From amy at demarco.com Mon Nov 1 00:20:01 2021 From: amy at demarco.com (Amy Marrich) Date: Sun, 31 Oct 2021 19:20:01 -0500 Subject: [Diversity] Diversity and Inclusion Meeting Reminder Message-ID: The Diversity & Inclusion WG invites members of all OIF projects to attend our next meeting Monday November 1st, at 17:00 UTC in the #openinfra- diversity channel on OFTC. The agenda can be found at https://etherpad.openstack.org/p/diversity-wg-agenda. Please feel free to add any topics you wish to discuss at the meeting. Thanks, Amy (apotz) -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Mon Nov 1 07:01:22 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Mon, 01 Nov 2021 08:01:22 +0100 Subject: Router interfaces are down In-Reply-To: References: Message-ID: <4390784.LvFx2qVVIh@p1> Hi, Please don't drop ML from the thread. You have to go to the node where Your router is hosted and investigate there in the agent's logs. If ports are DOWN, I would start with checking in the L2 agent logs (neutron-ovs-agent or linuxbridge-agent, idk what You are using exactly). If there is no any errors there, You can also check neutron-server logs, why ports aren't set to be UP as well as checking neutron-l3-agent logs on the node where router is hosted. On czwartek, 28 pa?dziernika 2021 18:41:29 CET Jibsan Joel Rosa Toirac wrote: > Here I let you some screenshots of my Network Topology: > > > > El jue, 28 oct 2021 a las 7:43, Jibsan Joel Rosa Toirac () > escribi?: > > Yes it?s neutron router. Well the router is centralized. It is located > > between the subnets and the node, all the subnets will pass through the > > router to internet, but I don?t know what else check to set it up for. I?m > > using a friend?s Openstack instance on a server to check out if I?m missing > > something and both nodes are the same. I will send you a screenshot of my > > network Topology. > > > > Greetings > > > > On Thu, Oct 28, 2021 at 2:33 AM Slawek Kaplonski > > > > wrote: > >> Hi, > >> > >> On ?roda, 27 pa?dziernika 2021 21:39:55 CEST Jibsan Joel Rosa Toirac > >> > >> wrote: > >> > Hello, I'm trying to route all the requests from a private vlan to > >> > internet. I have a private network and all the Virtual Machines inside > >> > >> the > >> > >> > subnet config can do everything between they, but if I ping to > >> > >> Internet, it > >> > >> > doesn't work. > >> > > >> > When I see the router_external it says all the interfaces are DOWN. > >> > >> By router_external, You mean neutron router, right? > >> If so, what kind of router it is, centralized HA or non-HA, or maybe DVR? > >> Is router scheduled properly to some node? You can check that with > >> command > >> like "neutron l3-agent-list-hosting-router ". > >> > >> > I have search in everywhere but I can't find a solution for this. > >> > > >> > Thank you for your time > >> > >> -- > >> Slawek Kaplonski > >> Principal Software Engineer > >> Red Hat -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From lokendrarathour at gmail.com Mon Nov 1 08:22:07 2021 From: lokendrarathour at gmail.com (Lokendra Rathour) Date: Mon, 1 Nov 2021 13:52:07 +0530 Subject: [ Tacker ] Passing a shell script/parameters as a file in cloud config In-Reply-To: References: Message-ID: Hello EveryOne, Any update on this, please. -Lokendra On Thu, Oct 28, 2021 at 2:43 PM Lokendra Rathour wrote: > Hi, > *In Tacker, while deploying VNFD can we pass a file ( parameter file) and > keep it at a defined path using cloud-config way?* > > Like in *generic hot template*s, we have the below-mentioned way to pass > a file directly as below: > parameters: > foo: > default: bar > > resources: > > the_server: > type: OS::Nova::Server > properties: > # flavor, image etc > user_data: > str_replace: > template: {get_file: the_server_boot.sh} > params: > $FOO: {get_param: foo} > > > *but when using this approach in Tacker BaseHOT it gives an error saying * > "nstantiation wait failed for vnf 77693e61-c80e-41e0-af9a-a0f702f3a9a7, > error: VNF Create Resource CREATE failed: resources.obsvrnnu62mb: > resources.CAS_0_group.Property error: > resources.soft_script.properties.config: No content found in the "files" > section for get_file path: Files/scripts/install.py > 2021-10-28 00:46:35.677 3853831 ERROR oslo_messaging.rpc.server > " > do we have a defined way to use the hot capability in TACKER? > > Defined Folder Structure for CSAR: > . > ??? BaseHOT > ? ??? default > ? ??? RIN_vnf_hot.yaml > ? ??? nested > ? ??? RIN_0.yaml > ? ??? RIN_1.yaml > ??? Definitions > ? ??? RIN_df_default.yaml > ? ??? RIN_top_vnfd.yaml > ? ??? RIN_types.yaml > ? ??? etsi_nfv_sol001_common_types.yaml > ? ??? etsi_nfv_sol001_vnfd_types.yaml > ??? Files > ? ??? images > ? ??? scripts > ? ??? install.py > ??? Scripts > ??? TOSCA-Metadata > ? ??? TOSCA.meta > ??? UserData > ? ??? __init__.py > ? ??? lcm_user_data.py > > *Objective: * > To pass a file at a defined path on the VDU after the VDU is > instantiated/launched. > > -- > ~ Lokendra > skype: lokendrarathour > > > -- ~ Lokendra www.inertiaspeaks.com www.inertiagroups.com skype: lokendrarathour -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at stackhpc.com Mon Nov 1 09:03:08 2021 From: mark at stackhpc.com (Mark Goddard) Date: Mon, 1 Nov 2021 09:03:08 +0000 Subject: Openstack ansible VS kolla ansible In-Reply-To: References: Message-ID: I would recommend kolla-ansible, but then I am biased. Ultimately. both are capable of giving you a production-ready OpenStack deployment, you just need to work out which is the best fit for you. Mark On Sat, 30 Oct 2021 at 21:10, A Monster wrote: > > Openstack-ansible uses LXC containers to deploy openstack services , while Kolla uses docker containers instead, which of these two deployment tools should I use for an Openstack deployment, and what are the differences between them. From amonster369 at gmail.com Mon Nov 1 09:29:58 2021 From: amonster369 at gmail.com (A Monster) Date: Mon, 1 Nov 2021 10:29:58 +0100 Subject: Using ceph for openstack storage Message-ID: Thank you for your response. Sadly, I'm talking about actual production, but I'm very limited in terms of hardware. I was thinking about using RAID for controller node as data redundancy, because I had the idea of maximizing the number of nova compute nodes, So basically i thought off using a controller with the following services ( Nova, Neutron, Keystone, Horizon, Glance, Cinder and Swift). Following the configuration you suggested, I would have : - 3 Controllers that are also Ceph Monitors - 9 Nova compute nodes and Ceph OSDs My questions are : - having multiple Ceph monitors is for the sake of redundancy or does it have a performance goal ? - Combining Ceph OSD and Nova compute wouldn't have performance drawbacks or damage the integrity of data stored in each node. wouldn't it be better is this case to use two separate servers for swift and glance and use RAID for data redundancy instead of using Ceph SDS. Thank you very much for you time. -------------- next part -------------- An HTML attachment was scrubbed... URL: From zigo at debian.org Mon Nov 1 12:12:48 2021 From: zigo at debian.org (Thomas Goirand) Date: Mon, 1 Nov 2021 13:12:48 +0100 Subject: Using ceph for openstack storage In-Reply-To: References: Message-ID: <73528885-aa0b-da12-1e46-4c45a094453e@debian.org> On 11/1/21 10:29 AM, A Monster wrote: > Thank you for?your response. > > Sadly, I'm talking about actual production, but I'm very limited in > terms of hardware. > > I was thinking about using RAID for controller node as data redundancy, > because I had the idea of maximizing the number of nova compute nodes,? > So basically i thought off using a controller with the following > services ( Nova, Neutron, Keystone, Horizon, Glance, Cinder and Swift). > Following the configuration you suggested, I would have : > - 3 Controllers that are also?Ceph Monitors > - 9 Nova compute nodes and Ceph OSDs? ? > > My questions are : > - having?multiple Ceph monitors is for the sake of redundancy or does it > have a performance goal ? > - Combining Ceph OSD and Nova compute?wouldn't have performance > drawbacks or damage the integrity?of data stored in each node. > > wouldn't it be better is this case to use two separate servers for swift > and glance and use RAID for data redundancy instead of using Ceph SDS. > > Thank you very?much for you time. > Hi, RAID will protect you from only a single type of failure on your controllers. RAID is *not* a good idea at all for Ceph or Swift (it will slow down things, and wont help much with redundancy). If you need, for example, to upgrade the operating system (for example because of a kernel security fix), you will have to restart your controllers, meaning there's going to be API down time. If you set the CEPH Mon on the controllers, then you will have the issue with Ceph Mon not being reachable during the upgrade, meaning you may end up with stuck I/O on all of your VMs. Of course, combining Ceph OSD and Nova compute is less nice than having a dedicated cluster (especially: busy VMs may slow down your Ceph and increase latency). But considering your constraints, it's still better: for any serious Ceph setup, you need to be able to "loose" at least 10% of your Ceph cluster so it can recover without impacting your overall cluster too much. The same way, I would suggest running at least the swift-object service on your compute nodes: it's common to have Swift account + containers on SSD, to seed it up. It's ok-ish to run account+container on your 3 controllers, IMO. Again, the piece of advice I'm giving is only valid because of your constraints, otherwise I would suggest a larger cluster. I hope this helps, Cheers, Thomas Goirand (zigo) From mihalis68 at gmail.com Mon Nov 1 12:45:57 2021 From: mihalis68 at gmail.com (Chris Morgan) Date: Mon, 1 Nov 2021 08:45:57 -0400 Subject: Using ceph for openstack storage In-Reply-To: References: Message-ID: <996D3999-85B9-4244-AB07-AF727CDF5DEC@gmail.com> VMs and OSDs on the same node (?hyperconverged?) is not a good idea in our experience. We used to run that way but moved to splitting nodes into either compute or storage. One of our older hyperconverged clusters OOM killed a VM only last week because ceph used up more memory than when the VM was scheduled. You also have different procedures to make a node safe for compute role than for storage. It?s tedious to have to worry about both when needing to take a node down for maintenance. Chris Morgan Sent from my iPhone > On Nov 1, 2021, at 5:36 AM, A Monster wrote: > > ? > Thank you for your response. > > Sadly, I'm talking about actual production, but I'm very limited in terms of hardware. > > I was thinking about using RAID for controller node as data redundancy, because I had the idea of maximizing the number of nova compute nodes, > So basically i thought off using a controller with the following services ( Nova, Neutron, Keystone, Horizon, Glance, Cinder and Swift). > Following the configuration you suggested, I would have : > - 3 Controllers that are also Ceph Monitors > - 9 Nova compute nodes and Ceph OSDs > > My questions are : > - having multiple Ceph monitors is for the sake of redundancy or does it have a performance goal ? > - Combining Ceph OSD and Nova compute wouldn't have performance drawbacks or damage the integrity of data stored in each node. > > wouldn't it be better is this case to use two separate servers for swift and glance and use RAID for data redundancy instead of using Ceph SDS. > > Thank you very much for you time. > From satish.txt at gmail.com Mon Nov 1 13:11:34 2021 From: satish.txt at gmail.com (Satish Patel) Date: Mon, 1 Nov 2021 09:11:34 -0400 Subject: is OVS+DPDK useful for general purpose workload In-Reply-To: References: Message-ID: Thank you Laurent, Now i am fully agree with you, that running DPDK only on the host doesn't gain anything because your VM guest will be a bottleneck. but again most of the documents keep saying you will gain performance but they never clarified what you said that it's not for everyone but only DPDK based vm. (wish there is a general purpose virtio based PMD which can suitable for all kind of workload) The only solution left is XDP and SRIOV (sriov is complicated to deploy because it doesn't support bond). On Sun, Oct 31, 2021 at 5:37 PM Laurent Dumont wrote: > > Most of the implementations I have seen for OVS-DPDK mean that the VM side would also use DPDK. > > Because even from a DPDK perspective at the compute level, the VM will become the bottleneck. 200k PPS with OVS-DPDK + non-DPDK VM is about what you get with OVS + OVSfirewall + non-DPDK VM. > > On Sun, Oct 31, 2021 at 12:21 AM Satish Patel wrote: >> >> Folks, >> >> I have deployed openstack and configured OVS-DPDK on compute nodes for >> high performance networking. My workload is general purpose workload >> like running haproxy, mysql, apache and XMPP etc. >> >> When I did load testing I found performance was average and after >> 200kpps packet rate I noticed packet drops. I heard and read that DPDK >> can handle millions of packets but in my case its not true. I am using >> virtio-net in guest vm which processes packets in the kernel so I >> believe my bottleneck is my guest VM. >> >> I don't have any guest based DPDK applications like testpmd etc. does >> that mean OVS+DPDK isn't useful for my cloud? How do I take advantage >> of OVS+DPDK with general purpose workload? >> >> Maybe I have the wrong understanding about DPDK so please help me :) >> >> Thanks >> ~S >> From jonathan.rosser at rd.bbc.co.uk Mon Nov 1 13:46:38 2021 From: jonathan.rosser at rd.bbc.co.uk (Jonathan Rosser) Date: Mon, 1 Nov 2021 13:46:38 +0000 Subject: Openstack ansible VS kolla ansible In-Reply-To: References: Message-ID: <9f39dc37-6a15-c7ad-acd7-e4e4c4b18da3@rd.bbc.co.uk> On 30/10/2021 21:05, A Monster wrote: > Openstack-ansible uses LXC containers to deploy openstack services You can also do a deployment with no use of container technologies with openstack-ansible if you wish.? Enough people have asked for that so it's a supported deployment model. Pick the tool that suits you, check out the documentation and do a proof-of-concept. They achieve the same thing by different means. Jonathan. From alex.kavanagh at canonical.com Mon Nov 1 14:13:32 2021 From: alex.kavanagh at canonical.com (Alex Kavanagh) Date: Mon, 1 Nov 2021 14:13:32 +0000 Subject: [charms] Yoga PTG Summary Message-ID: Hi All Thanks to all who participated in the PTG discussions. We had some good sessions where we tackled thorny issues in the charms deployments of OpenStack. A brief set of highlights is: - The store holding all the charms (https://jaas.ai/) just happens to be migrating to a new home (https://charmhub.io/) within the timescales of the Yoga cycle and some of the changes in behaviour provide some challenges and also opportunities to both simplify the charms and provide a better operator experience. The discussions highlighted the pain points and the general direction that the migration could take around: - Charm upgrades - OpenStack upgrades - Series upgrades of the underlying operating system. - The charms deployment of OpenStack supports versions back to mitaka (pre yoga) and queens (yoga cycle onwards). This has recently presented challenges around Python 3 support, particularly with py35 being EOL September 2020, and with py36 (bionic LTS) EOL 23rd December 2021. We discussed various strategies and approaches around maintenance, security and support of queens onwards in terms of support and upgrades. - The charms team also maintains a CI infrastructure based on Jenkins and (more recently) self-hosted Zuul to test installation and upgrade of OpenStack solutions. We discussed the further migrations of services from Jenkins to Zuul and issues associated with it. - Discussions around building new charms and refreshing old charms for the new components and features in the previous xena cycle and oncoming yoga cycle. Our raw notes are here: https://etherpad.opendev.org/p/charms-yoga-ptg Thank you again for all the participation in setting the direction and priorities for the yoga cycle. -- Alex Kavanagh - PTL Yoga -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Mon Nov 1 16:04:06 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 01 Nov 2021 11:04:06 -0500 Subject: [all][tc] Continuing the RBAC PTG discussion In-Reply-To: <17cc3068765.11a9586ce162263.9148179786177865922@ghanshyammann.com> References: <17cc3068765.11a9586ce162263.9148179786177865922@ghanshyammann.com> Message-ID: <17cdc3e5964.e42274e1372366.1164114504252831335@ghanshyammann.com> ---- On Wed, 27 Oct 2021 13:32:37 -0500 Ghanshyam Mann wrote ---- > Hello Everyone, > > As decided in PTG, we will continue the RBAC discussion from where we left in PTG. We will have a video > call next week based on the availability of most of the interested members. > > Please vote your available time in below doodle vote by Thursday (or Friday morning central time). > > - https://doodle.com/poll/6xicntb9tu657nz7 As per doodle voting, I have schedule it on Nov 3rd Wed, 15:00 - 16:00 UTC. Below is the link to join the call: https://meet.google.com/uue-adpp-xsm We will be taking notes in this etherpad https://etherpad.opendev.org/p/policy-popup-yoga-ptg -gmann > > NOTE: this is not specific to TC or people working in RBAC work but more to wider community to > get feedback and finalize the direction (like what we did in PTG session). > > Meanwhile, feel free to review the lance's updated proposal for community-wide goal > - https://review.opendev.org/c/openstack/governance/+/815158 > > -gmann > > From manchandavishal143 at gmail.com Mon Nov 1 16:12:49 2021 From: manchandavishal143 at gmail.com (vishal manchanda) Date: Mon, 1 Nov 2021 21:42:49 +0530 Subject: [horizon][dev]Handle multiple login sessions from same user in Horizon In-Reply-To: References: Message-ID: Hi Arthur, Thanks for adding a new blueprint, just approver it. Looking forward to the implementation patches. Feel free to reach out to me or the horizon team for any further queries on IRC (#openstack-horizon) at OFTC n/w. Regards, Vishal Manchanda On Sat, Oct 30, 2021 at 12:47 AM Luz de Avila, Arthur < Arthur.LuzdeAvila at windriver.com> wrote: > Hi everyone, > > In order to improve the system my colleagues and I would like to bring up > a new feature to Horizon. We found out that an user is able to login in > Horizon with the same credentials in multiple devices or/and browsers. This > may be not very secure as the user can login in many different devices > or/and browsers with the same credential. > > Thinking on that, we would like bring up more control to the admin of the > system in a way that the admin can enable or disable the multiple login > sessions according to the needs of the system. > > For a better follow up of this propose, a blueprint has been opened with > more details about the idea and concepts of this and we would like the > onion of the community whether this feature make sense to implement or not. > The blueprint opened on launchpad: > https://blueprints.launchpad.net/horizon/+spec/handle-multiple-login-sessions-from-same-user-in-horizon > > Kind regards, > Arthur Avila > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From abraden at verisign.com Mon Nov 1 16:35:50 2021 From: abraden at verisign.com (Braden, Albert) Date: Mon, 1 Nov 2021 16:35:50 +0000 Subject: Adjutant needs contributors (and a PTL) to survive! In-Reply-To: <3026c411-688b-c773-8577-e8eed40b995a@catalystcloud.nz> References: <3026c411-688b-c773-8577-e8eed40b995a@catalystcloud.nz> Message-ID: <2bd77f1e8c7542888b7e0e1a14931a41@verisign.com> Hi Adrian, I don't think I'm qualified to be PTL but I'm willing to help, and I've asked for permission. We aren't using Adjutant at this time because we're on Train and I learned at my last contract that running Adjutant on Train is a hassle, but I hope to start using it after we get to Ussuri. Has anyone else volunteered? -----Original Message----- From: Adrian Turjak Sent: Wednesday, October 27, 2021 1:41 AM To: openstack-discuss at lists.openstack.org Subject: [EXTERNAL] Adjutant needs contributors (and a PTL) to survive! Caution: This email originated from outside the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe. Hello fellow OpenStackers! I'm moving on to a different opportunity and my new role will not involve OpenStack, and there sadly isn't anyone at Catalystcloud who will be able to take over project responsibilities for Adjutant any time soon (not that I've been very onto it lately). As such Adjutant needs people to take over, and lead it going forward. I believe the codebase is in a reasonably good position for others to pick up, and I plan to go through and document a few more of my ideas for where it should go in storyboard so some of those plans exist somewhere should people want to pick up from where I left off before going fairly silent upstream. Plus if people want/need to they can reach out to me or add me to code review and chances are I'll comment/review because I do care about the project. Or I may contract some time to it. There are a few clouds running Adjutant, and people who have previously expressed interest in using it, so if you still are, the project isn't in a bad place at all. The code is stable, and the last few major refactors have cleaned up much of my biggest pain points with it. Best of luck! - adriant From yasufum.o at gmail.com Mon Nov 1 16:50:05 2021 From: yasufum.o at gmail.com (yasufum) Date: Tue, 2 Nov 2021 01:50:05 +0900 Subject: [tacker][ptg] Yoga PTG summary Message-ID: <8ba67bb6-3455-e5fd-ee02-9b366cf4301a@gmail.com> Hi everyone, Thank you for participating in the Yoga PTG sessions. We discussed 17 items totally, including 13 specs, and agreed all to implement the proposed features. We also decided to set the spec freeze on 31th Dec. The etherpad of our PTG is at https://etherpad.opendev.org/p/tacker-yoga-ptg Here is a summary of the PTG sessions. *Day 1 1. Introduce for Tacker installer * To reduce uinteresting steps of setting up tacker, we introduce a dedicated installer for developers. * For beginners, we should minimize difficulties of deploying and make them focusing on their usecases. 2. Prometheus monitoring and AutoHeal support for Kubernetes Cluster VNF via FM Interface * Add fault management interface support applied with ETSI SOL003 standards including polling mode and notify mode. * Polling mode responds to resources that is named "Alarms" and "Individual alarm". On the other hand, notify mode responds to resources that is named "Subscriptions" and "Notification endpoint" 3. Support VNF update operations reflecting changes to VNF instances using MgmtDriver * Add update APIs for container VNF instances by using ConfigMap and Secret of k8s. 4. Support CentOS stream * Revise our current incomplete CentOS support for the latest 'stream'. * We will provide some functional tests for the update, but non-voting for a while. 5. Add VNF Package sample for practical use cases * Provide more practical examples of usecase of tacker for users because current usecases in documentation are simple and not enough so much for actual cases. * There are three examples supposed in the proposal, "use multiple deployment flavours", "deploy VNF connected to external network" and "deploy VNF as HA cluster". *Day 2 6. Support handling large query results by ETSI NFV * Introduce an extended feature for reducing a large amount of result for some queries defined in ETSI standards. * The idea is simply dividing the results in several pieces with paging. 7. Support FT of Multi tenant * There are two problems in multi-tenancy case, (1) No restriction in assigning a tenant,(2) Notifying events on different tenants. * The policy of assigning a tenant should be clarified and also Notification process under multi-tenancy should be changed. * We need to clear the policy of RBAC (Is it allowed to make admin user can access all resources?) 8. Sample Ansible Driver * There are no management drivers enable VNF Configuration using Ansible. So, add the scripts as samples and docs. * Revise the name of directories for considering conventioins in takcer repo is necessary. * This can be used for 'Add VNF Package sample for practical use cases' proposal. We wii support the sample development. 9. Support Robot Framework API * Currently Tacker functional tests mainly focus on checking various VNF patterns such as a simple VNF, multi VDU, volume attach and affinity set. Tacker community is advancing ETSI NFV standard compliance, and coverage of compliant API testing becomes important. * The proposal is to use Robot Framework to achieve automated API testing and in the first step, we adopt API test code released by ETSI NFV-TST010. 10. Add Tacker-horizon Unit Test Cases * `tacker-horizon` which provides very limited features, such as showing a list of registered VIMs, VNFDs or so. We don't use this feature so much without checking instances of Tacker quickly via intuitive Web GUI way. So, we have not maintained tacker-horizon so actively * One of the reasons why bugs in tacker-horizon have not fixed is that we have no unittests in. Although we can find a bug coincidentally while using the features, but should implement unittests because bugs in horizon are some contextual and not so easy to find by hand. *Day 3 11. Report for ETSI NFV API Conformance Test Results * Share the result of Remote NFV&MEC API Plugtest 2021, Totally 136 API Conformance test sessions were executed for [NFV-SOL003] were executed as well as for NFV-SOL005. * https://hackmd.io/q4DzQ6_2Q0e-TdmtBikVlQ?view 12. Reduce code clone from sol-kubernetes job of FT * There are much similar files can be reduced for functional tests. It makes maintenance more complicated and difficult. The goal is to reduce it to less than 20%. * In particular, since sol_kubernetes's code clone rate is up to 40% , we think it will be better to refactor to make future maintenance easier. * other details: https://hackmd.io/Wo8cBIH_RPe6ll1hNwmx1w?view 13. Reduce unnecessary test codes * It's similar as above item. We have very similar YAML files for definitions and useless template files for tests can be reduced. 14. Enhance NFV SOL_v3 LCM operation * Introduce the latest V2 APIs for LCM operation as below. * Scale VNF (POST /vnf_instances/{vnfInstanceId}/scale) * Heal VNF (POST /vnf_instances/{vnfInstanceId}/heal) * Modify VNF Information (PATCH /vnf_instances/{vnfInstanceId}) * Change External VNF Connectivity (POST /vnf_instances/{vnfInstanceId}/change_ext_conn) 15. Support ETSI NFV-SOL_v3 based error-handling operation * Introduce the latest V2 APIs for error handling operation as below. * Retry operation (POST /vnf_lcm_op_occs/{vnfLcmOpOccId}/retry) * Fail operation (POST /vnf_lcm_op_occs/{vnfLcmOpOccId}/fail) * Rollback operation (POST /vnf_lcm_op_occs/{vnfLcmOpOccId}/rollback) * Test scenario needs to include raising an error, becoming FAILD_TEMP, and executing the ErrorHandling API. * Timer adjusting and using inapporpriate vnf package can cause error. 16. Support ChangeCurrentVNFPackage * Add API for ChangeCurrentVNFPackage by which blue-green deployment and rolling update are supported. * Both of VIMs, OpenStack and Kubernetes, are covered. 17. Support heal and scale method in lcm_user_data * Enable to customize stack parameters for heal and scale operations in user script, `user_lcm_data.py` more specifically. * Call the proposed methods if it's existing in the script for the operations. Thanks, Yasufumi From manchandavishal143 at gmail.com Tue Nov 2 05:54:06 2021 From: manchandavishal143 at gmail.com (vishal manchanda) Date: Tue, 2 Nov 2021 11:24:06 +0530 Subject: [horizon] Skip tomorrow Weekly IRC Meeting Message-ID: Hi all, As discussed, during the last weekly meeting[1], there will be no horizon weekly meeting tomorrow. See you next week! Thanks & regards, Vishal Manchanda [1] https://meetings.opendev.org/meetings/horizon/2021/horizon.2021-10-27-15.00.log.html#l-93 -------------- next part -------------- An HTML attachment was scrubbed... URL: From oleg.bondarev at huawei.com Tue Nov 2 07:36:09 2021 From: oleg.bondarev at huawei.com (Oleg Bondarev) Date: Tue, 2 Nov 2021 07:36:09 +0000 Subject: [neutron] Bug Deputy Report Oct 25 - 31 Message-ID: <0de9ec1bc641455da86e3820cdc77fc1@huawei.com> Hi everyone, Please find Bug Deputy report for the week Oct 25 - 31st below. 1 Critical bug in stable/train already in progress; 1 High bug looks for an assignee. Several OVN bug needs some triage from OVN folks + assignees. Critical - https://bugs.launchpad.net/neutron/+bug/1948804 - [stable/train] neutron-tempest-plugin scenario jobs fail "sudo: guestmount: command not found" o In Progress: https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/815518 o Assigned to bcafarel High - https://bugs.launchpad.net/neutron/+bug/1948832 - "_disable_ipv6_addressing_on_interface" can't find the interface o Confirmed o Unassigned Medium - https://bugs.launchpad.net/neutron/+bug/1948676 - rpc response timeout for agent report_state is not possible Edit o Fix released: https://review.opendev.org/c/openstack/neutron/+/815310 o Fixed by Tobias Urdin - https://bugs.launchpad.net/bugs/1948642 - Configuration of the ovs controller by neutron-ovs-agent isn't idempotent Edit o In progress: https://review.opendev.org/c/openstack/neutron/+/815255 o Assigned to slaweq - https://bugs.launchpad.net/neutron/+bug/1948891 - [ovn] Using ovsdb-client for MAC_Binding could theoretically block indefinitely o Confirmed o Unassigned - https://bugs.launchpad.net/neutron/+bug/1949059 - ovn-octavia-provider: incorrect router association in NB when network is linked to more than 1 router o Confirmed o Unassigned - https://bugs.launchpad.net/neutron/+bug/1949081 - [OVN] check_for_mcast_flood_reports maintenance task not accounting for localnet "mcast_flood" changes o Confirmed o Assigned to lucasgomes Undecided - https://bugs.launchpad.net/neutron/+bug/1949097 - Cloud-Init cannot contact Meta-Data-Service on Xena with OVN o New o Unassigned - https://bugs.launchpad.net/neutron/+bug/1949202 - ovn-controllers are listed as agents but cannot be disabled o New o Unassigned - https://bugs.launchpad.net/neutron/+bug/1949230 - OVN Octavia provider driver should implement allowed_cidrs to enforce security groups on LB ports o New o Unassigned Invalid - https://bugs.launchpad.net/neutron/+bug/1948656 - toggling explicitly_egress_direct from true to false does not clean flows Thanks, Oleg --- Advanced Software Technology Lab Huawei -------------- next part -------------- An HTML attachment was scrubbed... URL: From syedammad83 at gmail.com Tue Nov 2 07:51:04 2021 From: syedammad83 at gmail.com (Ammad Syed) Date: Tue, 2 Nov 2021 12:51:04 +0500 Subject: [neutron][ovn] Stateless Security Group Message-ID: Hi, I have upgraded my lab to latest xena release and ovn 21.09 and ovs 2.16. I am trying to create stateless security group. But its getting failed with below error message. # openstack security group create --stateless sec02-stateless Error while executing command: BadRequestException: 400, Unrecognized attribute(s) 'stateful' I see below logs in neutron server logs. 2021-11-02 12:47:41.921 1346 DEBUG neutron.wsgi [-] (1346) accepted ('172.16.40.45', 41272) server /usr/lib/python3/dist-packages/eventlet/wsgi.py:992 2021-11-02 12:47:42.166 1346 DEBUG neutron.api.v2.base [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 98687873a146418eaeeb54a01693669f - default default] Request body: {'security_group': {'name': 'sec02-stateless', 'stateful': False, 'description': 'sec02-stateless'}} prepare_request_body /usr/lib/python3/dist-packages/neutron/api/v2/base.py:729 2021-11-02 12:47:42.167 1346 WARNING neutron.api.v2.base [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 98687873a146418eaeeb54a01693669f - default default] An exception happened while processing the request body. The exception message is [Unrecognized attribute(s) 'stateful'].: webob.exc.HTTPBadRequest: Unrecognized attribute(s) 'stateful' 2021-11-02 12:47:42.167 1346 INFO neutron.api.v2.resource [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 98687873a146418eaeeb54a01693669f - default default] create failed (client error): Unrecognized attribute(s) 'stateful' 2021-11-02 12:47:42.168 1346 INFO neutron.wsgi [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 98687873a146418eaeeb54a01693669f - default default] 172.16.40.45 "POST /v2.0/security-groups HTTP/1.1" status: 400 len: 317 time: 0.2455938 Any advice on how to fix it ? Ammad -------------- next part -------------- An HTML attachment was scrubbed... URL: From iurygregory at gmail.com Tue Nov 2 07:51:46 2021 From: iurygregory at gmail.com (Iury Gregory) Date: Tue, 2 Nov 2021 08:51:46 +0100 Subject: =?UTF-8?Q?=5Bironic=5D_Proposing_Aija_Jaunt=C4=93va_for_sushy=2Dcore?= Message-ID: Hello everyone! I would like to propose Aija Jaunt?va (irc: ajya) to be added to the sushy-core group. Aija has been in the ironic community for a long time, she has a lot of knowledge about redfish and is always providing good reviews. ironic-cores please vote with +/- 1. -- *Att[]'sIury Gregory Melo Ferreira * *MSc in Computer Science at UFCG* *Part of the ironic-core and puppet-manager-core team in OpenStack* *Software Engineer at Red Hat Czech* *Social*: https://www.linkedin.com/in/iurygregory *E-mail: iurygregory at gmail.com * -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Tue Nov 2 08:03:53 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Tue, 02 Nov 2021 09:03:53 +0100 Subject: [neutron][ovn] Stateless Security Group In-Reply-To: References: Message-ID: <4691610.31r3eYUQgx@p1> Hi, On wtorek, 2 listopada 2021 08:51:04 CET Ammad Syed wrote: > Hi, > > I have upgraded my lab to latest xena release and ovn 21.09 and ovs 2.16. I > am trying to create stateless security group. But its getting failed with > below error message. > > # openstack security group create --stateless sec02-stateless > Error while executing command: BadRequestException: 400, Unrecognized > attribute(s) 'stateful' > > I see below logs in neutron server logs. > > 2021-11-02 12:47:41.921 1346 DEBUG neutron.wsgi [-] (1346) accepted > ('172.16.40.45', 41272) server > /usr/lib/python3/dist-packages/eventlet/wsgi.py:992 > 2021-11-02 12:47:42.166 1346 DEBUG neutron.api.v2.base > [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 > 98687873a146418eaeeb54a01693669f - default default] Request body: > {'security_group': {'name': 'sec02-stateless', 'stateful': False, > 'description': 'sec02-stateless'}} prepare_request_body > /usr/lib/python3/dist-packages/neutron/api/v2/base.py:729 > 2021-11-02 12:47:42.167 1346 WARNING neutron.api.v2.base > [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 > 98687873a146418eaeeb54a01693669f - default default] An exception happened > while processing the request body. The exception message is [Unrecognized > attribute(s) 'stateful'].: webob.exc.HTTPBadRequest: Unrecognized > attribute(s) 'stateful' > 2021-11-02 12:47:42.167 1346 INFO neutron.api.v2.resource > [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 > 98687873a146418eaeeb54a01693669f - default default] create failed (client > error): Unrecognized attribute(s) 'stateful' > 2021-11-02 12:47:42.168 1346 INFO neutron.wsgi > [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 > 98687873a146418eaeeb54a01693669f - default default] 172.16.40.45 "POST > /v2.0/security-groups HTTP/1.1" status: 400 len: 317 time: 0.2455938 > > Any advice on how to fix it ? > > Ammad Do You have 'stateful-security-group' API extension enabled? You can check it with command neutron ext-list If it's not loaded, You can check in the neutron-server logs while it wasn't loaded properly. -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From syedammad83 at gmail.com Tue Nov 2 08:09:14 2021 From: syedammad83 at gmail.com (Ammad Syed) Date: Tue, 2 Nov 2021 13:09:14 +0500 Subject: [neutron][ovn] Stateless Security Group In-Reply-To: <4691610.31r3eYUQgx@p1> References: <4691610.31r3eYUQgx@p1> Message-ID: Hi Slawek, I don't see any output with below command. neutron ext-list | grep stateful-security-group I have checked logs and found below in neutron-server.log. # grep stateful-security-group neutron-server.log 2021-11-02 13:02:20.846 998 DEBUG neutron.api.extensions [req-b022ced2-f365-4ab9-9f61-25e915231e02 - - - - -] Ext name="Stateful security group" alias="stateful-security-group" description="Indicates if the security group is stateful or not" updated="2019-11-26T09:00:00-00:00" _check_extension /usr/lib/python3/dist-packages/neutron/api/extensions.py:416 2021-11-02 13:02:20.846 998 INFO neutron.api.extensions [req-b022ced2-f365-4ab9-9f61-25e915231e02 - - - - -] Extension stateful-security-group not supported by any of loaded plugins Do I need to do any change in neutron server configuration? Ammad On Tue, Nov 2, 2021 at 1:04 PM Slawek Kaplonski wrote: > Hi, > > On wtorek, 2 listopada 2021 08:51:04 CET Ammad Syed wrote: > > Hi, > > > > I have upgraded my lab to latest xena release and ovn 21.09 and ovs > 2.16. I > > am trying to create stateless security group. But its getting failed with > > below error message. > > > > # openstack security group create --stateless sec02-stateless > > Error while executing command: BadRequestException: 400, Unrecognized > > attribute(s) 'stateful' > > > > I see below logs in neutron server logs. > > > > 2021-11-02 12:47:41.921 1346 DEBUG neutron.wsgi [-] (1346) accepted > > ('172.16.40.45', 41272) server > > /usr/lib/python3/dist-packages/eventlet/wsgi.py:992 > > 2021-11-02 12:47:42.166 1346 DEBUG neutron.api.v2.base > > [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > 19844bf62a7b498eb443508ef150e9b8 > > 98687873a146418eaeeb54a01693669f - default default] Request body: > > {'security_group': {'name': 'sec02-stateless', 'stateful': False, > > 'description': 'sec02-stateless'}} prepare_request_body > > /usr/lib/python3/dist-packages/neutron/api/v2/base.py:729 > > 2021-11-02 12:47:42.167 1346 WARNING neutron.api.v2.base > > [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > 19844bf62a7b498eb443508ef150e9b8 > > 98687873a146418eaeeb54a01693669f - default default] An exception happened > > while processing the request body. The exception message is [Unrecognized > > attribute(s) 'stateful'].: webob.exc.HTTPBadRequest: Unrecognized > > attribute(s) 'stateful' > > 2021-11-02 12:47:42.167 1346 INFO neutron.api.v2.resource > > [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > 19844bf62a7b498eb443508ef150e9b8 > > 98687873a146418eaeeb54a01693669f - default default] create failed (client > > error): Unrecognized attribute(s) 'stateful' > > 2021-11-02 12:47:42.168 1346 INFO neutron.wsgi > > [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > 19844bf62a7b498eb443508ef150e9b8 > > 98687873a146418eaeeb54a01693669f - default default] 172.16.40.45 "POST > > /v2.0/security-groups HTTP/1.1" status: 400 len: 317 time: 0.2455938 > > > > Any advice on how to fix it ? > > > > Ammad > > Do You have 'stateful-security-group' API extension enabled? You can check > it > with command > > neutron ext-list > > If it's not loaded, You can check in the neutron-server logs while it > wasn't > loaded properly. > > -- > Slawek Kaplonski > Principal Software Engineer > Red Hat -- Regards, Syed Ammad Ali -------------- next part -------------- An HTML attachment was scrubbed... URL: From syedammad83 at gmail.com Tue Nov 2 08:25:27 2021 From: syedammad83 at gmail.com (Ammad Syed) Date: Tue, 2 Nov 2021 13:25:27 +0500 Subject: [neutron][ovn] Stateless Security Group In-Reply-To: References: <4691610.31r3eYUQgx@p1> Message-ID: I have below plugins loaded in neutron.conf core_plugin = ml2 service_plugins = ovn-router, qos, segments, port_forwarding and below extension drivers ml2_conf.ini mechanism_drivers = ovn extension_drivers = port_security, qos Ammad On Tue, Nov 2, 2021 at 1:09 PM Ammad Syed wrote: > Hi Slawek, > > I don't see any output with below command. > > neutron ext-list | grep stateful-security-group > > I have checked logs and found below in neutron-server.log. > > # grep stateful-security-group neutron-server.log > > 2021-11-02 13:02:20.846 998 DEBUG neutron.api.extensions > [req-b022ced2-f365-4ab9-9f61-25e915231e02 - - - - -] Ext name="Stateful > security group" alias="stateful-security-group" description="Indicates if > the security group is stateful or not" updated="2019-11-26T09:00:00-00:00" > _check_extension > /usr/lib/python3/dist-packages/neutron/api/extensions.py:416 > 2021-11-02 13:02:20.846 998 INFO neutron.api.extensions > [req-b022ced2-f365-4ab9-9f61-25e915231e02 - - - - -] Extension > stateful-security-group not supported by any of loaded plugins > > Do I need to do any change in neutron server configuration? > > Ammad > > On Tue, Nov 2, 2021 at 1:04 PM Slawek Kaplonski > wrote: > >> Hi, >> >> On wtorek, 2 listopada 2021 08:51:04 CET Ammad Syed wrote: >> > Hi, >> > >> > I have upgraded my lab to latest xena release and ovn 21.09 and ovs >> 2.16. I >> > am trying to create stateless security group. But its getting failed >> with >> > below error message. >> > >> > # openstack security group create --stateless sec02-stateless >> > Error while executing command: BadRequestException: 400, Unrecognized >> > attribute(s) 'stateful' >> > >> > I see below logs in neutron server logs. >> > >> > 2021-11-02 12:47:41.921 1346 DEBUG neutron.wsgi [-] (1346) accepted >> > ('172.16.40.45', 41272) server >> > /usr/lib/python3/dist-packages/eventlet/wsgi.py:992 >> > 2021-11-02 12:47:42.166 1346 DEBUG neutron.api.v2.base >> > [req-b6a37fff-090f-4754-9df7-6e4314ed9481 >> 19844bf62a7b498eb443508ef150e9b8 >> > 98687873a146418eaeeb54a01693669f - default default] Request body: >> > {'security_group': {'name': 'sec02-stateless', 'stateful': False, >> > 'description': 'sec02-stateless'}} prepare_request_body >> > /usr/lib/python3/dist-packages/neutron/api/v2/base.py:729 >> > 2021-11-02 12:47:42.167 1346 WARNING neutron.api.v2.base >> > [req-b6a37fff-090f-4754-9df7-6e4314ed9481 >> 19844bf62a7b498eb443508ef150e9b8 >> > 98687873a146418eaeeb54a01693669f - default default] An exception >> happened >> > while processing the request body. The exception message is >> [Unrecognized >> > attribute(s) 'stateful'].: webob.exc.HTTPBadRequest: Unrecognized >> > attribute(s) 'stateful' >> > 2021-11-02 12:47:42.167 1346 INFO neutron.api.v2.resource >> > [req-b6a37fff-090f-4754-9df7-6e4314ed9481 >> 19844bf62a7b498eb443508ef150e9b8 >> > 98687873a146418eaeeb54a01693669f - default default] create failed >> (client >> > error): Unrecognized attribute(s) 'stateful' >> > 2021-11-02 12:47:42.168 1346 INFO neutron.wsgi >> > [req-b6a37fff-090f-4754-9df7-6e4314ed9481 >> 19844bf62a7b498eb443508ef150e9b8 >> > 98687873a146418eaeeb54a01693669f - default default] 172.16.40.45 "POST >> > /v2.0/security-groups HTTP/1.1" status: 400 len: 317 time: 0.2455938 >> > >> > Any advice on how to fix it ? >> > >> > Ammad >> >> Do You have 'stateful-security-group' API extension enabled? You can >> check it >> with command >> >> neutron ext-list >> >> If it's not loaded, You can check in the neutron-server logs while it >> wasn't >> loaded properly. >> >> -- >> Slawek Kaplonski >> Principal Software Engineer >> Red Hat > > > > -- > Regards, > > > Syed Ammad Ali > -- Regards, Syed Ammad Ali -------------- next part -------------- An HTML attachment was scrubbed... URL: From katonalala at gmail.com Tue Nov 2 08:25:44 2021 From: katonalala at gmail.com (Lajos Katona) Date: Tue, 2 Nov 2021 09:25:44 +0100 Subject: [neutron][ovn] Stateless Security Group In-Reply-To: References: Message-ID: Hi, statefull security-groups are only available with iptables based drivers: https://review.opendev.org/c/openstack/neutron/+/572767/53/releasenotes/notes/stateful-security-group-04b2902ed9c44e4f.yaml For OVS and OVN we have open RFE, nut as I know at the moment nobody works on them: https://bugs.launchpad.net/neutron/+bug/1885261 https://bugs.launchpad.net/neutron/+bug/1885262 Regards Lajos Katona (lajoskatona) Ammad Syed ezt ?rta (id?pont: 2021. nov. 2., K, 9:00): > Hi, > > I have upgraded my lab to latest xena release and ovn 21.09 and ovs 2.16. > I am trying to create stateless security group. But its getting failed with > below error message. > > # openstack security group create --stateless sec02-stateless > Error while executing command: BadRequestException: 400, Unrecognized > attribute(s) 'stateful' > > I see below logs in neutron server logs. > > 2021-11-02 12:47:41.921 1346 DEBUG neutron.wsgi [-] (1346) accepted > ('172.16.40.45', 41272) server > /usr/lib/python3/dist-packages/eventlet/wsgi.py:992 > 2021-11-02 12:47:42.166 1346 DEBUG neutron.api.v2.base > [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 > 98687873a146418eaeeb54a01693669f - default default] Request body: > {'security_group': {'name': 'sec02-stateless', 'stateful': False, > 'description': 'sec02-stateless'}} prepare_request_body > /usr/lib/python3/dist-packages/neutron/api/v2/base.py:729 > 2021-11-02 12:47:42.167 1346 WARNING neutron.api.v2.base > [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 > 98687873a146418eaeeb54a01693669f - default default] An exception happened > while processing the request body. The exception message is [Unrecognized > attribute(s) 'stateful'].: webob.exc.HTTPBadRequest: Unrecognized > attribute(s) 'stateful' > 2021-11-02 12:47:42.167 1346 INFO neutron.api.v2.resource > [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 > 98687873a146418eaeeb54a01693669f - default default] create failed (client > error): Unrecognized attribute(s) 'stateful' > 2021-11-02 12:47:42.168 1346 INFO neutron.wsgi > [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 > 98687873a146418eaeeb54a01693669f - default default] 172.16.40.45 "POST > /v2.0/security-groups HTTP/1.1" status: 400 len: 317 time: 0.2455938 > > Any advice on how to fix it ? > > Ammad > -------------- next part -------------- An HTML attachment was scrubbed... URL: From syedammad83 at gmail.com Tue Nov 2 08:29:13 2021 From: syedammad83 at gmail.com (Ammad Syed) Date: Tue, 2 Nov 2021 13:29:13 +0500 Subject: [neutron][ovn] Stateless Security Group In-Reply-To: References: Message-ID: Thanks Lajos, I was checking the release notes and found that stateless acl is supported by ovn in xena. https://docs.openstack.org/releasenotes/neutron/xena.html#:~:text=Support%20stateless%20security%20groups%20with%20the%20latest%20OVN%2021.06%2B.%20The%20stateful%3DFalse%20security%20groups%20are%20mapped%20to%20the%20new%20%E2%80%9Callow-stateless%E2%80%9D%20OVN%20ACL%20verb . Ammad On Tue, Nov 2, 2021 at 1:25 PM Lajos Katona wrote: > Hi, > statefull security-groups are only available with iptables based drivers: > > https://review.opendev.org/c/openstack/neutron/+/572767/53/releasenotes/notes/stateful-security-group-04b2902ed9c44e4f.yaml > > For OVS and OVN we have open RFE, nut as I know at the moment nobody works > on them: > https://bugs.launchpad.net/neutron/+bug/1885261 > https://bugs.launchpad.net/neutron/+bug/1885262 > > Regards > Lajos Katona (lajoskatona) > > Ammad Syed ezt ?rta (id?pont: 2021. nov. 2., K, > 9:00): > >> Hi, >> >> I have upgraded my lab to latest xena release and ovn 21.09 and ovs 2.16. >> I am trying to create stateless security group. But its getting failed with >> below error message. >> >> # openstack security group create --stateless sec02-stateless >> Error while executing command: BadRequestException: 400, Unrecognized >> attribute(s) 'stateful' >> >> I see below logs in neutron server logs. >> >> 2021-11-02 12:47:41.921 1346 DEBUG neutron.wsgi [-] (1346) accepted >> ('172.16.40.45', 41272) server >> /usr/lib/python3/dist-packages/eventlet/wsgi.py:992 >> 2021-11-02 12:47:42.166 1346 DEBUG neutron.api.v2.base >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 >> 98687873a146418eaeeb54a01693669f - default default] Request body: >> {'security_group': {'name': 'sec02-stateless', 'stateful': False, >> 'description': 'sec02-stateless'}} prepare_request_body >> /usr/lib/python3/dist-packages/neutron/api/v2/base.py:729 >> 2021-11-02 12:47:42.167 1346 WARNING neutron.api.v2.base >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 >> 98687873a146418eaeeb54a01693669f - default default] An exception happened >> while processing the request body. The exception message is [Unrecognized >> attribute(s) 'stateful'].: webob.exc.HTTPBadRequest: Unrecognized >> attribute(s) 'stateful' >> 2021-11-02 12:47:42.167 1346 INFO neutron.api.v2.resource >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 >> 98687873a146418eaeeb54a01693669f - default default] create failed (client >> error): Unrecognized attribute(s) 'stateful' >> 2021-11-02 12:47:42.168 1346 INFO neutron.wsgi >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 >> 98687873a146418eaeeb54a01693669f - default default] 172.16.40.45 "POST >> /v2.0/security-groups HTTP/1.1" status: 400 len: 317 time: 0.2455938 >> >> Any advice on how to fix it ? >> >> Ammad >> > -- Regards, Syed Ammad Ali -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Tue Nov 2 08:50:43 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Tue, 02 Nov 2021 09:50:43 +0100 Subject: [neutron] CI meeting - Tuesday 02.11.2021 Message-ID: <1800045.tdWV9SEqCh@p1> Hi, As we discussed during the PTG and in the last week's CI meeting, this week's meeting will be a video call. Please join https://meetpad.opendev.org/neutron-ci-meetings at 300pm UTC if You are interested in the Neutron CI. We will also keep meeting opened in the #openstack-neutron irc channel in case if anyone would like to participate that way. Agenda for the meeting is at https://etherpad.opendev.org/p/neutron-ci-meetings -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From elfosardo at gmail.com Tue Nov 2 08:51:54 2021 From: elfosardo at gmail.com (Riccardo Pittau) Date: Tue, 2 Nov 2021 09:51:54 +0100 Subject: =?UTF-8?Q?Re=3A_=5Bironic=5D_Proposing_Aija_Jaunt=C4=93va_for_sushy=2Dcore?= In-Reply-To: References: Message-ID: +1 Aija has done a great job so far :) On Tue, Nov 2, 2021 at 9:00 AM Iury Gregory wrote: > Hello everyone! > > I would like to propose Aija Jaunt?va (irc: ajya) to be added to the > sushy-core group. > Aija has been in the ironic community for a long time, she has a lot of > knowledge about redfish and is always providing good reviews. > > ironic-cores please vote with +/- 1. > > -- > > > *Att[]'sIury Gregory Melo Ferreira * > *MSc in Computer Science at UFCG* > *Part of the ironic-core and puppet-manager-core team in OpenStack* > *Software Engineer at Red Hat Czech* > *Social*: https://www.linkedin.com/in/iurygregory > *E-mail: iurygregory at gmail.com * > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Tue Nov 2 08:54:07 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Tue, 02 Nov 2021 09:54:07 +0100 Subject: [neutron][ovn] Stateless Security Group In-Reply-To: References: Message-ID: <2800702.e9J7NaK4W3@p1> Hi, On wtorek, 2 listopada 2021 09:29:13 CET Ammad Syed wrote: > Thanks Lajos, > > I was checking the release notes and found that stateless acl is supported > by ovn in xena. > > https://docs.openstack.org/releasenotes/neutron/ xena.html#:~:text=Support%20st > ateless%20security%20groups%20with%20the%20latest%20OVN%2021.06%2B. %20The%20st > ateful%3DFalse%20security%20groups%20are%20mapped%20to%20the%20new%20%E2%80%9C > allow-stateless%E2%80%9D%20OVN%20ACL%20verb . It should be supported by the OVN driver now IIRC. Maybe we forgot about adding this extension to the list: https://github.com/openstack/neutron/blob/ master/neutron/common/ovn/extensions.py#L93 Can You try to add it there and see if the extension will be loaded then? > > Ammad > > On Tue, Nov 2, 2021 at 1:25 PM Lajos Katona wrote: > > Hi, > > statefull security-groups are only available with iptables based drivers: > > > > https://review.opendev.org/c/openstack/neutron/+/572767/53/releasenotes/ note > > s/stateful-security-group-04b2902ed9c44e4f.yaml > > > > For OVS and OVN we have open RFE, nut as I know at the moment nobody works > > on them: > > https://bugs.launchpad.net/neutron/+bug/1885261 > > https://bugs.launchpad.net/neutron/+bug/1885262 > > > > Regards > > Lajos Katona (lajoskatona) > > > > Ammad Syed ezt ?rta (id?pont: 2021. nov. 2., K, > > > > 9:00): > >> Hi, > >> > >> I have upgraded my lab to latest xena release and ovn 21.09 and ovs 2.16. > >> I am trying to create stateless security group. But its getting failed with > >> below error message. > >> > >> # openstack security group create --stateless sec02-stateless > >> Error while executing command: BadRequestException: 400, Unrecognized > >> attribute(s) 'stateful' > >> > >> I see below logs in neutron server logs. > >> > >> 2021-11-02 12:47:41.921 1346 DEBUG neutron.wsgi [-] (1346) accepted > >> ('172.16.40.45', 41272) server > >> /usr/lib/python3/dist-packages/eventlet/wsgi.py:992 > >> 2021-11-02 12:47:42.166 1346 DEBUG neutron.api.v2.base > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 > >> 98687873a146418eaeeb54a01693669f - default default] Request body: > >> {'security_group': {'name': 'sec02-stateless', 'stateful': False, > >> 'description': 'sec02-stateless'}} prepare_request_body > >> /usr/lib/python3/dist-packages/neutron/api/v2/base.py:729 > >> 2021-11-02 12:47:42.167 1346 WARNING neutron.api.v2.base > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 > >> 98687873a146418eaeeb54a01693669f - default default] An exception happened > >> while processing the request body. The exception message is [Unrecognized > >> attribute(s) 'stateful'].: webob.exc.HTTPBadRequest: Unrecognized > >> attribute(s) 'stateful' > >> 2021-11-02 12:47:42.167 1346 INFO neutron.api.v2.resource > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 > >> 98687873a146418eaeeb54a01693669f - default default] create failed (client > >> error): Unrecognized attribute(s) 'stateful' > >> 2021-11-02 12:47:42.168 1346 INFO neutron.wsgi > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 > >> 98687873a146418eaeeb54a01693669f - default default] 172.16.40.45 "POST > >> /v2.0/security-groups HTTP/1.1" status: 400 len: 317 time: 0.2455938 > >> > >> Any advice on how to fix it ? > >> > >> Ammad -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From dtantsur at redhat.com Tue Nov 2 08:58:47 2021 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Tue, 2 Nov 2021 09:58:47 +0100 Subject: =?UTF-8?Q?Re=3A_=5Bironic=5D_Proposing_Aija_Jaunt=C4=93va_for_sushy=2Dcore?= In-Reply-To: References: Message-ID: On Tue, Nov 2, 2021 at 8:58 AM Iury Gregory wrote: > Hello everyone! > > I would like to propose Aija Jaunt?va (irc: ajya) to be added to the > sushy-core group. > Aija has been in the ironic community for a long time, she has a lot of > knowledge about redfish and is always providing good reviews. > > ironic-cores please vote with +/- 1. > Wait, why just 1? I have the whole +2! :) Aija has been extremely helpful in all things around Redfish, I feel absolutely confident in trusting her the core rights. Welcome! Dmitry > > -- > > > *Att[]'sIury Gregory Melo Ferreira * > *MSc in Computer Science at UFCG* > *Part of the ironic-core and puppet-manager-core team in OpenStack* > *Software Engineer at Red Hat Czech* > *Social*: https://www.linkedin.com/in/iurygregory > *E-mail: iurygregory at gmail.com * > -- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill -------------- next part -------------- An HTML attachment was scrubbed... URL: From syedammad83 at gmail.com Tue Nov 2 09:04:40 2021 From: syedammad83 at gmail.com (Ammad Syed) Date: Tue, 2 Nov 2021 14:04:40 +0500 Subject: [neutron][ovn] Stateless Security Group In-Reply-To: <2800702.e9J7NaK4W3@p1> References: <2800702.e9J7NaK4W3@p1> Message-ID: Hi Slawek, Yes, after adding extension, SG created with stateful=false. # neutron ext-list | grep stateful-security-group neutron CLI is deprecated and will be removed in the Z cycle. Use openstack CLI instead. | stateful-security-group | Stateful security group # openstack security group create --stateless sec02-stateless +-----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Field | Value | +-----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | created_at | 2021-11-02T09:02:42Z | | description | sec02-stateless | | id | 29c28678-9a03-496c-8157-4afbcdc8f2af | | name | sec02-stateless | | project_id | 98687873a146418eaeeb54a01693669f | | revision_number | 1 | | rules | created_at='2021-11-02T09:02:42Z', direction='egress', ethertype='IPv6', id='17079c04-dc1d-4fbd-9f15-e79c6e585932', standard_attr_id='2863', updated_at='2021-11-02T09:02:42Z' | | | created_at='2021-11-02T09:02:42Z', direction='egress', ethertype='IPv4', id='fadfbf09-f759-453d-b493-e6f73077113a', standard_attr_id='2860', updated_at='2021-11-02T09:02:42Z' | | stateful | False | | tags | [] | | updated_at | 2021-11-02T09:02:42Z | +-----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ Let me test this feature further. Ammad On Tue, Nov 2, 2021 at 1:54 PM Slawek Kaplonski wrote: > Hi, > > On wtorek, 2 listopada 2021 09:29:13 CET Ammad Syed wrote: > > Thanks Lajos, > > > > I was checking the release notes and found that stateless acl is > supported > > by ovn in xena. > > > > https://docs.openstack.org/releasenotes/neutron/ > xena.html#:~:text=Support%20st > > ateless%20security%20groups%20with%20the%20latest%20OVN%2021.06%2B. > %20The%20st > > > > ateful%3DFalse%20security%20groups%20are%20mapped%20to%20the%20new%20%E2%80%9C > > allow-stateless%E2%80%9D%20OVN%20ACL%20verb . > > It should be supported by the OVN driver now IIRC. Maybe we forgot about > adding this extension to the list: > https://github.com/openstack/neutron/blob/ > master/neutron/common/ovn/extensions.py#L93 > > Can You try to add it there and see if the extension will be loaded then? > > > > > Ammad > > > > On Tue, Nov 2, 2021 at 1:25 PM Lajos Katona > wrote: > > > Hi, > > > statefull security-groups are only available with iptables based > drivers: > > > > > > > https://review.opendev.org/c/openstack/neutron/+/572767/53/releasenotes/ > note > > > s/stateful-security-group-04b2902ed9c44e4f.yaml > > > > > > For OVS and OVN we have open RFE, nut as I know at the moment nobody > works > > > on them: > > > https://bugs.launchpad.net/neutron/+bug/1885261 > > > https://bugs.launchpad.net/neutron/+bug/1885262 > > > > > > Regards > > > Lajos Katona (lajoskatona) > > > > > > Ammad Syed ezt ?rta (id?pont: 2021. nov. 2., > K, > > > > > > 9:00): > > >> Hi, > > >> > > >> I have upgraded my lab to latest xena release and ovn 21.09 and ovs > 2.16. > > >> I am trying to create stateless security group. But its getting > failed > with > > >> below error message. > > >> > > >> # openstack security group create --stateless sec02-stateless > > >> Error while executing command: BadRequestException: 400, Unrecognized > > >> attribute(s) 'stateful' > > >> > > >> I see below logs in neutron server logs. > > >> > > >> 2021-11-02 12:47:41.921 1346 DEBUG neutron.wsgi [-] (1346) accepted > > >> ('172.16.40.45', 41272) server > > >> /usr/lib/python3/dist-packages/eventlet/wsgi.py:992 > > >> 2021-11-02 12:47:42.166 1346 DEBUG neutron.api.v2.base > > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > 19844bf62a7b498eb443508ef150e9b8 > > >> 98687873a146418eaeeb54a01693669f - default default] Request body: > > >> {'security_group': {'name': 'sec02-stateless', 'stateful': False, > > >> 'description': 'sec02-stateless'}} prepare_request_body > > >> /usr/lib/python3/dist-packages/neutron/api/v2/base.py:729 > > >> 2021-11-02 12:47:42.167 1346 WARNING neutron.api.v2.base > > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > 19844bf62a7b498eb443508ef150e9b8 > > >> 98687873a146418eaeeb54a01693669f - default default] An exception > happened > > >> while processing the request body. The exception message is > [Unrecognized > > >> attribute(s) 'stateful'].: webob.exc.HTTPBadRequest: Unrecognized > > >> attribute(s) 'stateful' > > >> 2021-11-02 12:47:42.167 1346 INFO neutron.api.v2.resource > > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > 19844bf62a7b498eb443508ef150e9b8 > > >> 98687873a146418eaeeb54a01693669f - default default] create failed > (client > > >> error): Unrecognized attribute(s) 'stateful' > > >> 2021-11-02 12:47:42.168 1346 INFO neutron.wsgi > > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > 19844bf62a7b498eb443508ef150e9b8 > > >> 98687873a146418eaeeb54a01693669f - default default] 172.16.40.45 "POST > > >> /v2.0/security-groups HTTP/1.1" status: 400 len: 317 time: 0.2455938 > > >> > > >> Any advice on how to fix it ? > > >> > > >> Ammad > > > -- > Slawek Kaplonski > Principal Software Engineer > Red Hat -- Regards, Syed Ammad Ali -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Tue Nov 2 09:45:38 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Tue, 02 Nov 2021 10:45:38 +0100 Subject: [neutron][ovn] Stateless Security Group In-Reply-To: References: <2800702.e9J7NaK4W3@p1> Message-ID: <2724584.mvXUDI8C0e@p1> Hi, On wtorek, 2 listopada 2021 10:04:40 CET Ammad Syed wrote: > Hi Slawek, > > Yes, after adding extension, SG created with stateful=false. That's good. Can You report an Launchpad bug for that? And You can also propose that change as fix for that bug too :) > > # neutron ext-list | grep stateful-security-group > neutron CLI is deprecated and will be removed in the Z cycle. Use openstack > CLI instead. > > | stateful-security-group | Stateful security group > > # openstack security group create --stateless sec02-stateless > +----------------- +----------------------------------------------------------- > ------------------------------------------------------------------------------ > ---------------------------------------+ > | Field | Value > > +----------------- +----------------------------------------------------------- > ------------------------------------------------------------------------------ > ---------------------------------------+ > | created_at | 2021-11-02T09:02:42Z > | > | > | description | sec02-stateless > | > | > | id | 29c28678-9a03-496c-8157-4afbcdc8f2af > | > | > | name | sec02-stateless > | > | > | project_id | 98687873a146418eaeeb54a01693669f > | > | > | revision_number | 1 > | > | > | rules | created_at='2021-11-02T09:02:42Z', direction='egress', > > ethertype='IPv6', id='17079c04-dc1d-4fbd-9f15-e79c6e585932', > standard_attr_id='2863', updated_at='2021-11-02T09:02:42Z' | > > | | created_at='2021-11-02T09:02:42Z', direction='egress', > > ethertype='IPv4', id='fadfbf09-f759-453d-b493-e6f73077113a', > standard_attr_id='2860', updated_at='2021-11-02T09:02:42Z' | > > | stateful | False > | > | > | tags | [] > | > | > | updated_at | 2021-11-02T09:02:42Z > > +----------------- +----------------------------------------------------------- > ------------------------------------------------------------------------------ > ---------------------------------------+ > > Let me test this feature further. > > Ammad > > On Tue, Nov 2, 2021 at 1:54 PM Slawek Kaplonski wrote: > > Hi, > > > > On wtorek, 2 listopada 2021 09:29:13 CET Ammad Syed wrote: > > > Thanks Lajos, > > > > > > I was checking the release notes and found that stateless acl is > > > > supported > > > > > by ovn in xena. > > > > > > https://docs.openstack.org/releasenotes/neutron/ > > > > xena.html#:~:text=Support%20st > > > > > ateless%20security%20groups%20with%20the%20latest%20OVN%2021.06%2B. > > > > %20The%20st > > > > > > ateful%3DFalse%20security%20groups%20are%20mapped%20to%20the%20new%20%E2%80% > > 9C> > > > allow-stateless%E2%80%9D%20OVN%20ACL%20verb . > > > > It should be supported by the OVN driver now IIRC. Maybe we forgot about > > adding this extension to the list: > > https://github.com/openstack/neutron/blob/ > > master/neutron/common/ovn/extensions.py#L93 > > > ons.py#L93> Can You try to add it there and see if the extension will be > > loaded then?> > > > Ammad > > > > > > On Tue, Nov 2, 2021 at 1:25 PM Lajos Katona > > > > wrote: > > > > Hi, > > > > statefull security-groups are only available with iptables based > > > > drivers: > > > > > > https://review.opendev.org/c/openstack/neutron/+/572767/53/releasenotes/ > > note > > > > > > s/stateful-security-group-04b2902ed9c44e4f.yaml > > > > > > > > For OVS and OVN we have open RFE, nut as I know at the moment nobody > > > > works > > > > > > on them: > > > > https://bugs.launchpad.net/neutron/+bug/1885261 > > > > https://bugs.launchpad.net/neutron/+bug/1885262 > > > > > > > > Regards > > > > Lajos Katona (lajoskatona) > > > > > > > > Ammad Syed ezt ?rta (id?pont: 2021. nov. 2., > > > > K, > > > > > > 9:00): > > > >> Hi, > > > >> > > > >> I have upgraded my lab to latest xena release and ovn 21.09 and ovs > > > > 2.16. > > > > > >> I am trying to create stateless security group. But its getting > > > > failed > > with > > > > > >> below error message. > > > >> > > > >> # openstack security group create --stateless sec02-stateless > > > >> Error while executing command: BadRequestException: 400, Unrecognized > > > >> attribute(s) 'stateful' > > > >> > > > >> I see below logs in neutron server logs. > > > >> > > > >> 2021-11-02 12:47:41.921 1346 DEBUG neutron.wsgi [-] (1346) accepted > > > >> ('172.16.40.45', 41272) server > > > >> /usr/lib/python3/dist-packages/eventlet/wsgi.py:992 > > > >> 2021-11-02 12:47:42.166 1346 DEBUG neutron.api.v2.base > > > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > > > > 19844bf62a7b498eb443508ef150e9b8 > > > > > >> 98687873a146418eaeeb54a01693669f - default default] Request body: > > > >> {'security_group': {'name': 'sec02-stateless', 'stateful': False, > > > >> 'description': 'sec02-stateless'}} prepare_request_body > > > >> /usr/lib/python3/dist-packages/neutron/api/v2/base.py:729 > > > >> 2021-11-02 12:47:42.167 1346 WARNING neutron.api.v2.base > > > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > > > > 19844bf62a7b498eb443508ef150e9b8 > > > > > >> 98687873a146418eaeeb54a01693669f - default default] An exception > > > > happened > > > > > >> while processing the request body. The exception message is > > > > [Unrecognized > > > > > >> attribute(s) 'stateful'].: webob.exc.HTTPBadRequest: Unrecognized > > > >> attribute(s) 'stateful' > > > >> 2021-11-02 12:47:42.167 1346 INFO neutron.api.v2.resource > > > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > > > > 19844bf62a7b498eb443508ef150e9b8 > > > > > >> 98687873a146418eaeeb54a01693669f - default default] create failed > > > > (client > > > > > >> error): Unrecognized attribute(s) 'stateful' > > > >> 2021-11-02 12:47:42.168 1346 INFO neutron.wsgi > > > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > > > > 19844bf62a7b498eb443508ef150e9b8 > > > > > >> 98687873a146418eaeeb54a01693669f - default default] 172.16.40.45 "POST > > > >> /v2.0/security-groups HTTP/1.1" status: 400 len: 317 time: 0.2455938 > > > >> > > > >> Any advice on how to fix it ? > > > >> > > > >> Ammad > > > > -- > > Slawek Kaplonski > > Principal Software Engineer > > Red Hat -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From swogatpradhan22 at gmail.com Tue Nov 2 09:59:27 2021 From: swogatpradhan22 at gmail.com (Swogat Pradhan) Date: Tue, 2 Nov 2021 15:29:27 +0530 Subject: [Openstack-victoria] [Openstack-queens] migration of VM from openstack queens to victoria Message-ID: Hi, I have 2 openstack setups Setup1. Openstack queens + ceph nautilaus Setup2. Openstack victoria + ceph octopus So, I am trying to migrate some VM's (windows and linux) from Setup1 to Setup2. For migrating the VM's i am using rbd export on setup1 and then rbd import on setup2. I have successfully migrated 21 VM's. i am now facing an issue in the 22nd vm which, after migrating the VM the vm is stuck in the windows logo screen and not moving forward, and i can't seem to understand how to approach it. Attached the instance.xml files of both queens and victoria setup's of the same VM. With regards, Swogat pradhan -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: queens Type: application/octet-stream Size: 5191 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: victoria Type: application/octet-stream Size: 5320 bytes Desc: not available URL: From swogatpradhan22 at gmail.com Tue Nov 2 10:01:38 2021 From: swogatpradhan22 at gmail.com (Swogat Pradhan) Date: Tue, 2 Nov 2021 15:31:38 +0530 Subject: [Update] [Openstack-victoria] [Openstack-queens] migration of VM from openstack queens to victoria In-Reply-To: References: Message-ID: Hi, I have even tried uploading the Primary drive to glance in queens setup and launching VM using the said glance image in the Victoria setup, but still I am facing the same problem. On Tue, Nov 2, 2021 at 3:29 PM Swogat Pradhan wrote: > Hi, > I have 2 openstack setups > Setup1. Openstack queens + ceph nautilaus > Setup2. Openstack victoria + ceph octopus > > So, I am trying to migrate some VM's (windows and linux) from Setup1 to > Setup2. > For migrating the VM's i am using rbd export on setup1 and then rbd import > on setup2. > > I have successfully migrated 21 VM's. > > i am now facing an issue in the 22nd vm which, after migrating the VM the > vm is stuck in the windows logo screen and not moving forward, and i can't > seem to understand how to approach it. > > > Attached the instance.xml files of both queens and victoria setup's of the > same VM. > > With regards, > Swogat pradhan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From arne.wiebalck at cern.ch Tue Nov 2 10:16:57 2021 From: arne.wiebalck at cern.ch (Arne Wiebalck) Date: Tue, 2 Nov 2021 11:16:57 +0100 Subject: =?UTF-8?Q?Re=3a_=5bironic=5d_Proposing_Aija_Jaunt=c4=93va_for_sushy?= =?UTF-8?Q?-core?= In-Reply-To: References: Message-ID: <2c67fc97-fabf-6603-d9fe-96a53297519f@cern.ch> +1 Great job, Aija! On 02.11.21 08:51, Iury Gregory wrote: > Hello everyone! > > I would like to propose Aija Jaunt?va (irc: ajya) to be added to the > sushy-core group. > Aija has been in the ironic community for a long time, she has a lot of > knowledge about redfish and is always providing good reviews. > > ironic-cores please vote with +/- 1. > > -- > /Att[]'s > Iury Gregory Melo Ferreira > //MSc in Computer Science at UFCG > / > /Part of the ironic-core and puppet-manager-core team in OpenStack/ > //Software Engineer at Red Hat Czech// > /Social/:https://www.linkedin.com/in/iurygregory > > /E-mail: iurygregory at gmail.com / From syedammad83 at gmail.com Tue Nov 2 10:24:20 2021 From: syedammad83 at gmail.com (Ammad Syed) Date: Tue, 2 Nov 2021 15:24:20 +0500 Subject: [neutron][ovn] Stateless Security Group In-Reply-To: <2724584.mvXUDI8C0e@p1> References: <2800702.e9J7NaK4W3@p1> <2724584.mvXUDI8C0e@p1> Message-ID: Hi, I have reported the bug but not sure how to propose that change. Any guide to propose change would be highly appreciated. https://bugs.launchpad.net/neutron/+bug/1949451 On Tue, Nov 2, 2021 at 2:45 PM Slawek Kaplonski wrote: > Hi, > > On wtorek, 2 listopada 2021 10:04:40 CET Ammad Syed wrote: > > Hi Slawek, > > > > Yes, after adding extension, SG created with stateful=false. > > That's good. Can You report an Launchpad bug for that? And You can also > propose that change as fix for that bug too :) > > > > > # neutron ext-list | grep stateful-security-group > > neutron CLI is deprecated and will be removed in the Z cycle. Use > openstack > > CLI instead. > > > > | stateful-security-group | Stateful security group > > > > # openstack security group create --stateless sec02-stateless > > +----------------- > +----------------------------------------------------------- > > > > ------------------------------------------------------------------------------ > > ---------------------------------------+ > > | Field | Value > > > > +----------------- > +----------------------------------------------------------- > > > > ------------------------------------------------------------------------------ > > ---------------------------------------+ > > | created_at | 2021-11-02T09:02:42Z > > | > > | > > | description | sec02-stateless > > | > > | > > | id | 29c28678-9a03-496c-8157-4afbcdc8f2af > > | > > | > > | name | sec02-stateless > > | > > | > > | project_id | 98687873a146418eaeeb54a01693669f > > | > > | > > | revision_number | 1 > > | > > | > > | rules | created_at='2021-11-02T09:02:42Z', > direction='egress', > > > > ethertype='IPv6', id='17079c04-dc1d-4fbd-9f15-e79c6e585932', > > standard_attr_id='2863', updated_at='2021-11-02T09:02:42Z' | > > > > | | created_at='2021-11-02T09:02:42Z', > direction='egress', > > > > ethertype='IPv4', id='fadfbf09-f759-453d-b493-e6f73077113a', > > standard_attr_id='2860', updated_at='2021-11-02T09:02:42Z' | > > > > | stateful | False > > | > > | > > | tags | [] > > | > > | > > | updated_at | 2021-11-02T09:02:42Z > > > > +----------------- > +----------------------------------------------------------- > > > > ------------------------------------------------------------------------------ > > ---------------------------------------+ > > > > Let me test this feature further. > > > > Ammad > > > > On Tue, Nov 2, 2021 at 1:54 PM Slawek Kaplonski > wrote: > > > Hi, > > > > > > On wtorek, 2 listopada 2021 09:29:13 CET Ammad Syed wrote: > > > > Thanks Lajos, > > > > > > > > I was checking the release notes and found that stateless acl is > > > > > > supported > > > > > > > by ovn in xena. > > > > > > > > https://docs.openstack.org/releasenotes/neutron/ > > > > > > xena.html#:~:text=Support%20st > > > > > > > ateless%20security%20groups%20with%20the%20latest%20OVN%2021.06%2B. > > > > > > %20The%20st > > > > > > > > > > > ateful%3DFalse%20security%20groups%20are%20mapped%20to%20the%20new%20%E2%80% > > > 9C> > > > > allow-stateless%E2%80%9D%20OVN%20ACL%20verb . > > > > > > It should be supported by the OVN driver now IIRC. Maybe we forgot > about > > > adding this extension to the list: > > > https://github.com/openstack/neutron/blob/ > > > master/neutron/common/ovn/extensions.py#L93 > > > extensi > > > ons.py#L93> Can You try to add it there and see if the extension will > be > > > loaded then?> > > > > Ammad > > > > > > > > On Tue, Nov 2, 2021 at 1:25 PM Lajos Katona > > > > > > wrote: > > > > > Hi, > > > > > statefull security-groups are only available with iptables based > > > > > > drivers: > > > > > > > > > > https://review.opendev.org/c/openstack/neutron/+/572767/53/releasenotes/ > > > note > > > > > > > > s/stateful-security-group-04b2902ed9c44e4f.yaml > > > > > > > > > > For OVS and OVN we have open RFE, nut as I know at the moment > nobody > > > > > > works > > > > > > > > on them: > > > > > https://bugs.launchpad.net/neutron/+bug/1885261 > > > > > https://bugs.launchpad.net/neutron/+bug/1885262 > > > > > > > > > > Regards > > > > > Lajos Katona (lajoskatona) > > > > > > > > > > Ammad Syed ezt ?rta (id?pont: 2021. nov. > 2., > > > > > > K, > > > > > > > > 9:00): > > > > >> Hi, > > > > >> > > > > >> I have upgraded my lab to latest xena release and ovn 21.09 and > ovs > > > > > > 2.16. > > > > > > > >> I am trying to create stateless security group. But its getting > > > > > > failed > > > with > > > > > > > >> below error message. > > > > >> > > > > >> # openstack security group create --stateless sec02-stateless > > > > >> Error while executing command: BadRequestException: 400, > Unrecognized > > > > >> attribute(s) 'stateful' > > > > >> > > > > >> I see below logs in neutron server logs. > > > > >> > > > > >> 2021-11-02 12:47:41.921 1346 DEBUG neutron.wsgi [-] (1346) > accepted > > > > >> ('172.16.40.45', 41272) server > > > > >> /usr/lib/python3/dist-packages/eventlet/wsgi.py:992 > > > > >> 2021-11-02 12:47:42.166 1346 DEBUG neutron.api.v2.base > > > > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > > > > > > 19844bf62a7b498eb443508ef150e9b8 > > > > > > > >> 98687873a146418eaeeb54a01693669f - default default] Request body: > > > > >> {'security_group': {'name': 'sec02-stateless', 'stateful': False, > > > > >> 'description': 'sec02-stateless'}} prepare_request_body > > > > >> /usr/lib/python3/dist-packages/neutron/api/v2/base.py:729 > > > > >> 2021-11-02 12:47:42.167 1346 WARNING neutron.api.v2.base > > > > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > > > > > > 19844bf62a7b498eb443508ef150e9b8 > > > > > > > >> 98687873a146418eaeeb54a01693669f - default default] An exception > > > > > > happened > > > > > > > >> while processing the request body. The exception message is > > > > > > [Unrecognized > > > > > > > >> attribute(s) 'stateful'].: webob.exc.HTTPBadRequest: Unrecognized > > > > >> attribute(s) 'stateful' > > > > >> 2021-11-02 12:47:42.167 1346 INFO neutron.api.v2.resource > > > > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > > > > > > 19844bf62a7b498eb443508ef150e9b8 > > > > > > > >> 98687873a146418eaeeb54a01693669f - default default] create failed > > > > > > (client > > > > > > > >> error): Unrecognized attribute(s) 'stateful' > > > > >> 2021-11-02 12:47:42.168 1346 INFO neutron.wsgi > > > > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > > > > > > 19844bf62a7b498eb443508ef150e9b8 > > > > > > > >> 98687873a146418eaeeb54a01693669f - default default] 172.16.40.45 > "POST > > > > >> /v2.0/security-groups HTTP/1.1" status: 400 len: 317 time: > 0.2455938 > > > > >> > > > > >> Any advice on how to fix it ? > > > > >> > > > > >> Ammad > > > > > > -- > > > Slawek Kaplonski > > > Principal Software Engineer > > > Red Hat > > > -- > Slawek Kaplonski > Principal Software Engineer > Red Hat -- Regards, Syed Ammad Ali -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Tue Nov 2 11:14:26 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Tue, 02 Nov 2021 12:14:26 +0100 Subject: [neutron][ovn] Stateless Security Group In-Reply-To: References: <2724584.mvXUDI8C0e@p1> Message-ID: <2836220.mvXUDI8C0e@p1> Hi, On wtorek, 2 listopada 2021 11:24:20 CET Ammad Syed wrote: > Hi, > > I have reported the bug but not sure how to propose that change. Any guide > to propose change would be highly appreciated. Please go through https://docs.openstack.org/contributors/code-and-documentation/quick-start.html as it should be good start :) If You will have any questions, You can reach out to me on IRC. I'm slaweq there and You can catch me on the #openstack-neutron channel. > > https://bugs.launchpad.net/neutron/+bug/1949451 Thx > > On Tue, Nov 2, 2021 at 2:45 PM Slawek Kaplonski wrote: > > Hi, > > > > On wtorek, 2 listopada 2021 10:04:40 CET Ammad Syed wrote: > > > Hi Slawek, > > > > > > Yes, after adding extension, SG created with stateful=false. > > > > That's good. Can You report an Launchpad bug for that? And You can also > > propose that change as fix for that bug too :) > > > > > # neutron ext-list | grep stateful-security-group > > > neutron CLI is deprecated and will be removed in the Z cycle. Use > > > > openstack > > > > > CLI instead. > > > > > > | stateful-security-group | Stateful security group > > > > > > # openstack security group create --stateless sec02-stateless > > > +----------------- > > > > +----------------------------------------------------------- > > > > > > ---------------------------------------------------------------------------- > > --> > > > ---------------------------------------+ > > > > > > | Field | Value > > > > > > +----------------- > > > > +----------------------------------------------------------- > > > > > > ---------------------------------------------------------------------------- > > --> > > > ---------------------------------------+ > > > > > > | created_at | 2021-11-02T09:02:42Z > > > | > > > | > > > | description | sec02-stateless > > > | > > > | > > > | id | 29c28678-9a03-496c-8157-4afbcdc8f2af > > > | > > > | > > > | name | sec02-stateless > > > | > > > | > > > | project_id | 98687873a146418eaeeb54a01693669f > > > | > > > | > > > | revision_number | 1 > > > | > > > | > > > | rules | created_at='2021-11-02T09:02:42Z', > > > > direction='egress', > > > > > ethertype='IPv6', id='17079c04-dc1d-4fbd-9f15-e79c6e585932', > > > standard_attr_id='2863', updated_at='2021-11-02T09:02:42Z' | > > > > > > | | created_at='2021-11-02T09:02:42Z', > > > > direction='egress', > > > > > ethertype='IPv4', id='fadfbf09-f759-453d-b493-e6f73077113a', > > > standard_attr_id='2860', updated_at='2021-11-02T09:02:42Z' | > > > > > > | stateful | False > > > | > > > | > > > | tags | [] > > > | > > > | > > > | updated_at | 2021-11-02T09:02:42Z > > > > > > +----------------- > > > > +----------------------------------------------------------- > > > > > > ---------------------------------------------------------------------------- > > --> > > > ---------------------------------------+ > > > > > > Let me test this feature further. > > > > > > Ammad > > > > > > On Tue, Nov 2, 2021 at 1:54 PM Slawek Kaplonski > > > > wrote: > > > > Hi, > > > > > > > > On wtorek, 2 listopada 2021 09:29:13 CET Ammad Syed wrote: > > > > > Thanks Lajos, > > > > > > > > > > I was checking the release notes and found that stateless acl is > > > > > > > > supported > > > > > > > > > by ovn in xena. > > > > > > > > > > https://docs.openstack.org/releasenotes/neutron/ > > > > > > > > xena.html#:~:text=Support%20st > > > > > > > > > ateless%20security%20groups%20with%20the%20latest%20OVN%2021.06%2B. > > > > > > > > %20The%20st > > > > ateful%3DFalse%20security%20groups%20are%20mapped%20to%20the%20new%20%E2%80% > > > > > > 9C> > > > > > > > > > allow-stateless%E2%80%9D%20OVN%20ACL%20verb . > > > > > > > > It should be supported by the OVN driver now IIRC. Maybe we forgot > > > > about > > > > > > adding this extension to the list: > > > > https://github.com/openstack/neutron/blob/ > > > > master/neutron/common/ovn/extensions.py#L93 > > > > > > > extensi > > > > > > ons.py#L93> Can You try to add it there and see if the extension will > > > > be > > > > > > loaded then?> > > > > > > > > > Ammad > > > > > > > > > > On Tue, Nov 2, 2021 at 1:25 PM Lajos Katona > > > > > > > > wrote: > > > > > > Hi, > > > > > > statefull security-groups are only available with iptables based > > > > > > > > drivers: > > https://review.opendev.org/c/openstack/neutron/+/572767/53/releasenotes/ > > > > > > note > > > > > > > > > > s/stateful-security-group-04b2902ed9c44e4f.yaml > > > > > > > > > > > > For OVS and OVN we have open RFE, nut as I know at the moment > > > > nobody > > > > > > works > > > > > > > > > > on them: > > > > > > https://bugs.launchpad.net/neutron/+bug/1885261 > > > > > > https://bugs.launchpad.net/neutron/+bug/1885262 > > > > > > > > > > > > Regards > > > > > > Lajos Katona (lajoskatona) > > > > > > > > > > > > Ammad Syed ezt ?rta (id?pont: 2021. nov. > > > > 2., > > > > > > K, > > > > > > > > > > 9:00): > > > > > >> Hi, > > > > > >> > > > > > >> I have upgraded my lab to latest xena release and ovn 21.09 and > > > > ovs > > > > > > 2.16. > > > > > > > > > >> I am trying to create stateless security group. But its getting > > > > > > > > failed > > > > with > > > > > > > > > >> below error message. > > > > > >> > > > > > >> # openstack security group create --stateless sec02-stateless > > > > > >> Error while executing command: BadRequestException: 400, > > > > Unrecognized > > > > > > > >> attribute(s) 'stateful' > > > > > >> > > > > > >> I see below logs in neutron server logs. > > > > > >> > > > > > >> 2021-11-02 12:47:41.921 1346 DEBUG neutron.wsgi [-] (1346) > > > > accepted > > > > > > > >> ('172.16.40.45', 41272) server > > > > > >> /usr/lib/python3/dist-packages/eventlet/wsgi.py:992 > > > > > >> 2021-11-02 12:47:42.166 1346 DEBUG neutron.api.v2.base > > > > > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > > > > > > > > 19844bf62a7b498eb443508ef150e9b8 > > > > > > > > > >> 98687873a146418eaeeb54a01693669f - default default] Request body: > > > > > >> {'security_group': {'name': 'sec02-stateless', 'stateful': False, > > > > > >> 'description': 'sec02-stateless'}} prepare_request_body > > > > > >> /usr/lib/python3/dist-packages/neutron/api/v2/base.py:729 > > > > > >> 2021-11-02 12:47:42.167 1346 WARNING neutron.api.v2.base > > > > > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > > > > > > > > 19844bf62a7b498eb443508ef150e9b8 > > > > > > > > > >> 98687873a146418eaeeb54a01693669f - default default] An exception > > > > > > > > happened > > > > > > > > > >> while processing the request body. The exception message is > > > > > > > > [Unrecognized > > > > > > > > > >> attribute(s) 'stateful'].: webob.exc.HTTPBadRequest: Unrecognized > > > > > >> attribute(s) 'stateful' > > > > > >> 2021-11-02 12:47:42.167 1346 INFO neutron.api.v2.resource > > > > > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > > > > > > > > 19844bf62a7b498eb443508ef150e9b8 > > > > > > > > > >> 98687873a146418eaeeb54a01693669f - default default] create failed > > > > > > > > (client > > > > > > > > > >> error): Unrecognized attribute(s) 'stateful' > > > > > >> 2021-11-02 12:47:42.168 1346 INFO neutron.wsgi > > > > > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > > > > > > > > 19844bf62a7b498eb443508ef150e9b8 > > > > > > > > > >> 98687873a146418eaeeb54a01693669f - default default] 172.16.40.45 > > > > "POST > > > > > > > >> /v2.0/security-groups HTTP/1.1" status: 400 len: 317 time: > > 0.2455938 > > > > > > > >> Any advice on how to fix it ? > > > > > >> > > > > > >> Ammad > > > > > > > > -- > > > > Slawek Kaplonski > > > > Principal Software Engineer > > > > Red Hat > > > > -- > > Slawek Kaplonski > > Principal Software Engineer > > Red Hat -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From juliaashleykreger at gmail.com Tue Nov 2 13:00:24 2021 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Tue, 2 Nov 2021 07:00:24 -0600 Subject: =?UTF-8?Q?Re=3A_=5Bironic=5D_Proposing_Aija_Jaunt=C4=93va_for_sushy=2Dcore?= In-Reply-To: References: Message-ID: +2 from me! On Tue, Nov 2, 2021 at 1:57 AM Iury Gregory wrote: > > Hello everyone! > > I would like to propose Aija Jaunt?va (irc: ajya) to be added to the sushy-core group. > Aija has been in the ironic community for a long time, she has a lot of knowledge about redfish and is always providing good reviews. > > ironic-cores please vote with +/- 1. > > -- > Att[]'s > Iury Gregory Melo Ferreira > MSc in Computer Science at UFCG > Part of the ironic-core and puppet-manager-core team in OpenStack > Software Engineer at Red Hat Czech > Social: https://www.linkedin.com/in/iurygregory > E-mail: iurygregory at gmail.com From abraden at verisign.com Tue Nov 2 13:19:34 2021 From: abraden at verisign.com (Braden, Albert) Date: Tue, 2 Nov 2021 13:19:34 +0000 Subject: [Openstack-victoria] [Openstack-queens] migration of VM from openstack queens to victoria In-Reply-To: References: Message-ID: I?d like to experiment with migrating VMs between clusters. Where can I find a document that describes the procedure? From: Swogat Pradhan Sent: Tuesday, November 2, 2021 5:59 AM To: OpenStack Discuss Subject: [EXTERNAL] [Openstack-victoria] [Openstack-queens] migration of VM from openstack queens to victoria Caution: This email originated from outside the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe. Hi, I have 2 openstack setups Setup1. Openstack queens + ceph nautilaus Setup2. Openstack victoria + ceph octopus So, I am trying to migrate some VM's (windows and linux) from Setup1 to Setup2. For migrating the VM's i am using rbd export on setup1 and then rbd import on setup2. I have successfully migrated 21 VM's. i am now facing an issue in the 22nd vm which, after migrating the VM the vm is stuck in the windows logo screen and not moving forward, and i can't seem to understand how to approach it. Attached the instance.xml files of both queens and victoria setup's of the same VM. With regards, Swogat pradhan -------------- next part -------------- An HTML attachment was scrubbed... URL: From openinfradn at gmail.com Tue Nov 2 13:43:37 2021 From: openinfradn at gmail.com (open infra) Date: Tue, 2 Nov 2021 19:13:37 +0530 Subject: Accessing Openstack using v3 API Message-ID: Hi, I was trying to use the openstack environment using the v3 client API as mentioned in [2] documentation but ended up with an error as mentioned in [1]. I can access the same environment v3 API using curl. Am I missing something? [1] https://paste.opendev.org/show/810336/ [2] https://docs.openstack.org/python-keystoneclient/latest/using-api-v3.html Regards, Danishka -------------- next part -------------- An HTML attachment was scrubbed... URL: From bkslash at poczta.onet.pl Tue Nov 2 14:01:22 2021 From: bkslash at poczta.onet.pl (Adam Tomas) Date: Tue, 2 Nov 2021 15:01:22 +0100 Subject: [monasca][kolla-ansible] HTTPUnprocessableEntity: Dimension value must be 255 characters or less Message-ID: <69CC3C6B-4C50-4735-B65D-79C05E258B79@poczta.onet.pl> Hi, in my test deployment I get following error messages for each metric at each metric collection (every 60s in my case). Of course whole URL (i.e. /v2.0/metrics/statistics?name=transfer.region2.size&merge_metrics=True&dimensions=project_id%3Afcfaxxxxx&start_time=2021-10-31+02%3A55%3A00%2B02%3A00&end_time=2021-10-31+02%3A00%3A00%2B01%3A00&period=-3300&statistics=avg&group_by=project_id&group_by=resource_id) is longer than 255 characters, but shouldn?t this URL be processed in smaller chunks? As I see the validation is done by monasca_api, so in case it fails no data is processed by monasca_log_transformer, right? Limiting metric name lenght won?t change this situation (in this example metric name is 22 characters long, while the whole URL is 285 characters long, so 285-22 is still > 255). How can I avoid this error? Best regards Adam Tomas 2021-11-02 14:19:44.935 693 ERROR monasca_api.v2.common.bulk_processor [req-a251xxxx 60bexxxx 27bexxxxx - default default] Log transformation failed, rejecting log: monasca_api.api.core.log.exceptions.HTTPUnprocessableEntity: Dimension value /v2.0/metrics/statistics?name=transfer.region2.size&merge_metrics=True&dimensions=project_id%3Afcfaxxxxx&start_time=2021-10-31+02%3A55%3A00%2B02%3A00&end_time=2021-10-31+02%3A00%3A00%2B01%3A00&period=-3300&statistics=avg&group_by=project_id&group_by=resource_id must be 255 characters or less 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor [req-a251xxxx 60bexxxx 27bexxxx - default default] Dimension value /v2.0/metrics/statistics?name=transfer.region2.size&merge_metrics=True&dimensions=project_id%3Afcfaxxx&start_time=2021-10-31+02%3A55%3A00%2B02%3A00&end_time=2021-10-31+02%3A00%3A00%2B01%3A00&period=-3300&statistics=avg&group_by=project_id&group_by=resource_id must be 255 characters or less: monasca_api.api.core.log.exceptions.HTTPUnprocessableEntity: Dimension value /v2.0/metrics/statistics?name=transfer.region2.size&merge_metrics=True&dimensions=project_id%3Afcfaxxxx&start_time=2021-10-31+02%3A55%3A00%2B02%3A00&end_time=2021-10-31+02%3A00%3A00%2B01%3A00&period=-3300&statistics=avg&group_by=project_id&group_by=resource_id must be 255 characters or less 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor Traceback (most recent call last): 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor File "/var/lib/kolla/venv/lib/python3.8/site-packages/monasca_api/v2/common/bulk_processor.py", line 81, in _transform_message 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor dimensions=self._get_dimensions(log_element, 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor File "/var/lib/kolla/venv/lib/python3.8/site-packages/monasca_api/v2/common/bulk_processor.py", line 129, in _get_dimensions 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor validation.validate_dimensions(local_dims) 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor File "/var/lib/kolla/venv/lib/python3.8/site-packages/monasca_api/api/core/log/validation.py", line 138, in validate_dimensions 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor _validate_dimension_value(dim_value) 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor File "/var/lib/kolla/venv/lib/python3.8/site-packages/monasca_api/api/core/log/validation.py", line 114, in _validate_dimension_value 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor raise exceptions.HTTPUnprocessableEntity( 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor monasca_api.api.core.log.exceptions.HTTPUnprocessableEntity: Dimension value /v2.0/metrics/statistics?name=transfer.region2.size&merge_metrics=True&dimensions=project_id%3Afcfaxxxx&start_time=2021-10-31+02%3A55%3A00%2B02%3A00&end_time=2021-10-31+02%3A00%3A00%2B01%3A00&period=-3300&statistics=avg&group_by=project_id&group_by=resource_id must be 255 characters or less 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor From fungi at yuggoth.org Tue Nov 2 14:01:37 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Tue, 2 Nov 2021 14:01:37 +0000 Subject: Accessing Openstack using v3 API In-Reply-To: References: Message-ID: <20211102140136.dyinvai5jbsqxzjy@yuggoth.org> On 2021-11-02 19:13:37 +0530 (+0530), open infra wrote: > I was trying to use the openstack environment using the v3 client > API as mentioned in [2] documentation but ended up with an error > as mentioned in [1]. > > I can access the same environment v3 API using curl. Am I missing > something? [...] While I'm not sure what the cause of the HTTP Unauthorized exception might be (perhaps turning on debugging options might help narrow it down?), I suspect you may have an easier time developing against the unified OpenStackSDK rather than the individual per-service client libraries. For some simple examples of configuring and connecting, see the Getting Started chapter: https://docs.openstack.org/openstacksdk/xena/user/guides/intro.html Hope that helps! -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From gmann at ghanshyammann.com Tue Nov 2 14:49:11 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 02 Nov 2021 09:49:11 -0500 Subject: [all][tc] Technical Committee next weekly meeting on Nov 4th at 1500 UTC Message-ID: <17ce1201f97.f3ced5a3438820.2030079520319769825@ghanshyammann.com> Hello Everyone, Technical Committee's next weekly meeting is scheduled for Nov 4th at 1500 UTC. If you would like to add topics for discussion, please add them to the below wiki page by Wednesday, Nov 3rd, at 2100 UTC. https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting -gmann From openinfradn at gmail.com Tue Nov 2 15:55:01 2021 From: openinfradn at gmail.com (open infra) Date: Tue, 2 Nov 2021 21:25:01 +0530 Subject: Accessing Openstack using v3 API In-Reply-To: <20211102140136.dyinvai5jbsqxzjy@yuggoth.org> References: <20211102140136.dyinvai5jbsqxzjy@yuggoth.org> Message-ID: Thank you Jeremy! On Tue, Nov 2, 2021 at 7:42 PM Jeremy Stanley wrote: > On 2021-11-02 19:13:37 +0530 (+0530), open infra wrote: > > I was trying to use the openstack environment using the v3 client > > API as mentioned in [2] documentation but ended up with an error > > as mentioned in [1]. > > > > I can access the same environment v3 API using curl. Am I missing > > something? > [...] > > While I'm not sure what the cause of the HTTP Unauthorized exception > might be (perhaps turning on debugging options might help narrow it > down?), I suspect you may have an easier time developing against the > unified OpenStackSDK rather than the individual per-service client > libraries. For some simple examples of configuring and connecting, > see the Getting Started chapter: > > https://docs.openstack.org/openstacksdk/xena/user/guides/intro.html > > Hope that helps! > -- > Jeremy Stanley > -------------- next part -------------- An HTML attachment was scrubbed... URL: From swogatpradhan22 at gmail.com Wed Nov 3 05:13:03 2021 From: swogatpradhan22 at gmail.com (Swogat Pradhan) Date: Wed, 3 Nov 2021 10:43:03 +0530 Subject: [Openstack-victoria] [Openstack-queens] migration of VM from openstack queens to victoria In-Reply-To: References: Message-ID: Hi, I think there is a document for cinder migration which shows how to migrate the Volumes from one cluster to another. But that is not the process i am following, i am using rbd export and import in ceph to migrate the volumes and then recreate instances from the said volumes. Aside, I would like some input from the community on the issue i am currently facing. Thank you On Tue, Nov 2, 2021 at 6:49 PM Braden, Albert wrote: > I?d like to experiment with migrating VMs between clusters. Where can I > find a document that describes the procedure? > > > > > > *From:* Swogat Pradhan > *Sent:* Tuesday, November 2, 2021 5:59 AM > *To:* OpenStack Discuss > *Subject:* [EXTERNAL] [Openstack-victoria] [Openstack-queens] migration > of VM from openstack queens to victoria > > > > *Caution:* This email originated from outside the organization. Do not > click links or open attachments unless you recognize the sender and know > the content is safe. > > Hi, > > I have 2 openstack setups > > Setup1. Openstack queens + ceph nautilaus > > Setup2. Openstack victoria + ceph octopus > > > > So, I am trying to migrate some VM's (windows and linux) from Setup1 to > Setup2. > > For migrating the VM's i am using rbd export on setup1 and then rbd import > on setup2. > > > > I have successfully migrated 21 VM's. > > > > i am now facing an issue in the 22nd vm which, after migrating the VM the > vm is stuck in the windows logo screen and not moving forward, and i can't > seem to understand how to approach it. > > > > > > Attached the instance.xml files of both queens and victoria setup's of the > same VM. > > > > With regards, > > Swogat pradhan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nmohankumar1011 at gmail.com Wed Nov 3 06:24:03 2021 From: nmohankumar1011 at gmail.com (Mohan Kumar) Date: Wed, 3 Nov 2021 11:54:03 +0530 Subject: [neutron][ovs] Different Cookie with Remote Security group Message-ID: Team, I've created a security group with a remote security group attached to it. openstack security group rule create --ethertype=IPv4 --protocol tcp > --remote-group server-grp --ingress --dst-port=22 application-grp when I create VM with server-grp , I can see two cookies in br-int ovs bridge, one with default cookie and another new cookie with conjunction flows, is it expected behavior? ubuntu at req-generic-65-d012-mn:~$ sudo ovs-ofctl dump-flows br-int | awk '{print $1}' | sort -n | uniq cookie=0x12b7adbe73614620, cookie=0x207d02d6bbcf499e, NXST_FLOW ubuntu at req-generic-65-d012-mn:~$ sudo ovs-ofctl dump-flows br-int | grep "0x207d02d6bbcf499e" cookie=0x207d02d6bbcf499e, duration=335.421s, table=82, n_packets=0, n_bytes=0, idle_age=335, priority=70,ct_state=+est-rel-rpl,ipv6,reg6=0x1,ipv6_src=fdd8:a3ce:31ae:0:f816:3eff:fe8a:f751 actions=conjunction(8,1/2) cookie=0x207d02d6bbcf499e, duration=335.420s, table=82, n_packets=0, n_bytes=0, idle_age=335, priority=70,ct_state=+new-est,ipv6,reg6=0x1,ipv6_src=fdd8:a3ce:31ae:0:f816:3eff:fe8a:f751 actions=conjunction(9,1/2) cookie=0x207d02d6bbcf499e, duration=335.420s, table=82, n_packets=0, n_bytes=0, idle_age=335, priority=70,ct_state=+est-rel-rpl,ip,reg6=0x1,nw_src=10.0.0.38 actions=conjunction(16,1/2) cookie=0x207d02d6bbcf499e, duration=335.420s, table=82, n_packets=0, n_bytes=0, idle_age=335, priority=70,ct_state=+new-est,ip,reg6=0x1,nw_src=10.0.0.38 actions=conjunction(17,1/2) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralonsoh at redhat.com Wed Nov 3 07:40:14 2021 From: ralonsoh at redhat.com (Rodolfo Alonso Hernandez) Date: Wed, 3 Nov 2021 08:40:14 +0100 Subject: [neutron][ovs] Different Cookie with Remote Security group In-Reply-To: References: Message-ID: Hello Mohan: Each OVS agent extension (QoS) or service (firewall), will have its own "OVSCookieBridge" instance of a specific bridge; in this case br-int. That means any service or extension will write OF rules using its own cookie. By default, if you are only using OVS FW (no QoS or any other extension), only one cookie should be present in br-int. Any new rule added to the FW should have the same cookie as the other present rules. Did you restart the agent? When the OVS agent is restarted, all rules are set again, deleting the previous ones. The OVS agent generates new cookies each time. No stale flows should remain after the restart. Regards. On Wed, Nov 3, 2021 at 7:33 AM Mohan Kumar wrote: > Team, > > I've created a security group with a remote security group attached to > it. > > openstack security group rule create --ethertype=IPv4 --protocol tcp >> --remote-group server-grp --ingress --dst-port=22 application-grp > > > when I create VM with server-grp , I can see two cookies in br-int ovs > bridge, one with default cookie and another new cookie with conjunction > flows, is it expected behavior? > > ubuntu at req-generic-65-d012-mn:~$ sudo ovs-ofctl dump-flows br-int | awk '{print $1}' | sort -n | uniq > cookie=0x12b7adbe73614620, > cookie=0x207d02d6bbcf499e, > NXST_FLOW > ubuntu at req-generic-65-d012-mn:~$ sudo ovs-ofctl dump-flows br-int | grep "0x207d02d6bbcf499e" > cookie=0x207d02d6bbcf499e, duration=335.421s, table=82, n_packets=0, n_bytes=0, idle_age=335, priority=70,ct_state=+est-rel-rpl,ipv6,reg6=0x1,ipv6_src=fdd8:a3ce:31ae:0:f816:3eff:fe8a:f751 actions=conjunction(8,1/2) > cookie=0x207d02d6bbcf499e, duration=335.420s, table=82, n_packets=0, n_bytes=0, idle_age=335, priority=70,ct_state=+new-est,ipv6,reg6=0x1,ipv6_src=fdd8:a3ce:31ae:0:f816:3eff:fe8a:f751 actions=conjunction(9,1/2) > cookie=0x207d02d6bbcf499e, duration=335.420s, table=82, n_packets=0, n_bytes=0, idle_age=335, priority=70,ct_state=+est-rel-rpl,ip,reg6=0x1,nw_src=10.0.0.38 actions=conjunction(16,1/2) > cookie=0x207d02d6bbcf499e, duration=335.420s, table=82, n_packets=0, n_bytes=0, idle_age=335, priority=70,ct_state=+new-est,ip,reg6=0x1,nw_src=10.0.0.38 actions=conjunction(17,1/2) > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nmohankumar1011 at gmail.com Wed Nov 3 07:52:57 2021 From: nmohankumar1011 at gmail.com (Mohan Kumar) Date: Wed, 3 Nov 2021 13:22:57 +0530 Subject: [neutron][ovs] Different Cookie with Remote Security group In-Reply-To: References: Message-ID: Hi Rodolfo, Thanks for the reply, restart the agent fixed the issue. The cookies are replaced with a new single cookie. It look weird to me when I saw multiple cookies on the bridge. Do you think it should be fixed? Thanks ., Mohankumar N On Wed, Nov 3, 2021 at 1:10 PM Rodolfo Alonso Hernandez wrote: > Hello Mohan: > > Each OVS agent extension (QoS) or service (firewall), will have its own > "OVSCookieBridge" instance of a specific bridge; in this case br-int. That > means any service or extension will write OF rules using its own cookie. > > By default, if you are only using OVS FW (no QoS or any other extension), > only one cookie should be present in br-int. Any new rule added to the FW > should have the same cookie as the other present rules. > > Did you restart the agent? When the OVS agent is restarted, all rules are > set again, deleting the previous ones. The OVS agent generates new cookies > each time. No stale flows should remain after the restart. > > Regards. > > On Wed, Nov 3, 2021 at 7:33 AM Mohan Kumar > wrote: > >> Team, >> >> I've created a security group with a remote security group attached >> to it. >> >> openstack security group rule create --ethertype=IPv4 --protocol tcp >>> --remote-group server-grp --ingress --dst-port=22 application-grp >> >> >> when I create VM with server-grp , I can see two cookies in br-int ovs >> bridge, one with default cookie and another new cookie with conjunction >> flows, is it expected behavior? >> >> ubuntu at req-generic-65-d012-mn:~$ sudo ovs-ofctl dump-flows br-int | awk '{print $1}' | sort -n | uniq >> cookie=0x12b7adbe73614620, >> cookie=0x207d02d6bbcf499e, >> NXST_FLOW >> ubuntu at req-generic-65-d012-mn:~$ sudo ovs-ofctl dump-flows br-int | grep "0x207d02d6bbcf499e" >> cookie=0x207d02d6bbcf499e, duration=335.421s, table=82, n_packets=0, n_bytes=0, idle_age=335, priority=70,ct_state=+est-rel-rpl,ipv6,reg6=0x1,ipv6_src=fdd8:a3ce:31ae:0:f816:3eff:fe8a:f751 actions=conjunction(8,1/2) >> cookie=0x207d02d6bbcf499e, duration=335.420s, table=82, n_packets=0, n_bytes=0, idle_age=335, priority=70,ct_state=+new-est,ipv6,reg6=0x1,ipv6_src=fdd8:a3ce:31ae:0:f816:3eff:fe8a:f751 actions=conjunction(9,1/2) >> cookie=0x207d02d6bbcf499e, duration=335.420s, table=82, n_packets=0, n_bytes=0, idle_age=335, priority=70,ct_state=+est-rel-rpl,ip,reg6=0x1,nw_src=10.0.0.38 actions=conjunction(16,1/2) >> cookie=0x207d02d6bbcf499e, duration=335.420s, table=82, n_packets=0, n_bytes=0, idle_age=335, priority=70,ct_state=+new-est,ip,reg6=0x1,nw_src=10.0.0.38 actions=conjunction(17,1/2) >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From bkslash at poczta.onet.pl Wed Nov 3 08:00:59 2021 From: bkslash at poczta.onet.pl (Adam Tomas) Date: Wed, 3 Nov 2021 09:00:59 +0100 Subject: [monasca][kolla-ansible] HTTPUnprocessableEntity: Dimension value must be 255 characters or less In-Reply-To: <69CC3C6B-4C50-4735-B65D-79C05E258B79@poczta.onet.pl> References: <69CC3C6B-4C50-4735-B65D-79C05E258B79@poczta.onet.pl> Message-ID: <7DE1CD9A-E1D0-48C6-A3C5-ED4B69F337D7@poczta.onet.pl> Hi again, I did some experiment: in /var/lib/kolla/venv/lib/python3.8/site-packages/monasca_api/api/core/log/validation.py I changed the default value from 255 to 300: DIMENSION_VALUE_CONSTRAINTS = { 'MAX_LENGTH': 300 } and restarted the container. And now there are no value validation errors, but how to check if this max value do not cause problems in monasca_log_transformer or any other service in the stack (kafka/monasca_persister/elasticsearch)? Thanks in advance for any help with this issue? Best regards Adam Tomas > Wiadomo?? napisana przez Adam Tomas w dniu 02.11.2021, o godz. 15:01: > > Hi, > > in my test deployment I get following error messages for each metric at each metric collection (every 60s in my case). Of course whole URL (i.e. /v2.0/metrics/statistics?name=transfer.region2.size&merge_metrics=True&dimensions=project_id%3Afcfaxxxxx&start_time=2021-10-31+02%3A55%3A00%2B02%3A00&end_time=2021-10-31+02%3A00%3A00%2B01%3A00&period=-3300&statistics=avg&group_by=project_id&group_by=resource_id) is longer than 255 characters, but shouldn?t this URL be processed in smaller chunks? As I see the validation is done by monasca_api, so in case it fails no data is processed by monasca_log_transformer, right? Limiting metric name lenght won?t change this situation (in this example metric name is 22 characters long, while the whole URL is 285 characters long, so 285-22 is still > 255). > > How can I avoid this error? > Best regards > > Adam Tomas > > > 2021-11-02 14:19:44.935 693 ERROR monasca_api.v2.common.bulk_processor [req-a251xxxx 60bexxxx 27bexxxxx - default default] Log transformation failed, rejecting log: monasca_api.api.core.log.exceptions.HTTPUnprocessableEntity: Dimension value /v2.0/metrics/statistics?name=transfer.region2.size&merge_metrics=True&dimensions=project_id%3Afcfaxxxxx&start_time=2021-10-31+02%3A55%3A00%2B02%3A00&end_time=2021-10-31+02%3A00%3A00%2B01%3A00&period=-3300&statistics=avg&group_by=project_id&group_by=resource_id must be 255 characters or less > 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor [req-a251xxxx 60bexxxx 27bexxxx - default default] Dimension value /v2.0/metrics/statistics?name=transfer.region2.size&merge_metrics=True&dimensions=project_id%3Afcfaxxx&start_time=2021-10-31+02%3A55%3A00%2B02%3A00&end_time=2021-10-31+02%3A00%3A00%2B01%3A00&period=-3300&statistics=avg&group_by=project_id&group_by=resource_id must be 255 characters or less: monasca_api.api.core.log.exceptions.HTTPUnprocessableEntity: Dimension value /v2.0/metrics/statistics?name=transfer.region2.size&merge_metrics=True&dimensions=project_id%3Afcfaxxxx&start_time=2021-10-31+02%3A55%3A00%2B02%3A00&end_time=2021-10-31+02%3A00%3A00%2B01%3A00&period=-3300&statistics=avg&group_by=project_id&group_by=resource_id must be 255 characters or less > 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor Traceback (most recent call last): > 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor File "/var/lib/kolla/venv/lib/python3.8/site-packages/monasca_api/v2/common/bulk_processor.py", line 81, in _transform_message > 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor dimensions=self._get_dimensions(log_element, > 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor File "/var/lib/kolla/venv/lib/python3.8/site-packages/monasca_api/v2/common/bulk_processor.py", line 129, in _get_dimensions > 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor validation.validate_dimensions(local_dims) > 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor File "/var/lib/kolla/venv/lib/python3.8/site-packages/monasca_api/api/core/log/validation.py", line 138, in validate_dimensions > 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor _validate_dimension_value(dim_value) > 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor File "/var/lib/kolla/venv/lib/python3.8/site-packages/monasca_api/api/core/log/validation.py", line 114, in _validate_dimension_value > 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor raise exceptions.HTTPUnprocessableEntity( > 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor monasca_api.api.core.log.exceptions.HTTPUnprocessableEntity: Dimension value /v2.0/metrics/statistics?name=transfer.region2.size&merge_metrics=True&dimensions=project_id%3Afcfaxxxx&start_time=2021-10-31+02%3A55%3A00%2B02%3A00&end_time=2021-10-31+02%3A00%3A00%2B01%3A00&period=-3300&statistics=avg&group_by=project_id&group_by=resource_id must be 255 characters or less > 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor From tkajinam at redhat.com Wed Nov 3 10:49:25 2021 From: tkajinam at redhat.com (Takashi Kajinami) Date: Wed, 3 Nov 2021 19:49:25 +0900 Subject: [puppet] Propose retiring puppet-senlin Message-ID: Hello, I remember I raised a similar discussion recently[1] but we need the same for a different module. puppet-selin was introduced back in 2018, but the module has had only the portion made by cookiecutter and has no capability to manage fundamental resources yet. Because we haven't seen any interest in creating implementations to support even basic usage, I'll propose retiring this module. I'll be open for any feedback for a while, and will propose a series of patches for retirement if no concern is raised here for one week. Thank you, Takashi [1] http://lists.openstack.org/pipermail/openstack-discuss/2021-September/024687.html [2] https://opendev.org/openstack/puppet-senlin -------------- next part -------------- An HTML attachment was scrubbed... URL: From tobias.urdin at binero.com Wed Nov 3 11:05:34 2021 From: tobias.urdin at binero.com (Tobias Urdin) Date: Wed, 3 Nov 2021 11:05:34 +0000 Subject: [puppet] Propose retiring puppet-senlin In-Reply-To: References: Message-ID: +1 for retiring Best regards Tobias On 3 Nov 2021, at 11:49, Takashi Kajinami > wrote: Hello, I remember I raised a similar discussion recently[1] but we need the same for a different module. puppet-selin was introduced back in 2018, but the module has had only the portion made by cookiecutter and has no capability to manage fundamental resources yet. Because we haven't seen any interest in creating implementations to support even basic usage, I'll propose retiring this module. I'll be open for any feedback for a while, and will propose a series of patches for retirement if no concern is raised here for one week. Thank you, Takashi [1] http://lists.openstack.org/pipermail/openstack-discuss/2021-September/024687.html [2] https://opendev.org/openstack/puppet-senlin -------------- next part -------------- An HTML attachment was scrubbed... URL: From senrique at redhat.com Wed Nov 3 13:08:15 2021 From: senrique at redhat.com (Sofia Enriquez) Date: Wed, 3 Nov 2021 10:08:15 -0300 Subject: [cinder] Bug deputy report for week of 11-03-2021 Message-ID: This is a bug report from 10-27-2021-15-09 to 11-03-2021. Agenda: https://etherpad.opendev.org/p/cinder-bug-squad-meeting ----------------------------------------------------------------------------------------- Medium - https://bugs.launchpad.net/cinder/+bug/1949313 'Cinder backup tries to restore to the wrong host'. Assigned to Sam Morrison. - https://bugs.launchpad.net/cinder/+bug/1949189 'NfsDriver' object has no attribute 'manage_existing_get_size'. Unassigned. - https://bugs.launchpad.net/cinder/+bug/1946483 'Error when deleting encrypted volume backup from another project'. Assigned to Pavlo Shchelokovskyy. - https://bugs.launchpad.net/cinder/+bug/1948962 'Quotas: Some operations fail with a 255 volume type name'. Assigned to Gorka. - https://bugs.launchpad.net/cinder/+bug/1949061 '[Storwize] Retype of a non-rep mirror volume to mirror- volume-type with different mirror_pool is failing'. Assigned to Mounika Sreeram. Low - https://bugs.launchpad.net/cinder/+bug/1798589 'No available space check for image_conversion_dir in cinder-volume on upload-to-image'. Unassigned. Cheers -- Sof?a Enriquez she/her Software Engineer Red Hat PnT IRC: @enriquetaso @RedHat Red Hat Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From opensrloo at gmail.com Wed Nov 3 13:26:10 2021 From: opensrloo at gmail.com (Ruby Loo) Date: Wed, 3 Nov 2021 09:26:10 -0400 Subject: =?UTF-8?Q?Re=3A_=5Bironic=5D_Proposing_Aija_Jaunt=C4=93va_for_sushy=2Dcore?= In-Reply-To: References: Message-ID: Yay Aija! +1 :) --ruby On Tue, Nov 2, 2021 at 3:54 AM Iury Gregory wrote: > Hello everyone! > > I would like to propose Aija Jaunt?va (irc: ajya) to be added to the > sushy-core group. > Aija has been in the ironic community for a long time, she has a lot of > knowledge about redfish and is always providing good reviews. > > ironic-cores please vote with +/- 1. > > -- > > > *Att[]'sIury Gregory Melo Ferreira * > *MSc in Computer Science at UFCG* > *Part of the ironic-core and puppet-manager-core team in OpenStack* > *Software Engineer at Red Hat Czech* > *Social*: https://www.linkedin.com/in/iurygregory > *E-mail: iurygregory at gmail.com * > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pbasaras at gmail.com Wed Nov 3 13:26:25 2021 From: pbasaras at gmail.com (Pavlos Basaras) Date: Wed, 3 Nov 2021 15:26:25 +0200 Subject: Attach Nvidia Jetson as compute node to openstack (aarch64) Message-ID: Hello everyone, I am relatively new to the community, thanks in advance for your time and help. I have an openstack cluster ready, working with several compute nodes based on x86 architecture. My current installation is based on Ussuri. I have recently acquired a couple of nvidia Jetson devices ( https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/jetson-agx-xavier/) which I want to connect to the cluster. The arm cpu model name is ARMv8 Processor rev 0 (v8l). CPU flags on the arm architecture are different and hence egrep -c '(vmx|svm)' /proc/cpuinfo is empty to the best of my knowledge. However, i re-build the custom linux kernel (i.e, Tegra) of the Jetson device, enabling the KVM module. So once i used: *sudo kvm-ok* INFO: /dev/kvm exists KVM acceleration can be used *dmesg | grep -i kvm* [ 1.372478] kvm [1]: 16-bit VMID [ 1.372489] kvm [1]: IDMAP page: 80fa1000 [ 1.372498] kvm [1]: HYP VA range: 4000000000:7fffffffff [ 1.374185] kvm [1]: Hyp mode initialized successfully [ 1.374299] kvm [1]: vgic-v2 at 3884000 [ 1.374763] kvm [1]: vgic interrupt IRQ1 [ 1.374790] kvm [1]: virtual timer IRQ4 *dmesg | grep -i 'CPU features'* [ 0.687366] CPU features: detected feature: Privileged Access Never [ 0.687372] CPU features: detected feature: LSE atomic instructions [ 0.687378] CPU features: detected feature: User Access Override [ 0.687385] CPU features: detected feature: 32-bit EL0 Support Does this suffice to say that I can use KVM with libvirt for the nova openstack? The version of libvirt is 6.0.0. >From the nova logs i see the following with *virt_type = qemu* WARNING nova.virt.libvirt.driver [-] The libvirt driver is not tested on qemu/aarch64 by the OpenStack project and thus its quality can not be ensured. For more information, see: https://docs.openstack.org/nova/latest/user/support-matrix.html And as it detects the aarch64 device: CPU mode "host-passthrough" was chosen If i try to launch an instance i get the following error with qemu libvirt.libvirtError: unsupported configuration: CPU mode 'host-passthrough' for aarch64 qemu domain on aarch64 host is not supported by hypervisor where as i i use *virt_type= kvm* the error after trying to launch an instance is libvirt.libvirtError: unsupported configuration: Emulator '/usr/bin/qemu-system-aarch64' does not support virt type 'kvm' Any advice on how to proceed? all the best Pavlos. -------------- next part -------------- An HTML attachment was scrubbed... URL: From emilien at redhat.com Wed Nov 3 16:13:03 2021 From: emilien at redhat.com (Emilien Macchi) Date: Wed, 3 Nov 2021 12:13:03 -0400 Subject: [sdk][tc] A new home for Gophercloud Message-ID: Hi everyone, The Gophercloud project is a stable OpenStack SDK in Golang, which has been widely used by many communities now, including Terraform and all the Kubernetes on OpenStack projects. The project was initiated by Rackspace in 2013 and has already successfully managed their departure as a principal contributor. Fast forward to 2021, Joe Topjian (major maintainer) who used to be an active contributor in Puppet modules and also a voice in the OpenStack operators space, has reached out to me so we can discuss a transition for maintenance: https://github.com/gophercloud/gophercloud/issues/2246 We have discussed this internally and here are our notes: https://github.com/gophercloud/gophercloud/issues/2246#issuecomment-957589400 . To make it short, we are figuring out whether it would make sense to find a new home for the project and if yes, where. The main reason we're reaching out to the opendev community first is because we think this is the most logical place to host the project, alongside OpenStack: Some ideas: - The project would potentially have more visibility in the community, it?s a SDK therefore strongly relying on OpenStack APIs stability. - Use (some) opendev tools, mainly Zuul & nodepool resources - integrate with other projects. - Governance: - Gerrit / not Gerrit: we don?t think we would move to Gerrit yet, as the existing contributors are probably more used to the Github workflow, and we clearly don?t want to lose anyone in the process). - IRC: we could have an IRC channel, potentially Slack as well. Many things to figure out and before we answer these questions, we would like to poll the community: what do you think? Have you contributed to the project? What's your feeling about the ideas here? We welcome any feedback and will take it in consideration in our discussions. Thanks everyone, -- Emilien Macchi, on behalf of the Gophercloud contributors -------------- next part -------------- An HTML attachment was scrubbed... URL: From elod.illes at est.tech Wed Nov 3 16:22:41 2021 From: elod.illes at est.tech (=?UTF-8?B?RWzFkWQgSWxsw6lz?=) Date: Wed, 3 Nov 2021 17:22:41 +0100 Subject: [ptl][release][stable][EM] Extended Maintenance - Ussuri In-Reply-To: References: Message-ID: Hi, This is a reminder: end of next week is the planned deadline for the transition. (I've updated the lists of open & unreleased changes per project where there were any changes. See [2]) Thanks, El?d On 2021. 10. 11. 18:26, El?d Ill?s wrote: > Hi, > > As Xena was released last week and we are in a less busy period, now > it is a good time to call your attention to the following: > > In a month Ussuri is planned to transition to Extended Maintenance > phase [1] (planned date: 2021-11-12). > > I have generated the list of the current *open* and *unreleased* > changes in stable/ussuri for the follows-policy tagged repositories > [2] (where there are such patches). These lists could help the teams > who are planning to do a *final* release on Ussuri before moving > stable/ussuri branches to Extended Maintenance. Feel free to edit and > extend these lists to track your progress! > > * At the transition date the Release Team will tag the *latest* Ussuri > releases of repositories with *ussuri-em* tag. > * After the transition stable/ussuri will be still open for bug fixes, > but there won't be official releases anymore. > > *NOTE*: teams, please focus on wrapping up your libraries first if > there is any concern about the changes, in order to avoid broken > (final!) releases! > > Thanks, > > El?d > > [1] https://releases.openstack.org/ > [2] https://etherpad.opendev.org/p/ussuri-final-release-before-em > > > From artem.goncharov at gmail.com Wed Nov 3 16:25:29 2021 From: artem.goncharov at gmail.com (Artem Goncharov) Date: Wed, 3 Nov 2021 17:25:29 +0100 Subject: [sdk][tc] A new home for Gophercloud In-Reply-To: References: Message-ID: <6E557306-481C-4A71-B6C8-886CCA76CCED@gmail.com> Hey, With openstacksdk maintainer hat and without any hat I would clearly welcome this move. Regards, Artem > On 3. Nov 2021, at 17:13, Emilien Macchi wrote: > > Hi everyone, > > The Gophercloud project is a stable OpenStack SDK in Golang, which has been widely used by many communities now, including Terraform and all the Kubernetes on OpenStack projects. > The project was initiated by Rackspace in 2013 and has already successfully managed their departure as a principal contributor. Fast forward to 2021, Joe Topjian (major maintainer) who used to be an active contributor in Puppet modules and also a voice in the OpenStack operators space, has reached out to me so we can discuss a transition for maintenance: https://github.com/gophercloud/gophercloud/issues/2246 > We have discussed this internally and here are our notes: https://github.com/gophercloud/gophercloud/issues/2246#issuecomment-957589400 . > > To make it short, we are figuring out whether it would make sense to find a new home for the project and if yes, where. > The main reason we're reaching out to the opendev community first is because we think this is the most logical place to host the project, alongside OpenStack: > > Some ideas: > The project would potentially have more visibility in the community, it?s a SDK therefore strongly relying on OpenStack APIs stability. > Use (some) opendev tools, mainly Zuul & nodepool resources - integrate with other projects. > Governance: > Gerrit / not Gerrit: we don?t think we would move to Gerrit yet, as the existing contributors are probably more used to the Github workflow, and we clearly don?t want to lose anyone in the process). > IRC: we could have an IRC channel, potentially Slack as well. > > Many things to figure out and before we answer these questions, we would like to poll the community: what do you think? Have you contributed to the project? What's your feeling about the ideas here? > We welcome any feedback and will take it in consideration in our discussions. > > Thanks everyone, > -- > Emilien Macchi, on behalf of the Gophercloud contributors -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Wed Nov 3 16:40:54 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 3 Nov 2021 16:40:54 +0000 Subject: [sdk][tc] A new home for Gophercloud In-Reply-To: References: Message-ID: <20211103164054.72y2c7qk2b7ogxn2@yuggoth.org> On 2021-11-03 12:13:03 -0400 (-0400), Emilien Macchi wrote: [...] > To make it short, we are figuring out whether it would make sense > to find a new home for the project and if yes, where. > > The main reason we're reaching out to the opendev community first > is because we think this is the most logical place to host the > project, alongside OpenStack: [...] > Use (some) opendev tools, mainly Zuul & nodepool resources - integrate > with other projects. > - > > Governance: > - > > Gerrit / not Gerrit: we don?t think we would move to Gerrit yet, as the > existing contributors are probably more used to the Github workflow, and we > clearly don?t want to lose anyone in the process). [...] Hosting the project in the OpenDev Collaboratory would mean having its source code in our review.opendev.org Gerrit service as the primary reference copy and updating it through change proposals using a Gerrit workflow. You could replicate code to GitHub as is done for OpenStack's repositories, but the code copy on GitHub would merely serve as a read-only mirror. While the Zuul software does have a GitHub driver and OpenDev connects their zuul.opendev.org deployment to GitHub in order to provide advisory testing to dependencies of projects hosted in OpenDev, the OpenDev sysadmins concluded that gating projects hosted outside of the OpenDev Collaboratory's Gerrit instance (e.g., on GitHub) is not something we were able to support sustainably: http://lists.openstack.org/pipermail/openstack-infra/2019-January/006269.html This was based on experiences trying to work with the Kata community, and the "experiment" referenced in that mailing list post eventually concluded with the removal of remaining Kata project configuration when https://review.opendev.org/744687 merged approximately 15 months ago. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From james.slagle at gmail.com Wed Nov 3 16:52:35 2021 From: james.slagle at gmail.com (James Slagle) Date: Wed, 3 Nov 2021 12:52:35 -0400 Subject: [TripleO] Core team cleanup Message-ID: Hello, I took a look at our core team, "tripleo-core" in gerrit. We have a few individuals who I feel have moved on from TripleO in their focus. I looked at the reviews from stackalytics.io for the last 180 days[1]. These individuals have less than 6 reviews, which is about 1 review a month: Bob Fournier Dan Sneddon Dmitry Tantsur Ji?? Str?nsk? Juan Antonio Osorio Robles Marius Cornea These individuals have publicly expressed that they are moving on from TripleO: Michele Baldessari wes hayutin I'd like to propose we remove these folks from our core team, while thanking them for their contributions. I'll also note that I'd still value +1/-1 from these folks with a lot of significance, and encourage them to review their areas of expertise! If anyone on the list plans to start reviewing in TripleO again, then I also think we can postpone the removal for the time being and re-evaluate later. Please let me know if that's the case. Please reply and let me know any agreements or concerns with this change. Thank you! [1] https://www.stackalytics.io/report/contribution?module=tripleo-group&project_type=openstack&days=180 -- -- James Slagle -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From james.slagle at gmail.com Wed Nov 3 17:02:29 2021 From: james.slagle at gmail.com (James Slagle) Date: Wed, 3 Nov 2021 13:02:29 -0400 Subject: [TripleO] Branching our documentation Message-ID: Hello TripleO Owls, Our documentation, particularly the deploy guide, has become overly complex with all the branches that TripleO has supported over the years. I see notes that document functionality specific to releases all the way back to Mitaka! In ancient history, we made the decision to not branch our documentation because the work to maintain multiple branches outweighed the effort to just document all releases at once in the same branch. I think the scale has now tipped in the other direction. I propose that we create a stable/wallaby in tripleo-docs, and begin making the master branch specific to Yoga. This would also mean we could clean up all the old notes and admonitions about previous releases on the master branch. Going forward, if you make a change to the docs that applies to Yoga and Wallaby, you would need to backport that change to stable/wallaby. If you needed to make a change that applied only to Wallaby (or a prior release), you would make that change only on stable/wallaby. I'm not sure of all the plumbing required to make it so that the OpenStack Wallaby deploy guide builds from stable/wallaby, etc, but I wanted to get the idea out there first for feedback. Please let me know your thoughts. Thank you! -- -- James Slagle -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Wed Nov 3 17:13:10 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 03 Nov 2021 12:13:10 -0500 Subject: [all][tc] Continuing the RBAC PTG discussion In-Reply-To: <17cdc3e5964.e42274e1372366.1164114504252831335@ghanshyammann.com> References: <17cc3068765.11a9586ce162263.9148179786177865922@ghanshyammann.com> <17cdc3e5964.e42274e1372366.1164114504252831335@ghanshyammann.com> Message-ID: <17ce6ca4f3c.11a0d934542096.8078141879504981729@ghanshyammann.com> Hello Everyone, We figured out a lot of things in today call and for other open queries or goal stuff, we will continue the discussion next week on Wed Nov 10th, 15:00 - 16:00 UTC. Below is the link to join the call: https://meet.google.com/uue-adpp-xsm -gmann ---- On Mon, 01 Nov 2021 11:04:06 -0500 Ghanshyam Mann wrote ---- > ---- On Wed, 27 Oct 2021 13:32:37 -0500 Ghanshyam Mann wrote ---- > > Hello Everyone, > > > > As decided in PTG, we will continue the RBAC discussion from where we left in PTG. We will have a video > > call next week based on the availability of most of the interested members. > > > > Please vote your available time in below doodle vote by Thursday (or Friday morning central time). > > > > - https://doodle.com/poll/6xicntb9tu657nz7 > > As per doodle voting, I have schedule it on Nov 3rd Wed, 15:00 - 16:00 UTC. > > Below is the link to join the call: > > https://meet.google.com/uue-adpp-xsm > > We will be taking notes in this etherpad https://etherpad.opendev.org/p/policy-popup-yoga-ptg > > -gmann > > > > > NOTE: this is not specific to TC or people working in RBAC work but more to wider community to > > get feedback and finalize the direction (like what we did in PTG session). > > > > Meanwhile, feel free to review the lance's updated proposal for community-wide goal > > - https://review.opendev.org/c/openstack/governance/+/815158 > > > > -gmann > > > > > > From smooney at redhat.com Wed Nov 3 17:33:45 2021 From: smooney at redhat.com (Sean Mooney) Date: Wed, 03 Nov 2021 17:33:45 +0000 Subject: [TripleO] Branching our documentation In-Reply-To: References: Message-ID: <60d72e92da6bab4ded806ea05232e5c05c476eb8.camel@redhat.com> On Wed, 2021-11-03 at 13:02 -0400, James Slagle wrote: > Hello TripleO Owls, > > Our documentation, particularly the deploy guide, has become overly complex > with all the branches that TripleO has supported over the years. I see > notes that document functionality specific to releases all the way back to > Mitaka! > > In ancient history, we made the decision to not branch our documentation > because the work to maintain multiple branches outweighed the effort to > just document all releases at once in the same branch. > > I think the scale has now tipped in the other direction. I propose that we > create a stable/wallaby in tripleo-docs, and begin making the master branch > specific to Yoga. This would also mean we could clean up all the old notes > and admonitions about previous releases on the master branch. +1 as some one that very really uses ooo and who always need to look at the documentation when i do try to use it i find the current docs very hard to parse due to all the differnt release annotations inline. the ooo docs themselve are not actully that extensive upstream and you can read all or most of them in one afternoon but parsing them and the parts that apply to the relase you are trying to deploy is a lot more effort then the branached docs in other projects. i think this definetly help the new user and might also help those that are more expirnce with ooo too. > > Going forward, if you make a change to the docs that applies to Yoga and > Wallaby, you would need to backport that change to stable/wallaby. > > If you needed to make a change that applied only to Wallaby (or a prior > release), you would make that change only on stable/wallaby. > > I'm not sure of all the plumbing required to make it so that the OpenStack > Wallaby deploy guide builds from stable/wallaby, etc, but I wanted to get > the idea out there first for feedback. Please let me know your thoughts. > Thank you! > From johfulto at redhat.com Wed Nov 3 18:15:25 2021 From: johfulto at redhat.com (John Fulton) Date: Wed, 3 Nov 2021 14:15:25 -0400 Subject: [TripleO] Branching our documentation In-Reply-To: <60d72e92da6bab4ded806ea05232e5c05c476eb8.camel@redhat.com> References: <60d72e92da6bab4ded806ea05232e5c05c476eb8.camel@redhat.com> Message-ID: On Wed, Nov 3, 2021 at 1:35 PM Sean Mooney wrote: > > On Wed, 2021-11-03 at 13:02 -0400, James Slagle wrote: > > Hello TripleO Owls, > > > > Our documentation, particularly the deploy guide, has become overly complex > > with all the branches that TripleO has supported over the years. I see > > notes that document functionality specific to releases all the way back to > > Mitaka! > > > > In ancient history, we made the decision to not branch our documentation > > because the work to maintain multiple branches outweighed the effort to > > just document all releases at once in the same branch. > > > > I think the scale has now tipped in the other direction. I propose that we > > create a stable/wallaby in tripleo-docs, and begin making the master branch > > specific to Yoga. This would also mean we could clean up all the old notes > > and admonitions about previous releases on the master branch. > +1 as some one that very really uses ooo and who always need to look at the documentation > when i do try to use it i find the current docs very hard to parse due to all the differnt release annotations > inline. the ooo docs themselve are not actully that extensive upstream and you can read all or most of them in one afternoon > but parsing them and the parts that apply to the relase you are trying to deploy is a lot more effort then the branached > docs in other projects. i think this definetly help the new user and might also help those that are more expirnce with ooo too. So stable/wallaby would have the notes and admonitions about previous releases (which are useful if you're using an older version)? Then we could then make a smaller main branch which is leaner and focussed on Yoga? That sounds good to me. I'd be happy to help with the clean up after the branch is made. John > > > > Going forward, if you make a change to the docs that applies to Yoga and > > Wallaby, you would need to backport that change to stable/wallaby. > > > > If you needed to make a change that applied only to Wallaby (or a prior > > release), you would make that change only on stable/wallaby. > > > > I'm not sure of all the plumbing required to make it so that the OpenStack > > Wallaby deploy guide builds from stable/wallaby, etc, but I wanted to get > > the idea out there first for feedback. Please let me know your thoughts. > > Thank you! > > > > From beagles at redhat.com Wed Nov 3 18:45:55 2021 From: beagles at redhat.com (Brent Eagles) Date: Wed, 3 Nov 2021 16:15:55 -0230 Subject: [TripleO] Branching our documentation In-Reply-To: References: Message-ID: On Wed, Nov 03, 2021 at 01:02:29PM -0400, James Slagle wrote: > Hello TripleO Owls, > > Our documentation, particularly the deploy guide, has become overly complex > with all the branches that TripleO has supported over the years. I see > notes that document functionality specific to releases all the way back to > Mitaka! > > In ancient history, we made the decision to not branch our documentation > because the work to maintain multiple branches outweighed the effort to > just document all releases at once in the same branch. > > I think the scale has now tipped in the other direction. I propose that we > create a stable/wallaby in tripleo-docs, and begin making the master branch > specific to Yoga. This would also mean we could clean up all the old notes > and admonitions about previous releases on the master branch. Agreed! As it stands, even wallaby is quite different than victoria and earlier for the overcloud deployment and https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/deployment/install_overcloud.html isn't particular helpful after a point. I've tried to update it a few times with the "bare essential" steps but it feels so badly forked, I couldn't follow it. > Going forward, if you make a change to the docs that applies to Yoga and > Wallaby, you would need to backport that change to stable/wallaby. > > If you needed to make a change that applied only to Wallaby (or a prior > release), you would make that change only on stable/wallaby. > > I'm not sure of all the plumbing required to make it so that the OpenStack > Wallaby deploy guide builds from stable/wallaby, etc, but I wanted to get > the idea out there first for feedback. Please let me know your thoughts. > Thank you! > > -- > -- James Slagle > -- I think it's a great idea! Cheers, Brent -- Brent Eagles Principal Software Engineer Red Hat Inc. From gauurav.sabharwal at in.ibm.com Wed Nov 3 04:50:25 2021 From: gauurav.sabharwal at in.ibm.com (Gauurav Sabharwal1) Date: Wed, 3 Nov 2021 10:20:25 +0530 Subject: [cinder] : SAN migration Message-ID: Hi Experts , I need some expert advise of one of the scenario, I have multiple isolated OpenStack cluster running with train & rocky edition. Each OpenStack cluster environment have it's own isolated infrastructure of SAN ( CISCO fabric ) & Storage ( HP, EMC & IBM). Now company planning to refresh their SAN infrastructure. By procuring new Brocade SAN switches. But there are some migration relevant challenges we have. As we understand under one cinder instance only one typer of FC zone manager is supported . Currently customer configured & managing CISCO . Is it possible to configure two different vendor FC Zone manager under one cinder instance. Migration of SAN zoning is supposedly going to be happen offline way from OpenStack point of view. We will be migrating all ports of each existing cisco fabric to Brocade with zone configuration using brocade CLI. Our main concern is that after migration How CINDER DB update new zone info & path via Brocade SAN. Regards Gauurav Sabharwal IBM India Pvt. Ltd. IBM towers Ground floor, Block -A , Plot number 26, Sector 62, Noida Gautam budhnagar UP-201307. Email:gauurav.sabharwal at in.ibm.com Mobile No.: +91-9910159277 -------------- next part -------------- An HTML attachment was scrubbed... URL: From james.slagle at gmail.com Wed Nov 3 19:01:16 2021 From: james.slagle at gmail.com (James Slagle) Date: Wed, 3 Nov 2021 15:01:16 -0400 Subject: [TripleO] Branching our documentation In-Reply-To: References: <60d72e92da6bab4ded806ea05232e5c05c476eb8.camel@redhat.com> Message-ID: On Wed, Nov 3, 2021 at 2:15 PM John Fulton wrote: > On Wed, Nov 3, 2021 at 1:35 PM Sean Mooney wrote: > > > > On Wed, 2021-11-03 at 13:02 -0400, James Slagle wrote: > > > Hello TripleO Owls, > > > > > > Our documentation, particularly the deploy guide, has become overly > complex > > > with all the branches that TripleO has supported over the years. I see > > > notes that document functionality specific to releases all the way > back to > > > Mitaka! > > > > > > In ancient history, we made the decision to not branch our > documentation > > > because the work to maintain multiple branches outweighed the effort to > > > just document all releases at once in the same branch. > > > > > > I think the scale has now tipped in the other direction. I propose > that we > > > create a stable/wallaby in tripleo-docs, and begin making the master > branch > > > specific to Yoga. This would also mean we could clean up all the old > notes > > > and admonitions about previous releases on the master branch. > > +1 as some one that very really uses ooo and who always need to look at > the documentation > > when i do try to use it i find the current docs very hard to parse due > to all the differnt release annotations > > inline. the ooo docs themselve are not actully that extensive upstream > and you can read all or most of them in one afternoon > > but parsing them and the parts that apply to the relase you are trying > to deploy is a lot more effort then the branached > > docs in other projects. i think this definetly help the new user and > might also help those that are more expirnce with ooo too. > > So stable/wallaby would have the notes and admonitions about previous > releases (which are useful if you're using an older version)? > Then we could then make a smaller main branch which is leaner and > focussed on Yoga? > That is correct. stable/wallaby would be for Wallaby and all prior versions. Essentially, as the docs are now. Master would be Yoga only. When Yoga is done, we'd branch stable/yoga and master would become Z*. -- -- James Slagle -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Wed Nov 3 21:06:53 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 03 Nov 2021 16:06:53 -0500 Subject: [all][tc] Technical Committee next weekly meeting on Nov 4th at 1500 UTC In-Reply-To: <17ce1201f97.f3ced5a3438820.2030079520319769825@ghanshyammann.com> References: <17ce1201f97.f3ced5a3438820.2030079520319769825@ghanshyammann.com> Message-ID: <17ce7a0489a.f7f41b7b51390.8778486736180024566@ghanshyammann.com> Hello Everyone, Below is the agenda for Tomorrow's TC video meeting schedule at 1500 UTC. https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting == Agenda for tomorrow's TC meeting == * Roll call * Follow up on past action items * Yoga tracker ** https://etherpad.opendev.org/p/tc-yoga-tracker * Gate health check * Release management team position on a longer release cycle (ttx) ** https://etherpad.opendev.org/p/relmgt-position-1y-releases * Stable team process change ** https://review.opendev.org/c/openstack/governance/+/810721 * Adjutant need PTLs and maintainers ** http://lists.openstack.org/pipermail/openstack-discuss/2021-October/025555.html * Office hour: continue or stop? ** https://meetings.opendev.org/#Technical_Committee_Office_hours * newsletter ** https://etherpad.opendev.org/p/newsletter-openstack-news * Pain Point targeting ** https://etherpad.opendev.org/p/pain-point-elimination * Open Reviews ** https://review.opendev.org/q/projects:openstack/governance+is:open -gmann ---- On Tue, 02 Nov 2021 09:49:11 -0500 Ghanshyam Mann wrote ---- > Hello Everyone, > > Technical Committee's next weekly meeting is scheduled for Nov 4th at 1500 UTC. > > If you would like to add topics for discussion, please add them to the below wiki page by > Wednesday, Nov 3rd, at 2100 UTC. > > https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting > > -gmann > > From akanevsk at redhat.com Wed Nov 3 22:12:56 2021 From: akanevsk at redhat.com (Arkady Kanevsky) Date: Wed, 3 Nov 2021 17:12:56 -0500 Subject: =?UTF-8?Q?Re=3A_=5Bironic=5D_Proposing_Aija_Jaunt=C4=93va_for_sushy=2Dcore?= In-Reply-To: References: Message-ID: +1 On Wed, Nov 3, 2021 at 8:28 AM Ruby Loo wrote: > Yay Aija! +1 :) > > --ruby > > On Tue, Nov 2, 2021 at 3:54 AM Iury Gregory wrote: > >> Hello everyone! >> >> I would like to propose Aija Jaunt?va (irc: ajya) to be added to the >> sushy-core group. >> Aija has been in the ironic community for a long time, she has a lot of >> knowledge about redfish and is always providing good reviews. >> >> ironic-cores please vote with +/- 1. >> >> -- >> >> >> *Att[]'sIury Gregory Melo Ferreira * >> *MSc in Computer Science at UFCG* >> *Part of the ironic-core and puppet-manager-core team in OpenStack* >> *Software Engineer at Red Hat Czech* >> *Social*: https://www.linkedin.com/in/iurygregory >> *E-mail: iurygregory at gmail.com * >> > -- Arkady Kanevsky, Ph.D. Phone: 972 707-6456 Corporate Phone: 919 729-5744 ext. 8176456 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ildiko.vancsa at gmail.com Thu Nov 4 01:12:56 2021 From: ildiko.vancsa at gmail.com (Ildiko Vancsa) Date: Wed, 3 Nov 2021 18:12:56 -0700 Subject: [neutron][networking][ipv6][dns][ddi] Upcoming OpenInfra Edge Computing Group sessions Message-ID: Hi, I?m reaching out to you to draw your attention to the amazing lineup of discussion topics for the OpenInfra Edge Computing Group weekly calls up until the end of this year with industry experts to present and participate in the discussions! I would like to invite and encourage you to join the working group sessions to discuss edge related challenges and solutions in the below areas and more! Some of the sessions to highlight will be continuing the discussions we started at the recent PTG: * November 29th - Networking and DNS discussion with Cricket Liu and Andrew Wertkin * December 6th - Networking and IPv6 discussion with Ed Horley Beyond the above topics we will have presentations from groups and projects such as Smart Edge and the Industry IoT Consortium (Formerly Industrial Internet Consortium). For the full schedule please see the edge WG?s wiki: https://wiki.openstack.org/wiki/Edge_Computing_Group#Upcoming_Topics Our calls happen every Monday at 6am Pacific Time / 1400 UTC on Zoom. You can find all the meeting details here: https://wiki.openstack.org/wiki/Edge_Computing_Group#Meetings Please let me know if you have any questions about the working group or any of the upcoming sessions. Thanks and Best Regards, Ildik? From jaosorior at redhat.com Thu Nov 4 06:03:22 2021 From: jaosorior at redhat.com (Juan Osorio Robles) Date: Thu, 4 Nov 2021 08:03:22 +0200 Subject: [TripleO] Core team cleanup In-Reply-To: References: Message-ID: Hey, I meant to send an email some time ago but I didn't. Sorry about that. Yes, please remove me from the core team. My responsibilities have changed to the point where I no longer work with OpenStack. I'd still be happy to review commits if asked :) especially if I'm acquainted with the area. Thanks for updating the list Best regards On Wed, 3 Nov 2021 at 18:55, James Slagle wrote: > Hello, I took a look at our core team, "tripleo-core" in gerrit. We have a > few individuals who I feel have moved on from TripleO in their focus. I > looked at the reviews from stackalytics.io for the last 180 days[1]. > > These individuals have less than 6 reviews, which is about 1 review a > month: > Bob Fournier > Dan Sneddon > Dmitry Tantsur > Ji?? Str?nsk? > Juan Antonio Osorio Robles > Marius Cornea > > These individuals have publicly expressed that they are moving on from > TripleO: > Michele Baldessari > wes hayutin > > I'd like to propose we remove these folks from our core team, while > thanking them for their contributions. I'll also note that I'd still value > +1/-1 from these folks with a lot of significance, and encourage them to > review their areas of expertise! > > If anyone on the list plans to start reviewing in TripleO again, then I > also think we can postpone the removal for the time being and re-evaluate > later. Please let me know if that's the case. > > Please reply and let me know any agreements or concerns with this change. > > Thank you! > > [1] > https://www.stackalytics.io/report/contribution?module=tripleo-group&project_type=openstack&days=180 > > -- > -- James Slagle > -- > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jichengke2011 at gmail.com Thu Nov 4 06:23:15 2021 From: jichengke2011 at gmail.com (chengke ji) Date: Thu, 4 Nov 2021 14:23:15 +0800 Subject: [barbican] Simple Crypto Plugin kek issue In-Reply-To: References: Message-ID: You should remove old data( project kek) in table kek_data(barbican), and your project kek will issued with your new master kek. Ammad Syed ?2021?10?29??? ??4:04??? > Hi, > > I have installed barbican and using it with openstack magnum. When I am > using the default kek describe in document below, works fine and magnum > cluster creation goes successful. > > https://docs.openstack.org/barbican/latest/install/barbican-backend.html > > But when I generate a new kek with below command. > > python3 -c "from cryptography.fernet import Fernet ; key = Fernet.generate_key(); print(key)" > > > and put it in barbican.conf, the magnum cluster failed to create and I see > below logs in barbican. > > 2021-10-29 12:53:28.932 568554 INFO barbican.plugin.crypto.simple_crypto > [req-aaac01e9-82af-421b-b85a-ff998d904972 ad702ac807f44c73a32a9b7a795b693c > d782069f335041138f0cb141fde9933f - default default] Software Only Crypto > initialized > 2021-10-29 12:53:28.932 568554 DEBUG barbican.model.repositories > [req-aaac01e9-82af-421b-b85a-ff998d904972 ad702ac807f44c73a32a9b7a795b693c > d782069f335041138f0cb141fde9933f - default default] Getting session... > get_session > /usr/lib/python3/dist-packages/barbican/model/repositories.py:364 > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers > [req-aaac01e9-82af-421b-b85a-ff998d904972 ad702ac807f44c73a32a9b7a795b693c > d782069f335041138f0cb141fde9933f - default default] Secret creation failure > seen - please contact site administrator.: cryptography.fernet.InvalidToken > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers Traceback > (most recent call last): > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers File > "/usr/lib/python3/dist-packages/cryptography/fernet.py", line 113, in > _verify_signature > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers > h.verify(data[-32:]) > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers File > "/usr/lib/python3/dist-packages/cryptography/hazmat/primitives/hmac.py", > line 70, in verify > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers > ctx.verify(signature) > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers File > "/usr/lib/python3/dist-packages/cryptography/hazmat/backends/openssl/hmac.py", > line 76, in verify > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers raise > InvalidSignature("Signature did not match digest.") > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers > cryptography.exceptions.InvalidSignature: Signature did not match digest. > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers During > handling of the above exception, another exception occurred: > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers Traceback > (most recent call last): > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers File > "/usr/lib/python3/dist-packages/barbican/api/controllers/__init__.py", line > 102, in handler > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers return > fn(inst, *args, **kwargs) > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers File > "/usr/lib/python3/dist-packages/barbican/api/controllers/__init__.py", line > 88, in enforcer > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers return > fn(inst, *args, **kwargs) > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers File > "/usr/lib/python3/dist-packages/barbican/api/controllers/__init__.py", line > 150, in content_types_enforcer > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers return > fn(inst, *args, **kwargs) > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers File > "/usr/lib/python3/dist-packages/barbican/api/controllers/secrets.py", line > 456, in on_post > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers > new_secret, transport_key_model = plugin.store_secret( > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers File > "/usr/lib/python3/dist-packages/barbican/plugin/resources.py", line 108, in > store_secret > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers > secret_metadata = _store_secret_using_plugin(store_plugin, secret_dto, > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers File > "/usr/lib/python3/dist-packages/barbican/plugin/resources.py", line 279, in > _store_secret_using_plugin > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers > secret_metadata = store_plugin.store_secret(secret_dto, context) > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers File > "/usr/lib/python3/dist-packages/barbican/plugin/store_crypto.py", line 96, > in store_secret > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers > response_dto = encrypting_plugin.encrypt( > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers File > "/usr/lib/python3/dist-packages/barbican/plugin/crypto/simple_crypto.py", > line 76, in encrypt > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers kek = > self._get_kek(kek_meta_dto) > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers File > "/usr/lib/python3/dist-packages/barbican/plugin/crypto/simple_crypto.py", > line 73, in _get_kek > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers return > encryptor.decrypt(kek_meta_dto.plugin_meta) > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers File > "/usr/lib/python3/dist-packages/cryptography/fernet.py", line 76, in decrypt > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers return > self._decrypt_data(data, timestamp, ttl, int(time.time())) > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers File > "/usr/lib/python3/dist-packages/cryptography/fernet.py", line 125, in > _decrypt_data > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers > self._verify_signature(data) > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers File > "/usr/lib/python3/dist-packages/cryptography/fernet.py", line 115, in > _verify_signature > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers raise > InvalidToken > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers > cryptography.fernet.InvalidToken > > Any advise how to fix it ? > > - Ammad > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stendulker at gmail.com Thu Nov 4 09:06:56 2021 From: stendulker at gmail.com (Shivanand Tendulker) Date: Thu, 4 Nov 2021 14:36:56 +0530 Subject: =?UTF-8?Q?Re=3A_=5Bironic=5D_Proposing_Aija_Jaunt=C4=93va_for_sushy=2Dcore?= In-Reply-To: References: Message-ID: +1. Great job Auja !! On Tue, Nov 2, 2021 at 1:28 PM Iury Gregory wrote: > Hello everyone! > > I would like to propose Aija Jaunt?va (irc: ajya) to be added to the > sushy-core group. > Aija has been in the ironic community for a long time, she has a lot of > knowledge about redfish and is always providing good reviews. > > ironic-cores please vote with +/- 1. > > -- > > > *Att[]'sIury Gregory Melo Ferreira * > *MSc in Computer Science at UFCG* > *Part of the ironic-core and puppet-manager-core team in OpenStack* > *Software Engineer at Red Hat Czech* > *Social*: https://www.linkedin.com/in/iurygregory > *E-mail: iurygregory at gmail.com * > -------------- next part -------------- An HTML attachment was scrubbed... URL: From arne.wiebalck at cern.ch Thu Nov 4 09:18:00 2021 From: arne.wiebalck at cern.ch (Arne Wiebalck) Date: Thu, 4 Nov 2021 10:18:00 +0100 Subject: [baremetal-sig][ironic] Tue Nov 9, 2021, 2pm UTC: Hardware burn-in Message-ID: Dear all, The Bare Metal SIG will meet next week on Tue Nov 9, 2021, at 2pm UTC on zoom. The meeting will feature a "topic-of-the-day" presentation by Emmanouil Bagakis (CERN) on "Hardware Burn-in with Ironic" As usual, all details on https://etherpad.opendev.org/p/bare-metal-sig Everyone is welcome! Cheers, Arne From ashrodri at redhat.com Thu Nov 4 15:15:13 2021 From: ashrodri at redhat.com (Ashley Rodriguez) Date: Thu, 4 Nov 2021 11:15:13 -0400 Subject: [docs] Double headings on every page In-Reply-To: References: Message-ID: Hi! Recently I've noticed that the h1 text on the manila API ref no longer shows up in the main text body, though it is still listed in the navigation bar on the left hand side. I've noticed a similar issue on the Ironic, Nova, Neutron API guides as well. I think the change you mentioned caused this, though I'm not completely sure since not all API guides are affected. For example, Swift's API guide still works as expected with visible header 1s identifying the resource, and each method being listed in the nav bar as well. Cinder's API guide makes use of h2 instead, and thus doesn't have each method listed in the nav bar, though the titles are visible. What I'd like to see in the manila API guide is each resource shown as h1 both in the main body and in the navigation bar, along with each corresponding method. If I were to change the resource titles to h2 I would be able to see the title itself but lose the methods in the nav bar, and yet keeping them as h1 means the side bar works but the main text is missing titles. I don't have much experience with front-end work so I'd really appreciate any guidance you can offer to fix this. Thanks, Ashley On Tue, Aug 10, 2021 at 8:05 PM Peter Matulis wrote: > I'm now seeing this symptom on several projects in the 'latest' release > branch. Such as: > > https://docs.openstack.org/nova/latest/ > > My browser exposes the following error: > > [image: image.png] > > We patched [1] our project as a workaround. > > Could one of Stephen's commits [2] be involved? > > [1]: > https://review.opendev.org/c/openstack/charm-deployment-guide/+/803531 > [2]: > https://opendev.org/openstack/openstackdocstheme/commit/08461c5311aa692088a27eb40a87965fd8515aba > > On Thu, Jun 10, 2021 at 3:51 PM Peter Matulis > wrote: > >> Hi Stephen. Did you ever get to circle back to this? >> >> On Fri, May 14, 2021 at 7:34 AM Stephen Finucane >> wrote: >> >>> On Tue, 2021-05-11 at 11:14 -0400, Peter Matulis wrote: >>> >>> Hi, I'm hitting an oddity in one of my projects where the titles of all >>> pages show up twice. >>> >>> Example: >>> >>> >>> https://docs.openstack.org/project-deploy-guide/charm-deployment-guide/wallaby/app-nova-cells.html >>> >>> Source file is here: >>> >>> >>> https://opendev.org/openstack/charm-deployment-guide/src/branch/master/deploy-guide/source/app-nova-cells.rst >>> >>> Does anyone see what can be causing this? It appears to happen only for >>> the current stable release ('wallaby') and 'latest'. >>> >>> Thanks, >>> Peter >>> >>> >>> I suspect you're bumping into issues introduced by a new version of >>> Sphinx or docutils (new versions of both were released recently). >>> >>> Comparing the current nova docs [1] to what you have, I see the >>> duplicate

element is present but hidden by the following CSS rule: >>> >>> .docs-body .section h1 { >>> >>> display: none; >>> >>> } >>> >>> >>> That works because we have the following HTML in the nova docs: >>> >>>
>>> >>>

Extra Specs?

>>> >>> ... >>> >>>
>>> >>> >>> while the docs you linked are using the HTML5 semantic '
' tag: >>> >>>
>>> >>>

Nova Cells?

>>> >>> ... >>> >>>
>>> >>> >>> So to fix this, we'll have to update the openstackdocstheme to handle >>> these changes. I can try to take a look at this next week but I really >>> wouldn't mind if someone beat me to it. >>> >>> Stephen >>> >>> [1] >>> https://docs.openstack.org/nova/latest/configuration/extra-specs.html >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 7527 bytes Desc: not available URL: From artem.goncharov at gmail.com Thu Nov 4 15:32:31 2021 From: artem.goncharov at gmail.com (Artem Goncharov) Date: Thu, 4 Nov 2021 16:32:31 +0100 Subject: [docs] Double headings on every page In-Reply-To: References: Message-ID: openstackdocstheme has fixed this issue already some months ago [1]. If you still see this issue on fresh docs this would hint to me that you use Sphinx>=4 but openstackdocstheme <2.3.1 [1]: https://review.opendev.org/c/openstack/openstackdocstheme/+/798897 Regards, Artem > On 4. Nov 2021, at 16:15, Ashley Rodriguez wrote: > > Hi! > > Recently I've noticed that the h1 text on the manila API ref no longer shows up in the main text body, though it is still listed in the navigation bar on the left hand side. > I've noticed a similar issue on the Ironic, Nova, Neutron API guides as well. > I think the change you mentioned caused this, though I'm not completely sure since not all API guides are affected. > For example, Swift's API guide still works as expected with visible header 1s identifying the resource, and each method being listed in the nav bar as well. > Cinder's API guide makes use of h2 instead, and thus doesn't have each method listed in the nav bar, though the titles are visible. > What I'd like to see in the manila API guide is each resource shown as h1 both in the main body and in the navigation bar, along with each corresponding method. > If I were to change the resource titles to h2 I would be able to see the title itself but lose the methods in the nav bar, and yet keeping them as h1 means the side bar works but the main text is missing titles. > I don't have much experience with front-end work so I'd really appreciate any guidance you can offer to fix this. > > Thanks, > Ashley > > On Tue, Aug 10, 2021 at 8:05 PM Peter Matulis > wrote: > I'm now seeing this symptom on several projects in the 'latest' release branch. Such as: > > https://docs.openstack.org/nova/latest/ > > My browser exposes the following error: > > > > We patched [1] our project as a workaround. > > Could one of Stephen's commits [2] be involved? > > [1]: https://review.opendev.org/c/openstack/charm-deployment-guide/+/803531 > [2]: https://opendev.org/openstack/openstackdocstheme/commit/08461c5311aa692088a27eb40a87965fd8515aba > On Thu, Jun 10, 2021 at 3:51 PM Peter Matulis > wrote: > Hi Stephen. Did you ever get to circle back to this? > > On Fri, May 14, 2021 at 7:34 AM Stephen Finucane > wrote: > On Tue, 2021-05-11 at 11:14 -0400, Peter Matulis wrote: >> Hi, I'm hitting an oddity in one of my projects where the titles of all pages show up twice. >> >> Example: >> >> https://docs.openstack.org/project-deploy-guide/charm-deployment-guide/wallaby/app-nova-cells.html >> >> Source file is here: >> >> https://opendev.org/openstack/charm-deployment-guide/src/branch/master/deploy-guide/source/app-nova-cells.rst >> >> Does anyone see what can be causing this? It appears to happen only for the current stable release ('wallaby') and 'latest'. >> >> Thanks, >> Peter > > I suspect you're bumping into issues introduced by a new version of Sphinx or docutils (new versions of both were released recently). > > Comparing the current nova docs [1] to what you have, I see the duplicate

element is present but hidden by the following CSS rule: > > .docs-body .section h1 { > display: none; > } > > That works because we have the following HTML in the nova docs: > >
>

Extra Specs?

> ... >
> > while the docs you linked are using the HTML5 semantic '
' tag: > >
>

Nova Cells?

> ... >
> > So to fix this, we'll have to update the openstackdocstheme to handle these changes. I can try to take a look at this next week but I really wouldn't mind if someone beat me to it. > > Stephen > > [1] https://docs.openstack.org/nova/latest/configuration/extra-specs.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From katonalala at gmail.com Thu Nov 4 16:17:00 2021 From: katonalala at gmail.com (Lajos Katona) Date: Thu, 4 Nov 2021 17:17:00 +0100 Subject: [neutron] Drivers meeting - Friday 5.11.2021 - cancelled Message-ID: Hi Neutron Drivers! Due to the lack of agenda, let's cancel tomorrow's drivers meeting. See You on the meeting next week. Lajos Katona (lajoskatona) -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Thu Nov 4 16:23:58 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 4 Nov 2021 16:23:58 +0000 Subject: [docs] Double headings on every page In-Reply-To: References: Message-ID: <20211104162357.36phjz3mj6czoxjp@yuggoth.org> On 2021-11-04 11:15:13 -0400 (-0400), Ashley Rodriguez wrote: > Recently I've noticed that the h1 text on the manila API ref no > longer shows up in the main text body, though it is still listed > in the navigation bar on the left hand side. [...] Can you provide specific URLs? It's a little hard to be sure I understand what you're describing. For example, if you're talking about https://docs.openstack.org/api-ref/shared-file-system/ then when I pull it up in my browser I see "Shared File Systems API" at the top of the main text column. That is H1 level text. I think you may be saying that when there's more than one H1 element in the page, all subsequent H1 elements are hidden? For example, I notice that "API Versions" is the second H1 element, and while it appears in the navigation list in the left column, it does not show up inline in the main text column. This is because OpenStackDocsTheme intentionally hides H1 in the Sphinx output so that it can present a custom styled version of it at the top of the text. This works just fine when there is one and only one top-level (H1) heading element, but breaks down if a document contains more than one. > What I'd like to see in the manila API guide is each resource > shown as h1 both in the main body and in the navigation bar, along > with each corresponding method. If I were to change the resource > titles to h2 I would be able to see the title itself but lose the > methods in the nav bar, and yet keeping them as h1 means the side > bar works but the main text is missing titles. [...] Whether they're H1 or H2 level seems more like an implementation detail. What you actually seem to want is just for the document section titles *and* the methods within them to both appear at different levels in the navigation column, right? That seems like something we should be able to solve in the sidebar here: https://opendev.org/openstack/openstackdocstheme/src/branch/master/openstackdocstheme/theme/openstackdocs/sidebartoc.html You might want to play with setting theme_sidebar_mode to "toc" since it looks like "toctree" implicitly limits the maxdepth to 2 heading levels (H1 and H2 essentially). -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From rosmaita.fossdev at gmail.com Thu Nov 4 19:19:00 2021 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Thu, 4 Nov 2021 15:19:00 -0400 Subject: [cinder][rbd] festival of RBD driver reviews 11 november 1500 utc Message-ID: <9e8a69c6-dff6-e320-e618-2f828fc95861@gmail.com> By popular demand (really!), I'm scheduling a RBD driver review festival for next week. It's a community driver, and we've got a backlog of patches: https://review.opendev.org/q/project:openstack/cinder+status:open+file:cinder/volume/drivers/rbd.py If your patch is currently in merge conflict, it would be helpful if you could get conflicts resolved before the festival. Also, if you have questions about comments that have been left on your patch, this would be a good time to get them answered. who: Everyone! what: The Cinder Festival of RBD Driver Reviews when: Thursday 11 November 2021 from 1500-1600 UTC where: https://meet.google.com/fsb-qkfc-qun etherpad: https://etherpad.opendev.org/p/cinder-festival-of-driver-reviews (Note that we're trying google meet for this session.) From thierry at openstack.org Fri Nov 5 14:26:13 2021 From: thierry at openstack.org (Thierry Carrez) Date: Fri, 5 Nov 2021 15:26:13 +0100 Subject: [all][tc] Relmgt team position on release cadence Message-ID: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> Hi everyone, The (long) document below reflects the current position of the release management team on a popular question: should the OpenStack release cadence be changed? Please note that we only address the release management / stable branch management facet of the problem. There are other dimensions to take into account (governance, feature deprecation, supported distros...) to get a complete view of the debate. Introduction ------------ The subject of how often OpenStack should be released has been regularly debated in the OpenStack community. OpenStack started with a 3-month release cycle, then switched to 6-month release cycle starting with Diablo. It is often thought of a release management decision, but it is actually a much larger topic: a release cadence is a trade-off between pressure to release more often and pressure to release less often, coming in from a lot of different stakeholders. In OpenStack, it is ultimately a Technical Committee decision. But that decision is informed by the position of a number of stakeholders. This document gives historical context and describes the current release management team position. The current trade-off --------------------- The main pressure to release more often is to make features available to users faster. Developers get a faster feedback loop, hardware vendors ensure software is compatible with their latest products, and users get exciting new features. "Release early, release often" is a best practice in our industry -- we should generally aim at releasing as often as possible. But that is counterbalanced by pressure to release less often. From a development perspective, each release cycle comes with some process overhead. On the integrators side, a new release means packaging and validation work. On the users side, it means pressure to upgrade. To justify that cost, there needs to be enough user-visible benefit (like new features) in a given release. For the last 10 years for OpenStack, that balance has been around six months. Six months let us accumulate enough new development that it was worth upgrading to / integrating the new version, while giving enough time to actually do the work. It also aligned well with Foundation events cadence, allowing to synchronize in-person developer meetings date with start of cycles. What changed ------------ The major recent change affecting this trade-off is that the pace of new development in OpenStack slowed down. The rhythm of changes was divided by 3 between 2015 and 2021, reflecting that OpenStack is now a mature and stable solution, where accessing the latest features is no longer a major driver. That reduces some of the pressure for releasing more often. At the same time, we have more users every day, with larger and larger deployments, and keeping those clusters constantly up to date is an operational challenge. That increases the pressure to release less often. In essence, OpenStack is becoming much more like a LTS distribution than a web browser -- something users like moving slow. Over the past years, project teams also increasingly decoupled individual components from the "coordinated release". More and more components opted for an independent or intermediary-released model, where they can put out releases in the middle of a cycle, making new features available to their users. This increasingly opens up the possibility of a longer "coordinated release" which would still allow development teams to follow "release early, release often" best practices. All that recent evolution means it is (again) time to reconsider if the 6-month cadence is what serves our community best, and in particular if a longer release cadence would not suit us better. The release management team position on the debate -------------------------------------------------- While releasing less often would definitely reduce the load on the release management team, most of the team work being automated, we do not think it should be a major factor in motivating the decision. We should not adjust the cadence too often though, as there is a one-time cost in switching our processes. In terms of impact, we expect that a switch to a longer cycle will encourage more project teams to adopt a "with-intermediary" release model (rather than the traditional "with-rc" single release per cycle), which may lead to abandoning the latter, hence simplifying our processes. Longer cycles might also discourage people to commit to PTL or release liaison work. We'd probably need to manage expectations there, and encourage more frequent switches (or create alternate models). If the decision is made to switch to a longer cycle, the release management team recommends to switch to one year directly. That would avoid changing it again anytime soon, and synchronizing on a calendar year is much simpler to follow and communicate. We also recommend announcing the change well in advance. We currently have an opportunity of making the switch when we reach the end of the release naming alphabet, which would also greatly simplify the communications around the change. Finally, it is worth mentioning the impact on the stable branch work. Releasing less often would likely impact the number of stable branches that we keep on maintaining, so that we do not go too much in the past (and hit unmaintained distributions or long-gone dependencies). We currently maintain releases for 18 months before they switch to extended maintenance, which results in between 3 and 4 releases being maintained at the same time. We'd recommend switching to maintaining one-year releases for 24 months, which would result in between 2 and 3 releases being maintained at the same time. Such a change would lead to longer maintenance for our users while reducing backporting work for our developers. -- Thierry Carrez (ttx) On behalf of the OpenStack Release Management team From pbasaras at gmail.com Fri Nov 5 14:43:38 2021 From: pbasaras at gmail.com (Pavlos Basaras) Date: Fri, 5 Nov 2021 16:43:38 +0200 Subject: [Ussuri] [Zun] Container deployment problem with Zun "There are not enough hosts available" Message-ID: Hello, I have an Openstack cluster, with basic services and functionality working based on ussuri release. I am trying to install the Zun service to be able to deploy containers, following [controller] -- https://docs.openstack.org/zun/ussuri/install/controller-install.html and [compute] -- https://docs.openstack.org/zun/ussuri/install/compute-install.html I used the git branch based on ussuri for all components. I veryfined kuryr-libnetwork operation issuing from the compute node # docker network create --driver kuryr --ipam-driver kuryr --subnet 10.10.0.0/16 --gateway=10.10.0.1 test_net and seeing the network created successfully, etc. I am not very sure about the zun.conf file. What is the "endpoint_type = internalURL" parameter? Do I need to change internalURL? >From sudo systemctl status zun-compute i see: Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 49, in wrapped_f Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task return Retrying(*dargs, **dkw).call(f, *args, **kw) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 206, in call Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task return attempt.get(self._wrap_exception) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 247, in get Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task six.reraise(self.value[0], self.value[1], self.value[2]) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/lib/python3/dist-packages/six.py", line 703, in reraise Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task raise value Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 200, in call Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task attempt = Attempt(fn(*args, **kwargs), attempt_number, False) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/zun/compute/compute_node_tracker.py", line 350, in _update_to_placement Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task context, node_rp_uuid, name=compute_node.hostname) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", line 846, in get_provider_tree_and_ensure_r Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task parent_provider_uuid=parent_provider_uuid) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", line 628, in _ensure_resource_provider Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task parent_provider_uuid=parent_provider_uuid) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", line 514, in _create_resource_provider Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task global_request_id=context.global_id) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", line 225, in post Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task headers=headers, logger=LOG) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/lib/python3/dist-packages/keystoneauth1/adapter.py", line 392, in post Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task return self.request(url, 'POST', **kwargs) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/lib/python3/dist-packages/keystoneauth1/adapter.py", line 248, in request Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task return self.session.request(url, method, **kwargs) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/lib/python3/dist-packages/keystoneauth1/session.py", line 968, in request Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task raise exceptions.from_response(resp, method, url) *Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task keystoneauth1.exceptions.http.Conflict: Conflict (HTTP 409) (Request-ID: req-9a158c41-a485-4937-99e7-e38cdce7fded)* What is this problem? any advice? I used the default configuration values ([keystone_auth] and [keystone_authtoken]) values based on the configuration from the above links. Aslo from the controller *openstack appcontainer service list* +----+----------+-------------+-------+----------+-----------------+----------------------------+-------------------+ | Id | Host | Binary | State | Disabled | Disabled Reason | Updated At | Availability Zone | +----+----------+-------------+-------+----------+-----------------+----------------------------+-------------------+ | 1 | compute5 | zun-compute | up | False | None | 2021-11-05T14:39:01.000000 | nova | +----+----------+-------------+-------+----------+-----------------+----------------------------+-------------------+ *openstack appcontainer host show compute5* +----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Field | Value | +----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | uuid | ee3a5b44-8ffa-463e-939d-0c61868a596f | | links | [{'href': ' http://controller:9517/v1/hosts/ee3a5b44-8ffa-463e-939d-0c61868a596f', 'rel': 'self'}, {'href': ' http://controller:9517/hosts/ee3a5b44-8ffa-463e-939d-0c61868a596f', 'rel': 'bookmark'}] | | hostname | compute5 | | mem_total | 7975 | | mem_used | 0 | | total_containers | 1 | | cpus | 10 | | cpu_used | 0.0 | | architecture | x86_64 | | os_type | linux | | os | Ubuntu 18.04.6 LTS | | kernel_version | 4.15.0-161-generic | | labels | {} | | disk_total | 63 | | disk_used | 0 | | disk_quota_supported | False | | runtimes | ['io.containerd.runc.v2', 'io.containerd.runtime.v1.linux', 'runc'] | | enable_cpu_pinning | False | +----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ seems to work fine. However when i issue e.g., openstack appcontainer run --name container --net network=$NET_ID cirros ping 8.8.8.8 i get the error: | status_reason | There are not enough hosts available. Any ideas? One final thing is that I did see in the Horizon dashboard the container tab, to be able to deploy containers from horizon. Is there an extra configuration for this? sorry for the long mail. best, Pavlos -------------- next part -------------- An HTML attachment was scrubbed... URL: From sbauza at redhat.com Fri Nov 5 15:17:48 2021 From: sbauza at redhat.com (Sylvain Bauza) Date: Fri, 5 Nov 2021 16:17:48 +0100 Subject: [nova][placement] Spec review day on Nov 16th Message-ID: As agreed on our last Nova meeting [1], please sharpen your pen and prepare your specs ahead of time as we'll have a spec review day on Nov 16th Reminder: the idea of a spec review day is to ensure that contributors and reviewers are available on the same day for prioritizing Gerrit comments and IRC discussions about specs in order to facilitate and accelerate the reviewing of open specs. If you care about some fancy new feature, please make sure your spec is ready for review on time and you are somehow joinable so reviewers can ping you, or you are able to quickly reply on their comments and ideally propose a new revision if needed. Nova cores, I appreciate your dedication about specs on this particular day. -Sylvain [1] https://meetings.opendev.org/meetings/nova/2021/nova.2021-11-02-16.00.log.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnaser at vexxhost.com Fri Nov 5 15:53:25 2021 From: mnaser at vexxhost.com (Mohammed Naser) Date: Fri, 5 Nov 2021 11:53:25 -0400 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> Message-ID: On Fri, Nov 5, 2021 at 10:39 AM Thierry Carrez wrote: > > Hi everyone, > > The (long) document below reflects the current position of the release > management team on a popular question: should the OpenStack release > cadence be changed? Please note that we only address the release > management / stable branch management facet of the problem. There are > other dimensions to take into account (governance, feature deprecation, > supported distros...) to get a complete view of the debate. > > Introduction > ------------ > > The subject of how often OpenStack should be released has been regularly > debated in the OpenStack community. OpenStack started with a 3-month > release cycle, then switched to 6-month release cycle starting with > Diablo. It is often thought of a release management decision, but it is > actually a much larger topic: a release cadence is a trade-off between > pressure to release more often and pressure to release less often, > coming in from a lot of different stakeholders. In OpenStack, it is > ultimately a Technical Committee decision. But that decision is informed > by the position of a number of stakeholders. This document gives > historical context and describes the current release management team > position. > > The current trade-off > --------------------- > > The main pressure to release more often is to make features available to > users faster. Developers get a faster feedback loop, hardware vendors > ensure software is compatible with their latest products, and users get > exciting new features. "Release early, release often" is a best practice > in our industry -- we should generally aim at releasing as often as > possible. > > But that is counterbalanced by pressure to release less often. From a > development perspective, each release cycle comes with some process > overhead. On the integrators side, a new release means packaging and > validation work. On the users side, it means pressure to upgrade. To > justify that cost, there needs to be enough user-visible benefit (like > new features) in a given release. > > For the last 10 years for OpenStack, that balance has been around six > months. Six months let us accumulate enough new development that it was > worth upgrading to / integrating the new version, while giving enough > time to actually do the work. It also aligned well with Foundation > events cadence, allowing to synchronize in-person developer meetings > date with start of cycles. > > What changed > ------------ > > The major recent change affecting this trade-off is that the pace of new > development in OpenStack slowed down. The rhythm of changes was divided > by 3 between 2015 and 2021, reflecting that OpenStack is now a mature > and stable solution, where accessing the latest features is no longer a > major driver. That reduces some of the pressure for releasing more > often. At the same time, we have more users every day, with larger and > larger deployments, and keeping those clusters constantly up to date is > an operational challenge. That increases the pressure to release less > often. In essence, OpenStack is becoming much more like a LTS > distribution than a web browser -- something users like moving slow. > > Over the past years, project teams also increasingly decoupled > individual components from the "coordinated release". More and more > components opted for an independent or intermediary-released model, > where they can put out releases in the middle of a cycle, making new > features available to their users. This increasingly opens up the > possibility of a longer "coordinated release" which would still allow > development teams to follow "release early, release often" best > practices. All that recent evolution means it is (again) time to > reconsider if the 6-month cadence is what serves our community best, and > in particular if a longer release cadence would not suit us better. > > The release management team position on the debate > -------------------------------------------------- > > While releasing less often would definitely reduce the load on the > release management team, most of the team work being automated, we do > not think it should be a major factor in motivating the decision. We > should not adjust the cadence too often though, as there is a one-time > cost in switching our processes. In terms of impact, we expect that a > switch to a longer cycle will encourage more project teams to adopt a > "with-intermediary" release model (rather than the traditional "with-rc" > single release per cycle), which may lead to abandoning the latter, > hence simplifying our processes. Longer cycles might also discourage > people to commit to PTL or release liaison work. We'd probably need to > manage expectations there, and encourage more frequent switches (or > create alternate models). > > If the decision is made to switch to a longer cycle, the release > management team recommends to switch to one year directly. That would > avoid changing it again anytime soon, and synchronizing on a calendar > year is much simpler to follow and communicate. We also recommend > announcing the change well in advance. We currently have an opportunity > of making the switch when we reach the end of the release naming > alphabet, which would also greatly simplify the communications around > the change. > > Finally, it is worth mentioning the impact on the stable branch work. > Releasing less often would likely impact the number of stable branches > that we keep on maintaining, so that we do not go too much in the past > (and hit unmaintained distributions or long-gone dependencies). We > currently maintain releases for 18 months before they switch to extended > maintenance, which results in between 3 and 4 releases being maintained > at the same time. We'd recommend switching to maintaining one-year > releases for 24 months, which would result in between 2 and 3 releases > being maintained at the same time. Such a change would lead to longer > maintenance for our users while reducing backporting work for our > developers. > Thanks for the write up Thierry. I wonder what are the thoughts of the community of having LTS + normal releases so that we can have the power of both? I guess that is essentially what we have with EM, but I guess we could introduce a way to ensure that operators can just upgrade LTS to LTS. It can complicate things a bit from a CI and project management side, but I think it could solve the problem for both sides that need want new features + those who want stability? > -- > Thierry Carrez (ttx) > On behalf of the OpenStack Release Management team > -- Mohammed Naser VEXXHOST, Inc. From fungi at yuggoth.org Fri Nov 5 16:18:31 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 5 Nov 2021 16:18:31 +0000 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> Message-ID: <20211105161830.2fiv6jjdn52fwrxf@yuggoth.org> On 2021-11-05 11:53:25 -0400 (-0400), Mohammed Naser wrote: [...] > I wonder what are the thoughts of the community of having LTS + > normal releases so that we can have the power of both? I guess > that is essentially what we have with EM, but I guess we could > introduce a way to ensure that operators can just upgrade LTS to > LTS. > > It can complicate things a bit from a CI and project management > side, but I think it could solve the problem for both sides that > need want new features + those who want stability? This is really just another way of suggesting we solve the skip-level upgrades problem, since we can't really test fast-forward upgrades through so-called "non-LTS" versions once we abandon them. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From smooney at redhat.com Fri Nov 5 17:47:13 2021 From: smooney at redhat.com (Sean Mooney) Date: Fri, 05 Nov 2021 17:47:13 +0000 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> Message-ID: On Fri, 2021-11-05 at 11:53 -0400, Mohammed Naser wrote: > On Fri, Nov 5, 2021 at 10:39 AM Thierry Carrez wrote: > > > > Hi everyone, > > > > The (long) document below reflects the current position of the release > > management team on a popular question: should the OpenStack release > > cadence be changed? Please note that we only address the release > > management / stable branch management facet of the problem. There are > > other dimensions to take into account (governance, feature deprecation, > > supported distros...) to get a complete view of the debate. > > > > Introduction > > ------------ > > > > The subject of how often OpenStack should be released has been regularly > > debated in the OpenStack community. OpenStack started with a 3-month > > release cycle, then switched to 6-month release cycle starting with > > Diablo. It is often thought of a release management decision, but it is > > actually a much larger topic: a release cadence is a trade-off between > > pressure to release more often and pressure to release less often, > > coming in from a lot of different stakeholders. In OpenStack, it is > > ultimately a Technical Committee decision. But that decision is informed > > by the position of a number of stakeholders. This document gives > > historical context and describes the current release management team > > position. > > > > The current trade-off > > --------------------- > > > > The main pressure to release more often is to make features available to > > users faster. Developers get a faster feedback loop, hardware vendors > > ensure software is compatible with their latest products, and users get > > exciting new features. "Release early, release often" is a best practice > > in our industry -- we should generally aim at releasing as often as > > possible. > > > > But that is counterbalanced by pressure to release less often. From a > > development perspective, each release cycle comes with some process > > overhead. On the integrators side, a new release means packaging and > > validation work. On the users side, it means pressure to upgrade. To > > justify that cost, there needs to be enough user-visible benefit (like > > new features) in a given release. > > > > For the last 10 years for OpenStack, that balance has been around six > > months. Six months let us accumulate enough new development that it was > > worth upgrading to / integrating the new version, while giving enough > > time to actually do the work. It also aligned well with Foundation > > events cadence, allowing to synchronize in-person developer meetings > > date with start of cycles. > > > > What changed > > ------------ > > > > The major recent change affecting this trade-off is that the pace of new > > development in OpenStack slowed down. The rhythm of changes was divided > > by 3 between 2015 and 2021, reflecting that OpenStack is now a mature > > and stable solution, where accessing the latest features is no longer a > > major driver. That reduces some of the pressure for releasing more > > often. At the same time, we have more users every day, with larger and > > larger deployments, and keeping those clusters constantly up to date is > > an operational challenge. That increases the pressure to release less > > often. In essence, OpenStack is becoming much more like a LTS > > distribution than a web browser -- something users like moving slow. > > > > Over the past years, project teams also increasingly decoupled > > individual components from the "coordinated release". More and more > > components opted for an independent or intermediary-released model, > > where they can put out releases in the middle of a cycle, making new > > features available to their users. This increasingly opens up the > > possibility of a longer "coordinated release" which would still allow > > development teams to follow "release early, release often" best > > practices. All that recent evolution means it is (again) time to > > reconsider if the 6-month cadence is what serves our community best, and > > in particular if a longer release cadence would not suit us better. > > > > The release management team position on the debate > > -------------------------------------------------- > > > > While releasing less often would definitely reduce the load on the > > release management team, most of the team work being automated, we do > > not think it should be a major factor in motivating the decision. We > > should not adjust the cadence too often though, as there is a one-time > > cost in switching our processes. In terms of impact, we expect that a > > switch to a longer cycle will encourage more project teams to adopt a > > "with-intermediary" release model (rather than the traditional "with-rc" > > single release per cycle), which may lead to abandoning the latter, > > hence simplifying our processes. Longer cycles might also discourage > > people to commit to PTL or release liaison work. We'd probably need to > > manage expectations there, and encourage more frequent switches (or > > create alternate models). > > > > If the decision is made to switch to a longer cycle, the release > > management team recommends to switch to one year directly. That would > > avoid changing it again anytime soon, and synchronizing on a calendar > > year is much simpler to follow and communicate. We also recommend > > announcing the change well in advance. We currently have an opportunity > > of making the switch when we reach the end of the release naming > > alphabet, which would also greatly simplify the communications around > > the change. > > > > Finally, it is worth mentioning the impact on the stable branch work. > > Releasing less often would likely impact the number of stable branches > > that we keep on maintaining, so that we do not go too much in the past > > (and hit unmaintained distributions or long-gone dependencies). We > > currently maintain releases for 18 months before they switch to extended > > maintenance, which results in between 3 and 4 releases being maintained > > at the same time. We'd recommend switching to maintaining one-year > > releases for 24 months, which would result in between 2 and 3 releases > > being maintained at the same time. Such a change would lead to longer > > maintenance for our users while reducing backporting work for our > > developers. > > > > Thanks for the write up Thierry. > > I wonder what are the thoughts of the community of having LTS + normal releases > so that we can have the power of both? I guess that is essentially what we have > with EM, but I guess we could introduce a way to ensure that operators can just > upgrade LTS to LTS. if we were to intoduce LTS release we would have to agree on what they were as a compunity and we would need to support roling upgrade between LTS versions that would reqruie all distibuted project like nova to ensur that lts to lts rpc and db compatitblity is maintained instead of the current N+1 guarunetees we have to day. i know that would make some downstream happy as perhaps we could align our FFU support with THE LTS cadance but i would hold my breath on that. as a developer i woudl presonally prefer to have shorter cycle upstream with uprades supporte aross a more then n+1 e.g. release every 2 months but keep rolling upgrade compatiablty for at least 12 months or someting like that. the release with intermeiday lifecyle can enable that while still allowign use to have a longer or shorter planing horizon depending on the project and its veliocity. > > It can complicate things a bit from a CI and project management side, > but I think it > could solve the problem for both sides that need want new features + > those who want > stability? it might but i suspect that it will still not align with distros canonical have a new lts every 2 years and redhat has a new release every 18~months or so based on every 3rd release the lts idea i think has merrit but we likely would have to maintain at least 2 lts release in paralel to make it work. so something like 1 lts release a year maintained for 2 years with normal release every 6 months that are only maintianed for 6 months each project woudl keep rolling upgrade compatitblity ideally between lts release rather then n+1 as a new mimium. the implication of this is that we would want to have grenade jobs testing latest lts to master upgrade compatiblity in additon to n to n+1 where those differ. > > > -- > > Thierry Carrez (ttx) > > On behalf of the OpenStack Release Management team > > > > From smooney at redhat.com Fri Nov 5 17:53:41 2021 From: smooney at redhat.com (Sean Mooney) Date: Fri, 05 Nov 2021 17:53:41 +0000 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: <20211105161830.2fiv6jjdn52fwrxf@yuggoth.org> References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> <20211105161830.2fiv6jjdn52fwrxf@yuggoth.org> Message-ID: <375451085e161c0a1af037576ba35540a5ddae41.camel@redhat.com> On Fri, 2021-11-05 at 16:18 +0000, Jeremy Stanley wrote: > On 2021-11-05 11:53:25 -0400 (-0400), Mohammed Naser wrote: > [...] > > I wonder what are the thoughts of the community of having LTS + > > normal releases so that we can have the power of both? I guess > > that is essentially what we have with EM, but I guess we could > > introduce a way to ensure that operators can just upgrade LTS to > > LTS. > > > > It can complicate things a bit from a CI and project management > > side, but I think it could solve the problem for both sides that > > need want new features + those who want stability? > > This is really just another way of suggesting we solve the > skip-level upgrades problem, since we can't really test fast-forward > upgrades through so-called "non-LTS" versions once we abandon them. well realisticlly i dont think the customer that are pusshign use to supprot skip level upgrades or fast forward upgrades will be able to work with a cadence of 1 release a year so i would expect use to still need to consider skip level upgrade between lts-2 to new lts we have several customer that need at least 12 months to complte certifacaiton of all of there workloads on a new cloud so openstack distos will still have to support those customer that really need a 2 yearly or longer upgrade cadance even if we had and lts release every year. there are many other uses of openstack that can effectivly live at head cern and vexhost been two example that the 1 year cycle might suit well but for our telco and finacial custoemr 12 is still a short upgrade horizon for them. From dms at danplanet.com Fri Nov 5 18:21:51 2021 From: dms at danplanet.com (Dan Smith) Date: Fri, 05 Nov 2021 11:21:51 -0700 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: (Sean Mooney's message of "Fri, 05 Nov 2021 17:47:13 +0000") References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> Message-ID: > i know that would make some downstream happy as perhaps we could align > our FFU support with THE LTS cadance but i would hold my breath on > that. Except any downstream that is unable to align on the LTS schedule either permanently or temporarily would have to wait a full extra year to resync, which would make them decidedly unhappy I think. I'm sure some distros have had to realign downstream releases to "the next" upstream one more than once, so... :) > as a developer i woudl presonally prefer to have shorter cycle > upstream with uprades supporte aross a more then n+1 e.g. release > every 2 months but keep rolling upgrade compatiablty for at least 12 > months or someting like that. the release with intermeiday lifecyle > can enable that while still allowign use to have a longer or shorter > planing horizon depending on the project and its veliocity. This has the same problem as you highlighted above, which is that we all have to agree on the same 12 months that we're supporting that span, otherwise this collapses to just the intersection of any two projects' windows. --Dan From fungi at yuggoth.org Fri Nov 5 18:25:29 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 5 Nov 2021 18:25:29 +0000 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> Message-ID: <20211105182528.fgpjp6ax6cgficz2@yuggoth.org> On 2021-11-05 17:47:13 +0000 (+0000), Sean Mooney wrote: [...] > if we were to intoduce LTS release we would have to agree on what > they were as a compunity and we would need to support roling > upgrade between LTS versions [...] Yes, but what about upgrades between LTS and non-LTS versions (from or to)? Do we test all those as well? And if we don't, are users likely to want to use the non-LTS versions at all knowing they might be unable to cleanly update from them to an LTS version later on? > so something like 1 lts release a year maintained for 2 years > with normal release every 6 months that are only maintianed for 6 > months [...] To restate what I said in my other reply, this assumes a future where skip-level upgrades are possible. Otherwise what happens with a series of releases like A,b,C,d,E where A/C/E are the LTS releases and b/d are the non-LTS releases and someone who's using A wants to upgrade to C but we've already stopped maintaining b and can't guarantee it's even installable any longer? If the LTS idea is interesting to people, then we should take a step back and work on switching from FFU to SLU first. If we can't solve that, then there's no point to having non-LTS releases. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From gmann at ghanshyammann.com Fri Nov 5 19:07:47 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Fri, 05 Nov 2021 14:07:47 -0500 Subject: [all][tc] What's happening in Technical Committee: summary 5th Nov, 21: Reading: 10 min Message-ID: <17cf17ff56d.dca4cf29182418.6179612577638751365@ghanshyammann.com> Hello Everyone, Here is this week's summary of the Technical Committee activities. 1. TC Meetings: ============ * TC this week IRC meeting held on Nov 5th Thursday. * Most of the meeting discussions are summarized below (Completed or in-progress activities section). Meeting full recording are available @ - https://www.youtube.com/watch?v=4bt6iNaR3oI - IRC chat summary (topics and links): https://meetings.opendev.org/meetings/tc/2021/tc.2021-11-04-15.02.log.html * Next week's meeting will be on IRC on Nov 11th, Thursday 15:00 UTC, feel free the topic on agenda[1] by Nov 10th. 2. What we completed this week: ========================= * Retired kolla-cli [2] * Removed oslo independent deliverables from stable policy[3] * Magnum PTL is changed from Feilong Wang to Spryos[4] 3. Activities In progress: ================== TC Tracker for Yoga cycle ------------------------------ * I have started the etherpad to collect the Yoga cycle targets for TC[5]. Open Reviews ----------------- * Eleven open reviews for ongoing activities[6]. RBAC discussion: continuing from PTG ---------------------------------------------- We had a discussion on Wed and sort out the many points especially on get-all-project resources or system and project scoped isolation. Complete notes are in this etherpad[7]. Lance is re-working on the goal and new implementation/details, feel free to review it for more feedback[8]. We will continue the discussion on next Wed Nov 10th same time[9], feel free to join for any additional query related to your project specific and we will also discuss about what all bits we can do in Yoga cycle and what all to move as next targets. Decouple the community-wide goals from cycle release ----------------------------------------------------------------- * As discussed in PTG, we are de-coupling the community-wide goals from release cycle so that we can improve the community-wide goal selection and its completion workflow where goals are multi-cycle like RBAC etc. * Proposal is up, please review and add your feedback[10]. Yoga release community-wide goal ----------------------------------------- * With the continuing the discussion on RBAC, we are re-working on the RBAC goal, please wait until we finalize the implementation[11] * There is one more goal proposal for 'FIPS compatibility and compliance'[12]. Adjutant need maintainers and PTLs ------------------------------------------- You might have seen the email[13] from Adrian to call the maintainer for Adjutant project, please reply to that email or here or on openstack-tc channel if you are interested to maintain this project. We will wait for next week also and if no-one shows up for help then we will start the retirement process. New project 'Skyline' proposal ------------------------------------ * We discussed it in TC PTG and there are few open points about python packaging, repos, and plugins plan which we are discussion on ML. * Still waiting from skyline team to work on the above points[14]. Updating the Yoga testing runtime ---------------------------------------- * As centos stream 9 is released, I have updated the Yoga testing runtime[15] with: 1. Add Debian 11 as tested distro 2. Change centos stream 8 -> centos stream 9 3. Bump lowest python version to test to 3.8 and highest to python 3.9 Stable Core team process change --------------------------------------- * Current proposal is under review[16]. Feel free to provide early feedback if you have any. * This is ready to merge and I will do that week early, feel free to add feedback if any. Merging 'Technical Writing' SIG Chair/Maintainers to TC ------------------------------------------------------------------ * Work to merge this SIG to TC is up for review[17]. * No response on usage and maintaining help for openstack/training-labs repo[18] and in TC meeting we decided to retire it next week. If you use or would like to maintain this, this is last chance to raise hand. TC tags analysis ------------------- * TC agreed to remove the framework and it is communicated in ML[19]. Project updates ------------------- * Retiring js-openstack-lib [20] 4. How to contact the TC: ==================== If you would like to discuss or give feedback to TC, you can reach out to us in multiple ways: 1. Email: you can send the email with tag [tc] on openstack-discuss ML[21]. 2. Weekly meeting: The Technical Committee conduct a weekly meeting every Thursday 15 UTC [22] 3. Office hours: The Technical Committee offers a weekly office hour every Tuesday at 0100 UTC [23] 4. Ping us using 'tc-members' nickname on #openstack-tc IRC channel. [1] https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting [2] https://review.opendev.org/c/openstack/governance/+/814603 [3] https://review.opendev.org/c/openstack/governance/+/816828 [4] https://review.opendev.org/c/openstack/governance/+/816431 [5] https://etherpad.opendev.org/p/tc-yoga-tracker [6] https://review.opendev.org/q/projects:openstack/governance+status:open [7] https://etherpad.opendev.org/p/policy-popup-yoga-ptg [8] https://review.opendev.org/c/openstack/governance/+/815158 [9] http://lists.openstack.org/pipermail/openstack-discuss/2021-November/025666.html [10] https://review.opendev.org/c/openstack/governance/+/816387 [11] https://review.opendev.org/c/openstack/governance/+/815158 [12] https://review.opendev.org/c/openstack/governance/+/816587 [13] http://lists.openstack.org/pipermail/openstack-discuss/2021-October/025555.html [14] https://review.opendev.org/c/openstack/governance/+/814037 [15] https://review.opendev.org/c/openstack/governance/+/815851 [16] https://review.opendev.org/c/openstack/governance/+/810721 [17] https://review.opendev.org/c/openstack/governance/+/815869 [18] http://lists.openstack.org/pipermail/openstack-discuss/2021-October/025586.html [19] http://lists.openstack.org/pipermail/openstack-discuss/2021-October/025571.html [20] https://review.opendev.org/c/openstack/governance/+/807163 [21] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-discuss [22] http://eavesdrop.openstack.org/#Technical_Committee_Meeting [23] http://eavesdrop.openstack.org/#Technical_Committee_Office_hours -gmann From jasonanderson at uchicago.edu Fri Nov 5 22:54:31 2021 From: jasonanderson at uchicago.edu (Jason Anderson) Date: Fri, 5 Nov 2021 22:54:31 +0000 Subject: [kuryr] Using kuryr-kubernetes CNI without neutron agent(s)? In-Reply-To: References: <9FF2989B-69FA-494B-B60A-B066E5BF13DA@uchicago.edu> <1043CEB9-7386-4842-8912-9DE021DB9BD0@uchicago.edu> Message-ID: <9F45DC15-94D8-4112-B0A2-8C749E64F493@uchicago.edu> Hi Micha?, I continue to appreciate the information you are providing. I?ve been doing some more research into the landscape of systems and had a few follow-up questions. I?ve also left some clarifying remarks if you are interested. I?m currently evaluating OVN, haven?t used it before and there?s a bit of a learning curve ;) However it seems like it may solve a good part of the problem by removing RabbitMQ and reducing the privileges of the edge host w.r.t. network config. Now I?m looking at kuryr-kubernetes. 1. What is the difference between kuryr and kuryr-kubernetes? I have used kuryr-libnetwork before, in conjunction with kuryr-server (which I think is provided via the main kuryr project?). I am using Kolla Ansible so was spared some of the details on installation. I understand kuryr-libnetwork is basically ?kuryr for Docker? while kuryr-kubernetes is ?kuryr for K8s?, but that leaves me confused about what exactly the kuryr repo is. 2. A current idea is to have several edge ?compute? nodes that will run a lightweight k8s kubelet such as k3s. OVN will provide networking to the edge nodes, controlled from the central site. I would then place kuryr-k8s-controller on the central site and kuryr-cni-daemon on all the edge nodes. My question is: could users create their own Neutron networks (w/ their own Geneve segment) and launch pods connected on that network, and have those pods effectively be isolated from other pods in the topology? As in, can k8s be told that pod A should launch on network A?, and pod B on network B?? Or is there an assumption that from Neutron?s perspective all pods are always on a single Neutron network? Cheers, and thanks! /Jason On Oct 27, 2021, at 12:03 PM, Micha? Dulko > wrote: Hm, so a mixed OpenStack-K8s edge setup, where edge sites are Kubernetes deployments? We've took a look at some edge use cases with Kuryr and one problem people see is that if an edge site becomes disconnected from the main side, Kuryr will not allow creation of new Pods and Services as it needs connection to Neutron and Octavia APIs for that. If that's not a problem had you gave a thought into running distributed compute nodes [1] as edge sites and then Kubernetes on top of them? This architecture should be doable with Kuryr (probably with minor changes). Sort of! I work in research infrastructure and we are building an IoT/edge testbed for computer science researchers who wish to do research in edge computing. It?s a bit mad science-y. We are buying and configuring relatively high-powered edge devices such as Raspberry Pis and Jetson Nanos and making them available for experimentation at a variety of sites. Separately, the platform supports any owner of a supported device to have it managed by the testbed (i.e., they can use our interfaces to launch containers on it and connect it logically to other devices / resources in the cloud.) Distributed compute node looks a bit too heavy for this highly dynamic use-case, but thank you for sharing. Anyways, one might ask why Neutron at all. I am hopeful we can get some interesting properties such as network isolation and the ability to bridge traffic from containers across other layer 2 links such as those provided by AL2S. OVN may help if it can remove the need for RabbitMQ, which is probably the most difficult aspect to remove from OpenStack?s dependencies/assumptions, yet also one of the most pernicious from a security angle, as an untrusted worker node can easily corrupt the control plane. It's just Kuryr which needs access to the credentials, so possibly you should be able to isolate them, but I get the point, containers are worse at isolation than VMs. I?m less worried about the mechanism for isolation on the host and more the amount of privileged information the host must keep secure, and the impact of that information being compromised. Because our experimental target system involves container engines maintained externally to the core site, the risk of compromise on the edge is high. I am searching for an architecture that greatly limits the blast radius of such a compromise. Currently if we use standard Neutron networking + Kuryr, we must give RabbitMQ credentials and others to the container engines on the edge, which papers such as http://seclab.cs.sunysb.edu/seclab/pubs/asiaccs16.pdf have documented as a trivial escalation path. For this reason, narrowing the scope of what state the edge hosts can influence on the core site is paramount. Re: admin creds, maybe it is possible to carefully craft a role that only works for some Neutron operations and put that on the worker nodes. I will explore. I think those settings [2] is what would require highest Neutron permissions in baremetal case. Thanks ? so it will need to create and delete ports. This may be acceptable; without some additional API proxy layer for the edge hosts, a malicious edge host could create bogus ports and delete good ones, but that is a much smaller level of impact. I think we could create a role that only allowed such operations and generate per-host credentials. [1] https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/features/distributed_compute_node.html [2] https://opendev.org/openstack/kuryr-kubernetes/src/branch/master/kuryr_kubernetes/controller/drivers/neutron_vif.py#L125-L127 Cheers! [1] https://docs.openstack.org/kuryr-kubernetes/latest/nested_vlan_mode.html Thanks, Micha? Thanks! Jason Anderson --- Chameleon DevOps Lead Department of Computer Science, University of Chicago Mathematics and Computer Science, Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: From dtantsur at redhat.com Sat Nov 6 15:22:23 2021 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Sat, 6 Nov 2021 16:22:23 +0100 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> Message-ID: On Fri, Nov 5, 2021 at 5:04 PM Mohammed Naser wrote: > On Fri, Nov 5, 2021 at 10:39 AM Thierry Carrez > wrote: > > > > Hi everyone, > > > > The (long) document below reflects the current position of the release > > management team on a popular question: should the OpenStack release > > cadence be changed? Please note that we only address the release > > management / stable branch management facet of the problem. There are > > other dimensions to take into account (governance, feature deprecation, > > supported distros...) to get a complete view of the debate. > > > > Introduction > > ------------ > > > > The subject of how often OpenStack should be released has been regularly > > debated in the OpenStack community. OpenStack started with a 3-month > > release cycle, then switched to 6-month release cycle starting with > > Diablo. It is often thought of a release management decision, but it is > > actually a much larger topic: a release cadence is a trade-off between > > pressure to release more often and pressure to release less often, > > coming in from a lot of different stakeholders. In OpenStack, it is > > ultimately a Technical Committee decision. But that decision is informed > > by the position of a number of stakeholders. This document gives > > historical context and describes the current release management team > > position. > > > > The current trade-off > > --------------------- > > > > The main pressure to release more often is to make features available to > > users faster. Developers get a faster feedback loop, hardware vendors > > ensure software is compatible with their latest products, and users get > > exciting new features. "Release early, release often" is a best practice > > in our industry -- we should generally aim at releasing as often as > > possible. > > > > But that is counterbalanced by pressure to release less often. From a > > development perspective, each release cycle comes with some process > > overhead. On the integrators side, a new release means packaging and > > validation work. On the users side, it means pressure to upgrade. To > > justify that cost, there needs to be enough user-visible benefit (like > > new features) in a given release. > > > > For the last 10 years for OpenStack, that balance has been around six > > months. Six months let us accumulate enough new development that it was > > worth upgrading to / integrating the new version, while giving enough > > time to actually do the work. It also aligned well with Foundation > > events cadence, allowing to synchronize in-person developer meetings > > date with start of cycles. > > > > What changed > > ------------ > > > > The major recent change affecting this trade-off is that the pace of new > > development in OpenStack slowed down. The rhythm of changes was divided > > by 3 between 2015 and 2021, reflecting that OpenStack is now a mature > > and stable solution, where accessing the latest features is no longer a > > major driver. That reduces some of the pressure for releasing more > > often. At the same time, we have more users every day, with larger and > > larger deployments, and keeping those clusters constantly up to date is > > an operational challenge. That increases the pressure to release less > > often. In essence, OpenStack is becoming much more like a LTS > > distribution than a web browser -- something users like moving slow. > > > > Over the past years, project teams also increasingly decoupled > > individual components from the "coordinated release". More and more > > components opted for an independent or intermediary-released model, > > where they can put out releases in the middle of a cycle, making new > > features available to their users. This increasingly opens up the > > possibility of a longer "coordinated release" which would still allow > > development teams to follow "release early, release often" best > > practices. All that recent evolution means it is (again) time to > > reconsider if the 6-month cadence is what serves our community best, and > > in particular if a longer release cadence would not suit us better. > > > > The release management team position on the debate > > -------------------------------------------------- > > > > While releasing less often would definitely reduce the load on the > > release management team, most of the team work being automated, we do > > not think it should be a major factor in motivating the decision. We > > should not adjust the cadence too often though, as there is a one-time > > cost in switching our processes. In terms of impact, we expect that a > > switch to a longer cycle will encourage more project teams to adopt a > > "with-intermediary" release model (rather than the traditional "with-rc" > > single release per cycle), which may lead to abandoning the latter, > > hence simplifying our processes. Longer cycles might also discourage > > people to commit to PTL or release liaison work. We'd probably need to > > manage expectations there, and encourage more frequent switches (or > > create alternate models). > > > > If the decision is made to switch to a longer cycle, the release > > management team recommends to switch to one year directly. That would > > avoid changing it again anytime soon, and synchronizing on a calendar > > year is much simpler to follow and communicate. We also recommend > > announcing the change well in advance. We currently have an opportunity > > of making the switch when we reach the end of the release naming > > alphabet, which would also greatly simplify the communications around > > the change. > > > > Finally, it is worth mentioning the impact on the stable branch work. > > Releasing less often would likely impact the number of stable branches > > that we keep on maintaining, so that we do not go too much in the past > > (and hit unmaintained distributions or long-gone dependencies). We > > currently maintain releases for 18 months before they switch to extended > > maintenance, which results in between 3 and 4 releases being maintained > > at the same time. We'd recommend switching to maintaining one-year > > releases for 24 months, which would result in between 2 and 3 releases > > being maintained at the same time. Such a change would lead to longer > > maintenance for our users while reducing backporting work for our > > developers. > > > > Thanks for the write up Thierry. > ++ very well written! > > I wonder what are the thoughts of the community of having LTS + normal > releases > so that we can have the power of both? I guess that is essentially what > we have > with EM, but I guess we could introduce a way to ensure that operators can > just > upgrade LTS to LTS. > This is basically what Ironic does: we release with the rest of OpenStack, but we also do 2 more releases per cycle with their own bugfix/X.Y branches. Dmitry > > It can complicate things a bit from a CI and project management side, > but I think it > could solve the problem for both sides that need want new features + > those who want > stability? > > > -- > > Thierry Carrez (ttx) > > On behalf of the OpenStack Release Management team > > > > > -- > Mohammed Naser > VEXXHOST, Inc. > > -- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill -------------- next part -------------- An HTML attachment was scrubbed... URL: From dtantsur at redhat.com Sat Nov 6 15:25:56 2021 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Sat, 6 Nov 2021 16:25:56 +0100 Subject: [TripleO] Core team cleanup In-Reply-To: References: Message-ID: On Wed, Nov 3, 2021 at 5:57 PM James Slagle wrote: > Hello, I took a look at our core team, "tripleo-core" in gerrit. We have a > few individuals who I feel have moved on from TripleO in their focus. I > looked at the reviews from stackalytics.io for the last 180 days[1]. > > These individuals have less than 6 reviews, which is about 1 review a > month: > Bob Fournier > Dan Sneddon > Dmitry Tantsur > +1. yeah, sorry for that. I have been trying to keep an eye on TripleO things, but with my new OpenShift responsibilities it's pretty much impossible. I guess it's the same for Bob. I'm still available for questions and reviews if someone needs me. Dmitry > Ji?? Str?nsk? > Juan Antonio Osorio Robles > Marius Cornea > > These individuals have publicly expressed that they are moving on from > TripleO: > Michele Baldessari > wes hayutin > > I'd like to propose we remove these folks from our core team, while > thanking them for their contributions. I'll also note that I'd still value > +1/-1 from these folks with a lot of significance, and encourage them to > review their areas of expertise! > > If anyone on the list plans to start reviewing in TripleO again, then I > also think we can postpone the removal for the time being and re-evaluate > later. Please let me know if that's the case. > > Please reply and let me know any agreements or concerns with this change. > > Thank you! > > [1] > https://www.stackalytics.io/report/contribution?module=tripleo-group&project_type=openstack&days=180 > > -- > -- James Slagle > -- > -- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Sat Nov 6 16:25:48 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Sat, 6 Nov 2021 16:25:48 +0000 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> Message-ID: <20211106162547.odvooimmcr6r7ql7@yuggoth.org> On 2021-11-06 16:22:23 +0100 (+0100), Dmitry Tantsur wrote: [...] > This is basically what Ironic does: we release with the rest of OpenStack, > but we also do 2 more releases per cycle with their own bugfix/X.Y branches. [...] Do you expect users to be able to upgrade between those, and if so is that tested? -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From kira034 at 163.com Sat Nov 6 08:24:15 2021 From: kira034 at 163.com (Hongbin Lu) Date: Sat, 6 Nov 2021 16:24:15 +0800 (CST) Subject: [Ussuri] [Zun] Container deployment problem with Zun "There are not enough hosts available" In-Reply-To: References: Message-ID: <656bfc53.18f5.17cf4592522.Coremail.kira034@163.com> Hi, It looks zun-compute wants to create a resource provider in placement, but placement return a 409 response. I would suggest you to check placement's logs. My best guess is the resource provider with the same name is already created so placement returned 409. If this is a case, simply remove those resources and restart zun-compute service should resolve the problem. Best regards, Hongbin At 2021-11-05 22:43:38, "Pavlos Basaras" wrote: Hello, I have an Openstack cluster, with basic services and functionality working based on ussuri release. I am trying to install the Zun service to be able to deploy containers, following [controller] -- https://docs.openstack.org/zun/ussuri/install/controller-install.html and [compute] -- https://docs.openstack.org/zun/ussuri/install/compute-install.html I used the git branch based on ussuri for all components. I veryfined kuryr-libnetwork operation issuing from the compute node # docker network create --driver kuryr --ipam-driver kuryr --subnet 10.10.0.0/16 --gateway=10.10.0.1 test_net and seeing the network created successfully, etc. I am not very sure about the zun.conf file. What is the "endpoint_type = internalURL" parameter? Do I need to change internalURL? From sudo systemctl status zun-compute i see: Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 49, in wrapped_f Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task return Retrying(*dargs, **dkw).call(f, *args, **kw) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 206, in call Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task return attempt.get(self._wrap_exception) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 247, in get Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task six.reraise(self.value[0], self.value[1], self.value[2]) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/lib/python3/dist-packages/six.py", line 703, in reraise Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task raise value Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 200, in call Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task attempt = Attempt(fn(*args, **kwargs), attempt_number, False) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/zun/compute/compute_node_tracker.py", line 350, in _update_to_placement Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task context, node_rp_uuid, name=compute_node.hostname) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", line 846, in get_provider_tree_and_ensure_r Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task parent_provider_uuid=parent_provider_uuid) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", line 628, in _ensure_resource_provider Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task parent_provider_uuid=parent_provider_uuid) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", line 514, in _create_resource_provider Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task global_request_id=context.global_id) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", line 225, in post Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task headers=headers, logger=LOG) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/lib/python3/dist-packages/keystoneauth1/adapter.py", line 392, in post Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task return self.request(url, 'POST', **kwargs) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/lib/python3/dist-packages/keystoneauth1/adapter.py", line 248, in request Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task return self.session.request(url, method, **kwargs) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/lib/python3/dist-packages/keystoneauth1/session.py", line 968, in request Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task raise exceptions.from_response(resp, method, url) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task keystoneauth1.exceptions.http.Conflict: Conflict (HTTP 409) (Request-ID: req-9a158c41-a485-4937-99e7-e38cdce7fded) What is this problem? any advice? I used the default configuration values ([keystone_auth] and [keystone_authtoken]) values based on the configuration from the above links. Aslo from the controller openstack appcontainer service list +----+----------+-------------+-------+----------+-----------------+----------------------------+-------------------+ | Id | Host | Binary | State | Disabled | Disabled Reason | Updated At | Availability Zone | +----+----------+-------------+-------+----------+-----------------+----------------------------+-------------------+ | 1 | compute5 | zun-compute | up | False | None | 2021-11-05T14:39:01.000000 | nova | +----+----------+-------------+-------+----------+-----------------+----------------------------+-------------------+ openstack appcontainer host show compute5 +----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Field | Value | +----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | uuid | ee3a5b44-8ffa-463e-939d-0c61868a596f | | links | [{'href': 'http://controller:9517/v1/hosts/ee3a5b44-8ffa-463e-939d-0c61868a596f', 'rel': 'self'}, {'href': 'http://controller:9517/hosts/ee3a5b44-8ffa-463e-939d-0c61868a596f', 'rel': 'bookmark'}] | | hostname | compute5 | | mem_total | 7975 | | mem_used | 0 | | total_containers | 1 | | cpus | 10 | | cpu_used | 0.0 | | architecture | x86_64 | | os_type | linux | | os | Ubuntu 18.04.6 LTS | | kernel_version | 4.15.0-161-generic | | labels | {} | | disk_total | 63 | | disk_used | 0 | | disk_quota_supported | False | | runtimes | ['io.containerd.runc.v2', 'io.containerd.runtime.v1.linux', 'runc'] | | enable_cpu_pinning | False | +----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ seems to work fine. However when i issue e.g., openstack appcontainer run --name container --net network=$NET_ID cirros ping 8.8.8.8 i get the error: | status_reason | There are not enough hosts available. Any ideas? One final thing is that I did see in the Horizon dashboard the container tab, to be able to deploy containers from horizon. Is there an extra configuration for this? sorry for the long mail. best, Pavlos -------------- next part -------------- An HTML attachment was scrubbed... URL: From dtantsur at redhat.com Sun Nov 7 14:04:59 2021 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Sun, 7 Nov 2021 15:04:59 +0100 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: <20211106162547.odvooimmcr6r7ql7@yuggoth.org> References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> <20211106162547.odvooimmcr6r7ql7@yuggoth.org> Message-ID: Hi, On Sat, Nov 6, 2021 at 7:32 PM Jeremy Stanley wrote: > On 2021-11-06 16:22:23 +0100 (+0100), Dmitry Tantsur wrote: > [...] > > This is basically what Ironic does: we release with the rest of > OpenStack, > > but we also do 2 more releases per cycle with their own bugfix/X.Y > branches. > [...] > > Do you expect users to be able to upgrade between those, and if so > is that tested? > We prefer to think that upgrades are supported, and we're ready to fix bugs when they arise, but we don't actively test that. Not that we don't want to, mostly because of understaffing, CI stability and the fact that grenade is painful enough as it is. Dmitry > -- > Jeremy Stanley > -- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Sun Nov 7 16:49:54 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Sun, 07 Nov 2021 17:49:54 +0100 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> Message-ID: <2685967.mvXUDI8C0e@p1> Hi, On pi?tek, 5 listopada 2021 18:47:13 CET Sean Mooney wrote: > On Fri, 2021-11-05 at 11:53 -0400, Mohammed Naser wrote: > > On Fri, Nov 5, 2021 at 10:39 AM Thierry Carrez wrote: > > > Hi everyone, > > > > > > The (long) document below reflects the current position of the release > > > management team on a popular question: should the OpenStack release > > > cadence be changed? Please note that we only address the release > > > management / stable branch management facet of the problem. There are > > > other dimensions to take into account (governance, feature deprecation, > > > supported distros...) to get a complete view of the debate. > > > > > > Introduction > > > ------------ > > > > > > The subject of how often OpenStack should be released has been regularly > > > debated in the OpenStack community. OpenStack started with a 3-month > > > release cycle, then switched to 6-month release cycle starting with > > > Diablo. It is often thought of a release management decision, but it is > > > actually a much larger topic: a release cadence is a trade-off between > > > pressure to release more often and pressure to release less often, > > > coming in from a lot of different stakeholders. In OpenStack, it is > > > ultimately a Technical Committee decision. But that decision is informed > > > by the position of a number of stakeholders. This document gives > > > historical context and describes the current release management team > > > position. > > > > > > The current trade-off > > > --------------------- > > > > > > The main pressure to release more often is to make features available to > > > users faster. Developers get a faster feedback loop, hardware vendors > > > ensure software is compatible with their latest products, and users get > > > exciting new features. "Release early, release often" is a best practice > > > in our industry -- we should generally aim at releasing as often as > > > possible. > > > > > > But that is counterbalanced by pressure to release less often. From a > > > development perspective, each release cycle comes with some process > > > overhead. On the integrators side, a new release means packaging and > > > validation work. On the users side, it means pressure to upgrade. To > > > justify that cost, there needs to be enough user-visible benefit (like > > > new features) in a given release. > > > > > > For the last 10 years for OpenStack, that balance has been around six > > > months. Six months let us accumulate enough new development that it was > > > worth upgrading to / integrating the new version, while giving enough > > > time to actually do the work. It also aligned well with Foundation > > > events cadence, allowing to synchronize in-person developer meetings > > > date with start of cycles. > > > > > > What changed > > > ------------ > > > > > > The major recent change affecting this trade-off is that the pace of new > > > development in OpenStack slowed down. The rhythm of changes was divided > > > by 3 between 2015 and 2021, reflecting that OpenStack is now a mature > > > and stable solution, where accessing the latest features is no longer a > > > major driver. That reduces some of the pressure for releasing more > > > often. At the same time, we have more users every day, with larger and > > > larger deployments, and keeping those clusters constantly up to date is > > > an operational challenge. That increases the pressure to release less > > > often. In essence, OpenStack is becoming much more like a LTS > > > distribution than a web browser -- something users like moving slow. > > > > > > Over the past years, project teams also increasingly decoupled > > > individual components from the "coordinated release". More and more > > > components opted for an independent or intermediary-released model, > > > where they can put out releases in the middle of a cycle, making new > > > features available to their users. This increasingly opens up the > > > possibility of a longer "coordinated release" which would still allow > > > development teams to follow "release early, release often" best > > > practices. All that recent evolution means it is (again) time to > > > reconsider if the 6-month cadence is what serves our community best, and > > > in particular if a longer release cadence would not suit us better. > > > > > > The release management team position on the debate > > > -------------------------------------------------- > > > > > > While releasing less often would definitely reduce the load on the > > > release management team, most of the team work being automated, we do > > > not think it should be a major factor in motivating the decision. We > > > should not adjust the cadence too often though, as there is a one-time > > > cost in switching our processes. In terms of impact, we expect that a > > > switch to a longer cycle will encourage more project teams to adopt a > > > "with-intermediary" release model (rather than the traditional "with-rc" > > > single release per cycle), which may lead to abandoning the latter, > > > hence simplifying our processes. Longer cycles might also discourage > > > people to commit to PTL or release liaison work. We'd probably need to > > > manage expectations there, and encourage more frequent switches (or > > > create alternate models). > > > > > > If the decision is made to switch to a longer cycle, the release > > > management team recommends to switch to one year directly. That would > > > avoid changing it again anytime soon, and synchronizing on a calendar > > > year is much simpler to follow and communicate. We also recommend > > > announcing the change well in advance. We currently have an opportunity > > > of making the switch when we reach the end of the release naming > > > alphabet, which would also greatly simplify the communications around > > > the change. > > > > > > Finally, it is worth mentioning the impact on the stable branch work. > > > Releasing less often would likely impact the number of stable branches > > > that we keep on maintaining, so that we do not go too much in the past > > > (and hit unmaintained distributions or long-gone dependencies). We > > > currently maintain releases for 18 months before they switch to extended > > > maintenance, which results in between 3 and 4 releases being maintained > > > at the same time. We'd recommend switching to maintaining one-year > > > releases for 24 months, which would result in between 2 and 3 releases > > > being maintained at the same time. Such a change would lead to longer > > > maintenance for our users while reducing backporting work for our > > > developers. > > > > Thanks for the write up Thierry. > > > > I wonder what are the thoughts of the community of having LTS + normal > > releases so that we can have the power of both? I guess that is > > essentially what we have with EM, but I guess we could introduce a way to > > ensure that operators can just upgrade LTS to LTS. > > if we were to intoduce LTS release we would have to agree on what they were as > a compunity and we would need to support roling upgrade between LTS versions > > that would reqruie all distibuted project like nova to ensur that lts to lts > rpc and db compatitblity is maintained instead of the current N+1 guarunetees > we have to day. Not only that but also cross project communication, like e.g. nova <-> neutron needs to work fine between such LTS releases. > > i know that would make some downstream happy as perhaps we could align our FFU > support with THE LTS cadance but i would hold my breath on that. > > as a developer i woudl presonally prefer to have shorter cycle upstream with > uprades supporte aross a more then n+1 e.g. release every 2 months but keep > rolling upgrade compatiablty for at least 12 months or someting like that. > the release with intermeiday lifecyle can enable that while still allowign > use to have a longer or shorter planing horizon depending on the project and > its veliocity. > > > It can complicate things a bit from a CI and project management side, > > but I think it > > could solve the problem for both sides that need want new features + > > those who want > > stability? > > it might but i suspect that it will still not align with distros > canonical have a new lts every 2 years and redhat has a new release every > 18~months or so based on every 3rd release > > the lts idea i think has merrit but we likely would have to maintain at least > 2 lts release in paralel to make it work. > > so something like 1 lts release a year maintained for 2 years with normal > release every 6 months that are only maintianed for 6 months each project > woudl keep rolling upgrade compatitblity ideally between lts release rather > then n+1 as a new mimium. the implication of this is that we would want to > have grenade jobs testing latest lts to master upgrade compatiblity in > additon to n to n+1 where those differ. > > > -- > > > Thierry Carrez (ttx) > > > On behalf of the OpenStack Release Management team -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From gmann at ghanshyammann.com Sun Nov 7 19:11:28 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Sun, 07 Nov 2021 13:11:28 -0600 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> Message-ID: <17cfbd00b82.1268d0068215518.8021398674853406097@ghanshyammann.com> ---- On Fri, 05 Nov 2021 09:26:13 -0500 Thierry Carrez wrote ---- > Hi everyone, > > The (long) document below reflects the current position of the release > management team on a popular question: should the OpenStack release > cadence be changed? Please note that we only address the release > management / stable branch management facet of the problem. There are > other dimensions to take into account (governance, feature deprecation, > supported distros...) to get a complete view of the debate. > > Introduction > ------------ > > The subject of how often OpenStack should be released has been regularly > debated in the OpenStack community. OpenStack started with a 3-month > release cycle, then switched to 6-month release cycle starting with > Diablo. It is often thought of a release management decision, but it is > actually a much larger topic: a release cadence is a trade-off between > pressure to release more often and pressure to release less often, > coming in from a lot of different stakeholders. In OpenStack, it is > ultimately a Technical Committee decision. But that decision is informed > by the position of a number of stakeholders. This document gives > historical context and describes the current release management team > position. > > The current trade-off > --------------------- > > The main pressure to release more often is to make features available to > users faster. Developers get a faster feedback loop, hardware vendors > ensure software is compatible with their latest products, and users get > exciting new features. "Release early, release often" is a best practice > in our industry -- we should generally aim at releasing as often as > possible. > > But that is counterbalanced by pressure to release less often. From a > development perspective, each release cycle comes with some process > overhead. On the integrators side, a new release means packaging and > validation work. On the users side, it means pressure to upgrade. To > justify that cost, there needs to be enough user-visible benefit (like > new features) in a given release. Thanks Thierry for the detailed write up. At the same time, a shorter release which leads to upgrade-often pressure but it will have fewer number of changes/features, so make the upgrade easy and longer-release model will have more changes/features that will make upgrade more complex. > > For the last 10 years for OpenStack, that balance has been around six > months. Six months let us accumulate enough new development that it was > worth upgrading to / integrating the new version, while giving enough > time to actually do the work. It also aligned well with Foundation > events cadence, allowing to synchronize in-person developer meetings > date with start of cycles. > > What changed > ------------ > > The major recent change affecting this trade-off is that the pace of new > development in OpenStack slowed down. The rhythm of changes was divided > by 3 between 2015 and 2021, reflecting that OpenStack is now a mature > and stable solution, where accessing the latest features is no longer a > major driver. That reduces some of the pressure for releasing more > often. At the same time, we have more users every day, with larger and > larger deployments, and keeping those clusters constantly up to date is > an operational challenge. That increases the pressure to release less > often. In essence, OpenStack is becoming much more like a LTS > distribution than a web browser -- something users like moving slow. > > Over the past years, project teams also increasingly decoupled > individual components from the "coordinated release". More and more > components opted for an independent or intermediary-released model, > where they can put out releases in the middle of a cycle, making new > features available to their users. This increasingly opens up the > possibility of a longer "coordinated release" which would still allow > development teams to follow "release early, release often" best > practices. All that recent evolution means it is (again) time to > reconsider if the 6-month cadence is what serves our community best, and > in particular if a longer release cadence would not suit us better. > > The release management team position on the debate > -------------------------------------------------- > > While releasing less often would definitely reduce the load on the > release management team, most of the team work being automated, we do > not think it should be a major factor in motivating the decision. We > should not adjust the cadence too often though, as there is a one-time > cost in switching our processes. In terms of impact, we expect that a > switch to a longer cycle will encourage more project teams to adopt a > "with-intermediary" release model (rather than the traditional "with-rc" > single release per cycle), which may lead to abandoning the latter, > hence simplifying our processes. Longer cycles might also discourage > people to commit to PTL or release liaison work. We'd probably need to > manage expectations there, and encourage more frequent switches (or > create alternate models). > > If the decision is made to switch to a longer cycle, the release > management team recommends to switch to one year directly. That would > avoid changing it again anytime soon, and synchronizing on a calendar > year is much simpler to follow and communicate. We also recommend > announcing the change well in advance. We currently have an opportunity > of making the switch when we reach the end of the release naming > alphabet, which would also greatly simplify the communications around > the change. > > Finally, it is worth mentioning the impact on the stable branch work. > Releasing less often would likely impact the number of stable branches > that we keep on maintaining, so that we do not go too much in the past > (and hit unmaintained distributions or long-gone dependencies). We > currently maintain releases for 18 months before they switch to extended > maintenance, which results in between 3 and 4 releases being maintained > at the same time. We'd recommend switching to maintaining one-year > releases for 24 months, which would result in between 2 and 3 releases > being maintained at the same time. Such a change would lead to longer > maintenance for our users while reducing backporting work for our > developers. Yeah, if we switch to one-year release model then definitely we need to change the stable support policy. For example, do we need an extended maintenance phase if we support a release for 24 months? and if we keep the EM phase too, then important thing to note is that EM phase is the almost same amount of work upstream developers are spending now a days in terms of testing or backports (even though we have the agreement of reducing the effort for EM stables when needed, but I do not see that is happening, and we end up doing the same amount of maintenance there as we do for supported stables). As the yearly release model extend the stable support window and with our current situation of stable team shrinking, it is an open question for us whether we as a community will be able to support the new stable release window or not? Another point we need to consider is how it will impact the contribution support from the companies and volunteer contributors (we might not have many volunteer contributors now, so we can ignore it, but let's consider companies' support). For example, the foundation membership contract does not have contribution requirements, so companies' contribution support is always a volunteer or based on their customer needs. In that case, we need to think about how we can keep that without any impact. For example, change the foundation membership requirement or get companies' feedback if it does not impact their contribution support policy. -gmann > > -- > Thierry Carrez (ttx) > On behalf of the OpenStack Release Management team > > From katonalala at gmail.com Mon Nov 8 08:47:08 2021 From: katonalala at gmail.com (Lajos Katona) Date: Mon, 8 Nov 2021 09:47:08 +0100 Subject: [neutron] Neutron Team meeting - vote on time of the meeting Message-ID: Hi Neutrinos As we discussed during the PTG, let's start a poll to see which time fits best the need of the team for the Neutron Team meeting. I prepared a doodle, please check which time slot would be best for you: https://doodle.com/poll/m973q9eyag8k385w?utm_source=poll&utm_medium=link The dates for the doodle now for next week, but please ignore that. Lajos Katona (lajoskatona) -------------- next part -------------- An HTML attachment was scrubbed... URL: From bcafarel at redhat.com Mon Nov 8 09:16:07 2021 From: bcafarel at redhat.com (Bernard Cafarelli) Date: Mon, 8 Nov 2021 10:16:07 +0100 Subject: [neutron] Bug deputy report (week starting on 2021-11-01) Message-ID: Hey neutrinos, Now that we are past the biannual DST calendar mess, it is also time for a new bug deputy rotation! Here is the list of bugs reported for last week, kudos to Rodolfo for both reporting most of them and also having a fix for them! All bugs have assignees, almost have patches and some are just waiting for CI Critical * "neutron-tempest-plugin-scenario-ovn" job timing out frequently - https://bugs.launchpad.net/neutron/+bug/1949557 Change merged increasing timeout by ralonsoh: https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/816438 * "openstack-tox-py38" CI job is timing out frequently in gate - https://bugs.launchpad.net/neutron/+bug/1949704 Timeout change by ralonsoh in https://review.opendev.org/c/openstack/neutron/+/816631 * [CI] "neutron-ovs-rally-task" job failing due to an authentication problem - https://bugs.launchpad.net/neutron/+bug/1949945 Change to create keystone endpoint merged by ralonsoh: https://review.opendev.org/c/openstack/neutron/+/816792 High * [OVN] Stateless SG Extension Support - https://bugs.launchpad.net/neutron/+bug/1949451 Extension missing in the OVN supported extensions list, fix by Ihar https://review.opendev.org/c/openstack/neutron/+/816612 * "openstack-tox-py39" is timing out frequently - https://bugs.launchpad.net/neutron/+bug/1949476 Similar fix as py38 in https://review.opendev.org/c/openstack/neutron/+/816631 Medium * [OVS][QoS] Dataplane enforcement is limited to minimum bandwidth egress rule - https://bugs.launchpad.net/neutron/+bug/1949607 Patch in progress by ralonsoh: https://review.opendev.org/c/openstack/neutron/+/816530 * FIP ports count into quota as they get a project_id set - https://bugs.launchpad.net/neutron/+bug/1949767 Change in train bulk port binding, suggested fix https://review.opendev.org/c/openstack/neutron/+/816722 * [OVN] OVN mech driver does not map new segments - https://bugs.launchpad.net/neutron/+bug/1949967 WIP patch by ralonsoh https://review.opendev.org/c/openstack/neutron/+/816856 Low * [OVN] Check OVN support for stateless NAT - https://bugs.launchpad.net/neutron/+bug/1949494 Patch in progress by ralonsoh: https://review.opendev.org/c/openstack/neutron/+/816376 * [OVN] Check OVN support virtual port type - https://bugs.launchpad.net/neutron/+bug/1949496 Patch in progress by ralonsoh: https://review.opendev.org/c/openstack/neutron/+/816383 * direct-physical port creation fails with QoS minimum bandwidth rule - https://bugs.launchpad.net/neutron/+bug/1949877 Assigned to gibi -- Bernard Cafarelli -------------- next part -------------- An HTML attachment was scrubbed... URL: From mdulko at redhat.com Mon Nov 8 09:40:02 2021 From: mdulko at redhat.com (=?UTF-8?Q?Micha=C5=82?= Dulko) Date: Mon, 08 Nov 2021 10:40:02 +0100 Subject: [kuryr] Using kuryr-kubernetes CNI without neutron agent(s)? In-Reply-To: <9F45DC15-94D8-4112-B0A2-8C749E64F493@uchicago.edu> References: <9FF2989B-69FA-494B-B60A-B066E5BF13DA@uchicago.edu> <1043CEB9-7386-4842-8912-9DE021DB9BD0@uchicago.edu> <9F45DC15-94D8-4112-B0A2-8C749E64F493@uchicago.edu> Message-ID: <8bf45e0a40568dd5c07e1a15dd1955b6665d056a.camel@redhat.com> On Fri, 2021-11-05 at 22:54 +0000, Jason Anderson wrote: > Hi Micha?, > > I continue to appreciate the information you are providing. I?ve been > doing some more research into the landscape of systems and had a few > follow-up questions. I?ve also left some clarifying remarks if you are > interested. > > I?m currently evaluating OVN, haven?t used it before and there?s a bit > of a learning curve ;) However it seems like it may solve a good part > of the problem by removing RabbitMQ and reducing the privileges of the > edge host w.r.t. network config. > > Now I?m looking at kuryr-kubernetes. > > 1. What is the difference between kuryr and kuryr-kubernetes? I have > used kuryr-libnetwork before, in conjunction with kuryr-server (which I > think is provided via the main kuryr project?). I am using Kolla > Ansible so was spared some of the details on installation. I understand > kuryr-libnetwork is basically ?kuryr for Docker? while kuryr-kubernetes > is ?kuryr for K8s?, but that leaves me confused about what exactly the > kuryr repo is. In openstack/kuryr we have kuryr.lib module, which is hosting a few things shared by kuryr-libnetwork and kuryr-kubernetes. Nothing to worry about really. ;) > 2. A current idea is to have several edge ?compute? nodes that will run > a lightweight k8s kubelet such as k3s. OVN will provide networking to > the edge nodes, controlled from the central site. I would then place > kuryr-k8s-controller on the central site and kuryr-cni-daemon on all > the edge nodes. My question is: could users create their own Neutron > networks (w/ their own Geneve segment) and launch pods connected on > that network, and have those pods effectively be isolated from other > pods in the topology? As in, can k8s be told that pod A should launch > on network A?, and pod B on network B?? Or is there an assumption that > from Neutron?s perspective all pods are always on a single Neutron > network? Ha, that might be a bit tough one. Basically you can easily set Kuryr to create separate subnets for each of the K8s namespaces, but then you'd need to rely on NetworkPolicies to isolate traffic between namespaces which might not exactly fit your multitenant model. The best way to implement whatever you need might be to write your own custom subnet driver [1] that would choose the subnet e.g. based on a pod or namespace annotation. If there's a clear use case behind it, I think we can include it into the upstream code too. [1] https://opendev.org/openstack/kuryr-kubernetes/src/branch/master/kuryr_kubernetes/controller/drivers > Cheers, and thanks! > /Jason > > > On Oct 27, 2021, at 12:03 PM, Micha? Dulko wrote: > > > > Hm, so a mixed OpenStack-K8s edge setup, where edge sites are > > Kubernetes deployments? We've took a look at some edge use cases with > > Kuryr and one problem people see is that if an edge site becomes > > disconnected from the main side, Kuryr will not allow creation of new > > Pods and Services as it needs connection to Neutron and Octavia APIs > > for that. If that's not a problem had you gave a thought into running > > distributed compute nodes [1] as edge sites and then Kubernetes on > > top > > of them? This architecture should be doable with Kuryr (probably with > > minor changes). > > Sort of! I work in research infrastructure and we are building an > IoT/edge testbed for computer science researchers who wish to do > research in edge computing. It?s a bit mad science-y. We are buying and > configuring relatively high-powered edge devices such as Raspberry Pis > and Jetson Nanos and making them available for experimentation at a > variety of sites. Separately, the platform supports any owner of a > supported device to have it managed by the testbed (i.e., they can use > our interfaces to launch containers on it and connect it logically to > other devices / resources in the cloud.) > > Distributed compute node looks a bit too heavy for this highly dynamic > use-case, but thank you for sharing. > > Anyways, one might ask why Neutron at all. I am hopeful we can get some > interesting properties such as network isolation and the ability to > bridge traffic from containers across other layer 2 links such as those > provided by?AL2S. > > > > OVN may help if it can remove the need for RabbitMQ, which is > > > probably the > > > most difficult aspect to remove from OpenStack?s > > > dependencies/assumptions, > > > yet also one of the most pernicious from a security angle, as an > > > untrusted > > > worker node can easily corrupt the control plane. > > > > It's just Kuryr which needs access to the credentials, so possibly > > you > > should be able to isolate them, but I get the point, containers are > > worse at isolation than VMs. > > I?m less worried about the mechanism for isolation on the host and more > the amount of privileged information the host must keep secure, and the > impact of that information being compromised. Because our experimental > target system involves container engines maintained externally to the > core site, the risk of compromise on the edge is high. I am searching > for an architecture that greatly limits the blast radius of such a > compromise. Currently if we use standard Neutron networking + Kuryr, we > must give RabbitMQ credentials and others to the container engines on > the edge, which papers such > as?http://seclab.cs.sunysb.edu/seclab/pubs/asiaccs16.pdf?have > documented as a trivial escalation path. > > For this reason, narrowing the scope of what state the edge hosts can > influence on the core site is paramount. > > > > > > Re: admin creds, maybe it is possible to carefully craft a role > > > that only works > > > for some Neutron operations and put that on the worker nodes. I > > > will explore. > > > > I think those settings [2] is what would require highest Neutron > > permissions in baremetal case. > > Thanks ? so it will need to create and delete ports. This may be > acceptable; without some additional API proxy layer for the edge hosts, > a malicious edge host could create bogus ports and delete good ones, > but that is a much smaller level of impact. I think we could create a > role that only allowed such operations and generate per-host > credentials. > > > [1]? > > https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/features/distributed_compute_node.html > > [2]? > > https://opendev.org/openstack/kuryr-kubernetes/src/branch/master/kuryr_kubernetes/controller/drivers/neutron_vif.py#L125-L127 > > > > > Cheers! > > > > [1] > > > > https://docs.openstack.org/kuryr-kubernetes/latest/nested_vlan_mode.html > > > > > > > > Thanks, > > > > Micha? > > > > > > > > > Thanks! > > > > > Jason Anderson > > > > > > > > > > --- > > > > > > > > > > Chameleon DevOps Lead > > > > > Department of Computer Science, University of Chicago > > > > > Mathematics and Computer Science, Argonne National Laboratory > From katonalala at gmail.com Mon Nov 8 10:48:21 2021 From: katonalala at gmail.com (Lajos Katona) Date: Mon, 8 Nov 2021 11:48:21 +0100 Subject: [neutron] Neutron drivers Meeting - vote on time of the meeting Message-ID: Hi Neutrinos, As we discussed during the PTG, let's start a poll to see which time fits best the need of the team for the Neutron drivers meeting. I prepared a doodle, please check which time slot would be best for you: https://doodle.com/poll/6vdugxp7g54smdv6?utm_source=poll&utm_medium=link The dates for the doodle now for next week, but please ignore that. Lajos Katona (lajoskatona) -------------- next part -------------- An HTML attachment was scrubbed... URL: From pbasaras at gmail.com Mon Nov 8 11:02:50 2021 From: pbasaras at gmail.com (Pavlos Basaras) Date: Mon, 8 Nov 2021 13:02:50 +0200 Subject: [Ussuri] [Zun] Container deployment problem with Zun "There are not enough hosts available" In-Reply-To: <656bfc53.18f5.17cf4592522.Coremail.kira034@163.com> References: <656bfc53.18f5.17cf4592522.Coremail.kira034@163.com> Message-ID: Hello, you are right, that fixed the problem. In order to solve the problem i revisited https://docs.openstack.org/placement/ussuri/install/verify.html I executed: openstack resource provider list Then removed the host that i use for containers, restarted the zun-compute at the host and works perfectly. Thank you very much for your input. One more thing, I don't see at the horizon dashboard the tab for the containers (i just see the nova compute related tab). Is there any additional configuration for this? btw, from https://docs.openstack.org/placement/ussuri/install/verify.html, i used pip3 install osc-placement (instead of pip install..) all the best Pavlos. On Sat, Nov 6, 2021 at 10:24 AM Hongbin Lu wrote: > Hi, > > It looks zun-compute wants to create a resource provider in placement, but > placement return a 409 response. I would suggest you to check placement's > logs. My best guess is the resource provider with the same name is already > created so placement returned 409. If this is a case, simply remove those > resources and restart zun-compute service should resolve the problem. > > Best regards, > Hongbin > > > > > > > At 2021-11-05 22:43:38, "Pavlos Basaras" wrote: > > Hello, > > I have an Openstack cluster, with basic services and functionality working > based on ussuri release. > > I am trying to install the Zun service to be able to deploy containers, > following > > [controller] -- > https://docs.openstack.org/zun/ussuri/install/controller-install.html > and > [compute] -- > https://docs.openstack.org/zun/ussuri/install/compute-install.html > > I used the git branch based on ussuri for all components. > > I veryfined kuryr-libnetwork operation issuing from the compute node > # docker network create --driver kuryr --ipam-driver kuryr --subnet > 10.10.0.0/16 --gateway=10.10.0.1 test_net > > and seeing the network created successfully, etc. > > I am not very sure about the zun.conf file. > What is the "endpoint_type = internalURL" parameter? > Do I need to change internalURL? > > > From sudo systemctl status zun-compute i see: > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task File > "/usr/local/lib/python3.6/dist-packages/retrying.py", line 49, in wrapped_f > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task return Retrying(*dargs, **dkw).call(f, > *args, **kw) > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task File > "/usr/local/lib/python3.6/dist-packages/retrying.py", line 206, in call > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task return > attempt.get(self._wrap_exception) > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task File > "/usr/local/lib/python3.6/dist-packages/retrying.py", line 247, in get > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task six.reraise(self.value[0], > self.value[1], self.value[2]) > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task File > "/usr/lib/python3/dist-packages/six.py", line 703, in reraise > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task raise value > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task File > "/usr/local/lib/python3.6/dist-packages/retrying.py", line 200, in call > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task attempt = Attempt(fn(*args, **kwargs), > attempt_number, False) > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task File > "/usr/local/lib/python3.6/dist-packages/zun/compute/compute_node_tracker.py", > line 350, in _update_to_placement > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task context, node_rp_uuid, > name=compute_node.hostname) > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task File > "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", > line 846, in get_provider_tree_and_ensure_r > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task > parent_provider_uuid=parent_provider_uuid) > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task File > "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", > line 628, in _ensure_resource_provider > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task > parent_provider_uuid=parent_provider_uuid) > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task File > "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", > line 514, in _create_resource_provider > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task global_request_id=context.global_id) > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task File > "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", > line 225, in post > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task headers=headers, logger=LOG) > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task File > "/usr/lib/python3/dist-packages/keystoneauth1/adapter.py", line 392, in post > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task return self.request(url, 'POST', > **kwargs) > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task File > "/usr/lib/python3/dist-packages/keystoneauth1/adapter.py", line 248, in > request > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task return self.session.request(url, > method, **kwargs) > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task File > "/usr/lib/python3/dist-packages/keystoneauth1/session.py", line 968, in > request > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task raise exceptions.from_response(resp, > method, url) > *Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task keystoneauth1.exceptions.http.Conflict: > Conflict (HTTP 409) (Request-ID: req-9a158c41-a485-4937-99e7-e38cdce7fded)* > > What is this problem? any advice? > I used the default configuration values ([keystone_auth] and > [keystone_authtoken]) values based on the configuration from the above > links. > > > Aslo from the controller > > > *openstack appcontainer service list* > +----+----------+-------------+-------+----------+-----------------+----------------------------+-------------------+ > | Id | Host | Binary | State | Disabled | Disabled Reason | > Updated At | Availability Zone | > > +----+----------+-------------+-------+----------+-----------------+----------------------------+-------------------+ > | 1 | compute5 | zun-compute | up | False | None | > 2021-11-05T14:39:01.000000 | nova | > > +----+----------+-------------+-------+----------+-----------------+----------------------------+-------------------+ > > > *openstack appcontainer host show compute5* > +----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ > | Field | Value > > | > > +----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ > | uuid | ee3a5b44-8ffa-463e-939d-0c61868a596f > > | > | links | [{'href': ' > http://controller:9517/v1/hosts/ee3a5b44-8ffa-463e-939d-0c61868a596f', > 'rel': 'self'}, {'href': ' > http://controller:9517/hosts/ee3a5b44-8ffa-463e-939d-0c61868a596f', > 'rel': 'bookmark'}] | > | hostname | compute5 > > | > | mem_total | 7975 > > | > | mem_used | 0 > > | > | total_containers | 1 > > | > | cpus | 10 > > | > | cpu_used | 0.0 > > | > | architecture | x86_64 > > | > | os_type | linux > > | > | os | Ubuntu 18.04.6 LTS > > | > | kernel_version | 4.15.0-161-generic > > | > | labels | {} > > | > | disk_total | 63 > > | > | disk_used | 0 > > | > | disk_quota_supported | False > > | > | runtimes | ['io.containerd.runc.v2', > 'io.containerd.runtime.v1.linux', 'runc'] > > | > | enable_cpu_pinning | False > > | > > +----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ > > > seems to work fine. > > However when i issue e.g., openstack appcontainer run --name container > --net network=$NET_ID cirros ping 8.8.8.8 > > i get the error: | status_reason | There are not enough hosts > available. > > Any ideas? > > One final thing is that I did see in the Horizon dashboard the > container tab, to be able to deploy containers from horizon. Is there an > extra configuration for this? > > sorry for the long mail. > > best, > Pavlos > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From iurygregory at gmail.com Mon Nov 8 14:18:24 2021 From: iurygregory at gmail.com (Iury Gregory) Date: Mon, 8 Nov 2021 15:18:24 +0100 Subject: =?UTF-8?Q?Re=3A_=5Bironic=5D_Proposing_Aija_Jaunt=C4=93va_for_sushy=2Dcore?= In-Reply-To: References: Message-ID: I've added Aija to the sushy-core group =) Congratulations! Em ter., 2 de nov. de 2021 ?s 08:51, Iury Gregory escreveu: > Hello everyone! > > I would like to propose Aija Jaunt?va (irc: ajya) to be added to the > sushy-core group. > Aija has been in the ironic community for a long time, she has a lot of > knowledge about redfish and is always providing good reviews. > > ironic-cores please vote with +/- 1. > > -- > > > *Att[]'sIury Gregory Melo Ferreira * > *MSc in Computer Science at UFCG* > *Part of the ironic-core and puppet-manager-core team in OpenStack* > *Software Engineer at Red Hat Czech* > *Social*: https://www.linkedin.com/in/iurygregory > *E-mail: iurygregory at gmail.com * > -- *Att[]'sIury Gregory Melo Ferreira * *MSc in Computer Science at UFCG* *Part of the ironic-core and puppet-manager-core team in OpenStack* *Software Engineer at Red Hat Czech* *Social*: https://www.linkedin.com/in/iurygregory *E-mail: iurygregory at gmail.com * -------------- next part -------------- An HTML attachment was scrubbed... URL: From ashlee at openstack.org Mon Nov 8 15:53:22 2021 From: ashlee at openstack.org (Ashlee Ferguson) Date: Mon, 8 Nov 2021 09:53:22 -0600 Subject: OpenInfra Live - November 11 at 9am CT Message-ID: Hi everyone, This week?s OpenInfra Live episode is brought to you by the OpenStack Community. The 2021 User Survey shows that the footprint of OpenStack clouds grew by 66% over the last year, totaling over 25 million cores in production. This increase has been contributed by organizations of all sizes around the world. During this episode of OpenInfra Live, we are going to talk to operators from Kakao, LINE, Schwarz IT, NeCTAR and T-Systems about what?s causing this OpenStack growth at their organization. Episode: OpenStack Is Alive: Explosive Growth Among Production Deployments Date and time: November 11 at 9am CT (1500 UTC) You can watch us live on: YouTube: https://www.youtube.com/watch?v=RhMJO82lDxc LinkedIn: https://www.linkedin.com/feed/update/urn:li:ugcPost:6862068514526756864 Facebook: https://www.facebook.com/104139126308032/posts/4493822407339660/ WeChat: recording will be posted on OpenStack WeChat after the live stream Speakers: Paul Coddington (ARDC Nectar) Reedip Banerjee (LINE) Yushiro Furukawa (LINE) Andrew Kong (Kakao) Nils Magnus (T-Systems) Adrian Seiffert (Schwarz) Marvin Titus (Schwarz) Carmel Walsh (ARDC Nectar) Have an idea for a future episode? Share it now at ideas.openinfra.live . Register now for OpenInfra Live: Keynotes, a special edition of OpenInfra Live on November 17-18th starting at 1500 UTC: https://openinfralivekeynotes.eventbrite.com/ Thanks, Ashlee OpenInfra Foundation Community & Events ashlee at openinfra.dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From thierry at openstack.org Mon Nov 8 15:58:00 2021 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 8 Nov 2021 16:58:00 +0100 Subject: [largescale-sig] Next meeting: Nov 10th, 15utc Message-ID: Hi everyone, The Large Scale SIG meeting is back this Wednesday in #openstack-operators on OFTC IRC, at 15UTC. It will be chaired by Belmiro Moreira. You can doublecheck how that time translates locally at: https://www.timeanddate.com/worldclock/fixedtime.html?iso=20211110T15 Feel free to add topics to our agenda at: https://etherpad.openstack.org/p/large-scale-sig-meeting Regards, -- Thierry Carrez From thierry at openstack.org Mon Nov 8 17:40:42 2021 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 8 Nov 2021 18:40:42 +0100 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: <17cfbd00b82.1268d0068215518.8021398674853406097@ghanshyammann.com> References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> <17cfbd00b82.1268d0068215518.8021398674853406097@ghanshyammann.com> Message-ID: Ghanshyam Mann wrote: > [...] > Thanks Thierry for the detailed write up. > > At the same time, a shorter release which leads to upgrade-often pressure but > it will have fewer number of changes/features, so make the upgrade easy and > longer-release model will have more changes/features that will make upgrade more > complex. I think that was true a few years ago, but I'm not convinced that still holds. We currently have a third of the changes volume we had back in 2015, so a one-year release in 2022 would contain far less changes than a 6-month release from 2015. Also, thanks to our testing and our focus on stability, the pain linked to the amount of breaking changes in a release is now negligible compared to the basic pain of going through a 1M-core deployment and upgrading the various pieces... every 6 months. I've heard of multiple users claiming it takes them close to 6 months to upgrade their massive deployments to a new version. So when they are done, they have to start again. -- Thierry Carrez (ttx) From gmann at ghanshyammann.com Mon Nov 8 18:35:36 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 08 Nov 2021 12:35:36 -0600 Subject: [all][tc] Technical Committee next weekly meeting on Nov 11th at 1500 UTC Message-ID: <17d00d5940e.c605c228290918.1450543438871901915@ghanshyammann.com> Hello Everyone, Technical Committee's next weekly meeting is scheduled for Nov 11th at 1500 UTC. If you would like to add topics for discussion, please add them to the below wiki page by Wednesday, Nov 10th, at 2100 UTC. https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting -gmann From juliaashleykreger at gmail.com Mon Nov 8 19:43:18 2021 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Mon, 8 Nov 2021 12:43:18 -0700 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> <17cfbd00b82.1268d0068215518.8021398674853406097@ghanshyammann.com> Message-ID: On Mon, Nov 8, 2021 at 10:44 AM Thierry Carrez wrote: > > Ghanshyam Mann wrote: > > [...] > > Thanks Thierry for the detailed write up. > > > > At the same time, a shorter release which leads to upgrade-often pressure but > > it will have fewer number of changes/features, so make the upgrade easy and > > longer-release model will have more changes/features that will make upgrade more > > complex. > > I think that was true a few years ago, but I'm not convinced that still > holds. We currently have a third of the changes volume we had back in > 2015, so a one-year release in 2022 would contain far less changes than > a 6-month release from 2015. I concur. Also, in 2015, we were still very much in a "move fast" mode of operation as a community. > Also, thanks to our testing and our focus on stability, the pain linked > to the amount of breaking changes in a release is now negligible > compared to the basic pain of going through a 1M-core deployment and > upgrading the various pieces... every 6 months. I've heard of multiple > users claiming it takes them close to 6 months to upgrade their massive > deployments to a new version. So when they are done, they have to start > again. > > -- > Thierry Carrez (ttx) > I've been hearing the exact same messaging from larger operators as well as operators in environments where they are concerned about managing risk for at least the past two years. These operators have indicated it is not uncommon for the upgrade projects which consume, test, certify for production, and deploy to production take *at least* six months to execute. At the same time, they are shy of being the ones to also "find all of the bugs", and so the project doesn't actually start until well after the new coordinated release has occurred. Quickly they become yet another version behind with this pattern. I suspect it is really easy for us as a CI focused community to think that six months is plenty of time to roll out a fully updated deployment which has been fully tested in every possible way. Except, these operators are often trying to do just that on physical hardware, with updated firmware and operatings systems bringing in new variables with every single change which may ripple up the entire stack. These operators then have to apply the lessons they have previously learned once they have worked through all of the variables. In some cases this may involve aspects such as benchmarking, to ensure they don't need to make additional changes which need to be factored into their deployment, sending them back to the start of their testing. All while thinking of phrases like "business/mission critical". I guess this means I'm in support of revising the release cycle. At the same time, I think it would be wise for us to see if we can learn from these operators the pain points they experience, the process they leverage, and ultimately see if there are opportunities to spread knowledge or potentially tooling. Or maybe even get them to contribute their patches upstream. Not that all of these issues are easily solved with any level of code, but sometimes they can include contextual disconnects and resolving those are just as important as shipping a release, IMHO. -Julia From abraden at verisign.com Mon Nov 8 19:49:29 2021 From: abraden at verisign.com (Braden, Albert) Date: Mon, 8 Nov 2021 19:49:29 +0000 Subject: Adjutant needs contributors (and a PTL) to survive! In-Reply-To: <2bd77f1e8c7542888b7e0e1a14931a41@verisign.com> References: <3026c411-688b-c773-8577-e8eed40b995a@catalystcloud.nz> <2bd77f1e8c7542888b7e0e1a14931a41@verisign.com> Message-ID: <9cabb3cb32a7441697f58933df72b514@verisign.com> I didn't have any luck contacting Adrian. Does anyone know where the storyboard is that he mentions in his email? -----Original Message----- From: Braden, Albert Sent: Monday, November 1, 2021 12:36 PM To: adriant at catalystcloud.nz; openstack-discuss at lists.openstack.org Subject: RE: [EXTERNAL] Adjutant needs contributors (and a PTL) to survive! Hi Adrian, I don't think I'm qualified to be PTL but I'm willing to help, and I've asked for permission. We aren't using Adjutant at this time because we're on Train and I learned at my last contract that running Adjutant on Train is a hassle, but I hope to start using it after we get to Ussuri. Has anyone else volunteered? -----Original Message----- From: Adrian Turjak Sent: Wednesday, October 27, 2021 1:41 AM To: openstack-discuss at lists.openstack.org Subject: [EXTERNAL] Adjutant needs contributors (and a PTL) to survive! Caution: This email originated from outside the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe. Hello fellow OpenStackers! I'm moving on to a different opportunity and my new role will not involve OpenStack, and there sadly isn't anyone at Catalystcloud who will be able to take over project responsibilities for Adjutant any time soon (not that I've been very onto it lately). As such Adjutant needs people to take over, and lead it going forward. I believe the codebase is in a reasonably good position for others to pick up, and I plan to go through and document a few more of my ideas for where it should go in storyboard so some of those plans exist somewhere should people want to pick up from where I left off before going fairly silent upstream. Plus if people want/need to they can reach out to me or add me to code review and chances are I'll comment/review because I do care about the project. Or I may contract some time to it. There are a few clouds running Adjutant, and people who have previously expressed interest in using it, so if you still are, the project isn't in a bad place at all. The code is stable, and the last few major refactors have cleaned up much of my biggest pain points with it. Best of luck! - adriant From fungi at yuggoth.org Mon Nov 8 20:24:05 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Mon, 8 Nov 2021 20:24:05 +0000 Subject: Adjutant needs contributors (and a PTL) to survive! In-Reply-To: <9cabb3cb32a7441697f58933df72b514@verisign.com> References: <3026c411-688b-c773-8577-e8eed40b995a@catalystcloud.nz> <2bd77f1e8c7542888b7e0e1a14931a41@verisign.com> <9cabb3cb32a7441697f58933df72b514@verisign.com> Message-ID: <20211108202404.f45g57hnt5aa3db4@yuggoth.org> On 2021-11-08 19:49:29 +0000 (+0000), Braden, Albert wrote: > I didn't have any luck contacting Adrian. Does anyone know where > the storyboard is that he mentions in his email? [...] There's a project group for Adjutant here: https://storyboard.openstack.org/#!/project_group/adjutant -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From hojat.gazestani1 at gmail.com Mon Nov 8 19:13:25 2021 From: hojat.gazestani1 at gmail.com (Hojii GZNI) Date: Mon, 8 Nov 2021 22:43:25 +0330 Subject: Opendaylight integration Openstack Message-ID: Hi everyone I am trying to use Opendaylight as an Openstack SDN controller but have this error for "mechanism_dirver". Xena Ubuntu 20.04 Opendaylight 8 /etc/neutron/plugins/ml2/ml2_conf.ini mechanism_drivers = opendaylight Neutron server log: CRITICAL neutron.plugins.ml2.managers [-] The following mechanism drivers were not found: {'opendaylight'} All configuration is in my Github[1] Regards, Hojat. [1]: https://github.com/hojat-gazestani/openstack -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Mon Nov 8 21:48:54 2021 From: smooney at redhat.com (Sean Mooney) Date: Mon, 08 Nov 2021 21:48:54 +0000 Subject: Opendaylight integration Openstack In-Reply-To: References: Message-ID: <74cf8bc2059798f0d9489fb7b19a997255481223.camel@redhat.com> On Mon, 2021-11-08 at 22:43 +0330, Hojii GZNI wrote: > Hi everyone > > I am trying to use Opendaylight as an Openstack SDN controller but have > this error for "mechanism_dirver". > > Xena > Ubuntu 20.04 > Opendaylight 8 > > > /etc/neutron/plugins/ml2/ml2_conf.ini > > mechanism_drivers = opendaylight i dont know which release you are using but the orginal odl driver was remvoed some time ago and replace by the v2 dirver https://opendev.org/openstack/networking-odl/src/branch/master/setup.cfg#L52 so you should be padding opendaylight_v2 as shoe in the comment https://opendev.org/openstack/networking-odl/src/branch/master/setup.cfg#L44-L47 that is likely the cause of the error > > Neutron server log: > CRITICAL neutron.plugins.ml2.managers [-] The following mechanism > drivers were not found: {'opendaylight'} > > All configuration is in my Github[1] > > Regards, > > Hojat. > > [1]: https://github.com/hojat-gazestani/openstack From arnaud.morin at gmail.com Mon Nov 8 22:57:54 2021 From: arnaud.morin at gmail.com (Arnaud) Date: Mon, 08 Nov 2021 23:57:54 +0100 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> <17cfbd00b82.1268d0068215518.8021398674853406097@ghanshyammann.com> Message-ID: <83EBFE80-193C-47F4-91B5-65BEE27509E6@gmail.com> Hey, I'd like to add my 2 cents. It's hard to upgrade a region, so when it comes to upgrade multiples regions, it's even harder. Some operators also have their own downstream patchs / extensions / drivers which make the upgrade process more complex, so it take more time (for all reasons already given in the thread, need to update the CI, the tools, the doc, the people, etc). One more thing is about consistency, when you have to manage multiple regions, it's easier if all of them are pretty identical. Human operation are always the same, and can eventually be automated. This leads to keep going on with a fixed version of OpenStack to run the business. When scaling, you (we) always chose security and consistency. Also, Julia mentioned something true about contribution from operators. It's difficult for them for multiple reasons: - pushing upstream is a process, which need to be taken into account when working on an internal fix. - it's usually quicker to push downstream because it's needed. When it comes to upstream, it's challenged by the developers (and it's good), so it take time and can be discouraging. - operators are not running master, but a stable release. Bugs on stables could be fixed differently than on master, which could also be discouraging. - writing unit tests is a job, some tech operators are not necessarily developers, so this could also be a challenge. All of these to say that helping people which are proposing a patch is a good thing. And as far as I can see, upstream developers are helping most of the time, and we should keep and encourage such behavior IMHO. Finally, I would also vote for less releases or LTS releases (but it looks heavier to have this). I think this would help keeping up to date with stables and propose more patches from operators. Cheers, Arnaud. Le 8 novembre 2021 20:43:18 GMT+01:00, Julia Kreger a ?crit?: >On Mon, Nov 8, 2021 at 10:44 AM Thierry Carrez wrote: >> >> Ghanshyam Mann wrote: >> > [...] >> > Thanks Thierry for the detailed write up. >> > >> > At the same time, a shorter release which leads to upgrade-often pressure but >> > it will have fewer number of changes/features, so make the upgrade easy and >> > longer-release model will have more changes/features that will make upgrade more >> > complex. >> >> I think that was true a few years ago, but I'm not convinced that still >> holds. We currently have a third of the changes volume we had back in >> 2015, so a one-year release in 2022 would contain far less changes than >> a 6-month release from 2015. > >I concur. Also, in 2015, we were still very much in a "move fast" mode >of operation as a community. > >> Also, thanks to our testing and our focus on stability, the pain linked >> to the amount of breaking changes in a release is now negligible >> compared to the basic pain of going through a 1M-core deployment and >> upgrading the various pieces... every 6 months. I've heard of multiple >> users claiming it takes them close to 6 months to upgrade their massive >> deployments to a new version. So when they are done, they have to start >> again. >> >> -- >> Thierry Carrez (ttx) >> > >I've been hearing the exact same messaging from larger operators as >well as operators in environments where they are concerned about >managing risk for at least the past two years. These operators have >indicated it is not uncommon for the upgrade projects which consume, >test, certify for production, and deploy to production take *at least* >six months to execute. At the same time, they are shy of being the >ones to also "find all of the bugs", and so the project doesn't >actually start until well after the new coordinated release has >occurred. Quickly they become yet another version behind with this >pattern. > >I suspect it is really easy for us as a CI focused community to think >that six months is plenty of time to roll out a fully updated >deployment which has been fully tested in every possible way. Except, >these operators are often trying to do just that on physical hardware, >with updated firmware and operatings systems bringing in new variables >with every single change which may ripple up the entire stack. These >operators then have to apply the lessons they have previously learned >once they have worked through all of the variables. In some cases this >may involve aspects such as benchmarking, to ensure they don't need to >make additional changes which need to be factored into their >deployment, sending them back to the start of their testing. All while >thinking of phrases like "business/mission critical". > >I guess this means I'm in support of revising the release cycle. At >the same time, I think it would be wise for us to see if we can learn >from these operators the pain points they experience, the process they >leverage, and ultimately see if there are opportunities to spread >knowledge or potentially tooling. Or maybe even get them to contribute >their patches upstream. Not that all of these issues are easily solved >with any level of code, but sometimes they can include contextual >disconnects and resolving those are just as important as shipping a >release, IMHO. > >-Julia > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gagehugo at gmail.com Tue Nov 9 00:58:47 2021 From: gagehugo at gmail.com (Gage Hugo) Date: Mon, 8 Nov 2021 18:58:47 -0600 Subject: [openstack-helm] No meeting tomorrow Message-ID: Hey team, Since there are no agenda items [0] for the IRC meeting tomorrow, the meeting is cancelled. Our next meeting will be November 16th. Thanks [0] https://etherpad.opendev.org/p/openstack-helm-weekly-meeting -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrew at etc.gen.nz Tue Nov 9 10:54:36 2021 From: andrew at etc.gen.nz (Andrew Ruthven) Date: Tue, 09 Nov 2021 23:54:36 +1300 Subject: Adjutant needs contributors (and a PTL) to survive! In-Reply-To: <9cabb3cb32a7441697f58933df72b514@verisign.com> References: <3026c411-688b-c773-8577-e8eed40b995a@catalystcloud.nz> <2bd77f1e8c7542888b7e0e1a14931a41@verisign.com> <9cabb3cb32a7441697f58933df72b514@verisign.com> Message-ID: <1e4045dd6edf572a8a32ca639056bc06a8782fe8.camel@etc.gen.nz> On Mon, 2021-11-08 at 19:49 +0000, Braden, Albert wrote: > I didn't have any luck contacting Adrian. Does anyone know where the > storyboard is that he mentions in his email? I'll check in with Adrian to see if he has heard from anyone. Cheers, Andrew Catalyst Cloud --? Andrew Ruthven, Wellington, New Zealand andrew at etc.gen.nz | Catalyst Cloud: | This space intentionally left blank https://catalystcloud.nz | -------------- next part -------------- An HTML attachment was scrubbed... URL: From moreira.belmiro.email.lists at gmail.com Tue Nov 9 13:50:31 2021 From: moreira.belmiro.email.lists at gmail.com (Belmiro Moreira) Date: Tue, 9 Nov 2021 14:50:31 +0100 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: <83EBFE80-193C-47F4-91B5-65BEE27509E6@gmail.com> References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> <17cfbd00b82.1268d0068215518.8021398674853406097@ghanshyammann.com> <83EBFE80-193C-47F4-91B5-65BEE27509E6@gmail.com> Message-ID: Hi, It's time again to discuss the release cycle... Just considering the number of times that lately we have been discussing the release cycle we should acknowledge that we really have a problem or at least that we have very different opinions in the community and we should discuss it openly. Thanks Thierry to bring the topic again. Looking into the last user survey we see that 23% of the deployments are running the last two releases and then we have a long... long... tail with older releases. Honestly, I have mixed feelings about it! As an operator I relate more with having a LTS release and give the possibility to upgrade between LTS releases. But having the possibility to upgrade every 6 months is also very interesting for the small and fast moving projects. Maybe an 1 year release cycle would provide the mid term here. In our cloud infrastructure we run different releases, from Stein to Victoria. There are projects that we can easily upgrade (and we do it!) and other projects that are much more complicated (because feature deprecations, Operating System dependencies, internal patches, or simply because is too risky considering the current workloads). For those we need definitely more than 6 months for the upgrade. If again we don't reach a consensus to change the release cycle at least we should continue to work in improving the upgrade experience (and don't let me wrong... the upgrade experience has been improved tremendously over the years). There are small things that change in the projects (most of them are good refactors) but can be a big headache for upgrades. Let me enumerate some: DB schema changes usually translates into offline upgrades, configuration changes (options that move to different configuration groups without bringing anything new, change defaults, policy changes), architecture changes (new projects that are now mandatory), ... In my opinion if we reduce those or at least are more aware of the challenges that they impose to operators, we will make upgrades easier and hopefully see deployments move much faster whatever is the release cycle. cheers, Belmiro On Tue, Nov 9, 2021 at 12:04 AM Arnaud wrote: > Hey, > I'd like to add my 2 cents. > > It's hard to upgrade a region, so when it comes to upgrade multiples > regions, it's even harder. > > Some operators also have their own downstream patchs / extensions / > drivers which make the upgrade process more complex, so it take more time > (for all reasons already given in the thread, need to update the CI, the > tools, the doc, the people, etc). > > One more thing is about consistency, when you have to manage multiple > regions, it's easier if all of them are pretty identical. Human operation > are always the same, and can eventually be automated. > This leads to keep going on with a fixed version of OpenStack to run the > business. > When scaling, you (we) always chose security and consistency. > > Also, Julia mentioned something true about contribution from operators. > It's difficult for them for multiple reasons: > - pushing upstream is a process, which need to be taken into account when > working on an internal fix. > - it's usually quicker to push downstream because it's needed. When it > comes to upstream, it's challenged by the developers (and it's good), so it > take time and can be discouraging. > - operators are not running master, but a stable release. Bugs on stables > could be fixed differently than on master, which could also be discouraging. > - writing unit tests is a job, some tech operators are not necessarily > developers, so this could also be a challenge. > > All of these to say that helping people which are proposing a patch is a > good thing. And as far as I can see, upstream developers are helping most > of the time, and we should keep and encourage such behavior IMHO. > > Finally, I would also vote for less releases or LTS releases (but it looks > heavier to have this). I think this would help keeping up to date with > stables and propose more patches from operators. > > Cheers, > Arnaud. > > > Le 8 novembre 2021 20:43:18 GMT+01:00, Julia Kreger < > juliaashleykreger at gmail.com> a ?crit : >> >> On Mon, Nov 8, 2021 at 10:44 AM Thierry Carrez wrote: >> >>> >>> Ghanshyam Mann wrote: >>> >>>> [...] >>>> Thanks Thierry for the detailed write up. >>>> >>>> At the same time, a shorter release which leads to upgrade-often pressure but >>>> it will have fewer number of changes/features, so make the upgrade easy and >>>> longer-release model will have more changes/features that will make upgrade more >>>> complex. >>>> >>> >>> I think that was true a few years ago, but I'm not convinced that still >>> holds. We currently have a third of the changes volume we had back in >>> 2015, so a one-year release in 2022 would contain far less changes than >>> a 6-month release from 2015. >>> >> >> I concur. Also, in 2015, we were still very much in a "move fast" mode >> of operation as a community. >> >> Also, thanks to our testing and our focus on stability, the pain linked >>> to the amount of breaking changes in a release is now negligible >>> compared to the basic pain of going through a 1M-core deployment and >>> upgrading the various pieces... every 6 months. I've heard of multiple >>> users claiming it takes them close to 6 months to upgrade their massive >>> deployments to a new version. So when they are done, they have to start >>> again. >>> >>> -- >>> Thierry Carrez (ttx) >>> >>> >> I've been hearing the exact same messaging from larger operators as >> well as operators in environments where they are concerned about >> managing risk for at least the past two years. These operators have >> indicated it is not uncommon for the upgrade projects which consume, >> test, certify for production, and deploy to production take *at least* >> six months to execute. At the same time, they are shy of being the >> ones to also "find all of the bugs", and so the project doesn't >> actually start until well after the new coordinated release has >> occurred. Quickly they become yet another version behind with this >> pattern. >> >> I suspect it is really easy for us as a CI focused community to think >> that six months is plenty of time to roll out a fully updated >> deployment which has been fully tested in every possible way. Except, >> these operators are often trying to do just that on physical hardware, >> with updated firmware and operatings systems bringing in new variables >> with every single change which may ripple up the entire stack. These >> operators then have to apply the lessons they have previously learned >> once they have worked through all of the variables. In some cases this >> may involve aspects such as benchmarking, to ensure they don't need to >> make additional changes which need to be factored into their >> deployment, sending them back to the start of their testing. All while >> thinking of phrases like "business/mission critical". >> >> I guess this means I'm in support of revising the release cycle. At >> the same time, I think it would be wise for us to see if we can learn >> from these operators the pain points they experience, the process they >> leverage, and ultimately see if there are opportunities to spread >> knowledge or potentially tooling. Or maybe even get them to contribute >> their patches upstream. Not that all of these issues are easily solved >> with any level of code, but sometimes they can include contextual >> disconnects and resolving those are just as important as shipping a >> release, IMHO. >> >> -Julia >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From juliaashleykreger at gmail.com Tue Nov 9 14:45:38 2021 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Tue, 9 Nov 2021 07:45:38 -0700 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> <17cfbd00b82.1268d0068215518.8021398674853406097@ghanshyammann.com> <83EBFE80-193C-47F4-91B5-65BEE27509E6@gmail.com> Message-ID: On Tue, Nov 9, 2021 at 6:50 AM Belmiro Moreira wrote: > > Hi, > It's time again to discuss the release cycle... > Just considering the number of times that lately we have been > discussing the release cycle we should acknowledge that we really > have a problem or at least that we have very different opinions in > the community and we should discuss it openly. > > Thanks Thierry to bring the topic again. > > Looking into the last user survey we see that 23% of the deployments > are running the last two releases and then we have a long... long... > tail with older releases. > > Honestly, I have mixed feelings about it! > > As an operator I relate more with having a LTS release and give the > possibility to upgrade between LTS releases. But having the possibility > to upgrade every 6 months is also very interesting for the small and fast > moving projects. > > Maybe an 1 year release cycle would provide the mid term here. > > In our cloud infrastructure we run different releases, from Stein to Victoria. > There are projects that we can easily upgrade (and we do it!) and other projects > that are much more complicated (because feature deprecations, Operating System > dependencies, internal patches, or simply because is too risky considering the > current workloads). > For those we need definitely more than 6 months for the upgrade. > > If again we don't reach a consensus to change the release cycle at least we > should continue to work in improving the upgrade experience (and don't let me wrong... > the upgrade experience has been improved tremendously over the years). > > There are small things that change in the projects (most of them are good refactors) > but can be a big headache for upgrades. > Let me enumerate some: DB schema changes usually translates into offline upgrades, > configuration changes (options that move to different configuration groups without > bringing anything new, change defaults, policy changes), architecture changes > (new projects that are now mandatory), ... > This is the kind of contextual reminder that needs to come up frequently. Is there any chance of conveying how long the outages are with a deployment size in your experience, with your level of risk tolerance. Same goes for human/operational impact of working through aspects like configuration options changing/moving, policy changes, architectural changes, new projects being mandatory? My hope is that we convey some sense of "what it really takes" to help provide context in which contributors making changes understand how, at least at a high level, their changes may impact others. > In my opinion if we reduce those or at least are more aware of the challenges > that they impose to operators, we will make upgrades easier and hopefully see > deployments move much faster whatever is the release cycle. > > cheers, > Belmiro > > On Tue, Nov 9, 2021 at 12:04 AM Arnaud wrote: >> >> Hey, >> I'd like to add my 2 cents. >> >> It's hard to upgrade a region, so when it comes to upgrade multiples regions, it's even harder. >> >> Some operators also have their own downstream patchs / extensions / drivers which make the upgrade process more complex, so it take more time (for all reasons already given in the thread, need to update the CI, the tools, the doc, the people, etc). >> >> One more thing is about consistency, when you have to manage multiple regions, it's easier if all of them are pretty identical. Human operation are always the same, and can eventually be automated. >> This leads to keep going on with a fixed version of OpenStack to run the business. >> When scaling, you (we) always chose security and consistency. >> >> Also, Julia mentioned something true about contribution from operators. It's difficult for them for multiple reasons: >> - pushing upstream is a process, which need to be taken into account when working on an internal fix. >> - it's usually quicker to push downstream because it's needed. When it comes to upstream, it's challenged by the developers (and it's good), so it take time and can be discouraging. >> - operators are not running master, but a stable release. Bugs on stables could be fixed differently than on master, which could also be discouraging. >> - writing unit tests is a job, some tech operators are not necessarily developers, so this could also be a challenge. >> >> All of these to say that helping people which are proposing a patch is a good thing. And as far as I can see, upstream developers are helping most of the time, and we should keep and encourage such behavior IMHO. >> >> Finally, I would also vote for less releases or LTS releases (but it looks heavier to have this). I think this would help keeping up to date with stables and propose more patches from operators. >> >> Cheers, >> Arnaud. >> >> >> Le 8 novembre 2021 20:43:18 GMT+01:00, Julia Kreger a ?crit : >>> >>> On Mon, Nov 8, 2021 at 10:44 AM Thierry Carrez wrote: >>>> >>>> >>>> Ghanshyam Mann wrote: >>>>> >>>>> [...] >>>>> Thanks Thierry for the detailed write up. >>>>> >>>>> At the same time, a shorter release which leads to upgrade-often pressure but >>>>> it will have fewer number of changes/features, so make the upgrade easy and >>>>> longer-release model will have more changes/features that will make upgrade more >>>>> complex. >>>> >>>> >>>> I think that was true a few years ago, but I'm not convinced that still >>>> holds. We currently have a third of the changes volume we had back in >>>> 2015, so a one-year release in 2022 would contain far less changes than >>>> a 6-month release from 2015. >>> >>> >>> I concur. Also, in 2015, we were still very much in a "move fast" mode >>> of operation as a community. >>> >>>> Also, thanks to our testing and our focus on stability, the pain linked >>>> to the amount of breaking changes in a release is now negligible >>>> compared to the basic pain of going through a 1M-core deployment and >>>> upgrading the various pieces... every 6 months. I've heard of multiple >>>> users claiming it takes them close to 6 months to upgrade their massive >>>> deployments to a new version. So when they are done, they have to start >>>> again. >>>> >>>> -- >>>> Thierry Carrez (ttx) >>>> >>> >>> I've been hearing the exact same messaging from larger operators as >>> well as operators in environments where they are concerned about >>> managing risk for at least the past two years. These operators have >>> indicated it is not uncommon for the upgrade projects which consume, >>> test, certify for production, and deploy to production take *at least* >>> six months to execute. At the same time, they are shy of being the >>> ones to also "find all of the bugs", and so the project doesn't >>> actually start until well after the new coordinated release has >>> occurred. Quickly they become yet another version behind with this >>> pattern. >>> >>> I suspect it is really easy for us as a CI focused community to think >>> that six months is plenty of time to roll out a fully updated >>> deployment which has been fully tested in every possible way. Except, >>> these operators are often trying to do just that on physical hardware, >>> with updated firmware and operatings systems bringing in new variables >>> with every single change which may ripple up the entire stack. These >>> operators then have to apply the lessons they have previously learned >>> once they have worked through all of the variables. In some cases this >>> may involve aspects such as benchmarking, to ensure they don't need to >>> make additional changes which need to be factored into their >>> deployment, sending them back to the start of their testing. All while >>> thinking of phrases like "business/mission critical". >>> >>> I guess this means I'm in support of revising the release cycle. At >>> the same time, I think it would be wise for us to see if we can learn >>> from these operators the pain points they experience, the process they >>> leverage, and ultimately see if there are opportunities to spread >>> knowledge or potentially tooling. Or maybe even get them to contribute >>> their patches upstream. Not that all of these issues are easily solved >>> with any level of code, but sometimes they can include contextual >>> disconnects and resolving those are just as important as shipping a >>> release, IMHO. >>> >>> -Julia >>> From mkopec at redhat.com Tue Nov 9 14:55:32 2021 From: mkopec at redhat.com (Martin Kopec) Date: Tue, 9 Nov 2021 15:55:32 +0100 Subject: [qa] Moving office hour one hour later Message-ID: Hi everyone, we have decided to move the weekly office hour (held on Tuesdays) one later to 15:00 UTC effective November 16th [1]. It was discussed during our last office hour [2]. [1] https://review.opendev.org/c/opendev/irc-meetings/+/817224 [2] https://meetings.opendev.org/meetings/qa/2021/qa.2021-11-09-14.00.log.html#l-74 Regards, -- Martin Kopec Senior Software Quality Engineer Red Hat EMEA -------------- next part -------------- An HTML attachment was scrubbed... URL: From dms at danplanet.com Tue Nov 9 15:17:37 2021 From: dms at danplanet.com (Dan Smith) Date: Tue, 09 Nov 2021 07:17:37 -0800 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: <83EBFE80-193C-47F4-91B5-65BEE27509E6@gmail.com> (Arnaud's message of "Mon, 08 Nov 2021 23:57:54 +0100") References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> <17cfbd00b82.1268d0068215518.8021398674853406097@ghanshyammann.com> <83EBFE80-193C-47F4-91B5-65BEE27509E6@gmail.com> Message-ID: > - it's usually quicker to push downstream because it's needed. When it > comes to upstream, it's challenged by the developers (and it's good), > so it take time and can be discouraging. I'm sure many operators push downstream first, and then chuck a patch into upstream gerrit in hopes of it landing upstream so they don't have to maintain it long-term. Do you think the possibility of it not landing for a year (if they make it in the first one) or two (if it goes into the next one) is a disincentive to pushing upstream? I would think it might push it past the event horizon making downstream patches more of a constant. > - writing unit tests is a job, some tech operators are not necessarily > developers, so this could also be a challenge. Yep, and my experience is that this sort of "picking up the pieces" of good fixes that need help from another developer happens mostly at the end of the release, post-FF in a lot of cases. This is the time when the pressure of the pending release is finally on and we get around to this sort of task. Expanding the release window increases the number of these things collected per cycle, and delays them being in a release by a long time. I know, we should just "do better" for the earlier parts of the cycle, but realistically that won't happen :) --Dan From zigo at debian.org Tue Nov 9 18:07:39 2021 From: zigo at debian.org (Thomas Goirand) Date: Tue, 9 Nov 2021 19:07:39 +0100 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> <17cfbd00b82.1268d0068215518.8021398674853406097@ghanshyammann.com> <83EBFE80-193C-47F4-91B5-65BEE27509E6@gmail.com> Message-ID: On 11/9/21 4:17 PM, Dan Smith wrote: >> - it's usually quicker to push downstream because it's needed. When it >> comes to upstream, it's challenged by the developers (and it's good), >> so it take time and can be discouraging. > > I'm sure many operators push downstream first, and then chuck a patch > into upstream gerrit in hopes of it landing upstream so they don't have > to maintain it long-term. Do you think the possibility of it not landing > for a year (if they make it in the first one) or two (if it goes into > the next one) is a disincentive to pushing upstream? Don't ask your colleagues upstream about what we do! :) With my Debian package maintainer hat on: I will continue to send patch upstream whenever I can. > Yep, and my experience is that this sort of "picking up the pieces" of > good fixes that need help from another developer happens mostly at the > end of the release, post-FF in a lot of cases. This is the time when the > pressure of the pending release is finally on and we get around to this > sort of task. It used to be that I was told to add unit tests, open a bug and close it, etc. and if I was not doing it, the patch would just stay open forever. This was the early days of OpenStack... >From packager view, that's not what I experienced. Mostly, upstream OpenStack people are nice, and understand that we (package maintainers) just jump from one package to another, and can't afford more than 15 minutes per package upgrade (considering upgrading to Xena meant upgrading 220 packages...). I've seen numerous times upstream projects taking over one of my patch, finishing the work (sometimes adding unit tests) and make the patch land (sometimes, even backport it to earlier releases). I don't think switching to a 1 year release cycle will change anything regarding distro <-> upstream relationship. Hopefully, OpenStack people will continue to be awesome and nice to work with... :) Cheers, Thomas Goirand (zigo) From dmeng at uvic.ca Mon Nov 8 21:37:56 2021 From: dmeng at uvic.ca (dmeng) Date: Mon, 08 Nov 2021 13:37:56 -0800 Subject: [sdk]: Get server fault message Message-ID: <135e55fafeebd0c6e8328ea2f4da4713@uvic.ca> Hello there, Hope everything is going well. I'm wondering if there is any method that could show the error message of a server whose status is "ERROR"? Like from openstack cli, "openstack server show server_name", if the server is in "ERROR" status, this will return a field "fault" with a message shows the error. I tried the compute service get_server and find_server, but neither of them show the error messages. Thanks and have a great day! Catherine -------------- next part -------------- An HTML attachment was scrubbed... URL: From tonyliu0592 at hotmail.com Tue Nov 9 23:50:57 2021 From: tonyliu0592 at hotmail.com (Tony Liu) Date: Tue, 9 Nov 2021 23:50:57 +0000 Subject: [keystone] failed auth takes a while Message-ID: Hi, I am using Keystone 17.0.0 with Ussuri. When issue a request by openstack cli (eg. network list) with wrong password, it takes 1+ minute to get auth failure 401 back. After correct the password, for the first request, it still takes 1+ minute to return success. After that, all following request is as fast as usual, around 1s. "openstack user password set" also takes that much time to return success. Any clues? Thanks! Tony From emilien at redhat.com Wed Nov 10 01:36:00 2021 From: emilien at redhat.com (Emilien Macchi) Date: Tue, 9 Nov 2021 20:36:00 -0500 Subject: [sdk][tc] A new home for Gophercloud In-Reply-To: <20211103164054.72y2c7qk2b7ogxn2@yuggoth.org> References: <20211103164054.72y2c7qk2b7ogxn2@yuggoth.org> Message-ID: On Wed, Nov 3, 2021 at 12:43 PM Jeremy Stanley wrote: [...] This was based on experiences trying to work with the Kata > community, and the "experiment" referenced in that mailing list post > eventually concluded with the removal of remaining Kata project > configuration when https://review.opendev.org/744687 merged > approximately 15 months ago. > ack I haven't seen much pushback from moving to Gerrit, but pretty much all feedback I got was from folks who worked (is working) on OpenStack, so a bit biased in my opinion (myself included). Beside that, if we would move to opendev, I want to see some incentives in our roadmap, not just "move our project here because it's cool". Some ideas: * Consider it as a subproject from OpenStack SDK? Or part of a SIG? * CI coverage for API regression testing (e.g. gophercloud/acceptance/compute running in Nova CI) * Getting more exposure of the project and potentially more contributors * Consolidate the best practices in general, for contributions to the project, getting started, dev environments, improving CI jobs (current jobs use OpenLab zuul, with a fork of zuul jobs). Is there any concern would we have to discuss? -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From juliaashleykreger at gmail.com Wed Nov 10 01:48:41 2021 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Tue, 9 Nov 2021 18:48:41 -0700 Subject: [sdk][tc] A new home for Gophercloud In-Reply-To: References: <20211103164054.72y2c7qk2b7ogxn2@yuggoth.org> Message-ID: On Tue, Nov 9, 2021 at 6:40 PM Emilien Macchi wrote: > > On Wed, Nov 3, 2021 at 12:43 PM Jeremy Stanley wrote: > [...] > >> This was based on experiences trying to work with the Kata >> community, and the "experiment" referenced in that mailing list post >> eventually concluded with the removal of remaining Kata project >> configuration when https://review.opendev.org/744687 merged >> approximately 15 months ago. > > > ack > > I haven't seen much pushback from moving to Gerrit, but pretty much all feedback I got was from folks who worked (is working) on OpenStack, so a bit biased in my opinion (myself included). > Beside that, if we would move to opendev, I want to see some incentives in our roadmap, not just "move our project here because it's cool". > > Some ideas: > * Consider it as a subproject from OpenStack SDK? Or part of a SIG? My $0.01 opinion is to move the OpenStack SDK "project" to be a generic "SDK" project, in which gophercloud could live. Mainly for a point of contact perspective, but I think "whatever works" may be best in the end. > * CI coverage for API regression testing (e.g. gophercloud/acceptance/compute running in Nova CI) I strongly suspect it wouldn't be hard to convince ironic contributors to do something similar for Ironic's CI. > * Getting more exposure of the project and potentially more contributors > * Consolidate the best practices in general, for contributions to the project, getting started, dev environments, improving CI jobs (current jobs use OpenLab zuul, with a fork of zuul jobs). On a plus side, the official SDK list could be updated... This would be kind of epic, actually. > > Is there any concern would we have to discuss? > -- > Emilien Macchi From tkajinam at redhat.com Wed Nov 10 02:32:11 2021 From: tkajinam at redhat.com (Takashi Kajinami) Date: Wed, 10 Nov 2021 11:32:11 +0900 Subject: [puppet] Propose retiring puppet-senlin In-Reply-To: References: Message-ID: Thanks Tobias for your feedback. Because I've heard no objections for a week while there is a positive response, I've proposed the patches to retire the project[1]. [1] https://review.opendev.org/q/topic:retire-puppet-senlin On Wed, Nov 3, 2021 at 8:09 PM Tobias Urdin wrote: > +1 for retiring > > Best regards > Tobias > > On 3 Nov 2021, at 11:49, Takashi Kajinami wrote: > > Hello, > > > I remember I raised a similar discussion recently[1] but > we need the same for a different module. > > puppet-selin was introduced back in 2018, but the module has had > only the portion made by cookiecutter and has no capability to manage > fundamental resources yet. > Because we haven't seen any interest in creating implementations > to support even basic usage, I'll propose retiring this module. > > I'll be open for any feedback for a while, and will propose a series > of patches for retirement if no concern is raised here for one week. > > Thank you, > Takashi > > [1] > http://lists.openstack.org/pipermail/openstack-discuss/2021-September/024687.html > [2] https://opendev.org/openstack/puppet-senlin > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjoen at dds.nl Wed Nov 10 08:00:44 2021 From: tjoen at dds.nl (tjoen) Date: Wed, 10 Nov 2021 09:00:44 +0100 Subject: [sdk]: Get server fault message In-Reply-To: <135e55fafeebd0c6e8328ea2f4da4713@uvic.ca> References: <135e55fafeebd0c6e8328ea2f4da4713@uvic.ca> Message-ID: On 11/8/21 22:37, dmeng wrote: > I'm wondering if there is any method that could show the error message > of a server whose status is "ERROR"? Like from openstack cli, "openstack journalctl From iurygregory at gmail.com Wed Nov 10 08:12:47 2021 From: iurygregory at gmail.com (Iury Gregory) Date: Wed, 10 Nov 2021 09:12:47 +0100 Subject: [sdk][tc] A new home for Gophercloud In-Reply-To: References: <20211103164054.72y2c7qk2b7ogxn2@yuggoth.org> Message-ID: Em qua., 10 de nov. de 2021 ?s 02:49, Julia Kreger < juliaashleykreger at gmail.com> escreveu: > On Tue, Nov 9, 2021 at 6:40 PM Emilien Macchi wrote: > > > > On Wed, Nov 3, 2021 at 12:43 PM Jeremy Stanley > wrote: > > [...] > > > >> This was based on experiences trying to work with the Kata > >> community, and the "experiment" referenced in that mailing list post > >> eventually concluded with the removal of remaining Kata project > >> configuration when https://review.opendev.org/744687 merged > >> approximately 15 months ago. > > > > > > ack > > > > I haven't seen much pushback from moving to Gerrit, but pretty much all > feedback I got was from folks who worked (is working) on OpenStack, so a > bit biased in my opinion (myself included). > > Beside that, if we would move to opendev, I want to see some incentives > in our roadmap, not just "move our project here because it's cool". > > > > Some ideas: > > * Consider it as a subproject from OpenStack SDK? Or part of a SIG? > > My $0.01 opinion is to move the OpenStack SDK "project" to be a > generic "SDK" project, in which gophercloud could live. Mainly for a > point of contact perspective, but I think "whatever works" may be best > in the end. > I agree with Julia's comment here. > * CI coverage for API regression testing (e.g. > gophercloud/acceptance/compute running in Nova CI) > > I strongly suspect it wouldn't be hard to convince ironic contributors > to do something similar for Ironic's CI. > ++ I will be more than happy to help with this since I've contributed to gophercloud, and I've also helped them to add a job that runs Ironic so we could run acceptance tests. > > * Getting more exposure of the project and potentially more contributors > > * Consolidate the best practices in general, for contributions to the > project, getting started, dev environments, improving CI jobs (current jobs > use OpenLab zuul, with a fork of zuul jobs). > > On a plus side, the official SDK list could be updated... This would > be kind of epic, actually. > ++ I think this would be very good. > > > > > Is there any concern would we have to discuss? > > -- > > Emilien Macchi > > -- *Att[]'sIury Gregory Melo Ferreira * *MSc in Computer Science at UFCG* *Part of the ironic-core and puppet-manager-core team in OpenStack* *Software Engineer at Red Hat Czech* *Social*: https://www.linkedin.com/in/iurygregory *E-mail: iurygregory at gmail.com * -------------- next part -------------- An HTML attachment was scrubbed... URL: From nabeel.tariq at rapidcompute.com Wed Nov 10 08:50:34 2021 From: nabeel.tariq at rapidcompute.com (nabeel.tariq at rapidcompute.com) Date: Wed, 10 Nov 2021 13:50:34 +0500 Subject: OVN with SSL Message-ID: <000401d7d610$081b59f0$18520dd0$@rapidcompute.com> Hi, We have configured OVN with SSL which is working while using SSL from third party. When we implement self-signed certificate, it shows certificate verification failed in logs. ERROR LOG: 2021-11-10 10:11:14.703 88854 ERROR neutron.service OpenSSL.SSL.Error: [('SSL routines', 'tls_process_server_certificate', 'certificate verify failed')] By using openssl verify command selfsigned server is showing verified. openssl verify -verbose -CAfile .pem
.crt -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 20946 bytes Desc: not available URL: From mdulko at redhat.com Wed Nov 10 09:43:54 2021 From: mdulko at redhat.com (=?UTF-8?Q?Micha=C5=82?= Dulko) Date: Wed, 10 Nov 2021 10:43:54 +0100 Subject: [sdk][tc] A new home for Gophercloud In-Reply-To: References: <20211103164054.72y2c7qk2b7ogxn2@yuggoth.org> Message-ID: On Tue, 2021-11-09 at 20:36 -0500, Emilien Macchi wrote: > On Wed, Nov 3, 2021 at 12:43 PM Jeremy Stanley > wrote: > [...] > ? > > This was based on experiences trying to work with the Kata > > community, and the "experiment" referenced in that mailing list > > post > > eventually concluded with the removal of remaining Kata project > > configuration when https://review.opendev.org/744687 merged > > approximately 15 months ago. > > > > > ack > > I haven't seen much pushback from moving to Gerrit, but pretty much > all feedback I got was from folks who worked (is working) on > OpenStack, so a bit biased in my?opinion (myself included). > Beside that, if we would move to opendev, I want to see some > incentives in our roadmap, not just "move our project here because > it's cool". > > Some ideas: > * Consider it as a subproject from OpenStack SDK? Or part of a SIG? > * CI coverage for API regression testing (e.g. > gophercloud/acceptance/compute running in Nova CI) > * Getting more exposure of the project and potentially more > contributors > * Consolidate the best practices in general, for contributions to the > project, getting started, dev environments, improving CI jobs > (current jobs use OpenLab zuul, with a fork of zuul jobs). > > Is there any concern would we have to discuss? Besides that with DevStack and its stable branches you have an easy way to test Gophercloud against various OpenStack versions. From senrique at redhat.com Wed Nov 10 13:11:23 2021 From: senrique at redhat.com (Sofia Enriquez) Date: Wed, 10 Nov 2021 10:11:23 -0300 Subject: [cinder] Bug deputy report for week of 11-10-2021 Message-ID: This is a bug report from 10-03-2021 to 11-10-2021. Agenda: https://etherpad.opendev.org/p/cinder-bug-squad-meeting ----------------------------------------------------------------------------------------- Medium - https://bugs.launchpad.net/cinder/+bug/1950291 tempest-integrated-compute-centos-8-stream fails with version conflict in boto. Unassigned. Incomplete - https://bugs.launchpad.net/cinder/+bug/1950134 InstanceLocality filter use results in AttributeError: 'Client' object has no attribute 'list_extensions'. Unassigned. Invalid - https://bugs.launchpad.net/cinder/+bug/1950128 NFS backend initialisation fails because an immutable directory exists. Unassigned. -- Sof?a Enriquez she/her Software Engineer Red Hat PnT IRC: @enriquetaso @RedHat Red Hat Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Wed Nov 10 15:00:50 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 10 Nov 2021 09:00:50 -0600 Subject: [all][tc] Continuing the RBAC PTG discussion In-Reply-To: <17ce6ca4f3c.11a0d934542096.8078141879504981729@ghanshyammann.com> References: <17cc3068765.11a9586ce162263.9148179786177865922@ghanshyammann.com> <17cdc3e5964.e42274e1372366.1164114504252831335@ghanshyammann.com> <17ce6ca4f3c.11a0d934542096.8078141879504981729@ghanshyammann.com> Message-ID: <17d0a5daab0.f08965b2428404.1434605436511830260@ghanshyammann.com> We have started this meeting, you can join @ https://meet.google.com/uue-adpp-xsm -gmann ---- On Wed, 03 Nov 2021 12:13:10 -0500 Ghanshyam Mann wrote ---- > Hello Everyone, > > We figured out a lot of things in today call and for other open queries or goal stuff, > we will continue the discussion next week on Wed Nov 10th, 15:00 - 16:00 UTC. > > Below is the link to join the call: > > https://meet.google.com/uue-adpp-xsm > > -gmann > > ---- On Mon, 01 Nov 2021 11:04:06 -0500 Ghanshyam Mann wrote ---- > > ---- On Wed, 27 Oct 2021 13:32:37 -0500 Ghanshyam Mann wrote ---- > > > Hello Everyone, > > > > > > As decided in PTG, we will continue the RBAC discussion from where we left in PTG. We will have a video > > > call next week based on the availability of most of the interested members. > > > > > > Please vote your available time in below doodle vote by Thursday (or Friday morning central time). > > > > > > - https://doodle.com/poll/6xicntb9tu657nz7 > > > > As per doodle voting, I have schedule it on Nov 3rd Wed, 15:00 - 16:00 UTC. > > > > Below is the link to join the call: > > > > https://meet.google.com/uue-adpp-xsm > > > > We will be taking notes in this etherpad https://etherpad.opendev.org/p/policy-popup-yoga-ptg > > > > -gmann > > > > > > > > NOTE: this is not specific to TC or people working in RBAC work but more to wider community to > > > get feedback and finalize the direction (like what we did in PTG session). > > > > > > Meanwhile, feel free to review the lance's updated proposal for community-wide goal > > > - https://review.opendev.org/c/openstack/governance/+/815158 > > > > > > -gmann > > > > > > > > > > > > From moreira.belmiro.email.lists at gmail.com Wed Nov 10 15:48:17 2021 From: moreira.belmiro.email.lists at gmail.com (Belmiro Moreira) Date: Wed, 10 Nov 2021 16:48:17 +0100 Subject: [largescale-sig] Next meeting: Nov 10th, 15utc In-Reply-To: References: Message-ID: Hi, we held our meeting today. We discussed the topic of our next "Large Scale OpenStack" episode on OpenInfra.Live, which should happen on Dec 9. The episode will be around tricks and tools that large deployments use for day to day ops. You can read the meeting logs at: https://meetings.opendev.org/meetings/large_scale_sig/2021/large_scale_sig.2021-11-10-15.00.html Our next IRC meeting will be Nov 24, at 1500utc on #openstack-operators on OFTC. best regards, Belmiro On Mon, Nov 8, 2021 at 5:04 PM Thierry Carrez wrote: > Hi everyone, > > The Large Scale SIG meeting is back this Wednesday in > #openstack-operators on OFTC IRC, at 15UTC. It will be chaired by > Belmiro Moreira. > > You can doublecheck how that time translates locally at: > https://www.timeanddate.com/worldclock/fixedtime.html?iso=20211110T15 > > Feel free to add topics to our agenda at: > https://etherpad.openstack.org/p/large-scale-sig-meeting > > Regards, > > -- > Thierry Carrez > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From elod.illes at est.tech Wed Nov 10 18:40:08 2021 From: elod.illes at est.tech (=?UTF-8?B?RWzFkWQgSWxsw6lz?=) Date: Wed, 10 Nov 2021 19:40:08 +0100 Subject: [cinder][kolla][OSA][release] Xena cycle-trailing release deadline Message-ID: <95058549-f349-024e-37b2-97a1a9aa62fc@est.tech> Hello teams with deliverables following the cycle-trailing release model! This is just a reminder to wrap up your trailing deliverables for Xena. A few cycles ago the deadline for cycle-trailing projects was extended to give more time. The deadline for Xena is *16 December, 2021* [1]. If things are ready sooner than that though, all the better for our downstream consumers. For reference, the following cycle-trailing deliverables will need final releases at some point until the above deadline: cinderlib kayobe kolla-ansible kolla openstack-ansible-roles openstack-ansible Thanks! El?d Ill?s irc: elodilles [1] https://releases.openstack.org/yoga/schedule.html#y-cycle-trail From gmann at ghanshyammann.com Wed Nov 10 19:03:36 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 10 Nov 2021 13:03:36 -0600 Subject: [sdk][tc] A new home for Gophercloud In-Reply-To: References: <20211103164054.72y2c7qk2b7ogxn2@yuggoth.org> Message-ID: <17d0b3beed8.f1f93452444025.7625989527428519853@ghanshyammann.com> ---- On Tue, 09 Nov 2021 19:36:00 -0600 Emilien Macchi wrote ---- > On Wed, Nov 3, 2021 at 12:43 PM Jeremy Stanley wrote: > [...] > This was based on experiences trying to work with the Kata > community, and the "experiment" referenced in that mailing list post > eventually concluded with the removal of remaining Kata project > configuration when https://review.opendev.org/744687 merged > approximately 15 months ago. > > ack > I haven't seen much pushback from moving to Gerrit, but pretty much all feedback I got was from folks who worked (is working) on OpenStack, so a bit biased in my opinion (myself included).Beside that, if we would move to opendev, I want to see some incentives in our roadmap, not just "move our project here because it's cool". > Some ideas:* Consider it as a subproject from OpenStack SDK? Or part of a SIG?* CI coverage for API regression testing (e.g. gophercloud/acceptance/compute running in Nova CI)* Getting more exposure of the project and potentially more contributors* Consolidate the best practices in general, for contributions to the project, getting started, dev environments, improving CI jobs (current jobs use OpenLab zuul, with a fork of zuul jobs). > Is there any concern would we have to discuss?-- +1, Thanks Emilien for putting the roadmap which is more clear to understand the logn term benefits. Looks good to me, especially CI part is cool to have from API testing perspective and to know where we break things (we run client jobs in many projects CI so it should not be something special we need to do) >* Consider it as a subproject from OpenStack SDK? Or part of a SIG? Just to be more clear on this. Does this mean, once we setup the things in opendev then we can migrate it under openstack/ namespace under OpenStack SDK umbrella? or you mean keep it in opendev with non-openstack namespace but collaborative effort with SDK team. -gmann > Emilien Macchi > From radoslaw.piliszek at gmail.com Wed Nov 10 19:18:31 2021 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Wed, 10 Nov 2021 20:18:31 +0100 Subject: [cinder][kolla][OSA][release] Xena cycle-trailing release deadline In-Reply-To: <95058549-f349-024e-37b2-97a1a9aa62fc@est.tech> References: <95058549-f349-024e-37b2-97a1a9aa62fc@est.tech> Message-ID: Thanks, El?d, for the reminder. Unless something unexpected comes up, Kolla projects release next week around Nov 16. -yoctozepto On Wed, 10 Nov 2021 at 19:41, El?d Ill?s wrote: > > Hello teams with deliverables following the cycle-trailing release model! > > This is just a reminder to wrap up your trailing deliverables for Xena. > A few cycles ago the deadline for cycle-trailing projects was extended > to give more time. The deadline for Xena is *16 December, 2021* [1]. > > If things are ready sooner than that though, all the better for our > downstream consumers. > > For reference, the following cycle-trailing deliverables will need > final releases at some point until the above deadline: > > cinderlib > kayobe > kolla-ansible > kolla > openstack-ansible-roles > openstack-ansible > > Thanks! > > El?d Ill?s > irc: elodilles > > [1] https://releases.openstack.org/yoga/schedule.html#y-cycle-trail > > > > From gmann at ghanshyammann.com Wed Nov 10 23:56:08 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 10 Nov 2021 17:56:08 -0600 Subject: [all][tc] Technical Committee next weekly meeting on Nov 11th at 1500 UTC In-Reply-To: <17d00d5940e.c605c228290918.1450543438871901915@ghanshyammann.com> References: <17d00d5940e.c605c228290918.1450543438871901915@ghanshyammann.com> Message-ID: <17d0c47bee0.12ba6e056450219.3324779231364314039@ghanshyammann.com> Hello Everyone, Below is the agenda for Tomorrow's TC IRC meeting schedule at 1500 UTC. https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting == Agenda for tomorrow's TC meeting == * Roll call * Follow up on past action items * Gate health check * Updates on community-wide goal ** Decoupling goal from release cycle *** https://review.opendev.org/c/openstack/governance/+/816387 ** RBAC goal rework *** https://review.opendev.org/c/openstack/governance/+/815158 ** Proposed community goal for FIPS compatibility and compliance *** https://review.opendev.org/c/openstack/governance/+/816587 * Adjutant need PTLs and maintainers ** http://lists.openstack.org/pipermail/openstack-discuss/2021-October/025555.html * Pain Point targeting ** https://etherpad.opendev.org/p/pain-point-elimination * Open Reviews ** https://review.opendev.org/q/projects:openstack/governance+is:open -gmann ---- On Mon, 08 Nov 2021 12:35:36 -0600 Ghanshyam Mann wrote ---- > Hello Everyone, > > Technical Committee's next weekly meeting is scheduled for Nov 11th at 1500 UTC. > > If you would like to add topics for discussion, please add them to the below wiki page by > Wednesday, Nov 10th, at 2100 UTC. > > https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting > > -gmann > > From gmann at ghanshyammann.com Thu Nov 11 00:41:21 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 10 Nov 2021 18:41:21 -0600 Subject: [all] Stable Core Team Process changed Message-ID: <17d0c712625.1134a0c9e450546.8685029299925787442@ghanshyammann.com> Hello Everyone, We discussed this in many PTG (Shanghai and in Xena) to decentralize the stable branch process and make it more distributed towards projects. The technical Committee has merged the resolution on this and I thought of putting it on ML in case you have not read it. With the new process, the individual project teams can manage their own stable core team in the same way that the regular core team. Also, project teams will be empowered to create/enforce polices that best meet the needs of that project. Existing stable policies stay valid and will serve the purpose as it is doing currently. And if you have any questions or seeking guidance regarding stable policies, you can reach out to the existing "Extended Maintenance" SIG which will be renamed to "Stable Maintenance". [1] https://governance.openstack.org/tc/resolutions/20210923-stable-core-team.html -gmann From gmann at ghanshyammann.com Thu Nov 11 02:30:52 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 10 Nov 2021 20:30:52 -0600 Subject: [all] Anyone use or would like to maintain openstack/training-labs repo In-Reply-To: <17cc7e09369.ddfa2076225205.6942386596020382900@ghanshyammann.com> References: <17cc7e09369.ddfa2076225205.6942386596020382900@ghanshyammann.com> Message-ID: <17d0cd56981.d71e48a7451122.1077594163909534877@ghanshyammann.com> ---- On Thu, 28 Oct 2021 12:09:16 -0500 Ghanshyam Mann wrote ---- > Hello Everyone, > > During the TC weekly meeting and PTG discussion to merge the 'Technical Writing' SIG into TC, we found > that openstack/training-labs is no maintained now a days. Even we do not know who use this repo for training. > > I have checked that upstream institute training or CoA are not using this repo in their training. If you are using > it for your training please reply to this email pr ping us on #openstack-tc IRC OFTC channel otherwise we will start > the retirement process. As there is no response on maintaining it, I have started the retirement proposal - https://review.opendev.org/c/openstack/governance/+/817511 -gmann > > - https://opendev.org/openstack/training-labs > > -gmann > > > From pbasaras at gmail.com Thu Nov 11 12:05:50 2021 From: pbasaras at gmail.com (Pavlos Basaras) Date: Thu, 11 Nov 2021 14:05:50 +0200 Subject: [Ussuri] [Zun] Container deployment problem with Zun "There are not enough hosts available" In-Reply-To: References: <656bfc53.18f5.17cf4592522.Coremail.kira034@163.com> Message-ID: Hello, no worries, million thanks for the support and the link for the UI. One more thing to point out is that in ( https://docs.openstack.org/zun/xena/install/compute-install.html) at step 9, the curl command is fixed to amd64 architecture --since i have an aarch64 system i fetched the appropriate version from https://github.com/containernetworking/plugins/releases/, and worked fine. best, Pavlos. On Thu, Nov 11, 2021 at 9:00 AM Hongbin Lu wrote: > Sorry for late reply. I missed this email. > > This is the Zun UI installation guide: > https://docs.openstack.org/zun-ui/latest/install/index.html#manual-installation . > About your suggest of pip3 installation, I will add a note to clarify that. > Thanks for pointing it out. > > Best regards, > Hongbin > > > > > > > At 2021-11-08 19:02:50, "Pavlos Basaras" wrote: > > Hello, > > you are right, that fixed the problem. > > In order to solve the problem i revisited > https://docs.openstack.org/placement/ussuri/install/verify.html > I executed: openstack resource provider list > Then removed the host that i use for containers, restarted the zun-compute > at the host and works perfectly. > Thank you very much for your input. > > One more thing, I don't see at the horizon dashboard the tab for the > containers (i just see the nova compute related tab). Is there any > additional configuration for this? > > btw, from https://docs.openstack.org/placement/ussuri/install/verify.html, > i used pip3 install osc-placement (instead of pip install..) > > all the best > Pavlos. > > On Sat, Nov 6, 2021 at 10:24 AM Hongbin Lu wrote: > >> Hi, >> >> It looks zun-compute wants to create a resource provider in placement, >> but placement return a 409 response. I would suggest you to check >> placement's logs. My best guess is the resource provider with the same name >> is already created so placement returned 409. If this is a case, simply >> remove those resources and restart zun-compute service should resolve the >> problem. >> >> Best regards, >> Hongbin >> >> >> >> >> >> >> At 2021-11-05 22:43:38, "Pavlos Basaras" wrote: >> >> Hello, >> >> I have an Openstack cluster, with basic services and >> functionality working based on ussuri release. >> >> I am trying to install the Zun service to be able to deploy containers, >> following >> >> [controller] -- >> https://docs.openstack.org/zun/ussuri/install/controller-install.html >> and >> [compute] -- >> https://docs.openstack.org/zun/ussuri/install/compute-install.html >> >> I used the git branch based on ussuri for all components. >> >> I veryfined kuryr-libnetwork operation issuing from the compute node >> # docker network create --driver kuryr --ipam-driver kuryr --subnet >> 10.10.0.0/16 --gateway=10.10.0.1 test_net >> >> and seeing the network created successfully, etc. >> >> I am not very sure about the zun.conf file. >> What is the "endpoint_type = internalURL" parameter? >> Do I need to change internalURL? >> >> >> From sudo systemctl status zun-compute i see: >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task File >> "/usr/local/lib/python3.6/dist-packages/retrying.py", line 49, in wrapped_f >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task return Retrying(*dargs, **dkw).call(f, >> *args, **kw) >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task File >> "/usr/local/lib/python3.6/dist-packages/retrying.py", line 206, in call >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task return >> attempt.get(self._wrap_exception) >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task File >> "/usr/local/lib/python3.6/dist-packages/retrying.py", line 247, in get >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task six.reraise(self.value[0], >> self.value[1], self.value[2]) >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task File >> "/usr/lib/python3/dist-packages/six.py", line 703, in reraise >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task raise value >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task File >> "/usr/local/lib/python3.6/dist-packages/retrying.py", line 200, in call >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task attempt = Attempt(fn(*args, **kwargs), >> attempt_number, False) >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task File >> "/usr/local/lib/python3.6/dist-packages/zun/compute/compute_node_tracker.py", >> line 350, in _update_to_placement >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task context, node_rp_uuid, >> name=compute_node.hostname) >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task File >> "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", >> line 846, in get_provider_tree_and_ensure_r >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task >> parent_provider_uuid=parent_provider_uuid) >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task File >> "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", >> line 628, in _ensure_resource_provider >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task >> parent_provider_uuid=parent_provider_uuid) >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task File >> "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", >> line 514, in _create_resource_provider >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task global_request_id=context.global_id) >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task File >> "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", >> line 225, in post >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task headers=headers, logger=LOG) >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task File >> "/usr/lib/python3/dist-packages/keystoneauth1/adapter.py", line 392, in post >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task return self.request(url, 'POST', >> **kwargs) >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task File >> "/usr/lib/python3/dist-packages/keystoneauth1/adapter.py", line 248, in >> request >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task return self.session.request(url, >> method, **kwargs) >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task File >> "/usr/lib/python3/dist-packages/keystoneauth1/session.py", line 968, in >> request >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task raise exceptions.from_response(resp, >> method, url) >> *Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task keystoneauth1.exceptions.http.Conflict: >> Conflict (HTTP 409) (Request-ID: req-9a158c41-a485-4937-99e7-e38cdce7fded)* >> >> What is this problem? any advice? >> I used the default configuration values ([keystone_auth] and >> [keystone_authtoken]) values based on the configuration from the above >> links. >> >> >> Aslo from the controller >> >> >> *openstack appcontainer service list* >> +----+----------+-------------+-------+----------+-----------------+----------------------------+-------------------+ >> | Id | Host | Binary | State | Disabled | Disabled Reason | >> Updated At | Availability Zone | >> >> +----+----------+-------------+-------+----------+-----------------+----------------------------+-------------------+ >> | 1 | compute5 | zun-compute | up | False | None | >> 2021-11-05T14:39:01.000000 | nova | >> >> +----+----------+-------------+-------+----------+-----------------+----------------------------+-------------------+ >> >> >> *openstack appcontainer host show compute5* >> +----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ >> | Field | Value >> >> | >> >> +----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ >> | uuid | ee3a5b44-8ffa-463e-939d-0c61868a596f >> >> | >> | links | [{'href': ' >> http://controller:9517/v1/hosts/ee3a5b44-8ffa-463e-939d-0c61868a596f', >> 'rel': 'self'}, {'href': ' >> http://controller:9517/hosts/ee3a5b44-8ffa-463e-939d-0c61868a596f', >> 'rel': 'bookmark'}] | >> | hostname | compute5 >> >> | >> | mem_total | 7975 >> >> | >> | mem_used | 0 >> >> | >> | total_containers | 1 >> >> | >> | cpus | 10 >> >> | >> | cpu_used | 0.0 >> >> | >> | architecture | x86_64 >> >> | >> | os_type | linux >> >> | >> | os | Ubuntu 18.04.6 LTS >> >> | >> | kernel_version | 4.15.0-161-generic >> >> | >> | labels | {} >> >> | >> | disk_total | 63 >> >> | >> | disk_used | 0 >> >> | >> | disk_quota_supported | False >> >> | >> | runtimes | ['io.containerd.runc.v2', >> 'io.containerd.runtime.v1.linux', 'runc'] >> >> | >> | enable_cpu_pinning | False >> >> | >> >> +----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ >> >> >> seems to work fine. >> >> However when i issue e.g., openstack appcontainer run --name container >> --net network=$NET_ID cirros ping 8.8.8.8 >> >> i get the error: | status_reason | There are not enough hosts >> available. >> >> Any ideas? >> >> One final thing is that I did see in the Horizon dashboard the >> container tab, to be able to deploy containers from horizon. Is there an >> extra configuration for this? >> >> sorry for the long mail. >> >> best, >> Pavlos >> >> >> >> >> > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kira034 at 163.com Thu Nov 11 07:00:04 2021 From: kira034 at 163.com (Hongbin Lu) Date: Thu, 11 Nov 2021 15:00:04 +0800 (CST) Subject: [Ussuri] [Zun] Container deployment problem with Zun "There are not enough hosts available" In-Reply-To: References: <656bfc53.18f5.17cf4592522.Coremail.kira034@163.com> Message-ID: Sorry for late reply. I missed this email. This is the Zun UI installation guide: https://docs.openstack.org/zun-ui/latest/install/index.html#manual-installation . About your suggest of pip3 installation, I will add a note to clarify that. Thanks for pointing it out. Best regards, Hongbin At 2021-11-08 19:02:50, "Pavlos Basaras" wrote: Hello, you are right, that fixed the problem. In order to solve the problem i revisited https://docs.openstack.org/placement/ussuri/install/verify.html I executed: openstack resource provider list Then removed the host that i use for containers, restarted the zun-compute at the host and works perfectly. Thank you very much for your input. One more thing, I don't see at the horizon dashboard the tab for the containers (i just see the nova compute related tab). Is there any additional configuration for this? btw, from https://docs.openstack.org/placement/ussuri/install/verify.html, i used pip3 install osc-placement (instead of pip install..) all the best Pavlos. On Sat, Nov 6, 2021 at 10:24 AM Hongbin Lu wrote: Hi, It looks zun-compute wants to create a resource provider in placement, but placement return a 409 response. I would suggest you to check placement's logs. My best guess is the resource provider with the same name is already created so placement returned 409. If this is a case, simply remove those resources and restart zun-compute service should resolve the problem. Best regards, Hongbin At 2021-11-05 22:43:38, "Pavlos Basaras" wrote: Hello, I have an Openstack cluster, with basic services and functionality working based on ussuri release. I am trying to install the Zun service to be able to deploy containers, following [controller] -- https://docs.openstack.org/zun/ussuri/install/controller-install.html and [compute] -- https://docs.openstack.org/zun/ussuri/install/compute-install.html I used the git branch based on ussuri for all components. I veryfined kuryr-libnetwork operation issuing from the compute node # docker network create --driver kuryr --ipam-driver kuryr --subnet 10.10.0.0/16 --gateway=10.10.0.1 test_net and seeing the network created successfully, etc. I am not very sure about the zun.conf file. What is the "endpoint_type = internalURL" parameter? Do I need to change internalURL? From sudo systemctl status zun-compute i see: Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 49, in wrapped_f Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task return Retrying(*dargs, **dkw).call(f, *args, **kw) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 206, in call Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task return attempt.get(self._wrap_exception) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 247, in get Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task six.reraise(self.value[0], self.value[1], self.value[2]) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/lib/python3/dist-packages/six.py", line 703, in reraise Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task raise value Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 200, in call Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task attempt = Attempt(fn(*args, **kwargs), attempt_number, False) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/zun/compute/compute_node_tracker.py", line 350, in _update_to_placement Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task context, node_rp_uuid, name=compute_node.hostname) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", line 846, in get_provider_tree_and_ensure_r Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task parent_provider_uuid=parent_provider_uuid) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", line 628, in _ensure_resource_provider Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task parent_provider_uuid=parent_provider_uuid) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", line 514, in _create_resource_provider Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task global_request_id=context.global_id) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", line 225, in post Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task headers=headers, logger=LOG) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/lib/python3/dist-packages/keystoneauth1/adapter.py", line 392, in post Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task return self.request(url, 'POST', **kwargs) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/lib/python3/dist-packages/keystoneauth1/adapter.py", line 248, in request Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task return self.session.request(url, method, **kwargs) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/lib/python3/dist-packages/keystoneauth1/session.py", line 968, in request Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task raise exceptions.from_response(resp, method, url) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task keystoneauth1.exceptions.http.Conflict: Conflict (HTTP 409) (Request-ID: req-9a158c41-a485-4937-99e7-e38cdce7fded) What is this problem? any advice? I used the default configuration values ([keystone_auth] and [keystone_authtoken]) values based on the configuration from the above links. Aslo from the controller openstack appcontainer service list +----+----------+-------------+-------+----------+-----------------+----------------------------+-------------------+ | Id | Host | Binary | State | Disabled | Disabled Reason | Updated At | Availability Zone | +----+----------+-------------+-------+----------+-----------------+----------------------------+-------------------+ | 1 | compute5 | zun-compute | up | False | None | 2021-11-05T14:39:01.000000 | nova | +----+----------+-------------+-------+----------+-----------------+----------------------------+-------------------+ openstack appcontainer host show compute5 +----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Field | Value | +----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | uuid | ee3a5b44-8ffa-463e-939d-0c61868a596f | | links | [{'href': 'http://controller:9517/v1/hosts/ee3a5b44-8ffa-463e-939d-0c61868a596f', 'rel': 'self'}, {'href': 'http://controller:9517/hosts/ee3a5b44-8ffa-463e-939d-0c61868a596f', 'rel': 'bookmark'}] | | hostname | compute5 | | mem_total | 7975 | | mem_used | 0 | | total_containers | 1 | | cpus | 10 | | cpu_used | 0.0 | | architecture | x86_64 | | os_type | linux | | os | Ubuntu 18.04.6 LTS | | kernel_version | 4.15.0-161-generic | | labels | {} | | disk_total | 63 | | disk_used | 0 | | disk_quota_supported | False | | runtimes | ['io.containerd.runc.v2', 'io.containerd.runtime.v1.linux', 'runc'] | | enable_cpu_pinning | False | +----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ seems to work fine. However when i issue e.g., openstack appcontainer run --name container --net network=$NET_ID cirros ping 8.8.8.8 i get the error: | status_reason | There are not enough hosts available. Any ideas? One final thing is that I did see in the Horizon dashboard the container tab, to be able to deploy containers from horizon. Is there an extra configuration for this? sorry for the long mail. best, Pavlos -------------- next part -------------- An HTML attachment was scrubbed... URL: From katonalala at gmail.com Thu Nov 11 15:34:34 2021 From: katonalala at gmail.com (Lajos Katona) Date: Thu, 11 Nov 2021 16:34:34 +0100 Subject: [neutron] Drivers meeting agenda - 12.11.2021 Message-ID: Hi Neutron Drivers, The agenda for tomorrow's drivers meeting is at [1]. We have 1 RFE to discuss: * https://bugs.launchpad.net/neutron/+bug/1950454 : [RFE] GW IP and FIP QoS to inherit from network [1] https://wiki.openstack.org/wiki/Meetings/NeutronDrivers#Agenda See you at the meeting tomorrow. Lajos Katona (lajoskatona) -------------- next part -------------- An HTML attachment was scrubbed... URL: From DHilsbos at performair.com Thu Nov 11 16:25:13 2021 From: DHilsbos at performair.com (DHilsbos at performair.com) Date: Thu, 11 Nov 2021 16:25:13 +0000 Subject: [ops][keystone][cinder][swift][neutron][nova] Rename Availability Zone Message-ID: <0670B960225633449A24709C291A525251D4538F@COM03.performair.local> All; We recently decided to change our naming convention. As such I'd like to rename our current availability zone. The configuration files are the obvious places to do so. Is there anything I need to change in the database? Properties of images, volumes, servers, etc.? Should I just give this up as a bad deal? Maybe rebuild the cluster from scratch? Thank you, Dominic L. Hilsbos, MBA Vice President - Information Technology Perform Air International Inc. DHilsbos at PerformAir.com www.PerformAir.com From gmann at ghanshyammann.com Thu Nov 11 16:40:52 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Thu, 11 Nov 2021 10:40:52 -0600 Subject: Adjutant needs contributors (and a PTL) to survive! In-Reply-To: <1e4045dd6edf572a8a32ca639056bc06a8782fe8.camel@etc.gen.nz> References: <3026c411-688b-c773-8577-e8eed40b995a@catalystcloud.nz> <2bd77f1e8c7542888b7e0e1a14931a41@verisign.com> <9cabb3cb32a7441697f58933df72b514@verisign.com> <1e4045dd6edf572a8a32ca639056bc06a8782fe8.camel@etc.gen.nz> Message-ID: <17d0fdf9cef.bfbf002c504513.1457558466289860904@ghanshyammann.com> ---- On Tue, 09 Nov 2021 04:54:36 -0600 Andrew Ruthven wrote ---- > On Mon, 2021-11-08 at 19:49 +0000, Braden, Albert wrote:I didn't have any luck contacting Adrian. Does anyone know where the storyboard is that he mentions in his email? > I'll check in with Adrian to see if he has heard from anyone. > Cheers,AndrewCatalyst Cloud-- Andrew Ruthven, Wellington, New Zealandandrew at etc.gen.nz |Catalyst Cloud: | This space intentionally left blank https://catalystcloud.nz | @Andrew not sure but please let us know if someone from Catalyst is planning to maintain it? We are still waiting for volunteers to lead/maintain this project, if you are interested please reply here or ping us on #openstack-tc IRC channel. -gmann From james.slagle at gmail.com Thu Nov 11 18:26:41 2021 From: james.slagle at gmail.com (James Slagle) Date: Thu, 11 Nov 2021 13:26:41 -0500 Subject: [TripleO] directord to opendev Message-ID: Hello TripleO'ers, We are proposing to move the directord repo[1] from github to opendev as a top level organization. Moving to the opendev[2] hosting will make it easier to integrate with the existing tripleo-core team, our existing CI, and developer processes. The benefits seem to outweigh leaving it on github. At the same time, we want to encourage external contribution from outside TripleO, so felt that making it it's own organization under opendev would be better suited towards that than under the openstack organization. If there are concerns, please raise them. Thank you! [1] https://github.com/Directord/ [2] https://opendev.org/ -- -- James Slagle -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From james.slagle at gmail.com Thu Nov 11 18:41:59 2021 From: james.slagle at gmail.com (James Slagle) Date: Thu, 11 Nov 2021 13:41:59 -0500 Subject: [TripleO] directord to opendev In-Reply-To: References: Message-ID: On Thu, Nov 11, 2021 at 1:26 PM James Slagle wrote: > Hello TripleO'ers, > > We are proposing to move the directord repo[1] from github to opendev as a > top level organization. Moving to the opendev[2] hosting will make it > easier to integrate with the existing tripleo-core team, our existing CI, > and developer processes. The benefits seem to outweigh leaving it on github. > > At the same time, we want to encourage external contribution from outside > TripleO, so felt that making it it's own organization under opendev would > be better suited towards that than under the openstack organization. > > If there are concerns, please raise them. Thank you! > > [1] https://github.com/Directord/ > [2] https://opendev.org/ > Note that this would also include putting task-core under the directord org. -- -- James Slagle -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Thu Nov 11 19:00:31 2021 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Thu, 11 Nov 2021 20:00:31 +0100 Subject: [Openstack][cinder] scheduler filters Message-ID: Hello All, I read that capacity filters for cinder is the default, so, if I understood well, a volume is placed on the backend where more space is available. Since my two backends are on storage with same features, I wonder if I must specify a default storage backend in cinder.conf or not. Must I create a cinder volume without cinder type and scheduler evaluate where there is more space available? Thanks Ignazio -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Thu Nov 11 19:23:03 2021 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Thu, 11 Nov 2021 20:23:03 +0100 Subject: [Openstack][cinder] scheduler filters In-Reply-To: References: Message-ID: Hello again, probably I must use the same backend name for both and a cinder type associated to it and the scheduler will use the backend with more space available ? Ignazio Il Gio 11 Nov 2021, 20:00 Ignazio Cassano ha scritto: > Hello All, > I read that capacity filters for cinder is the default, so, if I > understood well, a volume is placed on the backend where more space is > available. > Since my two backends are on storage with same features, I wonder if I > must specify a default storage backend in cinder.conf or not. > Must I create a cinder volume without cinder type and scheduler evaluate > where there is more space available? > Thanks > Ignazio > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From DHilsbos at performair.com Thu Nov 11 22:59:18 2021 From: DHilsbos at performair.com (DHilsbos at performair.com) Date: Thu, 11 Nov 2021 22:59:18 +0000 Subject: [ops][nova] Error trying to migrate Message-ID: <0670B960225633449A24709C291A525251D45985@COM03.performair.local> All; I'm running into a strange issue when I try to migrate a server from one host to another. The only messages I'm seeing are in nova-conductor.log: 2021-11-11 15:16:49.118 1887742 WARNING nova.scheduler.utils [req-d648901b-d000-47ec-9497-bcc51586c381 d7c514813e5d4fe6815f5f59e8e35f2f a008ad02d16f436a9e320882ca497055 - default default] Failed to compute_task_migrate_server: Timed out waiting for a reply to message ID 237323d0824a43588f6394e5a505248b: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 237323d0824a43588f6394e5a505248b 2021-11-11 15:16:49.122 1887742 WARNING nova.scheduler.utils [req-d648901b-d000-47ec-9497-bcc51586c381 d7c514813e5d4fe6815f5f59e8e35f2f a008ad02d16f436a9e320882ca497055 - default default] [instance: 5935c07d-0c7f-48cc-a4b9-674504fc6005] Setting instance to STOPPED state.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 237323d0824a43588f6394e5a505248b 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server [req-d648901b-d000-47ec-9497-bcc51586c381 d7c514813e5d4fe6815f5f59e8e35f2f a008ad02d16f436a9e320882ca497055 - default default] Exception during message handling: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 237323d0824a43588f6394e5a505248b 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server Traceback (most recent call last): 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 433, in get 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return self._queues[msg_id].get(block=True, timeout=timeout) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/eventlet/queue.py", line 322, in get 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return waiter.wait() 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/eventlet/queue.py", line 141, in wait 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return get_hub().switch() 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/eventlet/hubs/hub.py", line 298, in switch 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return self.greenlet.switch() 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server queue.Empty 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server During handling of the above exception, another exception occurred: 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server Traceback (most recent call last): 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/server.py", line 165, in _process_incoming 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", line 309, in dispatch 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", line 229, in _do_dispatch 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/server.py", line 241, in inner 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return func(*args, **kwargs) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/manager.py", line 99, in wrapper 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return fn(self, context, *args, **kwargs) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/compute/utils.py", line 1434, in decorated_function 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return function(self, context, *args, **kwargs) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/manager.py", line 303, in migrate_server 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server host_list) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/manager.py", line 397, in _cold_migrate 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server else 'cold migrate', instance=instance) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server self.force_reraise() 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server raise value 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/manager.py", line 354, in _cold_migrate 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server task.execute() 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/tasks/base.py", line 26, in wrap 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server self.rollback(ex) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server self.force_reraise() 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server raise value 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/tasks/base.py", line 23, in wrap 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return original(self) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/tasks/base.py", line 40, in execute 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return self._execute() 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/tasks/migrate.py", line 384, in _execute 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server selection = self._schedule() 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/tasks/migrate.py", line 434, in _schedule 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return_objects=True, return_alternates=True) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/scheduler/client/query.py", line 42, in select_destinations 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server instance_uuids, return_objects, return_alternates) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/scheduler/rpcapi.py", line 160, in select_destinations 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return cctxt.call(ctxt, 'select_destinations', **msg_args) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/client.py", line 179, in call 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server transport_options=self.transport_options) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/transport.py", line 128, in _send 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server transport_options=transport_options) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 682, in send 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server transport_options=transport_options) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 670, in _send 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server call_monitor_timeout) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 559, in wait 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server message = self.waiters.get(msg_id, timeout=timeout) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 437, in get 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server 'to message ID %s' % msg_id) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 237323d0824a43588f6394e5a505248b No messages are generated in nova-schedule, or any of the nova-compute logs on the hosts. Thank you, Dominic L. Hilsbos, MBA Vice President - Information Technology Perform Air International Inc. DHilsbos at PerformAir.com www.PerformAir.com From mnaser at vexxhost.com Fri Nov 12 03:13:28 2021 From: mnaser at vexxhost.com (Mohammed Naser) Date: Thu, 11 Nov 2021 22:13:28 -0500 Subject: [ops][nova] Error trying to migrate In-Reply-To: <0670B960225633449A24709C291A525251D45985@COM03.performair.local> References: <0670B960225633449A24709C291A525251D45985@COM03.performair.local> Message-ID: It sounds like you've got an issue with your RabbitMQ infrastructure that's causing messages not to show up. I suggest focusing your troubleshooting there. I've seen these issues resolved by simply rebuilding the RabbitMQ cluster. On Thu, Nov 11, 2021 at 6:07 PM wrote: > > All; > > I'm running into a strange issue when I try to migrate a server from one host to another. > > The only messages I'm seeing are in nova-conductor.log: > 2021-11-11 15:16:49.118 1887742 WARNING nova.scheduler.utils [req-d648901b-d000-47ec-9497-bcc51586c381 d7c514813e5d4fe6815f5f59e8e35f2f a008ad02d16f436a9e320882ca497055 - default default] Failed to compute_task_migrate_server: Timed out waiting for a reply to message ID 237323d0824a43588f6394e5a505248b: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 237323d0824a43588f6394e5a505248b > 2021-11-11 15:16:49.122 1887742 WARNING nova.scheduler.utils [req-d648901b-d000-47ec-9497-bcc51586c381 d7c514813e5d4fe6815f5f59e8e35f2f a008ad02d16f436a9e320882ca497055 - default default] [instance: 5935c07d-0c7f-48cc-a4b9-674504fc6005] Setting instance to STOPPED state.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 237323d0824a43588f6394e5a505248b > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server [req-d648901b-d000-47ec-9497-bcc51586c381 d7c514813e5d4fe6815f5f59e8e35f2f a008ad02d16f436a9e320882ca497055 - default default] Exception during message handling: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 237323d0824a43588f6394e5a505248b > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server Traceback (most recent call last): > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 433, in get > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return self._queues[msg_id].get(block=True, timeout=timeout) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/eventlet/queue.py", line 322, in get > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return waiter.wait() > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/eventlet/queue.py", line 141, in wait > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return get_hub().switch() > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/eventlet/hubs/hub.py", line 298, in switch > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return self.greenlet.switch() > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server queue.Empty > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server During handling of the above exception, another exception occurred: > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server Traceback (most recent call last): > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/server.py", line 165, in _process_incoming > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", line 309, in dispatch > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", line 229, in _do_dispatch > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/server.py", line 241, in inner > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return func(*args, **kwargs) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/manager.py", line 99, in wrapper > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return fn(self, context, *args, **kwargs) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/compute/utils.py", line 1434, in decorated_function > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return function(self, context, *args, **kwargs) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/manager.py", line 303, in migrate_server > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server host_list) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/manager.py", line 397, in _cold_migrate > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server else 'cold migrate', instance=instance) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__ > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server self.force_reraise() > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server raise value > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/manager.py", line 354, in _cold_migrate > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server task.execute() > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/tasks/base.py", line 26, in wrap > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server self.rollback(ex) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__ > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server self.force_reraise() > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server raise value > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/tasks/base.py", line 23, in wrap > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return original(self) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/tasks/base.py", line 40, in execute > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return self._execute() > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/tasks/migrate.py", line 384, in _execute > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server selection = self._schedule() > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/tasks/migrate.py", line 434, in _schedule > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return_objects=True, return_alternates=True) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/scheduler/client/query.py", line 42, in select_destinations > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server instance_uuids, return_objects, return_alternates) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/scheduler/rpcapi.py", line 160, in select_destinations > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return cctxt.call(ctxt, 'select_destinations', **msg_args) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/client.py", line 179, in call > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server transport_options=self.transport_options) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/transport.py", line 128, in _send > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server transport_options=transport_options) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 682, in send > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server transport_options=transport_options) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 670, in _send > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server call_monitor_timeout) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 559, in wait > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server message = self.waiters.get(msg_id, timeout=timeout) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 437, in get > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server 'to message ID %s' % msg_id) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 237323d0824a43588f6394e5a505248b > > No messages are generated in nova-schedule, or any of the nova-compute logs on the hosts. > > Thank you, > > Dominic L. Hilsbos, MBA > Vice President - Information Technology > Perform Air International Inc. > DHilsbos at PerformAir.com > www.PerformAir.com > > > -- Mohammed Naser VEXXHOST, Inc. From arnaud.morin at gmail.com Fri Nov 12 08:51:46 2021 From: arnaud.morin at gmail.com (Arnaud Morin) Date: Fri, 12 Nov 2021 08:51:46 +0000 Subject: [neutron][large-scale][ops] openflow rules tools In-Reply-To: <6212551.lOV4Wx5bFT@p1> References: <6212551.lOV4Wx5bFT@p1> Message-ID: Hey all, we have been working on this subject recently and we pushed this: https://review.opendev.org/c/openstack/osops/+/817715 Feel free to comment +tag [ops][large-scale] On 11.10.21 - 22:40, Slawek Kaplonski wrote: > Hi, > > For OVN with have small tool ml2ovn-trace: https://docs.openstack.org/neutron/ > latest/ovn/ml2ovn_trace.html in the neutron repo https://docs.openstack.org/ > neutron/latest/ovn/ml2ovn_trace.html but that will not be helpful for ML2/OVS > at all. > > On poniedzia?ek, 11 pa?dziernika 2021 20:05:40 CEST Arnaud wrote: > > That would be awesome! > > > > We also built a tool which is looking for openflow rules related to a tap > > interface, but since we upgraded and enabled security rules in ovs, the tool > > isn't working anymore. > > Yes, for ML2/OVS with ovs firewall driver it is really painful to debug all > those OF rules. > > > > > So before rewriting everything from scratch, I was wondering if the > community > > was also dealing with the same issue. > > If You will have anything like that, please share with community :) > > > > > So I am glad to here from you! > > Let me know :) > > Cheers > > > > Le 11 octobre 2021 17:52:52 GMT+02:00, Laurent Dumont > a ?crit : > > >Also interested in this. Reading rules in dump-flows is an absolute pain. > > >In an ideal world, I would have never have to. > > > > > >We some stuff on our side that I'll see if I can share. > > > > > >On Mon, Oct 11, 2021 at 9:41 AM Arnaud Morin > wrote: > > >> Hello, > > >> > > >> When using native ovs in neutron, we endup with a lot of openflow rules > > >> on ovs side. > > >> > > >> Debugging it with regular ovs-ofctl --color dump-flows is kind of > > >> painful. > > >> > > >> Is there any tool that the community is using to manage that? > > >> > > >> Thanks in advance! > > >> > > >> Arnaud. > > > -- > Slawek Kaplonski > Principal Software Engineer > Red Hat From zigo at debian.org Fri Nov 12 11:42:06 2021 From: zigo at debian.org (Thomas Goirand) Date: Fri, 12 Nov 2021 12:42:06 +0100 Subject: [ops][keystone][cinder][swift][neutron][nova] Rename Availability Zone In-Reply-To: <0670B960225633449A24709C291A525251D4538F@COM03.performair.local> References: <0670B960225633449A24709C291A525251D4538F@COM03.performair.local> Message-ID: <55338d20-aa80-0f74-e663-b6e859d58fc0@debian.org> On 11/11/21 5:25 PM, DHilsbos at performair.com wrote: > All; > > We recently decided to change our naming convention. As such I'd like to rename our current availability zone. The configuration files are the obvious places to do so. Is there anything I need to change in the database? Properties of images, volumes, servers, etc.? > > Should I just give this up as a bad deal? Maybe rebuild the cluster from scratch? > > Thank you, Hi, You don't need to rebuild your cluster from scratch. Just, you got to know that instances cannot change availability zone, unless you really go deep into the nova database (and change the nova SPEC there). I'm not sure for existing volumes. Unless I'm mistaking, there's no notion of AZ for images. FYI, the nova availability zones are controlled through the compute aggregate API. The cinder one from cinder.conf in each volume nodes. Cheers, Thomas Goirand (zigo) From elod.illes at est.tech Fri Nov 12 14:46:19 2021 From: elod.illes at est.tech (=?UTF-8?B?RWzFkWQgSWxsw6lz?=) Date: Fri, 12 Nov 2021 15:46:19 +0100 Subject: [release] Release countdown for week R-19, Nov 15-19 Message-ID: <2b213596-6530-e3d8-52b7-19038a553a52@est.tech> Development Focus ----------------- The Yoga-1 milestone is next week, on November 18, 2021! Project team plans for the Yogacycle should now be solidified. General Information ------------------- Libraries need to be released at least once per milestone period. Next week, the release team will propose releases for any library which had changes but has not been otherwise released since the Xenarelease. PTL's or release liaisons, please watch for these and give a +1 to acknowledge them. If there is some reason to hold off on a release, let us know that as well, by posting a -1. If we do not hear anything at all by the end of the week, we will assume things are OK to proceed. NB: If one of your libraries is still releasing 0.x versions, start thinking about when it will be appropriate to do a 1.0 version. The version number does signal the state, real or perceived, of the library, so we strongly encourage going to a full major version once things are in a good and usable state. Upcoming Deadlines & Dates -------------------------- Yoga-1 milestone: November 18, 2021 Yoga final release: March 30, 2022 El?d Ill?s irc: elodilles -------------- next part -------------- An HTML attachment was scrubbed... URL: From elod.illes at est.tech Fri Nov 12 15:26:25 2021 From: elod.illes at est.tech (=?UTF-8?B?RWzFkWQgSWxsw6lz?=) Date: Fri, 12 Nov 2021 16:26:25 +0100 Subject: [openstack-ansible] release job failure for ansible-collections-openstack Message-ID: Hi Openstack-Ansible team! This mail is just to inform you that there was a release job failure [1] yesterday and the job could not be re-run as part of the job was finished successfully in the 1st run (so the 2nd attempt failed [2]). Could you please review if everything is OK with the release? Thanks, El?d (elodilles @ #openstack-release) [1] http://lists.openstack.org/pipermail/release-job-failures/2021-November/001576.html [2] http://lists.openstack.org/pipermail/release-job-failures/2021-November/001577.html From dpawlik at redhat.com Fri Nov 12 16:14:08 2021 From: dpawlik at redhat.com (Daniel Pawlik) Date: Fri, 12 Nov 2021 17:14:08 +0100 Subject: [all] Openstack CI Log Processing project Message-ID: Hello Everyone, By moving the Opendev Elasticsearch to Openstack Elasticsearch service (in the future), a new repository was created in the Openstack project: ci-log-processing. The new repository will be used to store configuration related to the Opensearch service and all tools required to process logs from Zuul CI system to Opensearch. By moving to the new Elasticsearch system, we would like to take this opportunity to use a new service to replace the legacy submit-logstash-jobs system [1][2]. I would like to ask for volunteers to review changes in the Openstack ci-log-processing repository? [3] Please reply to this ML or ping me on #openstack-infra IRC channel so that we can plan to expand the core member list of this repo. Dan [1] https://opendev.org/opendev/base-jobs/src/branch/master/roles/submit-logstash-jobs [2] https://docs.opendev.org/opendev/system-config/latest/logstash.html [3] https://review.opendev.org/q/project:openstack%252Fci-log-processing+status:open -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Fri Nov 12 12:01:32 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Fri, 12 Nov 2021 13:01:32 +0100 Subject: [neutron] Averaga number of rechecks in Neutron comparing to other projects Message-ID: <3257902.aeNJFYEL58@p1> Hi neutrinos, As we discussed during the last PTG, I spent some time today to get average number of rechecks which we need to do in the last PS of the change before it's merged. In theory that number should be close to 0 as patch should be merged with first CI run when it's approved by reviewers :) All data are only from the master branch. I didn't check the same for stable branches. File with graph and raw data in csv format are in the attachments. Basically my conclusion is that Neutron's CI is really bad in that. We have to recheck many, many times before patches will be merged. We really need to think about how to improve that in Neutron. -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: Average_number_of_rechecks_before_patch_was_merged_in_various_projects-2021.csv Type: text/csv Size: 1352 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Average_number_of_rechecks_in_various_projects_before_patch_was_merged.png Type: image/png Size: 45823 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From gmann at ghanshyammann.com Fri Nov 12 17:22:11 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Fri, 12 Nov 2021 11:22:11 -0600 Subject: [all][tc][policy] RBAC discussions: policy popup team meeting time/place change Message-ID: <17d152bcd80.12992a51e572168.3448628568417682192@ghanshyammann.com> Hello Everyone, As you might aware that we had post-PTG discussion on new secure RBAC[1] and figured out a lot of things but still lot of things pending to figure out :). We are at least in good shape on "what to target in Yoga cycle" (this proposal - https://review.opendev.org/c/openstack/governance/+/815158) We will use policy popup team biweekly meeting to continue the discussion on open questions and with video call on meetpad. Details of meeting: https://wiki.openstack.org/wiki/Consistent_and_Secure_Default_Policies_Popup_Team#Meeting Ical: https://meetings.opendev.org/#Secure_Default_Policies_Popup-Team_Meeting Next meeting is on 18th Nov Thursday at 18:00 UTC and then biweekly. [1] http://lists.openstack.org/pipermail/openstack-discuss/2021-November/025619.html -gmann From gmann at ghanshyammann.com Fri Nov 12 17:58:18 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Fri, 12 Nov 2021 11:58:18 -0600 Subject: [all][tc] What's happening in Technical Committee: summary 12th Nov, 21: Reading: 10 min Message-ID: <17d154cdc55.10db9d53f573599.1949859880627332898@ghanshyammann.com> Hello Everyone, Here is this week's summary of the Technical Committee activities. 1. TC Meetings: ============ * TC this week IRC meeting held on Nov 11th Thursday. * Most of the meeting discussions are summarized below (Completed or in-progress activities section). Meeting full logs are available @ https://meetings.opendev.org/meetings/tc/2021/tc.2021-11-11-15.00.log.html * Next week's meeting is canceled due to OpenInfra Keynotes and we will continue weekly meeting on 25th Nov, Thursday 15:00 UTC, feel free the topic on agenda[1] by Nov 24th. 2. What we completed this week: ========================= * Changes in stable core team process[2] * Decouple the community-wide goals from cycle release[3] (I will write up summary on this once we have RBAC goal merged as an example) * Merged 'Technical Writing' SIG into TC[4] 3. Activities In progress: ================== TC Tracker for Yoga cycle ------------------------------ * This etherpad includes the Yoga cycle targets/working items for TC[5]. Open Reviews ----------------- * 12 open reviews for ongoing activities[6]. Review volunteer required for Openstack CI Log Processing project -------------------------------------------------------------------------------- dpawlik sent the email on openstack-discuss[7] about seeking help to review on new Openstack CI Log Processing project. Please respond there if you can help at some extent even not as full time. Remove office hours in favor of weekly meetings ----------------------------------------------------------- We decided to remove the TC in-active office hours[8] in favor of weekly meetings which are serving the purpose of office hours. RBAC discussion: continuing from PTG ---------------------------------------------- We had a discussion on Wed 10th too and agreed on what all we can target for Yoga cycle. Complete notes are in this etherpad[9]. Please review the goal rework[10]. Discussion on open items will be continued in policy popup team meeting[11] Community-wide goal updates ------------------------------------ * With the continuing the discussion on RBAC, we are re-working on the RBAC goal, please wait until we finalize the implementation[8] * There is one more goal proposal for 'FIPS compatibility and compliance'[12]. Adjutant need maintainers and PTLs ------------------------------------------- No volunteer to lead/maintain the Adjutant project, I have sent another reminder to the email[13]. New project 'Skyline' proposal ------------------------------------ * We discussed it in TC PTG and there are few open points about python packaging, repos, and plugins plan which we are discussion on ML. * Still waiting from skyline team to work on the above points[14]. Updating the Yoga testing runtime ---------------------------------------- * As centos stream 9 is released, I have updated the Yoga testing runtime[15] with: 1. Add Debian 11 as tested distro 2. Change centos stream 8 -> centos stream 9 3. Bump lowest python version to test to 3.8 and highest to python 3.9 TC tags analysis ------------------- * TC agreed to remove the framework and it is communicated in ML[16]. Project updates ------------------- * Rename ?Extended Maintenance? SIG to the ?Stable Maintenance?[17] * Retire training-labs repo[18] * Retire puppet-senlin[19] * Add ProxySQL repository for OpenStack-Ansible[20] * Retire js-openstack-lib [21] 4. How to contact the TC: ==================== If you would like to discuss or give feedback to TC, you can reach out to us in multiple ways: 1. Email: you can send the email with tag [tc] on openstack-discuss ML[22]. 2. Weekly meeting: The Technical Committee conduct a weekly meeting every Thursday 15 UTC [23] 3. Ping us using 'tc-members' nickname on #openstack-tc IRC channel. [1] https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting [2] http://lists.openstack.org/pipermail/openstack-discuss/2021-November/025741.html [3] https://review.opendev.org/c/openstack/governance/+/816387 [4] https://review.opendev.org/c/openstack/governance/+/815869 [5] https://etherpad.opendev.org/p/tc-yoga-tracker [6] https://review.opendev.org/q/projects:openstack/governance+status:open [7] http://lists.openstack.org/pipermail/openstack-discuss/2021-November/025758.html [8] https://review.opendev.org/c/openstack/governance/+/817493 [9] https://etherpad.opendev.org/p/policy-popup-yoga-ptg [10] https://review.opendev.org/c/openstack/governance/+/815158 [11] http://lists.openstack.org/pipermail/openstack-discuss/2021-November/025760.html [12] https://review.opendev.org/c/openstack/governance/+/816587 [13] http://lists.openstack.org/pipermail/openstack-discuss/2021-October/025555.html [14] https://review.opendev.org/c/openstack/governance/+/814037 [15] https://review.opendev.org/c/openstack/governance/+/815851 [16] http://lists.openstack.org/pipermail/openstack-discuss/2021-October/025571.html [17] https://review.opendev.org/c/openstack/governance-sigs/+/817499 [18] https://review.opendev.org/c/openstack/governance/+/817511 [19] https://review.opendev.org/c/openstack/governance/+/817329 [20] https://review.opendev.org/c/openstack/governance/+/817245 [21] https://review.opendev.org/c/openstack/governance/+/807163 [22] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-discuss [23] http://eavesdrop.openstack.org/#Technical_Committee_Meeting -gmann From radoslaw.piliszek at gmail.com Fri Nov 12 18:32:17 2021 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Fri, 12 Nov 2021 19:32:17 +0100 Subject: [openstack-ansible] release job failure for ansible-collections-openstack In-Reply-To: References: Message-ID: Please note ansible-collections-openstack is part of Ansible SIG, not OpenStack Ansible. [1] [1] https://opendev.org/openstack/governance/src/commit/bf1b5848934ab209dea2255c22ca0f177719db3b/reference/sigs-repos.yaml -yoctozepto On Fri, 12 Nov 2021 at 16:27, El?d Ill?s wrote: > > Hi Openstack-Ansible team! > > This mail is just to inform you that there was a release job failure [1] > yesterday and the job could not be re-run as part of the job was > finished successfully in the 1st run (so the 2nd attempt failed [2]). > > Could you please review if everything is OK with the release? > > Thanks, > > El?d (elodilles @ #openstack-release) > > [1] > http://lists.openstack.org/pipermail/release-job-failures/2021-November/001576.html > [2] > http://lists.openstack.org/pipermail/release-job-failures/2021-November/001577.html > > > From fungi at yuggoth.org Fri Nov 12 18:36:37 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 12 Nov 2021 18:36:37 +0000 Subject: [openstack-ansible] release job failure for ansible-collections-openstack In-Reply-To: References: Message-ID: <20211112183636.dx5rgj3467royvgc@yuggoth.org> On 2021-11-12 19:32:17 +0100 (+0100), Rados?aw Piliszek wrote: > Please note ansible-collections-openstack is part of Ansible SIG, > not OpenStack Ansible. [...] Interesting, I thought the Release Management team explicitly avoided handling releases for SIG repos, focusing solely on project team deliverables. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From james.slagle at gmail.com Fri Nov 12 20:56:17 2021 From: james.slagle at gmail.com (James Slagle) Date: Fri, 12 Nov 2021 15:56:17 -0500 Subject: [TripleO] Core team cleanup In-Reply-To: References: Message-ID: Thanks for the replies everyone. I have made these changes in gerrit. On Sat, Nov 6, 2021 at 11:28 AM Dmitry Tantsur wrote: > > > On Wed, Nov 3, 2021 at 5:57 PM James Slagle > wrote: > >> Hello, I took a look at our core team, "tripleo-core" in gerrit. We have >> a few individuals who I feel have moved on from TripleO in their focus. I >> looked at the reviews from stackalytics.io for the last 180 days[1]. >> >> These individuals have less than 6 reviews, which is about 1 review a >> month: >> Bob Fournier >> Dan Sneddon >> Dmitry Tantsur >> > > +1. yeah, sorry for that. I have been trying to keep an eye on TripleO > things, but with my new OpenShift responsibilities it's pretty much > impossible. I guess it's the same for Bob. > > I'm still available for questions and reviews if someone needs me. > > Dmitry > > >> Ji?? Str?nsk? >> Juan Antonio Osorio Robles >> Marius Cornea >> >> These individuals have publicly expressed that they are moving on from >> TripleO: >> Michele Baldessari >> wes hayutin >> >> I'd like to propose we remove these folks from our core team, while >> thanking them for their contributions. I'll also note that I'd still value >> +1/-1 from these folks with a lot of significance, and encourage them to >> review their areas of expertise! >> >> If anyone on the list plans to start reviewing in TripleO again, then I >> also think we can postpone the removal for the time being and re-evaluate >> later. Please let me know if that's the case. >> >> Please reply and let me know any agreements or concerns with this change. >> >> Thank you! >> >> [1] >> https://www.stackalytics.io/report/contribution?module=tripleo-group&project_type=openstack&days=180 >> >> -- >> -- James Slagle >> -- >> > > > -- > Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, > Commercial register: Amtsgericht Muenchen, HRB 153243, > Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael > O'Neill > -- -- James Slagle -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From noonedeadpunk at ya.ru Sat Nov 13 06:59:15 2021 From: noonedeadpunk at ya.ru (Dmitriy Rabotyagov) Date: Sat, 13 Nov 2021 08:59:15 +0200 Subject: [openstack-ansible] release job failure for ansible-collections-openstack In-Reply-To: References: Message-ID: <272321636786145@mail.yandex.ru> An HTML attachment was scrubbed... URL: From rlandy at redhat.com Sun Nov 14 23:53:23 2021 From: rlandy at redhat.com (Ronelle Landy) Date: Sun, 14 Nov 2021 18:53:23 -0500 Subject: [all] Openstack CI Log Processing project In-Reply-To: References: Message-ID: On Fri, Nov 12, 2021 at 11:19 AM Daniel Pawlik wrote: > Hello Everyone, > > By moving the Opendev Elasticsearch to Openstack Elasticsearch service (in > the future), > a new repository was created in the Openstack project: ci-log-processing. > The new repository will be used to store configuration related to the > Opensearch > service and all tools required to process logs from Zuul CI system to > Opensearch. > > By moving to the new Elasticsearch system, we would like to take this > opportunity to use a new service to replace the legacy > submit-logstash-jobs system [1][2]. > > > I would like to ask for volunteers to review changes in the Openstack > ci-log-processing repository? [3] > Please reply to this ML or ping me on #openstack-infra IRC channel so that > we can plan to > expand the core member list of this repo. > Any of the TripleO Ci cores could help out here. Thanks. > > Dan > > [1] > https://opendev.org/opendev/base-jobs/src/branch/master/roles/submit-logstash-jobs > [2] https://docs.opendev.org/opendev/system-config/latest/logstash.html > [3] > https://review.opendev.org/q/project:openstack%252Fci-log-processing+status:open > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jordanansell at catalystcloud.nz Mon Nov 15 00:53:05 2021 From: jordanansell at catalystcloud.nz (Jordan Ansell) Date: Mon, 15 Nov 2021 13:53:05 +1300 Subject: [barbican][sdk] Barbican's quota API is missing CLI support? Message-ID: <291f1b4e-8765-bc42-7cd3-92ceb8c963c8@catalystcloud.nz> Hello, I was wondering why the quota API for Barbican doesn't have a CLI command in python-barbicanclient? Thanks, Jordan Ansell From skaplons at redhat.com Mon Nov 15 09:28:47 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Mon, 15 Nov 2021 10:28:47 +0100 Subject: [neutron] CI meeting time slot In-Reply-To: <13828406.O9o76ZdvQC@p1> References: <13828406.O9o76ZdvQC@p1> Message-ID: <5771601.lOV4Wx5bFT@p1> Hi, On czwartek, 28 pa?dziernika 2021 11:10:18 CET Slawek Kaplonski wrote: > Hi, > > As per PTG discussion, I prepared doodle to check what would be the best time > slot for most of the people. > Doodle is at [1]. Please fill it in if You are interested attending the weekly > Neutron CI meeting. Meeting is on the #openstack-neutron irc channel, but we > are also planning to do it on video from time to time. > > The timeslots in doodle have dates for next week, but please ignore them. It's > just to pick the best time slot for the meeting to use it weekly. Next week > meeting will be for sure still in the current time slot, which is Tuesday > 1500 UTC. > > [1] https://doodle.com/poll/3n2im4ebyxhs45ne?utm_source=poll&utm_medium=link Thx for all who participated in the Doodle. After all it seems that the best slot for all who were interested in that meeting and attend it usually is the existing one. So nothing will change there and we will still have Neutron CI meeting on Tuesday at 1500 UTC -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From srelf at ukcloud.com Mon Nov 15 10:21:13 2021 From: srelf at ukcloud.com (Steven Relf) Date: Mon, 15 Nov 2021 10:21:13 +0000 Subject: [Nova] - OS_TYPE - Setting after the fact. Message-ID: Hello List, I have the following situation and would be thankful for any help or suggestions. We provide images to our customers, upon which we set the os_type metadata. We use this to help schedule instances to host aggregates, so we can control where certain OS's land. This works great for new instances, that are created from our images. We have the following problems though. 1. Instances created prior to the introduction of the os_type metadata existing on the image, this metadata is not flowed down, and as such these instances have NULL 2. Instances created from a snapshot do not seem to get the os_type flowed down 3. Instances imported using our migration tool do not end up with the os_type being set either. Currently the only way I can see to set os_type on an instance is to manually (yuk) update the database for each and every instance, which is missing it. Does anyone have any ideas how this can be updated without manually modifying the database. My second thought was to maybe make use of the instance metadata that you can set via the CLI or API, but then I run in to the issue that there is not a nova filter that is able to use instance metadata, and in my brief play with the code, it looks like (user) metadata is not passed in as part of the instance spec dict that is used to schedule instances. I was thinking of writing a custom filter, but I'm not sure how compliant it would be to have a function making a call to try and collect the instance metadata. In summary, I need to solve two things. 1. How do I set os_type on instances in a way that doesn't involve editing the database. 2. Or How can I use another piece of metadata that I can set via the api, which also is exposed to the filter scheduler. Rgds Steve. The future has already arrived. It's just not evenly distributed yet - William Gibson -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Mon Nov 15 11:28:17 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Mon, 15 Nov 2021 12:28:17 +0100 Subject: [neutron] Bug deputy report - week of November 8th Message-ID: <11865446.O9o76ZdvQC@p1> Hi, I was bug deputy last week. Here is the list of new issues opened in Neutron: ## Critical * https://bugs.launchpad.net/neutron/+bug/1950275 - openstack-tox-py36-with- neutron-lib-master job is failing since 05.11.2021 - gate failure in periodic job, Lajos is working on that, * https://bugs.launchpad.net/neutron/+bug/1950346 - [stable/train] neutron- tempest-plugin-designate-scenario-train fails with AttributeError: 'Manager' object has no attribute 'zones_client' - gate failure, in progress, assigned to Lajos, * https://bugs.launchpad.net/neutron/+bug/1950795 - neutron-tempest-plugin- scenario jobs on stable/rocky and stable/queens are failing with POST_FAILURE every time - assigned to slaweq ## High * https://bugs.launchpad.net/neutron/+bug/1950273 - Error 500 during log update - unassigned, happened in the CI, ovn related * https://bugs.launchpad.net/neutron/+bug/1950679 - [ovn] neutron_ovn_db_sync_util hangs on sync_routers_and_rports - Assigned to Daniel Speichert, fix in progress https://review.opendev.org/c/openstack/neutron/+/ 817637 ## Medium * https://bugs.launchpad.net/neutron/+bug/1950686 - [OVN] dns- nameserver=0.0.0.0 for a subnet isn't treated properly - unassigned, ovn related * https://bugs.launchpad.net/neutron/+bug/1899207 - [OVN][Docs] admin/config- dns-res.html should be updated for OVN - unassinged, docs bug, ovn related ## Low * https://bugs.launchpad.net/neutron/+bug/1950662 - [DHCP] Improve RPC server methods - assigned to ralonsoh, fix proposed https://review.opendev.org/c/ openstack/neutron/+/816850 ## Whishlist (RFEs) * https://bugs.launchpad.net/neutron/+bug/1950454 - [RFE] GW IP and FIP QoS to inherit from network - assigned to ralonsoh -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From rlandy at redhat.com Mon Nov 15 11:28:11 2021 From: rlandy at redhat.com (Ronelle Landy) Date: Mon, 15 Nov 2021 06:28:11 -0500 Subject: [Triple0] Gate blocker - standalone failures on master and wallaby Message-ID: Hello All, We have a check/gate blocker on master and wallaby that started on Saturday. Standalone jobs are failing tempest tests. The related bug is linked below: https://bugs.launchpad.net/tripleo/+bug/1950916 The networking team is helping debug this. Please don't recheck for now. We will update this list when we have more info/a fix. Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: From rlandy at redhat.com Mon Nov 15 11:52:15 2021 From: rlandy at redhat.com (Ronelle Landy) Date: Mon, 15 Nov 2021 06:52:15 -0500 Subject: [Triple0] Gate blocker - standalone failures on master and wallaby In-Reply-To: References: Message-ID: tripleo-ci-centos-8-containers-multinode, and the wallaby job, are also impacted. On Mon, Nov 15, 2021 at 6:28 AM Ronelle Landy wrote: > Hello All, > > We have a check/gate blocker on master and wallaby that started on > Saturday. > Standalone jobs are failing tempest tests. The related bug is linked below: > > https://bugs.launchpad.net/tripleo/+bug/1950916 > > The networking team is helping debug this. Please don't recheck for now. > We will update this list when we have more info/a fix. > > Thank you! > -------------- next part -------------- An HTML attachment was scrubbed... URL: From arxcruz at redhat.com Mon Nov 15 13:33:10 2021 From: arxcruz at redhat.com (Arx Cruz) Date: Mon, 15 Nov 2021 14:33:10 +0100 Subject: [all] Openstack CI Log Processing project In-Reply-To: References: Message-ID: Hi Daniel, I can help you on this :) Kind regards, Arx Cruz On Mon, Nov 15, 2021 at 12:57 AM Ronelle Landy wrote: > > > On Fri, Nov 12, 2021 at 11:19 AM Daniel Pawlik wrote: > >> Hello Everyone, >> >> By moving the Opendev Elasticsearch to Openstack Elasticsearch service >> (in the future), >> a new repository was created in the Openstack project: ci-log-processing. >> The new repository will be used to store configuration related to the >> Opensearch >> service and all tools required to process logs from Zuul CI system to >> Opensearch. >> > >> By moving to the new Elasticsearch system, we would like to take this >> opportunity to use a new service to replace the legacy >> submit-logstash-jobs system [1][2]. >> >> >> I would like to ask for volunteers to review changes in the Openstack >> ci-log-processing repository? [3] >> Please reply to this ML or ping me on #openstack-infra IRC channel so >> that we can plan to >> expand the core member list of this repo. >> > > Any of the TripleO Ci cores could help out here. > > Thanks. > >> >> Dan >> >> [1] >> https://opendev.org/opendev/base-jobs/src/branch/master/roles/submit-logstash-jobs >> [2] https://docs.opendev.org/opendev/system-config/latest/logstash.html >> [3] >> https://review.opendev.org/q/project:openstack%252Fci-log-processing+status:open >> > -- Arx Cruz Software Engineer Red Hat EMEA arxcruz at redhat.com @RedHat Red Hat Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From marios at redhat.com Mon Nov 15 14:16:07 2021 From: marios at redhat.com (Marios Andreou) Date: Mon, 15 Nov 2021 16:16:07 +0200 Subject: [all] Openstack CI Log Processing project In-Reply-To: References: Message-ID: On Fri, Nov 12, 2021 at 6:19 PM Daniel Pawlik wrote: > > Hello Everyone, > > By moving the Opendev Elasticsearch to Openstack Elasticsearch service (in the future), > a new repository was created in the Openstack project: ci-log-processing. > The new repository will be used to store configuration related to the Opensearch > service and all tools required to process logs from Zuul CI system to > Opensearch. > > By moving to the new Elasticsearch system, we would like to take this > opportunity to use a new service to replace the legacy submit-logstash-jobs system [1][2]. > > > I would like to ask for volunteers to review changes in the Openstack ci-log-processing repository? [3] > Please reply to this ML or ping me on #openstack-infra IRC channel so that we can plan to > expand the core member list of this repo. > Hi Daniel - count me in if you still need more eyes - adding the repo to my review list regards, marios > Dan > > [1] https://opendev.org/opendev/base-jobs/src/branch/master/roles/submit-logstash-jobs > [2] https://docs.opendev.org/opendev/system-config/latest/logstash.html > [3] https://review.opendev.org/q/project:openstack%252Fci-log-processing+status:open From smooney at redhat.com Mon Nov 15 15:17:37 2021 From: smooney at redhat.com (Sean Mooney) Date: Mon, 15 Nov 2021 15:17:37 +0000 Subject: [Nova] - OS_TYPE - Setting after the fact. In-Reply-To: References: Message-ID: <9eefc93efd057a857802fd8dda7898245399b29e.camel@redhat.com> On Mon, 2021-11-15 at 10:21 +0000, Steven Relf wrote: > Hello List, > > I have the following situation and would be thankful for any help or suggestions. > > We provide images to our customers, upon which we set the os_type metadata. We use this to help schedule instances to host aggregates, so we can control where certain OS's land. This works great for new instances, that are created from our images. > > We have the following problems though. > > > 1. Instances created prior to the introduction of the os_type metadata existing on the image, this metadata is not flowed down, and as such these instances have NULL This is the expected behaivor. we snapshot/copy the image metadata at the time the instnace was created into the instance_system_metadata table to ensure the change to the image after the fact do not affect existing vms. > 2. Instances created from a snapshot do not seem to get the os_type flowed down > 3. Instances imported using our migration tool do not end up with the os_type being set either. likely because the meataddata was not set on the image or volume before the vm was created. > > Currently the only way I can see to set os_type on an instance is to manually (yuk) update the database for each and every instance, which is missing it. os_type on the instance is not used anymore and should not be set at all https://github.com/openstack/nova/blob/master/nova/objects/instance.py#L170 the os_type filed in the instance tabel was replaced with the os_type filed in the image_metatdata https://github.com/openstack/nova/blob/master/nova/objects/image_meta.py#L552-L555 which is stored in teh instance_system_metadata table in the db. > > Does anyone have any ideas how this can be updated without manually modifying the database. > > My second thought was to maybe make use of the instance metadata that you can set via the CLI or API, but then I run in to the issue that there is not a nova filter that is able to use instance metadata, and in my brief play with the code, it looks like (user) metadata is not passed in as part of the instance spec dict that is used to schedule instances. we do have nova manage command that allow operators to set some fo the image metadata that coudl be extended to allow any image metadta to be set however the operator would then be resposible for ensureing that the chagne to the metadata do not invalide the placment of the current vm or rectifying it if it would. The only way i see to do this via the api would be a resize to same flavor which would update the flavor and image metadta and find a new host to move the vm to that is vaild for the new requiremetns. we had discussed added a new recreate api for this usecase previously but the feedback was to just extend resize. > > I was thinking of writing a custom filter, but I'm not sure how compliant it would be to have a function making a call to try and collect the instance metadata. > > In summary, I need to solve two things. > > > 1. How do I set os_type on instances in a way that doesn't involve editing the database. the only way to do this today would be a rebuild, a nova manage command would be inline with what we have done for machine_type. allowing the image metadata to be change via an api is likely not approriate unless its a new instance action like recreate which would use the updated image metadta and move the instance(optionally to the same host). https://github.com/openstack/nova/commit/c70cde057d20bb2c05a133e52b9ec560bd792698 for now i think that is the best approch to take. ensure that all image have the correct os_type set perhaps using a glance import plugin to also update update user submitted images then via an sql script or a new nova manage command update all existing images. > 2. Or How can I use another piece of metadata that I can set via the api, which also is exposed to the filter scheduler. you wont be able to use any metadata in the flavor or image since both are cahced at instance create. you might be able to use server metadata or a server tag. there is no intree filetr however that would work. in tree filters are not allwo to make api calls or RPC calls and should avoid making db queires in the filter. the request spec does not currently contaien any instnace tag or server metadata https://github.com/openstack/nova/blob/master/nova/objects/request_spec.py#L60-L116 as such there is no efficnet way to implement this as a custom filter. > > Rgds > Steve. > > The future has already arrived. It's just not evenly distributed yet - William Gibson > From srelf at ukcloud.com Mon Nov 15 15:35:19 2021 From: srelf at ukcloud.com (Steven Relf) Date: Mon, 15 Nov 2021 15:35:19 +0000 Subject: [Nova] - OS_TYPE - Setting after the fact. In-Reply-To: <9eefc93efd057a857802fd8dda7898245399b29e.camel@redhat.com> References: <9eefc93efd057a857802fd8dda7898245399b29e.camel@redhat.com> Message-ID: Hey Sean, Thanks for responding. It sounds like a nova-manage command is the way forward, to avoid editing the DB directly, the operator would then need to ensure no breaches of aggregate requirements. Ill have a look at the nova-mange command you have referenced. We do now have the os_type set on all our provided images, but this doesn?t stop a customer uploading an image without it set. I guess that?s where a glance image plugin would come in, can you point me at some documentation for me to have a read around. I don?t like the idea of having to resize, as these aren?t our instances, as we are operating a public cloud. On an aside, out of curiosity, why was the decision to not cascade these types of changes made, is it simply to provide immutability to instances? Rgds Steve. From emilien at redhat.com Mon Nov 15 15:39:56 2021 From: emilien at redhat.com (Emilien Macchi) Date: Mon, 15 Nov 2021 10:39:56 -0500 Subject: [sdk][tc] A new home for Gophercloud In-Reply-To: <17d0b3beed8.f1f93452444025.7625989527428519853@ghanshyammann.com> References: <20211103164054.72y2c7qk2b7ogxn2@yuggoth.org> <17d0b3beed8.f1f93452444025.7625989527428519853@ghanshyammann.com> Message-ID: On Wed, Nov 10, 2021 at 2:06 PM Ghanshyam Mann wrote: > ---- On Tue, 09 Nov 2021 19:36:00 -0600 Emilien Macchi < > emilien at redhat.com> wrote ---- > > On Wed, Nov 3, 2021 at 12:43 PM Jeremy Stanley > wrote: > > [...] > > This was based on experiences trying to work with the Kata > > community, and the "experiment" referenced in that mailing list post > > eventually concluded with the removal of remaining Kata project > > configuration when https://review.opendev.org/744687 merged > > approximately 15 months ago. > > > > ack > > I haven't seen much pushback from moving to Gerrit, but pretty much all > feedback I got was from folks who worked (is working) on OpenStack, so a > bit biased in my opinion (myself included).Beside that, if we would move to > opendev, I want to see some incentives in our roadmap, not just "move our > project here because it's cool". > > Some ideas:* Consider it as a subproject from OpenStack SDK? Or part of > a SIG?* CI coverage for API regression testing (e.g. > gophercloud/acceptance/compute running in Nova CI)* Getting more exposure > of the project and potentially more contributors* Consolidate the best > practices in general, for contributions to the project, getting started, > dev environments, improving CI jobs (current jobs use OpenLab zuul, with a > fork of zuul jobs). > > Is there any concern would we have to discuss?-- > > > +1, Thanks Emilien for putting the roadmap which is more clear to > understand the logn term benefits. > > Looks good to me, especially CI part is cool to have from API testing > perspective and to know where > we break things (we run client jobs in many projects CI so it should not > be something special we need to do) > > >* Consider it as a subproject from OpenStack SDK? Or part of a SIG? > > Just to be more clear on this. Does this mean, once we setup the things in > opendev then we can migrate it under > openstack/ namespace under OpenStack SDK umbrella? or you mean keep it in > opendev with non-openstack > namespace but collaborative effort with SDK team. > I think we would move the project under opendev with a non openstack namespace, and of course collaborate with everyone. -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Mon Nov 15 16:28:00 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 15 Nov 2021 10:28:00 -0600 Subject: [sdk][tc] A new home for Gophercloud In-Reply-To: References: <20211103164054.72y2c7qk2b7ogxn2@yuggoth.org> <17d0b3beed8.f1f93452444025.7625989527428519853@ghanshyammann.com> Message-ID: <17d246d4494.11dd8ebf8679316.6998852912939343738@ghanshyammann.com> ---- On Mon, 15 Nov 2021 09:39:56 -0600 Emilien Macchi wrote ---- > > > On Wed, Nov 10, 2021 at 2:06 PM Ghanshyam Mann wrote: > ---- On Tue, 09 Nov 2021 19:36:00 -0600 Emilien Macchi wrote ---- > > On Wed, Nov 3, 2021 at 12:43 PM Jeremy Stanley wrote: > > [...] > > This was based on experiences trying to work with the Kata > > community, and the "experiment" referenced in that mailing list post > > eventually concluded with the removal of remaining Kata project > > configuration when https://review.opendev.org/744687 merged > > approximately 15 months ago. > > > > ack > > I haven't seen much pushback from moving to Gerrit, but pretty much all feedback I got was from folks who worked (is working) on OpenStack, so a bit biased in my opinion (myself included).Beside that, if we would move to opendev, I want to see some incentives in our roadmap, not just "move our project here because it's cool". > > Some ideas:* Consider it as a subproject from OpenStack SDK? Or part of a SIG?* CI coverage for API regression testing (e.g. gophercloud/acceptance/compute running in Nova CI)* Getting more exposure of the project and potentially more contributors* Consolidate the best practices in general, for contributions to the project, getting started, dev environments, improving CI jobs (current jobs use OpenLab zuul, with a fork of zuul jobs). > > Is there any concern would we have to discuss?-- > > > +1, Thanks Emilien for putting the roadmap which is more clear to understand the logn term benefits. > > Looks good to me, especially CI part is cool to have from API testing perspective and to know where > we break things (we run client jobs in many projects CI so it should not be something special we need to do) > > >* Consider it as a subproject from OpenStack SDK? Or part of a SIG? > > Just to be more clear on this. Does this mean, once we setup the things in opendev then we can migrate it under > openstack/ namespace under OpenStack SDK umbrella? or you mean keep it in opendev with non-openstack > namespace but collaborative effort with SDK team. > > I think we would move the project under opendev with a non openstack namespace, and of course collaborate with everyone. Thanks for the clarification. +1, sounds like a good plan. -gmann > > -- > Emilien Macchi > From sbauza at redhat.com Mon Nov 15 16:29:41 2021 From: sbauza at redhat.com (Sylvain Bauza) Date: Mon, 15 Nov 2021 17:29:41 +0100 Subject: [nova][placement] Spec review day on Nov 16th In-Reply-To: References: Message-ID: On Fri, Nov 5, 2021 at 4:17 PM Sylvain Bauza wrote: > As agreed on our last Nova meeting [1], please sharpen your pen and > prepare your specs ahead of time as we'll have a spec review day on Nov 16th > Reminder: the idea of a spec review day is to ensure that contributors and > reviewers are available on the same day for prioritizing Gerrit comments > and IRC discussions about specs in order to facilitate and accelerate the > reviewing of open specs. > If you care about some fancy new feature, please make sure your spec is > ready for review on time and you are somehow joinable so reviewers can ping > you, or you are able to quickly reply on their comments and ideally propose > a new revision if needed. > Nova cores, I appreciate your dedication about specs on this particular > day. > > Kind reminder that tomorrow will be our Spec review day for the Yoga release. Please prepare your specs if you want us to review them, and we also appreciate any review people could make. The more contributors are reviewing our features, the better Nova release will be :-) -Sylvain -Sylvain > > [1] > https://meetings.opendev.org/meetings/nova/2021/nova.2021-11-02-16.00.log.html > -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Mon Nov 15 16:53:47 2021 From: smooney at redhat.com (Sean Mooney) Date: Mon, 15 Nov 2021 16:53:47 +0000 Subject: [Nova] - OS_TYPE - Setting after the fact. In-Reply-To: References: <9eefc93efd057a857802fd8dda7898245399b29e.camel@redhat.com> Message-ID: On Mon, 2021-11-15 at 15:35 +0000, Steven Relf wrote: > Hey Sean, > > Thanks for responding. > > It sounds like a nova-manage command is the way forward, to avoid editing the DB directly, the operator would then need to ensure no breaches of aggregate requirements. Ill have a look at the nova-mange command you have referenced. > > We do now have the os_type set on all our provided images, but this doesn?t stop a customer uploading an image without it set. I guess that?s where a glance image plugin would come in, can you point me at some documentation for me to have a read around. > > I don?t like the idea of having to resize, as these aren?t our instances, as we are operating a public cloud. hi yes so there is an image property inject plugin but that is mainly to inject static metadata https://docs.openstack.org/glance/latest/admin/interoperable-image-import.html#the-image-property-injection-plugin you would porably need to modify that to use libguestfs to inspect the image and then add the os_type based on what it finds. > > On an aside, out of curiosity, why was the decision to not cascade these types of changes made, is it simply to provide immutability to instances? immutablity is one reason and maintaining the validity fo its current placment. adding cpu pinnign extra spec for example would invalidate the placment of any vm that did not have it set also via the flavor. you could add traits requests as another exampel that would make some vm invlaide for there current host. so in generall modifying the image extra specs on a runing instnace can invalidate the current placment of the vm due to chagne in the behavior of fliters or alter the guest abi which might break worklaods. that is why we said this would have to be an operation that you oppted into (proppagation of update extra specs) rahter then an operation that happened by default. it become espically problematic for image with shared/ public or comunity visiblity as a minor chagne to adress a supprot request from one customer might impact other customers. so to be safe we making it immuntable for the lifetime of the the instance unless you change the image via a rebuild in the case of image properties or resize in the case of flavor extra specs. > > Rgds > Steve. > From kennelson11 at gmail.com Mon Nov 15 16:58:34 2021 From: kennelson11 at gmail.com (Kendall Nelson) Date: Mon, 15 Nov 2021 08:58:34 -0800 Subject: [sdk][tc] A new home for Gophercloud In-Reply-To: References: <20211103164054.72y2c7qk2b7ogxn2@yuggoth.org> <17d0b3beed8.f1f93452444025.7625989527428519853@ghanshyammann.com> Message-ID: Is there a reason why you don't want it to be under the openstack namespace? -Kendall On Mon, Nov 15, 2021 at 7:40 AM Emilien Macchi wrote: > > > On Wed, Nov 10, 2021 at 2:06 PM Ghanshyam Mann > wrote: > >> ---- On Tue, 09 Nov 2021 19:36:00 -0600 Emilien Macchi < >> emilien at redhat.com> wrote ---- >> > On Wed, Nov 3, 2021 at 12:43 PM Jeremy Stanley >> wrote: >> > [...] >> > This was based on experiences trying to work with the Kata >> > community, and the "experiment" referenced in that mailing list post >> > eventually concluded with the removal of remaining Kata project >> > configuration when https://review.opendev.org/744687 merged >> > approximately 15 months ago. >> > >> > ack >> > I haven't seen much pushback from moving to Gerrit, but pretty much >> all feedback I got was from folks who worked (is working) on OpenStack, so >> a bit biased in my opinion (myself included).Beside that, if we would move >> to opendev, I want to see some incentives in our roadmap, not just "move >> our project here because it's cool". >> > Some ideas:* Consider it as a subproject from OpenStack SDK? Or part >> of a SIG?* CI coverage for API regression testing (e.g. >> gophercloud/acceptance/compute running in Nova CI)* Getting more exposure >> of the project and potentially more contributors* Consolidate the best >> practices in general, for contributions to the project, getting started, >> dev environments, improving CI jobs (current jobs use OpenLab zuul, with a >> fork of zuul jobs). >> > Is there any concern would we have to discuss?-- >> >> >> +1, Thanks Emilien for putting the roadmap which is more clear to >> understand the logn term benefits. >> >> Looks good to me, especially CI part is cool to have from API testing >> perspective and to know where >> we break things (we run client jobs in many projects CI so it should not >> be something special we need to do) >> >> >* Consider it as a subproject from OpenStack SDK? Or part of a SIG? >> >> Just to be more clear on this. Does this mean, once we setup the things >> in opendev then we can migrate it under >> openstack/ namespace under OpenStack SDK umbrella? or you mean keep it in >> opendev with non-openstack >> namespace but collaborative effort with SDK team. >> > > I think we would move the project under opendev with a non openstack > namespace, and of course collaborate with everyone. > > -- > Emilien Macchi > -------------- next part -------------- An HTML attachment was scrubbed... URL: From senrique at redhat.com Mon Nov 15 17:34:06 2021 From: senrique at redhat.com (Sofia Enriquez) Date: Mon, 15 Nov 2021 14:34:06 -0300 Subject: [Openstack][cinder] scheduler filters In-Reply-To: References: Message-ID: Greetings, > probably I must use the same backend name for both and a cinder type associated to it and the scheduler will use the backend with more space available ? I'm not familiar with your deployment but there's a example in the documentation that I think It may help you: In a multiple-storage back-end configuration, each back end has a name ( volume_backend_name). Several back ends can have the same name. In that case, the scheduler properly decides which back end the volume has to be created in. i.e [1] In this configuration, lvmdriver-1 and lvmdriver-2 have the same volume_backend_name. If a volume creation requests the LVM back end name, the scheduler uses the capacity filter scheduler to choose the most suitable driver, which is either lvmdriver-1 or lvmdriver-2. The capacity filter scheduler is enabled by default. The next section provides more information. In addition, this example presents a lvmdriver-3 back end. Cheers, Sofia [1] https://docs.openstack.org/cinder/xena/admin/blockstorage-multi-backend.html On Thu, Nov 11, 2021 at 4:25 PM Ignazio Cassano wrote: > Hello again, probably I must use the same backend name for both and a > cinder type associated to it and the scheduler will use the backend with > more space available ? > Ignazio > > Il Gio 11 Nov 2021, 20:00 Ignazio Cassano ha > scritto: > >> Hello All, >> I read that capacity filters for cinder is the default, so, if I >> understood well, a volume is placed on the backend where more space is >> available. >> Since my two backends are on storage with same features, I wonder if I >> must specify a default storage backend in cinder.conf or not. >> Must I create a cinder volume without cinder type and scheduler evaluate >> where there is more space available? >> Thanks >> Ignazio >> >> >> -- Sof?a Enriquez she/her Software Engineer Red Hat PnT IRC: @enriquetaso @RedHat Red Hat Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Mon Nov 15 17:39:44 2021 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Mon, 15 Nov 2021 18:39:44 +0100 Subject: [Openstack][cinder] scheduler filters In-Reply-To: References: Message-ID: Many thanks , Sofia. It is exactly what I want to test. Ignazio Il Lun 15 Nov 2021, 18:34 Sofia Enriquez ha scritto: > Greetings, > > probably I must use the same backend name for both and a cinder type > associated to it and the scheduler will use the backend with more space > available ? > I'm not familiar with your deployment but there's a example in the > documentation that I think It may help you: > > In a multiple-storage back-end configuration, each back end has a name ( > volume_backend_name). Several back ends can have the same name. In that > case, the scheduler properly decides which back end the volume has to be > created in. i.e [1] In this configuration, lvmdriver-1 and lvmdriver-2 > have the same volume_backend_name. If a volume creation requests the LVM > back end name, the scheduler uses the capacity filter scheduler to choose > the most suitable driver, which is either lvmdriver-1 or lvmdriver-2. The > capacity filter scheduler is enabled by default. The next section provides > more information. In addition, this example presents a lvmdriver-3 back > end. > > Cheers, > Sofia > [1] > https://docs.openstack.org/cinder/xena/admin/blockstorage-multi-backend.html > > On Thu, Nov 11, 2021 at 4:25 PM Ignazio Cassano > wrote: > >> Hello again, probably I must use the same backend name for both and a >> cinder type associated to it and the scheduler will use the backend with >> more space available ? >> Ignazio >> >> Il Gio 11 Nov 2021, 20:00 Ignazio Cassano ha >> scritto: >> >>> Hello All, >>> I read that capacity filters for cinder is the default, so, if I >>> understood well, a volume is placed on the backend where more space is >>> available. >>> Since my two backends are on storage with same features, I wonder if I >>> must specify a default storage backend in cinder.conf or not. >>> Must I create a cinder volume without cinder type and scheduler evaluate >>> where there is more space available? >>> Thanks >>> Ignazio >>> >>> >>> > > -- > > Sof?a Enriquez > > she/her > > Software Engineer > > Red Hat PnT > > IRC: @enriquetaso > @RedHat Red Hat > Red Hat > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From faisal.sheikh at rapidcompute.com Mon Nov 15 16:11:39 2021 From: faisal.sheikh at rapidcompute.com (Muhammad Faisal) Date: Mon, 15 Nov 2021 21:11:39 +0500 Subject: Error while getting network agents Message-ID: <007a01d7da3b$78f1d800$6ad58800$@rapidcompute.com> Hi, While executing openstack network agent list we are getting below mentioned error. We have run "ovn-sbctl chassis-del 650be87c-b581-467a-b523-ce454e753780" command on controller node. OS: Ubuntu 20 Openstack version: Wallaby Number of controller/network node: 1 (172.16.30.46) Number of compute node: 2 (172.16.30.1, 172.16.30.3) OVN Version: 21.09.0 /var/log/ovn/ovn-northd.log: 2021-11-15T15:29:22.184Z|00053|northd|WARN|Dropped 7 log messages in last 127 seconds (most recently, 120 seconds ago) due to excessive rate 2021-11-15T15:29:22.184Z|00054|northd|WARN|Chassis does not exist for Chassis_Private record, name: 7d07ed32-996f-48da-9ac1-7f0455ea7ec7 2021-11-15T15:31:11.516Z|00055|northd|WARN|Dropped 14 log messages in last 110 seconds (most recently, 52 seconds ago) due to excessive rate 2021-11-15T15:31:11.516Z|00056|northd|WARN|Chassis does not exist for Chassis_Private record, name: 7d07ed32-996f-48da-9ac1-7f0455ea7ec7 2021-11-15T15:35:25.738Z|00057|northd|WARN|Dropped 6 log messages in last 254 seconds (most recently, 243 seconds ago) due to excessive rate 2021-11-15T15:35:25.738Z|00058|northd|WARN|Chassis does not exist for Chassis_Private record, name: 7d07ed32-996f-48da-9ac1-7f0455ea7ec7 2021-11-15T15:44:34.663Z|00059|northd|WARN|Dropped 12 log messages in last 549 seconds (most recently, 512 seconds ago) due to excessive rate 2021-11-15T15:44:34.663Z|00060|northd|WARN|Chassis does not exist for Chassis_Private record, name: 7d07ed32-996f-48da-9ac1-7f0455ea7ec7 2021-11-15T15:47:39.947Z|00061|northd|WARN|Dropped 2 log messages in last 185 seconds (most recently, 185 seconds ago) due to excessive rate 2021-11-15T15:47:39.948Z|00062|northd|WARN|Chassis does not exist for Chassis_Private record, name: 7d07ed32-996f-48da-9ac1-7f0455ea7ec7 /var/log/neutron/neutron-server.log: 2021-11-15 20:55:24.025 678149 DEBUG futurist.periodics [req-df627767-441c-4b46-8487-91604cd3033a - - - - -] Submitting periodic callback 'neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.maintenance.HashRingHealt hCheckPeriodics.touch_hash_ring_nodes' _process_scheduled /usr/lib/python3/dist-packages/futurist/periodics.py:641 2021-11-15 20:55:30.531 678146 DEBUG neutron.wsgi [-] (678146) accepted ('172.16.30.46', 49782) server /usr/lib/python3/dist-packages/eventlet/wsgi.py:992 2021-11-15 20:55:30.893 678146 DEBUG ovsdbapp.backend.ovs_idl.transaction [req-04604530-8f57-428b-8359-e1bd2648d417 - - - - -] Running txn n=1 command(idx=0): CheckLivenessCommand() do_commit /usr/lib/python3/dist-packages/ovsdbapp/backend/ovs_idl/transaction.py:90 2021-11-15 20:55:30.925 678146 DEBUG neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor [req-04604530-8f57-428b-8359-e1bd2648d417 - - - - -] Hash Ring: Node 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling event "update" for row b4926d80-2c2a-488f-bd27-49c7191f0c3a (table: NB_Global) notify /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/o vsdb/ovsdb_monitor.py:666 2021-11-15 20:55:30.927 678147 DEBUG neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor [-] ChassisAgentWriteEvent : Matched Chassis_Private, update, None None matches /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/o vsdb/ovsdb_monitor.py:61 2021-11-15 20:55:30.930 678147 DEBUG neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor [-] ChassisMetadataAgentWriteEvent : Matched Chassis_Private, update, None None matches /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/o vsdb/ovsdb_monitor.py:61 2021-11-15 20:55:30.931 678146 DEBUG neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor [req-04604530-8f57-428b-8359-e1bd2648d417 - - - - -] Hash Ring: Node 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling event "update" for row b4926d80-2c2a-488f-bd27-49c7191f0c3a (table: NB_Global) notify /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/o vsdb/ovsdb_monitor.py:666 2021-11-15 20:55:30.934 678146 DEBUG neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor [req-04604530-8f57-428b-8359-e1bd2648d417 - - - - -] Hash Ring: Node 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling event "update" for row b4926d80-2c2a-488f-bd27-49c7191f0c3a (table: NB_Global) notify /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/o vsdb/ovsdb_monitor.py:666 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource [req-db3a5256-a3de-41a9-a135-710ce024a698 92e064e5e77c4e32b97e6c056a86c5eb 7eb59a51503f48ddbc936c40990e2177 - default default] index failed: No details.: AttributeError: 'Chassis_Private' object has no attribute 'hostname' 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource Traceback (most recent call last): 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/neutron/api/v2/resource.py", line 98, in resource 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource result = method(request=request, **args) 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 139, in wrapped 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource setattr(e, '_RETRY_EXCEEDED', True) 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 227, in __exit__ 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource self.force_reraise() 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 200, in force_reraise 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource raise self.value 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 135, in wrapped 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource return f(*args, **kwargs) 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/oslo_db/api.py", line 154, in wrapper 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource ectxt.value = e.inner_exc 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 227, in __exit__ 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource self.force_reraise() 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 200, in force_reraise 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource raise self.value 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/oslo_db/api.py", line 142, in wrapper 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource return f(*args, **kwargs) 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 183, in wrapped 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource LOG.debug("Retry wrapper got retriable exception: %s", e) 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 227, in __exit__ 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource self.force_reraise() 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 200, in force_reraise 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource raise self.value 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 179, in wrapped 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource return f(*dup_args, **dup_kwargs) 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/neutron/api/v2/base.py", line 369, in index 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource return self._items(request, True, parent_id) 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/neutron/api/v2/base.py", line 304, in _items 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource obj_list = obj_getter(request.context, **kwargs) 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ mech_driver.py", line 1118, in fn 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource return op(results, new_method(*args, _driver=self, **kwargs)) 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ mech_driver.py", line 1182, in get_agents 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource agent_dict = agent.as_dict() 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/agent/neutro n_agent.py", line 59, in as_dict 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource 'host': self.chassis.hostname, 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource AttributeError: 'Chassis_Private' object has no attribute 'hostname' 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource 2021-11-15 20:55:30.937 678146 INFO neutron.wsgi [req-db3a5256-a3de-41a9-a135-710ce024a698 92e064e5e77c4e32b97e6c056a86c5eb 7eb59a51503f48ddbc936c40990e2177 - default default] 172.16.30.46 "GET /v2.0/agents HTTP/1.1" status: 500 len: 368 time: 0.4032691 2021-11-15 20:55:30.938 678146 DEBUG neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor [req-5efca86e-8945-43c5-a492-76e8ad16ef6e - - - - -] Hash Ring: Node 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling event "update" for row 2bd437c4-9173-4d1b-ad01-d27cf0a11c8a (table: SB_Global) notify /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/o vsdb/ovsdb_monitor.py:666 2021-11-15 20:55:30.939 678146 DEBUG neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor [req-5efca86e-8945-43c5-a492-76e8ad16ef6e - - - - -] ChassisAgentWriteEvent : Matched Chassis_Private, update, None None matches /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/o vsdb/ovsdb_monitor.py:61 2021-11-15 20:55:30.939 678146 DEBUG neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor [req-5efca86e-8945-43c5-a492-76e8ad16ef6e - - - - -] Hash Ring: Node 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling event "update" for row 03ce3887-1007-48fa-9f40-b2fe88246a69 (table: Chassis_Private) notify /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/o vsdb/ovsdb_monitor.py:666 2021-11-15 20:55:30.940 678146 DEBUG neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor [req-5efca86e-8945-43c5-a492-76e8ad16ef6e - - - - -] ChassisMetadataAgentWriteEvent : Matched Chassis_Private, update, None None matches /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/o vsdb/ovsdb_monitor.py:61 2021-11-15 20:55:30.940 678146 DEBUG neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor [req-5efca86e-8945-43c5-a492-76e8ad16ef6e - - - - -] Hash Ring: Node 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling event "update" for row 03ce3887-1007-48fa-9f40-b2fe88246a69 (table: Chassis_Private) notify /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/o vsdb/ovsdb_monitor.py:666 2021-11-15 20:55:37.389 678146 INFO neutron.wsgi [req-85f09a48-f50f-4776-b110-5d07dd1dc39e 9888a6e6da764fd28a72b1a7b25b8967 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET /v2.0/ports?device_id=21999c50-740c-4a26-970b-8e07ba37b5ac&fields=binding%3A host_id&fields=binding%3Avif_type HTTP/1.1" status: 200 len: 271 time: 0.0680673 2021-11-15 20:55:37.476 678146 DEBUG neutron.pecan_wsgi.hooks.policy_enforcement [req-4099412f-d724-48d7-8786-add0c5b47f56 9888a6e6da764fd28a72b1a7b25b8967 3e076bf2747b40ed8792ebf85d14d719 - default default] Attributes excluded by policy engine: ['standard_attr_id'] _exclude_attributes_by_policy /usr/lib/python3/dist-packages/neutron/pecan_wsgi/hooks/policy_enforcement.p y:255 2021-11-15 20:55:37.477 678146 INFO neutron.wsgi [req-4099412f-d724-48d7-8786-add0c5b47f56 9888a6e6da764fd28a72b1a7b25b8967 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET /v2.0/ports?tenant_id=7eb59a51503f48ddbc936c40990e2177&device_id=21999c50-74 0c-4a26-970b-8e07ba37b5ac HTTP/1.1" status: 200 len: 1175 time: 0.0565202 2021-11-15 20:55:37.618 678146 DEBUG neutron.pecan_wsgi.hooks.policy_enforcement [req-eebf3e86-206a-45ed-92f3-804007a39cca 9888a6e6da764fd28a72b1a7b25b8967 3e076bf2747b40ed8792ebf85d14d719 - default default] Attributes excluded by policy engine: ['standard_attr_id', 'vlan_transparent'] _exclude_attributes_by_policy /usr/lib/python3/dist-packages/neutron/pecan_wsgi/hooks/policy_enforcement.p y:255 2021-11-15 20:55:37.619 678146 INFO neutron.wsgi [req-eebf3e86-206a-45ed-92f3-804007a39cca 9888a6e6da764fd28a72b1a7b25b8967 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET /v2.0/networks?id=ed4ee148-2562-41ed-93ac-7341948ac4dc HTTP/1.1" status: 200 len: 904 time: 0.1111536 2021-11-15 20:55:37.674 678146 INFO neutron.wsgi [req-71885d0c-4aa2-4377-a014-33768c9160ae 9888a6e6da764fd28a72b1a7b25b8967 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET /v2.0/floatingips?fixed_ip_address=192.168.100.199&port_id=f9c17ae5-c07b-441 1-a0be-551686e6ba8e HTTP/1.1" status: 200 len: 217 time: 0.0442567 2021-11-15 20:55:37.727 678146 DEBUG neutron.pecan_wsgi.hooks.policy_enforcement [req-d35a3b5e-82ea-4e80-b1c6-c8b39205a1f7 9888a6e6da764fd28a72b1a7b25b8967 3e076bf2747b40ed8792ebf85d14d719 - default default] Attributes excluded by policy engine: ['standard_attr_id', 'shared'] _exclude_attributes_by_policy /usr/lib/python3/dist-packages/neutron/pecan_wsgi/hooks/policy_enforcement.p y:255 2021-11-15 20:55:37.731 678146 INFO neutron.wsgi [req-d35a3b5e-82ea-4e80-b1c6-c8b39205a1f7 9888a6e6da764fd28a72b1a7b25b8967 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET /v2.0/subnets?id=74e3b294-5fcc-4685-8bc4-5200be3af09e HTTP/1.1" status: 200 len: 858 time: 0.0468309 2021-11-15 20:55:37.774 678146 INFO neutron.wsgi [req-7a57b86f-75bf-4381-9447-0b1d7262aae8 9888a6e6da764fd28a72b1a7b25b8967 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET /v2.0/ports?network_id=ed4ee148-2562-41ed-93ac-7341948ac4dc&device_owner=net work%3Adhcp HTTP/1.1" status: 200 len: 210 time: 0.0335791 2021-11-15 20:55:37.891 678146 INFO neutron.wsgi [req-7cf3c09b-b5ee-4372-84c2-d738abd1a47a 9888a6e6da764fd28a72b1a7b25b8967 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET /v2.0/networks/ed4ee148-2562-41ed-93ac-7341948ac4dc?fields=segments HTTP/1.1" status: 200 len: 212 time: 0.1059954 2021-11-15 20:55:37.991 678146 INFO neutron.wsgi [req-3ed1edbd-9f33-4b4e-89c8-a2291f23e8ef 9888a6e6da764fd28a72b1a7b25b8967 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET /v2.0/networks/ed4ee148-2562-41ed-93ac-7341948ac4dc?fields=provider%3Aphysic al_network&fields=provider%3Anetwork_type HTTP/1.1" status: 200 len: 277 time: 0.0903363 2021-11-15 20:55:49.083 678146 INFO neutron.wsgi [req-0328379e-d381-48be-a209-a94acc297baa 9888a6e6da764fd28a72b1a7b25b8967 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.1 "GET /v2.0/ports?device_id=82e90d49-bbc4-4c2c-8ae1-7a4bcf84bb7d&fields=binding%3A host_id&fields=binding%3Avif_type HTTP/1.1" status: 200 len: 271 time: 0.2884536 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 33431 bytes Desc: not available URL: From gmann at ghanshyammann.com Mon Nov 15 18:54:03 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 15 Nov 2021 12:54:03 -0600 Subject: [all][tc] Canceling Technical Committee 18th Nov weekly meeting Message-ID: <17d24f2fab5.e49f0c5f688029.678012827550598310@ghanshyammann.com> Hello Everyone, Due to OpenInfra Keynotes happening on 17-18 Nov, this week (18th Nov) TC meeting is cancelled. -gmann From abraden at verisign.com Mon Nov 15 19:19:59 2021 From: abraden at verisign.com (Braden, Albert) Date: Mon, 15 Nov 2021 19:19:59 +0000 Subject: [BULK] Re: Adjutant needs contributors (and a PTL) to survive! In-Reply-To: <17d0fdf9cef.bfbf002c504513.1457558466289860904@ghanshyammann.com> References: <3026c411-688b-c773-8577-e8eed40b995a@catalystcloud.nz> <2bd77f1e8c7542888b7e0e1a14931a41@verisign.com> <9cabb3cb32a7441697f58933df72b514@verisign.com> <1e4045dd6edf572a8a32ca639056bc06a8782fe8.camel@etc.gen.nz> <17d0fdf9cef.bfbf002c504513.1457558466289860904@ghanshyammann.com> Message-ID: <47a580a722ba4e4fbf5a91d6e044447a@verisign.com> I heard back from Adrian, and he showed me where to find everything. I'm still waiting for permission to work on Adjutant. -----Original Message----- From: Ghanshyam Mann Sent: Thursday, November 11, 2021 11:41 AM To: Andrew Ruthven Cc: openstack-discuss at lists.openstack.org Subject: [EXTERNAL] [BULK] Re: Adjutant needs contributors (and a PTL) to survive! Caution: This email originated from outside the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe. ---- On Tue, 09 Nov 2021 04:54:36 -0600 Andrew Ruthven wrote ---- > On Mon, 2021-11-08 at 19:49 +0000, Braden, Albert wrote:I didn't have any luck contacting Adrian. Does anyone know where the storyboard is that he mentions in his email? > I'll check in with Adrian to see if he has heard from anyone. > Cheers,AndrewCatalyst Cloud-- Andrew Ruthven, Wellington, New Zealandandrew at etc.gen.nz |Catalyst Cloud: | This space intentionally left blank https://catalystcloud.nz | @Andrew not sure but please let us know if someone from Catalyst is planning to maintain it? We are still waiting for volunteers to lead/maintain this project, if you are interested please reply here or ping us on #openstack-tc IRC channel. -gmann From emilien at redhat.com Tue Nov 16 00:32:15 2021 From: emilien at redhat.com (Emilien Macchi) Date: Mon, 15 Nov 2021 19:32:15 -0500 Subject: [sdk][tc] A new home for Gophercloud In-Reply-To: References: <20211103164054.72y2c7qk2b7ogxn2@yuggoth.org> <17d0b3beed8.f1f93452444025.7625989527428519853@ghanshyammann.com> Message-ID: Hey Kendall, On Mon, Nov 15, 2021 at 11:59 AM Kendall Nelson wrote: > Is there a reason why you don't want it to be under the openstack > namespace? > The only reason that comes to my mind is not technical at all. I (not saying we, since we haven't reached consensus yet) think that we want the project in its own organization, rather than under openstack. We want to encourage external contributions from outside of OpenStack, therefore opendev would probably suit better than openstack. This is open for discussion of course, but as I see it going, these are my personal thoughts. Thanks, -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From gagehugo at gmail.com Tue Nov 16 00:34:35 2021 From: gagehugo at gmail.com (Gage Hugo) Date: Mon, 15 Nov 2021 18:34:35 -0600 Subject: [openstack-helm] No meeting tomorrow Message-ID: Hey team, Since there are no agenda items [0] for the IRC meeting tomorrow, the meeting is cancelled. Our next meeting will be November 23rd. Thanks [0] https://etherpad.opendev.org/p/openstack-helm-weekly-meeting -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Tue Nov 16 08:32:14 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Tue, 16 Nov 2021 09:32:14 +0100 Subject: [neutron] CI meeting - agenda for 16.11.2021 Message-ID: <7988921.T7Z3S40VBb@p1> Hi, Just quick reminder for those interested in the Neutron CI. Today at 1500 UTC we will have our weekly meeting. Agenda for the meeting is in the etherpad: [1]. This week's meeting will be also on the meetpad [2]. See You all there :) [1] https://etherpad.opendev.org/p/neutron-ci-meetings [2] https://meetpad.opendev.org/neutron-ci-meetings -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From swogatpradhan22 at gmail.com Tue Nov 16 10:13:02 2021 From: swogatpradhan22 at gmail.com (Swogat Pradhan) Date: Tue, 16 Nov 2021 15:43:02 +0530 Subject: [Openstack-victoria] [IRONIC] error in network interface | baremetal node validate Message-ID: Hi, I am currently trying to setup openstack ironic using driver IPMI. I followed the official docs of openstack for setting everything up. When i run openstack baremetal node validate $NODE_UUID, i am getting the following error: * Unexpected exception, traceback saved into log by ironic conductor service that is running on controller: 'ServiceTokenAuthWrapper' object has no attribute '_discovery_cache' * in the network interface in command output. When i check the ironic conductor logs i see the following messages: > Can anyone suggest a solution or a way forward. With regards Swogat Pradhan -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager [req-d2401a0c-b1e6-42a7-9576-fdf7755d2cb2 e3a04390d9a34062be0478e52404d3d2 559f7d28e7354bd398fb70074de53312 - default default] Unexpected exception occurred while validating network driver interface for driver ipmi: 'ServiceTokenAuthWrapper' object has no attribute '_discovery_cache' on node 7b902c2a-7897-4cc4-aec6-e93abfce4adf.: AttributeError: 'ServiceTokenAuthWrapper' object has no attribute '_discovery_cache' 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager Traceback (most recent call last): 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager File "/usr/lib/python3/dist-packages/ironic/conductor/manager.py", line 1958, in validate_driver_interfaces 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager iface.validate(task) 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager File "/usr/lib/python3/dist-packages/ironic/drivers/modules/network/neutron.py", line 62, in validate 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager self.get_cleaning_network_uuid(task) 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager File "/usr/lib/python3/dist-packages/ironic/common/neutron.py", line 933, in get_cleaning_network_uuid 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager return validate_network( 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager File "/usr/lib/python3/dist-packages/ironic/common/neutron.py", line 689, in validate_network 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager client = get_client(context=context) 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager File "/usr/lib/python3/dist-packages/ironic/common/neutron.py", line 79, in get_client 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager return conn.global_request(context.global_id).network 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager File "/usr/lib/python3/dist-packages/openstack/service_description.py", line 87, in __get__ 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager proxy = self._make_proxy(instance) 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager File "/usr/lib/python3/dist-packages/openstack/service_description.py", line 223, in _make_proxy 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager data = proxy_obj.get_endpoint_data() 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager File "/usr/lib/python3/dist-packages/keystoneauth1/adapter.py", line 312, in get_endpoint_data 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager return self.session.get_endpoint_data(auth or self.auth, **kwargs) 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager File "/usr/lib/python3/dist-packages/keystoneauth1/session.py", line 1250, in get_endpoint_data 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager return auth.get_endpoint_data(self, **kwargs) 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager File "/usr/lib/python3/dist-packages/keystoneauth1/plugin.py", line 132, in get_endpoint_data 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager session, cache=self._discovery_cache, 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager AttributeError: 'ServiceTokenAuthWrapper' object has no attribute '_discovery_cache' 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager From mnasiadka at gmail.com Tue Nov 16 12:49:35 2021 From: mnasiadka at gmail.com (=?utf-8?Q?Micha=C5=82_Nasiadka?=) Date: Tue, 16 Nov 2021 13:49:35 +0100 Subject: [kolla] Proposing Michal Arbet as Kolla core Message-ID: Hi, I would like to propose adding Michal Arbet (kevko) to the kolla-core and kolla-ansible-core groups. Michal did some great work on ProxySQL, is a consistent maintainer of Debian related images and has provided some helpful reviews. Cores - please reply +1/-1 before the end of Friday 26th November. Thanks, Michal -------------- next part -------------- An HTML attachment was scrubbed... URL: From sshnaidm at redhat.com Tue Nov 16 14:12:06 2021 From: sshnaidm at redhat.com (Sagi Shnaidman) Date: Tue, 16 Nov 2021 16:12:06 +0200 Subject: [openstack-ansible] release job failure for ansible-collections-openstack In-Reply-To: <272321636786145@mail.yandex.ru> References: <272321636786145@mail.yandex.ru> Message-ID: Hi, thanks for heads up. The first job failed because of the Zuul bug [1] and I didn't expect the job to run the second time, so I just released it manually to Galaxy. When the job tried the second time, it failed because it tried to duplicate the manual release. So I think it's all good now. [1] https://review.opendev.org/c/zuul/zuul/+/817298 On Sat, Nov 13, 2021 at 9:05 AM Dmitriy Rabotyagov wrote: > + sshnaidm@ > - ??? > > Hi! > > It's indeed OpenStack-Ansible SIG repo. > But from what I can see - release is published in Ansible Galaxy which is > the main thing that this job does. So for me things look good, but I guess > the only person who for sure can verify that is Sagi Shnaidman. > > Being able to tag repos is crucial for this sig since it's the only way to > provide Ansible Collections in Ansible Galaxy. > > 12.11.2021, 17:33, "El?d Ill?s" : > > Hi Openstack-Ansible team! > > This mail is just to inform you that there was a release job failure [1] > yesterday and the job could not be re-run as part of the job was > finished successfully in the 1st run (so the 2nd attempt failed [2]). > > Could you please review if everything is OK with the release? > > Thanks, > > El?d (elodilles @ #openstack-release) > > [1] > > http://lists.openstack.org/pipermail/release-job-failures/2021-November/001576.html > [2] > > http://lists.openstack.org/pipermail/release-job-failures/2021-November/001577.html > > > > -- > Kind Regards, > Dmitriy Rabotyagov > > -- Best regards Sagi Shnaidman -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Tue Nov 16 15:06:00 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Tue, 16 Nov 2021 15:06:00 +0000 Subject: [openstack-ansible] release job failure for ansible-collections-openstack In-Reply-To: References: <272321636786145@mail.yandex.ru> Message-ID: <20211116150600.rtuvpyhpgcuxr2u7@yuggoth.org> On 2021-11-16 16:12:06 +0200 (+0200), Sagi Shnaidman wrote: [...] > I didn't expect the job to run the second time, so I just released > it manually to Galaxy. [...] Aha, thanks, that explains it. El?d had asked me to reenqueue it once we got the zuul.tag variable back, but I didn't realize the version had been manually uploaded in the meantime. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From rlandy at redhat.com Tue Nov 16 15:58:29 2021 From: rlandy at redhat.com (Ronelle Landy) Date: Tue, 16 Nov 2021 10:58:29 -0500 Subject: [Triple0] Gate blocker - standalone failures on master and wallaby In-Reply-To: References: Message-ID: On Mon, Nov 15, 2021 at 6:52 AM Ronelle Landy wrote: > > tripleo-ci-centos-8-containers-multinode, and the wallaby job, are also > impacted. > > On Mon, Nov 15, 2021 at 6:28 AM Ronelle Landy wrote: > >> Hello All, >> >> We have a check/gate blocker on master and wallaby that started on >> Saturday. >> Standalone jobs are failing tempest tests. The related bug is linked >> below: >> >> https://bugs.launchpad.net/tripleo/+bug/1950916 >> >> The networking team is helping debug this. Please don't recheck for now. >> We will update this list when we have more info/a fix. >> > Thanks to Yatin, we have a temporary fix merged to unblock check/gate: Merged openstack/tripleo-quickstart master: Exclude libvirt/qemu from AppStream repo https://review.opendev.org/c/openstack/tripleo-quickstart/+/818043 Please recheck patches that were impacted by these failures. The compute team is still working on a more complete fix here. > >> Thank you! >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.souppart at gmail.com Tue Nov 16 08:21:26 2021 From: alex.souppart at gmail.com (alex souppart) Date: Tue, 16 Nov 2021 09:21:26 +0100 Subject: error "haproxy[]: proxy horizon has no server available!" when internal tls is activated In-Reply-To: References: Message-ID: Hello, I try to deploy an overcloud openstack in victoria version. My configuration to deploy is : openstack overcloud deploy --templates -r /home/stack/templates/roles_data.yaml \ -n /home/stack/network_data.yaml \ -e /home/stack/templates/scheduler_hints_env.yaml \ -e /home/stack/templates/network-isolation.yaml \ -e /home/stack/templates/os-net-config-mapping.yaml \ -e /home/stack/templates/node-info.yaml \ -e /home/stack/containers-prepare-parameter.yaml \ -e /home/stack/templates/host-map.yaml \ -e /home/stack/templates/ips-from-pool-all.yaml \ -e /home/stack/templates/network-environment.yaml \ -e /home/stack/templates/net-multiple-nics-vlans.yaml \ -e /home/stack/templates/ceph-ansible-external.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/ssl/enable-internal-tls.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/haproxy-internal-tls-certmonger.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/ssl/tls-endpoints-public-ip.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/octavia.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/ssl/tls-everywhere-endpoints-dns.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/disable-telemetry.yaml \ -e /home/stack/templates/tls-parameters.yaml \ -e /home/stack/templates/inject-trust-anchor.yaml \ The generated configuration of horizon httpd contains SSLVerifyClient. But Haproxy fails to check server available, because haproxy does not send a client certificate when check attempt. the generated configuration of haproxy backend is : server host1 ip_host1:5000 ca-file /etc/ipa/ca.crt check fall 5 inter 2000 rise 2 ssl verify required verifyhost host1 server host2 ip_host2:5000 ca-file /etc/ipa/ca.crt check fall 5 inter 2000 rise 2 ssl verify required verifyhost host2 server host3 ip_host3:5000 ca-file /etc/ipa/ca.crt check fall 5 inter 2000 rise 2 ssl verify required verifyhost host3 if i try adding manualy "crt /etc/pki/tls/certs/haproxy/overcloud-haproxy-internal_api.pem" in server configuration in haproxy.conf, horizon/dashboard works via haproxy. But i'm not sure that's the right way. Did I forget an environment file in deploy configuration ? Thank you in advance for your assistance with this. Best regards Souppart Alexandre -------------- next part -------------- An HTML attachment was scrubbed... URL: From katonalala at gmail.com Tue Nov 16 09:42:03 2021 From: katonalala at gmail.com (Lajos Katona) Date: Tue, 16 Nov 2021 10:42:03 +0100 Subject: Error while getting network agents In-Reply-To: <007a01d7da3b$78f1d800$6ad58800$@rapidcompute.com> References: <007a01d7da3b$78f1d800$6ad58800$@rapidcompute.com> Message-ID: Hi Muhammad, >From Wallaby you should be able to delete agent through Neutron API: https://review.opendev.org/c/openstack/neutron/+/752795 To tell the truth I don't know what happens if you execute DELETE /v2.0/agents/{agent_id} after you have deleted the chassis in sbctl. Regards Lajos Katona (lajoskatona) Muhammad Faisal ezt ?rta (id?pont: 2021. nov. 15., H, 19:05): > *Hi,* > > While executing openstack network agent list we are getting below > mentioned error. We have run ?ovn-sbctl chassis-del > 650be87c-b581-467a-b523-ce454e753780? command on controller node. > > *OS:* Ubuntu 20 > *Openstack version:* Wallaby > *Number of controller/network node:* 1 (172.16.30.46) > > *Number of compute node:* 2 (172.16.30.1, 172.16.30.3) > *OVN Version:* 21.09.0 > > /var/log/ovn/ovn-northd.log: > > 2021-11-15T15:29:22.184Z|00053|northd|WARN|Dropped 7 log messages in last > 127 seconds (most recently, 120 seconds ago) due to excessive rate > > 2021-11-15T15:29:22.184Z|00054|northd|WARN|Chassis does not exist for > Chassis_Private record, name: 7d07ed32-996f-48da-9ac1-7f0455ea7ec7 > > 2021-11-15T15:31:11.516Z|00055|northd|WARN|Dropped 14 log messages in last > 110 seconds (most recently, 52 seconds ago) due to excessive rate > > 2021-11-15T15:31:11.516Z|00056|northd|WARN|Chassis does not exist for > Chassis_Private record, name: 7d07ed32-996f-48da-9ac1-7f0455ea7ec7 > > 2021-11-15T15:35:25.738Z|00057|northd|WARN|Dropped 6 log messages in last > 254 seconds (most recently, 243 seconds ago) due to excessive rate > > 2021-11-15T15:35:25.738Z|00058|northd|WARN|Chassis does not exist for > Chassis_Private record, name: 7d07ed32-996f-48da-9ac1-7f0455ea7ec7 > > 2021-11-15T15:44:34.663Z|00059|northd|WARN|Dropped 12 log messages in last > 549 seconds (most recently, 512 seconds ago) due to excessive rate > > 2021-11-15T15:44:34.663Z|00060|northd|WARN|Chassis does not exist for > Chassis_Private record, name: 7d07ed32-996f-48da-9ac1-7f0455ea7ec7 > > 2021-11-15T15:47:39.947Z|00061|northd|WARN|Dropped 2 log messages in last > 185 seconds (most recently, 185 seconds ago) due to excessive rate > > 2021-11-15T15:47:39.948Z|00062|northd|WARN|Chassis does not exist for > Chassis_Private record, name: 7d07ed32-996f-48da-9ac1-7f0455ea7ec7 > > > /var/log/neutron/neutron-server.log: > 2021-11-15 20:55:24.025 678149 DEBUG futurist.periodics > [req-df627767-441c-4b46-8487-91604cd3033a - - - - -] Submitting periodic > callback > 'neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.maintenance.HashRingHealthCheckPeriodics.touch_hash_ring_nodes' > _process_scheduled /usr/lib/python3/dist-packages/futurist/periodics.py:641 > > 2021-11-15 20:55:30.531 678146 DEBUG neutron.wsgi [-] (678146) accepted > ('172.16.30.46', 49782) server > /usr/lib/python3/dist-packages/eventlet/wsgi.py:992 > > 2021-11-15 20:55:30.893 678146 DEBUG ovsdbapp.backend.ovs_idl.transaction > [req-04604530-8f57-428b-8359-e1bd2648d417 - - - - -] Running txn n=1 > command(idx=0): CheckLivenessCommand() do_commit > /usr/lib/python3/dist-packages/ovsdbapp/backend/ovs_idl/transaction.py:90 > > 2021-11-15 20:55:30.925 678146 DEBUG > neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor > [req-04604530-8f57-428b-8359-e1bd2648d417 - - - - -] Hash Ring: Node > 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling > event "update" for row b4926d80-2c2a-488f-bd27-49c7191f0c3a (table: > NB_Global) notify > /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:666 > > 2021-11-15 20:55:30.927 678147 DEBUG > neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor [-] > ChassisAgentWriteEvent : Matched Chassis_Private, update, None None matches > /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:61 > > 2021-11-15 20:55:30.930 678147 DEBUG > neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor [-] > ChassisMetadataAgentWriteEvent : Matched Chassis_Private, update, None None > matches > /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:61 > > 2021-11-15 20:55:30.931 678146 DEBUG > neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor > [req-04604530-8f57-428b-8359-e1bd2648d417 - - - - -] Hash Ring: Node > 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling > event "update" for row b4926d80-2c2a-488f-bd27-49c7191f0c3a (table: > NB_Global) notify > /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:666 > > 2021-11-15 20:55:30.934 678146 DEBUG > neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor > [req-04604530-8f57-428b-8359-e1bd2648d417 - - - - -] Hash Ring: Node > 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling > event "update" for row b4926d80-2c2a-488f-bd27-49c7191f0c3a (table: > NB_Global) notify > /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:666 > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource > [req-db3a5256-a3de-41a9-a135-710ce024a698 92e064e5e77c4e32b97e6c056a86c5eb > 7eb59a51503f48ddbc936c40990e2177 - default default] index failed: No > details.: AttributeError: 'Chassis_Private' object has no attribute > 'hostname' > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource Traceback > (most recent call last): > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/neutron/api/v2/resource.py", line 98, in > resource > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource result = > method(request=request, **args) > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 139, in wrapped > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource > setattr(e, '_RETRY_EXCEEDED', True) > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 227, in > __exit__ > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource > self.force_reraise() > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 200, in > force_reraise > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource raise > self.value > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 135, in wrapped > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource return > f(*args, **kwargs) > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/oslo_db/api.py", line 154, in wrapper > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource > ectxt.value = e.inner_exc > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 227, in > __exit__ > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource > self.force_reraise() > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 200, in > force_reraise > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource raise > self.value > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/oslo_db/api.py", line 142, in wrapper > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource return > f(*args, **kwargs) > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 183, in wrapped > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource > LOG.debug("Retry wrapper got retriable exception: %s", e) > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 227, in > __exit__ > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource > self.force_reraise() > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 200, in > force_reraise > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource raise > self.value > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 179, in wrapped > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource return > f(*dup_args, **dup_kwargs) > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/neutron/api/v2/base.py", line 369, in index > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource return > self._items(request, True, parent_id) > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/neutron/api/v2/base.py", line 304, in _items > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource obj_list > = obj_getter(request.context, **kwargs) > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/mech_driver.py", > line 1118, in fn > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource return > op(results, new_method(*args, _driver=self, **kwargs)) > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/mech_driver.py", > line 1182, in get_agents > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource > agent_dict = agent.as_dict() > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/agent/neutron_agent.py", > line 59, in as_dict > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource 'host': > self.chassis.hostname, > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource > AttributeError: 'Chassis_Private' object has no attribute 'hostname' > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource > > 2021-11-15 20:55:30.937 678146 INFO neutron.wsgi > [req-db3a5256-a3de-41a9-a135-710ce024a698 92e064e5e77c4e32b97e6c056a86c5eb > 7eb59a51503f48ddbc936c40990e2177 - default default] 172.16.30.46 "GET > /v2.0/agents HTTP/1.1" status: 500 len: 368 time: 0.4032691 > > 2021-11-15 20:55:30.938 678146 DEBUG > neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor > [req-5efca86e-8945-43c5-a492-76e8ad16ef6e - - - - -] Hash Ring: Node > 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling > event "update" for row 2bd437c4-9173-4d1b-ad01-d27cf0a11c8a (table: > SB_Global) notify > /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:666 > > 2021-11-15 20:55:30.939 678146 DEBUG > neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor > [req-5efca86e-8945-43c5-a492-76e8ad16ef6e - - - - -] ChassisAgentWriteEvent > : Matched Chassis_Private, update, None None matches > /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:61 > > 2021-11-15 20:55:30.939 678146 DEBUG > neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor > [req-5efca86e-8945-43c5-a492-76e8ad16ef6e - - - - -] Hash Ring: Node > 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling > event "update" for row 03ce3887-1007-48fa-9f40-b2fe88246a69 (table: > Chassis_Private) notify > /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:666 > > 2021-11-15 20:55:30.940 678146 DEBUG > neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor > [req-5efca86e-8945-43c5-a492-76e8ad16ef6e - - - - -] > ChassisMetadataAgentWriteEvent : Matched Chassis_Private, update, None None > matches > /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:61 > > 2021-11-15 20:55:30.940 678146 DEBUG > neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor > [req-5efca86e-8945-43c5-a492-76e8ad16ef6e - - - - -] Hash Ring: Node > 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling > event "update" for row 03ce3887-1007-48fa-9f40-b2fe88246a69 (table: > Chassis_Private) notify > /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:666 > > 2021-11-15 20:55:37.389 678146 INFO neutron.wsgi > [req-85f09a48-f50f-4776-b110-5d07dd1dc39e 9888a6e6da764fd28a72b1a7b25b8967 > 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET > /v2.0/ports?device_id=21999c50-740c-4a26-970b-8e07ba37b5ac&fields=binding%3Ahost_id&fields=binding%3Avif_type > HTTP/1.1" status: 200 len: 271 time: 0.0680673 > > 2021-11-15 20:55:37.476 678146 DEBUG > neutron.pecan_wsgi.hooks.policy_enforcement > [req-4099412f-d724-48d7-8786-add0c5b47f56 9888a6e6da764fd28a72b1a7b25b8967 > 3e076bf2747b40ed8792ebf85d14d719 - default default] Attributes excluded by > policy engine: ['standard_attr_id'] _exclude_attributes_by_policy > /usr/lib/python3/dist-packages/neutron/pecan_wsgi/hooks/policy_enforcement.py:255 > > 2021-11-15 20:55:37.477 678146 INFO neutron.wsgi > [req-4099412f-d724-48d7-8786-add0c5b47f56 9888a6e6da764fd28a72b1a7b25b8967 > 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET > /v2.0/ports?tenant_id=7eb59a51503f48ddbc936c40990e2177&device_id=21999c50-740c-4a26-970b-8e07ba37b5ac > HTTP/1.1" status: 200 len: 1175 time: 0.0565202 > > 2021-11-15 20:55:37.618 678146 DEBUG > neutron.pecan_wsgi.hooks.policy_enforcement > [req-eebf3e86-206a-45ed-92f3-804007a39cca 9888a6e6da764fd28a72b1a7b25b8967 > 3e076bf2747b40ed8792ebf85d14d719 - default default] Attributes excluded by > policy engine: ['standard_attr_id', 'vlan_transparent'] > _exclude_attributes_by_policy > /usr/lib/python3/dist-packages/neutron/pecan_wsgi/hooks/policy_enforcement.py:255 > > 2021-11-15 20:55:37.619 678146 INFO neutron.wsgi > [req-eebf3e86-206a-45ed-92f3-804007a39cca 9888a6e6da764fd28a72b1a7b25b8967 > 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET > /v2.0/networks?id=ed4ee148-2562-41ed-93ac-7341948ac4dc HTTP/1.1" status: > 200 len: 904 time: 0.1111536 > > 2021-11-15 20:55:37.674 678146 INFO neutron.wsgi > [req-71885d0c-4aa2-4377-a014-33768c9160ae 9888a6e6da764fd28a72b1a7b25b8967 > 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET > /v2.0/floatingips?fixed_ip_address=192.168.100.199&port_id=f9c17ae5-c07b-4411-a0be-551686e6ba8e > HTTP/1.1" status: 200 len: 217 time: 0.0442567 > > 2021-11-15 20:55:37.727 678146 DEBUG > neutron.pecan_wsgi.hooks.policy_enforcement > [req-d35a3b5e-82ea-4e80-b1c6-c8b39205a1f7 9888a6e6da764fd28a72b1a7b25b8967 > 3e076bf2747b40ed8792ebf85d14d719 - default default] Attributes excluded by > policy engine: ['standard_attr_id', 'shared'] _exclude_attributes_by_policy > /usr/lib/python3/dist-packages/neutron/pecan_wsgi/hooks/policy_enforcement.py:255 > > 2021-11-15 20:55:37.731 678146 INFO neutron.wsgi > [req-d35a3b5e-82ea-4e80-b1c6-c8b39205a1f7 9888a6e6da764fd28a72b1a7b25b8967 > 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET > /v2.0/subnets?id=74e3b294-5fcc-4685-8bc4-5200be3af09e HTTP/1.1" status: > 200 len: 858 time: 0.0468309 > > 2021-11-15 20:55:37.774 678146 INFO neutron.wsgi > [req-7a57b86f-75bf-4381-9447-0b1d7262aae8 9888a6e6da764fd28a72b1a7b25b8967 > 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET > /v2.0/ports?network_id=ed4ee148-2562-41ed-93ac-7341948ac4dc&device_owner=network%3Adhcp > HTTP/1.1" status: 200 len: 210 time: 0.0335791 > > 2021-11-15 20:55:37.891 678146 INFO neutron.wsgi > [req-7cf3c09b-b5ee-4372-84c2-d738abd1a47a 9888a6e6da764fd28a72b1a7b25b8967 > 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET > /v2.0/networks/ed4ee148-2562-41ed-93ac-7341948ac4dc?fields=segments > HTTP/1.1" status: 200 len: 212 time: 0.1059954 > > 2021-11-15 20:55:37.991 678146 INFO neutron.wsgi > [req-3ed1edbd-9f33-4b4e-89c8-a2291f23e8ef 9888a6e6da764fd28a72b1a7b25b8967 > 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET > /v2.0/networks/ed4ee148-2562-41ed-93ac-7341948ac4dc?fields=provider%3Aphysical_network&fields=provider%3Anetwork_type > HTTP/1.1" status: 200 len: 277 time: 0.0903363 > > 2021-11-15 20:55:49.083 678146 INFO neutron.wsgi > [req-0328379e-d381-48be-a209-a94acc297baa 9888a6e6da764fd28a72b1a7b25b8967 > 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.1 "GET > /v2.0/ports?device_id=82e90d49-bbc4-4c2c-8ae1-7a4bcf84bb7d&fields=binding%3Ahost_id&fields=binding%3Avif_type > HTTP/1.1" status: 200 len: 271 time: 0.2884536 > > [image: Email Signature] > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 33431 bytes Desc: not available URL: From cboylan at sapwetik.org Tue Nov 16 16:58:44 2021 From: cboylan at sapwetik.org (Clark Boylan) Date: Tue, 16 Nov 2021 08:58:44 -0800 Subject: =?UTF-8?Q?[all][refstack][neutron][kolla][ironic][heat][trove][senlin][b?= =?UTF-8?Q?arbican][manila]_Fixing_Zuul_Config_Errors?= Message-ID: <5072b00c-bbf6-42e9-830f-d6598a76beb7@www.fastmail.com> Hello, The OpenStack tenant in Zuul currently has 134 config errors. You can find these errors at https://zuul.opendev.org/t/openstack/config-errors or by clicking the blue bell icon in the top right of https://zuul.opendev.org/t/openstack/status. The vast majority of these errors appear related to project renames that have been requested of OpenDev or project retirements. Can you please look into fixing these as they can be an attractive nuisance when debugging Zuul problems (they also indicate that a number of your jobs are probably not working). Project renames creating issues: * openstack/python-tempestconf -> osf/python-tempestconf -> openinfra/python-tempestconf * openstack/refstack -> osf/refstack -> openinfra/refstack * x/tap-as-a-service -> openstack/tap-as-a-service * openstack/networking-l2gw -> x/networking-l2gw Project retirements creating issues: * openstack/neutron-lbaas * recordsansible/ara Projects whose configs have errors: * openinfra/python-tempestconf * openstack/heat * openstack/ironic * openstack/kolla-ansible * openstack/kuryr-kubernetes * openstack/murano-apps * openstack/networking-midonet * openstack/networking-odl * openstack/neutron * openstack/neutron-fwaas * openstack/python-troveclient * openstack/senlin * openstack/tap-as-a-service * openstack/zaqar * x/vmware-nsx * openinfra/openstackid * openstack/barbican * openstack/cookbook-openstack-application-catalog * openstack/heat-dashboard * openstack/manila-ui * openstack/python-manilaclient Let us know if we can help decipher any errors, Clark From radoslaw.piliszek at gmail.com Tue Nov 16 17:06:59 2021 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Tue, 16 Nov 2021 18:06:59 +0100 Subject: [all][refstack][neutron][kolla][ironic][heat][trove][senlin][barbican][manila] Fixing Zuul Config Errors In-Reply-To: <5072b00c-bbf6-42e9-830f-d6598a76beb7@www.fastmail.com> References: <5072b00c-bbf6-42e9-830f-d6598a76beb7@www.fastmail.com> Message-ID: @ Kolla folks It seems to affect kolla-ansible in stable/rocky and stable/stein branches due to ansible_python_interpreter being set. We either need to backport https://review.opendev.org/c/openstack/kolla-ansible/+/798205 or drop affected CI config in those. They are EM so it might be fine going either way as CI might be broken by now anyhow. -yoctozepto On Tue, 16 Nov 2021 at 17:59, Clark Boylan wrote: > > Hello, > > The OpenStack tenant in Zuul currently has 134 config errors. You can find these errors at https://zuul.opendev.org/t/openstack/config-errors or by clicking the blue bell icon in the top right of https://zuul.opendev.org/t/openstack/status. The vast majority of these errors appear related to project renames that have been requested of OpenDev or project retirements. Can you please look into fixing these as they can be an attractive nuisance when debugging Zuul problems (they also indicate that a number of your jobs are probably not working). > > Project renames creating issues: > > * openstack/python-tempestconf -> osf/python-tempestconf -> openinfra/python-tempestconf > * openstack/refstack -> osf/refstack -> openinfra/refstack > * x/tap-as-a-service -> openstack/tap-as-a-service > * openstack/networking-l2gw -> x/networking-l2gw > > Project retirements creating issues: > > * openstack/neutron-lbaas > * recordsansible/ara > > Projects whose configs have errors: > > * openinfra/python-tempestconf > * openstack/heat > * openstack/ironic > * openstack/kolla-ansible > * openstack/kuryr-kubernetes > * openstack/murano-apps > * openstack/networking-midonet > * openstack/networking-odl > * openstack/neutron > * openstack/neutron-fwaas > * openstack/python-troveclient > * openstack/senlin > * openstack/tap-as-a-service > * openstack/zaqar > * x/vmware-nsx > * openinfra/openstackid > * openstack/barbican > * openstack/cookbook-openstack-application-catalog > * openstack/heat-dashboard > * openstack/manila-ui > * openstack/python-manilaclient > > Let us know if we can help decipher any errors, > Clark > From ralonsoh at redhat.com Tue Nov 16 17:26:31 2021 From: ralonsoh at redhat.com (Rodolfo Alonso Hernandez) Date: Tue, 16 Nov 2021 18:26:31 +0100 Subject: Error while getting network agents In-Reply-To: References: <007a01d7da3b$78f1d800$6ad58800$@rapidcompute.com> Message-ID: Hello Muhammad: If the ovn-controller service is enabled in the host (in this case your controller), this service should create again the Chassis register. If the ovn-controller is not enabled, then it is possible to reproduce this issue. I could reproduce it doing those steps: - Killing the ovn-controller service. Not stopping gracefully (that will delete the "chassis" and the "chassis_private" registers) but sending the kill signal. - Deleting the "chassis" register. That will leave the "chassis_private" register with chassis=[]. However, we don't attend "chassis" events. That means the "AgentCache" won't be updated. - Restart the Neutron server. That will force the OVB SB retrieval with the "chassis_private" register without any chassis. - List the agents --> that will trigger the reported error. In any case, remember that the OVN agent deletion is only for clean-up purposes. If by any circumstance the ovn-controller in a host does not stop gracefully, it will leave "chassis" and "chassis_private" registers undeleted. To properly delete those registers, you should: - Delete both from OVN SB. - Then delete the agents from Neutron, using the CLI: "openstack network agent delete ". I'll open a bug to consider this very specific scenario. Regards. On Tue, Nov 16, 2021 at 5:22 PM Lajos Katona wrote: > Hi Muhammad, > From Wallaby you should be able to delete agent through Neutron API: > https://review.opendev.org/c/openstack/neutron/+/752795 > > To tell the truth I don't know what happens if you execute > DELETE /v2.0/agents/{agent_id} after you have deleted the chassis in sbctl. > > Regards > Lajos Katona (lajoskatona) > > Muhammad Faisal ezt ?rta (id?pont: 2021. > nov. 15., H, 19:05): > >> *Hi,* >> >> While executing openstack network agent list we are getting below >> mentioned error. We have run ?ovn-sbctl chassis-del >> 650be87c-b581-467a-b523-ce454e753780? command on controller node. >> >> *OS:* Ubuntu 20 >> *Openstack version:* Wallaby >> *Number of controller/network node:* 1 (172.16.30.46) >> >> *Number of compute node:* 2 (172.16.30.1, 172.16.30.3) >> *OVN Version:* 21.09.0 >> >> /var/log/ovn/ovn-northd.log: >> >> 2021-11-15T15:29:22.184Z|00053|northd|WARN|Dropped 7 log messages in last >> 127 seconds (most recently, 120 seconds ago) due to excessive rate >> >> 2021-11-15T15:29:22.184Z|00054|northd|WARN|Chassis does not exist for >> Chassis_Private record, name: 7d07ed32-996f-48da-9ac1-7f0455ea7ec7 >> >> 2021-11-15T15:31:11.516Z|00055|northd|WARN|Dropped 14 log messages in >> last 110 seconds (most recently, 52 seconds ago) due to excessive rate >> >> 2021-11-15T15:31:11.516Z|00056|northd|WARN|Chassis does not exist for >> Chassis_Private record, name: 7d07ed32-996f-48da-9ac1-7f0455ea7ec7 >> >> 2021-11-15T15:35:25.738Z|00057|northd|WARN|Dropped 6 log messages in last >> 254 seconds (most recently, 243 seconds ago) due to excessive rate >> >> 2021-11-15T15:35:25.738Z|00058|northd|WARN|Chassis does not exist for >> Chassis_Private record, name: 7d07ed32-996f-48da-9ac1-7f0455ea7ec7 >> >> 2021-11-15T15:44:34.663Z|00059|northd|WARN|Dropped 12 log messages in >> last 549 seconds (most recently, 512 seconds ago) due to excessive rate >> >> 2021-11-15T15:44:34.663Z|00060|northd|WARN|Chassis does not exist for >> Chassis_Private record, name: 7d07ed32-996f-48da-9ac1-7f0455ea7ec7 >> >> 2021-11-15T15:47:39.947Z|00061|northd|WARN|Dropped 2 log messages in last >> 185 seconds (most recently, 185 seconds ago) due to excessive rate >> >> 2021-11-15T15:47:39.948Z|00062|northd|WARN|Chassis does not exist for >> Chassis_Private record, name: 7d07ed32-996f-48da-9ac1-7f0455ea7ec7 >> >> >> /var/log/neutron/neutron-server.log: >> 2021-11-15 20:55:24.025 678149 DEBUG futurist.periodics >> [req-df627767-441c-4b46-8487-91604cd3033a - - - - -] Submitting periodic >> callback >> 'neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.maintenance.HashRingHealthCheckPeriodics.touch_hash_ring_nodes' >> _process_scheduled /usr/lib/python3/dist-packages/futurist/periodics.py:641 >> >> 2021-11-15 20:55:30.531 678146 DEBUG neutron.wsgi [-] (678146) accepted >> ('172.16.30.46', 49782) server >> /usr/lib/python3/dist-packages/eventlet/wsgi.py:992 >> >> 2021-11-15 20:55:30.893 678146 DEBUG ovsdbapp.backend.ovs_idl.transaction >> [req-04604530-8f57-428b-8359-e1bd2648d417 - - - - -] Running txn n=1 >> command(idx=0): CheckLivenessCommand() do_commit >> /usr/lib/python3/dist-packages/ovsdbapp/backend/ovs_idl/transaction.py:90 >> >> 2021-11-15 20:55:30.925 678146 DEBUG >> neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor >> [req-04604530-8f57-428b-8359-e1bd2648d417 - - - - -] Hash Ring: Node >> 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling >> event "update" for row b4926d80-2c2a-488f-bd27-49c7191f0c3a (table: >> NB_Global) notify >> /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:666 >> >> 2021-11-15 20:55:30.927 678147 DEBUG >> neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor [-] >> ChassisAgentWriteEvent : Matched Chassis_Private, update, None None matches >> /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:61 >> >> 2021-11-15 20:55:30.930 678147 DEBUG >> neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor [-] >> ChassisMetadataAgentWriteEvent : Matched Chassis_Private, update, None None >> matches >> /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:61 >> >> 2021-11-15 20:55:30.931 678146 DEBUG >> neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor >> [req-04604530-8f57-428b-8359-e1bd2648d417 - - - - -] Hash Ring: Node >> 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling >> event "update" for row b4926d80-2c2a-488f-bd27-49c7191f0c3a (table: >> NB_Global) notify >> /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:666 >> >> 2021-11-15 20:55:30.934 678146 DEBUG >> neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor >> [req-04604530-8f57-428b-8359-e1bd2648d417 - - - - -] Hash Ring: Node >> 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling >> event "update" for row b4926d80-2c2a-488f-bd27-49c7191f0c3a (table: >> NB_Global) notify >> /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:666 >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource >> [req-db3a5256-a3de-41a9-a135-710ce024a698 92e064e5e77c4e32b97e6c056a86c5eb >> 7eb59a51503f48ddbc936c40990e2177 - default default] index failed: No >> details.: AttributeError: 'Chassis_Private' object has no attribute >> 'hostname' >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource Traceback >> (most recent call last): >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/neutron/api/v2/resource.py", line 98, in >> resource >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource result = >> method(request=request, **args) >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 139, in wrapped >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource >> setattr(e, '_RETRY_EXCEEDED', True) >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 227, in >> __exit__ >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource >> self.force_reraise() >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 200, in >> force_reraise >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource raise >> self.value >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 135, in wrapped >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource return >> f(*args, **kwargs) >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/oslo_db/api.py", line 154, in wrapper >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource >> ectxt.value = e.inner_exc >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 227, in >> __exit__ >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource >> self.force_reraise() >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 200, in >> force_reraise >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource raise >> self.value >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/oslo_db/api.py", line 142, in wrapper >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource return >> f(*args, **kwargs) >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 183, in wrapped >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource >> LOG.debug("Retry wrapper got retriable exception: %s", e) >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 227, in >> __exit__ >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource >> self.force_reraise() >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 200, in >> force_reraise >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource raise >> self.value >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 179, in wrapped >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource return >> f(*dup_args, **dup_kwargs) >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/neutron/api/v2/base.py", line 369, in index >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource return >> self._items(request, True, parent_id) >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/neutron/api/v2/base.py", line 304, in _items >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource obj_list >> = obj_getter(request.context, **kwargs) >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/mech_driver.py", >> line 1118, in fn >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource return >> op(results, new_method(*args, _driver=self, **kwargs)) >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/mech_driver.py", >> line 1182, in get_agents >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource >> agent_dict = agent.as_dict() >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/agent/neutron_agent.py", >> line 59, in as_dict >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource 'host': >> self.chassis.hostname, >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource >> AttributeError: 'Chassis_Private' object has no attribute 'hostname' >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource >> >> 2021-11-15 20:55:30.937 678146 INFO neutron.wsgi >> [req-db3a5256-a3de-41a9-a135-710ce024a698 92e064e5e77c4e32b97e6c056a86c5eb >> 7eb59a51503f48ddbc936c40990e2177 - default default] 172.16.30.46 "GET >> /v2.0/agents HTTP/1.1" status: 500 len: 368 time: 0.4032691 >> >> 2021-11-15 20:55:30.938 678146 DEBUG >> neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor >> [req-5efca86e-8945-43c5-a492-76e8ad16ef6e - - - - -] Hash Ring: Node >> 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling >> event "update" for row 2bd437c4-9173-4d1b-ad01-d27cf0a11c8a (table: >> SB_Global) notify >> /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:666 >> >> 2021-11-15 20:55:30.939 678146 DEBUG >> neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor >> [req-5efca86e-8945-43c5-a492-76e8ad16ef6e - - - - -] ChassisAgentWriteEvent >> : Matched Chassis_Private, update, None None matches >> /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:61 >> >> 2021-11-15 20:55:30.939 678146 DEBUG >> neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor >> [req-5efca86e-8945-43c5-a492-76e8ad16ef6e - - - - -] Hash Ring: Node >> 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling >> event "update" for row 03ce3887-1007-48fa-9f40-b2fe88246a69 (table: >> Chassis_Private) notify >> /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:666 >> >> 2021-11-15 20:55:30.940 678146 DEBUG >> neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor >> [req-5efca86e-8945-43c5-a492-76e8ad16ef6e - - - - -] >> ChassisMetadataAgentWriteEvent : Matched Chassis_Private, update, None None >> matches >> /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:61 >> >> 2021-11-15 20:55:30.940 678146 DEBUG >> neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor >> [req-5efca86e-8945-43c5-a492-76e8ad16ef6e - - - - -] Hash Ring: Node >> 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling >> event "update" for row 03ce3887-1007-48fa-9f40-b2fe88246a69 (table: >> Chassis_Private) notify >> /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:666 >> >> 2021-11-15 20:55:37.389 678146 INFO neutron.wsgi >> [req-85f09a48-f50f-4776-b110-5d07dd1dc39e 9888a6e6da764fd28a72b1a7b25b8967 >> 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET >> /v2.0/ports?device_id=21999c50-740c-4a26-970b-8e07ba37b5ac&fields=binding%3Ahost_id&fields=binding%3Avif_type >> HTTP/1.1" status: 200 len: 271 time: 0.0680673 >> >> 2021-11-15 20:55:37.476 678146 DEBUG >> neutron.pecan_wsgi.hooks.policy_enforcement >> [req-4099412f-d724-48d7-8786-add0c5b47f56 9888a6e6da764fd28a72b1a7b25b8967 >> 3e076bf2747b40ed8792ebf85d14d719 - default default] Attributes excluded by >> policy engine: ['standard_attr_id'] _exclude_attributes_by_policy >> /usr/lib/python3/dist-packages/neutron/pecan_wsgi/hooks/policy_enforcement.py:255 >> >> 2021-11-15 20:55:37.477 678146 INFO neutron.wsgi >> [req-4099412f-d724-48d7-8786-add0c5b47f56 9888a6e6da764fd28a72b1a7b25b8967 >> 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET >> /v2.0/ports?tenant_id=7eb59a51503f48ddbc936c40990e2177&device_id=21999c50-740c-4a26-970b-8e07ba37b5ac >> HTTP/1.1" status: 200 len: 1175 time: 0.0565202 >> >> 2021-11-15 20:55:37.618 678146 DEBUG >> neutron.pecan_wsgi.hooks.policy_enforcement >> [req-eebf3e86-206a-45ed-92f3-804007a39cca 9888a6e6da764fd28a72b1a7b25b8967 >> 3e076bf2747b40ed8792ebf85d14d719 - default default] Attributes excluded by >> policy engine: ['standard_attr_id', 'vlan_transparent'] >> _exclude_attributes_by_policy >> /usr/lib/python3/dist-packages/neutron/pecan_wsgi/hooks/policy_enforcement.py:255 >> >> 2021-11-15 20:55:37.619 678146 INFO neutron.wsgi >> [req-eebf3e86-206a-45ed-92f3-804007a39cca 9888a6e6da764fd28a72b1a7b25b8967 >> 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET >> /v2.0/networks?id=ed4ee148-2562-41ed-93ac-7341948ac4dc HTTP/1.1" status: >> 200 len: 904 time: 0.1111536 >> >> 2021-11-15 20:55:37.674 678146 INFO neutron.wsgi >> [req-71885d0c-4aa2-4377-a014-33768c9160ae 9888a6e6da764fd28a72b1a7b25b8967 >> 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET >> /v2.0/floatingips?fixed_ip_address=192.168.100.199&port_id=f9c17ae5-c07b-4411-a0be-551686e6ba8e >> HTTP/1.1" status: 200 len: 217 time: 0.0442567 >> >> 2021-11-15 20:55:37.727 678146 DEBUG >> neutron.pecan_wsgi.hooks.policy_enforcement >> [req-d35a3b5e-82ea-4e80-b1c6-c8b39205a1f7 9888a6e6da764fd28a72b1a7b25b8967 >> 3e076bf2747b40ed8792ebf85d14d719 - default default] Attributes excluded by >> policy engine: ['standard_attr_id', 'shared'] _exclude_attributes_by_policy >> /usr/lib/python3/dist-packages/neutron/pecan_wsgi/hooks/policy_enforcement.py:255 >> >> 2021-11-15 20:55:37.731 678146 INFO neutron.wsgi >> [req-d35a3b5e-82ea-4e80-b1c6-c8b39205a1f7 9888a6e6da764fd28a72b1a7b25b8967 >> 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET >> /v2.0/subnets?id=74e3b294-5fcc-4685-8bc4-5200be3af09e HTTP/1.1" status: >> 200 len: 858 time: 0.0468309 >> >> 2021-11-15 20:55:37.774 678146 INFO neutron.wsgi >> [req-7a57b86f-75bf-4381-9447-0b1d7262aae8 9888a6e6da764fd28a72b1a7b25b8967 >> 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET >> /v2.0/ports?network_id=ed4ee148-2562-41ed-93ac-7341948ac4dc&device_owner=network%3Adhcp >> HTTP/1.1" status: 200 len: 210 time: 0.0335791 >> >> 2021-11-15 20:55:37.891 678146 INFO neutron.wsgi >> [req-7cf3c09b-b5ee-4372-84c2-d738abd1a47a 9888a6e6da764fd28a72b1a7b25b8967 >> 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET >> /v2.0/networks/ed4ee148-2562-41ed-93ac-7341948ac4dc?fields=segments >> HTTP/1.1" status: 200 len: 212 time: 0.1059954 >> >> 2021-11-15 20:55:37.991 678146 INFO neutron.wsgi >> [req-3ed1edbd-9f33-4b4e-89c8-a2291f23e8ef 9888a6e6da764fd28a72b1a7b25b8967 >> 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET >> /v2.0/networks/ed4ee148-2562-41ed-93ac-7341948ac4dc?fields=provider%3Aphysical_network&fields=provider%3Anetwork_type >> HTTP/1.1" status: 200 len: 277 time: 0.0903363 >> >> 2021-11-15 20:55:49.083 678146 INFO neutron.wsgi >> [req-0328379e-d381-48be-a209-a94acc297baa 9888a6e6da764fd28a72b1a7b25b8967 >> 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.1 "GET >> /v2.0/ports?device_id=82e90d49-bbc4-4c2c-8ae1-7a4bcf84bb7d&fields=binding%3Ahost_id&fields=binding%3Avif_type >> HTTP/1.1" status: 200 len: 271 time: 0.2884536 >> >> [image: Email Signature] >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 33431 bytes Desc: not available URL: From marcin.juszkiewicz at linaro.org Tue Nov 16 17:49:41 2021 From: marcin.juszkiewicz at linaro.org (Marcin Juszkiewicz) Date: Tue, 16 Nov 2021 18:49:41 +0100 Subject: [kolla] Proposing Michal Arbet as Kolla core In-Reply-To: References: Message-ID: <9747ba17-d3f0-3a35-a3c0-c1fb5317d665@linaro.org> W dniu 16.11.2021 o?13:49, Micha? Nasiadka pisze: > Hi, > > I would like to propose adding Michal Arbet (kevko) to the kolla-core > and kolla-ansible-core groups. > Michal did some great work on ProxySQL, is a consistent maintainer > of?Debian related?images and has provided some helpful > reviews. > > Cores - please reply +1/-1 before the end of Friday 26th November. +1 From yasufum.o at gmail.com Tue Nov 16 18:11:44 2021 From: yasufum.o at gmail.com (Yasufumi Ogawa) Date: Wed, 17 Nov 2021 03:11:44 +0900 Subject: [ Tacker ] Passing a shell script/parameters as a file in cloud config In-Reply-To: References: Message-ID: Hi Lokendra, My appologies overlooked your question. We'll confirm the issue soon. Thanks, Yasufumi On 2021/11/01 17:22, Lokendra Rathour wrote: > Hello EveryOne, > Any update on this, please. > > > -Lokendra > > > On Thu, Oct 28, 2021 at 2:43 PM Lokendra Rathour > > wrote: > > Hi, > /In Tacker, while deploying VNFD can we pass a file ( parameter > file) and keep it at a defined path using cloud-config way?/ > > Like in *generic hot template*s, we have the below-mentioned way to > pass a file directly as below: > parameters: > ? foo: > ? ? default: bar > > resources: > > ? the_server: > ? ? type: OS::Nova::Server > ? ? properties: > ? ? ? # flavor, image etc > ? ? ? user_data: > ? ? ? ? str_replace: > ? ? ? ? ? template: {get_file: the_server_boot.sh} > ? ? ? ? ? params: > ? ? ? ? ? ? $FOO: {get_param: foo} > > > *but when using this approach in Tacker BaseHOT it gives an error > saying * > "nstantiation wait failed for vnf > 77693e61-c80e-41e0-af9a-a0f702f3a9a7, error: VNF Create Resource > CREATE failed: resources.obsvrnnu62mb: > resources.CAS_0_group.Property error: > resources.soft_script.properties.config: No content found in the > "files" section for get_file path: Files/scripts/install.py > 2021-10-28 00:46:35.677 3853831 ERROR oslo_messaging.rpc.server > " > do we have a defined way to use the hot capability in TACKER? > > Defined Folder Structure for CSAR: > . > ??? BaseHOT > ??? ??? default > ??? ? ? ??? RIN_vnf_hot.yaml > ??? ? ? ??? nested > ??? ? ? ? ? ??? RIN_0.yaml > ??? ? ? ? ? ??? RIN_1.yaml > ??? Definitions > ??? ??? RIN_df_default.yaml > ??? ??? RIN_top_vnfd.yaml > ??? ??? RIN_types.yaml > ??? ??? etsi_nfv_sol001_common_types.yaml > ??? ??? etsi_nfv_sol001_vnfd_types.yaml > ??? Files > ??? ??? images > ??? ??? scripts > ??? ? ? ??? install.py > ??? Scripts > ??? TOSCA-Metadata > ??? ??? TOSCA.meta > ??? UserData > ??? ??? __init__.py > ??? ??? lcm_user_data.py > > *Objective: * > To pass a file at a defined path on the VDU after the VDU is > instantiated/launched. > > -- > ~ Lokendra > skype: lokendrarathour > > > > > -- > ~ Lokendra > www.inertiaspeaks.com > www.inertiagroups.com > skype: lokendrarathour > > From lokendrarathour at gmail.com Tue Nov 16 18:25:03 2021 From: lokendrarathour at gmail.com (Lokendra Rathour) Date: Tue, 16 Nov 2021 23:55:03 +0530 Subject: [ Tacker ] Passing a shell script/parameters as a file in cloud config In-Reply-To: References: Message-ID: Thankyou so much for the input. Will wait for the same. -Lokendra On Tue, 16 Nov 2021, 23:50 Yasufumi Ogawa, wrote: > Hi Lokendra, > > My appologies overlooked your question. We'll confirm the issue soon. > > Thanks, > Yasufumi > > On 2021/11/01 17:22, Lokendra Rathour wrote: > > Hello EveryOne, > > Any update on this, please. > > > > > > -Lokendra > > > > > > On Thu, Oct 28, 2021 at 2:43 PM Lokendra Rathour > > > wrote: > > > > Hi, > > /In Tacker, while deploying VNFD can we pass a file ( parameter > > file) and keep it at a defined path using cloud-config way?/ > > > > Like in *generic hot template*s, we have the below-mentioned way to > > pass a file directly as below: > > parameters: > > foo: > > default: bar > > > > resources: > > > > the_server: > > type: OS::Nova::Server > > properties: > > # flavor, image etc > > user_data: > > str_replace: > > template: {get_file: the_server_boot.sh} > > params: > > $FOO: {get_param: foo} > > > > > > *but when using this approach in Tacker BaseHOT it gives an error > > saying * > > "nstantiation wait failed for vnf > > 77693e61-c80e-41e0-af9a-a0f702f3a9a7, error: VNF Create Resource > > CREATE failed: resources.obsvrnnu62mb: > > resources.CAS_0_group.Property error: > > resources.soft_script.properties.config: No content found in the > > "files" section for get_file path: Files/scripts/install.py > > 2021-10-28 00:46:35.677 3853831 ERROR oslo_messaging.rpc.server > > " > > do we have a defined way to use the hot capability in TACKER? > > > > Defined Folder Structure for CSAR: > > . > > ??? BaseHOT > > ? ??? default > > ? ??? RIN_vnf_hot.yaml > > ? ??? nested > > ? ??? RIN_0.yaml > > ? ??? RIN_1.yaml > > ??? Definitions > > ? ??? RIN_df_default.yaml > > ? ??? RIN_top_vnfd.yaml > > ? ??? RIN_types.yaml > > ? ??? etsi_nfv_sol001_common_types.yaml > > ? ??? etsi_nfv_sol001_vnfd_types.yaml > > ??? Files > > ? ??? images > > ? ??? scripts > > ? ??? install.py > > ??? Scripts > > ??? TOSCA-Metadata > > ? ??? TOSCA.meta > > ??? UserData > > ? ??? __init__.py > > ? ??? lcm_user_data.py > > > > *Objective: * > > To pass a file at a defined path on the VDU after the VDU is > > instantiated/launched. > > > > -- > > ~ Lokendra > > skype: lokendrarathour > > > > > > > > > > -- > > ~ Lokendra > > www.inertiaspeaks.com > > www.inertiagroups.com > > skype: lokendrarathour > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Danny.Webb at thehutgroup.com Tue Nov 16 18:33:02 2021 From: Danny.Webb at thehutgroup.com (Danny Webb) Date: Tue, 16 Nov 2021 18:33:02 +0000 Subject: [cinder][rbd] festival of RBD driver reviews 11 november 1500 utc In-Reply-To: <9e8a69c6-dff6-e320-e618-2f828fc95861@gmail.com> References: <9e8a69c6-dff6-e320-e618-2f828fc95861@gmail.com> Message-ID: Hi All, Apologies, I totally missed this meeting. My company interested in taking up the backend QOS implementation (https://review.opendev.org/c/openstack/cinder/+/762794) of rbd driver and moving it towards completion. Would anyone be available to walk me through what's needed to finalise this? I can jump onto IRC in whichever openstack channel is required (bearing in mind I'm in GMT). Cheers, Danny ________________________________ From: Brian Rosmaita Sent: 04 November 2021 19:19 To: openstack-discuss at lists.openstack.org Subject: [cinder][rbd] festival of RBD driver reviews 11 november 1500 utc CAUTION: This email originates from outside THG By popular demand (really!), I'm scheduling a RBD driver review festival for next week. It's a community driver, and we've got a backlog of patches: https://review.opendev.org/q/project:openstack/cinder+status:open+file:cinder/volume/drivers/rbd.py If your patch is currently in merge conflict, it would be helpful if you could get conflicts resolved before the festival. Also, if you have questions about comments that have been left on your patch, this would be a good time to get them answered. who: Everyone! what: The Cinder Festival of RBD Driver Reviews when: Thursday 11 November 2021 from 1500-1600 UTC where: https://meet.google.com/fsb-qkfc-qun etherpad: https://etherpad.opendev.org/p/cinder-festival-of-driver-reviews (Note that we're trying google meet for this session.) Danny Webb Senior Linux Systems Administrator The Hut Group Tel: Email: Danny.Webb at thehutgroup.com For the purposes of this email, the "company" means The Hut Group Limited, a company registered in England and Wales (company number 6539496) whose registered office is at Fifth Floor, Voyager House, Chicago Avenue, Manchester Airport, M90 3DQ and/or any of its respective subsidiaries. Confidentiality Notice This e-mail is confidential and intended for the use of the named recipient only. If you are not the intended recipient please notify us by telephone immediately on +44(0)1606 811888 or return it to us by e-mail. Please then delete it from your system and note that any use, dissemination, forwarding, printing or copying is strictly prohibited. Any views or opinions are solely those of the author and do not necessarily represent those of the company. Encryptions and Viruses Please note that this e-mail and any attachments have not been encrypted. They may therefore be liable to be compromised. Please also note that it is your responsibility to scan this e-mail and any attachments for viruses. We do not, to the extent permitted by law, accept any liability (whether in contract, negligence or otherwise) for any virus infection and/or external compromise of security and/or confidentiality in relation to transmissions sent by e-mail. Monitoring Activity and use of the company's systems is monitored to secure its effective use and operation and for other lawful business purposes. Communications using these systems will also be monitored and may be recorded to secure effective use and operation and for other lawful business purposes. hgvyjuv -------------- next part -------------- An HTML attachment was scrubbed... URL: From jungleboyj at gmail.com Tue Nov 16 18:38:01 2021 From: jungleboyj at gmail.com (Jay Bryant) Date: Tue, 16 Nov 2021 12:38:01 -0600 Subject: [cinder][rbd] festival of RBD driver reviews 11 november 1500 utc In-Reply-To: References: <9e8a69c6-dff6-e320-e618-2f828fc95861@gmail.com> Message-ID: <33e02f27-fadd-5c07-a697-b66621dbd03d@gmail.com> On 11/16/2021 12:33 PM, Danny Webb wrote: > Hi All, > > Apologies, I totally missed this meeting.? My company interested in > taking up the backend QOS implementation > (https://review.opendev.org/c/openstack/cinder/+/762794) of rbd driver > and moving it towards completion.? Would anyone be available to walk > me through what's needed to finalise this?? I can jump onto IRC in > whichever openstack channel is required (bearing in mind I'm in GMT). Danny, Best approach would be to join the #openstack-cinder channel on oftc.net and ask questions there.? Also, an opportunity to get help would be by bringing this up in the weekly Cinder meeting [1] . Hope this information helps! [1] https://wiki.openstack.org/wiki/CinderMeetings > > Cheers, > Danny > ------------------------------------------------------------------------ > *From:* Brian Rosmaita > *Sent:* 04 November 2021 19:19 > *To:* openstack-discuss at lists.openstack.org > > *Subject:* [cinder][rbd] festival of RBD driver reviews 11 november > 1500 utc > CAUTION: This email originates from outside THG > > By popular demand (really!), I'm scheduling a RBD driver review festival > for next week. It's a community driver, and we've got a backlog of > patches: > > https://review.opendev.org/q/project:openstack/cinder+status:open+file:cinder/volume/drivers/rbd.py > > If your patch is currently in merge conflict, it would be helpful if you > could get conflicts resolved before the festival. Also, if you have > questions about comments that have been left on your patch, this would > be a good time to get them answered. > > who: Everyone! > what: The Cinder Festival of RBD Driver Reviews > when: Thursday 11 November 2021 from 1500-1600 UTC > where: https://meet.google.com/fsb-qkfc-qun > etherpad: https://etherpad.opendev.org/p/cinder-festival-of-driver-reviews > > (Note that we're trying google meet for this session.) > > Danny Webb > Senior Linux Systems Administrator > The Hut Group > > Tel: > Email: Danny.Webb at thehutgroup.com > > > For the purposes of this email, the "company" means The Hut Group > Limited, a company registered in England and Wales (company number > 6539496) whose registered office is at Fifth Floor, Voyager House, > Chicago Avenue, Manchester Airport, M90 3DQ and/or any of its > respective subsidiaries. > > *Confidentiality Notice* > This e-mail is confidential and intended for the use of the named > recipient only. If you are not the intended recipient please notify us > by telephone immediately on +44(0)1606 811888 or return it to us by > e-mail. Please then delete it from your system and note that any use, > dissemination, forwarding, printing or copying is strictly prohibited. > Any views or opinions are solely those of the author and do not > necessarily represent those of the company. > > *Encryptions and Viruses* > Please note that this e-mail and any attachments have not been > encrypted. They may therefore be liable to be compromised. Please also > note that it is your responsibility to scan this e-mail and any > attachments for viruses. We do not, to the extent permitted by law, > accept any liability (whether in contract, negligence or otherwise) > for any virus infection and/or external compromise of security and/or > confidentiality in relation to transmissions sent by e-mail. > > *Monitoring* > Activity and use of the company's systems is monitored to secure its > effective use and operation and for other lawful business purposes. > Communications using these systems will also be monitored and may be > recorded to secure effective use and operation and for other lawful > business purposes. > > hgvyjuv -------------- next part -------------- An HTML attachment was scrubbed... URL: From dmeng at uvic.ca Tue Nov 16 17:55:32 2021 From: dmeng at uvic.ca (dmeng) Date: Tue, 16 Nov 2021 09:55:32 -0800 Subject: [sdk]: Check instance error message Message-ID: <1abc189e90ecc42225a74a4be8c9ee47@uvic.ca> Hello there, Hope everything is going well. I'm wondering if there is any method that could check the error message of an instance whose status is "ERROR"? Like from openstack cli, "openstack server show server_name", if the server is in "ERROR" status, this will return a field "fault" with a message shows the error. I tried the compute service get_server and find_server, but neither of them show the error messages of an instance. Thanks and have a great day! Catherine -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Tue Nov 16 19:13:31 2021 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Tue, 16 Nov 2021 20:13:31 +0100 Subject: [kolla] Proposing Michal Arbet as Kolla core In-Reply-To: References: Message-ID: +1 On Tue, 16 Nov 2021 at 13:50, Micha? Nasiadka wrote: > > Hi, > > I would like to propose adding Michal Arbet (kevko) to the kolla-core and kolla-ansible-core groups. > Michal did some great work on ProxySQL, is a consistent maintainer of Debian related images and has provided some helpful > reviews. > > Cores - please reply +1/-1 before the end of Friday 26th November. > > Thanks, > Michal From rosmaita.fossdev at gmail.com Tue Nov 16 22:24:39 2021 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Tue, 16 Nov 2021 17:24:39 -0500 Subject: [cinder][rbd] festival of RBD driver reviews 11 november 1500 utc In-Reply-To: <33e02f27-fadd-5c07-a697-b66621dbd03d@gmail.com> References: <9e8a69c6-dff6-e320-e618-2f828fc95861@gmail.com> <33e02f27-fadd-5c07-a697-b66621dbd03d@gmail.com> Message-ID: <0cbc3047-71b9-89fc-ff1d-1f51642c65a9@gmail.com> On 11/16/21 1:38 PM, Jay Bryant wrote: > > On 11/16/2021 12:33 PM, Danny Webb wrote: >> Hi All, >> >> Apologies, I totally missed this meeting.? My company interested in >> taking up the backend QOS implementation >> (https://review.opendev.org/c/openstack/cinder/+/762794) of rbd driver >> and moving it towards completion.? Would anyone be available to walk >> me through what's needed to finalise this?? I can jump onto IRC in >> whichever openstack channel is required (bearing in mind I'm in GMT). > > Danny, > > Best approach would be to join the #openstack-cinder channel on oftc.net > and ask questions there.? Also, an opportunity to get help would be by > bringing this up in the weekly Cinder meeting [1] . > > > Hope this information helps! > > > [1] https://wiki.openstack.org/wiki/CinderMeetings > In addition to what Jay said, the history on that patch indicates that the cinder team wanted a spec outlining the design of the feature [2]. You can find info about cinder specs here: https://docs.openstack.org/cinder/latest/contributor/contributing.html#new-feature-planning [2] https://review.opendev.org/c/openstack/cinder/+/762794/3#message-c3cacc61d6a5e7b1229b64e2d72b5b2a2c68404d >> >> Cheers, >> Danny >> ------------------------------------------------------------------------ >> *From:* Brian Rosmaita >> *Sent:* 04 November 2021 19:19 >> *To:* openstack-discuss at lists.openstack.org >> >> *Subject:* [cinder][rbd] festival of RBD driver reviews 11 november >> 1500 utc >> CAUTION: This email originates from outside THG >> >> By popular demand (really!), I'm scheduling a RBD driver review festival >> for next week. It's a community driver, and we've got a backlog of >> patches: >> >> https://review.opendev.org/q/project:openstack/cinder+status:open+file:cinder/volume/drivers/rbd.py >> >> If your patch is currently in merge conflict, it would be helpful if you >> could get conflicts resolved before the festival. Also, if you have >> questions about comments that have been left on your patch, this would >> be a good time to get them answered. >> >> who: Everyone! >> what: The Cinder Festival of RBD Driver Reviews >> when: Thursday 11 November 2021 from 1500-1600 UTC >> where: https://meet.google.com/fsb-qkfc-qun >> etherpad: https://etherpad.opendev.org/p/cinder-festival-of-driver-reviews >> >> (Note that we're trying google meet for this session.) >> >> Danny Webb >> Senior Linux Systems Administrator >> The Hut Group >> >> Tel: >> Email: Danny.Webb at thehutgroup.com >> >> >> For the purposes of this email, the "company" means The Hut Group >> Limited, a company registered in England and Wales (company number >> 6539496) whose registered office is at Fifth Floor, Voyager House, >> Chicago Avenue, Manchester Airport, M90 3DQ and/or any of its >> respective subsidiaries. >> >> *Confidentiality Notice* >> This e-mail is confidential and intended for the use of the named >> recipient only. If you are not the intended recipient please notify us >> by telephone immediately on +44(0)1606 811888 or return it to us by >> e-mail. Please then delete it from your system and note that any use, >> dissemination, forwarding, printing or copying is strictly prohibited. >> Any views or opinions are solely those of the author and do not >> necessarily represent those of the company. >> >> *Encryptions and Viruses* >> Please note that this e-mail and any attachments have not been >> encrypted. They may therefore be liable to be compromised. Please also >> note that it is your responsibility to scan this e-mail and any >> attachments for viruses. We do not, to the extent permitted by law, >> accept any liability (whether in contract, negligence or otherwise) >> for any virus infection and/or external compromise of security and/or >> confidentiality in relation to transmissions sent by e-mail. >> >> *Monitoring* >> Activity and use of the company's systems is monitored to secure its >> effective use and operation and for other lawful business purposes. >> Communications using these systems will also be monitored and may be >> recorded to secure effective use and operation and for other lawful >> business purposes. >> >> hgvyjuv From skaplons at redhat.com Wed Nov 17 07:26:14 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Wed, 17 Nov 2021 08:26:14 +0100 Subject: [all][refstack][neutron][kolla][ironic][heat][trove][senlin][barbican][manila] Fixing Zuul Config Errors In-Reply-To: <5072b00c-bbf6-42e9-830f-d6598a76beb7@www.fastmail.com> References: <5072b00c-bbf6-42e9-830f-d6598a76beb7@www.fastmail.com> Message-ID: <5517904.DvuYhMxLoT@p1> Hi, I just checked neutron related things there and it seems there are 2 major issues there: 1. move of the tap-as-a-service from x/ to openstack/ namespace (that affects networking-midonet) - I will propose patch for that today. 2. remove of the neutron-lbaas repo (that affects much more than only neutron repos - for that I will try to propose patches this week as well. On wtorek, 16 listopada 2021 17:58:44 CET Clark Boylan wrote: > Hello, > > The OpenStack tenant in Zuul currently has 134 config errors. You can find > these errors at https://zuul.opendev.org/t/openstack/config-errors or by > clicking the blue bell icon in the top right of > https://zuul.opendev.org/t/openstack/status. The vast majority of these > errors appear related to project renames that have been requested of OpenDev > or project retirements. Can you please look into fixing these as they can be > an attractive nuisance when debugging Zuul problems (they also indicate that > a number of your jobs are probably not working). > > Project renames creating issues: > > * openstack/python-tempestconf -> osf/python-tempestconf -> > openinfra/python-tempestconf * openstack/refstack -> osf/refstack -> > openinfra/refstack > * x/tap-as-a-service -> openstack/tap-as-a-service > * openstack/networking-l2gw -> x/networking-l2gw > > Project retirements creating issues: > > * openstack/neutron-lbaas > * recordsansible/ara > > Projects whose configs have errors: > > * openinfra/python-tempestconf > * openstack/heat > * openstack/ironic > * openstack/kolla-ansible > * openstack/kuryr-kubernetes > * openstack/murano-apps > * openstack/networking-midonet > * openstack/networking-odl > * openstack/neutron > * openstack/neutron-fwaas > * openstack/python-troveclient > * openstack/senlin > * openstack/tap-as-a-service > * openstack/zaqar > * x/vmware-nsx > * openinfra/openstackid > * openstack/barbican > * openstack/cookbook-openstack-application-catalog > * openstack/heat-dashboard > * openstack/manila-ui > * openstack/python-manilaclient > > Let us know if we can help decipher any errors, > Clark -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From skaplons at redhat.com Wed Nov 17 07:36:08 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Wed, 17 Nov 2021 08:36:08 +0100 Subject: [all][refstack][neutron][kolla][ironic][heat][trove][senlin][barbican][manila] Fixing Zuul Config Errors In-Reply-To: <5517904.DvuYhMxLoT@p1> References: <5072b00c-bbf6-42e9-830f-d6598a76beb7@www.fastmail.com> <5517904.DvuYhMxLoT@p1> Message-ID: <4359750.LvFx2qVVIh@p1> Hi, On ?roda, 17 listopada 2021 08:26:14 CET Slawek Kaplonski wrote: > Hi, > > I just checked neutron related things there and it seems there are 2 major > issues there: > 1. move of the tap-as-a-service from x/ to openstack/ namespace (that affects > networking-midonet) - I will propose patch for that today. > 2. remove of the neutron-lbaas repo (that affects much more than only neutron > repos - for that I will try to propose patches this week as well. There are also some missing job definitions in some of the neutron related repos and also issues with missing openstack/networking-l2gw project. I will take a look into all those issues in next days. > > On wtorek, 16 listopada 2021 17:58:44 CET Clark Boylan wrote: > > Hello, > > > > The OpenStack tenant in Zuul currently has 134 config errors. You can find > > these errors at https://zuul.opendev.org/t/openstack/config-errors or by > > clicking the blue bell icon in the top right of > > https://zuul.opendev.org/t/openstack/status. The vast majority of these > > errors appear related to project renames that have been requested of OpenDev > > or project retirements. Can you please look into fixing these as they can be > > an attractive nuisance when debugging Zuul problems (they also indicate that > > a number of your jobs are probably not working). > > > > Project renames creating issues: > > * openstack/python-tempestconf -> osf/python-tempestconf -> > > > > openinfra/python-tempestconf * openstack/refstack -> osf/refstack -> > > openinfra/refstack > > > > * x/tap-as-a-service -> openstack/tap-as-a-service > > * openstack/networking-l2gw -> x/networking-l2gw > > > > Project retirements creating issues: > > * openstack/neutron-lbaas > > * recordsansible/ara > > > > Projects whose configs have errors: > > * openinfra/python-tempestconf > > * openstack/heat > > * openstack/ironic > > * openstack/kolla-ansible > > * openstack/kuryr-kubernetes > > * openstack/murano-apps > > * openstack/networking-midonet > > * openstack/networking-odl > > * openstack/neutron > > * openstack/neutron-fwaas > > * openstack/python-troveclient > > * openstack/senlin > > * openstack/tap-as-a-service > > * openstack/zaqar > > * x/vmware-nsx > > * openinfra/openstackid > > * openstack/barbican > > * openstack/cookbook-openstack-application-catalog > > * openstack/heat-dashboard > > * openstack/manila-ui > > * openstack/python-manilaclient > > > > Let us know if we can help decipher any errors, > > Clark -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From tjoen at dds.nl Wed Nov 17 07:49:47 2021 From: tjoen at dds.nl (tjoen) Date: Wed, 17 Nov 2021 08:49:47 +0100 Subject: [sdk]: Check instance error message In-Reply-To: <1abc189e90ecc42225a74a4be8c9ee47@uvic.ca> References: <1abc189e90ecc42225a74a4be8c9ee47@uvic.ca> Message-ID: <8910d566-b5b7-de5e-0a98-36cd464e0d1f@dds.nl> On 11/16/21 18:55, dmeng wrote: > I'm wondering if there is any method that could check the error message > of an instance whose status is "ERROR"? Like from openstack cli, > "openstack server show server_name", if the server is in "ERROR" status, > this will return a field "fault" with a message shows the error. I tried > the compute service get_server and find_server, but neither of them show > the error messages of an instance. Haven't I answered this a week ago? Look in the archives From katonalala at gmail.com Wed Nov 17 07:52:47 2021 From: katonalala at gmail.com (Lajos Katona) Date: Wed, 17 Nov 2021 08:52:47 +0100 Subject: [edge][neutron][all] Edge related documentation Message-ID: Hi, During Yoga PTG the Neutron team had a very useful discussion together with some Designate folks, see the etherpad [1]. >From Neutron team's perspective the task is to document how to set up an edge site with AZ, DNS.... etc. This is a cross project effort (even if just do it for Neutron-Designate today), is there a place for such documentation? [1]: https://etherpad.opendev.org/p/octavia-designate-neutron-ptg#L16 -------------- next part -------------- An HTML attachment was scrubbed... URL: From franck.vedel at univ-grenoble-alpes.fr Wed Nov 17 07:59:05 2021 From: franck.vedel at univ-grenoble-alpes.fr (Franck VEDEL) Date: Wed, 17 Nov 2021 08:59:05 +0100 Subject: [kolla-ansible][wallaby][glance] Problem with image list after reconfigure Message-ID: Hello everyone I have a strange problem and I haven't found the solution yet. Following a certificate update I had to do a "kolla-ansible -t multinode reconfigure ?. Well, after several attempts (it is not easy to use certificates with Kolla-ansible, and from my advice, not documented enough for beginners), I have my new functional certificates. Perfect ... well almost. I am trying to create a new instance to check general operation. ERROR. Okay, I look in the logs and I see that Cinder is having problems creating volumes with an error that I never had ("TypeError: 'NoneType' object is not iterable). I dig and then I wonder if it is not the Glance images which cannot be used, while they are present (openstack image list is OK). I create an empty volume: it works. I am creating a volume from an image: Failed. However, I have my list of ten images in glance. I create a new image and create a volume with this new image: it works. I create an instance with this new image: OK. What is the problem ? The images present before the "reconfigure" are listed, visible in horizon for example, but unusable. Is there a way to fix this, or do we have to reinstall them all? Thanks in advance for your help if this problem speaks to you. Franck VEDEL D?p. R?seaux Informatiques & T?l?coms IUT1 - Univ GRENOBLE Alpes 0476824462 Stages, Alternance, Emploi. http://www.rtgrenoble.fr -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Wed Nov 17 08:13:34 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Wed, 17 Nov 2021 09:13:34 +0100 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming Message-ID: <2165480.iZASKD2KPV@p1> Hi, Recently I spent some time to check how many rechecks we need in Neutron to get patch merged and I compared it to some other OpenStack projects (see [1] for details). TL;DR - results aren't good for us and I think we really need to do something with that. Of course "easiest" thing to say is that we should fix issues which we are hitting in the CI to make jobs more stable. But it's not that easy. We are struggling with those jobs for very long time. We have CI related meeting every week and we are fixing what we can there. Unfortunately there is still bunch of issues which we can't fix so far because they are intermittent and hard to reproduce locally or in some cases the issues aren't realy related to the Neutron or there are new bugs which we need to investigate and fix :) So this is never ending battle for us. The problem is that we have to test various backends, drivers, etc. so as a result we have many jobs running on each patch - excluding UT, pep8 and docs jobs we have around 19 jobs in check and 14 jobs in gate queue. In the past we made a lot of improvements, like e.g. we improved irrelevant files lists for jobs to run less jobs on some of the patches, together with QA team we did "integrated-networking" template to run only Neutron and Nova related scenario tests in the Neutron queues, we removed and consolidated some of the jobs (there is still one patch in progress for that but it should just remove around 2 jobs from the check queue). All of that are good improvements but still not enough to make our CI really stable :/ Because of all of that, I would like to ask community about any other ideas how we can improve that. If You have any ideas, please send it in this email thread or reach out to me directly on irc. We want to discuss about them in the next video CI meeting which will be on November 30th. If You would have any idea and would like to join that discussion, You are more than welcome in that meeting of course :) [1] http://lists.openstack.org/pipermail/openstack-discuss/2021-November/ 025759.html -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From zigo at debian.org Wed Nov 17 09:22:51 2021 From: zigo at debian.org (Thomas Goirand) Date: Wed, 17 Nov 2021 10:22:51 +0100 Subject: [nova][all] Adding /healthcheck support in Nova, and better healthcheck in every projects Message-ID: Hi, About a year and a half ago, I attempted to add /healthcheck support by default in all projects. For Nova, this resulted in this patch: https://review.opendev.org/c/openstack/nova/+/724684 For other projects, it's been merged almost everywhere (I'd have to survey all project to see if that's the case, or if I still have debian specific patches somewhere). Though for Nova, this sparked a discussion where it's been said that the current implementation of /healthcheck wasn't good enough. This resulted in threads about how to better do it. Unfortunately, this blocked my patch from being merged in Nova. It is my point of view to recognize a failure here. The /healthcheck URL was added in oslo.middleware so one can use it with something like haproxy to verify that the API is up, and responds. It was never designed to check, for example, if nova-api has a valid connectivity to MySQL and RabbitMQ. Yes, this is welcome, but in the mean time, operators must tweak the default file to have a valid, useable /etc/nova/api-paste.ini. So I am hereby asking the nova team: Can we please move forward and agree that 1.5 years waiting for such a minor patch is too long, and that such patch should be approved, prior to having a better healtcheck mechanism? I don't think it's a good idea to ask Nova users to wait potentially more development cycles to have a good-by-default api-paste.ini file. At the same time, I am wondering: is anyone even working on a better healthcheck system? I haven't heard that anyone is working on this. Though it would be more than welcome. Currently, to check that a daemon is alive and well, operators are stuck with: - checking with ss if the daemon is correctly connected to a given port - check the logs for rabbitmq and mysql errors (with something like filebeat + elastic search and alarming) Clearly, this doesn't scale. When running many large OpenStack clusters, it is not trivial to have a monitoring system that works and scales. The effort to deploy such a monitoring system is also not trivial at all. So what's been discussed at the time for improving the monitoring would be very much welcome, though not only for the API service: something to check the health of other daemons would be very much welcome. I'd very much would like to participate in a Yoga effort to improve the current situation, and contribute the best I can, though I'm not sure I'd be the best person to drive this... Is there anyone else willing to work on this? Hoping this message is helpful, Cheers, Thomas Goirand (zigo) From balazs.gibizer at est.tech Wed Nov 17 10:18:03 2021 From: balazs.gibizer at est.tech (Balazs Gibizer) Date: Wed, 17 Nov 2021 11:18:03 +0100 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming In-Reply-To: <2165480.iZASKD2KPV@p1> References: <2165480.iZASKD2KPV@p1> Message-ID: <3MOP2R.O83SZVO0NWN23@est.tech> On Wed, Nov 17 2021 at 09:13:34 AM +0100, Slawek Kaplonski wrote: > Hi, > > Recently I spent some time to check how many rechecks we need in > Neutron to > get patch merged and I compared it to some other OpenStack projects > (see [1] > for details). > TL;DR - results aren't good for us and I think we really need to do > something > with that. I really like the idea of collecting such stats. Thank you for doing it. I can even imagine to make a public dashboard somewhere with this information as it is a good indication about the health of our projects / testing. > > Of course "easiest" thing to say is that we should fix issues which > we are > hitting in the CI to make jobs more stable. But it's not that easy. > We are > struggling with those jobs for very long time. We have CI related > meeting > every week and we are fixing what we can there. > Unfortunately there is still bunch of issues which we can't fix so > far because > they are intermittent and hard to reproduce locally or in some cases > the > issues aren't realy related to the Neutron or there are new bugs > which we need > to investigate and fix :) I have couple of suggestion based on my experience working with CI in nova. 1) we try to open bug reports for intermittent gate failures too and keep them tagged in a list [1] so when a job fail it is easy to check if the bug is known. 2) I offer my help here now that if you see something in neutron runs that feels non neutron specific then ping me with it. Maybe we are struggling with the same problem too. 3) there was informal discussion before about a possibility to re-run only some jobs with a recheck instead for re-running the whole set. I don't know if this is feasible with Zuul and I think this only treat the symptom not the root case. But still this could be a direction if all else fails. Cheers, gibi > So this is never ending battle for us. The problem is that we have > to test > various backends, drivers, etc. so as a result we have many jobs > running on > each patch - excluding UT, pep8 and docs jobs we have around 19 jobs > in check > and 14 jobs in gate queue. > > In the past we made a lot of improvements, like e.g. we improved > irrelevant > files lists for jobs to run less jobs on some of the patches, > together with QA > team we did "integrated-networking" template to run only Neutron and > Nova > related scenario tests in the Neutron queues, we removed and > consolidated some > of the jobs (there is still one patch in progress for that but it > should just > remove around 2 jobs from the check queue). All of that are good > improvements > but still not enough to make our CI really stable :/ > > Because of all of that, I would like to ask community about any other > ideas > how we can improve that. If You have any ideas, please send it in > this email > thread or reach out to me directly on irc. > We want to discuss about them in the next video CI meeting which will > be on > November 30th. If You would have any idea and would like to join that > discussion, You are more than welcome in that meeting of course :) > > [1] > http://lists.openstack.org/pipermail/openstack-discuss/2021-November/ > 025759.html [1] https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure&orderby=-date_last_updated&start=0 > > -- > Slawek Kaplonski > Principal Software Engineer > Red Hat From arnaud.morin at gmail.com Wed Nov 17 10:54:58 2021 From: arnaud.morin at gmail.com (Arnaud Morin) Date: Wed, 17 Nov 2021 10:54:58 +0000 Subject: [nova] weird python stacktrace in nova-compute Message-ID: Hey all, We have some python stacktrace in our nova-compute journalctl logs, which looks like that: $ journalctl -u nova-compute ... Nov 01 09:04:03 host1234 python3[161354]: Exception in thread tpool_thread_6: Nov 01 09:04:03 host1234 python3[161354]: Traceback (most recent call last): Nov 01 09:04:03 host1234 python3[161354]: File "/opt/openstack/nova/lib/python3.6/threading.py", line 916, in _bootstrap_inner Nov 01 09:04:03 host1234 python3[161354]: self.run() Nov 01 09:04:03 host1234 python3[161354]: File "/opt/openstack/nova/lib/python3.6/threading.py", line 864, in run Nov 01 09:04:03 host1234 python3[161354]: self._target(*self._args, **self._kwargs) Nov 01 09:04:03 host1234 python3[161354]: File "/opt/openstack/nova/lib/python3.6/site-packages/eventlet/tpool.py", line 96, in tworker Nov 01 09:04:03 host1234 python3[161354]: _wsock.sendall(_bytetosend) Nov 01 09:04:03 host1234 python3[161354]: TimeoutError: [Errno 110] Connection timed out Nov 01 09:04:03 host1234 python3[161354]: Traceback (most recent call last): Nov 01 09:04:03 host1234 python3[161354]: File "/opt/openstack/nova/lib/python3.6/site-packages/eventlet/hubs/poll.py", line 109, in wait Nov 01 09:04:03 host1234 python3[161354]: listener.cb(fileno) Nov 01 09:04:03 host1234 python3[161354]: File "/opt/openstack/nova/lib/python3.6/site-packages/eventlet/tpool.py", line 58, in tpool_trampoline Nov 01 09:04:03 host1234 python3[161354]: assert _c Nov 01 09:04:03 host1234 python3[161354]: AssertionError Nov 01 09:04:03 host1234 python3[161354]: Removing descriptor: 16 ... After this, nova-compute is "stuck" and does nothing more, but still continue to answer on RPC (it is still up in nova services) We am not experts of python threading / eventlet stuff, and we have no idea how to debug this. Our current solution is to restart nova-compute, but it's more a dirty workaround than a real fix. Does it ring a bell to someone in the community? Cheers, Arnaud From pierre at stackhpc.com Wed Nov 17 11:07:58 2021 From: pierre at stackhpc.com (Pierre Riteau) Date: Wed, 17 Nov 2021 12:07:58 +0100 Subject: [all] git-review broken by git version 2.34.0 Message-ID: Hello, I recently upgraded to the latest git release 2.34.0 and noticed it breaks git-review (output below is with -v): Errors running git rebase -p -i remotes/gerrit/stable/xena fatal: --preserve-merges was replaced by --rebase-merges I submitted a patch [1], but since it would break compatibility with git versions older than 2.18 that don't support the --rebase-merges (-r) option, it may need to be refined before being merged. Cheers, Pierre [1] https://review.opendev.org/c/opendev/git-review/+/818219 From senrique at redhat.com Wed Nov 17 11:12:42 2021 From: senrique at redhat.com (Sofia Enriquez) Date: Wed, 17 Nov 2021 08:12:42 -0300 Subject: [cinder] Bug deputy report for week of 11-17-2021 Message-ID: No meeting today :( This is a bug report from 11-10-2021 to 11-17-2021. Agenda: https://etherpad.opendev.org/p/cinder-bug-squad-meeting ----------------------------------------------------------------------------------------- High - https://bugs.launchpad.net/cinder/+bug/1950474 "Xena accept transfer policy breaks volume transfer workflow". Unassigned. Medium - https://bugs.launchpad.net/cinder/+bug/1950291 "tempest-integrated-compute-centos-8-stream fails with version conflict in boto3". Unassigned. - https://bugs.launchpad.net/cinder/+bug/1951163 "Unable to import/manage an encrypted volume". Unassigned. - https://bugs.launchpad.net/cinder/+bug/1951046 "DS8000 driver terminates volume connection when there still has volume attached to instances - Ussuri". Unassigned. Cheers, -- Sof?a Enriquez she/her Software Engineer Red Hat PnT IRC: @enriquetaso @RedHat Red Hat Red Hat -- Sof?a Enriquez she/her Software Engineer Red Hat PnT IRC: @enriquetaso @RedHat Red Hat Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Wed Nov 17 13:13:57 2021 From: smooney at redhat.com (Sean Mooney) Date: Wed, 17 Nov 2021 13:13:57 +0000 Subject: [nova][all] Adding /healthcheck support in Nova, and better healthcheck in every projects In-Reply-To: References: Message-ID: On Wed, 2021-11-17 at 10:22 +0100, Thomas Goirand wrote: > Hi, > > About a year and a half ago, I attempted to add /healthcheck support by > default in all projects. For Nova, this resulted in this patch: > > https://review.opendev.org/c/openstack/nova/+/724684 > > For other projects, it's been merged almost everywhere (I'd have to > survey all project to see if that's the case, or if I still have debian > specific patches somewhere). > > Though for Nova, this sparked a discussion where it's been said that the > current implementation of /healthcheck wasn't good enough. This resulted > in threads about how to better do it. > > Unfortunately, this blocked my patch from being merged in Nova. > > It is my point of view to recognize a failure here. The /healthcheck URL > was added in oslo.middleware so one can use it with something like > haproxy to verify that the API is up, and responds. It was never > designed to check, for example, if nova-api has a valid connectivity to > MySQL and RabbitMQ. Yes, this is welcome, but in the mean time, > operators must tweak the default file to have a valid, useable > /etc/nova/api-paste.ini. > > So I am hereby asking the nova team: > > Can we please move forward and agree that 1.5 years waiting for such a > minor patch is too long, and that such patch should be approved, prior > to having a better healtcheck mechanism? I don't think it's a good idea > to ask Nova users to wait potentially more development cycles to have a > good-by-default api-paste.ini file. i am currently wokring on an alternitive solution for this cycle. i still belive it woudl be incorrect to add teh healtcheck provided by oslo.middelware to nova. we disucssed this at the ptg this cycel and still did nto think it was the correct way to approch this but we did agree to work on adding an alternitive form of health checks this cycle. i fundementally belive bad healthchecks are worse then no helatch checks and the olso midelware provides bad healthchecks. since the /healthcheck denpoint can be added via api-paste.ini manually i dont think we shoudl add it to our default or that packageagre shoudl either. one open question in my draft spec is for the nova api in particaly should we support /healtcheck on the normal api port instead of the dedeicated health check endpoint. > > At the same time, I am wondering: is anyone even working on a better > healthcheck system? I haven't heard that anyone is working on this. yes so i need to push the spec for review ill see if i can do that today or at a minium this week. the tldr is as follows. nova will be extended with 2 addtional options to allow a health checks endpoint to be exposed on a tcp port and/or a unix socket. these heatlth check endpoints will not be authenticated will be disabel by default. all nova binaries (nova-api, nova-schduler, nova-compute, ...) will supprot exposing the endpoint. the process will internally update a heathcheck data structure when ever they perform specific operation that can be uses as a proxy for the healt of the binary (db query, rpc ping, request to libvirt) these will be binary specific. The over all health will be summerised with a status enum, exact values to be determind but im working with (OK, DEGRADED, FAULT) for now. in the degraded and fault state there will also be a mesage and likely details filed in the respocne. message would be human readable with detail being the actual content of the health check data structure. i have not decided if i should use http status codes as part of the way to singal the status, my instinct are saying no parsing the json reponce shoudl be simple and if you just need to check the status filed for ok|degreated|falut using a 5XX error code in the degraded of fault case would not be semanticly correct. the current set of usecases i am using to drive the desting of the spec are as follows. Use Cases --------- As a operator i want a simple health-check i can consume to know if a nova process is OK, Degraded or Faulty. As an operator i want this health-check to not impact performance of the service so it can be queried frequently at short intervals. As a deployment tool implementer i want the health check to be local with no dependencies on other hosts or services to function so i can integrate it with service managers such as systemd or container runtime like docker As a packager i would like health-check to not require special client or packages consume them. CURL, socat or netcat should be all that is required to connect to the health check and retrieve the service status. As an operator i would like to be able to use health-check of the nova api and metadata services to manage the membership of endpoints in my load-balancer or reverse proxy automatically. > Though it would be more than welcome. Currently, to check that a daemon > is alive and well, operators are stuck with: > > - checking with ss if the daemon is correctly connected to a given port > - check the logs for rabbitmq and mysql errors (with something like > filebeat + elastic search and alarming) > > Clearly, this doesn't scale. When running many large OpenStack clusters, > it is not trivial to have a monitoring system that works and scales. The > effort to deploy such a monitoring system is also not trivial at all. So > what's been discussed at the time for improving the monitoring would be > very much welcome, though not only for the API service: something to > check the health of other daemons would be very much welcome. > > I'd very much would like to participate in a Yoga effort to improve the > current situation, and contribute the best I can, though I'm not sure > I'd be the best person to drive this... Is there anyone else willing to > work on this? yep i am feel free to ping me on irc: sean-k-mooney incase your wondering but we have talked before. i have not configured my defualt channels since the change to oftc but im alwasy in at least #openstack-nova after discussing this in the nova ptg session the design took a hard right turn from being based on a rpc like protocaol exposed over a unix socket with ovos as the data fromat and active probes to a http based endpoint, avaiable over tcp and or unix socket with json as the responce format and a semi global data stucutre with TTL for the data. as a result i have had to rethink and rework most of the draft spec i had prepared. The main point of design that we need to agree on is exactuly how that data stucture is accessed and wehre it is stored. in the orginal desing i proposed there was no need to store any kind of state and or modify existing functions to add healchecks. each nova service manager would just implemant a new healthcheck function that would be pass as a callback to the healtcheck manager which exposed the endpoint. With the new approch we will like add decorators to imporant functions that will update the healthchecks based on if that fucntion complete correctly. if we take the decorator because of how decorators work it can only access module level varables, class method/memeber or the parmaters to the function it is decorating. what that efffectivly means is either the health check manager need to be stored in a module level "global" variable, it need to be a signelton accessable via a class method or it need to be stored in a data stucure that is passed to almost ever funciton speicifcally the context object. i am leaning towards the context object but i need to understand how that will interact with RPC calls so it might end up being a global/singelton which sucks form a unit/fucntional testing perspective but we can make it work via fixtures. hopefully this sould like good news to you but feel free to give feedback. > > Hoping this message is helpful, > Cheers, > > Thomas Goirand (zigo) > From cboylan at sapwetik.org Wed Nov 17 14:48:42 2021 From: cboylan at sapwetik.org (Clark Boylan) Date: Wed, 17 Nov 2021 06:48:42 -0800 Subject: [all] git-review broken by git version 2.34.0 In-Reply-To: References: Message-ID: On Wed, Nov 17, 2021, at 3:07 AM, Pierre Riteau wrote: > Hello, > > I recently upgraded to the latest git release 2.34.0 and noticed it > breaks git-review (output below is with -v): > > Errors running git rebase -p -i remotes/gerrit/stable/xena > fatal: --preserve-merges was replaced by --rebase-merges > > I submitted a patch [1], but since it would break compatibility with > git versions older than 2.18 that don't support the --rebase-merges > (-r) option, it may need to be refined before being merged. Ubuntu Bionic and CentOS7 both have older git versions than 2.18. If it isn't terrible to continue to support older git that may be a good idea, but we can also suggest those installations pin their git-review to the current version instead. As a side note the man page for git rebase says that --preserve-merges and --interactive are incompatible options and yet they are both set in the failed command above. I wonder what sort of behavior we were getting out of git when this "worked". > > Cheers, > Pierre > > [1] https://review.opendev.org/c/opendev/git-review/+/818219 From cboylan at sapwetik.org Wed Nov 17 15:20:13 2021 From: cboylan at sapwetik.org (Clark Boylan) Date: Wed, 17 Nov 2021 07:20:13 -0800 Subject: [all] git-review broken by git version 2.34.0 In-Reply-To: References: Message-ID: On Wed, Nov 17, 2021, at 6:48 AM, Clark Boylan wrote: > On Wed, Nov 17, 2021, at 3:07 AM, Pierre Riteau wrote: >> Hello, >> >> I recently upgraded to the latest git release 2.34.0 and noticed it >> breaks git-review (output below is with -v): >> >> Errors running git rebase -p -i remotes/gerrit/stable/xena >> fatal: --preserve-merges was replaced by --rebase-merges >> >> I submitted a patch [1], but since it would break compatibility with >> git versions older than 2.18 that don't support the --rebase-merges >> (-r) option, it may need to be refined before being merged. > > Ubuntu Bionic and CentOS7 both have older git versions than 2.18. If it > isn't terrible to continue to support older git that may be a good > idea, but we can also suggest those installations pin their git-review > to the current version instead. I pushed a follwup, https://review.opendev.org/c/opendev/git-review/+/818238, that can be squashed into the parent if this works in testing. It attempts to check the version and set the flag appropriately. Clark From dmendiza at redhat.com Wed Nov 17 15:51:48 2021 From: dmendiza at redhat.com (Douglas Mendizabal) Date: Wed, 17 Nov 2021 09:51:48 -0600 Subject: [barbican] No weekly meeting next week Message-ID: <79495e55-da52-45a3-1d40-0e04fe4683ca@redhat.com> Hi Barbicaneers, I'll be out on PTO next week, so I'm canceling the Barbican weekly meeting for November 23. Meeting will resume the following week on November 30. Thanks, - Douglas Mendiz?bal (redrobot) From cboylan at sapwetik.org Wed Nov 17 15:51:57 2021 From: cboylan at sapwetik.org (Clark Boylan) Date: Wed, 17 Nov 2021 07:51:57 -0800 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming In-Reply-To: <3MOP2R.O83SZVO0NWN23@est.tech> References: <2165480.iZASKD2KPV@p1> <3MOP2R.O83SZVO0NWN23@est.tech> Message-ID: <3e835881-29d8-4e33-9ef1-36171f9dd0a3@www.fastmail.com> On Wed, Nov 17, 2021, at 2:18 AM, Balazs Gibizer wrote: > Snip. I want to respond to a specific suggestion: > 3) there was informal discussion before about a possibility to re-run > only some jobs with a recheck instead for re-running the whole set. I > don't know if this is feasible with Zuul and I think this only treat > the symptom not the root case. But still this could be a direction if > all else fails. > OpenStack has configured its check and gate queues with something we've called "clean check". This refers to the requirement that before an OpenStack project can be gated it must pass check tests first. This policy was instituted because a number of these infrequent but problematic issues were traced back to recheck spamming. Basically changes would show up and were broken. They would fail some percentage of the time. They got rechecked until they finally merged and now their failure rate is added to the whole. This rule was introduced to make it more difficult to get this flakyness into the gate. Locking in test results is in direct opposition to the existing policy and goals. Locking results would make it far more trivial to land such flakyness as you wouldn't need entire sets of jobs to pass before you could land. Instead you could rerun individual jobs until each one passed and then land the result. Potentially introducing significant flakyness with a single merge. Locking results is also not really something that fits well with the speculative gate queues that Zuul runs. Remember that Zuul constructs a future git state and tests that in parallel. Currently the state for OpenStack looks like: A - Nova ^ B - Glance ^ C - Neutron ^ D - Neutron ^ F - Neutron The B glance change is tested as if the A Nova change has already merged and so on down the queue. If we want to keep these speculative states we can't really have humans manually verify a failure can be ignored and retry it. Because we'd be enqueuing job builds at different stages of speculative state. Each job build would be testing a different version of the software. What we could do is implement a retry limit for failing jobs. Zuul could rerun failing jobs X times before giving up and reporting failure (this would require updates to Zuul). The problem with this approach is without some oversight it becomes very easy to land changes that make things worse. As a side note Zuul does do retries, but only for detected network errors or when a pre-run playbook fails. The assumption is that network failures are due to the dangers of the Internet, and that pre-run playbooks are small, self contained, unlikely to fail, and when they do fail the failure should be independent of what is being tested. Where does that leave us? I think it is worth considering the original goals of "clean check". We know that rechecking/rerunning only makes these problems worse in the long term. They represent technical debt. One of the reasons we run these tests is to show us when our software is broken. In the case of flaky results we are exposing this technical debt where it impacts the functionality of our software. The longer we avoid fixing these issues the worse it gets, and this is true even with "clean check". Do we as developers find value in knowing the software needs attention before it gets released to users? Do the users find value in running reliable software? In the past we have asserted that "yes, there is value in this", and have invested in tracking, investigating, and fixing these problems even if they happen infrequently. But that does require investment, and active maintenance. Clark From zigo at debian.org Wed Nov 17 16:03:20 2021 From: zigo at debian.org (Thomas Goirand) Date: Wed, 17 Nov 2021 17:03:20 +0100 Subject: [nova][all] Adding /healthcheck support in Nova, and better healthcheck in every projects In-Reply-To: References: Message-ID: <4882cd19-4f6d-c0a7-15cd-a98e73a217ad@debian.org> Hi Sean, thanks for your reply! On 11/17/21 2:13 PM, Sean Mooney wrote: > i am currently wokring on an alternitive solution for this cycle. gr8! > i still belive it woudl be incorrect to add teh healtcheck provided by oslo.middelware to nova. > we disucssed this at the ptg this cycel and still did nto think it was the correct way to approch this > but we did agree to work on adding an alternitive form of health checks this cycle. > i fundementally belive bad healthchecks are worse then no helatch checks and the olso midelware provides bad healthchecks. The current implementation is only useful for plugging haproxy to APIs, nothing more, nothing less. > since the /healthcheck denpoint can be added via api-paste.ini manually i dont think we shoudl add it to our > default or that packageagre shoudl either. Like it or not, the current state of things is: - /healthcheck is activated everywhere (I patched that myself) - The nova package at least in Debian has it activated by default (as this is the only project that refused the patch, I carry it in the package). Also, many operators already use the /healthcheck in production, so you really want to keep it. IMO, your implementation should switch to a different endpoint if you wish to not retain compatibility with the older system. For this reason, I strongly believe that the Nova team should be revising its view from a year and a half, and accept the imperfect currently implemented /healthcheck. This is not mutually exclusive to a better implementation bound on some other URL. > one open question in my draft spec is for the nova api in particaly should we support /healtcheck on the normal api port instead of > the dedeicated health check endpoint. You should absolutely not break backward compatibility!!! > yes so i need to push the spec for review ill see if i can do that today or at a minium this week. > the tldr is as follows. > > nova will be extended with 2 addtional options to allow a health checks endpoint to be exposed on a tcp port > and/or a unix socket. these heatlth check endpoints will not be authenticated will be disabel by default. > all nova binaries (nova-api, nova-schduler, nova-compute, ...) will supprot exposing the endpoint. > > the process will internally update a heathcheck data structure when ever they perform specific operation that > can be uses as a proxy for the healt of the binary (db query, rpc ping, request to libvirt) these will be binary specific. > > The over all health will be summerised with a status enum, exact values to be determind but im working with (OK, DEGRADED, FAULT) > for now. in the degraded and fault state there will also be a mesage and likely details filed in the respocne. > message would be human readable with detail being the actual content of the health check data structure. > > i have not decided if i should use http status codes as part of the way to singal the status, my instinct are saying no > parsing the json reponce shoudl be simple and if you just need to check the status filed for ok|degreated|falut using a 5XX error code > in the degraded of fault case would not be semanticly correct. All you wrote above is great. For the http status codes, please implement it, because it's cheap, and that's how Zabbix (and probably other monitoring systems) works, plus everyone understand them. > Use Cases > --------- > > As a operator i want a simple health-check i can consume to know > if a nova process is OK, Degraded or Faulty. > > As an operator i want this health-check to not impact performance of the > service so it can be queried frequently at short intervals. > > As a deployment tool implementer i want the health check to be local with no > dependencies on other hosts or services to function so i can integrate it with > service managers such as systemd or container runtime like docker > > As a packager i would like health-check to not require special client or > packages consume them. CURL, socat or netcat should be all that is required to > connect to the health check and retrieve the service status. > > As an operator i would like to be able to use health-check of the nova api and > metadata services to manage the membership of endpoints in my load-balancer > or reverse proxy automatically. > > >> Though it would be more than welcome. Currently, to check that a daemon >> is alive and well, operators are stuck with: >> >> - checking with ss if the daemon is correctly connected to a given port >> - check the logs for rabbitmq and mysql errors (with something like >> filebeat + elastic search and alarming) >> >> Clearly, this doesn't scale. When running many large OpenStack clusters, >> it is not trivial to have a monitoring system that works and scales. The >> effort to deploy such a monitoring system is also not trivial at all. So >> what's been discussed at the time for improving the monitoring would be >> very much welcome, though not only for the API service: something to >> check the health of other daemons would be very much welcome. >> >> I'd very much would like to participate in a Yoga effort to improve the >> current situation, and contribute the best I can, though I'm not sure >> I'd be the best person to drive this... Is there anyone else willing to >> work on this? > > yep i am feel free to ping me on irc: sean-k-mooney incase your wondering but we have talked before. Yes. Feel free to ping me as well, I'll enjoy contributing were I can (though I know you're more skilled than I do in OpenStack's Python code... I'll still do what I can). > i have not configured my defualt channels since the change to oftc but im alwasy in at least #openstack-nova > after discussing this in the nova ptg session the design took a hard right turn from being based on a rpc like protocaol > exposed over a unix socket with ovos as the data fromat and active probes to a http based endpoint, avaiable over tcp and or unix socket > with json as the responce format and a semi global data stucutre with TTL for the data. > > as a result i have had to rethink and rework most of the draft spec i had prepared. > The main point of design that we need to agree on is exactuly how that data stucture is accessed and wehre it is stored. > > in the orginal desing i proposed there was no need to store any kind of state and or modify existing functions to add healchecks. > each nova service manager would just implemant a new healthcheck function that would be pass as a callback to the healtcheck manager which exposed the endpoint. > > With the new approch we will like add decorators to imporant functions that will update the healthchecks based on if that fucntion complete correctly. > if we take the decorator because of how decorators work it can only access module level varables, class method/memeber or the parmaters to the function it is decorating. > what that efffectivly means is either the health check manager need to be stored in a module level "global" variable, it need to be a signelton accessable via a class method > or it need to be stored in a data stucure that is passed to almost ever funciton speicifcally the context object. > > i am leaning towards the context object but i need to understand how that will interact with RPC calls so it might end up being a global/singelton which sucks form a unit/fucntional testing > perspective but we can make it work via fixtures. > > hopefully this sould like good news to you but feel free to give feedback. I don't like the fact that we're still having the discussion 1.5 years after the proposed patch, and that still delays having Nova following what all the other projects have approved. Again, what you're doing should not be mutually exclusive with adding what already works, and what is already in production. It's been said a year and a half ago, and it's still truth. A year and a half ago, we even discuss the fact it would be a shame if it took more than a year... So can we move forward? Anyways, I'm excited that this goes forward, so thanks again for leading this initiative. Cheers, Thomas Goirand (zigo) From yasufum.o at gmail.com Wed Nov 17 16:26:22 2021 From: yasufum.o at gmail.com (Yasufumi Ogawa) Date: Thu, 18 Nov 2021 01:26:22 +0900 Subject: [tacker] Skip next IRC meeting Message-ID: <5846722b-5675-f949-2947-eecf0e098302@gmail.com> Hi team, Due to my absence from work, I would like to skip the next IRC meeting on November 23. Thanks, Yasufumi From rosmaita.fossdev at gmail.com Wed Nov 17 17:10:57 2021 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Wed, 17 Nov 2021 12:10:57 -0500 Subject: [cinder] yoga R-17 virtual midcycle on 1 december Message-ID: As decided at today's weekly meeting, the Cinder Yoga R-17 virtual midcycle will be held: DATE: Wednesday 1 December 2021 TIME: 1400-1600 UTC LOCATION: https://bluejeans.com/3228528973 The meeting will be recorded. Please add topics to the midcycle etherpad: https://etherpad.opendev.org/p/cinder-yoga-midcycles cheers, brian From cyril at redhat.com Wed Nov 17 20:11:10 2021 From: cyril at redhat.com (Cyril Roelandt) Date: Wed, 17 Nov 2021 21:11:10 +0100 Subject: [kolla-ansible][wallaby][glance] Problem with image list after reconfigure In-Reply-To: References: Message-ID: Hello, On 2021-11-17 08:59, Franck VEDEL wrote: > Hello everyone > > I have a strange problem and I haven't found the solution yet. > Following a certificate update I had to do a "kolla-ansible -t multinode reconfigure ?. > Well, after several attempts (it is not easy to use certificates with Kolla-ansible, and from my advice, not documented enough for beginners), I have my new functional certificates. Perfect ... well almost. > > I am trying to create a new instance to check general operation. ERROR. > Okay, I look in the logs and I see that Cinder is having problems creating volumes with an error that I never had ("TypeError: 'NoneType' object is not iterable). We'd like to see the logs as well, especially the stacktrace. > I dig and then I wonder if it is not the Glance images which cannot be used, while they are present (openstack image list is OK). > > I create an empty volume: it works. > I am creating a volume from an image: Failed. What commands are you running? What's the output? What's in the logs? > > However, I have my list of ten images in glance. > > I create a new image and create a volume with this new image: it works. > I create an instance with this new image: OK. > > What is the problem ? The images present before the "reconfigure" are listed, visible in horizon for example, but unusable. > Is there a way to fix this, or do we have to reinstall them all? What's your configuration? What version of OpenStack are you running? Cyril From mnaser at vexxhost.com Wed Nov 17 20:53:22 2021 From: mnaser at vexxhost.com (Mohammed Naser) Date: Wed, 17 Nov 2021 15:53:22 -0500 Subject: [nova][all] Adding /healthcheck support in Nova, and better healthcheck in every projects In-Reply-To: <4882cd19-4f6d-c0a7-15cd-a98e73a217ad@debian.org> References: <4882cd19-4f6d-c0a7-15cd-a98e73a217ad@debian.org> Message-ID: I don't think we rely on /healthcheck -- there's nothing healthy about an API endpoint blindly returning a 200 OK. You might as well just hit / and accept 300 as a code and that's exactly the same behaviour. I support what Sean is bringing up here and I don't think it makes sense to have a noop /healthcheck that always gives a 200 OK...seems a bit useless imho On Wed, Nov 17, 2021 at 11:09 AM Thomas Goirand wrote: > > Hi Sean, thanks for your reply! > > On 11/17/21 2:13 PM, Sean Mooney wrote: > > i am currently wokring on an alternitive solution for this cycle. > > gr8! > > > i still belive it woudl be incorrect to add teh healtcheck provided by oslo.middelware to nova. > > we disucssed this at the ptg this cycel and still did nto think it was the correct way to approch this > > but we did agree to work on adding an alternitive form of health checks this cycle. > > i fundementally belive bad healthchecks are worse then no helatch checks and the olso midelware provides bad healthchecks. > > The current implementation is only useful for plugging haproxy to APIs, > nothing more, nothing less. > > > since the /healthcheck denpoint can be added via api-paste.ini manually i dont think we shoudl add it to our > > default or that packageagre shoudl either. > > Like it or not, the current state of things is: > - /healthcheck is activated everywhere (I patched that myself) > - The nova package at least in Debian has it activated by default (as > this is the only project that refused the patch, I carry it in the package). > > Also, many operators already use the /healthcheck in production, so you > really want to keep it. IMO, your implementation should switch to a > different endpoint if you wish to not retain compatibility with the > older system. > > For this reason, I strongly believe that the Nova team should be > revising its view from a year and a half, and accept the imperfect > currently implemented /healthcheck. This is not mutually exclusive to a > better implementation bound on some other URL. > > > one open question in my draft spec is for the nova api in particaly should we support /healtcheck on the normal api port instead of > > the dedeicated health check endpoint. > > You should absolutely not break backward compatibility!!! > > > yes so i need to push the spec for review ill see if i can do that today or at a minium this week. > > the tldr is as follows. > > > > nova will be extended with 2 addtional options to allow a health checks endpoint to be exposed on a tcp port > > and/or a unix socket. these heatlth check endpoints will not be authenticated will be disabel by default. > > all nova binaries (nova-api, nova-schduler, nova-compute, ...) will supprot exposing the endpoint. > > > > the process will internally update a heathcheck data structure when ever they perform specific operation that > > can be uses as a proxy for the healt of the binary (db query, rpc ping, request to libvirt) these will be binary specific. > > > > The over all health will be summerised with a status enum, exact values to be determind but im working with (OK, DEGRADED, FAULT) > > for now. in the degraded and fault state there will also be a mesage and likely details filed in the respocne. > > message would be human readable with detail being the actual content of the health check data structure. > > > > i have not decided if i should use http status codes as part of the way to singal the status, my instinct are saying no > > parsing the json reponce shoudl be simple and if you just need to check the status filed for ok|degreated|falut using a 5XX error code > > in the degraded of fault case would not be semanticly correct. > > All you wrote above is great. For the http status codes, please > implement it, because it's cheap, and that's how Zabbix (and probably > other monitoring systems) works, plus everyone understand them. > > > Use Cases > > --------- > > > > As a operator i want a simple health-check i can consume to know > > if a nova process is OK, Degraded or Faulty. > > > > As an operator i want this health-check to not impact performance of the > > service so it can be queried frequently at short intervals. > > > > As a deployment tool implementer i want the health check to be local with no > > dependencies on other hosts or services to function so i can integrate it with > > service managers such as systemd or container runtime like docker > > > > As a packager i would like health-check to not require special client or > > packages consume them. CURL, socat or netcat should be all that is required to > > connect to the health check and retrieve the service status. > > > > As an operator i would like to be able to use health-check of the nova api and > > metadata services to manage the membership of endpoints in my load-balancer > > or reverse proxy automatically. > > > > > >> Though it would be more than welcome. Currently, to check that a daemon > >> is alive and well, operators are stuck with: > >> > >> - checking with ss if the daemon is correctly connected to a given port > >> - check the logs for rabbitmq and mysql errors (with something like > >> filebeat + elastic search and alarming) > >> > >> Clearly, this doesn't scale. When running many large OpenStack clusters, > >> it is not trivial to have a monitoring system that works and scales. The > >> effort to deploy such a monitoring system is also not trivial at all. So > >> what's been discussed at the time for improving the monitoring would be > >> very much welcome, though not only for the API service: something to > >> check the health of other daemons would be very much welcome. > >> > >> I'd very much would like to participate in a Yoga effort to improve the > >> current situation, and contribute the best I can, though I'm not sure > >> I'd be the best person to drive this... Is there anyone else willing to > >> work on this? > > > > yep i am feel free to ping me on irc: sean-k-mooney incase your wondering but we have talked before. > > Yes. Feel free to ping me as well, I'll enjoy contributing were I can > (though I know you're more skilled than I do in OpenStack's Python > code... I'll still do what I can). > > > i have not configured my defualt channels since the change to oftc but im alwasy in at least #openstack-nova > > after discussing this in the nova ptg session the design took a hard right turn from being based on a rpc like protocaol > > exposed over a unix socket with ovos as the data fromat and active probes to a http based endpoint, avaiable over tcp and or unix socket > > with json as the responce format and a semi global data stucutre with TTL for the data. > > > > as a result i have had to rethink and rework most of the draft spec i had prepared. > > The main point of design that we need to agree on is exactuly how that data stucture is accessed and wehre it is stored. > > > > in the orginal desing i proposed there was no need to store any kind of state and or modify existing functions to add healchecks. > > each nova service manager would just implemant a new healthcheck function that would be pass as a callback to the healtcheck manager which exposed the endpoint. > > > > With the new approch we will like add decorators to imporant functions that will update the healthchecks based on if that fucntion complete correctly. > > if we take the decorator because of how decorators work it can only access module level varables, class method/memeber or the parmaters to the function it is decorating. > > what that efffectivly means is either the health check manager need to be stored in a module level "global" variable, it need to be a signelton accessable via a class method > > or it need to be stored in a data stucure that is passed to almost ever funciton speicifcally the context object. > > > > i am leaning towards the context object but i need to understand how that will interact with RPC calls so it might end up being a global/singelton which sucks form a unit/fucntional testing > > perspective but we can make it work via fixtures. > > > > hopefully this sould like good news to you but feel free to give feedback. > > I don't like the fact that we're still having the discussion 1.5 years > after the proposed patch, and that still delays having Nova following > what all the other projects have approved. > > Again, what you're doing should not be mutually exclusive with adding > what already works, and what is already in production. It's been said a > year and a half ago, and it's still truth. A year and a half ago, we > even discuss the fact it would be a shame if it took more than a year... > So can we move forward? > > Anyways, I'm excited that this goes forward, so thanks again for leading > this initiative. > > Cheers, > > Thomas Goirand (zigo) > -- Mohammed Naser VEXXHOST, Inc. From dms at danplanet.com Wed Nov 17 21:54:49 2021 From: dms at danplanet.com (Dan Smith) Date: Wed, 17 Nov 2021 13:54:49 -0800 Subject: [nova][all] Adding /healthcheck support in Nova, and better healthcheck in every projects In-Reply-To: (Mohammed Naser's message of "Wed, 17 Nov 2021 15:53:22 -0500") References: <4882cd19-4f6d-c0a7-15cd-a98e73a217ad@debian.org> Message-ID: > I don't think we rely on /healthcheck -- there's nothing healthy about > an API endpoint blindly returning a 200 OK. > > You might as well just hit / and accept 300 as a code and that's > exactly the same behaviour. I support what Sean is bringing up here > and I don't think it makes sense to have a noop /healthcheck that > always gives a 200 OK...seems a bit useless imho Yup, totally agree. Our previous concerns over a healthcheck that checked all of nova returning too much info to be useful (for something trying to figure out if an individual worker is healthy) apply in reverse to one that returns too little to be useful. I agree, what Sean is working on is the right balance and that we should focus on that. --Dan From franck.vedel at univ-grenoble-alpes.fr Wed Nov 17 22:15:08 2021 From: franck.vedel at univ-grenoble-alpes.fr (Franck VEDEL) Date: Wed, 17 Nov 2021 23:15:08 +0100 Subject: [kolla-ansible][wallaby][glance] Problem with image list after reconfigure In-Reply-To: References: Message-ID: <87804956-038F-4F88-A467-33851679B5B4@univ-grenoble-alpes.fr> Hello and thank you for the help. I was able to move forward on my problem, without finding a satisfactory solution. Normally, I have 2 servers with the role [glance] but I noticed that all my images were on the first server (in / var / lib / docker / volumes / glance / _data / images) before the reconfigure, none on the second. But since the reconfiguration, the images are placed on the second, and no longer on the first. I do not understand why. I haven't changed anything to the multinode file. so, to get out of this situation quickly as I need this openstack for the students, I modified the multinode file and put only one server in [glance] (I put server 1, the one that had the images before reconfigure), I did a reconfigure -t glance and now I have my images usable for instances. I don't understand what happened. There is something wrong. Is it normal that after updating the certificates, all instances are turned off? thanks again Franck > Le 17 nov. 2021 ? 21:11, Cyril Roelandt a ?crit : > > Hello, > > > On 2021-11-17 08:59, Franck VEDEL wrote: >> Hello everyone >> >> I have a strange problem and I haven't found the solution yet. >> Following a certificate update I had to do a "kolla-ansible -t multinode reconfigure ?. >> Well, after several attempts (it is not easy to use certificates with Kolla-ansible, and from my advice, not documented enough for beginners), I have my new functional certificates. Perfect ... well almost. >> >> I am trying to create a new instance to check general operation. ERROR. >> Okay, I look in the logs and I see that Cinder is having problems creating volumes with an error that I never had ("TypeError: 'NoneType' object is not iterable). > > We'd like to see the logs as well, especially the stacktrace. > >> I dig and then I wonder if it is not the Glance images which cannot be used, while they are present (openstack image list is OK). >> >> I create an empty volume: it works. >> I am creating a volume from an image: Failed. > > What commands are you running? What's the output? What's in the logs? > >> >> However, I have my list of ten images in glance. >> >> I create a new image and create a volume with this new image: it works. >> I create an instance with this new image: OK. >> >> What is the problem ? The images present before the "reconfigure" are listed, visible in horizon for example, but unusable. >> Is there a way to fix this, or do we have to reinstall them all? > > What's your configuration? What version of OpenStack are you running? > > > > Cyril > -------------- next part -------------- An HTML attachment was scrubbed... URL: From melwittt at gmail.com Wed Nov 17 22:20:41 2021 From: melwittt at gmail.com (melanie witt) Date: Wed, 17 Nov 2021 14:20:41 -0800 Subject: [sdk]: Check instance error message In-Reply-To: <1abc189e90ecc42225a74a4be8c9ee47@uvic.ca> References: <1abc189e90ecc42225a74a4be8c9ee47@uvic.ca> Message-ID: On Tue Nov 16 2021 09:55:32 GMT-0800 (Pacific Standard Time), dmeng wrote: > I'm wondering if there is any method that could check the error message > of an instance whose status is "ERROR"? Like from openstack cli, > "openstack server show server_name", if the server is in "ERROR" status, > this will return a field "fault" with a message shows the error. I tried > the compute service get_server and find_server, but neither of them show > the error messages of an instance. Hi, it looks like currently the sdk doesn't have the 'fault' field in the Server model [1] so AFAICT you can't get the fault message out-of-the-box. A patch upstream will be needed to add it. This can be hacked around by, for example: import openstack from openstack.compute.v2 import server from openstack import resource class MyServer(server.Server): fault = resource.Body('fault', type=dict) conn = openstack.connect(cloud='devstack') s = conn.compute._get(MyServer, '9282db95-801f-4f43-90fb-e86d9bfb6785') s.fault {'code': 500, 'created': '2021-09-17T02:23:16Z', 'message': 'No valid host was found. '} HTH, -melanie [1] https://docs.openstack.org/openstacksdk/latest/user/model.html#server From zigo at debian.org Wed Nov 17 22:47:45 2021 From: zigo at debian.org (Thomas Goirand) Date: Wed, 17 Nov 2021 23:47:45 +0100 Subject: [nova][all] Adding /healthcheck support in Nova, and better healthcheck in every projects In-Reply-To: References: <4882cd19-4f6d-c0a7-15cd-a98e73a217ad@debian.org> Message-ID: <290155a7-ad8a-ab60-8cf3-58e2642e8f38@debian.org> On 11/17/21 10:54 PM, Dan Smith wrote: >> I don't think we rely on /healthcheck -- there's nothing healthy about >> an API endpoint blindly returning a 200 OK. >> >> You might as well just hit / and accept 300 as a code and that's >> exactly the same behaviour. I support what Sean is bringing up here >> and I don't think it makes sense to have a noop /healthcheck that >> always gives a 200 OK...seems a bit useless imho > > Yup, totally agree. Our previous concerns over a healthcheck that > checked all of nova returning too much info to be useful (for something > trying to figure out if an individual worker is healthy) apply in > reverse to one that returns too little to be useful. > > I agree, what Sean is working on is the right balance and that we should > focus on that. > > --Dan > That's not the only thing it does. It also is capable of being disabled, which is useful for maintenance: one can gracefully remove an API node for removal this way, which one cannot do with the root. Cheers, Thomas Goirand (zigo) From gmann at ghanshyammann.com Thu Nov 18 00:42:35 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 17 Nov 2021 18:42:35 -0600 Subject: [nova][all] Adding /healthcheck support in Nova, and better healthcheck in every projects In-Reply-To: References: <4882cd19-4f6d-c0a7-15cd-a98e73a217ad@debian.org> Message-ID: <17d307ecada.cc39a56d853416.7543793211146988220@ghanshyammann.com> ---- On Wed, 17 Nov 2021 15:54:49 -0600 Dan Smith wrote ---- > > I don't think we rely on /healthcheck -- there's nothing healthy about > > an API endpoint blindly returning a 200 OK. > > > > You might as well just hit / and accept 300 as a code and that's > > exactly the same behaviour. I support what Sean is bringing up here > > and I don't think it makes sense to have a noop /healthcheck that > > always gives a 200 OK...seems a bit useless imho > > Yup, totally agree. Our previous concerns over a healthcheck that > checked all of nova returning too much info to be useful (for something > trying to figure out if an individual worker is healthy) apply in > reverse to one that returns too little to be useful. True, we can see the example in this old patch PS1 trying to implement all the Nova_DB_healthcheck, Nova_MQ_healthcheck, Nova_services_healthcheck and end up a lot of info and time-consuming process - https://review.opendev.org/c/openstack/nova/+/731396/1 and then on RPC call success in PS2 - https://review.opendev.org/c/openstack/nova/+/731396/2 I agree on the point that heathchecks should be 'very Confirmed things saying it is healthy' otherwise, it just solves the HA proxy use case and rests all use cases will consider this as bad healthcheck which is the current case of solo middleware. -gmann > > I agree, what Sean is working on is the right balance and that we should > focus on that. > > --Dan > > From mnaser at vexxhost.com Thu Nov 18 01:03:02 2021 From: mnaser at vexxhost.com (Mohammed Naser) Date: Wed, 17 Nov 2021 20:03:02 -0500 Subject: [nova][all] Adding /healthcheck support in Nova, and better healthcheck in every projects In-Reply-To: <290155a7-ad8a-ab60-8cf3-58e2642e8f38@debian.org> References: <4882cd19-4f6d-c0a7-15cd-a98e73a217ad@debian.org> <290155a7-ad8a-ab60-8cf3-58e2642e8f38@debian.org> Message-ID: On Wed, Nov 17, 2021 at 5:52 PM Thomas Goirand wrote: > On 11/17/21 10:54 PM, Dan Smith wrote: > >> I don't think we rely on /healthcheck -- there's nothing healthy about > >> an API endpoint blindly returning a 200 OK. > >> > >> You might as well just hit / and accept 300 as a code and that's > >> exactly the same behaviour. I support what Sean is bringing up here > >> and I don't think it makes sense to have a noop /healthcheck that > >> always gives a 200 OK...seems a bit useless imho > > > > Yup, totally agree. Our previous concerns over a healthcheck that > > checked all of nova returning too much info to be useful (for something > > trying to figure out if an individual worker is healthy) apply in > > reverse to one that returns too little to be useful. > > > > I agree, what Sean is working on is the right balance and that we should > > focus on that. > > > > --Dan > > > > That's not the only thing it does. It also is capable of being disabled, > which is useful for maintenance: one can gracefully remove an API node > for removal this way, which one cannot do with the root. > I feel like this should be handled by whatever layer that needs to drain requests for maintenance, otherwise also it might just be the same as turning off the service, no? > Cheers, > > Thomas Goirand (zigo) > > -- Mohammed Naser VEXXHOST, Inc. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Thu Nov 18 06:23:53 2021 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Thu, 18 Nov 2021 07:23:53 +0100 Subject: [kolla-ansible][wallaby][glance] Problem with image list after reconfigure In-Reply-To: <87804956-038F-4F88-A467-33851679B5B4@univ-grenoble-alpes.fr> References: <87804956-038F-4F88-A467-33851679B5B4@univ-grenoble-alpes.fr> Message-ID: Hello, i solved using the following variabile in globals.yml: glance_file_datadir_volume=somedir and glance_backend_file="yes' So if the somedir is a nfs mount point, controllers can share images. Remember you have to deploy glance on all controllers. Ignazio Il Mer 17 Nov 2021, 23:17 Franck VEDEL ha scritto: > Hello and thank you for the help. > I was able to move forward on my problem, without finding a satisfactory > solution. > Normally, I have 2 servers with the role [glance] but I noticed that all > my images were on the first server (in / var / lib / docker / volumes / > glance / _data / images) before the reconfigure, none on the second. But > since the reconfiguration, the images are placed on the second, and no > longer on the first. I do not understand why. I haven't changed anything to > the multinode file. > so, to get out of this situation quickly as I need this openstack for the > students, I modified the multinode file and put only one server in [glance] > (I put server 1, the one that had the images before reconfigure), I did a > reconfigure -t glance and now I have my images usable for instances. > I don't understand what happened. There is something wrong. > > Is it normal that after updating the certificates, all instances are > turned off? > thanks again > > Franck > > Le 17 nov. 2021 ? 21:11, Cyril Roelandt a ?crit : > > Hello, > > > On 2021-11-17 08:59, Franck VEDEL wrote: > > Hello everyone > > I have a strange problem and I haven't found the solution yet. > Following a certificate update I had to do a "kolla-ansible -t multinode > reconfigure ?. > Well, after several attempts (it is not easy to use certificates with > Kolla-ansible, and from my advice, not documented enough for beginners), I > have my new functional certificates. Perfect ... well almost. > > I am trying to create a new instance to check general operation. ERROR. > Okay, I look in the logs and I see that Cinder is having problems creating > volumes with an error that I never had ("TypeError: 'NoneType' object is > not iterable). > > > We'd like to see the logs as well, especially the stacktrace. > > I dig and then I wonder if it is not the Glance images which cannot be > used, while they are present (openstack image list is OK). > > I create an empty volume: it works. > I am creating a volume from an image: Failed. > > > What commands are you running? What's the output? What's in the logs? > > > However, I have my list of ten images in glance. > > I create a new image and create a volume with this new image: it works. > I create an instance with this new image: OK. > > What is the problem ? The images present before the "reconfigure" are > listed, visible in horizon for example, but unusable. > Is there a way to fix this, or do we have to reinstall them all? > > > What's your configuration? What version of OpenStack are you running? > > > > Cyril > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjoen at dds.nl Thu Nov 18 07:08:34 2021 From: tjoen at dds.nl (tjoen) Date: Thu, 18 Nov 2021 08:08:34 +0100 Subject: [sdk]: Check instance error message In-Reply-To: References: <1abc189e90ecc42225a74a4be8c9ee47@uvic.ca> Message-ID: On 11/17/21 23:20, melanie witt wrote: > On Tue Nov 16 2021 09:55:32 GMT-0800 (Pacific Standard Time), dmeng > wrote: >> I'm wondering if there is any method that could check the error >> message of an instance whose status is "ERROR"? Like from openstack >> cli, "openstack server show server_name", if the server is in "ERROR" >> status, this will return a field "fault" with a message shows the >> error. I tried the compute service get_server and find_server, but >> neither of them show the error messages of an instance. > > This can be hacked around by, for example: > > import openstack > from openstack.compute.v2 import server > from openstack import resource > > > class MyServer(server.Server): > ??? fault = resource.Body('fault', type=dict) > > > conn = openstack.connect(cloud='devstack') > s = conn.compute._get(MyServer, '9282db95-801f-4f43-90fb-e86d9bfb6785') > s.fault > {'code': 500, 'created': '2021-09-17T02:23:16Z', 'message': 'No valid > host was found. '} In my (not OP) case the problems were mostly python or sudo errors So journalctl still needed From skaplons at redhat.com Thu Nov 18 07:42:22 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Thu, 18 Nov 2021 08:42:22 +0100 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming In-Reply-To: <3MOP2R.O83SZVO0NWN23@est.tech> References: <2165480.iZASKD2KPV@p1> <3MOP2R.O83SZVO0NWN23@est.tech> Message-ID: <1889698.yKVeVyVuyW@p1> Hi, On ?roda, 17 listopada 2021 11:18:03 CET Balazs Gibizer wrote: > On Wed, Nov 17 2021 at 09:13:34 AM +0100, Slawek Kaplonski > > wrote: > > Hi, > > > > Recently I spent some time to check how many rechecks we need in > > Neutron to > > get patch merged and I compared it to some other OpenStack projects > > (see [1] > > for details). > > TL;DR - results aren't good for us and I think we really need to do > > something > > with that. > > I really like the idea of collecting such stats. Thank you for doing > it. I can even imagine to make a public dashboard somewhere with this > information as it is a good indication about the health of our projects > / testing. Thx. So far it's just simple script which I run from my terminal to get that data. Nothing else. If You want to use it, it's here https://github.com/ slawqo/tools/tree/master/rechecks > > > Of course "easiest" thing to say is that we should fix issues which > > we are > > hitting in the CI to make jobs more stable. But it's not that easy. > > We are > > struggling with those jobs for very long time. We have CI related > > meeting > > every week and we are fixing what we can there. > > Unfortunately there is still bunch of issues which we can't fix so > > far because > > they are intermittent and hard to reproduce locally or in some cases > > the > > issues aren't realy related to the Neutron or there are new bugs > > which we need > > to investigate and fix :) > > I have couple of suggestion based on my experience working with CI in > nova. > > 1) we try to open bug reports for intermittent gate failures too and > keep them tagged in a list [1] so when a job fail it is easy to check > if the bug is known. Thx. We are trying more or less to do that, but TBH I think that in many cases we didn't open LPs for such issues. I added it to the list of ideas :) > > 2) I offer my help here now that if you see something in neutron runs > that feels non neutron specific then ping me with it. Maybe we are > struggling with the same problem too. Thank a lot. I will for sure ping You when I will see something like that. > > 3) there was informal discussion before about a possibility to re-run > only some jobs with a recheck instead for re-running the whole set. I > don't know if this is feasible with Zuul and I think this only treat > the symptom not the root case. But still this could be a direction if > all else fails. yes, I remember that discussion and I totally understand pros and cons of such solution, but I added it to the list as well. > > Cheers, > gibi > > > So this is never ending battle for us. The problem is that we have > > to test > > various backends, drivers, etc. so as a result we have many jobs > > running on > > each patch - excluding UT, pep8 and docs jobs we have around 19 jobs > > in check > > and 14 jobs in gate queue. > > > > In the past we made a lot of improvements, like e.g. we improved > > irrelevant > > files lists for jobs to run less jobs on some of the patches, > > together with QA > > team we did "integrated-networking" template to run only Neutron and > > Nova > > related scenario tests in the Neutron queues, we removed and > > consolidated some > > of the jobs (there is still one patch in progress for that but it > > should just > > remove around 2 jobs from the check queue). All of that are good > > improvements > > but still not enough to make our CI really stable :/ > > > > Because of all of that, I would like to ask community about any other > > ideas > > how we can improve that. If You have any ideas, please send it in > > this email > > thread or reach out to me directly on irc. > > We want to discuss about them in the next video CI meeting which will > > be on > > November 30th. If You would have any idea and would like to join that > > discussion, You are more than welcome in that meeting of course :) > > > > [1] > > http://lists.openstack.org/pipermail/openstack-discuss/2021-November/ > > 025759.html > > [1] > https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure&orderby=-date_las > t_updated&start=0 > > -- > > Slawek Kaplonski > > Principal Software Engineer > > Red Hat -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From skaplons at redhat.com Thu Nov 18 07:46:11 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Thu, 18 Nov 2021 08:46:11 +0100 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming In-Reply-To: <3e835881-29d8-4e33-9ef1-36171f9dd0a3@www.fastmail.com> References: <2165480.iZASKD2KPV@p1> <3MOP2R.O83SZVO0NWN23@est.tech> <3e835881-29d8-4e33-9ef1-36171f9dd0a3@www.fastmail.com> Message-ID: <14494355.tv2OnDr8pf@p1> Hi, Thx Clark for detailed explanation about that :) On ?roda, 17 listopada 2021 16:51:57 CET you wrote: > On Wed, Nov 17, 2021, at 2:18 AM, Balazs Gibizer wrote: > > Snip. I want to respond to a specific suggestion: > > 3) there was informal discussion before about a possibility to re-run > > only some jobs with a recheck instead for re-running the whole set. I > > don't know if this is feasible with Zuul and I think this only treat > > the symptom not the root case. But still this could be a direction if > > all else fails. > > OpenStack has configured its check and gate queues with something we've called > "clean check". This refers to the requirement that before an OpenStack > project can be gated it must pass check tests first. This policy was > instituted because a number of these infrequent but problematic issues were > traced back to recheck spamming. Basically changes would show up and were > broken. They would fail some percentage of the time. They got rechecked until > they finally merged and now their failure rate is added to the whole. This > rule was introduced to make it more difficult to get this flakyness into the > gate. > > Locking in test results is in direct opposition to the existing policy and > goals. Locking results would make it far more trivial to land such flakyness > as you wouldn't need entire sets of jobs to pass before you could land. > Instead you could rerun individual jobs until each one passed and then land > the result. Potentially introducing significant flakyness with a single > merge. > > Locking results is also not really something that fits well with the > speculative gate queues that Zuul runs. Remember that Zuul constructs a > future git state and tests that in parallel. Currently the state for > OpenStack looks like: > > A - Nova > ^ > B - Glance > ^ > C - Neutron > ^ > D - Neutron > ^ > F - Neutron > > The B glance change is tested as if the A Nova change has already merged and > so on down the queue. If we want to keep these speculative states we can't > really have humans manually verify a failure can be ignored and retry it. > Because we'd be enqueuing job builds at different stages of speculative > state. Each job build would be testing a different version of the software. > > What we could do is implement a retry limit for failing jobs. Zuul could rerun > failing jobs X times before giving up and reporting failure (this would > require updates to Zuul). The problem with this approach is without some > oversight it becomes very easy to land changes that make things worse. As a > side note Zuul does do retries, but only for detected network errors or when > a pre-run playbook fails. The assumption is that network failures are due to > the dangers of the Internet, and that pre-run playbooks are small, self > contained, unlikely to fail, and when they do fail the failure should be > independent of what is being tested. > > Where does that leave us? > > I think it is worth considering the original goals of "clean check". We know > that rechecking/rerunning only makes these problems worse in the long term. > They represent technical debt. One of the reasons we run these tests is to > show us when our software is broken. In the case of flaky results we are > exposing this technical debt where it impacts the functionality of our > software. The longer we avoid fixing these issues the worse it gets, and this > is true even with "clean check". I agree with You on that and I would really like to find better/other solution for the Neutron problem than rechecking only broken jobs as I'm pretty sure that this would make things much worst quickly. > > Do we as developers find value in knowing the software needs attention before > it gets released to users? Do the users find value in running reliable > software? In the past we have asserted that "yes, there is value in this", > and have invested in tracking, investigating, and fixing these problems even > if they happen infrequently. But that does require investment, and active > maintenance. > > Clark -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From pshchelokovskyy at mirantis.com Thu Nov 18 11:11:02 2021 From: pshchelokovskyy at mirantis.com (Pavlo Shchelokovskyy) Date: Thu, 18 Nov 2021 13:11:02 +0200 Subject: [all][refstack][neutron][kolla][ironic][heat][trove][senlin][barbican][manila] Fixing Zuul Config Errors In-Reply-To: <4359750.LvFx2qVVIh@p1> References: <5072b00c-bbf6-42e9-830f-d6598a76beb7@www.fastmail.com> <5517904.DvuYhMxLoT@p1> <4359750.LvFx2qVVIh@p1> Message-ID: Hi Clark, Why is the retirement of openstack/neutron-lbaas being a problem? The repo is there and accessible under the same URL, it has (potentially working) stable/pike and stable/queens branches, and was not retired at the time of Pike or Queens, so IMO it is a valid request for testing configuration in the same branches of other projects, openstack/heat in this case. Maybe we should leave some minimal zuul configs in retired projects for zuul to find them? Cheers, On Wed, Nov 17, 2021 at 9:45 AM Slawek Kaplonski wrote: > Hi, > > On ?roda, 17 listopada 2021 08:26:14 CET Slawek Kaplonski wrote: > > Hi, > > > > I just checked neutron related things there and it seems there are 2 > major > > issues there: > > 1. move of the tap-as-a-service from x/ to openstack/ namespace (that > affects > > networking-midonet) - I will propose patch for that today. > > 2. remove of the neutron-lbaas repo (that affects much more than only > neutron > > repos - for that I will try to propose patches this week as well. > > There are also some missing job definitions in some of the neutron related > repos and also issues with missing openstack/networking-l2gw project. I > will > take a look into all those issues in next days. > > > > > On wtorek, 16 listopada 2021 17:58:44 CET Clark Boylan wrote: > > > Hello, > > > > > > The OpenStack tenant in Zuul currently has 134 config errors. You can > find > > > these errors at https://zuul.opendev.org/t/openstack/config-errors or > by > > > clicking the blue bell icon in the top right of > > > https://zuul.opendev.org/t/openstack/status. The vast majority of > these > > > errors appear related to project renames that have been requested of > OpenDev > > > or project retirements. Can you please look into fixing these as they > can > be > > > an attractive nuisance when debugging Zuul problems (they also > indicate > that > > > a number of your jobs are probably not working). > > > > > > Project renames creating issues: > > > * openstack/python-tempestconf -> osf/python-tempestconf -> > > > > > > openinfra/python-tempestconf * openstack/refstack -> osf/refstack -> > > > openinfra/refstack > > > > > > * x/tap-as-a-service -> openstack/tap-as-a-service > > > * openstack/networking-l2gw -> x/networking-l2gw > > > > > > Project retirements creating issues: > > > * openstack/neutron-lbaas > > > * recordsansible/ara > > > > > > Projects whose configs have errors: > > > * openinfra/python-tempestconf > > > * openstack/heat > > > * openstack/ironic > > > * openstack/kolla-ansible > > > * openstack/kuryr-kubernetes > > > * openstack/murano-apps > > > * openstack/networking-midonet > > > * openstack/networking-odl > > > * openstack/neutron > > > * openstack/neutron-fwaas > > > * openstack/python-troveclient > > > * openstack/senlin > > > * openstack/tap-as-a-service > > > * openstack/zaqar > > > * x/vmware-nsx > > > * openinfra/openstackid > > > * openstack/barbican > > > * openstack/cookbook-openstack-application-catalog > > > * openstack/heat-dashboard > > > * openstack/manila-ui > > > * openstack/python-manilaclient > > > > > > Let us know if we can help decipher any errors, > > > Clark > > > -- > Slawek Kaplonski > Principal Software Engineer > Red Hat -- Dr. Pavlo Shchelokovskyy Principal Software Engineer Mirantis Inc www.mirantis.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From katonalala at gmail.com Thu Nov 18 12:26:44 2021 From: katonalala at gmail.com (Lajos Katona) Date: Thu, 18 Nov 2021 13:26:44 +0100 Subject: [all][sdk][neutronclient][neutron] List of projects that use python-neutronclient as "client" Message-ID: Hi, During the Yoga PTG we discussed the possibility of deprecating python-neutronclient (see [0]). The CLI part of python-neutronclient is already deprecated (see [1]), but we have python bindings for Neutron both in python-neutronclient and in openstacksdk. As python-neutronclient's binding code is widely used as "client" code (i.e.: Heat, Horizon, Nova), but python-openstackclient uses openstacksdk's binding code for "client" this means duplicated work and maintenance for all the bindings. If we have a new API feature the binding code must go both to python-neutronclient to make it available in Heat, and to openstacksdk to have openstackclient support for it. The best would be to have the binding code in openstacksdk and deprecate python-neutronclient, but before we plan anything we would like to have a list of projects that depend on python-neutronclient. We identified a few but for sure with python-neutronclient's long history there are a lot more: * Heat * Horizon * Nova * various Networking projects * ...... Please share Your thoughts about this topic and the projects which use python-neutronclient, it would be really helpful to see how we can move forward. [0]: https://etherpad.opendev.org/p/neutron-yoga-ptg#L372 [1]: https://review.opendev.org/c/openstack/python-neutronclient/+/795475 Regards Lajos Katona (lajoskatona) -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnasiadka at gmail.com Thu Nov 18 12:33:25 2021 From: mnasiadka at gmail.com (=?utf-8?Q?Micha=C5=82_Nasiadka?=) Date: Thu, 18 Nov 2021 13:33:25 +0100 Subject: [kolla] Plan to deprecate binary and unify on single distrubition Message-ID: <7B0C2125-975E-437C-B332-9E88F725D248@gmail.com> Hello Koalas, On the PTG we have discussed two topics: 1) Deprecate and drop binary type of Kolla images 2) Use a common base (single Linux distribution) for Kolla images This is a call for feedback - for people that have not been attending the PTG. What this essentially mean for consumers: 1) In Yoga cycle we will deprecate binary type of Kolla images, and in Z cycle those will be dropped. 2) We are not going to support CentOS Stream 9 (cs9) as a base operating system, and the source type build will rely on CentOS Stream 8 in Z release. 3) Beginning from A release Kolla will build only Debian source images - but Kolla-Ansible will still support deployment of those images on CentOS/Ubuntu/Debian Host operating systems (and Rocky Linux to be added in Yoga to that mix). Justification: The Kolla project team is limited in numbers, therefore supporting current broad mix of operating systems (especially with CentOS Stream 9 ,,on the way??) is a significant maintenance burden. By dropping binary type of images - users would be running more tested images (since Kolla/Kolla-Ansible CI runs source images jobs as voting). In Xena we?ve already changed the default image type Kolla-Ansible uses to source. We also feel that using a unified base OS for Kolla container images is a way to remove some of the maintenance burden (including CI cycles and Request for feedback: If any of those changes is a no go from your perspective - we?d like to hear your opinions. Best regards, Michal Nasiadka From balazs.gibizer at est.tech Thu Nov 18 14:39:40 2021 From: balazs.gibizer at est.tech (Balazs Gibizer) Date: Thu, 18 Nov 2021 15:39:40 +0100 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming In-Reply-To: <3e835881-29d8-4e33-9ef1-36171f9dd0a3@www.fastmail.com> References: <2165480.iZASKD2KPV@p1> <3MOP2R.O83SZVO0NWN23@est.tech> <3e835881-29d8-4e33-9ef1-36171f9dd0a3@www.fastmail.com> Message-ID: <4EVR2R.KF069X977ZIK2@est.tech> On Wed, Nov 17 2021 at 07:51:57 AM -0800, Clark Boylan wrote: > On Wed, Nov 17, 2021, at 2:18 AM, Balazs Gibizer wrote: >> > > Snip. I want to respond to a specific suggestion: > >> 3) there was informal discussion before about a possibility to >> re-run >> only some jobs with a recheck instead for re-running the whole set. >> I >> don't know if this is feasible with Zuul and I think this only treat >> the symptom not the root case. But still this could be a direction >> if >> all else fails. >> > > OpenStack has configured its check and gate queues with something > we've called "clean check". This refers to the requirement that > before an OpenStack project can be gated it must pass check tests > first. This policy was instituted because a number of these > infrequent but problematic issues were traced back to recheck > spamming. Basically changes would show up and were broken. They would > fail some percentage of the time. They got rechecked until they > finally merged and now their failure rate is added to the whole. This > rule was introduced to make it more difficult to get this flakyness > into the gate. > > Locking in test results is in direct opposition to the existing > policy and goals. Locking results would make it far more trivial to > land such flakyness as you wouldn't need entire sets of jobs to pass > before you could land. Instead you could rerun individual jobs until > each one passed and then land the result. Potentially introducing > significant flakyness with a single merge. > > Locking results is also not really something that fits well with the > speculative gate queues that Zuul runs. Remember that Zuul constructs > a future git state and tests that in parallel. Currently the state > for OpenStack looks like: > > A - Nova > ^ > B - Glance > ^ > C - Neutron > ^ > D - Neutron > ^ > F - Neutron > > The B glance change is tested as if the A Nova change has already > merged and so on down the queue. If we want to keep these speculative > states we can't really have humans manually verify a failure can be > ignored and retry it. Because we'd be enqueuing job builds at > different stages of speculative state. Each job build would be > testing a different version of the software. > > What we could do is implement a retry limit for failing jobs. Zuul > could rerun failing jobs X times before giving up and reporting > failure (this would require updates to Zuul). The problem with this > approach is without some oversight it becomes very easy to land > changes that make things worse. As a side note Zuul does do retries, > but only for detected network errors or when a pre-run playbook > fails. The assumption is that network failures are due to the dangers > of the Internet, and that pre-run playbooks are small, self > contained, unlikely to fail, and when they do fail the failure should > be independent of what is being tested. > > Where does that leave us? > > I think it is worth considering the original goals of "clean check". > We know that rechecking/rerunning only makes these problems worse in > the long term. They represent technical debt. One of the reasons we > run these tests is to show us when our software is broken. In the > case of flaky results we are exposing this technical debt where it > impacts the functionality of our software. The longer we avoid fixing > these issues the worse it gets, and this is true even with "clean > check". > > Do we as developers find value in knowing the software needs > attention before it gets released to users? Do the users find value > in running reliable software? In the past we have asserted that "yes, > there is value in this", and have invested in tracking, > investigating, and fixing these problems even if they happen > infrequently. But that does require investment, and active > maintenance. Thank you Clark! I agree with your view that the current setup provides us with very valuable information about the health of the software we are developing. I also agree that our primary goal should be to fix the flaky tests instead of hiding the results under any kind of rechecks. Still I'm wondering what we will do if it turns out that the existing developer bandwidth shrunk to the point where we simply not have the capacity for fix these technical debts. What the stable team does on stable branches in Extended Maintenance mode in a similar situation is to simply turn off problematic test jobs. So I guess that is also a valid last resort move. Cheers, gibi > > Clark > From cboylan at sapwetik.org Thu Nov 18 14:40:38 2021 From: cboylan at sapwetik.org (Clark Boylan) Date: Thu, 18 Nov 2021 06:40:38 -0800 Subject: =?UTF-8?Q?Re:_[all][refstack][neutron][kolla][ironic][heat][trove][senli?= =?UTF-8?Q?n][barbican][manila]_Fixing_Zuul_Config_Errors?= In-Reply-To: References: <5072b00c-bbf6-42e9-830f-d6598a76beb7@www.fastmail.com> <5517904.DvuYhMxLoT@p1> <4359750.LvFx2qVVIh@p1> Message-ID: <5fea46fa-f304-48fa-a171-ce1114700379@www.fastmail.com> On Thu, Nov 18, 2021, at 3:11 AM, Pavlo Shchelokovskyy wrote: > Hi Clark, > > Why is the retirement of openstack/neutron-lbaas being a problem? > > The repo is there and accessible under the same URL, it has > (potentially working) stable/pike and stable/queens branches, and was > not retired at the time of Pike or Queens, so IMO it is a valid request > for testing configuration in the same branches of other projects, > openstack/heat in this case. > > Maybe we should leave some minimal zuul configs in retired projects for > zuul to find them? The reason for this is one of the steps for project retirement is to remove the repo from zuul [0]. If the upstream for the project has retired the project I think it is reasonable to remove it from our systems like Zuul. The project isn't retired on a per branch basis. I do not think the Zuul operators should be responsible for keeping retired projects alive in the CI system if their maintainers have stopped maintaining them. Instead we should remove them from the CI system. [0] https://opendev.org/openstack/project-config/commit/3832c8fafc4d4e03c306c21f37b2d39bd7c5bd2b From smooney at redhat.com Thu Nov 18 15:04:52 2021 From: smooney at redhat.com (Sean Mooney) Date: Thu, 18 Nov 2021 15:04:52 +0000 Subject: [kolla] Plan to deprecate binary and unify on single distrubition In-Reply-To: <7B0C2125-975E-437C-B332-9E88F725D248@gmail.com> References: <7B0C2125-975E-437C-B332-9E88F725D248@gmail.com> Message-ID: On Thu, 2021-11-18 at 13:33 +0100, Micha? Nasiadka wrote: > Hello Koalas, > > On the PTG we have discussed two topics: > > 1) Deprecate and drop binary type of Kolla images > 2) Use a common base (single Linux distribution) for Kolla images > > This is a call for feedback - for people that have not been attending the PTG. > > What this essentially mean for consumers: > > 1) In Yoga cycle we will deprecate binary type of Kolla images, and in Z cycle those will be dropped. > 2) We are not going to support CentOS Stream 9 (cs9) as a base operating system, and the source type build will rely on CentOS Stream 8 in Z release. > 3) Beginning from A release Kolla will build only Debian source images - but Kolla-Ansible will still support deployment of those images on CentOS/Ubuntu/Debian Host operating systems (and Rocky Linux to be added in Yoga to that mix). > > Justification: > The Kolla project team is limited in numbers, therefore supporting current broad mix of operating systems (especially with CentOS Stream 9 ,,on the way??) is a significant maintenance burden. > By dropping binary type of images - users would be running more tested images (since Kolla/Kolla-Ansible CI runs source images jobs as voting). > In Xena we?ve already changed the default image type Kolla-Ansible uses to source. > We also feel that using a unified base OS for Kolla container images is a way to remove some of the maintenance burden (including CI cycles and > > Request for feedback: > If any of those changes is a no go from your perspective - we?d like to hear your opinions. i only have reason to use kolla-ansible on small personal clouds at home so either way this will have limited direct affect on me but just wanted to give some toughts: kolla has been my favorit way to deploy openstack for a very long time, one of the reason i liked it was the fact that it was distro independent, simple ot use/configure and it support source and binary installs. i have almost always defautl to ubuntu source although in the rare case si used centos i always used centos binary. i almost never used binary installs on debian based disto and on rpm based distos never really used soruce, im not sure if im the only one that did that. its not because i trusted the rpm package more by the way it just seam to be waht was tested more when i cahged to kolla devs on irc so i avoid ubuntu bindary and centos source as a result. with that in mind debian source is not really contoversial to me, i had one quetion on that however. will the support for the other distos be kept in the kolla images but not extended or will it be dropped. i asume the plan is to remvoe the templating for other distos in A based on point 3 above. the only other thing i wanted to point out is that while i have had some succes gettign ubuntu soruce image to run on centos host in the past it will be tricky if kolla every want to supprot selinx/apparmor. that was the main barrier i faced but there can be other. speficialy ovs and libvirt can be somewhat picky about the kernel on which they run. most of the openstack service will likely not care that the contaienr os does not match the host but some of the "system" depenciy like libvirt/ovs might. a way to address taht would be to supprot using external images for those servie form dockerhub/quay e.g. use the offical upstream mariadb image or libvirt or rabbit if that is infact a porblem. anyway its totally understandable that if you do not have contirbutor that are able to support the other distors that you would remove the supprot. espicaly with the move away form using kolla image in ooo of late and presumable a reduction in redhat contibutions to keep centos supprot alive. anyway the chagne while sad to see just form a proejct health point of vew are sad would not be enough to prevent me personally form using or recommendign kolla and kolla-ansibel. if you had prorpsed the opistte of centos binary only that would be much more concerning to me. there are some advantages to source installs that you are preservging in this change that i am glad will not be lost. one last tought is that if only one disto will be supproted for building images in the future with only soruce installs supproted, it might be worth considering if alpine or the alpine/debian-lite python 3 iamge should be revaluated as the base for an even lighter set of contianers rather then the base os image. i am kind fo assumeing that at some poitn the non python noopenstack contaienr would be used form the offial images at some point rahter then kolla contining to maintain them with the above suggestion. So this may not be applicable now and really would likely be an A or post A thing. > > Best regards, > Michal Nasiadka > From gmann at ghanshyammann.com Thu Nov 18 15:12:01 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Thu, 18 Nov 2021 09:12:01 -0600 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming In-Reply-To: <1889698.yKVeVyVuyW@p1> References: <2165480.iZASKD2KPV@p1> <3MOP2R.O83SZVO0NWN23@est.tech> <1889698.yKVeVyVuyW@p1> Message-ID: <17d339ac9bd.107860943903455.71431419368509180@ghanshyammann.com> ---- On Thu, 18 Nov 2021 01:42:22 -0600 Slawek Kaplonski wrote ---- > Hi, > > On ?roda, 17 listopada 2021 11:18:03 CET Balazs Gibizer wrote: > > On Wed, Nov 17 2021 at 09:13:34 AM +0100, Slawek Kaplonski > > > > wrote: > > > Hi, > > > > > > Recently I spent some time to check how many rechecks we need in > > > Neutron to > > > get patch merged and I compared it to some other OpenStack projects > > > (see [1] > > > for details). > > > TL;DR - results aren't good for us and I think we really need to do > > > something > > > with that. > > > > I really like the idea of collecting such stats. Thank you for doing > > it. I can even imagine to make a public dashboard somewhere with this > > information as it is a good indication about the health of our projects > > / testing. > > Thx. So far it's just simple script which I run from my terminal to get that > data. Nothing else. If You want to use it, it's here https://github.com/ > slawqo/tools/tree/master/rechecks > > > > > > Of course "easiest" thing to say is that we should fix issues which > > > we are > > > hitting in the CI to make jobs more stable. But it's not that easy. > > > We are > > > struggling with those jobs for very long time. We have CI related > > > meeting > > > every week and we are fixing what we can there. > > > Unfortunately there is still bunch of issues which we can't fix so > > > far because > > > they are intermittent and hard to reproduce locally or in some cases > > > the > > > issues aren't realy related to the Neutron or there are new bugs > > > which we need > > > to investigate and fix :) > > > > I have couple of suggestion based on my experience working with CI in > > nova. > > > > 1) we try to open bug reports for intermittent gate failures too and > > keep them tagged in a list [1] so when a job fail it is easy to check > > if the bug is known. > > Thx. We are trying more or less to do that, but TBH I think that in many cases > we didn't open LPs for such issues. > I added it to the list of ideas :) +1, I think opening bugs is the best way to get the project notified and also track the issue. I like the Slawek script to collect the recheck per project and that is something we can use in TC tracking the gate health in the weekly meeting and see which project is having more recheck, Recheck does not mean that project has the issue but at least we will encourage members to open bug on corresponding projects. -gmann > > > > > 2) I offer my help here now that if you see something in neutron runs > > that feels non neutron specific then ping me with it. Maybe we are > > struggling with the same problem too. > > Thank a lot. I will for sure ping You when I will see something like that. > > > > > 3) there was informal discussion before about a possibility to re-run > > only some jobs with a recheck instead for re-running the whole set. I > > don't know if this is feasible with Zuul and I think this only treat > > the symptom not the root case. But still this could be a direction if > > all else fails. > > yes, I remember that discussion and I totally understand pros and cons of such > solution, but I added it to the list as well. > > > > > Cheers, > > gibi > > > > > So this is never ending battle for us. The problem is that we have > > > to test > > > various backends, drivers, etc. so as a result we have many jobs > > > running on > > > each patch - excluding UT, pep8 and docs jobs we have around 19 jobs > > > in check > > > and 14 jobs in gate queue. > > > > > > In the past we made a lot of improvements, like e.g. we improved > > > irrelevant > > > files lists for jobs to run less jobs on some of the patches, > > > together with QA > > > team we did "integrated-networking" template to run only Neutron and > > > Nova > > > related scenario tests in the Neutron queues, we removed and > > > consolidated some > > > of the jobs (there is still one patch in progress for that but it > > > should just > > > remove around 2 jobs from the check queue). All of that are good > > > improvements > > > but still not enough to make our CI really stable :/ > > > > > > Because of all of that, I would like to ask community about any other > > > ideas > > > how we can improve that. If You have any ideas, please send it in > > > this email > > > thread or reach out to me directly on irc. > > > We want to discuss about them in the next video CI meeting which will > > > be on > > > November 30th. If You would have any idea and would like to join that > > > discussion, You are more than welcome in that meeting of course :) > > > > > > [1] > > > http://lists.openstack.org/pipermail/openstack-discuss/2021-November/ > > > 025759.html > > > > [1] > > https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure&orderby=-date_las > > t_updated&start=0 > > > -- > > > Slawek Kaplonski > > > Principal Software Engineer > > > Red Hat > > > -- > Slawek Kaplonski > Principal Software Engineer > Red Hat From smooney at redhat.com Thu Nov 18 15:19:33 2021 From: smooney at redhat.com (Sean Mooney) Date: Thu, 18 Nov 2021 15:19:33 +0000 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming In-Reply-To: <4EVR2R.KF069X977ZIK2@est.tech> References: <2165480.iZASKD2KPV@p1> <3MOP2R.O83SZVO0NWN23@est.tech> <3e835881-29d8-4e33-9ef1-36171f9dd0a3@www.fastmail.com> <4EVR2R.KF069X977ZIK2@est.tech> Message-ID: On Thu, 2021-11-18 at 15:39 +0100, Balazs Gibizer wrote: > > On Wed, Nov 17 2021 at 07:51:57 AM -0800, Clark Boylan > wrote: > > On Wed, Nov 17, 2021, at 2:18 AM, Balazs Gibizer wrote: > > > > > > > Snip. I want to respond to a specific suggestion: > > > > > 3) there was informal discussion before about a possibility to > > > re-run > > > only some jobs with a recheck instead for re-running the whole set. > > > I > > > don't know if this is feasible with Zuul and I think this only treat > > > the symptom not the root case. But still this could be a direction > > > if > > > all else fails. > > > > > > > OpenStack has configured its check and gate queues with something > > we've called "clean check". This refers to the requirement that > > before an OpenStack project can be gated it must pass check tests > > first. This policy was instituted because a number of these > > infrequent but problematic issues were traced back to recheck > > spamming. Basically changes would show up and were broken. They would > > fail some percentage of the time. They got rechecked until they > > finally merged and now their failure rate is added to the whole. This > > rule was introduced to make it more difficult to get this flakyness > > into the gate. > > > > Locking in test results is in direct opposition to the existing > > policy and goals. Locking results would make it far more trivial to > > land such flakyness as you wouldn't need entire sets of jobs to pass > > before you could land. Instead you could rerun individual jobs until > > each one passed and then land the result. Potentially introducing > > significant flakyness with a single merge. > > > > Locking results is also not really something that fits well with the > > speculative gate queues that Zuul runs. Remember that Zuul constructs > > a future git state and tests that in parallel. Currently the state > > for OpenStack looks like: > > > > A - Nova > > ^ > > B - Glance > > ^ > > C - Neutron > > ^ > > D - Neutron > > ^ > > F - Neutron > > > > The B glance change is tested as if the A Nova change has already > > merged and so on down the queue. If we want to keep these speculative > > states we can't really have humans manually verify a failure can be > > ignored and retry it. Because we'd be enqueuing job builds at > > different stages of speculative state. Each job build would be > > testing a different version of the software. > > > > What we could do is implement a retry limit for failing jobs. Zuul > > could rerun failing jobs X times before giving up and reporting > > failure (this would require updates to Zuul). The problem with this > > approach is without some oversight it becomes very easy to land > > changes that make things worse. As a side note Zuul does do retries, > > but only for detected network errors or when a pre-run playbook > > fails. The assumption is that network failures are due to the dangers > > of the Internet, and that pre-run playbooks are small, self > > contained, unlikely to fail, and when they do fail the failure should > > be independent of what is being tested. > > > > Where does that leave us? > > > > I think it is worth considering the original goals of "clean check". > > We know that rechecking/rerunning only makes these problems worse in > > the long term. They represent technical debt. One of the reasons we > > run these tests is to show us when our software is broken. In the > > case of flaky results we are exposing this technical debt where it > > impacts the functionality of our software. The longer we avoid fixing > > these issues the worse it gets, and this is true even with "clean > > check". > > > > Do we as developers find value in knowing the software needs > > attention before it gets released to users? Do the users find value > > in running reliable software? In the past we have asserted that "yes, > > there is value in this", and have invested in tracking, > > investigating, and fixing these problems even if they happen > > infrequently. But that does require investment, and active > > maintenance. > > Thank you Clark! I agree with your view that the current setup provides > us with very valuable information about the health of the software we > are developing. I also agree that our primary goal should be to fix the > flaky tests instead of hiding the results under any kind of rechecks. > > Still I'm wondering what we will do if it turns out that the existing > developer bandwidth shrunk to the point where we simply not have the > capacity for fix these technical debts. What the stable team does on > stable branches in Extended Maintenance mode in a similar situation is > to simply turn off problematic test jobs. So I guess that is also a > valid last resort move. one option is to "trust" the core team more and grant them explict rigth to workflow +2 and force merge a patch. trust is in quotes because its not really about trusting that the core teams can restrain themselve form blindly merging broken code but more a case of right now we entrust zuul to be the final gate keeper of our repo. When there are known broken gate failure and we are trying to land specific patch to say nova to fix or unblock the nuetron gate and we can see the neutron DNM patch that depens on this nova fix passsed then we could entrust the core team in this specific case to override zuul. i would expect this capablity to be used very spareinly but we do have some intermitent failures that happen that we can tell? are unrelated to the patch like the curernt issue with volumne attach/detach that result in kernel panics in the guest. if that is the only failure and all other test passed in gate i think it woudl be reasonable for a the neutron team to approve a neutron patch that modifies security groups for example. its very clearly an unrealted failure. that might be an alternivie to the recheck we have now and by resreving that for the core team it limits the scope for abusing this. i do think that the orginal goes of green check are good so really i would be suggesting this as an option for when check passed and we get an intermient failure in gate that we woudl override. this would not adress the issue in check but it would make itermitent failure in gate much less painful. > > Cheers, > gibi > > > > > > > > Clark > > > > > From balazs.gibizer at est.tech Thu Nov 18 15:30:37 2021 From: balazs.gibizer at est.tech (Balazs Gibizer) Date: Thu, 18 Nov 2021 16:30:37 +0100 Subject: [all] tempest-integrated-compute-centos-8-stream job is failing Message-ID: <1RXR2R.FUJGJYZK1M7N3@est.tech> Hi, The centos 8 stream job is failing in 100% of the cases with mirror issues [2]. You probably need to hold you recheck until it is resolved. cheers, gibi [1] https://zuul.opendev.org/t/openstack/builds?job_name=tempest-integrated-compute-centos-8-stream [2] https://zuul.opendev.org/t/openstack/build/67dc088859d1435c9a5097e7469e5c12/log/job-output.txt#581-591 From cboylan at sapwetik.org Thu Nov 18 15:33:48 2021 From: cboylan at sapwetik.org (Clark Boylan) Date: Thu, 18 Nov 2021 07:33:48 -0800 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming In-Reply-To: <4EVR2R.KF069X977ZIK2@est.tech> References: <2165480.iZASKD2KPV@p1> <3MOP2R.O83SZVO0NWN23@est.tech> <3e835881-29d8-4e33-9ef1-36171f9dd0a3@www.fastmail.com> <4EVR2R.KF069X977ZIK2@est.tech> Message-ID: <376b1cc3-5f3f-4e5f-8e4b-97312b033dc2@www.fastmail.com> On Thu, Nov 18, 2021, at 6:39 AM, Balazs Gibizer wrote: > On Wed, Nov 17 2021 at 07:51:57 AM -0800, Clark Boylan > wrote: >> On Wed, Nov 17, 2021, at 2:18 AM, Balazs Gibizer wrote: >>> >> >> Snip. I want to respond to a specific suggestion: >> >>> 3) there was informal discussion before about a possibility to >>> re-run >>> only some jobs with a recheck instead for re-running the whole set. >>> I >>> don't know if this is feasible with Zuul and I think this only treat >>> the symptom not the root case. But still this could be a direction >>> if >>> all else fails. >>> >> >> OpenStack has configured its check and gate queues with something >> we've called "clean check". This refers to the requirement that >> before an OpenStack project can be gated it must pass check tests >> first. This policy was instituted because a number of these >> infrequent but problematic issues were traced back to recheck >> spamming. Basically changes would show up and were broken. They would >> fail some percentage of the time. They got rechecked until they >> finally merged and now their failure rate is added to the whole. This >> rule was introduced to make it more difficult to get this flakyness >> into the gate. >> >> Locking in test results is in direct opposition to the existing >> policy and goals. Locking results would make it far more trivial to >> land such flakyness as you wouldn't need entire sets of jobs to pass >> before you could land. Instead you could rerun individual jobs until >> each one passed and then land the result. Potentially introducing >> significant flakyness with a single merge. >> >> Locking results is also not really something that fits well with the >> speculative gate queues that Zuul runs. Remember that Zuul constructs >> a future git state and tests that in parallel. Currently the state >> for OpenStack looks like: >> >> A - Nova >> ^ >> B - Glance >> ^ >> C - Neutron >> ^ >> D - Neutron >> ^ >> F - Neutron >> >> The B glance change is tested as if the A Nova change has already >> merged and so on down the queue. If we want to keep these speculative >> states we can't really have humans manually verify a failure can be >> ignored and retry it. Because we'd be enqueuing job builds at >> different stages of speculative state. Each job build would be >> testing a different version of the software. >> >> What we could do is implement a retry limit for failing jobs. Zuul >> could rerun failing jobs X times before giving up and reporting >> failure (this would require updates to Zuul). The problem with this >> approach is without some oversight it becomes very easy to land >> changes that make things worse. As a side note Zuul does do retries, >> but only for detected network errors or when a pre-run playbook >> fails. The assumption is that network failures are due to the dangers >> of the Internet, and that pre-run playbooks are small, self >> contained, unlikely to fail, and when they do fail the failure should >> be independent of what is being tested. >> >> Where does that leave us? >> >> I think it is worth considering the original goals of "clean check". >> We know that rechecking/rerunning only makes these problems worse in >> the long term. They represent technical debt. One of the reasons we >> run these tests is to show us when our software is broken. In the >> case of flaky results we are exposing this technical debt where it >> impacts the functionality of our software. The longer we avoid fixing >> these issues the worse it gets, and this is true even with "clean >> check". >> >> Do we as developers find value in knowing the software needs >> attention before it gets released to users? Do the users find value >> in running reliable software? In the past we have asserted that "yes, >> there is value in this", and have invested in tracking, >> investigating, and fixing these problems even if they happen >> infrequently. But that does require investment, and active >> maintenance. > > Thank you Clark! I agree with your view that the current setup provides > us with very valuable information about the health of the software we > are developing. I also agree that our primary goal should be to fix the > flaky tests instead of hiding the results under any kind of rechecks. > > Still I'm wondering what we will do if it turns out that the existing > developer bandwidth shrunk to the point where we simply not have the > capacity for fix these technical debts. What the stable team does on > stable branches in Extended Maintenance mode in a similar situation is > to simply turn off problematic test jobs. So I guess that is also a > valid last resort move. Absolutely reduce scope if necessary. We run a huge assortment of jobs because we've added support for the kitchen sink to OpenStack. If we can't continue to reliably test those features then it should be completely valid to remove testing and probably deprecate and remove the features as well. Historically we've done this for things like postgresql support so this isn't a new problem. > > Cheers, > gibi From cboylan at sapwetik.org Thu Nov 18 15:46:00 2021 From: cboylan at sapwetik.org (Clark Boylan) Date: Thu, 18 Nov 2021 07:46:00 -0800 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming In-Reply-To: References: <2165480.iZASKD2KPV@p1> <3MOP2R.O83SZVO0NWN23@est.tech> <3e835881-29d8-4e33-9ef1-36171f9dd0a3@www.fastmail.com> <4EVR2R.KF069X977ZIK2@est.tech> Message-ID: <0d9a0bda-dee4-4d8c-8f82-1503a6b4cab3@www.fastmail.com> On Thu, Nov 18, 2021, at 7:19 AM, Sean Mooney wrote: > On Thu, 2021-11-18 at 15:39 +0100, Balazs Gibizer wrote: >> >> On Wed, Nov 17 2021 at 07:51:57 AM -0800, Clark Boylan >> wrote: >> > On Wed, Nov 17, 2021, at 2:18 AM, Balazs Gibizer wrote: >> > > >> > >> > Snip. I want to respond to a specific suggestion: >> > >> > > 3) there was informal discussion before about a possibility to >> > > re-run >> > > only some jobs with a recheck instead for re-running the whole set. >> > > I >> > > don't know if this is feasible with Zuul and I think this only treat >> > > the symptom not the root case. But still this could be a direction >> > > if >> > > all else fails. >> > > >> > >> > OpenStack has configured its check and gate queues with something >> > we've called "clean check". This refers to the requirement that >> > before an OpenStack project can be gated it must pass check tests >> > first. This policy was instituted because a number of these >> > infrequent but problematic issues were traced back to recheck >> > spamming. Basically changes would show up and were broken. They would >> > fail some percentage of the time. They got rechecked until they >> > finally merged and now their failure rate is added to the whole. This >> > rule was introduced to make it more difficult to get this flakyness >> > into the gate. >> > >> > Locking in test results is in direct opposition to the existing >> > policy and goals. Locking results would make it far more trivial to >> > land such flakyness as you wouldn't need entire sets of jobs to pass >> > before you could land. Instead you could rerun individual jobs until >> > each one passed and then land the result. Potentially introducing >> > significant flakyness with a single merge. >> > >> > Locking results is also not really something that fits well with the >> > speculative gate queues that Zuul runs. Remember that Zuul constructs >> > a future git state and tests that in parallel. Currently the state >> > for OpenStack looks like: >> > >> > A - Nova >> > ^ >> > B - Glance >> > ^ >> > C - Neutron >> > ^ >> > D - Neutron >> > ^ >> > F - Neutron >> > >> > The B glance change is tested as if the A Nova change has already >> > merged and so on down the queue. If we want to keep these speculative >> > states we can't really have humans manually verify a failure can be >> > ignored and retry it. Because we'd be enqueuing job builds at >> > different stages of speculative state. Each job build would be >> > testing a different version of the software. >> > >> > What we could do is implement a retry limit for failing jobs. Zuul >> > could rerun failing jobs X times before giving up and reporting >> > failure (this would require updates to Zuul). The problem with this >> > approach is without some oversight it becomes very easy to land >> > changes that make things worse. As a side note Zuul does do retries, >> > but only for detected network errors or when a pre-run playbook >> > fails. The assumption is that network failures are due to the dangers >> > of the Internet, and that pre-run playbooks are small, self >> > contained, unlikely to fail, and when they do fail the failure should >> > be independent of what is being tested. >> > >> > Where does that leave us? >> > >> > I think it is worth considering the original goals of "clean check". >> > We know that rechecking/rerunning only makes these problems worse in >> > the long term. They represent technical debt. One of the reasons we >> > run these tests is to show us when our software is broken. In the >> > case of flaky results we are exposing this technical debt where it >> > impacts the functionality of our software. The longer we avoid fixing >> > these issues the worse it gets, and this is true even with "clean >> > check". >> > >> > Do we as developers find value in knowing the software needs >> > attention before it gets released to users? Do the users find value >> > in running reliable software? In the past we have asserted that "yes, >> > there is value in this", and have invested in tracking, >> > investigating, and fixing these problems even if they happen >> > infrequently. But that does require investment, and active >> > maintenance. >> >> Thank you Clark! I agree with your view that the current setup provides >> us with very valuable information about the health of the software we >> are developing. I also agree that our primary goal should be to fix the >> flaky tests instead of hiding the results under any kind of rechecks. >> >> Still I'm wondering what we will do if it turns out that the existing >> developer bandwidth shrunk to the point where we simply not have the >> capacity for fix these technical debts. What the stable team does on >> stable branches in Extended Maintenance mode in a similar situation is >> to simply turn off problematic test jobs. So I guess that is also a >> valid last resort move. > > one option is to "trust" the core team more and grant them explict > rigth to workflow +2 and force merge a patch. > > trust is in quotes because its not really about trusting that the core > teams can restrain themselve form blindly merging > broken code but more a case of right now we entrust zuul to be the > final gate keeper of our repo. > > When there are known broken gate failure and we are trying to land > specific patch to say nova to fix or unblock the nuetron > gate and we can see the neutron DNM patch that depens on this nova fix > passsed then we could entrust the core team in this specific case > to override zuul. We do already give you this option via the removal of tests that are invalid/flaky/not useful. I do worry that if we give a complete end around the CI system it will be quickly abused. We stopped requiring a bug on rechecks because we quickly realized that no one was actually debugging the failure and identifying the underlying issue. Instead they would just recheck with an arbitrary or completely wrong bug identified. I expect similar would happen here. And the end result would be that CI would simply get more flaky and unreliable for the next change. If instead we fix or remove the flaky tests/jobs we'll end up with a system that is more reliable for the next change. > > i would expect this capablity to be used very spareinly but we do have > some intermitent failures that happen that we can tell? > are unrelated to the patch like the curernt issue with volumne > attach/detach that result in kernel panics in the guest. > if that is the only failure and all other test passed in gate i think > it woudl be reasonable for a the neutron team to approve a neutron patch > that modifies security groups for example. its very clearly an > unrealted failure. As noted above, it would also be reasonable to stop running tests that cannot function. We do need to be careful that we don't remove tests and never fix the underlying issues though. We should also remember that if we have these problems in CI there is a high chance that our users will have these problems in production later (we've helped more than one of the infra donor clouds identify bugs straight out of elastic-recheck information in the past so this does happen). > > that might be an alternivie to the recheck we have now and by resreving > that for the core team it limits the scope for abusing this. > > i do think that the orginal goes of green check are good so really i > would be suggesting this as an option for when check passed and we get > an intermient > failure in gate that we woudl override. > > this would not adress the issue in check but it would make itermitent > failure in gate much less painful. > > I tried to make this point in my previous email, but I think we are still fumbling around it. If we provide mechanisms to end around flaky CI instead of fixing flaky CI the end result will be flakier CI. I'm not convinced that we'll be happier with any mechanism that doesn't remove the -1 from happening in the first place. Instead the problems will accelerate and eventually we'll be unable to rely on CI for anything useful. From zigo at debian.org Thu Nov 18 15:50:58 2021 From: zigo at debian.org (Thomas Goirand) Date: Thu, 18 Nov 2021 16:50:58 +0100 Subject: [nova][all] Adding /healthcheck support in Nova, and better healthcheck in every projects In-Reply-To: References: <4882cd19-4f6d-c0a7-15cd-a98e73a217ad@debian.org> <290155a7-ad8a-ab60-8cf3-58e2642e8f38@debian.org> Message-ID: On 11/18/21 2:03 AM, Mohammed Naser wrote: > > > On Wed, Nov 17, 2021 at 5:52 PM Thomas Goirand > wrote: > > On 11/17/21 10:54 PM, Dan Smith wrote: > >> I don't think we rely on /healthcheck -- there's nothing healthy > about > >> an API endpoint blindly returning a 200 OK. > >> > >> You might as well just hit / and accept 300 as a code and that's > >> exactly the same behaviour.? I support what Sean is bringing up here > >> and I don't think it makes sense to have a noop /healthcheck that > >> always gives a 200 OK...seems a bit useless imho > > > > Yup, totally agree. Our previous concerns over a healthcheck that > > checked all of nova returning too much info to be useful (for > something > > trying to figure out if an individual worker is healthy) apply in > > reverse to one that returns too little to be useful. > > > > I agree, what Sean is working on is the right balance and that we > should > > focus on that. > > > > --Dan > > > > That's not the only thing it does. It also is capable of being disabled, > which is useful for maintenance: one can gracefully remove an API node > for removal this way, which one cannot do with the root. > > > I feel like this should be handled by whatever layer that needs to drain > requests for maintenance, otherwise also it might just be the same as > turning off the service, no? It's not the same. If you just turn off the service, there well may be some requests attempted to the API before it's seen as down. The idea here, is to declare the API as down, so that haproxy can remove it from the pool *before* the service is really turned off. That's what the oslo.middleware disable file helps doing, which the root url cannot do. Cheers, Thomas Goirand (zigo) From dms at danplanet.com Thu Nov 18 16:15:06 2021 From: dms at danplanet.com (Dan Smith) Date: Thu, 18 Nov 2021 08:15:06 -0800 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming In-Reply-To: <0d9a0bda-dee4-4d8c-8f82-1503a6b4cab3@www.fastmail.com> (Clark Boylan's message of "Thu, 18 Nov 2021 07:46:00 -0800") References: <2165480.iZASKD2KPV@p1> <3MOP2R.O83SZVO0NWN23@est.tech> <3e835881-29d8-4e33-9ef1-36171f9dd0a3@www.fastmail.com> <4EVR2R.KF069X977ZIK2@est.tech> <0d9a0bda-dee4-4d8c-8f82-1503a6b4cab3@www.fastmail.com> Message-ID: > We do already give you this option via the removal of tests that are > invalid/flaky/not useful. I do worry that if we give a complete end > around the CI system it will be quickly abused. Absolutely agree, humans are not good at making these decisions. Despite "trust" in the core team, and even using a less-loaded word than "abuse," I really don't think that even allowing the option to override flaky tests by force merge is the right solution (at all). > I tried to make this point in my previous email, but I think we are > still fumbling around it. If we provide mechanisms to end around flaky > CI instead of fixing flaky CI the end result will be flakier CI. I'm > not convinced that we'll be happier with any mechanism that doesn't > remove the -1 from happening in the first place. Instead the problems > will accelerate and eventually we'll be unable to rely on CI for > anything useful. Agreed. Either the tests are useful or they aren't. Even if they're not very reliable, they might be useful in causing pain because they continue to highlight flaky behavior until it gets fixed. --Dan From fungi at yuggoth.org Thu Nov 18 16:24:31 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 18 Nov 2021 16:24:31 +0000 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming In-Reply-To: References: <2165480.iZASKD2KPV@p1> <3MOP2R.O83SZVO0NWN23@est.tech> <3e835881-29d8-4e33-9ef1-36171f9dd0a3@www.fastmail.com> <4EVR2R.KF069X977ZIK2@est.tech> <0d9a0bda-dee4-4d8c-8f82-1503a6b4cab3@www.fastmail.com> Message-ID: <20211118162207.nu4ktuikasoxyaq2@yuggoth.org> On 2021-11-18 08:15:06 -0800 (-0800), Dan Smith wrote: [...] > Absolutely agree, humans are not good at making these decisions. > Despite "trust" in the core team, and even using a less-loaded > word than "abuse," I really don't think that even allowing the > option to override flaky tests by force merge is the right > solution (at all). [...] Just about any time we Gerrit admins have decided to bypass testing to merge some change (and to be clear, we really don't like to if we can avoid it), we introduce a new test-breaking bug we then need to troubleshoot and fix. It's a humbling reminder that even though you may feel absolutely sure something's safe to merge without passing tests, you're probably wrong. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From gmann at ghanshyammann.com Thu Nov 18 16:45:17 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Thu, 18 Nov 2021 10:45:17 -0600 Subject: [all] tempest-integrated-compute-centos-8-stream job is failing In-Reply-To: <1RXR2R.FUJGJYZK1M7N3@est.tech> References: <1RXR2R.FUJGJYZK1M7N3@est.tech> Message-ID: <17d33f029a2.10ffde70c911325.6334468229609379685@ghanshyammann.com> ---- On Thu, 18 Nov 2021 09:30:37 -0600 Balazs Gibizer wrote ---- > Hi, > > The centos 8 stream job is failing in 100% of the cases with mirror > issues [2]. You probably need to hold you recheck until it is resolved. did we make it voting too early? In devstack centos8-stream is still non-voting [1] which I think we were waiting for its stability first (or may be yoctozepto knows). [1] https://github.com/openstack/devstack/blob/487057de80df936f96f0b7364f4abfc8a7561d55/.zuul.yaml#L620 -gmann > > cheers, > gibi > > [1] > https://zuul.opendev.org/t/openstack/builds?job_name=tempest-integrated-compute-centos-8-stream > [2] > https://zuul.opendev.org/t/openstack/build/67dc088859d1435c9a5097e7469e5c12/log/job-output.txt#581-591 > > > > From balazs.gibizer at est.tech Thu Nov 18 16:47:08 2021 From: balazs.gibizer at est.tech (Balazs Gibizer) Date: Thu, 18 Nov 2021 17:47:08 +0100 Subject: [all] tempest-integrated-compute-centos-8-stream job is failing In-Reply-To: <17d33f029a2.10ffde70c911325.6334468229609379685@ghanshyammann.com> References: <1RXR2R.FUJGJYZK1M7N3@est.tech> <17d33f029a2.10ffde70c911325.6334468229609379685@ghanshyammann.com> Message-ID: On Thu, Nov 18 2021 at 10:45:17 AM -0600, Ghanshyam Mann wrote: > ---- On Thu, 18 Nov 2021 09:30:37 -0600 Balazs Gibizer > wrote ---- > > Hi, > > > > The centos 8 stream job is failing in 100% of the cases with mirror > > issues [2]. You probably need to hold you recheck until it is > resolved. > > did we make it voting too early? In devstack centos8-stream is still > non-voting [1] which I think we were waiting for its stability first > (or > may be yoctozepto knows). OK I made a mistake it is only voting in nova as far as I see[1]. But there was now a green run[1] so maybe the problem is resolved. cheers, gibi [1] https://zuul.opendev.org/t/openstack/builds?job_name=tempest-integrated-compute-centos-8-stream&voting=1 > > [1] > https://github.com/openstack/devstack/blob/487057de80df936f96f0b7364f4abfc8a7561d55/.zuul.yaml#L620 > > -gmann > > > > cheers, > > gibi > > > > [1] > > > https://zuul.opendev.org/t/openstack/builds?job_name=tempest-integrated-compute-centos-8-stream > > [2] > > > https://zuul.opendev.org/t/openstack/build/67dc088859d1435c9a5097e7469e5c12/log/job-output.txt#581-591 > > > > > > > > From gmann at ghanshyammann.com Thu Nov 18 16:47:21 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Thu, 18 Nov 2021 10:47:21 -0600 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming In-Reply-To: <20211118162207.nu4ktuikasoxyaq2@yuggoth.org> References: <2165480.iZASKD2KPV@p1> <3MOP2R.O83SZVO0NWN23@est.tech> <3e835881-29d8-4e33-9ef1-36171f9dd0a3@www.fastmail.com> <4EVR2R.KF069X977ZIK2@est.tech> <0d9a0bda-dee4-4d8c-8f82-1503a6b4cab3@www.fastmail.com> <20211118162207.nu4ktuikasoxyaq2@yuggoth.org> Message-ID: <17d33f20e01.11ee7cdc8911479.3734135720231660826@ghanshyammann.com> ---- On Thu, 18 Nov 2021 10:24:31 -0600 Jeremy Stanley wrote ---- > On 2021-11-18 08:15:06 -0800 (-0800), Dan Smith wrote: > [...] > > Absolutely agree, humans are not good at making these decisions. > > Despite "trust" in the core team, and even using a less-loaded > > word than "abuse," I really don't think that even allowing the > > option to override flaky tests by force merge is the right > > solution (at all). > [...] > > Just about any time we Gerrit admins have decided to bypass testing > to merge some change (and to be clear, we really don't like to if we > can avoid it), we introduce a new test-breaking bug we then need to > troubleshoot and fix. It's a humbling reminder that even though you > may feel absolutely sure something's safe to merge without passing > tests, you're probably wrong. Indeed. I too agree here and it can lead to the situation that 'hey my patch was all good can you just +W this' which can end up more unstable tests/code. -gmann > -- > Jeremy Stanley > From gmann at ghanshyammann.com Thu Nov 18 16:49:05 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Thu, 18 Nov 2021 10:49:05 -0600 Subject: [all] tempest-integrated-compute-centos-8-stream job is failing In-Reply-To: References: <1RXR2R.FUJGJYZK1M7N3@est.tech> <17d33f029a2.10ffde70c911325.6334468229609379685@ghanshyammann.com> Message-ID: <17d33f3a647.c62909f7911582.7656301661228191981@ghanshyammann.com> ---- On Thu, 18 Nov 2021 10:47:08 -0600 Balazs Gibizer wrote ---- > > > On Thu, Nov 18 2021 at 10:45:17 AM -0600, Ghanshyam Mann > wrote: > > ---- On Thu, 18 Nov 2021 09:30:37 -0600 Balazs Gibizer > > wrote ---- > > > Hi, > > > > > > The centos 8 stream job is failing in 100% of the cases with mirror > > > issues [2]. You probably need to hold you recheck until it is > > resolved. > > > > did we make it voting too early? In devstack centos8-stream is still > > non-voting [1] which I think we were waiting for its stability first > > (or > > may be yoctozepto knows). > > OK I made a mistake it is only voting in nova as far as I see[1]. But > there was now a green run[1] so maybe the problem is resolved. +1, as in yoga testing runtime we are moving to centos9-stream. let me try the devstack job also voting so that any concrete job like ' tempest-integrated-compute-centos-8-stream' can be consider as voting and we capture the issue in devstack first before it break on project side. -gmann > > cheers, > gibi > > [1] > https://zuul.opendev.org/t/openstack/builds?job_name=tempest-integrated-compute-centos-8-stream&voting=1 > > > > > [1] > > https://github.com/openstack/devstack/blob/487057de80df936f96f0b7364f4abfc8a7561d55/.zuul.yaml#L620 > > > > -gmann > > > > > > cheers, > > > gibi > > > > > > [1] > > > > > https://zuul.opendev.org/t/openstack/builds?job_name=tempest-integrated-compute-centos-8-stream > > > [2] > > > > > https://zuul.opendev.org/t/openstack/build/67dc088859d1435c9a5097e7469e5c12/log/job-output.txt#581-591 > > > > > > > > > > > > > > > From balazs.gibizer at est.tech Thu Nov 18 18:39:14 2021 From: balazs.gibizer at est.tech (Balazs Gibizer) Date: Thu, 18 Nov 2021 19:39:14 +0100 Subject: [all] tempest-integrated-compute-centos-8-stream job is failing In-Reply-To: <17d33f3a647.c62909f7911582.7656301661228191981@ghanshyammann.com> References: <1RXR2R.FUJGJYZK1M7N3@est.tech> <17d33f029a2.10ffde70c911325.6334468229609379685@ghanshyammann.com> <17d33f3a647.c62909f7911582.7656301661228191981@ghanshyammann.com> Message-ID: Since then we had two green runs so I think the mirror issue has been resolved. cheers, gibi On Thu, Nov 18 2021 at 10:49:05 AM -0600, Ghanshyam Mann wrote: > ---- On Thu, 18 Nov 2021 10:47:08 -0600 Balazs Gibizer > wrote ---- > > > > > > On Thu, Nov 18 2021 at 10:45:17 AM -0600, Ghanshyam Mann > > wrote: > > > ---- On Thu, 18 Nov 2021 09:30:37 -0600 Balazs Gibizer > > > wrote ---- > > > > Hi, > > > > > > > > The centos 8 stream job is failing in 100% of the cases with > mirror > > > > issues [2]. You probably need to hold you recheck until it is > > > resolved. > > > > > > did we make it voting too early? In devstack centos8-stream is > still > > > non-voting [1] which I think we were waiting for its stability > first > > > (or > > > may be yoctozepto knows). > > > > OK I made a mistake it is only voting in nova as far as I see[1]. > But > > there was now a green run[1] so maybe the problem is resolved. > > +1, as in yoga testing runtime we are moving to centos9-stream. let me > try the devstack job also voting so that any concrete job like > ' tempest-integrated-compute-centos-8-stream' can be consider as > voting > and we capture the issue in devstack first before it break on project > side. > > -gmann > > > > > cheers, > > gibi > > > > [1] > > > https://zuul.opendev.org/t/openstack/builds?job_name=tempest-integrated-compute-centos-8-stream&voting=1 > > > > > > > > [1] > > > > https://github.com/openstack/devstack/blob/487057de80df936f96f0b7364f4abfc8a7561d55/.zuul.yaml#L620 > > > > > > -gmann > > > > > > > > cheers, > > > > gibi > > > > > > > > [1] > > > > > > > > https://zuul.opendev.org/t/openstack/builds?job_name=tempest-integrated-compute-centos-8-stream > > > > [2] > > > > > > > > https://zuul.opendev.org/t/openstack/build/67dc088859d1435c9a5097e7469e5c12/log/job-output.txt#581-591 > > > > > > > > > > > > > > > > > > > > > > From fungi at yuggoth.org Thu Nov 18 18:45:53 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 18 Nov 2021 18:45:53 +0000 Subject: [all] tempest-integrated-compute-centos-8-stream job is failing In-Reply-To: References: <1RXR2R.FUJGJYZK1M7N3@est.tech> <17d33f029a2.10ffde70c911325.6334468229609379685@ghanshyammann.com> <17d33f3a647.c62909f7911582.7656301661228191981@ghanshyammann.com> Message-ID: <20211118184553.ha6eqxq744vsc6mo@yuggoth.org> On 2021-11-18 19:39:14 +0100 (+0100), Balazs Gibizer wrote: > Since then we had two green runs so I think the mirror issue has been > resolved. [...] Yes, apparently mirror.centos.org was broken. We reported the problem to Red Hat staff, and it was shortly addressed. Our next mirror sync after that solved it across our CI mirror network and CentOS Stream 8 based jobs stopped failing. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From franck.vedel at univ-grenoble-alpes.fr Thu Nov 18 18:58:07 2021 From: franck.vedel at univ-grenoble-alpes.fr (Franck VEDEL) Date: Thu, 18 Nov 2021 19:58:07 +0100 Subject: [kolla-ansible][wallaby][glance] Problem with image list after reconfigure In-Reply-To: References: <87804956-038F-4F88-A467-33851679B5B4@univ-grenoble-alpes.fr> Message-ID: <63143C21-F494-4544-86F3-22792BF062E6@univ-grenoble-alpes.fr> ok ... I got it ... and I think I was doing things wrong. Okay, so I have another question. My cinder storage is on an iscsi bay. I have 3 servers, S1, S2, S2. Compute is on S1, S2, S3. Controller is on S1 and S2. Storage is on S3. I have Glance on S1. Building an instance from an image is too long, so you have to make a volume first. If I put the images on the iSCSI bay, I mount a directory in the S1 file system, will the images build faster? Much faster ? Is this a good idea or not? Thank you again for your help and your experience Franck > Le 18 nov. 2021 ? 07:23, Ignazio Cassano a ?crit : > > Hello, i solved using the following variabile in globals.yml: > glance_file_datadir_volume=somedir > and glance_backend_file="yes' > > So if the somedir is a nfs mount point, controllers can share images. Remember you have to deploy glance on all controllers. > Ignazio > > Il Mer 17 Nov 2021, 23:17 Franck VEDEL > ha scritto: > Hello and thank you for the help. > I was able to move forward on my problem, without finding a satisfactory solution. > Normally, I have 2 servers with the role [glance] but I noticed that all my images were on the first server (in / var / lib / docker / volumes / glance / _data / images) before the reconfigure, none on the second. But since the reconfiguration, the images are placed on the second, and no longer on the first. I do not understand why. I haven't changed anything to the multinode file. > so, to get out of this situation quickly as I need this openstack for the students, I modified the multinode file and put only one server in [glance] (I put server 1, the one that had the images before reconfigure), I did a reconfigure -t glance and now I have my images usable for instances. > I don't understand what happened. There is something wrong. > > Is it normal that after updating the certificates, all instances are turned off? > thanks again > > Franck > >> Le 17 nov. 2021 ? 21:11, Cyril Roelandt > a ?crit : >> >> Hello, >> >> >> On 2021-11-17 08:59, Franck VEDEL wrote: >>> Hello everyone >>> >>> I have a strange problem and I haven't found the solution yet. >>> Following a certificate update I had to do a "kolla-ansible -t multinode reconfigure ?. >>> Well, after several attempts (it is not easy to use certificates with Kolla-ansible, and from my advice, not documented enough for beginners), I have my new functional certificates. Perfect ... well almost. >>> >>> I am trying to create a new instance to check general operation. ERROR. >>> Okay, I look in the logs and I see that Cinder is having problems creating volumes with an error that I never had ("TypeError: 'NoneType' object is not iterable). >> >> We'd like to see the logs as well, especially the stacktrace. >> >>> I dig and then I wonder if it is not the Glance images which cannot be used, while they are present (openstack image list is OK). >>> >>> I create an empty volume: it works. >>> I am creating a volume from an image: Failed. >> >> What commands are you running? What's the output? What's in the logs? >> >>> >>> However, I have my list of ten images in glance. >>> >>> I create a new image and create a volume with this new image: it works. >>> I create an instance with this new image: OK. >>> >>> What is the problem ? The images present before the "reconfigure" are listed, visible in horizon for example, but unusable. >>> Is there a way to fix this, or do we have to reinstall them all? >> >> What's your configuration? What version of OpenStack are you running? >> >> >> >> Cyril >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From katonalala at gmail.com Thu Nov 18 19:19:51 2021 From: katonalala at gmail.com (Lajos Katona) Date: Thu, 18 Nov 2021 20:19:51 +0100 Subject: [neutron] Drivers meeting - Friday 18.11.2021 - cancelled Message-ID: Hi Neutron Drivers! Due to the lack of agenda, let's cancel tomorrow's drivers meeting. See You on the meeting next week. Lajos Katona (lajoskatona) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Thu Nov 18 19:34:39 2021 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Thu, 18 Nov 2021 20:34:39 +0100 Subject: [kolla-ansible][wallaby][glance] Problem with image list after reconfigure In-Reply-To: <63143C21-F494-4544-86F3-22792BF062E6@univ-grenoble-alpes.fr> References: <87804956-038F-4F88-A467-33851679B5B4@univ-grenoble-alpes.fr> <63143C21-F494-4544-86F3-22792BF062E6@univ-grenoble-alpes.fr> Message-ID: I did not understand very well how your infrastructure is done. Generally speaking, I prefer to have 3 controllers , n computer nodes and external storage. I think using iscsi images must be downloaded and converted from qcow2 to raw format and it can takes a long time. In this case I used image cache. Probably when you create a volume from image you can see a download phase. If you use image cache the download is executed only the first time a volume from that image is created. Sorry for my bad english. Take a look at https://docs.openstack.org/cinder/latest/admin/blockstorage-image-volume-cache.html#:~:text=Image%2DVolume%20cache%C2%B6,end%20can%20clone%20a%20volume . Ignazio Il Gio 18 Nov 2021, 19:58 Franck VEDEL ha scritto: > ok ... I got it ... and I think I was doing things wrong. > Okay, so I have another question. > My cinder storage is on an iscsi bay. > I have 3 servers, S1, S2, S2. > Compute is on S1, S2, S3. > Controller is on S1 and S2. > Storage is on S3. > I have Glance on S1. Building an instance from an image is too long, so > you have to make a volume first. > If I put the images on the iSCSI bay, I mount a directory in the S1 file > system, will the images build faster? Much faster ? > Is this a good idea or not? > > Thank you again for your help and your experience > > > Franck > > Le 18 nov. 2021 ? 07:23, Ignazio Cassano a > ?crit : > > Hello, i solved using the following variabile in globals.yml: > glance_file_datadir_volume=somedir > and glance_backend_file="yes' > > So if the somedir is a nfs mount point, controllers can share images. > Remember you have to deploy glance on all controllers. > Ignazio > > Il Mer 17 Nov 2021, 23:17 Franck VEDEL < > franck.vedel at univ-grenoble-alpes.fr> ha scritto: > >> Hello and thank you for the help. >> I was able to move forward on my problem, without finding a satisfactory >> solution. >> Normally, I have 2 servers with the role [glance] but I noticed that all >> my images were on the first server (in / var / lib / docker / volumes / >> glance / _data / images) before the reconfigure, none on the second. But >> since the reconfiguration, the images are placed on the second, and no >> longer on the first. I do not understand why. I haven't changed anything to >> the multinode file. >> so, to get out of this situation quickly as I need this openstack for the >> students, I modified the multinode file and put only one server in [glance] >> (I put server 1, the one that had the images before reconfigure), I did a >> reconfigure -t glance and now I have my images usable for instances. >> I don't understand what happened. There is something wrong. >> >> Is it normal that after updating the certificates, all instances are >> turned off? >> thanks again >> >> Franck >> >> Le 17 nov. 2021 ? 21:11, Cyril Roelandt a ?crit : >> >> Hello, >> >> >> On 2021-11-17 08:59, Franck VEDEL wrote: >> >> Hello everyone >> >> I have a strange problem and I haven't found the solution yet. >> Following a certificate update I had to do a "kolla-ansible -t multinode >> reconfigure ?. >> Well, after several attempts (it is not easy to use certificates with >> Kolla-ansible, and from my advice, not documented enough for beginners), I >> have my new functional certificates. Perfect ... well almost. >> >> I am trying to create a new instance to check general operation. ERROR. >> Okay, I look in the logs and I see that Cinder is having problems >> creating volumes with an error that I never had ("TypeError: 'NoneType' >> object is not iterable). >> >> >> We'd like to see the logs as well, especially the stacktrace. >> >> I dig and then I wonder if it is not the Glance images which cannot be >> used, while they are present (openstack image list is OK). >> >> I create an empty volume: it works. >> I am creating a volume from an image: Failed. >> >> >> What commands are you running? What's the output? What's in the logs? >> >> >> However, I have my list of ten images in glance. >> >> I create a new image and create a volume with this new image: it works. >> I create an instance with this new image: OK. >> >> What is the problem ? The images present before the "reconfigure" are >> listed, visible in horizon for example, but unusable. >> Is there a way to fix this, or do we have to reinstall them all? >> >> >> What's your configuration? What version of OpenStack are you running? >> >> >> >> Cyril >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Thu Nov 18 20:57:55 2021 From: smooney at redhat.com (Sean Mooney) Date: Thu, 18 Nov 2021 20:57:55 +0000 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming In-Reply-To: <20211118162207.nu4ktuikasoxyaq2@yuggoth.org> References: <2165480.iZASKD2KPV@p1> <3MOP2R.O83SZVO0NWN23@est.tech> <3e835881-29d8-4e33-9ef1-36171f9dd0a3@www.fastmail.com> <4EVR2R.KF069X977ZIK2@est.tech> <0d9a0bda-dee4-4d8c-8f82-1503a6b4cab3@www.fastmail.com> <20211118162207.nu4ktuikasoxyaq2@yuggoth.org> Message-ID: On Thu, 2021-11-18 at 16:24 +0000, Jeremy Stanley wrote: > On 2021-11-18 08:15:06 -0800 (-0800), Dan Smith wrote: > [...] > > Absolutely agree, humans are not good at making these decisions. > > Despite "trust" in the core team, and even using a less-loaded > > word than "abuse," I really don't think that even allowing the > > option to override flaky tests by force merge is the right > > solution (at all). > [...] > > Just about any time we Gerrit admins have decided to bypass testing > to merge some change (and to be clear, we really don't like to if we > can avoid it), we introduce a new test-breaking bug we then need to > troubleshoot and fix. It's a humbling reminder that even though you > may feel absolutely sure something's safe to merge without passing > tests, you're probably wrong. well the example i gave is a failure in the interaction between nova and cinder failing in the neturon gate. there is no way the neutron patch under reivew could cause that failure to happen and i chose a specific example of the intermite failure we have with the compute volume detach fiailest where it looks like the bug is actully the tempest test.? https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_67c/810915/2/gate/nova-live-migration/67c89da/testr_results.html it appear that for some reason attaching a cinder volume and live migrating the vm while the kernel/OS in the vm is still booting up can result in a kernel panic. This has been an on going battel to solve for many weeks. there is no way that a change in neutron or glance or keystone patch could have cause the guest kernel to crash. https://bugs.launchpad.net/nova/+bug/1950310 and https://bugs.launchpad.net/nova/+bug/1939108 are two of the related bugs if they are running the tempest.api.compute.admin.test_live_migration.LiveMigrationTest* in any of there jobs however they could have been impacted by this lee yarwood has started implemeting a very old tepest spec https://specs.openstack.org/openstack/qa-specs/specs/tempest/implemented/ssh-auth-strategy.html for this and we think that will fix the test failure.https://review.opendev.org/c/openstack/tempest/+/817772/2 i suspect we have many other cases in tempest today where we have intermitent failures cause by the guest os not being ready before we do operations on the guest beyond the curent volume attach/detach issues i did not sucggest allowint the ci to be overrdien because i think that is generally a good idea,? its not but some time there are failure that we are activly trying to fix but have not found a solution for for months. im pretty sure this live migration test prevent patches to the ironic virt driver landing not so long ago requiring sevel retries. the ironic virirt dirver obvioly does not supprot live migrationand the chagne was not touching any oter part of nova so the failure was unrealted. https://review.opendev.org/c/openstack/nova/+/799327 is the cahange i was thinking of the master version need 3 recheck the backprot needed 6 more https://review.opendev.org/c/openstack/nova/+/799772 that may have actully been casue by https://bugs.launchpad.net/nova/+bug/1931702 which is an other bug for a similar kernel panic but i would not be surpirsed if it ws actully the same root causes. i think that point was lost in my orginal message. the point i was trying to make is sometimes the failture is not about the code under review its because the test is wrong. we shoudl fix the test but it can be very frustrating if you recheck somethign 3-4 times where it passes in check and fails in gate for somethign you know is unrealated but you dont want to disable the test because you dont want to losse coverate for somthign that typically fails a low amount of time. regards sean. From rlandy at redhat.com Thu Nov 18 23:52:46 2021 From: rlandy at redhat.com (Ronelle Landy) Date: Thu, 18 Nov 2021 18:52:46 -0500 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming In-Reply-To: <3MOP2R.O83SZVO0NWN23@est.tech> References: <2165480.iZASKD2KPV@p1> <3MOP2R.O83SZVO0NWN23@est.tech> Message-ID: On Wed, Nov 17, 2021 at 5:22 AM Balazs Gibizer wrote: > > > On Wed, Nov 17 2021 at 09:13:34 AM +0100, Slawek Kaplonski > wrote: > > Hi, > > > > Recently I spent some time to check how many rechecks we need in > > Neutron to > > get patch merged and I compared it to some other OpenStack projects > > (see [1] > > for details). > > TL;DR - results aren't good for us and I think we really need to do > > something > > with that. > > I really like the idea of collecting such stats. Thank you for doing > it. I can even imagine to make a public dashboard somewhere with this > information as it is a good indication about the health of our projects > / testing. > > > > > Of course "easiest" thing to say is that we should fix issues which > > we are > > hitting in the CI to make jobs more stable. But it's not that easy. > > We are > > struggling with those jobs for very long time. We have CI related > > meeting > > every week and we are fixing what we can there. > > Unfortunately there is still bunch of issues which we can't fix so > > far because > > they are intermittent and hard to reproduce locally or in some cases > > the > > issues aren't realy related to the Neutron or there are new bugs > > which we need > > to investigate and fix :) > > > I have couple of suggestion based on my experience working with CI in > nova. > We've struggled with unstable tests in TripleO as well. Here are some things we tried and implemented: 1. Created job dependencies so we only ran check tests once we knew we had the resources we needed (example we had pulled containers successfully) 2. Moved some testing to third party where we have easier control of the environment (note that third party cannot stop a change merging) 3. Used dependency pipelines to pre-qualify some dependencies ahead of letting them run wild on our check jobs 4. Requested testproject runs of changes in a less busy environment before running a full set of tests in a public zuul 5. Used a skiplist to keep track of tech debt and skip known failures that we could temporarily ignore to keep CI moving along if we're waiting on an external fix. > > 1) we try to open bug reports for intermittent gate failures too and > keep them tagged in a list [1] so when a job fail it is easy to check > if the bug is known. > > 2) I offer my help here now that if you see something in neutron runs > that feels non neutron specific then ping me with it. Maybe we are > struggling with the same problem too. > > 3) there was informal discussion before about a possibility to re-run > only some jobs with a recheck instead for re-running the whole set. I > don't know if this is feasible with Zuul and I think this only treat > the symptom not the root case. But still this could be a direction if > all else fails. > > Cheers, > gibi > > > So this is never ending battle for us. The problem is that we have > > to test > > various backends, drivers, etc. so as a result we have many jobs > > running on > > each patch - excluding UT, pep8 and docs jobs we have around 19 jobs > > in check > > and 14 jobs in gate queue. > > > > In the past we made a lot of improvements, like e.g. we improved > > irrelevant > > files lists for jobs to run less jobs on some of the patches, > > together with QA > > team we did "integrated-networking" template to run only Neutron and > > Nova > > related scenario tests in the Neutron queues, we removed and > > consolidated some > > of the jobs (there is still one patch in progress for that but it > > should just > > remove around 2 jobs from the check queue). All of that are good > > improvements > > but still not enough to make our CI really stable :/ > > > > Because of all of that, I would like to ask community about any other > > ideas > > how we can improve that. If You have any ideas, please send it in > > this email > > thread or reach out to me directly on irc. > > We want to discuss about them in the next video CI meeting which will > > be on > > November 30th. If You would have any idea and would like to join that > > discussion, You are more than welcome in that meeting of course :) > > > > [1] > > http://lists.openstack.org/pipermail/openstack-discuss/2021-November/ > > 025759.html > > > [1] > > https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure&orderby=-date_last_updated&start=0 > > > > > > -- > > Slawek Kaplonski > > Principal Software Engineer > > Red Hat > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gregory.orange at pawsey.org.au Fri Nov 19 00:33:01 2021 From: gregory.orange at pawsey.org.au (Gregory Orange) Date: Fri, 19 Nov 2021 08:33:01 +0800 Subject: [kolla] Plan to deprecate binary and unify on single distrubition In-Reply-To: <7B0C2125-975E-437C-B332-9E88F725D248@gmail.com> References: <7B0C2125-975E-437C-B332-9E88F725D248@gmail.com> Message-ID: <1e3b2745-5b78-fe2b-3113-b3253eb4c0a5@pawsey.org.au> On 18/11/21 8:33 pm, Micha? Nasiadka wrote: > 1) Deprecate and drop binary type of Kolla images My globals.yaml has #kolla_install_type: "binary" So will this mean it needs to switch to 'source' from Yoga onward, and the containers will have to be built some how? Can you point me to something to read on that? > 2) Use a common base (single Linux distribution) for Kolla images I'm not experienced enough with Kolla to know whether this will affect me, so I will roll with it and figure it out as we go. Thanks, Greg. From franck.vedel at univ-grenoble-alpes.fr Fri Nov 19 07:41:21 2021 From: franck.vedel at univ-grenoble-alpes.fr (Franck VEDEL) Date: Fri, 19 Nov 2021 08:41:21 +0100 Subject: [kolla-ansible][wallaby][glance] Problem with image list after reconfigure In-Reply-To: References: <87804956-038F-4F88-A467-33851679B5B4@univ-grenoble-alpes.fr> <63143C21-F494-4544-86F3-22792BF062E6@univ-grenoble-alpes.fr> Message-ID: hello ignacio and thank you for all this information. I also think that a structure with 3 servers may not be built properly, once again, arriving on such a project, without help (human help, because we find documents, documentations to be taken in order, with many different directions, choose the right OS, don't run into a bug (vpnaas for me), do tests, etc.). You have to make choices in order to move forward. I agree that I probably didn't do things the best way. And I regret it. Thank you for this help on how the images work. Yes, in my case the images can be used after the "download" because they are in Qcow2. I will change this, I did not understand it. It is clear that if a professional came to see my Openstack, they would tell me what is wrong, what I need to change, but hey, in the end, it still works a bit. Thanks Ingnacio, really. Franck > Le 18 nov. 2021 ? 20:34, Ignazio Cassano a ?crit : > > I did not understand very well how your infrastructure is done. > Generally speaking, I prefer to have 3 controllers , n computer nodes and external storage. > I think using iscsi images must be downloaded and converted from qcow2 to raw format and it can takes a long time. In this case I used image cache. Probably when you create a volume from image you can see a download phase. If you use image cache the download is executed only the first time a volume from that image is created. > Sorry for my bad english. > Take a look at > https://docs.openstack.org/cinder/latest/admin/blockstorage-image-volume-cache.html#:~:text=Image%2DVolume%20cache%C2%B6,end%20can%20clone%20a%20volume . > Ignazio > > > Il Gio 18 Nov 2021, 19:58 Franck VEDEL > ha scritto: > ok ... I got it ... and I think I was doing things wrong. > Okay, so I have another question. > My cinder storage is on an iscsi bay. > I have 3 servers, S1, S2, S2. > Compute is on S1, S2, S3. > Controller is on S1 and S2. > Storage is on S3. > I have Glance on S1. Building an instance from an image is too long, so you have to make a volume first. > If I put the images on the iSCSI bay, I mount a directory in the S1 file system, will the images build faster? Much faster ? > Is this a good idea or not? > > Thank you again for your help and your experience > > > Franck > >> Le 18 nov. 2021 ? 07:23, Ignazio Cassano > a ?crit : >> >> Hello, i solved using the following variabile in globals.yml: >> glance_file_datadir_volume=somedir >> and glance_backend_file="yes' >> >> So if the somedir is a nfs mount point, controllers can share images. Remember you have to deploy glance on all controllers. >> Ignazio >> >> Il Mer 17 Nov 2021, 23:17 Franck VEDEL > ha scritto: >> Hello and thank you for the help. >> I was able to move forward on my problem, without finding a satisfactory solution. >> Normally, I have 2 servers with the role [glance] but I noticed that all my images were on the first server (in / var / lib / docker / volumes / glance / _data / images) before the reconfigure, none on the second. But since the reconfiguration, the images are placed on the second, and no longer on the first. I do not understand why. I haven't changed anything to the multinode file. >> so, to get out of this situation quickly as I need this openstack for the students, I modified the multinode file and put only one server in [glance] (I put server 1, the one that had the images before reconfigure), I did a reconfigure -t glance and now I have my images usable for instances. >> I don't understand what happened. There is something wrong. >> >> Is it normal that after updating the certificates, all instances are turned off? >> thanks again >> >> Franck >> >>> Le 17 nov. 2021 ? 21:11, Cyril Roelandt > a ?crit : >>> >>> Hello, >>> >>> >>> On 2021-11-17 08:59, Franck VEDEL wrote: >>>> Hello everyone >>>> >>>> I have a strange problem and I haven't found the solution yet. >>>> Following a certificate update I had to do a "kolla-ansible -t multinode reconfigure ?. >>>> Well, after several attempts (it is not easy to use certificates with Kolla-ansible, and from my advice, not documented enough for beginners), I have my new functional certificates. Perfect ... well almost. >>>> >>>> I am trying to create a new instance to check general operation. ERROR. >>>> Okay, I look in the logs and I see that Cinder is having problems creating volumes with an error that I never had ("TypeError: 'NoneType' object is not iterable). >>> >>> We'd like to see the logs as well, especially the stacktrace. >>> >>>> I dig and then I wonder if it is not the Glance images which cannot be used, while they are present (openstack image list is OK). >>>> >>>> I create an empty volume: it works. >>>> I am creating a volume from an image: Failed. >>> >>> What commands are you running? What's the output? What's in the logs? >>> >>>> >>>> However, I have my list of ten images in glance. >>>> >>>> I create a new image and create a volume with this new image: it works. >>>> I create an instance with this new image: OK. >>>> >>>> What is the problem ? The images present before the "reconfigure" are listed, visible in horizon for example, but unusable. >>>> Is there a way to fix this, or do we have to reinstall them all? >>> >>> What's your configuration? What version of OpenStack are you running? >>> >>> >>> >>> Cyril >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnasiadka at gmail.com Fri Nov 19 08:18:31 2021 From: mnasiadka at gmail.com (=?utf-8?Q?Micha=C5=82_Nasiadka?=) Date: Fri, 19 Nov 2021 09:18:31 +0100 Subject: [kolla] Plan to deprecate binary and unify on single distrubition In-Reply-To: References: <7B0C2125-975E-437C-B332-9E88F725D248@gmail.com> Message-ID: <756EB466-CBAF-43C6-8360-4B5B0F0FE873@gmail.com> Hi Sean, > On 18 Nov 2021, at 16:04, Sean Mooney wrote: > > On Thu, 2021-11-18 at 13:33 +0100, Micha? Nasiadka wrote: >> Hello Koalas, >> >> On the PTG we have discussed two topics: >> >> 1) Deprecate and drop binary type of Kolla images >> 2) Use a common base (single Linux distribution) for Kolla images >> >> This is a call for feedback - for people that have not been attending the PTG. >> >> What this essentially mean for consumers: >> >> 1) In Yoga cycle we will deprecate binary type of Kolla images, and in Z cycle those will be dropped. >> 2) We are not going to support CentOS Stream 9 (cs9) as a base operating system, and the source type build will rely on CentOS Stream 8 in Z release. >> 3) Beginning from A release Kolla will build only Debian source images - but Kolla-Ansible will still support deployment of those images on CentOS/Ubuntu/Debian Host operating systems (and Rocky Linux to be added in Yoga to that mix). >> >> Justification: >> The Kolla project team is limited in numbers, therefore supporting current broad mix of operating systems (especially with CentOS Stream 9 ,,on the way??) is a significant maintenance burden. >> By dropping binary type of images - users would be running more tested images (since Kolla/Kolla-Ansible CI runs source images jobs as voting). >> In Xena we?ve already changed the default image type Kolla-Ansible uses to source. >> We also feel that using a unified base OS for Kolla container images is a way to remove some of the maintenance burden (including CI cycles and >> >> Request for feedback: >> If any of those changes is a no go from your perspective - we?d like to hear your opinions. > > i only have reason to use kolla-ansible on small personal clouds at home so either way this will have limited direct affect on me but just wanted to give some toughts: > kolla has been my favorit way to deploy openstack for a very long time, one of the reason i liked it was the fact that it was distro independent, simple ot use/configure and it support source and > binary installs. i have almost always defautl to ubuntu source although in the rare case si used centos i always used centos binary. > > i almost never used binary installs on debian based disto and on rpm based distos never really used soruce, im not sure if im the only one that did that. > its not because i trusted the rpm package more by the way it just seam to be waht was tested more when i cahged to kolla devs on irc so i avoid ubuntu bindary and centos source as a result. > > with that in mind debian source is not really contoversial to me, i had one quetion on that however. > will the support for the other distos be kept in the kolla images but not extended or will it be dropped. i asume the plan is to remvoe the templating for other distos in A based on point 3 above. > Yes, currently that?s the plan - to remove templating for other distros and entries for them in kolla-build code. > the only other thing i wanted to point out is that while i have had some succes gettign ubuntu soruce image to run on centos host in the past it will be tricky if kolla every want to supprot > selinx/apparmor. that was the main barrier i faced but there can be other. speficialy ovs and libvirt can be somewhat picky about the kernel on which they run. most of the openstack service will > likely not care that the contaienr os does not match the host but some of the "system" depenciy like libvirt/ovs might. a way to address taht would be to supprot using external images for those servie > form dockerhub/quay e.g. use the offical upstream mariadb image or libvirt or rabbit if that is infact a porblem. > We?ve been discussing about moving OVS and Libvirt deployment to be deployed on OS level (not in containers) - if that will be required. > anyway its totally understandable that if you do not have contirbutor that are able to support the other distors that you would remove the supprot. > espicaly with the move away form using kolla image in ooo of late and presumable a reduction in redhat contibutions to keep centos supprot alive. > > anyway the chagne while sad to see just form a proejct health point of vew are sad would not be enough to prevent me personally form using or recommendign kolla and kolla-ansibel. > if you had prorpsed the opistte of centos binary only that would be much more concerning to me. there are some advantages to source installs that you are preservging in this change > that i am glad will not be lost. > That?s good to see. > one last tought is that if only one disto will be supproted for building images in the future with only soruce installs supproted, it might be worth considering if alpine or the alpine/debian-lite > python 3 iamge should be revaluated as the base for an even lighter set of contianers rather then the base os image. i am kind fo assumeing that at some poitn the non python noopenstack contaienr > would be used form the offial images at some point rahter then kolla contining to maintain them with the above suggestion. So this may not be applicable now and really would likely be an A or post A > thing. > That was in the PTG notes, but I don?t think we have considered a separate base OS for python-based Kolla images (and a separate for those that need it or use official Docker Hub/Quay.io images for those services). Thanks for your input! > >> >> Best regards, >> Michal Nasiadka -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnasiadka at gmail.com Fri Nov 19 08:19:33 2021 From: mnasiadka at gmail.com (=?utf-8?Q?Micha=C5=82_Nasiadka?=) Date: Fri, 19 Nov 2021 09:19:33 +0100 Subject: [kolla] Plan to deprecate binary and unify on single distrubition In-Reply-To: <1e3b2745-5b78-fe2b-3113-b3253eb4c0a5@pawsey.org.au> References: <7B0C2125-975E-437C-B332-9E88F725D248@gmail.com> <1e3b2745-5b78-fe2b-3113-b3253eb4c0a5@pawsey.org.au> Message-ID: Hi Gregory, > On 19 Nov 2021, at 01:33, Gregory Orange wrote: > > On 18/11/21 8:33 pm, Micha? Nasiadka wrote: >> 1) Deprecate and drop binary type of Kolla images > > My globals.yaml has > #kolla_install_type: "binary" > > So will this mean it needs to switch to 'source' from Yoga onward, and the containers will have to be built some how? Can you point me to something to read on that? > If that?s commented out, that means with Xena you will get source images deployed (during an upgrade or on a fresh install). >> 2) Use a common base (single Linux distribution) for Kolla images > > I'm not experienced enough with Kolla to know whether this will affect me, so I will roll with it and figure it out as we go. > > Thanks, > Greg. Thanks, Michal From ignaziocassano at gmail.com Fri Nov 19 09:50:18 2021 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Fri, 19 Nov 2021 10:50:18 +0100 Subject: [kolla-ansible][wallaby][glance] Problem with image list after reconfigure In-Reply-To: <87804956-038F-4F88-A467-33851679B5B4@univ-grenoble-alpes.fr> References: <87804956-038F-4F88-A467-33851679B5B4@univ-grenoble-alpes.fr> Message-ID: Hello Franck, glance is not deployed on all nodes at default. I got the same problem In my case I have 3 controllers. I created an nfs share on a storage server where to store images. Before deploying glance, I create /var/lib/glance/images on the 3 controllers and I mount the nfs share. This is my fstab on the 3 controllers: 10.102.189.182:/netappopenstacktst2_glance /var/lib/glance nfs rw,user=glance,soft,intr,noatime,nodiratime In my globals.yml I have: glance_file_datadir_volume: "/var/lib/glance" glance_backend_file: "yes" This means images are on /var/lib/glance and since it is a nfs share all my 3 controlles can share images. Then you must deploy. To be sure the glance container is started on all controllers, since I have 3 controllers, I deployed 3 times changing the order in the inventory. First time: [control] A B C Second time: [control] B C A Third time: [control] C B A Or you can deploy glance 3 times using -t glance and -l As far as the instance stopped, I got I bug with a version of kolla. https://bugs.launchpad.net/kolla-ansible/+bug/1941706 Now is corrected and with kolla 12.2.0 it works. Ignazio Il giorno mer 17 nov 2021 alle ore 23:17 Franck VEDEL < franck.vedel at univ-grenoble-alpes.fr> ha scritto: > Hello and thank you for the help. > I was able to move forward on my problem, without finding a satisfactory > solution. > Normally, I have 2 servers with the role [glance] but I noticed that all > my images were on the first server (in / var / lib / docker / volumes / > glance / _data / images) before the reconfigure, none on the second. But > since the reconfiguration, the images are placed on the second, and no > longer on the first. I do not understand why. I haven't changed anything to > the multinode file. > so, to get out of this situation quickly as I need this openstack for the > students, I modified the multinode file and put only one server in [glance] > (I put server 1, the one that had the images before reconfigure), I did a > reconfigure -t glance and now I have my images usable for instances. > I don't understand what happened. There is something wrong. > > Is it normal that after updating the certificates, all instances are > turned off? > thanks again > > Franck > > Le 17 nov. 2021 ? 21:11, Cyril Roelandt a ?crit : > > Hello, > > > On 2021-11-17 08:59, Franck VEDEL wrote: > > Hello everyone > > I have a strange problem and I haven't found the solution yet. > Following a certificate update I had to do a "kolla-ansible -t multinode > reconfigure ?. > Well, after several attempts (it is not easy to use certificates with > Kolla-ansible, and from my advice, not documented enough for beginners), I > have my new functional certificates. Perfect ... well almost. > > I am trying to create a new instance to check general operation. ERROR. > Okay, I look in the logs and I see that Cinder is having problems creating > volumes with an error that I never had ("TypeError: 'NoneType' object is > not iterable). > > > We'd like to see the logs as well, especially the stacktrace. > > I dig and then I wonder if it is not the Glance images which cannot be > used, while they are present (openstack image list is OK). > > I create an empty volume: it works. > I am creating a volume from an image: Failed. > > > What commands are you running? What's the output? What's in the logs? > > > However, I have my list of ten images in glance. > > I create a new image and create a volume with this new image: it works. > I create an instance with this new image: OK. > > What is the problem ? The images present before the "reconfigure" are > listed, visible in horizon for example, but unusable. > Is there a way to fix this, or do we have to reinstall them all? > > > What's your configuration? What version of OpenStack are you running? > > > > Cyril > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Fri Nov 19 11:03:36 2021 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Fri, 19 Nov 2021 12:03:36 +0100 Subject: [kolla-ansible][wallaby][glance] Problem with image list after reconfigure In-Reply-To: References: <87804956-038F-4F88-A467-33851679B5B4@univ-grenoble-alpes.fr> Message-ID: If one sets glance_file_datadir_volume to non-default, then glance-api gets deployed on all hosts. -yoctozepto On Fri, 19 Nov 2021 at 10:51, Ignazio Cassano wrote: > > Hello Franck, glance is not deployed on all nodes at default. > I got the same problem > In my case I have 3 controllers. > I created an nfs share on a storage server where to store images. > Before deploying glance, I create /var/lib/glance/images on the 3 controllers and I mount the nfs share. > This is my fstab on the 3 controllers: > > 10.102.189.182:/netappopenstacktst2_glance /var/lib/glance nfs rw,user=glance,soft,intr,noatime,nodiratime > > In my globals.yml I have: > glance_file_datadir_volume: "/var/lib/glance" > glance_backend_file: "yes" > > This means images are on /var/lib/glance and since it is a nfs share all my 3 controlles can share images. > Then you must deploy. > To be sure the glance container is started on all controllers, since I have 3 controllers, I deployed 3 times changing the order in the inventory. > First time: > [control] > A > B > C > > Second time: > [control] > B > C > A > > Third time: > [control] > C > B > A > > Or you can deploy glance 3 times using -t glance and -l > > As far as the instance stopped, I got I bug with a version of kolla. > https://bugs.launchpad.net/kolla-ansible/+bug/1941706 > Now is corrected and with kolla 12.2.0 it works. > Ignazio > > > Il giorno mer 17 nov 2021 alle ore 23:17 Franck VEDEL ha scritto: >> >> Hello and thank you for the help. >> I was able to move forward on my problem, without finding a satisfactory solution. >> Normally, I have 2 servers with the role [glance] but I noticed that all my images were on the first server (in / var / lib / docker / volumes / glance / _data / images) before the reconfigure, none on the second. But since the reconfiguration, the images are placed on the second, and no longer on the first. I do not understand why. I haven't changed anything to the multinode file. >> so, to get out of this situation quickly as I need this openstack for the students, I modified the multinode file and put only one server in [glance] (I put server 1, the one that had the images before reconfigure), I did a reconfigure -t glance and now I have my images usable for instances. >> I don't understand what happened. There is something wrong. >> >> Is it normal that after updating the certificates, all instances are turned off? >> thanks again >> >> Franck >> >> Le 17 nov. 2021 ? 21:11, Cyril Roelandt a ?crit : >> >> Hello, >> >> >> On 2021-11-17 08:59, Franck VEDEL wrote: >> >> Hello everyone >> >> I have a strange problem and I haven't found the solution yet. >> Following a certificate update I had to do a "kolla-ansible -t multinode reconfigure ?. >> Well, after several attempts (it is not easy to use certificates with Kolla-ansible, and from my advice, not documented enough for beginners), I have my new functional certificates. Perfect ... well almost. >> >> I am trying to create a new instance to check general operation. ERROR. >> Okay, I look in the logs and I see that Cinder is having problems creating volumes with an error that I never had ("TypeError: 'NoneType' object is not iterable). >> >> >> We'd like to see the logs as well, especially the stacktrace. >> >> I dig and then I wonder if it is not the Glance images which cannot be used, while they are present (openstack image list is OK). >> >> I create an empty volume: it works. >> I am creating a volume from an image: Failed. >> >> >> What commands are you running? What's the output? What's in the logs? >> >> >> However, I have my list of ten images in glance. >> >> I create a new image and create a volume with this new image: it works. >> I create an instance with this new image: OK. >> >> What is the problem ? The images present before the "reconfigure" are listed, visible in horizon for example, but unusable. >> Is there a way to fix this, or do we have to reinstall them all? >> >> >> What's your configuration? What version of OpenStack are you running? >> >> >> >> Cyril >> >> From gregory.orange at pawsey.org.au Fri Nov 19 11:27:56 2021 From: gregory.orange at pawsey.org.au (Gregory Orange) Date: Fri, 19 Nov 2021 19:27:56 +0800 Subject: [kolla] Plan to deprecate binary and unify on single distrubition In-Reply-To: References: <7B0C2125-975E-437C-B332-9E88F725D248@gmail.com> <1e3b2745-5b78-fe2b-3113-b3253eb4c0a5@pawsey.org.au> Message-ID: <402bf683-8c9d-64f9-16c8-aacb4169a7b6@pawsey.org.au> On 19/11/21 4:19 pm, Micha? Nasiadka wrote: >> My globals.yaml has >> #kolla_install_type: "binary" >> >> So will this mean it needs to switch to 'source' from Yoga onward, and the containers will have to be built some how? Can you point me to something to read on that? >> > If that?s commented out, that means with Xena you will get source images deployed (during an upgrade or on a fresh install). So am I going to need to familiarise myself with some build process, such as https://docs.openstack.org/kolla/train/admin/image-building.html ? From swogatpradhan22 at gmail.com Fri Nov 19 11:41:58 2021 From: swogatpradhan22 at gmail.com (Swogat Pradhan) Date: Fri, 19 Nov 2021 17:11:58 +0530 Subject: [Openstack-victoria] [IRONIC] error in network interface | baremetal node validate In-Reply-To: References: Message-ID: can someone please suggest a way forward for this issue?? On Tue, Nov 16, 2021 at 3:43 PM Swogat Pradhan wrote: > Hi, > I am currently trying to setup openstack ironic using driver IPMI. > I followed the official docs of openstack for setting everything up. > > When i run openstack baremetal node validate $NODE_UUID, i am getting the > following error: > > * Unexpected exception, traceback saved into log by ironic conductor > service that is running on controller: 'ServiceTokenAuthWrapper' object has > no attribute '_discovery_cache' * > in the network interface in command output. > > When i check the ironic conductor logs i see the following messages: > > > > > Can anyone suggest a solution or a way forward. > > With regards > Swogat Pradhan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Fri Nov 19 12:02:01 2021 From: smooney at redhat.com (Sean Mooney) Date: Fri, 19 Nov 2021 12:02:01 +0000 Subject: [kolla] Plan to deprecate binary and unify on single distrubition In-Reply-To: <402bf683-8c9d-64f9-16c8-aacb4169a7b6@pawsey.org.au> References: <7B0C2125-975E-437C-B332-9E88F725D248@gmail.com> <1e3b2745-5b78-fe2b-3113-b3253eb4c0a5@pawsey.org.au> <402bf683-8c9d-64f9-16c8-aacb4169a7b6@pawsey.org.au> Message-ID: <2ca211d9b9429b1358810b671cdc29246b0a7e4d.camel@redhat.com> On Fri, 2021-11-19 at 19:27 +0800, Gregory Orange wrote: > On 19/11/21 4:19 pm, Micha? Nasiadka wrote: > > > My globals.yaml has > > > #kolla_install_type: "binary" > > > > > > So will this mean it needs to switch to 'source' from Yoga onward, and the containers will have to be built some how? Can you point me to something to read on that? > > > > > If that?s commented out, that means with Xena you will get source images deployed (during an upgrade or on a fresh install). > > So am I going to need to familiarise myself with some build process, such as https://docs.openstack.org/kolla/train/admin/image-building.html ? no the soruce images are also published to docker/quay registryso if you are not building them today and pulling them from there it will work the same if you are building image today there is no really different form a commandline point of view for source vs binary. the main different is the soruce images install all the deps from pip and the servcis form the offical tarballs into a python virtual env within the contaianer image. so unless you have explcitly set kolla_install_type: binay or the distroy you should not need to change anything hopefuly in your gobal.yaml ideally. > From ignaziocassano at gmail.com Fri Nov 19 13:50:30 2021 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Fri, 19 Nov 2021 14:50:30 +0100 Subject: [kolla-ansible][wallaby][glance] Problem with image list after reconfigure In-Reply-To: References: <87804956-038F-4F88-A467-33851679B5B4@univ-grenoble-alpes.fr> Message-ID: Franck, this help you a lot. Thanks Radoslaw Ignazio Il giorno ven 19 nov 2021 alle ore 12:03 Rados?aw Piliszek < radoslaw.piliszek at gmail.com> ha scritto: > If one sets glance_file_datadir_volume to non-default, then glance-api > gets deployed on all hosts. > > -yoctozepto > > On Fri, 19 Nov 2021 at 10:51, Ignazio Cassano > wrote: > > > > Hello Franck, glance is not deployed on all nodes at default. > > I got the same problem > > In my case I have 3 controllers. > > I created an nfs share on a storage server where to store images. > > Before deploying glance, I create /var/lib/glance/images on the 3 > controllers and I mount the nfs share. > > This is my fstab on the 3 controllers: > > > > 10.102.189.182:/netappopenstacktst2_glance /var/lib/glance nfs > rw,user=glance,soft,intr,noatime,nodiratime > > > > In my globals.yml I have: > > glance_file_datadir_volume: "/var/lib/glance" > > glance_backend_file: "yes" > > > > This means images are on /var/lib/glance and since it is a nfs share all > my 3 controlles can share images. > > Then you must deploy. > > To be sure the glance container is started on all controllers, since I > have 3 controllers, I deployed 3 times changing the order in the inventory. > > First time: > > [control] > > A > > B > > C > > > > Second time: > > [control] > > B > > C > > A > > > > Third time: > > [control] > > C > > B > > A > > > > Or you can deploy glance 3 times using -t glance and -l > > > > As far as the instance stopped, I got I bug with a version of kolla. > > https://bugs.launchpad.net/kolla-ansible/+bug/1941706 > > Now is corrected and with kolla 12.2.0 it works. > > Ignazio > > > > > > Il giorno mer 17 nov 2021 alle ore 23:17 Franck VEDEL < > franck.vedel at univ-grenoble-alpes.fr> ha scritto: > >> > >> Hello and thank you for the help. > >> I was able to move forward on my problem, without finding a > satisfactory solution. > >> Normally, I have 2 servers with the role [glance] but I noticed that > all my images were on the first server (in / var / lib / docker / volumes / > glance / _data / images) before the reconfigure, none on the second. But > since the reconfiguration, the images are placed on the second, and no > longer on the first. I do not understand why. I haven't changed anything to > the multinode file. > >> so, to get out of this situation quickly as I need this openstack for > the students, I modified the multinode file and put only one server in > [glance] (I put server 1, the one that had the images before reconfigure), > I did a reconfigure -t glance and now I have my images usable for instances. > >> I don't understand what happened. There is something wrong. > >> > >> Is it normal that after updating the certificates, all instances are > turned off? > >> thanks again > >> > >> Franck > >> > >> Le 17 nov. 2021 ? 21:11, Cyril Roelandt a ?crit : > >> > >> Hello, > >> > >> > >> On 2021-11-17 08:59, Franck VEDEL wrote: > >> > >> Hello everyone > >> > >> I have a strange problem and I haven't found the solution yet. > >> Following a certificate update I had to do a "kolla-ansible -t > multinode reconfigure ?. > >> Well, after several attempts (it is not easy to use certificates with > Kolla-ansible, and from my advice, not documented enough for beginners), I > have my new functional certificates. Perfect ... well almost. > >> > >> I am trying to create a new instance to check general operation. ERROR. > >> Okay, I look in the logs and I see that Cinder is having problems > creating volumes with an error that I never had ("TypeError: 'NoneType' > object is not iterable). > >> > >> > >> We'd like to see the logs as well, especially the stacktrace. > >> > >> I dig and then I wonder if it is not the Glance images which cannot be > used, while they are present (openstack image list is OK). > >> > >> I create an empty volume: it works. > >> I am creating a volume from an image: Failed. > >> > >> > >> What commands are you running? What's the output? What's in the logs? > >> > >> > >> However, I have my list of ten images in glance. > >> > >> I create a new image and create a volume with this new image: it works. > >> I create an instance with this new image: OK. > >> > >> What is the problem ? The images present before the "reconfigure" are > listed, visible in horizon for example, but unusable. > >> Is there a way to fix this, or do we have to reinstall them all? > >> > >> > >> What's your configuration? What version of OpenStack are you running? > >> > >> > >> > >> Cyril > >> > >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dtantsur at redhat.com Fri Nov 19 15:00:45 2021 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Fri, 19 Nov 2021 16:00:45 +0100 Subject: [all][refstack][neutron][kolla][ironic][heat][trove][senlin][barbican][manila] Fixing Zuul Config Errors In-Reply-To: <5072b00c-bbf6-42e9-830f-d6598a76beb7@www.fastmail.com> References: <5072b00c-bbf6-42e9-830f-d6598a76beb7@www.fastmail.com> Message-ID: On Tue, Nov 16, 2021 at 6:05 PM Clark Boylan wrote: > Hello, > > The OpenStack tenant in Zuul currently has 134 config errors. You can find > these errors at https://zuul.opendev.org/t/openstack/config-errors or by > clicking the blue bell icon in the top right of > https://zuul.opendev.org/t/openstack/status. The vast majority of these > errors appear related to project renames that have been requested of > OpenDev or project retirements. Can you please look into fixing these as > they can be an attractive nuisance when debugging Zuul problems (they also > indicate that a number of your jobs are probably not working). > > Project renames creating issues: > > * openstack/python-tempestconf -> osf/python-tempestconf -> > openinfra/python-tempestconf > * openstack/refstack -> osf/refstack -> openinfra/refstack > * x/tap-as-a-service -> openstack/tap-as-a-service > * openstack/networking-l2gw -> x/networking-l2gw > > Project retirements creating issues: > > * openstack/neutron-lbaas > * recordsansible/ara > > Projects whose configs have errors: > > * openinfra/python-tempestconf > * openstack/heat > * openstack/ironic > Ironic failures are missing jobs on Queens and Rocky. I'd avoid touching these branches unless our problems are breaking someone. Dmitry > * openstack/kolla-ansible > * openstack/kuryr-kubernetes > * openstack/murano-apps > * openstack/networking-midonet > * openstack/networking-odl > * openstack/neutron > * openstack/neutron-fwaas > * openstack/python-troveclient > * openstack/senlin > * openstack/tap-as-a-service > * openstack/zaqar > * x/vmware-nsx > * openinfra/openstackid > * openstack/barbican > * openstack/cookbook-openstack-application-catalog > * openstack/heat-dashboard > * openstack/manila-ui > * openstack/python-manilaclient > > Let us know if we can help decipher any errors, > Clark > > -- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill -------------- next part -------------- An HTML attachment was scrubbed... URL: From tobias.urdin at binero.com Fri Nov 19 15:22:08 2021 From: tobias.urdin at binero.com (Tobias Urdin) Date: Fri, 19 Nov 2021 15:22:08 +0000 Subject: [kolla] Plan to deprecate binary and unify on single distrubition In-Reply-To: <7B0C2125-975E-437C-B332-9E88F725D248@gmail.com> References: <7B0C2125-975E-437C-B332-9E88F725D248@gmail.com> Message-ID: <19643763-8235-45B3-91A8-B8A0839E31F6@binero.com> Interesting. This is probably a little bit off-topic but I find it very interesting that a majority of all the big OpenStack clouds out there is running on containers based and a lot of them using ?LOKI? that was talked about so much in the OpenInfra Live Keynotes. What I don?t understand is that, with all these limited resources, there is no joint effort in the OpenStack ecosystem to solve the container deliverables issue and then just have all deployments/tooling use the same. Maybe that what they are doing though, using Kolla images? but then, wouldn?t they contribute more and the below not be a problem *makes me wonder* Sorry for off-topic loud thinking. Best regards Tobias > On 18 Nov 2021, at 13:33, Micha? Nasiadka wrote: > > Hello Koalas, > > On the PTG we have discussed two topics: > > 1) Deprecate and drop binary type of Kolla images > 2) Use a common base (single Linux distribution) for Kolla images > > This is a call for feedback - for people that have not been attending the PTG. > > What this essentially mean for consumers: > > 1) In Yoga cycle we will deprecate binary type of Kolla images, and in Z cycle those will be dropped. > 2) We are not going to support CentOS Stream 9 (cs9) as a base operating system, and the source type build will rely on CentOS Stream 8 in Z release. > 3) Beginning from A release Kolla will build only Debian source images - but Kolla-Ansible will still support deployment of those images on CentOS/Ubuntu/Debian Host operating systems (and Rocky Linux to be added in Yoga to that mix). > > Justification: > The Kolla project team is limited in numbers, therefore supporting current broad mix of operating systems (especially with CentOS Stream 9 ,,on the way??) is a significant maintenance burden. > By dropping binary type of images - users would be running more tested images (since Kolla/Kolla-Ansible CI runs source images jobs as voting). > In Xena we?ve already changed the default image type Kolla-Ansible uses to source. > We also feel that using a unified base OS for Kolla container images is a way to remove some of the maintenance burden (including CI cycles and > > Request for feedback: > If any of those changes is a no go from your perspective - we?d like to hear your opinions. > > Best regards, > Michal Nasiadka From marcin.juszkiewicz at linaro.org Fri Nov 19 15:24:35 2021 From: marcin.juszkiewicz at linaro.org (Marcin Juszkiewicz) Date: Fri, 19 Nov 2021 16:24:35 +0100 Subject: [kolla] Plan to deprecate binary and unify on single distrubition In-Reply-To: References: <7B0C2125-975E-437C-B332-9E88F725D248@gmail.com> Message-ID: W dniu 18.11.2021 o 16:04, Sean Mooney pisze: > one last tought is that if only one disto will be supproted for > building images in the future with only soruce installs supproted, it > might be worth considering if alpine or the alpine/debian-lite python > 3 iamge should be revaluated as the base for an even lighter set of > contianers rather then the base os image. Using one distribution which covers all images makes life simple in Kolla. Alpine may not cover all our needs as some software still requires things present in glibc, missing in musl libc. And while openstack-base image probably could be built using one of python lite containers there are images derived from it which install additional distro packages. So instead of taking packages from one distribution (Debian) we would take from Debian or Alpine. Which could get out of sync too easily. From tobias.urdin at binero.com Fri Nov 19 15:35:55 2021 From: tobias.urdin at binero.com (Tobias Urdin) Date: Fri, 19 Nov 2021 15:35:55 +0000 Subject: [nova][all] Adding /healthcheck support in Nova, and better healthcheck in every projects In-Reply-To: References: <4882cd19-4f6d-c0a7-15cd-a98e73a217ad@debian.org> <290155a7-ad8a-ab60-8cf3-58e2642e8f38@debian.org> Message-ID: <9E3B75D2-EE5C-47F0-968F-B8A1285B44B1@binero.com> As Mohammed said, you can actually do the exact same in haproxy by setting the server in the backend to drain which would be the same just the opposite way around. That is ?set server / state drain? over haproxy admin socket. I really welcome Sean?s proposal on a real healthcheck framework that would actually tell you that something is not working instead of trying to find for example RabbitMQ connection issues from logs, it really is a pain. I wouldn?t want to have an ?real? healthcheck that does all these things exposed on public API though and think Sean?s proposal is correct and does not break backward capability since oslo.healthcheck middleware will still be there. Best regards Tobias > On 18 Nov 2021, at 16:50, Thomas Goirand wrote: > > On 11/18/21 2:03 AM, Mohammed Naser wrote: >> >> >> On Wed, Nov 17, 2021 at 5:52 PM Thomas Goirand > > wrote: >> >> On 11/17/21 10:54 PM, Dan Smith wrote: >>>> I don't think we rely on /healthcheck -- there's nothing healthy >> about >>>> an API endpoint blindly returning a 200 OK. >>>> >>>> You might as well just hit / and accept 300 as a code and that's >>>> exactly the same behaviour. I support what Sean is bringing up here >>>> and I don't think it makes sense to have a noop /healthcheck that >>>> always gives a 200 OK...seems a bit useless imho >>> >>> Yup, totally agree. Our previous concerns over a healthcheck that >>> checked all of nova returning too much info to be useful (for >> something >>> trying to figure out if an individual worker is healthy) apply in >>> reverse to one that returns too little to be useful. >>> >>> I agree, what Sean is working on is the right balance and that we >> should >>> focus on that. >>> >>> --Dan >>> >> >> That's not the only thing it does. It also is capable of being disabled, >> which is useful for maintenance: one can gracefully remove an API node >> for removal this way, which one cannot do with the root. >> >> >> I feel like this should be handled by whatever layer that needs to drain >> requests for maintenance, otherwise also it might just be the same as >> turning off the service, no? > > It's not the same. > > If you just turn off the service, there well may be some requests > attempted to the API before it's seen as down. The idea here, is to > declare the API as down, so that haproxy can remove it from the pool > *before* the service is really turned off. > > That's what the oslo.middleware disable file helps doing, which the root > url cannot do. > > Cheers, > > Thomas Goirand (zigo) > From DHilsbos at performair.com Fri Nov 19 16:29:00 2021 From: DHilsbos at performair.com (DHilsbos at performair.com) Date: Fri, 19 Nov 2021 16:29:00 +0000 Subject: [ops][nova][victoria] QEMU/KVM libvirt Authentication Failed Message-ID: <0670B960225633449A24709C291A525251D4DE7F@COM03.performair.local> All; I feel like I've dealt with this issue before, but I can't find any records of it. I've been swapping out the compute nodes in my cluster for newer and better hardware. We also decided to abandon CentOS. All the differences mean that we haven't been able to do live migrations. I now have 2 servers with the same CPUs, OS (Ubuntu), OS Version (20.10), etc., and would like to get live migration working again. I configured passwordless ssh access between the servers for the nova users to get cold migration working. I have also configured passwordless ssh for the root users in accordance with [1]. When I try to do a live migration, the origin server generates this error, in the nova-compute log: 2021-11-19 15:52:31.130 15610 ERROR nova.virt.libvirt.driver [-] [instance: 5935c07d-0c7f-48cc-a4b9-674504fc6005] Live Migration failure: operation failed: Failed to connect to remote libvirt URI qemu+tcp:///system: authentication failed: authentication failed: libvirt.libvirtError: operation failed: Failed to connect to remote libvirt URI qemu+tcp:///system: authentication failed: authentication failed At one point, I came across a tutorial on configuring live-migration for libvirt, which included a bunch of user configuration. I don't remember having to do that before, but is that what I need to be looking for? Thank you, Dominic L. Hilsbos, MBA Vice President - Information Technology Perform Air International Inc. DHilsbos at PerformAir.com www.PerformAir.com 1: https://docs.openstack.org/nova/victoria/admin/configuring-migrations.html#section-configuring-compute-migrations From gauurav.sabharwal at in.ibm.com Fri Nov 19 03:40:02 2021 From: gauurav.sabharwal at in.ibm.com (Gauurav Sabharwal1) Date: Fri, 19 Nov 2021 09:10:02 +0530 Subject: [cinder] : SAN migration In-Reply-To: References: Message-ID: Hi Sumit, No response. Regards Gauurav Sabharwal IBM India Pvt. Ltd. IBM towers Ground floor, Block -A , Plot number 26, Sector 62, Noida Gautam budhnagar UP-201307. Email:gauurav.sabharwal at in.ibm.com Mobile No.: +91-9910159277 From: "Sumit Marwah" To: "Gauurav Sabharwal1" Cc: openstack-discuss at lists.openstack.org, "Pravin P Kudav" Date: 19-11-2021 07:55 Subject: [EXTERNAL] Re: [cinder] : SAN migration Hi?Gauurav,? Did you receive any response from OpenStack?? Thanks,? Sumit Marwah Technology & Sales Director, APJ | Brocade Broadcom Mobile: +6596452240 1 Yishun Ave 7?| ?Singapore sumit.marwah at broadcom.com ??| broadcom.com ?????????? Hi?Gauurav, Did you receive any response from OpenStack? Thanks, Sumit Marwah Technology & Sales Director, APJ | Brocade Broadcom Mobile: +6596452240 1 Yishun Ave 7?| ?Singapore sumit.marwah at broadcom.com ??| ??broadcom.com On Wed, Nov 3, 2021 at 12:50 PM Gauurav Sabharwal1 < gauurav.sabharwal at in.ibm.com> wrote: Hi Experts , I need some expert advise of one of the scenario, I have multiple isolated OpenStack cluster running with train & rocky edition. Each OpenStack cluster environment have it's own isolated infrastructure of SAN ?( CISCO fabric ) & Storage ?( HP, EMC & IBM). Now company planning to refresh their SAN infrastructure. By procuring new Brocade SAN switches. But there are some migration relevant challenges we have. As we understand under one cinder instance only one typer of FC zone manager is supported . Currently ?customer configured & managing CISCO . Is it possible to configure two different vendor FC Zone manager under one cinder instance. Migration of SAN zoning is supposedly going to be happen offline way from OpenStack point of view. We will be migrating all ports of each existing cisco fabric to Brocade with zone configuration using brocade CLI. ? ?Our main concern is that after migration How ?CINDER DB update new zone info & path via Brocade SAN. Regards Gauurav Sabharwal IBM India Pvt. Ltd. IBM towers Ground floor, Block -A , Plot number 26, Sector 62, Noida Gautam budhnagar UP-201307. Email:gauurav.sabharwal at in.ibm.com Mobile No.: +91-9910159277 This electronic communication and the information and any files transmitted with it, or attached to it, are confidential and are intended solely for the use of the individual or entity to whom it is addressed and may contain information that is confidential, legally privileged, protected by privacy laws, or otherwise restricted from disclosure to anyone else. If you are not the intended recipient or the person responsible for delivering the e-mail to the intended recipient, you are hereby notified that any use, copying, distributing, dissemination, forwarding, printing, or copying of this e-mail is strictly prohibited. If you received this e-mail in error, please return the e-mail to the sender, delete it from your computer, and destroy any printed copy of it. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From sumit.marwah at broadcom.com Fri Nov 19 02:24:37 2021 From: sumit.marwah at broadcom.com (Sumit Marwah) Date: Fri, 19 Nov 2021 10:24:37 +0800 Subject: [cinder] : SAN migration In-Reply-To: References: Message-ID: Hi Gauurav, Did you receive any response from OpenStack? Thanks, Sumit Marwah Technology & Sales Director, APJ | Brocade Broadcom Mobile: +6596452240 1 Yishun Ave 7 | Singapore sumit.marwah at broadcom.com | broadcom.com On Wed, Nov 3, 2021 at 12:50 PM Gauurav Sabharwal1 < gauurav.sabharwal at in.ibm.com> wrote: > Hi Experts , > > I need some expert advise of one of the scenario, I have multiple isolated > OpenStack cluster running with train & rocky edition. Each OpenStack > cluster environment have it's own isolated infrastructure of SAN ( CISCO > fabric ) & Storage ( HP, EMC & IBM). > > Now company planning to refresh their SAN infrastructure. By procuring new > Brocade SAN switches. But there are some migration relevant challenges we > have. > > 1. As we understand under one cinder instance only one typer of FC > zone manager is supported . Currently customer configured & managing CISCO > .* Is it possible to configure two different vendor FC Zone manager > under one cinder instance.* > 2. Migration of SAN zoning is supposedly going to be happen offline > way from OpenStack point of view. We will be migrating all ports of each > existing cisco fabric to Brocade with zone configuration using brocade CLI. > *Our main concern is that after migration How CINDER DB update new > zone info & path via Brocade SAN.* > > > > Regards > Gauurav Sabharwal > IBM India Pvt. Ltd. > IBM towers > Ground floor, Block -A , Plot number 26, > Sector 62, Noida > Gautam budhnagar UP-201307. > Email:gauurav.sabharwal at in.ibm.com > Mobile No.: +91-9910159277 > > -- This electronic communication and the information and any files transmitted with it, or attached to it, are confidential and are intended solely for the use of the individual or entity to whom it is addressed and may contain information that is confidential, legally privileged, protected by privacy laws, or otherwise restricted from disclosure to anyone else. If you are not the intended recipient or the person responsible for delivering the e-mail to the intended recipient, you are hereby notified that any use, copying, distributing, dissemination, forwarding, printing, or copying of this e-mail is strictly prohibited. If you received this e-mail in error, please return the e-mail to the sender, delete it from your computer, and destroy any printed copy of it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Fri Nov 19 17:44:52 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Fri, 19 Nov 2021 11:44:52 -0600 Subject: [all][tc] Yoga testing runtime update. Action needed if py3.9 job failing for your project Message-ID: <17d394d1377.da5099ba980738.5329502106648098442@ghanshyammann.com> Hello Everyone. As discussed in TC PTG, we have updated the testing runtime for the Yoga cycle [1]. Changes are: * Add Debian 11 as the tested distro * Change centos stream 8 -> centos stream 9 * Bump lowest python version to test to 3.8 and highest to python 3.9 ** This removes python 3.6 from testing. I pushed the job template update[2] which will make py3.9 unit test job voting (which is non-voting currently). I do not see any projects failing consistently on py3.9[3] but still, I will keep it as -W until early next week (23rd Nov). If any project needs time to fix the failing py3.9 job, please do it before 23rd Nov. [1] https://governance.openstack.org/tc/reference/runtimes/yoga.html [2] https://review.opendev.org/c/openstack/openstack-zuul-jobs/+/818609 [3] https://zuul.openstack.org/builds?job_name=openstack-tox-py39&branch=master&result=FAILURE -gmann From mthode at mthode.org Fri Nov 19 17:56:50 2021 From: mthode at mthode.org (Matthew Thode) Date: Fri, 19 Nov 2021 11:56:50 -0600 Subject: [requirements][keystone] pysaml2 and oslo.policy not able to be updated due to keystone test failures Message-ID: <20211119175650.b4rb3nwsv7u34q5k@mthode.org> more or less as the title states the following reviews show the failures pysaml2: https://review.opendev.org/818612 oslo.policy: https://review.opendev.org/815820 -- Matthew Thode From kennelson11 at gmail.com Fri Nov 19 18:10:33 2021 From: kennelson11 at gmail.com (Kendall Nelson) Date: Fri, 19 Nov 2021 10:10:33 -0800 Subject: [sdk][tc] A new home for Gophercloud In-Reply-To: References: <20211103164054.72y2c7qk2b7ogxn2@yuggoth.org> <17d0b3beed8.f1f93452444025.7625989527428519853@ghanshyammann.com> Message-ID: Fair enough. Was just curious if there was some technical reason. I think it would make more sense for the SDKs to live together, personally, but I can also see how having it live inside OpenStack can be daunting for a new, external contributor. -Kendall On Mon, Nov 15, 2021 at 4:32 PM Emilien Macchi wrote: > Hey Kendall, > > On Mon, Nov 15, 2021 at 11:59 AM Kendall Nelson > wrote: > >> Is there a reason why you don't want it to be under the openstack >> namespace? >> > > The only reason that comes to my mind is not technical at all. > I (not saying we, since we haven't reached consensus yet) think that we > want the project in its own organization, rather than under openstack. We > want to encourage external contributions from outside of OpenStack, > therefore opendev would probably suit better than openstack. > > This is open for discussion of course, but as I see it going, these are my > personal thoughts. > > Thanks, > -- > Emilien Macchi > -------------- next part -------------- An HTML attachment was scrubbed... URL: From emilien at redhat.com Fri Nov 19 18:55:20 2021 From: emilien at redhat.com (Emilien Macchi) Date: Fri, 19 Nov 2021 13:55:20 -0500 Subject: [sdk][tc] A new home for Gophercloud In-Reply-To: References: <20211103164054.72y2c7qk2b7ogxn2@yuggoth.org> <17d0b3beed8.f1f93452444025.7625989527428519853@ghanshyammann.com> Message-ID: On Fri, Nov 19, 2021 at 1:11 PM Kendall Nelson wrote: > Fair enough. Was just curious if there was some technical reason. > > I think it would make more sense for the SDKs to live together, > personally, but I can also see how having it live inside OpenStack can be > daunting for a new, external contributor. > If it was me only, Gophercloud would move to opendev right now, I can only think about its benefits by my experience with the community and our amazing tools / workflows. But because I'm biased and the decision is not up to me only, I'm trying to see if this decision would be well welcomed. So far the feedback from non-OpenStack contributors was (in a nutshell, and roughly summarized): "We like the Github ecosystem, don't know much about Gerrit but if this gets too complicated I'll give up my PRs. However I agree we need to make CI better". So this is where we are... In an ideal world we would keep Github for issues & PRs, and use Opendev Infra, but I understand this isn't possible. For now, we're doing nothing except trying to stabilize our CI until we finally make a decision whether we move or not. I hope this email explains well enough why we haven't made any move yet. Emilien > -Kendall > > On Mon, Nov 15, 2021 at 4:32 PM Emilien Macchi wrote: > >> Hey Kendall, >> >> On Mon, Nov 15, 2021 at 11:59 AM Kendall Nelson >> wrote: >> >>> Is there a reason why you don't want it to be under the openstack >>> namespace? >>> >> >> The only reason that comes to my mind is not technical at all. >> I (not saying we, since we haven't reached consensus yet) think that we >> want the project in its own organization, rather than under openstack. We >> want to encourage external contributions from outside of OpenStack, >> therefore opendev would probably suit better than openstack. >> >> This is open for discussion of course, but as I see it going, these are >> my personal thoughts. >> >> Thanks, >> -- >> Emilien Macchi >> > -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Fri Nov 19 19:48:26 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Fri, 19 Nov 2021 13:48:26 -0600 Subject: [requirements][keystone] pysaml2 and oslo.policy not able to be updated due to keystone test failures In-Reply-To: <20211119175650.b4rb3nwsv7u34q5k@mthode.org> References: <20211119175650.b4rb3nwsv7u34q5k@mthode.org> Message-ID: <17d39be3432.fcce414f983958.4576894677399317541@ghanshyammann.com> ---- On Fri, 19 Nov 2021 11:56:50 -0600 Matthew Thode wrote ---- > more or less as the title states the following reviews show the failures > > pysaml2: https://review.opendev.org/818612 > oslo.policy: https://review.opendev.org/815820 oslo policy failing in keystone is due to a change in warning text, I have pushed the fix- https://review.opendev.org/c/openstack/keystone/+/818624 -gmann > > -- > Matthew Thode > > From dmendiza at redhat.com Fri Nov 19 20:32:42 2021 From: dmendiza at redhat.com (Douglas Mendizabal) Date: Fri, 19 Nov 2021 14:32:42 -0600 Subject: [keystone] No weekly meeting next week Message-ID: Hi Keystone friends, I'll be out on PTO next week, so I'm canceling the Keystone weekly meeting for November 23. Meetings will resume the following week on November 30. Thanks, - Douglas Mendiz?bal (redrobot) From franck.vedel at univ-grenoble-alpes.fr Fri Nov 19 20:56:09 2021 From: franck.vedel at univ-grenoble-alpes.fr (Franck VEDEL) Date: Fri, 19 Nov 2021 21:56:09 +0100 Subject: [kolla-ansible][wallaby][glance] Problem with image list after reconfigure In-Reply-To: References: <87804956-038F-4F88-A467-33851679B5B4@univ-grenoble-alpes.fr> Message-ID: Hello, thanks a lot , you help me to understand a lot of things. in particular that I have a lot of modifications to make to have an operational openstack and with good performance. If my iscsi bay is attached to S3 (I have S1, S2 and S3), I have to put glance on S3 with a mount in the filesystem of S3, and enable the cache. My images are in qcow2. suddenly I do not know if I modify them. Finally, and I don't know if this is the best solution, to make images that work well, I go through virtualbox, then from VDI to RAW (then from RAW to QCOW2 but it was a big mistake if I well understood). For example, I am having trouble with an opnsense image if I create the iinstance from iso and Horizon. If I go through virtualbox on another computer, then copy the files, the image is OK. Weird ?. Ah, I forgot, I didn't realize that order was important in a module [module]. Really not easy to handle all of this. Anyway thank you for your help. I will check the docs again and try to change this next week. Franck > Le 19 nov. 2021 ? 14:50, Ignazio Cassano a ?crit : > > Franck, this help you a lot. > Thanks Radoslaw > Ignazio > > Il giorno ven 19 nov 2021 alle ore 12:03 Rados?aw Piliszek > ha scritto: > If one sets glance_file_datadir_volume to non-default, then glance-api > gets deployed on all hosts. > > -yoctozepto > > On Fri, 19 Nov 2021 at 10:51, Ignazio Cassano > wrote: > > > > Hello Franck, glance is not deployed on all nodes at default. > > I got the same problem > > In my case I have 3 controllers. > > I created an nfs share on a storage server where to store images. > > Before deploying glance, I create /var/lib/glance/images on the 3 controllers and I mount the nfs share. > > This is my fstab on the 3 controllers: > > > > 10.102.189.182:/netappopenstacktst2_glance /var/lib/glance nfs rw,user=glance,soft,intr,noatime,nodiratime > > > > In my globals.yml I have: > > glance_file_datadir_volume: "/var/lib/glance" > > glance_backend_file: "yes" > > > > This means images are on /var/lib/glance and since it is a nfs share all my 3 controlles can share images. > > Then you must deploy. > > To be sure the glance container is started on all controllers, since I have 3 controllers, I deployed 3 times changing the order in the inventory. > > First time: > > [control] > > A > > B > > C > > > > Second time: > > [control] > > B > > C > > A > > > > Third time: > > [control] > > C > > B > > A > > > > Or you can deploy glance 3 times using -t glance and -l > > > > As far as the instance stopped, I got I bug with a version of kolla. > > https://bugs.launchpad.net/kolla-ansible/+bug/1941706 > > Now is corrected and with kolla 12.2.0 it works. > > Ignazio > > > > > > Il giorno mer 17 nov 2021 alle ore 23:17 Franck VEDEL > ha scritto: > >> > >> Hello and thank you for the help. > >> I was able to move forward on my problem, without finding a satisfactory solution. > >> Normally, I have 2 servers with the role [glance] but I noticed that all my images were on the first server (in / var / lib / docker / volumes / glance / _data / images) before the reconfigure, none on the second. But since the reconfiguration, the images are placed on the second, and no longer on the first. I do not understand why. I haven't changed anything to the multinode file. > >> so, to get out of this situation quickly as I need this openstack for the students, I modified the multinode file and put only one server in [glance] (I put server 1, the one that had the images before reconfigure), I did a reconfigure -t glance and now I have my images usable for instances. > >> I don't understand what happened. There is something wrong. > >> > >> Is it normal that after updating the certificates, all instances are turned off? > >> thanks again > >> > >> Franck > >> > >> Le 17 nov. 2021 ? 21:11, Cyril Roelandt > a ?crit : > >> > >> Hello, > >> > >> > >> On 2021-11-17 08:59, Franck VEDEL wrote: > >> > >> Hello everyone > >> > >> I have a strange problem and I haven't found the solution yet. > >> Following a certificate update I had to do a "kolla-ansible -t multinode reconfigure ?. > >> Well, after several attempts (it is not easy to use certificates with Kolla-ansible, and from my advice, not documented enough for beginners), I have my new functional certificates. Perfect ... well almost. > >> > >> I am trying to create a new instance to check general operation. ERROR. > >> Okay, I look in the logs and I see that Cinder is having problems creating volumes with an error that I never had ("TypeError: 'NoneType' object is not iterable). > >> > >> > >> We'd like to see the logs as well, especially the stacktrace. > >> > >> I dig and then I wonder if it is not the Glance images which cannot be used, while they are present (openstack image list is OK). > >> > >> I create an empty volume: it works. > >> I am creating a volume from an image: Failed. > >> > >> > >> What commands are you running? What's the output? What's in the logs? > >> > >> > >> However, I have my list of ten images in glance. > >> > >> I create a new image and create a volume with this new image: it works. > >> I create an instance with this new image: OK. > >> > >> What is the problem ? The images present before the "reconfigure" are listed, visible in horizon for example, but unusable. > >> Is there a way to fix this, or do we have to reinstall them all? > >> > >> > >> What's your configuration? What version of OpenStack are you running? > >> > >> > >> > >> Cyril > >> > >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Fri Nov 19 20:59:45 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Fri, 19 Nov 2021 14:59:45 -0600 Subject: [all][refstack][neutron][kolla][ironic][heat][trove][senlin][barbican][manila] Fixing Zuul Config Errors In-Reply-To: <5fea46fa-f304-48fa-a171-ce1114700379@www.fastmail.com> References: <5072b00c-bbf6-42e9-830f-d6598a76beb7@www.fastmail.com> <5517904.DvuYhMxLoT@p1> <4359750.LvFx2qVVIh@p1> <5fea46fa-f304-48fa-a171-ce1114700379@www.fastmail.com> Message-ID: <17d39ff7f1e.db45f09a985367.5739424682253609439@ghanshyammann.com> ---- On Thu, 18 Nov 2021 08:40:38 -0600 Clark Boylan wrote ---- > On Thu, Nov 18, 2021, at 3:11 AM, Pavlo Shchelokovskyy wrote: > > Hi Clark, > > > > Why is the retirement of openstack/neutron-lbaas being a problem? > > > > The repo is there and accessible under the same URL, it has > > (potentially working) stable/pike and stable/queens branches, and was > > not retired at the time of Pike or Queens, so IMO it is a valid request > > for testing configuration in the same branches of other projects, > > openstack/heat in this case. > > > > Maybe we should leave some minimal zuul configs in retired projects for > > zuul to find them? > > The reason for this is one of the steps for project retirement is to remove the repo from zuul [0]. If the upstream for the project has retired the project I think it is reasonable to remove it from our systems like Zuul. The project isn't retired on a per branch basis. > > I do not think the Zuul operators should be responsible for keeping retired projects alive in the CI system if their maintainers have stopped maintaining them. Instead we should remove them from the CI system. I agree, in case of deprecation we take care of stable branch and in retirement case, it means it is gone completely so usage of retired repo has to be completely cleanup from everywhere (master or stable). -gmann > > [0] https://opendev.org/openstack/project-config/commit/3832c8fafc4d4e03c306c21f37b2d39bd7c5bd2b > > From gmann at ghanshyammann.com Fri Nov 19 21:23:20 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Fri, 19 Nov 2021 15:23:20 -0600 Subject: [all][tc] What's happening in Technical Committee: summary 19th Nov, 21: Reading: 10 min Message-ID: <17d3a151547.cb44b8eb985741.7343034494975169538@ghanshyammann.com> Hello Everyone, Here is this week's summary of the Technical Committee activities. 1. TC Meetings: ============ * We cancelled this week's meeting due to Open Infra Keynotes at the same time. * Next week's meeting is on IRC on 25th Nov, Thursday 15:00 UTC, feel free the topic on the agenda[1] by Nov 24th. 2. What we completed this week: ========================= * Updated Yoga testing runtime [2] * Retire puppet-senlin[3] * Removed TC office hours in favour of weekly meetings [4] 3. Activities In progress: ================== TC Tracker for Yoga cycle ------------------------------ * This etherpad includes the Yoga cycle targets/working items for TC[5]. Open Reviews ----------------- * 8 open reviews for ongoing activities[6]. Fixing Zuul config error -------------------------- Clark sent a email tagging projects need action for zuul config error in their projects[7]. Please fix those config error which is not causing failure now but will fail on changes in those repo or at least is not testing what we think they are. Few projects are fixing those but still we need to clean it up completly. RBAC discussion: continuing from PTG ---------------------------------------------- We had another meeting on Wed and figured out the open question and schedule. Meeting notes are in this etherpad[8]. Final design and schedule is up in this goal rework patch[9]. Please review that and provide feedback from your project perspective. Community-wide goal updates ------------------------------------ * RBAC goal is pretty much in good shape now, feel free to review it[9] * There is one more goal proposal for 'FIPS compatibility and compliance' which needs feedback from community as well from TC[10]. * TC is working to prepare the final goal(s) soon, please wait for some more time. Adjutant need maintainers and PTLs ------------------------------------------- We received response from Braden, Albert on ML, so let's wait on the final call to help here[11]. New project 'Skyline' proposal ------------------------------------ * We discussed it in TC PTG, we are ok to have Skyline as official project but we have few open queries on gerrit and still waiting from skyline team to respond[12]. TC tags analysis ------------------- * TC agreed to remove the framework and it is communicated in ML[13]. Project updates ------------------- * Rename ?Extended Maintenance? SIG to the ?Stable Maintenance?[14] * Retire training-labs repo[15] * Add ProxySQL repository for OpenStack-Ansible[16] * Retire js-openstack-lib [17] 4. How to contact the TC: ==================== If you would like to discuss or give feedback to TC, you can reach out to us in multiple ways: 1. Email: you can send the email with tag [tc] on openstack-discuss ML[18]. 2. Weekly meeting: The Technical Committee conduct a weekly meeting every Thursday 15 UTC [19] 3. Ping us using 'tc-members' nickname on #openstack-tc IRC channel. [1] https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting [2] https://review.opendev.org/c/openstack/governance/+/815851 [3] https://review.opendev.org/c/openstack/governance/+/817329 [4] https://review.opendev.org/c/openstack/governance/+/817493 [5] https://etherpad.opendev.org/p/tc-yoga-tracker [6] https://review.opendev.org/q/projects:openstack/governance+status:open [7] http://lists.openstack.org/pipermail/openstack-discuss/2021-November/025758.htmlhttp://lists.openstack.org/pipermail/openstack-discuss/2021-November/025797.html [8] https://etherpad.opendev.org/p/policy-popup-yoga-ptg [9] https://review.opendev.org/c/openstack/governance/+/815158 [10] https://review.opendev.org/c/openstack/governance/+/816587 [11]?http://lists.openstack.org/pipermail/openstack-discuss/2021-November/025786.html [12] https://review.opendev.org/c/openstack/governance/+/814037 [13] http://lists.openstack.org/pipermail/openstack-discuss/2021-October/025571.html [14] https://review.opendev.org/c/openstack/governance-sigs/+/817499 [15] https://review.opendev.org/c/openstack/governance/+/817511 [16] https://review.opendev.org/c/openstack/governance/+/817245 [17] https://review.opendev.org/c/openstack/governance/+/807163 [18] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-discuss [19] http://eavesdrop.openstack.org/#Technical_Committee_Meeting-gmann -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnaser at vexxhost.com Sat Nov 20 01:32:47 2021 From: mnaser at vexxhost.com (Mohammed Naser) Date: Fri, 19 Nov 2021 20:32:47 -0500 Subject: [ops][nova][victoria] QEMU/KVM libvirt Authentication Failed In-Reply-To: <0670B960225633449A24709C291A525251D4DE7F@COM03.performair.local> References: <0670B960225633449A24709C291A525251D4DE7F@COM03.performair.local> Message-ID: Just a heads up even if you get things working you?re not going to be able to live migrate from centos to ubuntu and vice versa since there?s going to be things like apparmor and SELinux issues On Fri, Nov 19, 2021 at 11:35 AM wrote: > All; > > I feel like I've dealt with this issue before, but I can't find any > records of it. > > I've been swapping out the compute nodes in my cluster for newer and > better hardware. We also decided to abandon CentOS. All the differences > mean that we haven't been able to do live migrations. I now have 2 servers > with the same CPUs, OS (Ubuntu), OS Version (20.10), etc., and would like > to get live migration working again. > > I configured passwordless ssh access between the servers for the nova > users to get cold migration working. I have also configured passwordless > ssh for the root users in accordance with [1]. > > When I try to do a live migration, the origin server generates this error, > in the nova-compute log: > 2021-11-19 15:52:31.130 15610 ERROR nova.virt.libvirt.driver [-] > [instance: 5935c07d-0c7f-48cc-a4b9-674504fc6005] Live Migration failure: > operation failed: Failed to connect to remote libvirt URI > qemu+tcp:///system: authentication failed: authentication > failed: libvirt.libvirtError: operation failed: Failed to connect to remote > libvirt URI qemu+tcp:///system: authentication failed: > authentication failed > > At one point, I came across a tutorial on configuring live-migration for > libvirt, which included a bunch of user configuration. I don't remember > having to do that before, but is that what I need to be looking for? > > Thank you, > > Dominic L. Hilsbos, MBA > Vice President - Information Technology > Perform Air International Inc. > DHilsbos at PerformAir.com > www.PerformAir.com > > 1: > https://docs.openstack.org/nova/victoria/admin/configuring-migrations.html#section-configuring-compute-migrations > > -- Mohammed Naser VEXXHOST, Inc. -------------- next part -------------- An HTML attachment was scrubbed... URL: From laurentfdumont at gmail.com Sat Nov 20 01:45:29 2021 From: laurentfdumont at gmail.com (Laurent Dumont) Date: Fri, 19 Nov 2021 20:45:29 -0500 Subject: [ops][nova][victoria] QEMU/KVM libvirt Authentication Failed In-Reply-To: References: <0670B960225633449A24709C291A525251D4DE7F@COM03.performair.local> Message-ID: Which version of Openstack are you running? It seems to try to connect over qemu with auth over tcp. Without ssh? Is the cold migration working now? On Fri, Nov 19, 2021 at 8:35 PM Mohammed Naser wrote: > Just a heads up even if you get things working you?re not going to be able > to live migrate from centos to ubuntu and vice versa since there?s going to > be things like apparmor and SELinux issues > > On Fri, Nov 19, 2021 at 11:35 AM wrote: > >> All; >> >> I feel like I've dealt with this issue before, but I can't find any >> records of it. >> >> I've been swapping out the compute nodes in my cluster for newer and >> better hardware. We also decided to abandon CentOS. All the differences >> mean that we haven't been able to do live migrations. I now have 2 servers >> with the same CPUs, OS (Ubuntu), OS Version (20.10), etc., and would like >> to get live migration working again. >> >> I configured passwordless ssh access between the servers for the nova >> users to get cold migration working. I have also configured passwordless >> ssh for the root users in accordance with [1]. >> >> When I try to do a live migration, the origin server generates this >> error, in the nova-compute log: >> 2021-11-19 15:52:31.130 15610 ERROR nova.virt.libvirt.driver [-] >> [instance: 5935c07d-0c7f-48cc-a4b9-674504fc6005] Live Migration failure: >> operation failed: Failed to connect to remote libvirt URI >> qemu+tcp:///system: authentication failed: authentication >> failed: libvirt.libvirtError: operation failed: Failed to connect to remote >> libvirt URI qemu+tcp:///system: authentication failed: >> authentication failed >> >> At one point, I came across a tutorial on configuring live-migration for >> libvirt, which included a bunch of user configuration. I don't remember >> having to do that before, but is that what I need to be looking for? >> >> Thank you, >> >> Dominic L. Hilsbos, MBA >> Vice President - Information Technology >> Perform Air International Inc. >> DHilsbos at PerformAir.com >> www.PerformAir.com >> >> 1: >> https://docs.openstack.org/nova/victoria/admin/configuring-migrations.html#section-configuring-compute-migrations >> >> -- > Mohammed Naser > VEXXHOST, Inc. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From michal.arbet at ultimum.io Sat Nov 20 10:05:16 2021 From: michal.arbet at ultimum.io (Michal Arbet) Date: Sat, 20 Nov 2021 12:05:16 +0200 Subject: [neutron][nova] [kolla] vif plugged timeout Message-ID: Hi, Has anyone seen issue which I am currently facing ? When launching heat stack ( but it's same if I launch several of instances ) vif plugged in timeouts an I don't know why, sometimes it is OK ..sometimes is failing. Sometimes neutron reports vif plugged in < 10 sec ( test env ) sometimes it's 100 and more seconds, it seems there is some race condition but I can't find out where the problem is. But on the end every instance is spawned ok (retry mechanism worked). Another finding is that it has to do something with security group, if noop driver is used ..everything is working good. Firewall security setup is openvswitch . Test env is wallaby. I will attach some logs when I will be near PC .. Thank you, Michal Arbet (Kevko) -------------- next part -------------- An HTML attachment was scrubbed... URL: From michal.arbet at ultimum.io Sat Nov 20 12:20:19 2021 From: michal.arbet at ultimum.io (Michal Arbet) Date: Sat, 20 Nov 2021 14:20:19 +0200 Subject: [neutron][nova] [kolla] vif plugged timeout In-Reply-To: References: Message-ID: + if i raise vif_plugged_timeout ( hope i rember it correct ) in nova to some high number ..problem dissapear ... But it's only workaround D?a so 20. 11. 2021, 12:05 Michal Arbet nap?sal(a): > Hi, > > Has anyone seen issue which I am currently facing ? > > When launching heat stack ( but it's same if I launch several of instances > ) vif plugged in timeouts an I don't know why, sometimes it is OK > ..sometimes is failing. > > Sometimes neutron reports vif plugged in < 10 sec ( test env ) sometimes > it's 100 and more seconds, it seems there is some race condition but I > can't find out where the problem is. But on the end every instance is > spawned ok (retry mechanism worked). > > Another finding is that it has to do something with security group, if > noop driver is used ..everything is working good. > > Firewall security setup is openvswitch . > > Test env is wallaby. > > I will attach some logs when I will be near PC .. > > Thank you, > Michal Arbet (Kevko) > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hello at dincercelik.com Sun Nov 21 06:51:17 2021 From: hello at dincercelik.com (Dincer Celik) Date: Sun, 21 Nov 2021 09:51:17 +0300 Subject: [kolla] Proposing Michal Arbet as Kolla core In-Reply-To: References: Message-ID: <139F5BE7-4083-4364-959F-A850C12FCCB1@dincercelik.com> +1 > On 16 Nov 2021, at 15:49, Micha? Nasiadka wrote: > > Hi, > > I would like to propose adding Michal Arbet (kevko) to the kolla-core and kolla-ansible-core groups. > Michal did some great work on ProxySQL, is a consistent maintainer of Debian related images and has provided some helpful > reviews. > > Cores - please reply +1/-1 before the end of Friday 26th November. > > Thanks, > Michal -------------- next part -------------- An HTML attachment was scrubbed... URL: From kira034 at 163.com Sun Nov 21 09:48:30 2021 From: kira034 at 163.com (Hongbin Lu) Date: Sun, 21 Nov 2021 17:48:30 +0800 (CST) Subject: [neutron] Bug deputy report (Nov 15 - 21) Message-ID: <2e49efc.f7b.17d41e5ad46.Coremail.kira034@163.com> Hi, I am the bug deputy this week. Please find the report below: High * https://bugs.launchpad.net/neutron/+bug/1951010 Restarting Neutron floods Nova with segment aggregates calls * https://bugs.launchpad.net/neutron/+bug/1951225 [OVN] Agent can't be found in functional test sometimes Medium * https://bugs.launchpad.net/neutron/+bug/1951149 [OVN] If "chassis" register is deleted, "chassis_private" can have 0 "chassis" associated * https://bugs.launchpad.net/neutron/+bug/1951559 [OVN] Router ports gateway_mtu option should not always be set * https://bugs.launchpad.net/neutron/+bug/1951564 snat random-fully supported with iptables 1.6.0 Low * https://bugs.launchpad.net/neutron/+bug/1951272 [OVN] OVS to OVN migration should be stopped if OVS agent firewall is "iptables_hybrid" * https://bugs.launchpad.net/neutron/+bug/1951429 Neutron API responses should not contain tracebacks * https://bugs.launchpad.net/neutron/+bug/1951569 Undecided (might need further traiging from domain experts) * https://bugs.launchpad.net/neutron/+bug/1951074 [OVN] default setting leak nameserver config from the host to instances * https://bugs.launchpad.net/neutron/+bug/1951493 OVS DPDK restart results in that tap interfaces in network namespaces can't be opened * https://bugs.launchpad.net/neutron/+bug/1951517 Segmentation ID should be lower or equal to 4095 Best regards, Hongbin -------------- next part -------------- An HTML attachment was scrubbed... URL: From wodel.youchi at gmail.com Sun Nov 21 13:27:02 2021 From: wodel.youchi at gmail.com (wodel youchi) Date: Sun, 21 Nov 2021 14:27:02 +0100 Subject: [kolla-ansible][wallaby][magnum] calico works, flannel does not work Message-ID: Hi, I don't have any experience in kubernetes yet, my goal in the present is to test that the auto-sacele of a kubernetes cluster is working properly using magnum. I followed some tutorials to get the test done. I tried to configure the test but I had problems with magnum-metrics-server not being able to provide metrics, after digging and reading a lot of things on the web, I found out that kubernetes master couldn't connect to the metrics server (curl -k https://ip_metrics_server didn't work) So I created another template but this time using calico as a network provider, and this time it worked. How can I find why flannel does not work? where should I look? Regards. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mthode at mthode.org Sun Nov 21 23:51:28 2021 From: mthode at mthode.org (Matthew Thode) Date: Sun, 21 Nov 2021 17:51:28 -0600 Subject: [requirements][cinder][manila] pyparsing update needs handling Message-ID: <20211121235128.a6jlampynlotp4yy@mthode.org> https://review.opendev.org/818614 For cinder it looks like operatorPrecedence is gone now. For manila it looks like the same thing. -- Matthew Thode From michal.arbet at ultimum.io Mon Nov 22 08:02:35 2021 From: michal.arbet at ultimum.io (Michal Arbet) Date: Mon, 22 Nov 2021 10:02:35 +0200 Subject: [neutron][nova] [kolla] vif plugged timeout In-Reply-To: References: Message-ID: Hi, It seems it's same issue as issue on launchpad https://bugs.launchpad.net/nova/+bug/1944779 Thanks, Kevko D?a so 20. 11. 2021, 14:20 Michal Arbet nap?sal(a): > + if i raise vif_plugged_timeout ( hope i rember it correct ) in nova to > some high number ..problem dissapear ... But it's only workaround > > D?a so 20. 11. 2021, 12:05 Michal Arbet > nap?sal(a): > >> Hi, >> >> Has anyone seen issue which I am currently facing ? >> >> When launching heat stack ( but it's same if I launch several of >> instances ) vif plugged in timeouts an I don't know why, sometimes it is OK >> ..sometimes is failing. >> >> Sometimes neutron reports vif plugged in < 10 sec ( test env ) sometimes >> it's 100 and more seconds, it seems there is some race condition but I >> can't find out where the problem is. But on the end every instance is >> spawned ok (retry mechanism worked). >> >> Another finding is that it has to do something with security group, if >> noop driver is used ..everything is working good. >> >> Firewall security setup is openvswitch . >> >> Test env is wallaby. >> >> I will attach some logs when I will be near PC .. >> >> Thank you, >> Michal Arbet (Kevko) >> >> >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralonsoh at redhat.com Mon Nov 22 08:54:06 2021 From: ralonsoh at redhat.com (Rodolfo Alonso Hernandez) Date: Mon, 22 Nov 2021 09:54:06 +0100 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming In-Reply-To: References: <2165480.iZASKD2KPV@p1> <3MOP2R.O83SZVO0NWN23@est.tech> Message-ID: Hello: I think the last idea Ronelled presented (a skiplist) could be feasible in Neutron. Of course, this list could grow indefinitely, but we can always keep an eye on it. There could be another issue with Neutron tempest tests when using the "advance" image. Despite the recent improvements done recently, we are frequently having problems with the RAM size of the testing VMs. We would like to have 20% more RAM, if possible. I wish we had the ability to pre-run some checks in specific HW (tempest plugin or grenade tests). Slawek commented the different number of backends we need to provide support and testing. However I think we can remove the Linux Bridge tempest plugin from the "gate" list (it is already tested in the "check" list). Tempest plugin tests are expensive in time and prone to errors. This paragraph falls under the shoulders of the Neutron team. We can also identify those long running tests that usually fail (those that take more than 1000 seconds). A test that takes around 15 mins to run, will probably fail. We need to find those tests, investigate the slowest parts of those tests and try to improve/optimize/remove them. Thank you all for your comments and proposals. That will help a lot to improve the Neutron CI stability. Regards. On Fri, Nov 19, 2021 at 12:53 AM Ronelle Landy wrote: > > > On Wed, Nov 17, 2021 at 5:22 AM Balazs Gibizer > wrote: > >> >> >> On Wed, Nov 17 2021 at 09:13:34 AM +0100, Slawek Kaplonski >> wrote: >> > Hi, >> > >> > Recently I spent some time to check how many rechecks we need in >> > Neutron to >> > get patch merged and I compared it to some other OpenStack projects >> > (see [1] >> > for details). >> > TL;DR - results aren't good for us and I think we really need to do >> > something >> > with that. >> >> I really like the idea of collecting such stats. Thank you for doing >> it. I can even imagine to make a public dashboard somewhere with this >> information as it is a good indication about the health of our projects >> / testing. >> >> > >> > Of course "easiest" thing to say is that we should fix issues which >> > we are >> > hitting in the CI to make jobs more stable. But it's not that easy. >> > We are >> > struggling with those jobs for very long time. We have CI related >> > meeting >> > every week and we are fixing what we can there. >> > Unfortunately there is still bunch of issues which we can't fix so >> > far because >> > they are intermittent and hard to reproduce locally or in some cases >> > the >> > issues aren't realy related to the Neutron or there are new bugs >> > which we need >> > to investigate and fix :) >> >> >> I have couple of suggestion based on my experience working with CI in >> nova. >> > > We've struggled with unstable tests in TripleO as well. Here are some > things we tried and implemented: > > 1. Created job dependencies so we only ran check tests once we knew we had > the resources we needed (example we had pulled containers successfully) > > 2. Moved some testing to third party where we have easier control of the > environment (note that third party cannot stop a change merging) > > 3. Used dependency pipelines to pre-qualify some dependencies ahead of > letting them run wild on our check jobs > > 4. Requested testproject runs of changes in a less busy environment before > running a full set of tests in a public zuul > > 5. Used a skiplist to keep track of tech debt and skip known failures that > we could temporarily ignore to keep CI moving along if we're waiting on an > external fix. > > > >> >> 1) we try to open bug reports for intermittent gate failures too and >> keep them tagged in a list [1] so when a job fail it is easy to check >> if the bug is known. >> >> 2) I offer my help here now that if you see something in neutron runs >> that feels non neutron specific then ping me with it. Maybe we are >> struggling with the same problem too. >> >> 3) there was informal discussion before about a possibility to re-run >> only some jobs with a recheck instead for re-running the whole set. I >> don't know if this is feasible with Zuul and I think this only treat >> the symptom not the root case. But still this could be a direction if >> all else fails. >> >> Cheers, >> gibi >> >> > So this is never ending battle for us. The problem is that we have >> > to test >> > various backends, drivers, etc. so as a result we have many jobs >> > running on >> > each patch - excluding UT, pep8 and docs jobs we have around 19 jobs >> > in check >> > and 14 jobs in gate queue. >> > >> > In the past we made a lot of improvements, like e.g. we improved >> > irrelevant >> > files lists for jobs to run less jobs on some of the patches, >> > together with QA >> > team we did "integrated-networking" template to run only Neutron and >> > Nova >> > related scenario tests in the Neutron queues, we removed and >> > consolidated some >> > of the jobs (there is still one patch in progress for that but it >> > should just >> > remove around 2 jobs from the check queue). All of that are good >> > improvements >> > but still not enough to make our CI really stable :/ >> > >> > Because of all of that, I would like to ask community about any other >> > ideas >> > how we can improve that. If You have any ideas, please send it in >> > this email >> > thread or reach out to me directly on irc. >> > We want to discuss about them in the next video CI meeting which will >> > be on >> > November 30th. If You would have any idea and would like to join that >> > discussion, You are more than welcome in that meeting of course :) >> > >> > [1] >> > http://lists.openstack.org/pipermail/openstack-discuss/2021-November/ >> > 025759.html >> >> >> [1] >> >> https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure&orderby=-date_last_updated&start=0 >> >> >> > >> > -- >> > Slawek Kaplonski >> > Principal Software Engineer >> > Red Hat >> >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From jake.yip at ardc.edu.au Mon Nov 22 09:01:28 2021 From: jake.yip at ardc.edu.au (Jake Yip) Date: Mon, 22 Nov 2021 20:01:28 +1100 Subject: [kolla-ansible][wallaby][magnum] calico works, flannel does not work In-Reply-To: References: Message-ID: <89b9fdf7-dcd9-87f1-fbac-114776c0cf25@ardc.edu.au> Hi, It could be due to https://github.com/flannel-io/flannel/issues/1155. One way to find out is to restart all the flannel pods after the cluster has been created, kubectl -n kube-system delete pod -l app=flannel Hope this helps. Regards, Jake On 22/11/21 12:27 am, wodel youchi wrote: > Hi, > I don't have any experience in kubernetes yet, my goal in the present is > to test that the auto-sacele of a kubernetes cluster is working properly > using magnum. > > I followed some tutorials to get the test done. I tried to configure the > test but I had problems with magnum-metrics-server not being able to > provide metrics, after digging and reading a lot of things on the web, I > found out that kubernetes master couldn't connect to the metrics server > (curl -k https://ip_metrics_server didn't work) > So I created another template but this time using calico as a network > provider, and this time it worked. > > How can I find why flannel does not work? where should I look? > > Regards. From thierry at openstack.org Mon Nov 22 11:08:09 2021 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 22 Nov 2021 12:08:09 +0100 Subject: [largescale-sig] Next meeting: Nov 24th, 15utc Message-ID: Hi everyone, The Large Scale SIG will be meeting this Wednesday in #openstack-operators on OFTC IRC, at 15UTC. You can doublecheck how that time translates locally at: https://www.timeanddate.com/worldclock/fixedtime.html?iso=20211124T15 We will be discussing our upcoming OpenInfra Live! episode on Dec 9. Feel free to add topics to our agenda at: https://etherpad.openstack.org/p/large-scale-sig-meeting Regards, -- Thierry Carrez From DHilsbos at performair.com Mon Nov 22 15:52:59 2021 From: DHilsbos at performair.com (DHilsbos at performair.com) Date: Mon, 22 Nov 2021 15:52:59 +0000 Subject: [ops][nova][victoria] QEMU/KVM libvirt Authentication Failed In-Reply-To: References: <0670B960225633449A24709C291A525251D4DE7F@COM03.performair.local> Message-ID: <0670B960225633449A24709C291A525251D4FF1B@COM03.performair.local> Mohammed; Yep, I'm aware. I have 3 Nova servers, 1 runs CentOS, 2 run Ubuntu. I'm only trying to live migrate between the 2 Ubuntu servers. Thank you, Dominic L. Hilsbos, MBA Vice President ? Information Technology Perform Air International Inc. DHilsbos at PerformAir.com www.PerformAir.com From: Mohammed Naser [mailto:mnaser at vexxhost.com] Sent: Friday, November 19, 2021 6:33 PM To: Dominic Hilsbos Cc: openstack-discuss at lists.openstack.org Subject: Re: [ops][nova][victoria] QEMU/KVM libvirt Authentication Failed Just a heads up even if you get things working you?re not going to be able to live migrate from centos to ubuntu and vice versa since there?s going to be things like apparmor and SELinux issues? On Fri, Nov 19, 2021 at 11:35 AM wrote: All; I feel like I've dealt with this issue before, but I can't find any records of it. I've been swapping out the compute nodes in my cluster for newer and better hardware.? We also decided to abandon CentOS.? All the differences mean that we haven't been able to do live migrations.? I now have 2 servers with the same CPUs, OS (Ubuntu), OS Version (20.10), etc., and would like to get live migration working again. I configured passwordless ssh access between the servers for the nova users to get cold migration working.? I have also configured passwordless ssh for the root users in accordance with [1]. When I try to do a live migration, the origin server generates this error, in the nova-compute log: 2021-11-19 15:52:31.130 15610 ERROR nova.virt.libvirt.driver [-] [instance: 5935c07d-0c7f-48cc-a4b9-674504fc6005] Live Migration failure: operation failed: Failed to connect to remote libvirt URI qemu+tcp:///system: authentication failed: authentication failed: libvirt.libvirtError: operation failed: Failed to connect to remote libvirt URI qemu+tcp:///system: authentication failed: authentication failed At one point, I came across a tutorial on configuring live-migration for libvirt, which included a bunch of user configuration.? I don't remember having to do that before, but is that what I need to be looking for? Thank you, Dominic L. Hilsbos, MBA Vice President - Information Technology Perform Air International Inc. DHilsbos at PerformAir.com www.PerformAir.com 1: https://docs.openstack.org/nova/victoria/admin/configuring-migrations.html#section-configuring-compute-migrations -- Mohammed Naser VEXXHOST, Inc. From DHilsbos at performair.com Mon Nov 22 16:09:23 2021 From: DHilsbos at performair.com (DHilsbos at performair.com) Date: Mon, 22 Nov 2021 16:09:23 +0000 Subject: [ops][nova][victoria] QEMU/KVM libvirt Authentication Failed In-Reply-To: References: <0670B960225633449A24709C291A525251D4DE7F@COM03.performair.local> Message-ID: <0670B960225633449A24709C291A525251D4FF98@COM03.performair.local> Laurent; We're running Victoria. Here are specific package versions: Ubuntu: 20.10 nova-compute: 22.2.1-0ubuntu1 (both) nova-compute-kvm: 22.2.1-0ubuntu1 (both) qemu-kvm: 5.0-5unbuntu9.9 (both) libvirt-daemon: 6.6.0-1ubuntu3.5 (both) As I said, this has come up for me before, but I can't find records of how it was addressed. I don't remember an issue of authentication from before, however. From before, I do remember that after the ssh connection to setup the new host, qemu/kvm on the old host makes a connection to qemu/kvm on the new host, in order to coordinate the transfer of memory contents, and other dynamic elements. Yes, I can cold migrate between all 3 servers, which makes this a non-critical issue. While I have a CentOS Nova host, I'm not going to attempt to get live-migration working between the Ubuntu Servers Changing the configuration of libvirt from system sockets to native listeners got me past a connection refused error (it appears that lbvirt also connects from one server to another?) Thank you, Dominic L. Hilsbos, MBA Vice President ? Information Technology Perform Air International Inc. DHilsbos at PerformAir.com www.PerformAir.com From: Laurent Dumont [mailto:laurentfdumont at gmail.com] Sent: Friday, November 19, 2021 6:45 PM To: Mohammed Naser Cc: Dominic Hilsbos; openstack-discuss Subject: Re: [ops][nova][victoria] QEMU/KVM libvirt Authentication Failed Which version of Openstack are you running? It seems to try to connect over qemu with auth over tcp. Without ssh? Is the cold migration working now? On Fri, Nov 19, 2021 at 8:35 PM Mohammed Naser wrote: Just a heads up even if you get things working you?re not going to be able to live migrate from centos to ubuntu and vice versa since there?s going to be things like apparmor and SELinux issues? On Fri, Nov 19, 2021 at 11:35 AM wrote: All; I feel like I've dealt with this issue before, but I can't find any records of it. I've been swapping out the compute nodes in my cluster for newer and better hardware.? We also decided to abandon CentOS.? All the differences mean that we haven't been able to do live migrations.? I now have 2 servers with the same CPUs, OS (Ubuntu), OS Version (20.10), etc., and would like to get live migration working again. I configured passwordless ssh access between the servers for the nova users to get cold migration working.? I have also configured passwordless ssh for the root users in accordance with [1]. When I try to do a live migration, the origin server generates this error, in the nova-compute log: 2021-11-19 15:52:31.130 15610 ERROR nova.virt.libvirt.driver [-] [instance: 5935c07d-0c7f-48cc-a4b9-674504fc6005] Live Migration failure: operation failed: Failed to connect to remote libvirt URI qemu+tcp:///system: authentication failed: authentication failed: libvirt.libvirtError: operation failed: Failed to connect to remote libvirt URI qemu+tcp:///system: authentication failed: authentication failed At one point, I came across a tutorial on configuring live-migration for libvirt, which included a bunch of user configuration.? I don't remember having to do that before, but is that what I need to be looking for? Thank you, Dominic L. Hilsbos, MBA Vice President - Information Technology Perform Air International Inc. DHilsbos at PerformAir.com www.PerformAir.com 1: https://docs.openstack.org/nova/victoria/admin/configuring-migrations.html#section-configuring-compute-migrations -- Mohammed Naser VEXXHOST, Inc. From jake.yip at unimelb.edu.au Mon Nov 22 09:24:26 2021 From: jake.yip at unimelb.edu.au (Jake Yip) Date: Mon, 22 Nov 2021 20:24:26 +1100 Subject: Migration from Midonet to OVN Message-ID: <4d91ad35-5fda-3164-4d9a-a624cb8faf4c@unimelb.edu.au> Hi all, We are planning a migration from Midonet to OVN. The general idea is to: - pause neutron - do a `neutron-ovn-db-sync-util` - change networks/ports to geneve/ovs - hard reboot the instances We are wondering if anyone has done a migration like this before, and will like to share their experiences. Any input will be greatly appreciated. Regards, Jake -- Jake Yip DevOps Engineer, ARDC Nectar Research Cloud From mathur.hitesh at gmail.com Mon Nov 22 10:00:50 2021 From: mathur.hitesh at gmail.com (Hitesh Mathur) Date: Mon, 22 Nov 2021 12:00:50 +0200 Subject: Cinder NFS Encryption Message-ID: Hi, I am not able to find whether Cinder support NFS data-in-transit encryption or not. Can you please provide the information on this and how to use it ? -- Regards Hitesh -------------- next part -------------- An HTML attachment was scrubbed... URL: From damien.rannou at ovhcloud.com Mon Nov 22 16:11:22 2021 From: damien.rannou at ovhcloud.com (Damien Rannou) Date: Mon, 22 Nov 2021 16:11:22 +0000 Subject: [neutron] default QOS on L3 gateway Message-ID: Hello We are currently playing with QOS on L3 agent, mostly for SNAT, but it can apply also on FIP. Everything is working properly, but I?m wondering if there is a way to define a ? default ? QOS that would be applied on Router creation, but also when the user is setting ? no_qos ? on his router. On a public cloud environnement, we cannot let the customers without any QOS limitation. Thanks ! Damien From DHilsbos at performair.com Mon Nov 22 17:01:55 2021 From: DHilsbos at performair.com (DHilsbos at performair.com) Date: Mon, 22 Nov 2021 17:01:55 +0000 Subject: [ops][nova][victoria] QEMU/KVM libvirt Authentication Failed In-Reply-To: <0670B960225633449A24709C291A525251D4FF98@COM03.performair.local> References: <0670B960225633449A24709C291A525251D4DE7F@COM03.performair.local> <0670B960225633449A24709C291A525251D4FF98@COM03.performair.local> Message-ID: <0670B960225633449A24709C291A525251D50128@COM03.performair.local> Laurent; Your message ended up pointing me in the right direction. I started asking myself why libvirtd came from Ubuntu configured incorrectly for live migrations. The obvious answer is: it didn't. That suggested that I ad configured something incorrectly. That realization, together with the discussion from [1] led me to set libvirt.live_migration_scheme="ssh" in nova.conf. After restarting nova-compute, I can now live migrate instances between Ubuntu servers. Thank you for your assistance. Thank you, Dominic L. Hilsbos, MBA Vice President ? Information Technology Perform Air International Inc. DHilsbos at PerformAir.com www.PerformAir.com [1] https://bugzilla.redhat.com/show_bug.cgi?id=1254307 -----Original Message----- From: DHilsbos at performair.com [mailto:DHilsbos at performair.com] Sent: Monday, November 22, 2021 9:09 AM To: laurentfdumont at gmail.com; mnaser at vexxhost.com Cc: openstack-discuss at lists.openstack.org Subject: RE: [ops][nova][victoria] QEMU/KVM libvirt Authentication Failed Laurent; We're running Victoria. Here are specific package versions: Ubuntu: 20.10 nova-compute: 22.2.1-0ubuntu1 (both) nova-compute-kvm: 22.2.1-0ubuntu1 (both) qemu-kvm: 5.0-5unbuntu9.9 (both) libvirt-daemon: 6.6.0-1ubuntu3.5 (both) As I said, this has come up for me before, but I can't find records of how it was addressed. I don't remember an issue of authentication from before, however. From before, I do remember that after the ssh connection to setup the new host, qemu/kvm on the old host makes a connection to qemu/kvm on the new host, in order to coordinate the transfer of memory contents, and other dynamic elements. Yes, I can cold migrate between all 3 servers, which makes this a non-critical issue. While I have a CentOS Nova host, I'm not going to attempt to get live-migration working between the Ubuntu Servers Changing the configuration of libvirt from system sockets to native listeners got me past a connection refused error (it appears that lbvirt also connects from one server to another?) Thank you, Dominic L. Hilsbos, MBA Vice President ? Information Technology Perform Air International Inc. DHilsbos at PerformAir.com www.PerformAir.com From: Laurent Dumont [mailto:laurentfdumont at gmail.com] Sent: Friday, November 19, 2021 6:45 PM To: Mohammed Naser Cc: Dominic Hilsbos; openstack-discuss Subject: Re: [ops][nova][victoria] QEMU/KVM libvirt Authentication Failed Which version of Openstack are you running? It seems to try to connect over qemu with auth over tcp. Without ssh? Is the cold migration working now? On Fri, Nov 19, 2021 at 8:35 PM Mohammed Naser wrote: Just a heads up even if you get things working you?re not going to be able to live migrate from centos to ubuntu and vice versa since there?s going to be things like apparmor and SELinux issues? On Fri, Nov 19, 2021 at 11:35 AM wrote: All; I feel like I've dealt with this issue before, but I can't find any records of it. I've been swapping out the compute nodes in my cluster for newer and better hardware.? We also decided to abandon CentOS.? All the differences mean that we haven't been able to do live migrations.? I now have 2 servers with the same CPUs, OS (Ubuntu), OS Version (20.10), etc., and would like to get live migration working again. I configured passwordless ssh access between the servers for the nova users to get cold migration working.? I have also configured passwordless ssh for the root users in accordance with [1]. When I try to do a live migration, the origin server generates this error, in the nova-compute log: 2021-11-19 15:52:31.130 15610 ERROR nova.virt.libvirt.driver [-] [instance: 5935c07d-0c7f-48cc-a4b9-674504fc6005] Live Migration failure: operation failed: Failed to connect to remote libvirt URI qemu+tcp:///system: authentication failed: authentication failed: libvirt.libvirtError: operation failed: Failed to connect to remote libvirt URI qemu+tcp:///system: authentication failed: authentication failed At one point, I came across a tutorial on configuring live-migration for libvirt, which included a bunch of user configuration.? I don't remember having to do that before, but is that what I need to be looking for? Thank you, Dominic L. Hilsbos, MBA Vice President - Information Technology Perform Air International Inc. DHilsbos at PerformAir.com www.PerformAir.com 1: https://docs.openstack.org/nova/victoria/admin/configuring-migrations.html#section-configuring-compute-migrations -- Mohammed Naser VEXXHOST, Inc. From katonalala at gmail.com Mon Nov 22 17:45:26 2021 From: katonalala at gmail.com (Lajos Katona) Date: Mon, 22 Nov 2021 18:45:26 +0100 Subject: [neutron] default QOS on L3 gateway In-Reply-To: References: Message-ID: Hi, There is an RFE to inherit network QoS: https://bugs.launchpad.net/neutron/+bug/1950454 The patch series: https://review.opendev.org/q/topic:%22bug%252F1950454%22+(status:open%20OR%20status:merged) Hope this covers your usecase. Lajos Katona (lajoskatona) Damien Rannou ezt ?rta (id?pont: 2021. nov. 22., H, 17:23): > Hello > We are currently playing with QOS on L3 agent, mostly for SNAT, but it can > apply also on FIP. > Everything is working properly, but I?m wondering if there is a way to > define a ? default ? QOS that would be applied on Router creation, > but also when the user is setting ? no_qos ? on his router. > > On a public cloud environnement, we cannot let the customers without any > QOS limitation. > > Thanks ! > Damien -------------- next part -------------- An HTML attachment was scrubbed... URL: From katonalala at gmail.com Mon Nov 22 17:52:15 2021 From: katonalala at gmail.com (Lajos Katona) Date: Mon, 22 Nov 2021 18:52:15 +0100 Subject: Migration from Midonet to OVN In-Reply-To: <4d91ad35-5fda-3164-4d9a-a624cb8faf4c@unimelb.edu.au> References: <4d91ad35-5fda-3164-4d9a-a624cb8faf4c@unimelb.edu.au> Message-ID: Hi For tripleo there's a documentation, worth read it to have a general view what tripleo does to migrate from OVS to OVN: https://docs.openstack.org/neutron/latest/ovn/migration.html There are playbooks for it in Neutron repo, that will be helpful as well I hope: https://opendev.org/openstack/neutron/src/branch/master/tools/ovn_migration Regards Lajos Katona (lajoskatona) Jake Yip ezt ?rta (id?pont: 2021. nov. 22., H, 17:23): > Hi all, > > We are planning a migration from Midonet to OVN. The general idea is to: > > - pause neutron > - do a `neutron-ovn-db-sync-util` > - change networks/ports to geneve/ovs > - hard reboot the instances > > We are wondering if anyone has done a migration like this before, and > will like to share their experiences. > > Any input will be greatly appreciated. > > Regards, > Jake > > -- > Jake Yip > DevOps Engineer, ARDC Nectar Research Cloud > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosmaita.fossdev at gmail.com Mon Nov 22 20:21:52 2021 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Mon, 22 Nov 2021 15:21:52 -0500 Subject: [cinder] reminder: this week's meeting in video+IRC Message-ID: Quick reminder that this week's Cinder team meeting on Wednesday 24 November, being the final meeting of the month, will be held in both videoconference and IRC at the regularly scheduled time of 1400 UTC. These are the video meeting rules we've agreed to: * Everyone will keep IRC open during the meeting. * We'll take notes in IRC to leave a record similar to what we have for our regular IRC meetings. * Some people are more comfortable communicating in written English. So at any point, any attendee may request that the discussion of the current topic be conducted entirely in IRC. * The meeting will be recorded. connection info: https://bluejeans.com/3228528973 meeting agenda: https://etherpad.opendev.org/p/cinder-yoga-meetings cheers, brian From gouthampravi at gmail.com Mon Nov 22 20:28:14 2021 From: gouthampravi at gmail.com (Goutham Pacha Ravi) Date: Mon, 22 Nov 2021 12:28:14 -0800 Subject: [manila] No IRC meeting on 25th Nov Message-ID: Hello Zorillas, In lieu of holidays this week, we'll be skipping the weekly meeting earlier scheduled at 1500 UTC on 25th Nov 2021. Please reach out here or on OFTC's #openstack-manila should you have any matters that need to be addressed. Thanks, Goutham From sombrafam at gmail.com Mon Nov 22 20:46:55 2021 From: sombrafam at gmail.com (Erlon Cruz) Date: Mon, 22 Nov 2021 17:46:55 -0300 Subject: [neutron] Neutron OVN+QoS Support Message-ID: Hi folks, I have a question related to the Neutron supportability of OVN+QoS. I have checked the config reference for both Victoria and Xena[1] [2] and they are shown as supported (bw limit, eggress/ingress), but I tried to set up an env with OVN+QoS but the rules are not being effective (VMs still download at maximum speed). I double-checked the configuration in the neutron API and it brings the QoS settings[3] [4] [5] , and the versions[6] [7] I'm using should support it. What makes me more confused is that there's a document[8] [9] with a gap analysis of the OVN vs OVS QoS functionality and the document *is* being updated over the releases, but it still shows that QoS is not supported in OVN. So, is there something I'm missing? Erlon _______________ [1] https://docs.openstack.org/neutron/victoria/admin/config-qos.html [2] https://docs.openstack.org/neutron/xena/admin/config-qos.html [3] QoS Config: https://gist.github.com/sombrafam/f8434c0505ed4dd3f912574e7ccebb82 [4] neutron.conf: https://gist.github.com/sombrafam/785beb10f20439c4e50eb633f294ae82 [5] ml2_conf.ini: https://gist.github.com/sombrafam/b171a38d8cd16bd4dc77cfee3916dccd [6] neutron-api-0 versions: https://gist.github.com/sombrafam/5d098daa1df3f116d599c09c96eab173 [7] nova-compute-0 versions: https://gist.github.com/sombrafam/d51102e3a32be5dc8ca03d7a23b6a998 [8] Gaps from ML2/OVS-OVN Xena: https://docs.openstack.org/neutron/xena/ovn/gaps.html [9] Gaps from ML2/OVS-OVN Victoria: https://docs.openstack.org/neutron/victoria/ovn/gaps.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From rsafrono at redhat.com Mon Nov 22 21:45:35 2021 From: rsafrono at redhat.com (Roman Safronov) Date: Mon, 22 Nov 2021 23:45:35 +0200 Subject: [neutron] Neutron OVN+QoS Support In-Reply-To: References: Message-ID: Hi Erlon, I have a couple of questions that probably will help to understand the issue better. Have you applied the QoS rules on a port, network or floating ip? Have you applied the QoS rules before starting the VM (before it's port is active) or after? Thanks On Mon, Nov 22, 2021 at 10:53 PM Erlon Cruz wrote: > Hi folks, > > I have a question related to the Neutron supportability of OVN+QoS. I have > checked the config reference for both > Victoria and Xena[1] > [2] > and they > are shown as supported (bw limit, eggress/ingress), but I tried to set up > an env > with OVN+QoS but the rules are not being effective (VMs still download at > maximum speed). I double-checked > the configuration in the neutron API and it brings the QoS settings[3] > [4] > [5] > , and > the versions[6] > [7] > I'm > using should support it. > > What makes me more confused is that there's a document[8] > [9] > with a gap > analysis of the OVN vs OVS QoS functionality > and the document *is* being updated over the releases, but it still shows > that QoS is not supported in OVN. > > So, is there something I'm missing? > > Erlon > _______________ > [1] https://docs.openstack.org/neutron/victoria/admin/config-qos.html > [2] https://docs.openstack.org/neutron/xena/admin/config-qos.html > [3] QoS Config: > https://gist.github.com/sombrafam/f8434c0505ed4dd3f912574e7ccebb82 > [4] neutron.conf: > https://gist.github.com/sombrafam/785beb10f20439c4e50eb633f294ae82 > [5] ml2_conf.ini: > https://gist.github.com/sombrafam/b171a38d8cd16bd4dc77cfee3916dccd > [6] neutron-api-0 versions: > https://gist.github.com/sombrafam/5d098daa1df3f116d599c09c96eab173 > [7] nova-compute-0 versions: > https://gist.github.com/sombrafam/d51102e3a32be5dc8ca03d7a23b6a998 > [8] Gaps from ML2/OVS-OVN Xena: > https://docs.openstack.org/neutron/xena/ovn/gaps.html > [9] Gaps from ML2/OVS-OVN Victoria: > https://docs.openstack.org/neutron/victoria/ovn/gaps.html > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ces.eduardo98 at gmail.com Mon Nov 22 22:03:10 2021 From: ces.eduardo98 at gmail.com (Carlos Silva) Date: Mon, 22 Nov 2021 19:03:10 -0300 Subject: [manila] First Yoga cycle bug squash - Nov 29th - 3rd Dec Message-ID: Greetings Zorillas and interested stackers! As mentioned in the previous weekly meetings, we will soon be meeting for the first bugsquash of the Yoga release! The event will be held from Nov 29th to 3rd December, 2021, providing an extended contribution window. We will start the event with a call on the first day (Nov 29th). There will be three calls (one of them using our Manila upstream meeting time slot). All of the three calls of the week will be held in a Jitsi room [1]. Nov 29th - 1500 to 1540 UTC - Event opening and bug assignments Dec 2nd - 1500 to 1600 UTC - Collab review session (no regular Manila meeting on this day) Dec 3rd - 1500 to 1540 UTC - Status update and event wrap up A list of selected bugs will be shared here [2]. Please feel free to add any additional bugs you would like to address during the event. [1] https://meetpad.opendev.org/ManilaYogaM1Bugsquash [2] https://ethercalc.openstack.org/wvb2oa23rxbb Thank you in advance! Hope to see you there :) carloss -------------- next part -------------- An HTML attachment was scrubbed... URL: From laurentfdumont at gmail.com Mon Nov 22 22:05:47 2021 From: laurentfdumont at gmail.com (Laurent Dumont) Date: Mon, 22 Nov 2021 17:05:47 -0500 Subject: [neutron][nova] [kolla] vif plugged timeout In-Reply-To: References: Message-ID: How high did you have to raise it? If it does appear after X amount of time, then the VIF plug is not lost? On Sat, Nov 20, 2021 at 7:23 AM Michal Arbet wrote: > + if i raise vif_plugged_timeout ( hope i rember it correct ) in nova to > some high number ..problem dissapear ... But it's only workaround > > D?a so 20. 11. 2021, 12:05 Michal Arbet > nap?sal(a): > >> Hi, >> >> Has anyone seen issue which I am currently facing ? >> >> When launching heat stack ( but it's same if I launch several of >> instances ) vif plugged in timeouts an I don't know why, sometimes it is OK >> ..sometimes is failing. >> >> Sometimes neutron reports vif plugged in < 10 sec ( test env ) sometimes >> it's 100 and more seconds, it seems there is some race condition but I >> can't find out where the problem is. But on the end every instance is >> spawned ok (retry mechanism worked). >> >> Another finding is that it has to do something with security group, if >> noop driver is used ..everything is working good. >> >> Firewall security setup is openvswitch . >> >> Test env is wallaby. >> >> I will attach some logs when I will be near PC .. >> >> Thank you, >> Michal Arbet (Kevko) >> >> >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From gouthampravi at gmail.com Mon Nov 22 22:20:33 2021 From: gouthampravi at gmail.com (Goutham Pacha Ravi) Date: Mon, 22 Nov 2021 14:20:33 -0800 Subject: [requirements][cinder][manila] pyparsing update needs handling In-Reply-To: <20211121235128.a6jlampynlotp4yy@mthode.org> References: <20211121235128.a6jlampynlotp4yy@mthode.org> Message-ID: On Sun, Nov 21, 2021 at 4:02 PM Matthew Thode wrote: > > https://review.opendev.org/818614 > > For cinder it looks like operatorPrecedence is gone now. > > For manila it looks like the same thing. Thanks for pointing it out. Fixing it with https://review.opendev.org/c/openstack/manila/+/818829 and https://review.opendev.org/c/openstack/cinder/+/818834 > > -- > Matthew Thode > From gmann at ghanshyammann.com Tue Nov 23 01:11:08 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 22 Nov 2021 19:11:08 -0600 Subject: [all][refstack][neutron][kolla][ironic][heat][trove][senlin][barbican][manila] Fixing Zuul Config Errors In-Reply-To: <17d39ff7f1e.db45f09a985367.5739424682253609439@ghanshyammann.com> References: <5072b00c-bbf6-42e9-830f-d6598a76beb7@www.fastmail.com> <5517904.DvuYhMxLoT@p1> <4359750.LvFx2qVVIh@p1> <5fea46fa-f304-48fa-a171-ce1114700379@www.fastmail.com> <17d39ff7f1e.db45f09a985367.5739424682253609439@ghanshyammann.com> Message-ID: <17d4a58ba7d.e83d82681106387.2917303785697015051@ghanshyammann.com> ---- On Fri, 19 Nov 2021 14:59:45 -0600 Ghanshyam Mann wrote ---- > ---- On Thu, 18 Nov 2021 08:40:38 -0600 Clark Boylan wrote ---- > > On Thu, Nov 18, 2021, at 3:11 AM, Pavlo Shchelokovskyy wrote: > > > Hi Clark, > > > > > > Why is the retirement of openstack/neutron-lbaas being a problem? > > > > > > The repo is there and accessible under the same URL, it has > > > (potentially working) stable/pike and stable/queens branches, and was > > > not retired at the time of Pike or Queens, so IMO it is a valid request > > > for testing configuration in the same branches of other projects, > > > openstack/heat in this case. > > > > > > Maybe we should leave some minimal zuul configs in retired projects for > > > zuul to find them? > > > > The reason for this is one of the steps for project retirement is to remove the repo from zuul [0]. If the upstream for the project has retired the project I think it is reasonable to remove it from our systems like Zuul. The project isn't retired on a per branch basis. > > > > I do not think the Zuul operators should be responsible for keeping retired projects alive in the CI system if their maintainers have stopped maintaining them. Instead we should remove them from the CI system. > > I agree, in case of deprecation we take care of stable branch and in retirement case, it means it is gone completely so usage of retired repo has to be completely > cleanup from everywhere (master or stable). I am adding this to the TC meeting agenda so that we do not forget it. Also, created the below etherpad to track the progress, please add the progress and patch links under your project: - https://etherpad.opendev.org/p/zuul-config-error-openstack -gmann > > -gmann > > > > > [0] https://opendev.org/openstack/project-config/commit/3832c8fafc4d4e03c306c21f37b2d39bd7c5bd2b > > > > > > From gmann at ghanshyammann.com Tue Nov 23 01:12:20 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 22 Nov 2021 19:12:20 -0600 Subject: [all][tc] Technical Committee next weekly meeting on Nov 25th at 1500 UTC Message-ID: <17d4a59d276.fd9316fb1106394.2758893334791698593@ghanshyammann.com> Hello Everyone, Technical Committee's next weekly meeting is scheduled for Nov 25th at 1500 UTC. If you would like to add topics for discussion, please add them to the below wiki page by Wednesday, Nov 24th, at 2100 UTC. https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting -gmann From gagehugo at gmail.com Tue Nov 23 04:55:46 2021 From: gagehugo at gmail.com (Gage Hugo) Date: Mon, 22 Nov 2021 22:55:46 -0600 Subject: [openstack-helm] No Meeting Tomorrow Message-ID: Hey team, Since there are no agenda items [0] for the IRC meeting tomorrow, the meeting is cancelled. Our next meeting will be November 30th. [0] https://etherpad.opendev.org/p/openstack-helm-weekly-meeting -------------- next part -------------- An HTML attachment was scrubbed... URL: From franck.vedel at univ-grenoble-alpes.fr Tue Nov 23 07:57:51 2021 From: franck.vedel at univ-grenoble-alpes.fr (Franck VEDEL) Date: Tue, 23 Nov 2021 08:57:51 +0100 Subject: [kolla-ansible][wallaby][glance] Problem with image list after reconfigure In-Reply-To: References: <87804956-038F-4F88-A467-33851679B5B4@univ-grenoble-alpes.fr> Message-ID: <30680C7F-7E55-4098-8438-F3CEE48C4A90@univ-grenoble-alpes.fr> Ignazio, Radoslaw, thanks to you, I made some modifications and my environment seems to work better (the images are placed on the iiscsi bay on which the volumes are stored). I installed the cache for glance. It works, well I think it does. My question is: between the different formats (qcow2, raw or other), which is the most efficient if - we create a volume then an instance from the volume - we create an instance from the image - we create an instance without volume - we create a snapshot then an instance from the snapshot Franck > > >> Le 19 nov. 2021 ? 14:50, Ignazio Cassano > a ?crit : >> >> Franck, this help you a lot. >> Thanks Radoslaw >> Ignazio >> >> Il giorno ven 19 nov 2021 alle ore 12:03 Rados?aw Piliszek > ha scritto: >> If one sets glance_file_datadir_volume to non-default, then glance-api >> gets deployed on all hosts. >> >> -yoctozepto >> >> On Fri, 19 Nov 2021 at 10:51, Ignazio Cassano > wrote: >> > >> > Hello Franck, glance is not deployed on all nodes at default. >> > I got the same problem >> > In my case I have 3 controllers. >> > I created an nfs share on a storage server where to store images. >> > Before deploying glance, I create /var/lib/glance/images on the 3 controllers and I mount the nfs share. >> > This is my fstab on the 3 controllers: >> > >> > 10.102.189.182:/netappopenstacktst2_glance /var/lib/glance nfs rw,user=glance,soft,intr,noatime,nodiratime >> > >> > In my globals.yml I have: >> > glance_file_datadir_volume: "/var/lib/glance" >> > glance_backend_file: "yes" >> > >> > This means images are on /var/lib/glance and since it is a nfs share all my 3 controlles can share images. >> > Then you must deploy. >> > To be sure the glance container is started on all controllers, since I have 3 controllers, I deployed 3 times changing the order in the inventory. >> > First time: >> > [control] >> > A >> > B >> > C >> > >> > Second time: >> > [control] >> > B >> > C >> > A >> > >> > Third time: >> > [control] >> > C >> > B >> > A >> > >> > Or you can deploy glance 3 times using -t glance and -l >> > >> > As far as the instance stopped, I got I bug with a version of kolla. >> > https://bugs.launchpad.net/kolla-ansible/+bug/1941706 >> > Now is corrected and with kolla 12.2.0 it works. >> > Ignazio >> > >> > >> > Il giorno mer 17 nov 2021 alle ore 23:17 Franck VEDEL > ha scritto: -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Tue Nov 23 08:14:49 2021 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 23 Nov 2021 09:14:49 +0100 Subject: [kolla-ansible][wallaby][glance] Problem with image list after reconfigure In-Reply-To: <30680C7F-7E55-4098-8438-F3CEE48C4A90@univ-grenoble-alpes.fr> References: <87804956-038F-4F88-A467-33851679B5B4@univ-grenoble-alpes.fr> <30680C7F-7E55-4098-8438-F3CEE48C4A90@univ-grenoble-alpes.fr> Message-ID: Franck, If the cache works fine , I think glance image format could be qcow2. The volume is created in raw format but the download phase is executed only the fisrt time you create a volume from a new image. With this setup I can create 20-30 instance in a shot and it takes few minutes to complete. I always use general purpose small images and colplete the instance configuration (package installation and so on) with heat or ansible. Ignazio Il giorno mar 23 nov 2021 alle ore 08:57 Franck VEDEL < franck.vedel at univ-grenoble-alpes.fr> ha scritto: > Ignazio, Radoslaw, > > thanks to you, I made some modifications and my environment seems to work > better (the images are placed on the iiscsi bay on which the volumes are > stored). > I installed the cache for glance. It works, well I think it does. > > My question is: between the different formats (qcow2, raw or other), which > is the most efficient if > - we create a volume then an instance from the volume > - we create an instance from the image > - we create an instance without volume > - we create a snapshot then an instance from the snapshot > > Franck > > > > > Le 19 nov. 2021 ? 14:50, Ignazio Cassano a > ?crit : > > Franck, this help you a lot. > Thanks Radoslaw > Ignazio > > Il giorno ven 19 nov 2021 alle ore 12:03 Rados?aw Piliszek < > radoslaw.piliszek at gmail.com> ha scritto: > >> If one sets glance_file_datadir_volume to non-default, then glance-api >> gets deployed on all hosts. >> >> -yoctozepto >> >> On Fri, 19 Nov 2021 at 10:51, Ignazio Cassano >> wrote: >> > >> > Hello Franck, glance is not deployed on all nodes at default. >> > I got the same problem >> > In my case I have 3 controllers. >> > I created an nfs share on a storage server where to store images. >> > Before deploying glance, I create /var/lib/glance/images on the 3 >> controllers and I mount the nfs share. >> > This is my fstab on the 3 controllers: >> > >> > 10.102.189.182:/netappopenstacktst2_glance /var/lib/glance nfs >> rw,user=glance,soft,intr,noatime,nodiratime >> > >> > In my globals.yml I have: >> > glance_file_datadir_volume: "/var/lib/glance" >> > glance_backend_file: "yes" >> > >> > This means images are on /var/lib/glance and since it is a nfs share >> all my 3 controlles can share images. >> > Then you must deploy. >> > To be sure the glance container is started on all controllers, since I >> have 3 controllers, I deployed 3 times changing the order in the inventory. >> > First time: >> > [control] >> > A >> > B >> > C >> > >> > Second time: >> > [control] >> > B >> > C >> > A >> > >> > Third time: >> > [control] >> > C >> > B >> > A >> > >> > Or you can deploy glance 3 times using -t glance and -l >> > >> > As far as the instance stopped, I got I bug with a version of kolla. >> > https://bugs.launchpad.net/kolla-ansible/+bug/1941706 >> > Now is corrected and with kolla 12.2.0 it works. >> > Ignazio >> > >> > >> > Il giorno mer 17 nov 2021 alle ore 23:17 Franck VEDEL < >> franck.vedel at univ-grenoble-alpes.fr> ha scritto: >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nsitlani03 at gmail.com Tue Nov 23 09:29:08 2021 From: nsitlani03 at gmail.com (Namrata Sitlani) Date: Tue, 23 Nov 2021 14:59:08 +0530 Subject: [magnum] [victoria] [fedora-coreos] Message-ID: Hello All, We run release Victoria(11.1.1) and we deployed Kubernetes version 1.18.20 on Magnum with Fedora CoreOS version 33 and recently we ran into issues with Cinder CSI plugin. Multiple master nodes break the CSI plugin, but if I use a single master everything works fine. In my understanding it could be related to Fedora CoreOS version. It is not clear which versions are supported. Can somebody give me information on which version is supported? Thanks, Namrata Sitlani -------------- next part -------------- An HTML attachment was scrubbed... URL: From sombrafam at gmail.com Tue Nov 23 10:38:38 2021 From: sombrafam at gmail.com (Erlon Cruz) Date: Tue, 23 Nov 2021 07:38:38 -0300 Subject: [neutron] Neutron OVN+QoS Support In-Reply-To: References: Message-ID: Hi Roman, Forgot to add that detail, since I run the same routine in a non-ovn deployment and it worked. But this is how I did it: openstack network qos policy list openstack network qos policy create bw-limiter openstack network qos rule create --type bandwidth-limit --max-kbps 512 --max-burst-kbits 512 --egress bw-limiter openstack network qos rule create --type bandwidth-limit --max-kbps 512 --max-burst-kbits 512 --ingress bw-limiter openstack network set --qos-policy bw-limiter ext_net I didn't set it in the port though, which is something I should do. I'll set it in the port too for testing but I think the above should work regardless. Erlon Em seg., 22 de nov. de 2021 ?s 18:45, Roman Safronov escreveu: > Hi Erlon, > > I have a couple of questions that probably will help to understand the > issue better. > Have you applied the QoS rules on a port, network or floating ip? > Have you applied the QoS rules before starting the VM (before it's port is > active) or after? > > Thanks > > > On Mon, Nov 22, 2021 at 10:53 PM Erlon Cruz wrote: > >> Hi folks, >> >> I have a question related to the Neutron supportability of OVN+QoS. I >> have checked the config reference for both >> Victoria and Xena[1] >> [2] >> and they >> are shown as supported (bw limit, eggress/ingress), but I tried to set up >> an env >> with OVN+QoS but the rules are not being effective (VMs still download at >> maximum speed). I double-checked >> the configuration in the neutron API and it brings the QoS settings[3] >> [4] >> [5] >> , >> and the versions[6] >> [7] >> I'm >> using should support it. >> >> What makes me more confused is that there's a document[8] >> [9] >> with a gap >> analysis of the OVN vs OVS QoS functionality >> and the document *is* being updated over the releases, but it still shows >> that QoS is not supported in OVN. >> >> So, is there something I'm missing? >> >> Erlon >> _______________ >> [1] https://docs.openstack.org/neutron/victoria/admin/config-qos.html >> [2] https://docs.openstack.org/neutron/xena/admin/config-qos.html >> [3] QoS Config: >> https://gist.github.com/sombrafam/f8434c0505ed4dd3f912574e7ccebb82 >> [4] neutron.conf: >> https://gist.github.com/sombrafam/785beb10f20439c4e50eb633f294ae82 >> [5] ml2_conf.ini: >> https://gist.github.com/sombrafam/b171a38d8cd16bd4dc77cfee3916dccd >> [6] neutron-api-0 versions: >> https://gist.github.com/sombrafam/5d098daa1df3f116d599c09c96eab173 >> [7] nova-compute-0 versions: >> https://gist.github.com/sombrafam/d51102e3a32be5dc8ca03d7a23b6a998 >> [8] Gaps from ML2/OVS-OVN Xena: >> https://docs.openstack.org/neutron/xena/ovn/gaps.html >> [9] Gaps from ML2/OVS-OVN Victoria: >> https://docs.openstack.org/neutron/victoria/ovn/gaps.html >> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rsafrono at redhat.com Tue Nov 23 11:12:03 2021 From: rsafrono at redhat.com (Roman Safronov) Date: Tue, 23 Nov 2021 13:12:03 +0200 Subject: [neutron] Neutron OVN+QoS Support In-Reply-To: References: Message-ID: Hi Erlon, There was a bug with setting QoS on a network but it had been fixed long ago. https://bugs.launchpad.net/neutron/+bug/1851362 or https://bugzilla.redhat.com/show_bug.cgi?id=1934096 At least in our downstream CI we do not observe such issues with QoS+OVN. >From the commands I see that you apply the QoS rule on the external network, right? On Tue, Nov 23, 2021 at 12:39 PM Erlon Cruz wrote: > Hi Roman, > > Forgot to add that detail, since I run the same routine in a non-ovn > deployment and it worked. But this is how I did it: > > openstack network qos policy list > openstack network qos policy create bw-limiter > openstack network qos rule create --type bandwidth-limit --max-kbps 512 > --max-burst-kbits 512 --egress bw-limiter > openstack network qos rule create --type bandwidth-limit --max-kbps 512 > --max-burst-kbits 512 --ingress bw-limiter > openstack network set --qos-policy bw-limiter ext_net > > I didn't set it in the port though, which is something I should do. I'll > set it in the port too for testing but I think the above should > work regardless. > > Erlon > > > Em seg., 22 de nov. de 2021 ?s 18:45, Roman Safronov > escreveu: > >> Hi Erlon, >> >> I have a couple of questions that probably will help to understand the >> issue better. >> Have you applied the QoS rules on a port, network or floating ip? >> Have you applied the QoS rules before starting the VM (before it's port >> is active) or after? >> >> Thanks >> >> >> On Mon, Nov 22, 2021 at 10:53 PM Erlon Cruz wrote: >> >>> Hi folks, >>> >>> I have a question related to the Neutron supportability of OVN+QoS. I >>> have checked the config reference for both >>> Victoria and Xena[1] >>> [2] >>> and >>> they are shown as supported (bw limit, eggress/ingress), but I tried to set >>> up an env >>> with OVN+QoS but the rules are not being effective (VMs still download >>> at maximum speed). I double-checked >>> the configuration in the neutron API and it brings the QoS settings[3] >>> [4] >>> [5] >>> , >>> and the versions[6] >>> [7] >>> I'm >>> using should support it. >>> >>> What makes me more confused is that there's a document[8] >>> [9] >>> with a gap >>> analysis of the OVN vs OVS QoS functionality >>> and the document *is* being updated over the releases, but it still >>> shows that QoS is not supported in OVN. >>> >>> So, is there something I'm missing? >>> >>> Erlon >>> _______________ >>> [1] https://docs.openstack.org/neutron/victoria/admin/config-qos.html >>> [2] https://docs.openstack.org/neutron/xena/admin/config-qos.html >>> [3] QoS Config: >>> https://gist.github.com/sombrafam/f8434c0505ed4dd3f912574e7ccebb82 >>> [4] neutron.conf: >>> https://gist.github.com/sombrafam/785beb10f20439c4e50eb633f294ae82 >>> [5] ml2_conf.ini: >>> https://gist.github.com/sombrafam/b171a38d8cd16bd4dc77cfee3916dccd >>> [6] neutron-api-0 versions: >>> https://gist.github.com/sombrafam/5d098daa1df3f116d599c09c96eab173 >>> [7] nova-compute-0 versions: >>> https://gist.github.com/sombrafam/d51102e3a32be5dc8ca03d7a23b6a998 >>> [8] Gaps from ML2/OVS-OVN Xena: >>> https://docs.openstack.org/neutron/xena/ovn/gaps.html >>> [9] Gaps from ML2/OVS-OVN Victoria: >>> https://docs.openstack.org/neutron/victoria/ovn/gaps.html >>> >>> >>> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From lokendrarathour at gmail.com Tue Nov 23 11:45:38 2021 From: lokendrarathour at gmail.com (Lokendra Rathour) Date: Tue, 23 Nov 2021 17:15:38 +0530 Subject: [Triple 0] Undercloud deployment Getting failed Message-ID: Hi Team getting strange error when installing triple O Train on Centos 8.4 '--volume', '/var/lib/container-puppet/puppetlabs/facter.conf:/etc/puppetlabs/facter/facter.conf:ro', '--volume', '/var/lib/container-puppet/puppetlabs/:/opt/puppetlabs/:ro', '--volume', '/dev/log:/dev/log:rw', '--rm', '--log-driver', 'k8s-file', '--log-opt', 'path=/var/log/containers/stdouts/container-puppet-zaqar.log', '--security-opt', 'label=disable', '--volume', '/usr/share/openstack-puppet/modules/:/usr/share/openstack-puppet/modules/:ro', '--entrypoint', '/var/lib/container-puppet/container-puppet.sh', '--net', 'host', '--volume', '/etc/hosts:/etc/hosts:ro', '--volume', '/var/lib/container-puppet/container-puppet.sh:/var/lib/container-puppet/container-puppet.sh:ro', 'quay.io/tripleotraincentos8/centos-binary-zaqar-wsgi:current-tripleo'] run failed after Error: container_linux.go:370: starting container process caused: error adding seccomp filter rule for syscall bdflush: permission denied: OCI permission denied", " attempt(s): 3", "2021-11-23 11:00:38,384 WARNING: 58791 -- Retrying running container: zaqar", "2021-11-23 11:00:38,384 ERROR: 58791 -- Failed running container for zaqar", "2021-11-23 11:00:38,385 INFO: 58791 -- Finished processing puppet configs for zaqar", "2021-11-23 11:00:38,385 ERROR: 58782 -- ERROR configuring crond", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring glance_api", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring heat_api", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring heat", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring ironic_api", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring ironic", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring ironic_inspector", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring neutron", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring iscsid", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring keystone", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring memcached", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring mistral", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring mysql", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring nova", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring rabbitmq", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring placement", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring swift", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring swift_ringbuilder", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring zaqar" ], "stderr_lines": [], "_ansible_no_log": false, "attempts": 15 } ] ] Not cleaning working directory /home/stack/tripleo-heat-installer-templates Not cleaning ansible directory /home/stack/undercloud-ansible-mie5k51_ Install artifact is located at /home/stack/undercloud-install-20211123110040.tar.bzip2 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Deployment Failed! This issue is recurring multiple times please advise. -- ~ Lokendra -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralonsoh at redhat.com Tue Nov 23 11:47:37 2021 From: ralonsoh at redhat.com (Rodolfo Alonso Hernandez) Date: Tue, 23 Nov 2021 12:47:37 +0100 Subject: [neutron] Neutron OVN+QoS Support In-Reply-To: References: Message-ID: Hello Erlon: We really need to review the gaps document, at least for Xena. As Roman said, we have been testing QoS in OVN successfully. The current status of QoS in OVN is (at least for Xena): - Fixed ports (VM ports): support for BW limit rules (egress/ingress) and DSCP (only egress). Neutron supports port network QoS inheritance (same as in your example). This is not for OVN but for any backend. - FIPs: support for BW limit rules (egress/ingress). Still no network QoS inheritance (in progress). - GW IP: no support yet. Ping me in #openstack-neutron channel (ralonsoh) if you have more questions. Regards. On Tue, Nov 23, 2021 at 12:12 PM Roman Safronov wrote: > Hi Erlon, > > There was a bug with setting QoS on a network but it had been fixed long > ago. > https://bugs.launchpad.net/neutron/+bug/1851362 or > https://bugzilla.redhat.com/show_bug.cgi?id=1934096 > At least in our downstream CI we do not observe such issues with QoS+OVN. > > From the commands I see that you apply the QoS rule on the external > network, right? > > On Tue, Nov 23, 2021 at 12:39 PM Erlon Cruz wrote: > >> Hi Roman, >> >> Forgot to add that detail, since I run the same routine in a non-ovn >> deployment and it worked. But this is how I did it: >> >> openstack network qos policy list >> openstack network qos policy create bw-limiter >> openstack network qos rule create --type bandwidth-limit --max-kbps 512 >> --max-burst-kbits 512 --egress bw-limiter >> openstack network qos rule create --type bandwidth-limit --max-kbps 512 >> --max-burst-kbits 512 --ingress bw-limiter >> openstack network set --qos-policy bw-limiter ext_net >> >> I didn't set it in the port though, which is something I should do. I'll >> set it in the port too for testing but I think the above should >> work regardless. >> >> Erlon >> >> >> Em seg., 22 de nov. de 2021 ?s 18:45, Roman Safronov >> escreveu: >> >>> Hi Erlon, >>> >>> I have a couple of questions that probably will help to understand the >>> issue better. >>> Have you applied the QoS rules on a port, network or floating ip? >>> Have you applied the QoS rules before starting the VM (before it's port >>> is active) or after? >>> >>> Thanks >>> >>> >>> On Mon, Nov 22, 2021 at 10:53 PM Erlon Cruz wrote: >>> >>>> Hi folks, >>>> >>>> I have a question related to the Neutron supportability of OVN+QoS. I >>>> have checked the config reference for both >>>> Victoria and Xena[1] >>>> [2] >>>> and >>>> they are shown as supported (bw limit, eggress/ingress), but I tried to set >>>> up an env >>>> with OVN+QoS but the rules are not being effective (VMs still download >>>> at maximum speed). I double-checked >>>> the configuration in the neutron API and it brings the QoS settings[3] >>>> [4] >>>> [5] >>>> , >>>> and the versions[6] >>>> [7] >>>> I'm >>>> using should support it. >>>> >>>> What makes me more confused is that there's a document[8] >>>> [9] >>>> with a gap >>>> analysis of the OVN vs OVS QoS functionality >>>> and the document *is* being updated over the releases, but it still >>>> shows that QoS is not supported in OVN. >>>> >>>> So, is there something I'm missing? >>>> >>>> Erlon >>>> _______________ >>>> [1] https://docs.openstack.org/neutron/victoria/admin/config-qos.html >>>> [2] https://docs.openstack.org/neutron/xena/admin/config-qos.html >>>> [3] QoS Config: >>>> https://gist.github.com/sombrafam/f8434c0505ed4dd3f912574e7ccebb82 >>>> [4] neutron.conf: >>>> https://gist.github.com/sombrafam/785beb10f20439c4e50eb633f294ae82 >>>> [5] ml2_conf.ini: >>>> https://gist.github.com/sombrafam/b171a38d8cd16bd4dc77cfee3916dccd >>>> [6] neutron-api-0 versions: >>>> https://gist.github.com/sombrafam/5d098daa1df3f116d599c09c96eab173 >>>> [7] nova-compute-0 versions: >>>> https://gist.github.com/sombrafam/d51102e3a32be5dc8ca03d7a23b6a998 >>>> [8] Gaps from ML2/OVS-OVN Xena: >>>> https://docs.openstack.org/neutron/xena/ovn/gaps.html >>>> [9] Gaps from ML2/OVS-OVN Victoria: >>>> https://docs.openstack.org/neutron/victoria/ovn/gaps.html >>>> >>>> >>>> >>> >>> > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sombrafam at gmail.com Tue Nov 23 13:01:41 2021 From: sombrafam at gmail.com (Erlon Cruz) Date: Tue, 23 Nov 2021 10:01:41 -0300 Subject: [neutron] Neutron OVN+QoS Support In-Reply-To: References: Message-ID: Hi Roman, Rodolfo, I tested setting the QoS policy to the port (internal) instead of the network (external), and it works! I did some more testing on the OVS vs OVN deployments and I can confirm the status you are saying. What I got was: OVS: FIP: Setting on port: FAIL Setting on network: OK Private network: Setting on port: OK Setting on network: OK Router: Internal port: OK External port: OK OVN: FIP: Setting on port: FAIL Setting on network: FAIL (I was trying this) Private network: Setting on port: OK Setting on network: OK Router: Internal port: FAIL External port: FAIL Thanks a lot for your help!! Erlon Em ter., 23 de nov. de 2021 ?s 08:47, Rodolfo Alonso Hernandez < ralonsoh at redhat.com> escreveu: > Hello Erlon: > > We really need to review the gaps document, at least for Xena. > > As Roman said, we have been testing QoS in OVN successfully. > > The current status of QoS in OVN is (at least for Xena): > - Fixed ports (VM ports): support for BW limit rules (egress/ingress) and > DSCP (only egress). Neutron supports port network QoS inheritance (same as > in your example). This is not for OVN but for any backend. > - FIPs: support for BW limit rules (egress/ingress). Still no network QoS > inheritance (in progress). > - GW IP: no support yet. > > Ping me in #openstack-neutron channel (ralonsoh) if you have more > questions. > > Regards. > > > On Tue, Nov 23, 2021 at 12:12 PM Roman Safronov > wrote: > >> Hi Erlon, >> >> There was a bug with setting QoS on a network but it had been fixed long >> ago. >> https://bugs.launchpad.net/neutron/+bug/1851362 or >> https://bugzilla.redhat.com/show_bug.cgi?id=1934096 >> At least in our downstream CI we do not observe such issues with QoS+OVN. >> >> From the commands I see that you apply the QoS rule on the external >> network, right? >> >> On Tue, Nov 23, 2021 at 12:39 PM Erlon Cruz wrote: >> >>> Hi Roman, >>> >>> Forgot to add that detail, since I run the same routine in a non-ovn >>> deployment and it worked. But this is how I did it: >>> >>> openstack network qos policy list >>> openstack network qos policy create bw-limiter >>> openstack network qos rule create --type bandwidth-limit --max-kbps 512 >>> --max-burst-kbits 512 --egress bw-limiter >>> openstack network qos rule create --type bandwidth-limit --max-kbps 512 >>> --max-burst-kbits 512 --ingress bw-limiter >>> openstack network set --qos-policy bw-limiter ext_net >>> >>> I didn't set it in the port though, which is something I should do. I'll >>> set it in the port too for testing but I think the above should >>> work regardless. >>> >>> Erlon >>> >>> >>> Em seg., 22 de nov. de 2021 ?s 18:45, Roman Safronov < >>> rsafrono at redhat.com> escreveu: >>> >>>> Hi Erlon, >>>> >>>> I have a couple of questions that probably will help to understand the >>>> issue better. >>>> Have you applied the QoS rules on a port, network or floating ip? >>>> Have you applied the QoS rules before starting the VM (before it's port >>>> is active) or after? >>>> >>>> Thanks >>>> >>>> >>>> On Mon, Nov 22, 2021 at 10:53 PM Erlon Cruz >>>> wrote: >>>> >>>>> Hi folks, >>>>> >>>>> I have a question related to the Neutron supportability of OVN+QoS. I >>>>> have checked the config reference for both >>>>> Victoria and Xena[1] >>>>> [2] >>>>> and >>>>> they are shown as supported (bw limit, eggress/ingress), but I tried to set >>>>> up an env >>>>> with OVN+QoS but the rules are not being effective (VMs still download >>>>> at maximum speed). I double-checked >>>>> the configuration in the neutron API and it brings the QoS settings[3] >>>>> >>>>> [4] >>>>> >>>>> [5] >>>>> , >>>>> and the versions[6] >>>>> >>>>> [7] >>>>> I'm >>>>> using should support it. >>>>> >>>>> What makes me more confused is that there's a document[8] >>>>> [9] >>>>> with a >>>>> gap analysis of the OVN vs OVS QoS functionality >>>>> and the document *is* being updated over the releases, but it still >>>>> shows that QoS is not supported in OVN. >>>>> >>>>> So, is there something I'm missing? >>>>> >>>>> Erlon >>>>> _______________ >>>>> [1] https://docs.openstack.org/neutron/victoria/admin/config-qos.html >>>>> [2] https://docs.openstack.org/neutron/xena/admin/config-qos.html >>>>> [3] QoS Config: >>>>> https://gist.github.com/sombrafam/f8434c0505ed4dd3f912574e7ccebb82 >>>>> [4] neutron.conf: >>>>> https://gist.github.com/sombrafam/785beb10f20439c4e50eb633f294ae82 >>>>> [5] ml2_conf.ini: >>>>> https://gist.github.com/sombrafam/b171a38d8cd16bd4dc77cfee3916dccd >>>>> [6] neutron-api-0 versions: >>>>> https://gist.github.com/sombrafam/5d098daa1df3f116d599c09c96eab173 >>>>> [7] nova-compute-0 versions: >>>>> https://gist.github.com/sombrafam/d51102e3a32be5dc8ca03d7a23b6a998 >>>>> [8] Gaps from ML2/OVS-OVN Xena: >>>>> https://docs.openstack.org/neutron/xena/ovn/gaps.html >>>>> [9] Gaps from ML2/OVS-OVN Victoria: >>>>> https://docs.openstack.org/neutron/victoria/ovn/gaps.html >>>>> >>>>> >>>>> >>>> >>>> >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From syedammad83 at gmail.com Tue Nov 23 13:47:20 2021 From: syedammad83 at gmail.com (Ammad Syed) Date: Tue, 23 Nov 2021 18:47:20 +0500 Subject: [neutron] Neutron OVN+QoS Support In-Reply-To: References: Message-ID: Hi Erlon, You can check below url for testing qos on FIP. I have tested it and it works fine. https://github.com/openstack/neutron/blob/master/doc/source/admin/config-qos.rst Ammad On Tue, Nov 23, 2021 at 6:06 PM Erlon Cruz wrote: > Hi Roman, Rodolfo, > > I tested setting the QoS policy to the port (internal) instead of the > network (external), and it works! I did some more testing on > the OVS vs OVN deployments and I can confirm the status you are saying. > What I got was: > > OVS: > FIP: > Setting on port: FAIL > Setting on network: OK > > Private network: > Setting on port: OK > Setting on network: OK > > Router: > Internal port: OK > External port: OK > > OVN: > FIP: > Setting on port: FAIL > Setting on network: FAIL (I was trying this) > > Private network: > Setting on port: OK > Setting on network: OK > > Router: > Internal port: FAIL > External port: FAIL > > Thanks a lot for your help!! > Erlon > > Em ter., 23 de nov. de 2021 ?s 08:47, Rodolfo Alonso Hernandez < > ralonsoh at redhat.com> escreveu: > >> Hello Erlon: >> >> We really need to review the gaps document, at least for Xena. >> >> As Roman said, we have been testing QoS in OVN successfully. >> >> The current status of QoS in OVN is (at least for Xena): >> - Fixed ports (VM ports): support for BW limit rules (egress/ingress) and >> DSCP (only egress). Neutron supports port network QoS inheritance (same as >> in your example). This is not for OVN but for any backend. >> - FIPs: support for BW limit rules (egress/ingress). Still no network QoS >> inheritance (in progress). >> - GW IP: no support yet. >> >> Ping me in #openstack-neutron channel (ralonsoh) if you have more >> questions. >> >> Regards. >> >> >> On Tue, Nov 23, 2021 at 12:12 PM Roman Safronov >> wrote: >> >>> Hi Erlon, >>> >>> There was a bug with setting QoS on a network but it had been fixed long >>> ago. >>> https://bugs.launchpad.net/neutron/+bug/1851362 or >>> https://bugzilla.redhat.com/show_bug.cgi?id=1934096 >>> At least in our downstream CI we do not observe such issues with >>> QoS+OVN. >>> >>> From the commands I see that you apply the QoS rule on the external >>> network, right? >>> >>> On Tue, Nov 23, 2021 at 12:39 PM Erlon Cruz wrote: >>> >>>> Hi Roman, >>>> >>>> Forgot to add that detail, since I run the same routine in a non-ovn >>>> deployment and it worked. But this is how I did it: >>>> >>>> openstack network qos policy list >>>> openstack network qos policy create bw-limiter >>>> openstack network qos rule create --type bandwidth-limit --max-kbps 512 >>>> --max-burst-kbits 512 --egress bw-limiter >>>> openstack network qos rule create --type bandwidth-limit --max-kbps 512 >>>> --max-burst-kbits 512 --ingress bw-limiter >>>> openstack network set --qos-policy bw-limiter ext_net >>>> >>>> I didn't set it in the port though, which is something I should do. >>>> I'll set it in the port too for testing but I think the above should >>>> work regardless. >>>> >>>> Erlon >>>> >>>> >>>> Em seg., 22 de nov. de 2021 ?s 18:45, Roman Safronov < >>>> rsafrono at redhat.com> escreveu: >>>> >>>>> Hi Erlon, >>>>> >>>>> I have a couple of questions that probably will help to understand the >>>>> issue better. >>>>> Have you applied the QoS rules on a port, network or floating ip? >>>>> Have you applied the QoS rules before starting the VM (before it's >>>>> port is active) or after? >>>>> >>>>> Thanks >>>>> >>>>> >>>>> On Mon, Nov 22, 2021 at 10:53 PM Erlon Cruz >>>>> wrote: >>>>> >>>>>> Hi folks, >>>>>> >>>>>> I have a question related to the Neutron supportability of OVN+QoS. I >>>>>> have checked the config reference for both >>>>>> Victoria and Xena[1] >>>>>> [2] >>>>>> and >>>>>> they are shown as supported (bw limit, eggress/ingress), but I tried to set >>>>>> up an env >>>>>> with OVN+QoS but the rules are not being effective (VMs still >>>>>> download at maximum speed). I double-checked >>>>>> the configuration in the neutron API and it brings the QoS settings >>>>>> [3] >>>>>> >>>>>> [4] >>>>>> >>>>>> [5] >>>>>> , >>>>>> and the versions[6] >>>>>> >>>>>> [7] >>>>>> I'm >>>>>> using should support it. >>>>>> >>>>>> What makes me more confused is that there's a document[8] >>>>>> [9] >>>>>> with a >>>>>> gap analysis of the OVN vs OVS QoS functionality >>>>>> and the document *is* being updated over the releases, but it still >>>>>> shows that QoS is not supported in OVN. >>>>>> >>>>>> So, is there something I'm missing? >>>>>> >>>>>> Erlon >>>>>> _______________ >>>>>> [1] https://docs.openstack.org/neutron/victoria/admin/config-qos.html >>>>>> [2] https://docs.openstack.org/neutron/xena/admin/config-qos.html >>>>>> [3] QoS Config: >>>>>> https://gist.github.com/sombrafam/f8434c0505ed4dd3f912574e7ccebb82 >>>>>> [4] neutron.conf: >>>>>> https://gist.github.com/sombrafam/785beb10f20439c4e50eb633f294ae82 >>>>>> [5] ml2_conf.ini: >>>>>> https://gist.github.com/sombrafam/b171a38d8cd16bd4dc77cfee3916dccd >>>>>> [6] neutron-api-0 versions: >>>>>> https://gist.github.com/sombrafam/5d098daa1df3f116d599c09c96eab173 >>>>>> [7] nova-compute-0 versions: >>>>>> https://gist.github.com/sombrafam/d51102e3a32be5dc8ca03d7a23b6a998 >>>>>> [8] Gaps from ML2/OVS-OVN Xena: >>>>>> https://docs.openstack.org/neutron/xena/ovn/gaps.html >>>>>> [9] Gaps from ML2/OVS-OVN Victoria: >>>>>> https://docs.openstack.org/neutron/victoria/ovn/gaps.html >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>> >>> >>> -- Regards, Syed Ammad Ali -------------- next part -------------- An HTML attachment was scrubbed... URL: From sombrafam at gmail.com Tue Nov 23 13:51:15 2021 From: sombrafam at gmail.com (Erlon Cruz) Date: Tue, 23 Nov 2021 10:51:15 -0300 Subject: Cinder NFS Encryption In-Reply-To: References: Message-ID: Hi Hitesh, Have you checked this[1] ? As far as I remember, when you do data encryption on Cinder, using volume encryption[2] and attach the volume via network, the data is transferred encrypted and nova has to share the keys. Erlon _____________ [1] Data Encryption: https://docs.openstack.org/security-guide/tenant-data/data-encryption.html [2] Cinder Volume Encryption: https://docs.openstack.org/cinder/pike/configuration/block-storage/volume-encryption.html Em seg., 22 de nov. de 2021 ?s 13:16, Hitesh Mathur escreveu: > Hi, > > I am not able to find whether Cinder support NFS data-in-transit > encryption or not. Can you please provide the information on this and how > to use it ? > > -- > Regards > Hitesh > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sombrafam at gmail.com Tue Nov 23 13:55:58 2021 From: sombrafam at gmail.com (Erlon Cruz) Date: Tue, 23 Nov 2021 10:55:58 -0300 Subject: [neutron] Neutron OVN+QoS Support In-Reply-To: References: Message-ID: Hi Ammad, What OpenStack version did you tested? I have just performed the FIP test on Xena and it didn't work for me. See the results I posted. Erlon Em ter., 23 de nov. de 2021 ?s 10:47, Ammad Syed escreveu: > Hi Erlon, > > You can check below url for testing qos on FIP. I have tested it and it > works fine. > > > https://github.com/openstack/neutron/blob/master/doc/source/admin/config-qos.rst > > Ammad > On Tue, Nov 23, 2021 at 6:06 PM Erlon Cruz wrote: > >> Hi Roman, Rodolfo, >> >> I tested setting the QoS policy to the port (internal) instead of the >> network (external), and it works! I did some more testing on >> the OVS vs OVN deployments and I can confirm the status you are saying. >> What I got was: >> >> OVS: >> FIP: >> Setting on port: FAIL >> Setting on network: OK >> >> Private network: >> Setting on port: OK >> Setting on network: OK >> >> Router: >> Internal port: OK >> External port: OK >> >> OVN: >> FIP: >> Setting on port: FAIL >> Setting on network: FAIL (I was trying this) >> >> Private network: >> Setting on port: OK >> Setting on network: OK >> >> Router: >> Internal port: FAIL >> External port: FAIL >> >> Thanks a lot for your help!! >> Erlon >> >> Em ter., 23 de nov. de 2021 ?s 08:47, Rodolfo Alonso Hernandez < >> ralonsoh at redhat.com> escreveu: >> >>> Hello Erlon: >>> >>> We really need to review the gaps document, at least for Xena. >>> >>> As Roman said, we have been testing QoS in OVN successfully. >>> >>> The current status of QoS in OVN is (at least for Xena): >>> - Fixed ports (VM ports): support for BW limit rules (egress/ingress) >>> and DSCP (only egress). Neutron supports port network QoS inheritance (same >>> as in your example). This is not for OVN but for any backend. >>> - FIPs: support for BW limit rules (egress/ingress). Still no network >>> QoS inheritance (in progress). >>> - GW IP: no support yet. >>> >>> Ping me in #openstack-neutron channel (ralonsoh) if you have more >>> questions. >>> >>> Regards. >>> >>> >>> On Tue, Nov 23, 2021 at 12:12 PM Roman Safronov >>> wrote: >>> >>>> Hi Erlon, >>>> >>>> There was a bug with setting QoS on a network but it had been fixed >>>> long ago. >>>> https://bugs.launchpad.net/neutron/+bug/1851362 or >>>> https://bugzilla.redhat.com/show_bug.cgi?id=1934096 >>>> At least in our downstream CI we do not observe such issues with >>>> QoS+OVN. >>>> >>>> From the commands I see that you apply the QoS rule on the external >>>> network, right? >>>> >>>> On Tue, Nov 23, 2021 at 12:39 PM Erlon Cruz >>>> wrote: >>>> >>>>> Hi Roman, >>>>> >>>>> Forgot to add that detail, since I run the same routine in a non-ovn >>>>> deployment and it worked. But this is how I did it: >>>>> >>>>> openstack network qos policy list >>>>> openstack network qos policy create bw-limiter >>>>> openstack network qos rule create --type bandwidth-limit --max-kbps >>>>> 512 --max-burst-kbits 512 --egress bw-limiter >>>>> openstack network qos rule create --type bandwidth-limit --max-kbps >>>>> 512 --max-burst-kbits 512 --ingress bw-limiter >>>>> openstack network set --qos-policy bw-limiter ext_net >>>>> >>>>> I didn't set it in the port though, which is something I should do. >>>>> I'll set it in the port too for testing but I think the above should >>>>> work regardless. >>>>> >>>>> Erlon >>>>> >>>>> >>>>> Em seg., 22 de nov. de 2021 ?s 18:45, Roman Safronov < >>>>> rsafrono at redhat.com> escreveu: >>>>> >>>>>> Hi Erlon, >>>>>> >>>>>> I have a couple of questions that probably will help to understand >>>>>> the issue better. >>>>>> Have you applied the QoS rules on a port, network or floating ip? >>>>>> Have you applied the QoS rules before starting the VM (before it's >>>>>> port is active) or after? >>>>>> >>>>>> Thanks >>>>>> >>>>>> >>>>>> On Mon, Nov 22, 2021 at 10:53 PM Erlon Cruz >>>>>> wrote: >>>>>> >>>>>>> Hi folks, >>>>>>> >>>>>>> I have a question related to the Neutron supportability of OVN+QoS. >>>>>>> I have checked the config reference for both >>>>>>> Victoria and Xena[1] >>>>>>> [2] >>>>>>> and >>>>>>> they are shown as supported (bw limit, eggress/ingress), but I tried to set >>>>>>> up an env >>>>>>> with OVN+QoS but the rules are not being effective (VMs still >>>>>>> download at maximum speed). I double-checked >>>>>>> the configuration in the neutron API and it brings the QoS settings >>>>>>> [3] >>>>>>> >>>>>>> [4] >>>>>>> >>>>>>> [5] >>>>>>> , >>>>>>> and the versions[6] >>>>>>> >>>>>>> [7] >>>>>>> I'm >>>>>>> using should support it. >>>>>>> >>>>>>> What makes me more confused is that there's a document[8] >>>>>>> [9] >>>>>>> with a >>>>>>> gap analysis of the OVN vs OVS QoS functionality >>>>>>> and the document *is* being updated over the releases, but it still >>>>>>> shows that QoS is not supported in OVN. >>>>>>> >>>>>>> So, is there something I'm missing? >>>>>>> >>>>>>> Erlon >>>>>>> _______________ >>>>>>> [1] >>>>>>> https://docs.openstack.org/neutron/victoria/admin/config-qos.html >>>>>>> [2] https://docs.openstack.org/neutron/xena/admin/config-qos.html >>>>>>> [3] QoS Config: >>>>>>> https://gist.github.com/sombrafam/f8434c0505ed4dd3f912574e7ccebb82 >>>>>>> [4] neutron.conf: >>>>>>> https://gist.github.com/sombrafam/785beb10f20439c4e50eb633f294ae82 >>>>>>> [5] ml2_conf.ini: >>>>>>> https://gist.github.com/sombrafam/b171a38d8cd16bd4dc77cfee3916dccd >>>>>>> [6] neutron-api-0 versions: >>>>>>> https://gist.github.com/sombrafam/5d098daa1df3f116d599c09c96eab173 >>>>>>> [7] nova-compute-0 versions: >>>>>>> https://gist.github.com/sombrafam/d51102e3a32be5dc8ca03d7a23b6a998 >>>>>>> [8] Gaps from ML2/OVS-OVN Xena: >>>>>>> https://docs.openstack.org/neutron/xena/ovn/gaps.html >>>>>>> [9] Gaps from ML2/OVS-OVN Victoria: >>>>>>> https://docs.openstack.org/neutron/victoria/ovn/gaps.html >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>> >>>> >>>> -- > Regards, > > > Syed Ammad Ali > -------------- next part -------------- An HTML attachment was scrubbed... URL: From syedammad83 at gmail.com Tue Nov 23 14:38:12 2021 From: syedammad83 at gmail.com (Ammad Syed) Date: Tue, 23 Nov 2021 19:38:12 +0500 Subject: [neutron] Neutron OVN+QoS Support In-Reply-To: References: Message-ID: Hi Erlon, I have tested on xena and it works fine. See if you have qos-fip extension loaded in neution. # openstack extension list | grep -i qos-fip | Floating IP QoS | qos-fip | The floating IP Quality of Service extension | Ammad On Tue, Nov 23, 2021 at 6:56 PM Erlon Cruz wrote: > Hi Ammad, > > What OpenStack version did you tested? I have just performed the FIP test > on Xena and it didn't work for me. See the results I posted. > > Erlon > > Em ter., 23 de nov. de 2021 ?s 10:47, Ammad Syed > escreveu: > >> Hi Erlon, >> >> You can check below url for testing qos on FIP. I have tested it and it >> works fine. >> >> >> https://github.com/openstack/neutron/blob/master/doc/source/admin/config-qos.rst >> >> Ammad >> On Tue, Nov 23, 2021 at 6:06 PM Erlon Cruz wrote: >> >>> Hi Roman, Rodolfo, >>> >>> I tested setting the QoS policy to the port (internal) instead of the >>> network (external), and it works! I did some more testing on >>> the OVS vs OVN deployments and I can confirm the status you are saying. >>> What I got was: >>> >>> OVS: >>> FIP: >>> Setting on port: FAIL >>> Setting on network: OK >>> >>> Private network: >>> Setting on port: OK >>> Setting on network: OK >>> >>> Router: >>> Internal port: OK >>> External port: OK >>> >>> OVN: >>> FIP: >>> Setting on port: FAIL >>> Setting on network: FAIL (I was trying this) >>> >>> Private network: >>> Setting on port: OK >>> Setting on network: OK >>> >>> Router: >>> Internal port: FAIL >>> External port: FAIL >>> >>> Thanks a lot for your help!! >>> Erlon >>> >>> Em ter., 23 de nov. de 2021 ?s 08:47, Rodolfo Alonso Hernandez < >>> ralonsoh at redhat.com> escreveu: >>> >>>> Hello Erlon: >>>> >>>> We really need to review the gaps document, at least for Xena. >>>> >>>> As Roman said, we have been testing QoS in OVN successfully. >>>> >>>> The current status of QoS in OVN is (at least for Xena): >>>> - Fixed ports (VM ports): support for BW limit rules (egress/ingress) >>>> and DSCP (only egress). Neutron supports port network QoS inheritance (same >>>> as in your example). This is not for OVN but for any backend. >>>> - FIPs: support for BW limit rules (egress/ingress). Still no network >>>> QoS inheritance (in progress). >>>> - GW IP: no support yet. >>>> >>>> Ping me in #openstack-neutron channel (ralonsoh) if you have more >>>> questions. >>>> >>>> Regards. >>>> >>>> >>>> On Tue, Nov 23, 2021 at 12:12 PM Roman Safronov >>>> wrote: >>>> >>>>> Hi Erlon, >>>>> >>>>> There was a bug with setting QoS on a network but it had been fixed >>>>> long ago. >>>>> https://bugs.launchpad.net/neutron/+bug/1851362 or >>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1934096 >>>>> At least in our downstream CI we do not observe such issues with >>>>> QoS+OVN. >>>>> >>>>> From the commands I see that you apply the QoS rule on the external >>>>> network, right? >>>>> >>>>> On Tue, Nov 23, 2021 at 12:39 PM Erlon Cruz >>>>> wrote: >>>>> >>>>>> Hi Roman, >>>>>> >>>>>> Forgot to add that detail, since I run the same routine in a non-ovn >>>>>> deployment and it worked. But this is how I did it: >>>>>> >>>>>> openstack network qos policy list >>>>>> openstack network qos policy create bw-limiter >>>>>> openstack network qos rule create --type bandwidth-limit --max-kbps >>>>>> 512 --max-burst-kbits 512 --egress bw-limiter >>>>>> openstack network qos rule create --type bandwidth-limit --max-kbps >>>>>> 512 --max-burst-kbits 512 --ingress bw-limiter >>>>>> openstack network set --qos-policy bw-limiter ext_net >>>>>> >>>>>> I didn't set it in the port though, which is something I should do. >>>>>> I'll set it in the port too for testing but I think the above should >>>>>> work regardless. >>>>>> >>>>>> Erlon >>>>>> >>>>>> >>>>>> Em seg., 22 de nov. de 2021 ?s 18:45, Roman Safronov < >>>>>> rsafrono at redhat.com> escreveu: >>>>>> >>>>>>> Hi Erlon, >>>>>>> >>>>>>> I have a couple of questions that probably will help to understand >>>>>>> the issue better. >>>>>>> Have you applied the QoS rules on a port, network or floating ip? >>>>>>> Have you applied the QoS rules before starting the VM (before it's >>>>>>> port is active) or after? >>>>>>> >>>>>>> Thanks >>>>>>> >>>>>>> >>>>>>> On Mon, Nov 22, 2021 at 10:53 PM Erlon Cruz >>>>>>> wrote: >>>>>>> >>>>>>>> Hi folks, >>>>>>>> >>>>>>>> I have a question related to the Neutron supportability of OVN+QoS. >>>>>>>> I have checked the config reference for both >>>>>>>> Victoria and Xena[1] >>>>>>>> [2] >>>>>>>> >>>>>>>> and they are shown as supported (bw limit, eggress/ingress), but I tried to >>>>>>>> set up an env >>>>>>>> with OVN+QoS but the rules are not being effective (VMs still >>>>>>>> download at maximum speed). I double-checked >>>>>>>> the configuration in the neutron API and it brings the QoS settings >>>>>>>> [3] >>>>>>>> >>>>>>>> [4] >>>>>>>> >>>>>>>> [5] >>>>>>>> , >>>>>>>> and the versions[6] >>>>>>>> >>>>>>>> [7] >>>>>>>> I'm >>>>>>>> using should support it. >>>>>>>> >>>>>>>> What makes me more confused is that there's a document[8] >>>>>>>> [9] >>>>>>>> with a >>>>>>>> gap analysis of the OVN vs OVS QoS functionality >>>>>>>> and the document *is* being updated over the releases, but it still >>>>>>>> shows that QoS is not supported in OVN. >>>>>>>> >>>>>>>> So, is there something I'm missing? >>>>>>>> >>>>>>>> Erlon >>>>>>>> _______________ >>>>>>>> [1] >>>>>>>> https://docs.openstack.org/neutron/victoria/admin/config-qos.html >>>>>>>> [2] https://docs.openstack.org/neutron/xena/admin/config-qos.html >>>>>>>> [3] QoS Config: >>>>>>>> https://gist.github.com/sombrafam/f8434c0505ed4dd3f912574e7ccebb82 >>>>>>>> [4] neutron.conf: >>>>>>>> https://gist.github.com/sombrafam/785beb10f20439c4e50eb633f294ae82 >>>>>>>> [5] ml2_conf.ini: >>>>>>>> https://gist.github.com/sombrafam/b171a38d8cd16bd4dc77cfee3916dccd >>>>>>>> [6] neutron-api-0 versions: >>>>>>>> https://gist.github.com/sombrafam/5d098daa1df3f116d599c09c96eab173 >>>>>>>> [7] nova-compute-0 versions: >>>>>>>> https://gist.github.com/sombrafam/d51102e3a32be5dc8ca03d7a23b6a998 >>>>>>>> [8] Gaps from ML2/OVS-OVN Xena: >>>>>>>> https://docs.openstack.org/neutron/xena/ovn/gaps.html >>>>>>>> [9] Gaps from ML2/OVS-OVN Victoria: >>>>>>>> https://docs.openstack.org/neutron/victoria/ovn/gaps.html >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>> >>>>> >>>>> -- >> Regards, >> >> >> Syed Ammad Ali >> > -- Regards, Syed Ammad Ali -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.therond at bitswalk.com Tue Nov 23 14:58:04 2021 From: gael.therond at bitswalk.com (=?UTF-8?Q?Ga=C3=ABl_THEROND?=) Date: Tue, 23 Nov 2021 14:58:04 +0000 Subject: [CLOUDKITTY][VICTORIA] - policy prohibit report:get_summary to admin Message-ID: Hi everyone! Today I faced a weird situation with one of our cloud platforms using victoria release. When trying to get a summary of projects rates would it be through Horizon or CLI using the admin user of the platform we've got the following error message: https://paste.opendev.org/show/bIgG6owrN9B2F3O7iqYG/ >From my understanding of the default policies of cloudkitty, this error seems to be a bit odd as the admin user profile actually match the default rules. At least as exposed in: https://opendev.org/openstack/cloudkitty/src/branch/stable/victoria/cloudkitty/common/policies/base.py and https://opendev.org/openstack/cloudkitty/src/branch/stable/victoria/cloudkitty/common/policies/v1/report.py Unless I misunderstood something (please correct me if I'm wrong), it's supposed to at least be ok with the matching. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rafaelweingartner at gmail.com Tue Nov 23 15:06:48 2021 From: rafaelweingartner at gmail.com (=?UTF-8?Q?Rafael_Weing=C3=A4rtner?=) Date: Tue, 23 Nov 2021 12:06:48 -0300 Subject: [CLOUDKITTY][VICTORIA] - policy prohibit report:get_summary to admin In-Reply-To: References: Message-ID: Can you check this one? https://review.opendev.org/c/openstack/cloudkitty/+/785132 On Tue, Nov 23, 2021 at 12:01 PM Ga?l THEROND wrote: > Hi everyone! > > Today I faced a weird situation with one of our cloud platforms using > victoria release. > > When trying to get a summary of projects rates would it be through Horizon > or CLI using the admin user of the platform we've got the following error > message: > > https://paste.opendev.org/show/bIgG6owrN9B2F3O7iqYG/ > > From my understanding of the default policies of cloudkitty, this error > seems to be a bit odd as the admin user profile actually match the default > rules. > > At least as exposed in: > > > https://opendev.org/openstack/cloudkitty/src/branch/stable/victoria/cloudkitty/common/policies/base.py > and > > https://opendev.org/openstack/cloudkitty/src/branch/stable/victoria/cloudkitty/common/policies/v1/report.py > > Unless I misunderstood something (please correct me if I'm wrong), it's > supposed to at least be ok with the matching. > -- Rafael Weing?rtner -------------- next part -------------- An HTML attachment was scrubbed... URL: From rafaelweingartner at gmail.com Tue Nov 23 15:08:58 2021 From: rafaelweingartner at gmail.com (=?UTF-8?Q?Rafael_Weing=C3=A4rtner?=) Date: Tue, 23 Nov 2021 12:08:58 -0300 Subject: [CLOUDKITTY][VICTORIA] - policy prohibit report:get_summary to admin In-Reply-To: References: Message-ID: I guess that the rule "context_is_admin" might have some weird definition in your version. Can you check it? On Tue, Nov 23, 2021 at 12:06 PM Rafael Weing?rtner < rafaelweingartner at gmail.com> wrote: > Can you check this one? > https://review.opendev.org/c/openstack/cloudkitty/+/785132 > > On Tue, Nov 23, 2021 at 12:01 PM Ga?l THEROND > wrote: > >> Hi everyone! >> >> Today I faced a weird situation with one of our cloud platforms using >> victoria release. >> >> When trying to get a summary of projects rates would it be through >> Horizon or CLI using the admin user of the platform we've got the following >> error message: >> >> https://paste.opendev.org/show/bIgG6owrN9B2F3O7iqYG/ >> >> From my understanding of the default policies of cloudkitty, this error >> seems to be a bit odd as the admin user profile actually match the default >> rules. >> >> At least as exposed in: >> >> >> https://opendev.org/openstack/cloudkitty/src/branch/stable/victoria/cloudkitty/common/policies/base.py >> and >> >> https://opendev.org/openstack/cloudkitty/src/branch/stable/victoria/cloudkitty/common/policies/v1/report.py >> >> Unless I misunderstood something (please correct me if I'm wrong), it's >> supposed to at least be ok with the matching. >> > > > -- > Rafael Weing?rtner > -- Rafael Weing?rtner -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.therond at bitswalk.com Tue Nov 23 15:15:40 2021 From: gael.therond at bitswalk.com (=?UTF-8?Q?Ga=C3=ABl_THEROND?=) Date: Tue, 23 Nov 2021 15:15:40 +0000 Subject: [CLOUDKITTY][VICTORIA] - policy prohibit report:get_summary to admin In-Reply-To: References: Message-ID: aaaaah nice catch! I'll check that out as I use CentOS packages; it may actually just be that! Thanks a lot! Le mar. 23 nov. 2021 ? 15:09, Rafael Weing?rtner < rafaelweingartner at gmail.com> a ?crit : > I guess that the rule "context_is_admin" might have some weird definition > in your version. Can you check it? > > On Tue, Nov 23, 2021 at 12:06 PM Rafael Weing?rtner < > rafaelweingartner at gmail.com> wrote: > >> Can you check this one? >> https://review.opendev.org/c/openstack/cloudkitty/+/785132 >> >> On Tue, Nov 23, 2021 at 12:01 PM Ga?l THEROND >> wrote: >> >>> Hi everyone! >>> >>> Today I faced a weird situation with one of our cloud platforms using >>> victoria release. >>> >>> When trying to get a summary of projects rates would it be through >>> Horizon or CLI using the admin user of the platform we've got the following >>> error message: >>> >>> https://paste.opendev.org/show/bIgG6owrN9B2F3O7iqYG/ >>> >>> From my understanding of the default policies of cloudkitty, this error >>> seems to be a bit odd as the admin user profile actually match the default >>> rules. >>> >>> At least as exposed in: >>> >>> >>> https://opendev.org/openstack/cloudkitty/src/branch/stable/victoria/cloudkitty/common/policies/base.py >>> and >>> >>> https://opendev.org/openstack/cloudkitty/src/branch/stable/victoria/cloudkitty/common/policies/v1/report.py >>> >>> Unless I misunderstood something (please correct me if I'm wrong), it's >>> supposed to at least be ok with the matching. >>> >> >> >> -- >> Rafael Weing?rtner >> > > > -- > Rafael Weing?rtner > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.therond at bitswalk.com Tue Nov 23 15:28:51 2021 From: gael.therond at bitswalk.com (=?UTF-8?Q?Ga=C3=ABl_THEROND?=) Date: Tue, 23 Nov 2021 15:28:51 +0000 Subject: [CLOUDKITTY][VICTORIA] - policy prohibit report:get_summary to admin In-Reply-To: References: Message-ID: ah ah! Was exactly that indeed! So, CentOS cloudkitty common package is not using the latest patch fixing the issue -> http://mirror.centos.org/centos/8/cloud/x86_64/openstack-victoria/Packages/o/openstack-cloudkitty-common-13.0.0-1.el8.noarch.rpm Thanks a lot for the hint! Will patch it downstream waiting for COS patch. Le mar. 23 nov. 2021 ? 15:15, Ga?l THEROND a ?crit : > aaaaah nice catch! I'll check that out as I use CentOS packages; it may > actually just be that! > > Thanks a lot! > > Le mar. 23 nov. 2021 ? 15:09, Rafael Weing?rtner < > rafaelweingartner at gmail.com> a ?crit : > >> I guess that the rule "context_is_admin" might have some weird definition >> in your version. Can you check it? >> >> On Tue, Nov 23, 2021 at 12:06 PM Rafael Weing?rtner < >> rafaelweingartner at gmail.com> wrote: >> >>> Can you check this one? >>> https://review.opendev.org/c/openstack/cloudkitty/+/785132 >>> >>> On Tue, Nov 23, 2021 at 12:01 PM Ga?l THEROND >>> wrote: >>> >>>> Hi everyone! >>>> >>>> Today I faced a weird situation with one of our cloud platforms using >>>> victoria release. >>>> >>>> When trying to get a summary of projects rates would it be through >>>> Horizon or CLI using the admin user of the platform we've got the following >>>> error message: >>>> >>>> https://paste.opendev.org/show/bIgG6owrN9B2F3O7iqYG/ >>>> >>>> From my understanding of the default policies of cloudkitty, this error >>>> seems to be a bit odd as the admin user profile actually match the default >>>> rules. >>>> >>>> At least as exposed in: >>>> >>>> >>>> https://opendev.org/openstack/cloudkitty/src/branch/stable/victoria/cloudkitty/common/policies/base.py >>>> and >>>> >>>> https://opendev.org/openstack/cloudkitty/src/branch/stable/victoria/cloudkitty/common/policies/v1/report.py >>>> >>>> Unless I misunderstood something (please correct me if I'm wrong), it's >>>> supposed to at least be ok with the matching. >>>> >>> >>> >>> -- >>> Rafael Weing?rtner >>> >> >> >> -- >> Rafael Weing?rtner >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sombrafam at gmail.com Tue Nov 23 17:41:52 2021 From: sombrafam at gmail.com (Erlon Cruz) Date: Tue, 23 Nov 2021 14:41:52 -0300 Subject: [neutron] Neutron OVN+QoS Support In-Reply-To: References: Message-ID: Hmm, My OVN deployment doesn't show the extension, and the OVS brings it by default, thought its not listed in the l3_agent.ini file. Where do you set that for OVN? The OVS deployment have the l3_agent.ini, but OVN does not have an L3 agent. Erlon Em ter., 23 de nov. de 2021 ?s 11:38, Ammad Syed escreveu: > Hi Erlon, > > I have tested on xena and it works fine. See if you have qos-fip > extension loaded in neution. > > # openstack extension list | grep -i qos-fip > > | Floating IP QoS > > | qos-fip | The floating IP > Quality of Service extension > | > Ammad > > On Tue, Nov 23, 2021 at 6:56 PM Erlon Cruz wrote: > >> Hi Ammad, >> >> What OpenStack version did you tested? I have just performed the FIP test >> on Xena and it didn't work for me. See the results I posted. >> >> Erlon >> >> Em ter., 23 de nov. de 2021 ?s 10:47, Ammad Syed >> escreveu: >> >>> Hi Erlon, >>> >>> You can check below url for testing qos on FIP. I have tested it and it >>> works fine. >>> >>> >>> https://github.com/openstack/neutron/blob/master/doc/source/admin/config-qos.rst >>> >>> Ammad >>> On Tue, Nov 23, 2021 at 6:06 PM Erlon Cruz wrote: >>> >>>> Hi Roman, Rodolfo, >>>> >>>> I tested setting the QoS policy to the port (internal) instead of the >>>> network (external), and it works! I did some more testing on >>>> the OVS vs OVN deployments and I can confirm the status you are saying. >>>> What I got was: >>>> >>>> OVS: >>>> FIP: >>>> Setting on port: FAIL >>>> Setting on network: OK >>>> >>>> Private network: >>>> Setting on port: OK >>>> Setting on network: OK >>>> >>>> Router: >>>> Internal port: OK >>>> External port: OK >>>> >>>> OVN: >>>> FIP: >>>> Setting on port: FAIL >>>> Setting on network: FAIL (I was trying this) >>>> >>>> Private network: >>>> Setting on port: OK >>>> Setting on network: OK >>>> >>>> Router: >>>> Internal port: FAIL >>>> External port: FAIL >>>> >>>> Thanks a lot for your help!! >>>> Erlon >>>> >>>> Em ter., 23 de nov. de 2021 ?s 08:47, Rodolfo Alonso Hernandez < >>>> ralonsoh at redhat.com> escreveu: >>>> >>>>> Hello Erlon: >>>>> >>>>> We really need to review the gaps document, at least for Xena. >>>>> >>>>> As Roman said, we have been testing QoS in OVN successfully. >>>>> >>>>> The current status of QoS in OVN is (at least for Xena): >>>>> - Fixed ports (VM ports): support for BW limit rules (egress/ingress) >>>>> and DSCP (only egress). Neutron supports port network QoS inheritance (same >>>>> as in your example). This is not for OVN but for any backend. >>>>> - FIPs: support for BW limit rules (egress/ingress). Still no network >>>>> QoS inheritance (in progress). >>>>> - GW IP: no support yet. >>>>> >>>>> Ping me in #openstack-neutron channel (ralonsoh) if you have more >>>>> questions. >>>>> >>>>> Regards. >>>>> >>>>> >>>>> On Tue, Nov 23, 2021 at 12:12 PM Roman Safronov >>>>> wrote: >>>>> >>>>>> Hi Erlon, >>>>>> >>>>>> There was a bug with setting QoS on a network but it had been fixed >>>>>> long ago. >>>>>> https://bugs.launchpad.net/neutron/+bug/1851362 or >>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1934096 >>>>>> At least in our downstream CI we do not observe such issues with >>>>>> QoS+OVN. >>>>>> >>>>>> From the commands I see that you apply the QoS rule on the external >>>>>> network, right? >>>>>> >>>>>> On Tue, Nov 23, 2021 at 12:39 PM Erlon Cruz >>>>>> wrote: >>>>>> >>>>>>> Hi Roman, >>>>>>> >>>>>>> Forgot to add that detail, since I run the same routine in a non-ovn >>>>>>> deployment and it worked. But this is how I did it: >>>>>>> >>>>>>> openstack network qos policy list >>>>>>> openstack network qos policy create bw-limiter >>>>>>> openstack network qos rule create --type bandwidth-limit --max-kbps >>>>>>> 512 --max-burst-kbits 512 --egress bw-limiter >>>>>>> openstack network qos rule create --type bandwidth-limit --max-kbps >>>>>>> 512 --max-burst-kbits 512 --ingress bw-limiter >>>>>>> openstack network set --qos-policy bw-limiter ext_net >>>>>>> >>>>>>> I didn't set it in the port though, which is something I should do. >>>>>>> I'll set it in the port too for testing but I think the above should >>>>>>> work regardless. >>>>>>> >>>>>>> Erlon >>>>>>> >>>>>>> >>>>>>> Em seg., 22 de nov. de 2021 ?s 18:45, Roman Safronov < >>>>>>> rsafrono at redhat.com> escreveu: >>>>>>> >>>>>>>> Hi Erlon, >>>>>>>> >>>>>>>> I have a couple of questions that probably will help to understand >>>>>>>> the issue better. >>>>>>>> Have you applied the QoS rules on a port, network or floating ip? >>>>>>>> Have you applied the QoS rules before starting the VM (before it's >>>>>>>> port is active) or after? >>>>>>>> >>>>>>>> Thanks >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Nov 22, 2021 at 10:53 PM Erlon Cruz >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi folks, >>>>>>>>> >>>>>>>>> I have a question related to the Neutron supportability of >>>>>>>>> OVN+QoS. I have checked the config reference for both >>>>>>>>> Victoria and Xena[1] >>>>>>>>> [2] >>>>>>>>> >>>>>>>>> and they are shown as supported (bw limit, eggress/ingress), but I tried to >>>>>>>>> set up an env >>>>>>>>> with OVN+QoS but the rules are not being effective (VMs still >>>>>>>>> download at maximum speed). I double-checked >>>>>>>>> the configuration in the neutron API and it brings the QoS settings >>>>>>>>> [3] >>>>>>>>> >>>>>>>>> [4] >>>>>>>>> >>>>>>>>> [5] >>>>>>>>> , >>>>>>>>> and the versions[6] >>>>>>>>> >>>>>>>>> [7] >>>>>>>>> I'm >>>>>>>>> using should support it. >>>>>>>>> >>>>>>>>> What makes me more confused is that there's a document[8] >>>>>>>>> [9] >>>>>>>>> with >>>>>>>>> a gap analysis of the OVN vs OVS QoS functionality >>>>>>>>> and the document *is* being updated over the releases, but it >>>>>>>>> still shows that QoS is not supported in OVN. >>>>>>>>> >>>>>>>>> So, is there something I'm missing? >>>>>>>>> >>>>>>>>> Erlon >>>>>>>>> _______________ >>>>>>>>> [1] >>>>>>>>> https://docs.openstack.org/neutron/victoria/admin/config-qos.html >>>>>>>>> [2] https://docs.openstack.org/neutron/xena/admin/config-qos.html >>>>>>>>> [3] QoS Config: >>>>>>>>> https://gist.github.com/sombrafam/f8434c0505ed4dd3f912574e7ccebb82 >>>>>>>>> [4] neutron.conf: >>>>>>>>> https://gist.github.com/sombrafam/785beb10f20439c4e50eb633f294ae82 >>>>>>>>> [5] ml2_conf.ini: >>>>>>>>> https://gist.github.com/sombrafam/b171a38d8cd16bd4dc77cfee3916dccd >>>>>>>>> [6] neutron-api-0 versions: >>>>>>>>> https://gist.github.com/sombrafam/5d098daa1df3f116d599c09c96eab173 >>>>>>>>> [7] nova-compute-0 versions: >>>>>>>>> https://gist.github.com/sombrafam/d51102e3a32be5dc8ca03d7a23b6a998 >>>>>>>>> [8] Gaps from ML2/OVS-OVN Xena: >>>>>>>>> https://docs.openstack.org/neutron/xena/ovn/gaps.html >>>>>>>>> [9] Gaps from ML2/OVS-OVN Victoria: >>>>>>>>> https://docs.openstack.org/neutron/victoria/ovn/gaps.html >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>>>> -- >>> Regards, >>> >>> >>> Syed Ammad Ali >>> >> > > -- > Regards, > > > Syed Ammad Ali > -------------- next part -------------- An HTML attachment was scrubbed... URL: From damien.rannou at ovhcloud.com Tue Nov 23 10:53:53 2021 From: damien.rannou at ovhcloud.com (Damien Rannou) Date: Tue, 23 Nov 2021 10:53:53 +0000 Subject: [neutron] default QOS on L3 gateway In-Reply-To: References: Message-ID: Yes exactly Just On point that I?m not sure: what append if the client ask for a router update with ?no-qos option ? Will it remove the QOS completely or just re apply the default value ? Damien Le 22 nov. 2021 ? 18:45, Lajos Katona > a ?crit : Hi, There is an RFE to inherit network QoS: https://bugs.launchpad.net/neutron/+bug/1950454 The patch series: https://review.opendev.org/q/topic:%22bug%252F1950454%22+(status:open%20OR%20status:merged) Hope this covers your usecase. Lajos Katona (lajoskatona) Damien Rannou > ezt ?rta (id?pont: 2021. nov. 22., H, 17:23): Hello We are currently playing with QOS on L3 agent, mostly for SNAT, but it can apply also on FIP. Everything is working properly, but I?m wondering if there is a way to define a ? default ? QOS that would be applied on Router creation, but also when the user is setting ? no_qos ? on his router. On a public cloud environnement, we cannot let the customers without any QOS limitation. Thanks ! Damien -------------- next part -------------- An HTML attachment was scrubbed... URL: From franck.vedel at univ-grenoble-alpes.fr Tue Nov 23 19:28:12 2021 From: franck.vedel at univ-grenoble-alpes.fr (Franck VEDEL) Date: Tue, 23 Nov 2021 20:28:12 +0100 Subject: [kolla-ansible][wallaby][glance] Problem with image list after reconfigure In-Reply-To: References: <87804956-038F-4F88-A467-33851679B5B4@univ-grenoble-alpes.fr> <30680C7F-7E55-4098-8438-F3CEE48C4A90@univ-grenoble-alpes.fr> Message-ID: Thanks again. I'll do some speed tests. In my case, I need ready-to-use images (debian, centos, ubuntu, pfsense, kali, windows 10, windows 2016), sometimes big images. This is why I am trying to find out what is the best solution with the use of an iscsi bay. Ah .... if I could change that and use disks and ceph ... Franck > Le 23 nov. 2021 ? 09:14, Ignazio Cassano a ?crit : > > Franck, If the cache works fine , I think glance image format could be qcow2. The volume is created in raw format but > the download phase is executed only the fisrt time you create a volume from a new image. > With this setup I can create 20-30 instance in a shot and it takes few minutes to complete. > I always use general purpose small images and colplete the instance configuration (package installation and so on) with heat or ansible. > Ignazio > > > Il giorno mar 23 nov 2021 alle ore 08:57 Franck VEDEL > ha scritto: > Ignazio, Radoslaw, > > thanks to you, I made some modifications and my environment seems to work better (the images are placed on the iiscsi bay on which the volumes are stored). > I installed the cache for glance. It works, well I think it does. > > My question is: between the different formats (qcow2, raw or other), which is the most efficient if > - we create a volume then an instance from the volume > - we create an instance from the image > - we create an instance without volume > - we create a snapshot then an instance from the snapshot > > Franck > > >> >> >>> Le 19 nov. 2021 ? 14:50, Ignazio Cassano > a ?crit : >>> >>> Franck, this help you a lot. >>> Thanks Radoslaw >>> Ignazio >>> >>> Il giorno ven 19 nov 2021 alle ore 12:03 Rados?aw Piliszek > ha scritto: >>> If one sets glance_file_datadir_volume to non-default, then glance-api >>> gets deployed on all hosts. >>> >>> -yoctozepto >>> >>> On Fri, 19 Nov 2021 at 10:51, Ignazio Cassano > wrote: >>> > >>> > Hello Franck, glance is not deployed on all nodes at default. >>> > I got the same problem >>> > In my case I have 3 controllers. >>> > I created an nfs share on a storage server where to store images. >>> > Before deploying glance, I create /var/lib/glance/images on the 3 controllers and I mount the nfs share. >>> > This is my fstab on the 3 controllers: >>> > >>> > 10.102.189.182:/netappopenstacktst2_glance /var/lib/glance nfs rw,user=glance,soft,intr,noatime,nodiratime >>> > >>> > In my globals.yml I have: >>> > glance_file_datadir_volume: "/var/lib/glance" >>> > glance_backend_file: "yes" >>> > >>> > This means images are on /var/lib/glance and since it is a nfs share all my 3 controlles can share images. >>> > Then you must deploy. >>> > To be sure the glance container is started on all controllers, since I have 3 controllers, I deployed 3 times changing the order in the inventory. >>> > First time: >>> > [control] >>> > A >>> > B >>> > C >>> > >>> > Second time: >>> > [control] >>> > B >>> > C >>> > A >>> > >>> > Third time: >>> > [control] >>> > C >>> > B >>> > A >>> > >>> > Or you can deploy glance 3 times using -t glance and -l >>> > >>> > As far as the instance stopped, I got I bug with a version of kolla. >>> > https://bugs.launchpad.net/kolla-ansible/+bug/1941706 >>> > Now is corrected and with kolla 12.2.0 it works. >>> > Ignazio >>> > >>> > >>> > Il giorno mer 17 nov 2021 alle ore 23:17 Franck VEDEL > ha scritto: > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosmaita.fossdev at gmail.com Tue Nov 23 21:15:15 2021 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Tue, 23 Nov 2021 16:15:15 -0500 Subject: [cinder] spec opportunities and deadline Message-ID: <25f62afc-e7a6-6876-9808-33dce952438a@gmail.com> To anyone working on a cinder spec for Yoga: This is a reminder that all Cinder Specs for features to be implemented in Yoga must be approved by Friday 17 December 2021 (23:59 UTC). There are two upcoming opportunities to get feedback on your spec proposal and/or ask questions about your spec from the cinder team: 1. Tomorrow's cinder weekly meeting (Wednesday 24 November, 1400 UTC) is being held in videoconference; if you'd like a discussion, please put a topic on the weekly agenda (which also has connection details): https://etherpad.opendev.org/p/cinder-yoga-meetings 2. The cinder yoga R-17 virtual midcycle meeting is being held next week on Wednesday 1 December 1400-1600 UTC; if you'd like to discuss your proposal, please add it to the planning etherpad: https://etherpad.opendev.org/p/cinder-yoga-midcycles cheers, brian From tonyliu0592 at hotmail.com Tue Nov 23 22:51:52 2021 From: tonyliu0592 at hotmail.com (Tony Liu) Date: Tue, 23 Nov 2021 22:51:52 +0000 Subject: [Neutron][OVN] nova-compute timeout while waiting for VIF activation confirmed by Neutron Message-ID: Hi, I see such problem from time to time. It's not consistently reproduceable. ====================== 2021-11-23 22:16:28.532 7 INFO nova.compute.manager [req-50817b01-e7ae-4991-94fe-e4c5672c481b - - - - -] [instance: 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] During sync_power_state the instance has a pending task (spawning). Skip. 2021-11-23 22:21:28.342 7 WARNING nova.virt.libvirt.driver [req-814c98a4-3fd9-4607-9abb-5fbe5cef8650 3a4c320d64664d9cb6784b7ea52d618a 21384fe21ebb4339b1a5f2a8f9000ea3 - default default] [instance: 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] Timeout waiting for [('network-vif-plugged', '735d97a3-db7c-42ed-b3be-1596d4cc7f4b')] for instance with vm_state building and task_state spawning.: eventlet.timeout.Timeout: 300 seconds ====================== The VIF/port is activated by OVN ovn-controller to ovn-sb. It seems that, either Neutron didn't capture the update or didn't send message back to nova-compute. Is there any known fix for this problem? Thanks! Tony From DHilsbos at performair.com Tue Nov 23 23:34:39 2021 From: DHilsbos at performair.com (DHilsbos at performair.com) Date: Tue, 23 Nov 2021 23:34:39 +0000 Subject: [ops]RabbitMQ High Availability Message-ID: <0670B960225633449A24709C291A525251D511CD@COM03.performair.local> All; In the time I've been part of this mailing list, the subject of RabbitMQ high availability has come up several times, and each time specific recommendations for both Rabbit and Open Stack are provided. I remember it being an A or B kind of recommendation (i.e. configure Rabbit like A1, and Open Stack like A2, OR configure Rabbit like B1, and Open Stack like B2). Unfortunately, I can't find the previous threads on this topic. Does anyone have this information, that they would care to share with me? Thank you, Dominic L. Hilsbos, MBA Vice President - Information Technology Perform Air International Inc. DHilsbos at PerformAir.com www.PerformAir.com From tonyliu0592 at hotmail.com Tue Nov 23 23:34:52 2021 From: tonyliu0592 at hotmail.com (Tony Liu) Date: Tue, 23 Nov 2021 23:34:52 +0000 Subject: [Neutron][OVN] networking_ovn.common.maintenance.DBInconsistenciesPeriodics.check_for_mcast_flood_reports failed Message-ID: Hi, Is this known issue? Any filed bug or fix to it? ======================================= 2021-11-23 23:04:20.943 40 ERROR futurist.periodics [-] Failed to call periodic 'networking_ovn.common.maintenance.DBInconsistenciesPeriodics.check_for_mcast_flood_reports' (it runs every 600.00 seconds): RuntimeError: OVSDB Error: The transaction failed because the IDL has been configured to require a database lock but didn't get it yet or has already lost it 2021-11-23 23:04:20.943 40 ERROR futurist.periodics Traceback (most recent call last): 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 111, in transaction 2021-11-23 23:04:20.943 40 ERROR futurist.periodics yield self._nested_txns_map[cur_thread_id] 2021-11-23 23:04:20.943 40 ERROR futurist.periodics KeyError: 140347734591640 2021-11-23 23:04:20.943 40 ERROR futurist.periodics 2021-11-23 23:04:20.943 40 ERROR futurist.periodics During handling of the above exception, another exception occurred: 2021-11-23 23:04:20.943 40 ERROR futurist.periodics 2021-11-23 23:04:20.943 40 ERROR futurist.periodics Traceback (most recent call last): 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/futurist/periodics.py", line 290, in run 2021-11-23 23:04:20.943 40 ERROR futurist.periodics work() 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/futurist/periodics.py", line 64, in __call__ 2021-11-23 23:04:20.943 40 ERROR futurist.periodics return self.callback(*self.args, **self.kwargs) 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/futurist/periodics.py", line 178, in decorator 2021-11-23 23:04:20.943 40 ERROR futurist.periodics return f(*args, **kwargs) 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/networking_ovn/common/maintenance.py", line 667, in check_for_mcast_flood_reports 2021-11-23 23:04:20.943 40 ERROR futurist.periodics for port in self._nb_idl.lsp_list().execute(check_error=True): 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/command.py", line 42, in execute 2021-11-23 23:04:20.943 40 ERROR futurist.periodics t.add(self) 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib64/python3.6/contextlib.py", line 88, in __exit__ 2021-11-23 23:04:20.943 40 ERROR futurist.periodics next(self.gen) 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/networking_ovn/ovsdb/impl_idl_ovn.py", line 183, in transaction 2021-11-23 23:04:20.943 40 ERROR futurist.periodics yield t 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib64/python3.6/contextlib.py", line 88, in __exit__ 2021-11-23 23:04:20.943 40 ERROR futurist.periodics next(self.gen) 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 119, in transaction 2021-11-23 23:04:20.943 40 ERROR futurist.periodics del self._nested_txns_map[cur_thread_id] 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 69, in __exit__ 2021-11-23 23:04:20.943 40 ERROR futurist.periodics self.result = self.commit() 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", line 62, in commit 2021-11-23 23:04:20.943 40 ERROR futurist.periodics raise result.ex 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/connection.py", line 128, in run 2021-11-23 23:04:20.943 40 ERROR futurist.periodics txn.results.put(txn.do_commit()) 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", line 115, in do_commit 2021-11-23 23:04:20.943 40 ERROR futurist.periodics raise RuntimeError(msg) 2021-11-23 23:04:20.943 40 ERROR futurist.periodics RuntimeError: OVSDB Error: The transaction failed because the IDL has been configured to require a database lock but didn't get it yet or has already lost it 2021-11-23 23:04:20.943 40 ERROR futurist.periodics ======================================= Thanks! Tony From rsafrono at redhat.com Tue Nov 23 23:51:08 2021 From: rsafrono at redhat.com (Roman Safronov) Date: Wed, 24 Nov 2021 01:51:08 +0200 Subject: [Neutron][OVN] networking_ovn.common.maintenance.DBInconsistenciesPeriodics.check_for_mcast_flood_reports failed In-Reply-To: References: Message-ID: https://bugs.launchpad.net/neutron/+bug/1927077 On Wed, Nov 24, 2021 at 1:41 AM Tony Liu wrote: > Hi, > > Is this known issue? Any filed bug or fix to it? > ======================================= > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics [-] Failed to call > periodic > 'networking_ovn.common.maintenance.DBInconsistenciesPeriodics.check_for_mcast_flood_reports' > (it runs every 600.00 seconds): RuntimeError: OVSDB Error: The transaction > failed because the IDL has been configured to require a database lock but > didn't get it yet or has already lost it > > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics Traceback (most recent > call last): > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 111, in > transaction > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics yield > self._nested_txns_map[cur_thread_id] > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics KeyError: > 140347734591640 > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics During handling of the > above exception, another exception occurred: > > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics Traceback (most recent > call last): > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/futurist/periodics.py", line 290, in run > > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics work() > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/futurist/periodics.py", line 64, in > __call__ > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics return > self.callback(*self.args, **self.kwargs) > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/futurist/periodics.py", line 178, in > decorator > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics return f(*args, > **kwargs) > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/networking_ovn/common/maintenance.py", > line 667, in check_for_mcast_flood_reports > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics for port in > self._nb_idl.lsp_list().execute(check_error=True): > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/command.py", > line 42, in execute > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics t.add(self) > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib64/python3.6/contextlib.py", line 88, in __exit__ > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics next(self.gen) > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/networking_ovn/ovsdb/impl_idl_ovn.py", > line 183, in transaction > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics yield t > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib64/python3.6/contextlib.py", line 88, in __exit__ > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics next(self.gen) > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 119, in > transaction > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics del > self._nested_txns_map[cur_thread_id] > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 69, in __exit__ > > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics self.result = > self.commit() > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", > line 62, in commit > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics raise result.ex > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/connection.py", > line 128, in run > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics > txn.results.put(txn.do_commit()) > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", > line 115, in do_commit > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics raise > RuntimeError(msg) > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics RuntimeError: OVSDB > Error: The transaction failed because the IDL has been configured to > require a database lock but didn't get it yet or has already lost it > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics > ======================================= > > > Thanks! > Tony > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rsafrono at redhat.com Tue Nov 23 23:51:08 2021 From: rsafrono at redhat.com (Roman Safronov) Date: Wed, 24 Nov 2021 01:51:08 +0200 Subject: [Neutron][OVN] networking_ovn.common.maintenance.DBInconsistenciesPeriodics.check_for_mcast_flood_reports failed In-Reply-To: References: Message-ID: https://bugs.launchpad.net/neutron/+bug/1927077 On Wed, Nov 24, 2021 at 1:41 AM Tony Liu wrote: > Hi, > > Is this known issue? Any filed bug or fix to it? > ======================================= > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics [-] Failed to call > periodic > 'networking_ovn.common.maintenance.DBInconsistenciesPeriodics.check_for_mcast_flood_reports' > (it runs every 600.00 seconds): RuntimeError: OVSDB Error: The transaction > failed because the IDL has been configured to require a database lock but > didn't get it yet or has already lost it > > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics Traceback (most recent > call last): > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 111, in > transaction > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics yield > self._nested_txns_map[cur_thread_id] > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics KeyError: > 140347734591640 > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics During handling of the > above exception, another exception occurred: > > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics Traceback (most recent > call last): > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/futurist/periodics.py", line 290, in run > > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics work() > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/futurist/periodics.py", line 64, in > __call__ > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics return > self.callback(*self.args, **self.kwargs) > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/futurist/periodics.py", line 178, in > decorator > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics return f(*args, > **kwargs) > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/networking_ovn/common/maintenance.py", > line 667, in check_for_mcast_flood_reports > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics for port in > self._nb_idl.lsp_list().execute(check_error=True): > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/command.py", > line 42, in execute > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics t.add(self) > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib64/python3.6/contextlib.py", line 88, in __exit__ > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics next(self.gen) > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/networking_ovn/ovsdb/impl_idl_ovn.py", > line 183, in transaction > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics yield t > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib64/python3.6/contextlib.py", line 88, in __exit__ > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics next(self.gen) > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 119, in > transaction > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics del > self._nested_txns_map[cur_thread_id] > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 69, in __exit__ > > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics self.result = > self.commit() > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", > line 62, in commit > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics raise result.ex > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/connection.py", > line 128, in run > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics > txn.results.put(txn.do_commit()) > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", > line 115, in do_commit > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics raise > RuntimeError(msg) > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics RuntimeError: OVSDB > Error: The transaction failed because the IDL has been configured to > require a database lock but didn't get it yet or has already lost it > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics > ======================================= > > > Thanks! > Tony > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tonyliu0592 at hotmail.com Wed Nov 24 00:16:00 2021 From: tonyliu0592 at hotmail.com (Tony Liu) Date: Wed, 24 Nov 2021 00:16:00 +0000 Subject: [Neutron][OVN] networking_ovn.common.maintenance.DBInconsistenciesPeriodics.check_for_mcast_flood_reports failed In-Reply-To: References: Message-ID: Thank you roman for the prompt response! Tony ________________________________________ From: Roman Safronov Sent: November 23, 2021 03:51 PM To: Tony Liu Cc: openstack-discuss; openstack-dev at lists.openstack.org Subject: Re: [Neutron][OVN] networking_ovn.common.maintenance.DBInconsistenciesPeriodics.check_for_mcast_flood_reports failed https://bugs.launchpad.net/neutron/+bug/1927077 On Wed, Nov 24, 2021 at 1:41 AM Tony Liu > wrote: Hi, Is this known issue? Any filed bug or fix to it? ======================================= 2021-11-23 23:04:20.943 40 ERROR futurist.periodics [-] Failed to call periodic 'networking_ovn.common.maintenance.DBInconsistenciesPeriodics.check_for_mcast_flood_reports' (it runs every 600.00 seconds): RuntimeError: OVSDB Error: The transaction failed because the IDL has been configured to require a database lock but didn't get it yet or has already lost it 2021-11-23 23:04:20.943 40 ERROR futurist.periodics Traceback (most recent call last): 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 111, in transaction 2021-11-23 23:04:20.943 40 ERROR futurist.periodics yield self._nested_txns_map[cur_thread_id] 2021-11-23 23:04:20.943 40 ERROR futurist.periodics KeyError: 140347734591640 2021-11-23 23:04:20.943 40 ERROR futurist.periodics 2021-11-23 23:04:20.943 40 ERROR futurist.periodics During handling of the above exception, another exception occurred: 2021-11-23 23:04:20.943 40 ERROR futurist.periodics 2021-11-23 23:04:20.943 40 ERROR futurist.periodics Traceback (most recent call last): 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/futurist/periodics.py", line 290, in run 2021-11-23 23:04:20.943 40 ERROR futurist.periodics work() 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/futurist/periodics.py", line 64, in __call__ 2021-11-23 23:04:20.943 40 ERROR futurist.periodics return self.callback(*self.args, **self.kwargs) 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/futurist/periodics.py", line 178, in decorator 2021-11-23 23:04:20.943 40 ERROR futurist.periodics return f(*args, **kwargs) 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/networking_ovn/common/maintenance.py", line 667, in check_for_mcast_flood_reports 2021-11-23 23:04:20.943 40 ERROR futurist.periodics for port in self._nb_idl.lsp_list().execute(check_error=True): 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/command.py", line 42, in execute 2021-11-23 23:04:20.943 40 ERROR futurist.periodics t.add(self) 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib64/python3.6/contextlib.py", line 88, in __exit__ 2021-11-23 23:04:20.943 40 ERROR futurist.periodics next(self.gen) 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/networking_ovn/ovsdb/impl_idl_ovn.py", line 183, in transaction 2021-11-23 23:04:20.943 40 ERROR futurist.periodics yield t 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib64/python3.6/contextlib.py", line 88, in __exit__ 2021-11-23 23:04:20.943 40 ERROR futurist.periodics next(self.gen) 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 119, in transaction 2021-11-23 23:04:20.943 40 ERROR futurist.periodics del self._nested_txns_map[cur_thread_id] 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 69, in __exit__ 2021-11-23 23:04:20.943 40 ERROR futurist.periodics self.result = self.commit() 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", line 62, in commit 2021-11-23 23:04:20.943 40 ERROR futurist.periodics raise result.ex 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/connection.py", line 128, in run 2021-11-23 23:04:20.943 40 ERROR futurist.periodics txn.results.put(txn.do_commit()) 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", line 115, in do_commit 2021-11-23 23:04:20.943 40 ERROR futurist.periodics raise RuntimeError(msg) 2021-11-23 23:04:20.943 40 ERROR futurist.periodics RuntimeError: OVSDB Error: The transaction failed because the IDL has been configured to require a database lock but didn't get it yet or has already lost it 2021-11-23 23:04:20.943 40 ERROR futurist.periodics ======================================= Thanks! Tony From tonyliu0592 at hotmail.com Wed Nov 24 00:21:27 2021 From: tonyliu0592 at hotmail.com (Tony Liu) Date: Wed, 24 Nov 2021 00:21:27 +0000 Subject: [neutron][nova] [kolla] vif plugged timeout In-Reply-To: References: Message-ID: I hit the same problem, from time to time, not consistently. I am using OVN. Typically, it takes no more than a few seconds for neutron to confirm the port is up. The default timeout in my setup is 600s. Even the ports shows up in both OVN SB and NB, nova-compute still didn't get confirmation from neutron. Either neutron didn't pick it up or the message was lost and didn't get to nova-compute. Hoping someone could share more thoughts. Thanks! Tony ________________________________________ From: Laurent Dumont Sent: November 22, 2021 02:05 PM To: Michal Arbet Cc: openstack-discuss Subject: Re: [neutron][nova] [kolla] vif plugged timeout How high did you have to raise it? If it does appear after X amount of time, then the VIF plug is not lost? On Sat, Nov 20, 2021 at 7:23 AM Michal Arbet > wrote: + if i raise vif_plugged_timeout ( hope i rember it correct ) in nova to some high number ..problem dissapear ... But it's only workaround D?a so 20. 11. 2021, 12:05 Michal Arbet > nap?sal(a): Hi, Has anyone seen issue which I am currently facing ? When launching heat stack ( but it's same if I launch several of instances ) vif plugged in timeouts an I don't know why, sometimes it is OK ..sometimes is failing. Sometimes neutron reports vif plugged in < 10 sec ( test env ) sometimes it's 100 and more seconds, it seems there is some race condition but I can't find out where the problem is. But on the end every instance is spawned ok (retry mechanism worked). Another finding is that it has to do something with security group, if noop driver is used ..everything is working good. Firewall security setup is openvswitch . Test env is wallaby. I will attach some logs when I will be near PC .. Thank you, Michal Arbet (Kevko) From syedammad83 at gmail.com Wed Nov 24 05:47:39 2021 From: syedammad83 at gmail.com (Ammad Syed) Date: Wed, 24 Nov 2021 10:47:39 +0500 Subject: [neutron] Neutron OVN+QoS Support In-Reply-To: References: Message-ID: For OVN qos, you need to set below in neutron.conf core_plugin = ml2 service_plugins = ovn-router, qos, segments, port_forwarding and below ones in ml2_conf.ini [ml2] type_drivers = flat,geneve,vlan tenant_network_types = geneve mechanism_drivers = ovn extension_drivers = port_security, qos Ammad On Tue, Nov 23, 2021 at 10:42 PM Erlon Cruz wrote: > Hmm, > > My OVN deployment doesn't show the extension, and the OVS brings it by > default, thought its not listed in the l3_agent.ini file. > Where do you set that for OVN? The OVS deployment have the l3_agent.ini, > but OVN does not have an L3 agent. > > Erlon > > Em ter., 23 de nov. de 2021 ?s 11:38, Ammad Syed > escreveu: > >> Hi Erlon, >> >> I have tested on xena and it works fine. See if you have qos-fip >> extension loaded in neution. >> >> # openstack extension list | grep -i qos-fip >> >> | Floating IP QoS >> >> | qos-fip | The floating IP >> Quality of Service extension >> | >> Ammad >> >> On Tue, Nov 23, 2021 at 6:56 PM Erlon Cruz wrote: >> >>> Hi Ammad, >>> >>> What OpenStack version did you tested? I have just performed the FIP >>> test on Xena and it didn't work for me. See the results I posted. >>> >>> Erlon >>> >>> Em ter., 23 de nov. de 2021 ?s 10:47, Ammad Syed >>> escreveu: >>> >>>> Hi Erlon, >>>> >>>> You can check below url for testing qos on FIP. I have tested it and it >>>> works fine. >>>> >>>> >>>> https://github.com/openstack/neutron/blob/master/doc/source/admin/config-qos.rst >>>> >>>> Ammad >>>> On Tue, Nov 23, 2021 at 6:06 PM Erlon Cruz wrote: >>>> >>>>> Hi Roman, Rodolfo, >>>>> >>>>> I tested setting the QoS policy to the port (internal) instead of the >>>>> network (external), and it works! I did some more testing on >>>>> the OVS vs OVN deployments and I can confirm the status you are >>>>> saying. What I got was: >>>>> >>>>> OVS: >>>>> FIP: >>>>> Setting on port: FAIL >>>>> Setting on network: OK >>>>> >>>>> Private network: >>>>> Setting on port: OK >>>>> Setting on network: OK >>>>> >>>>> Router: >>>>> Internal port: OK >>>>> External port: OK >>>>> >>>>> OVN: >>>>> FIP: >>>>> Setting on port: FAIL >>>>> Setting on network: FAIL (I was trying this) >>>>> >>>>> Private network: >>>>> Setting on port: OK >>>>> Setting on network: OK >>>>> >>>>> Router: >>>>> Internal port: FAIL >>>>> External port: FAIL >>>>> >>>>> Thanks a lot for your help!! >>>>> Erlon >>>>> >>>>> Em ter., 23 de nov. de 2021 ?s 08:47, Rodolfo Alonso Hernandez < >>>>> ralonsoh at redhat.com> escreveu: >>>>> >>>>>> Hello Erlon: >>>>>> >>>>>> We really need to review the gaps document, at least for Xena. >>>>>> >>>>>> As Roman said, we have been testing QoS in OVN successfully. >>>>>> >>>>>> The current status of QoS in OVN is (at least for Xena): >>>>>> - Fixed ports (VM ports): support for BW limit rules (egress/ingress) >>>>>> and DSCP (only egress). Neutron supports port network QoS inheritance (same >>>>>> as in your example). This is not for OVN but for any backend. >>>>>> - FIPs: support for BW limit rules (egress/ingress). Still no network >>>>>> QoS inheritance (in progress). >>>>>> - GW IP: no support yet. >>>>>> >>>>>> Ping me in #openstack-neutron channel (ralonsoh) if you have more >>>>>> questions. >>>>>> >>>>>> Regards. >>>>>> >>>>>> >>>>>> On Tue, Nov 23, 2021 at 12:12 PM Roman Safronov >>>>>> wrote: >>>>>> >>>>>>> Hi Erlon, >>>>>>> >>>>>>> There was a bug with setting QoS on a network but it had been fixed >>>>>>> long ago. >>>>>>> https://bugs.launchpad.net/neutron/+bug/1851362 or >>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1934096 >>>>>>> At least in our downstream CI we do not observe such issues with >>>>>>> QoS+OVN. >>>>>>> >>>>>>> From the commands I see that you apply the QoS rule on the external >>>>>>> network, right? >>>>>>> >>>>>>> On Tue, Nov 23, 2021 at 12:39 PM Erlon Cruz >>>>>>> wrote: >>>>>>> >>>>>>>> Hi Roman, >>>>>>>> >>>>>>>> Forgot to add that detail, since I run the same routine in a >>>>>>>> non-ovn deployment and it worked. But this is how I did it: >>>>>>>> >>>>>>>> openstack network qos policy list >>>>>>>> openstack network qos policy create bw-limiter >>>>>>>> openstack network qos rule create --type bandwidth-limit --max-kbps >>>>>>>> 512 --max-burst-kbits 512 --egress bw-limiter >>>>>>>> openstack network qos rule create --type bandwidth-limit --max-kbps >>>>>>>> 512 --max-burst-kbits 512 --ingress bw-limiter >>>>>>>> openstack network set --qos-policy bw-limiter ext_net >>>>>>>> >>>>>>>> I didn't set it in the port though, which is something I should do. >>>>>>>> I'll set it in the port too for testing but I think the above should >>>>>>>> work regardless. >>>>>>>> >>>>>>>> Erlon >>>>>>>> >>>>>>>> >>>>>>>> Em seg., 22 de nov. de 2021 ?s 18:45, Roman Safronov < >>>>>>>> rsafrono at redhat.com> escreveu: >>>>>>>> >>>>>>>>> Hi Erlon, >>>>>>>>> >>>>>>>>> I have a couple of questions that probably will help to understand >>>>>>>>> the issue better. >>>>>>>>> Have you applied the QoS rules on a port, network or floating ip? >>>>>>>>> Have you applied the QoS rules before starting the VM (before it's >>>>>>>>> port is active) or after? >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> >>>>>>>>> >>>>>>>>> On Mon, Nov 22, 2021 at 10:53 PM Erlon Cruz >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi folks, >>>>>>>>>> >>>>>>>>>> I have a question related to the Neutron supportability of >>>>>>>>>> OVN+QoS. I have checked the config reference for both >>>>>>>>>> Victoria and Xena[1] >>>>>>>>>> >>>>>>>>>> [2] >>>>>>>>>> >>>>>>>>>> and they are shown as supported (bw limit, eggress/ingress), but I tried to >>>>>>>>>> set up an env >>>>>>>>>> with OVN+QoS but the rules are not being effective (VMs still >>>>>>>>>> download at maximum speed). I double-checked >>>>>>>>>> the configuration in the neutron API and it brings the QoS >>>>>>>>>> settings[3] >>>>>>>>>> >>>>>>>>>> [4] >>>>>>>>>> >>>>>>>>>> [5] >>>>>>>>>> , >>>>>>>>>> and the versions[6] >>>>>>>>>> >>>>>>>>>> [7] >>>>>>>>>> I'm >>>>>>>>>> using should support it. >>>>>>>>>> >>>>>>>>>> What makes me more confused is that there's a document[8] >>>>>>>>>> [9] >>>>>>>>>> with >>>>>>>>>> a gap analysis of the OVN vs OVS QoS functionality >>>>>>>>>> and the document *is* being updated over the releases, but it >>>>>>>>>> still shows that QoS is not supported in OVN. >>>>>>>>>> >>>>>>>>>> So, is there something I'm missing? >>>>>>>>>> >>>>>>>>>> Erlon >>>>>>>>>> _______________ >>>>>>>>>> [1] >>>>>>>>>> https://docs.openstack.org/neutron/victoria/admin/config-qos.html >>>>>>>>>> [2] https://docs.openstack.org/neutron/xena/admin/config-qos.html >>>>>>>>>> [3] QoS Config: >>>>>>>>>> https://gist.github.com/sombrafam/f8434c0505ed4dd3f912574e7ccebb82 >>>>>>>>>> [4] neutron.conf: >>>>>>>>>> https://gist.github.com/sombrafam/785beb10f20439c4e50eb633f294ae82 >>>>>>>>>> [5] ml2_conf.ini: >>>>>>>>>> https://gist.github.com/sombrafam/b171a38d8cd16bd4dc77cfee3916dccd >>>>>>>>>> [6] neutron-api-0 versions: >>>>>>>>>> https://gist.github.com/sombrafam/5d098daa1df3f116d599c09c96eab173 >>>>>>>>>> [7] nova-compute-0 versions: >>>>>>>>>> https://gist.github.com/sombrafam/d51102e3a32be5dc8ca03d7a23b6a998 >>>>>>>>>> [8] Gaps from ML2/OVS-OVN Xena: >>>>>>>>>> https://docs.openstack.org/neutron/xena/ovn/gaps.html >>>>>>>>>> [9] Gaps from ML2/OVS-OVN Victoria: >>>>>>>>>> https://docs.openstack.org/neutron/victoria/ovn/gaps.html >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>> Regards, >>>> >>>> >>>> Syed Ammad Ali >>>> >>> >> >> -- >> Regards, >> >> >> Syed Ammad Ali >> > -- Regards, Syed Ammad Ali -------------- next part -------------- An HTML attachment was scrubbed... URL: From akekane at redhat.com Wed Nov 24 05:57:55 2021 From: akekane at redhat.com (Abhishek Kekane) Date: Wed, 24 Nov 2021 11:27:55 +0530 Subject: [Glance] No weekly meeting Message-ID: Hello Team, As this is the holiday season and ThanksGiving is on our weekly meeting day, we are not meeting this week. Next meeting will be on 02 December. Thanks & Best Regards, Abhishek Kekane -------------- next part -------------- An HTML attachment was scrubbed... URL: From manchandavishal143 at gmail.com Wed Nov 24 07:08:41 2021 From: manchandavishal143 at gmail.com (vishal manchanda) Date: Wed, 24 Nov 2021 12:38:41 +0530 Subject: [horizon] Cancelling Today's weekly meeting Message-ID: Hello Team, Since there are no agenda items [1] to discuss for today's horizon weekly meeting. Also, Today is a holiday for me or maybe for others as well So let's cancel today's weekly meeting. Next weekly meeting will be on 01 December. Thanks & Regards, Vishal Manchanda [1] https://etherpad.opendev.org/p/horizon-release-priorities -------------- next part -------------- An HTML attachment was scrubbed... URL: From wodel.youchi at gmail.com Wed Nov 24 08:25:01 2021 From: wodel.youchi at gmail.com (wodel youchi) Date: Wed, 24 Nov 2021 09:25:01 +0100 Subject: [kolla-ansible][wallaby][magnum][Kubernetes] Cannot auto-scale workers Message-ID: Hi, I have a new kolla-ansible deployment with wallaby. I have created a kubernetes cluster using calico (flannel didn't work for me). I configured an autoscale test to see if it works. - pods autoscale is working. - worker nodes autoscale is not working. This is my deployment file :*cat php-apache.yaml* apiVersion: apps/v1 kind: Deployment metadata: name: php-apache-deployment spec: selector: matchLabels: app: php-apache replicas: 2 template: metadata: labels: app: php-apache spec: containers: - name: php-apache image: k8s.gcr.io/hpa-example ports: - containerPort: 80 resources: limits: cpu: 500m requests: cpu: 200m --- apiVersion: v1 kind: Service metadata: name: php-apache-service labels: app: php-apache spec: ports: - port: 80 targetPort: 80 protocol: TCP selector: app: php-apache type: LoadBalancer This is my HPA file :*cat php-apache-hpa.yaml* apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler metadata: name: php-apache-hpa namespace: default labels: service: php-apache-service spec: minReplicas: 2 maxReplicas: 30 scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: php-apache-deployment metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 30 # en pourcentage This is my load program : kubectl run -i --tty load-generator-1 --rm --image=busybox --restart=Never -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://ip_load_balancer; done" Here are the output of my kub cluster before the test : [kube8 at cdndeployer ~]$ kubectl get pod NAME READY STATUS RESTARTS AGE php-apache-deployment-5b65bbc75c-95k6k 1/1 Running 0 24m php-apache-deployment-5b65bbc75c-mv5h6 1/1 Running 0 24m [kube8 at cdndeployer ~]$ kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE php-apache-hpa Deployment/php-apache-deployment *0%/30%* 2 15 2 24m [kube8 at cdndeployer ~]$ kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.254.0.1 443/TCP 13h php-apache-service LoadBalancer 10.254.3.54 *xx.xx.xx.213* 80:31763/TCP 25m When I apply the load : - pods autoscale creates new pods, then some of them get in the state of : *pending * [kube8 at cdndeployer ~]$ kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE php-apache-hpa Deployment/php-apache-deployment *155%/30%* 2 15 4 27m [kube8 at cdndeployer ~]$ kubectl get pod NAME READY STATUS RESTARTS AGE load-generator-1 1/1 Running 0 97s load-generator-2 1/1 Running 0 94s php-apache-deployment-5b65bbc75c-95k6k 1/1 Running 0 28m *php-apache-deployment-5b65bbc75c-cjkwk 0/1 Pending 0 33s* *php-apache-deployment-5b65bbc75c-cn5rt 0/1 Pending 0 33s* *php-apache-deployment-5b65bbc75c-cxctx 0/1 Pending 0 48s* php-apache-deployment-5b65bbc75c-fffnc 1/1 Running 0 64s php-apache-deployment-5b65bbc75c-hbfw8 0/1 Pending 0 33s php-apache-deployment-5b65bbc75c-l8496 1/1 Running 0 48s php-apache-deployment-5b65bbc75c-mv5h6 1/1 Running 0 28m php-apache-deployment-5b65bbc75c-qddrb 1/1 Running 0 48s php-apache-deployment-5b65bbc75c-dd5r5 0/1 Pending 0 48s php-apache-deployment-5b65bbc75c-tr65j 1/1 Running 0 64s 2 - The cluster is unable to create more pods/workers and I get this error message from the pending pods kubectl describe pod php-apache-deployment-5b65bbc75c-dd5r5 Name: php-apache-deployment-5b65bbc75c-dd5r5 Namespace: default Priority: 0 Node: Labels: app=php-apache pod-template-hash=5b65bbc75c Annotations: kubernetes.io/psp: magnum.privileged *Status: Pending* IP: IPs: Controlled By: ReplicaSet/php-apache-deployment-5b65bbc75c Containers: php-apache: Image: k8s.gcr.io/hpa-example Port: 80/TCP Host Port: 0/TCP Limits: cpu: 500m Requests: cpu: 200m Environment: Mounts: /var/run/secrets/kubernetes.io/serviceaccount from default-token-4fsgh (ro) Conditions: Type Status PodScheduled False Volumes: default-token-4fsgh: Type: Secret (a volume populated by a Secret) SecretName: default-token-4fsgh Optional: false QoS Class: Burstable Node-Selectors: Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- * Warning FailedScheduling 2m48s default-scheduler 0/4 nodes are available: 2 Insufficient cpu, 2 node(s) had taint {node-role.kubernetes.io/master : }, that the pod didn't tolerate.* * Warning FailedScheduling 2m48s default-scheduler 0/4 nodes are available: 2 Insufficient cpu, 2 node(s) had taint {node-role.kubernetes.io/master : }, that the pod didn't tolerate.* * Normal NotTriggerScaleUp 2m42s cluster-autoscaler pod didn't trigger scale-up (it wouldn't fit if a new node is added): 1 in backoff after failed scale-u**p* I have this error message from the autoscaller pod *cluster-autoscaler*-f4bd5f674-b9692 : I1123 00:50:27.714801 1 node_instances_cache.go:168] Refresh cloud provider node instances cache finished, refresh took 12.709?s I1123 00:51:34.181145 1 scale_up.go:658] Scale-up: setting group default-worker size to 3 *W1123 00:51:34.381953 1 clusterstate.go:281] Disabling scale-up for node group default-worker until 2021-11-23 00:56:34.180840351 +0000 UTC m=+47174.376164120; errorClass=Other; errorCode=cloudProviderError* *E1123 00:51:34.382081 1 static_autoscaler.go:415] Failed to scale up: failed to increase node group size: could not check current nodegroup size: could not get cluster: Get https://dash.cdn.domaine.tld:9511/v1/clusters/b4a6b3eb-fcf3-416f-b740-11a083d4b896 : dial tcp: lookup dash.cdn.domaine. tld on 10.254.0.10:53 : no such host* W1123 00:51:44.392523 1 scale_up.go:383] Node group default-worker is not ready for scaleup - backoff W1123 00:51:54.410273 1 scale_up.go:383] Node group default-worker is not ready for scaleup - backoff W1123 00:52:04.422128 1 scale_up.go:383] Node group default-worker is not ready for scaleup - backoff W1123 00:52:14.434278 1 scale_up.go:383] Node group default-worker is not ready for scaleup - backoff W1123 00:52:24.442480 1 scale_up.go:383] Node group default-worker is not ready for scaleup - backoff I1123 00:52:27.715019 1 node_instances_cache.go:156] Start refreshing cloud provider node instances cache I did some tests on the DNS pod and : kubectl get svc -A NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE default kubernetes ClusterIP 10.254.0.1 443/TCP 13h default php-apache-service LoadBalancer 10.254.3.54 xx.xx.xx.213 80:31763/TCP 19m kube-system dashboard-metrics-scraper ClusterIP 10.254.19.191 8000/TCP 13h *kube-system kube-dns ClusterIP 10.254.0.10 53/UDP,53/TCP,9153/TCP 13h* kube-system kubernetes-dashboard ClusterIP 10.254.132.17 443/TCP 13h kube-system magnum-metrics-server ClusterIP 10.254.235.147 443/TCP 13h I have noticed this behaviour about the horizon url, sometimes the dns pod responds sometimes it does not !!!!! [root at k8multiclustercalico-ve5t6uuoo245-master-0 ~]# *dig @10.254.0.10 dash.cdn. domaine.tld* ; <<>> DiG 9.11.28-RedHat-9.11.28-1.fc33 <<>> @10.254.0.10 dash.cdn. domaine.tld ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 5646 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;dash.cd n.domaine.tld. IN A ;; AUTHORITY SECTION: *cdn. domaine.tld. 30 IN SOA cdn. domaine.tld. root.cdn. domaine.tld. 2021100900 604800 86400 2419200 604800* ;; Query time: 84 msec ;; SERVER: 10.254.0.10#53(10.254.0.10) ;; WHEN: Tue Nov 23 01:08:03 UTC 2021 ;; MSG SIZE rcvd: 12 2 secondes later [root at k8multiclustercalico-ve5t6uuoo245-master-0 ~]# *dig @10.254.0.10 dash.cdn. domaine.tld* ; <<>> DiG 9.11.28-RedHat-9.11.28-1.fc33 <<>> @10.254.0.10 dash.cdn. domaine.tld ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 7653 ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;dash.cdn. domaine.tld. IN A ;; ANSWER SECTION: *dash.cdn. domaine.tld. 30 IN A xx.xx.xx.129* ;; Query time: 2 msec ;; SERVER: 10.254.0.10#53(10.254.0.10) ;; WHEN: Tue Nov 23 01:08:21 UTC 2021 ;; MSG SIZE rcvd: 81 In the log of the dns pod I have this kubectl logs *kube-dns-autoscaler*-75859754fd-q8z4w -n kube-system *E1122 20:56:09.944449 1 autoscaler_server.go:120] Update failure: the server could not find the requested resource E1122 20:56:19.945294 1 autoscaler_server.go:120] Update failure: the server could not find the requested resource E1122 20:56:29.944245 1 autoscaler_server.go:120] Update failure: the server could not find the requested resource E1122 20:56:39.946346 1 autoscaler_server.go:120] Update failure: the server could not find the requested resource* *E1122 20:56:49.944693 1 autoscaler_server.go:120] Update failure: the server could not find the requested resource* I don't have experience on kubernetes yet, could someone help me debug this? Regards. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralonsoh at redhat.com Wed Nov 24 09:07:07 2021 From: ralonsoh at redhat.com (Rodolfo Alonso Hernandez) Date: Wed, 24 Nov 2021 10:07:07 +0100 Subject: [neutron] default QOS on L3 gateway In-Reply-To: References: Message-ID: Hello Damien: There is no default value for QoS. If the router GW port is in a network with a QoS policy, once this feature [1] is implemented, the GW port will inherit the QoS from this network. This is the same behaviour as with fixed ports: when a port has no QoS policy but the network has, the network QoS will apply on this port. This feature [1] was requested by several customers expecting to have a uniform API between L2 and L3 QoS. Regards. [1]https://bugs.launchpad.net/neutron/+bug/1950454 On Tue, Nov 23, 2021 at 8:27 PM Damien Rannou wrote: > Yes exactly > Just On point that I?m not sure: what append if the client ask for a > router update with ?no-qos option ? > Will it remove the QOS completely or just re apply the default value ? > > Damien > > Le 22 nov. 2021 ? 18:45, Lajos Katona a ?crit : > > Hi, > There is an RFE to inherit network QoS: > https://bugs.launchpad.net/neutron/+bug/1950454 > The patch series: > https://review.opendev.org/q/topic:%22bug%252F1950454%22+(status:open%20OR%20status:merged) > > Hope this covers your usecase. > > Lajos Katona (lajoskatona) > > Damien Rannou ezt ?rta (id?pont: 2021. nov. > 22., H, 17:23): > >> Hello >> We are currently playing with QOS on L3 agent, mostly for SNAT, but it >> can apply also on FIP. >> Everything is working properly, but I?m wondering if there is a way to >> define a ? default ? QOS that would be applied on Router creation, >> but also when the user is setting ? no_qos ? on his router. >> >> On a public cloud environnement, we cannot let the customers without any >> QOS limitation. >> >> Thanks ! >> Damien > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From elod.illes at est.tech Wed Nov 24 09:30:08 2021 From: elod.illes at est.tech (=?UTF-8?B?RWzFkWQgSWxsw6lz?=) Date: Wed, 24 Nov 2021 10:30:08 +0100 Subject: [kolla][release][Release-job-failures] Release of openstack/kolla for ref refs/tags/10.4.0 failed In-Reply-To: References: Message-ID: <47b7b192-d1eb-9505-bd86-b3664c1d6a17@est.tech> Hi Kolla team, The last release of kolla (10.4.0 [1]) had release job failures (see below the attached, forwarded mail). The two job that failed are: - kolla-publish-centos8-source-dockerhub https://zuul.opendev.org/t/openstack/build/feb51c87f74c4dadb4984733d8b3793c : FAILURE in 2h 12m 37s - kolla-publish-centos8-binary-dockerhub https://zuul.opendev.org/t/openstack/build/ebce7219ff4b44368869bc1f3172d688 : FAILURE in 1h 46m 56s (non-voting) Could you please have a look at these? Thanks in advance, El?d (elodilles @ #openstack-release) [1] https://review.opendev.org/c/openstack/releases/+/818882 -------- Forwarded Message -------- Subject: [Release-job-failures] Release of openstack/kolla for ref refs/tags/10.4.0 failed Date: Tue, 23 Nov 2021 17:22:02 +0000 From: zuul at openstack.org Reply-To: openstack-discuss at lists.openstack.org To: release-job-failures at lists.openstack.org Build failed. - openstack-upload-github-mirror https://zuul.opendev.org/t/openstack/build/4834324fec464b309952930a2c221336 : SUCCESS in 1m 45s - release-openstack-python https://zuul.opendev.org/t/openstack/build/e3f27c5f3f9b4b289dc73570243507ff : SUCCESS in 3m 06s - announce-release https://zuul.opendev.org/t/openstack/build/1d759f048a1145af9b253b6e5d69b4f4 : SUCCESS in 4m 23s - propose-update-constraints https://zuul.opendev.org/t/openstack/build/3a600e6a4c494a829bfcd45c91931067 : SUCCESS in 3m 42s - kolla-publish-centos8-source-dockerhub https://zuul.opendev.org/t/openstack/build/feb51c87f74c4dadb4984733d8b3793c : FAILURE in 2h 12m 37s - kolla-publish-centos8-binary-dockerhub https://zuul.opendev.org/t/openstack/build/ebce7219ff4b44368869bc1f3172d688 : FAILURE in 1h 46m 56s (non-voting) - kolla-publish-debian-source-dockerhub https://zuul.opendev.org/t/openstack/build/e076910efb394683b98106d5648c2ab5 : SUCCESS in 1h 54m 47s (non-voting) - kolla-publish-debian-source-aarch64-dockerhub https://zuul.opendev.org/t/openstack/build/e201c6f797ea4248b933df8c53d65fa8 : SUCCESS in 2h 03m 51s (non-voting) - kolla-publish-debian-binary-dockerhub https://zuul.opendev.org/t/openstack/build/d0cf2b3aa48e49d8a5aa46719b84c926 : SUCCESS in 1h 18m 03s (non-voting) - kolla-publish-ubuntu-source-dockerhub https://zuul.opendev.org/t/openstack/build/17644441b45f4964b52a1e26e8ef3f19 : SUCCESS in 2h 01m 47s - kolla-publish-ubuntu-binary-dockerhub https://zuul.opendev.org/t/openstack/build/63d1810841da4d9c96d4755177761f4d : SUCCESS in 1h 34m 09s (non-voting) _______________________________________________ Release-job-failures mailing list Release-job-failures at lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/release-job-failures -------------- next part -------------- An HTML attachment was scrubbed... URL: From piotrmisiak1984 at gmail.com Wed Nov 24 10:03:52 2021 From: piotrmisiak1984 at gmail.com (Piotr Misiak) Date: Wed, 24 Nov 2021 11:03:52 +0100 Subject: [keystone][policy][ussuri] why I can create a domain Message-ID: <6c8d401e-1945-52df-ff69-8e92134a77b7@gmail.com> Hi, Maybe a stupid question but I'm really confused. In my Ussuri cloud Keystone has a following policy for create_domain action (this is a default policy from Keystone code): "identity:create_domain": "role:admin and system_scope:all" I have a user which has "admin" role assigned in project "admin" in domain "default" - AKA cloud admin. The user does not have any roles assigned on system scope. Could someone please explain why this user is able to create a domain in the cloud? Looking at the policy rule he shouldn't or maybe I'm reading it in a wrong way? Is there any "backward compatibility" casting "cloud admin" role to "system_scope:all"? Please help Thanks Piotr From bdobreli at redhat.com Wed Nov 24 10:05:24 2021 From: bdobreli at redhat.com (Bogdan Dobrelya) Date: Wed, 24 Nov 2021 11:05:24 +0100 Subject: [neutron][nova] [kolla] vif plugged timeout In-Reply-To: References: Message-ID: <08c67a25-72b4-870d-bfb7-1aad318a1551@redhat.com> On 11/24/21 1:21 AM, Tony Liu wrote: > I hit the same problem, from time to time, not consistently. I am using OVN. > Typically, it takes no more than a few seconds for neutron to confirm the port is up. > The default timeout in my setup is 600s. Even the ports shows up in both OVN SB > and NB, nova-compute still didn't get confirmation from neutron. Either neutron > didn't pick it up or the message was lost and didn't get to nova-compute. > Hoping someone could share more thoughts. That also may be a super-set of the revert-resize with OVS hybrid plug issue described in [0]. Even though the problems described in the topic may have nothing to that particular case, but does look related to the external events framework. Issues like that make me thinking about some improvements to it. [tl;dr] bring back up the idea of buffering events with a ttl Like a new deferred RPC calls feature maybe? That would execute a call after some trigger, like send unplug and forget. That would make debugging harder, but cover the cases when an external service "forgot" (an event was lost and the like cases) to notify Nova when it is done. Adding a queue to store events that Nova did not have a recieve handler set for might help as well. And have a TTL set on it, or a more advanced reaping logic, for example based on tombstone events invalidating the queue contents by causal conditions. That would eliminate flaky expectations set around starting to wait for receiving events vs sending unexpected or belated events. Why flaky? Because in an async distributed system there is no "before" nor "after", so an external to Nova service will unlikely conform to any time-frame based contract for send-notify/wait-receive/real-completion-fact. And the fact that Nova can't tell what the network backend is (because [1] was not fully implemented) does not make things simpler. As Sean noted in a private irc conversation, with OVN the current implementation is not capable of fullfilling the contract that network-vif-plugged events are only sent after the interface is fully configred. So it send events at bind time once it have updated the logical port in the ovn db but before real configuration has happened. I believe that deferred RPC calls and/or queued events might improve such a "cheating" by making the real post-completion processing a thing for any backend? [0] https://bugs.launchpad.net/nova/+bug/1952003 [1] https://specs.openstack.org/openstack/neutron-specs/specs/train/port-binding-extended-information.html > > Thanks! > Tony > ________________________________________ > From: Laurent Dumont > Sent: November 22, 2021 02:05 PM > To: Michal Arbet > Cc: openstack-discuss > Subject: Re: [neutron][nova] [kolla] vif plugged timeout > > How high did you have to raise it? If it does appear after X amount of time, then the VIF plug is not lost? > > On Sat, Nov 20, 2021 at 7:23 AM Michal Arbet > wrote: > + if i raise vif_plugged_timeout ( hope i rember it correct ) in nova to some high number ..problem dissapear ... But it's only workaround > > D?a so 20. 11. 2021, 12:05 Michal Arbet > nap?sal(a): > Hi, > > Has anyone seen issue which I am currently facing ? > > When launching heat stack ( but it's same if I launch several of instances ) vif plugged in timeouts an I don't know why, sometimes it is OK ..sometimes is failing. > > Sometimes neutron reports vif plugged in < 10 sec ( test env ) sometimes it's 100 and more seconds, it seems there is some race condition but I can't find out where the problem is. But on the end every instance is spawned ok (retry mechanism worked). > > Another finding is that it has to do something with security group, if noop driver is used ..everything is working good. > > Firewall security setup is openvswitch . > > Test env is wallaby. > > I will attach some logs when I will be near PC .. > > Thank you, > Michal Arbet (Kevko) > > > > > > -- Best regards, Bogdan Dobrelya, Irc #bogdando From bdobreli at redhat.com Wed Nov 24 10:31:25 2021 From: bdobreli at redhat.com (Bogdan Dobrelya) Date: Wed, 24 Nov 2021 11:31:25 +0100 Subject: [ops]RabbitMQ High Availability In-Reply-To: <0670B960225633449A24709C291A525251D511CD@COM03.performair.local> References: <0670B960225633449A24709C291A525251D511CD@COM03.performair.local> Message-ID: <4e37d1a7-ca17-b50f-ba6c-96229d85a75e@redhat.com> On 11/24/21 12:34 AM, DHilsbos at performair.com wrote: > All; > > In the time I've been part of this mailing list, the subject of RabbitMQ high availability has come up several times, and each time specific recommendations for both Rabbit and Open Stack are provided. I remember it being an A or B kind of recommendation (i.e. configure Rabbit like A1, and Open Stack like A2, OR configure Rabbit like B1, and Open Stack like B2). There is no special recommendations for rabbitmq setup for openstack, but probably a few, like instead of putting it behind a haproxy, or the like, list the rabbit cluster nodes in the oslo messaging config settings directly. Also, it seems that durable queues makes a very little sense for highly ephemeral RPC calls, just by design. I would also add that the raft quorum queues feature of rabbitmq >=3.18 does neither fit well into the oslo messaging design for RPC calls. A discussable and highly opinionated thing is also configuring ha/mirror queue policy params for queues used for RPC calls vs broad-casted notifications. And my biased personal humble recommendation is: use the upstream OCF RA [0][1], if configuring rabbitmq cluster by pacemaker. [0] https://www.rabbitmq.com/pacemaker.html#auto-pacemaker [1] https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/rabbitmq-server-ha > > Unfortunately, I can't find the previous threads on this topic. > > Does anyone have this information, that they would care to share with me? > > Thank you, > > Dominic L. Hilsbos, MBA > Vice President - Information Technology > Perform Air International Inc. > DHilsbos at PerformAir.com > www.PerformAir.com > > > -- Best regards, Bogdan Dobrelya, Irc #bogdando From mnasiadka at gmail.com Wed Nov 24 10:34:47 2021 From: mnasiadka at gmail.com (=?utf-8?Q?Micha=C5=82_Nasiadka?=) Date: Wed, 24 Nov 2021 11:34:47 +0100 Subject: [kolla] Transition kolla/kolla-ansible Rocky/Stein to EOL Message-ID: <13B802F8-6AEB-430E-AA5B-CE60726E88FD@gmail.com> Hello, We've kept stable/rocky and stable/stein branches of kolla and kolla-ansible around under extended maintenance for a while now. The coordinated Rocky release was 30th Aug, 2018 (Kolla on 22nd Oct, 2018) and Stein was 10th Apr, 2019 (Kolla on 17th Jul, 2019). The last time a change was backported to these branches has been a while: - kolla stable/rocky: Sep 4, 2020 stable/stein: Mar 19, 2021 - kolla-ansible stable/rocky: Sep 13, 2020 stable/stein: May 28,2021 There is no CI running for those projects periodically, last CI jobs have been run: - kolla: stable/rocky: Jul 8th 2019 stable/stein: Dec 10th 2020 - kolla-ansible stable/rocky: Sep 14th 2019 stable/stein: Dec 10th 2020 All changes to those branches have been abandoned long time (and nothing new proposed for at least a year): https://review.opendev.org/q/branch:stable/rocky+project:openstack/kolla https://review.opendev.org/q/branch:stable/stein+project:openstack/kolla https://review.opendev.org/q/branch:stable/rocky+project:openstack/kolla-ansible https://review.opendev.org/q/branch:stable/stein+project:openstack/kolla-ansible In addition to that - kolla-ansible CI jobs have incorrect Zuul configurations (see http://lists.openstack.org/pipermail/openstack-discuss/2021-November/025797.html ) - and on one of the Kolla weekly meetings it was decided to mark those branches as EOL. So, I'd like to request that these branches be moved to end-of-life. Patch proposed to add EOL tags to both kolla and kolla-ansible: https://review.opendev.org/c/openstack/releases/+/818920 Best regards, Michal -------------- next part -------------- An HTML attachment was scrubbed... URL: From jan.vondra at ultimum.io Wed Nov 24 13:30:00 2021 From: jan.vondra at ultimum.io (Jan Vondra) Date: Wed, 24 Nov 2021 14:30:00 +0100 Subject: [neutron][nova] [kolla] vif plugged timeout In-Reply-To: <08c67a25-72b4-870d-bfb7-1aad318a1551@redhat.com> References: <08c67a25-72b4-870d-bfb7-1aad318a1551@redhat.com> Message-ID: Hi guys, I've been further investigating Michal's (OP) issue, since he is on his holiday, and I've found out that the issue is not really plugging the VIF but cleanup after previous port bindings. We are creating 6 servers with 2-4 vifs using heat template [0]. We were hitting some problems with placement so the stack sometimes failed to create and we had to delete the stack and recreate it. If we recreate it right after the deletion, the vif plugging timeout occurs. If we wait some time (approx. 10 minutes) the stack is created successfully. This brings me to believe that there is some issue with deferring the removal of security groups from unbound ports (somewhere around this part of code [1]) and it somehow affects the creation of new ports. However, I am unable to find any lock that could cause this behaviour. The only proof I have is that after the stack recreation scenario I have measured that the process_network_ports [2] function call could take up to 650 s (varies from 5 s to 651 s in our environment). Any idea what could be causing this? [0] https://pastebin.com/infvj4ai [1] https://github.com/openstack/neutron/blob/master/neutron/agent/firewall.py#L133 [2] https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L2079 *Jan Vondra* *http://ultimum.io * st 24. 11. 2021 v 11:08 odes?latel Bogdan Dobrelya napsal: > On 11/24/21 1:21 AM, Tony Liu wrote: > > I hit the same problem, from time to time, not consistently. I am using > OVN. > > Typically, it takes no more than a few seconds for neutron to confirm > the port is up. > > The default timeout in my setup is 600s. Even the ports shows up in both > OVN SB > > and NB, nova-compute still didn't get confirmation from neutron. Either > neutron > > didn't pick it up or the message was lost and didn't get to nova-compute. > > Hoping someone could share more thoughts. > > That also may be a super-set of the revert-resize with OVS hybrid plug > issue described in [0]. Even though the problems described in the topic > may have nothing to that particular case, but does look related to the > external events framework. > > Issues like that make me thinking about some improvements to it. > > [tl;dr] bring back up the idea of buffering events with a ttl > > Like a new deferred RPC calls feature maybe? That would execute a call > after some trigger, like send unplug and forget. That would make > debugging harder, but cover the cases when an external service "forgot" > (an event was lost and the like cases) to notify Nova when it is done. > > Adding a queue to store events that Nova did not have a recieve handler > set for might help as well. And have a TTL set on it, or a more advanced > reaping logic, for example based on tombstone events invalidating the > queue contents by causal conditions. That would eliminate flaky > expectations set around starting to wait for receiving events vs sending > unexpected or belated events. Why flaky? Because in an async distributed > system there is no "before" nor "after", so an external to Nova service > will unlikely conform to any time-frame based contract for > send-notify/wait-receive/real-completion-fact. And the fact that Nova > can't tell what the network backend is (because [1] was not fully > implemented) does not make things simpler. > > As Sean noted in a private irc conversation, with OVN the current > implementation is not capable of fullfilling the contract that > network-vif-plugged events are only sent after the interface is fully > configred. So it send events at bind time once it have updated the > logical port in the ovn db but before real configuration has happened. I > believe that deferred RPC calls and/or queued events might improve such > a "cheating" by making the real post-completion processing a thing for > any backend? > > [0] https://bugs.launchpad.net/nova/+bug/1952003 > > [1] > > https://specs.openstack.org/openstack/neutron-specs/specs/train/port-binding-extended-information.html > > > > > Thanks! > > Tony > > ________________________________________ > > From: Laurent Dumont > > Sent: November 22, 2021 02:05 PM > > To: Michal Arbet > > Cc: openstack-discuss > > Subject: Re: [neutron][nova] [kolla] vif plugged timeout > > > > How high did you have to raise it? If it does appear after X amount of > time, then the VIF plug is not lost? > > > > On Sat, Nov 20, 2021 at 7:23 AM Michal Arbet > wrote: > > + if i raise vif_plugged_timeout ( hope i rember it correct ) in nova to > some high number ..problem dissapear ... But it's only workaround > > > > D?a so 20. 11. 2021, 12:05 Michal Arbet michal.arbet at ultimum.io>> nap?sal(a): > > Hi, > > > > Has anyone seen issue which I am currently facing ? > > > > When launching heat stack ( but it's same if I launch several of > instances ) vif plugged in timeouts an I don't know why, sometimes it is OK > ..sometimes is failing. > > > > Sometimes neutron reports vif plugged in < 10 sec ( test env ) sometimes > it's 100 and more seconds, it seems there is some race condition but I > can't find out where the problem is. But on the end every instance is > spawned ok (retry mechanism worked). > > > > Another finding is that it has to do something with security group, if > noop driver is used ..everything is working good. > > > > Firewall security setup is openvswitch . > > > > Test env is wallaby. > > > > I will attach some logs when I will be near PC .. > > > > Thank you, > > Michal Arbet (Kevko) > > > > > > > > > > > > > > > -- > Best regards, > Bogdan Dobrelya, > Irc #bogdando > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Wed Nov 24 13:56:41 2021 From: smooney at redhat.com (Sean Mooney) Date: Wed, 24 Nov 2021 13:56:41 +0000 Subject: [neutron][nova] [kolla] vif plugged timeout In-Reply-To: References: Message-ID: On Wed, 2021-11-24 at 00:21 +0000, Tony Liu wrote: > I hit the same problem, from time to time, not consistently. I am using OVN. > Typically, it takes no more than a few seconds for neutron to confirm the port is up. > The default timeout in my setup is 600s. Even the ports shows up in both OVN SB > and NB, nova-compute still didn't get confirmation from neutron. Either neutron > didn't pick it up or the message was lost and didn't get to nova-compute. > Hoping someone could share more thoughts. there are some knonw bugs in this area. basicaly every neutorn backend behaves slightly differently with regards to how/when it send the network vif plugged event and this depend on many factors and change form release to release. for exampel im pretty sure in the past ml2/ovs used to send network-vif-plugged events for ports that are adminstiratively disabel since nova/os-vif still pluggs thoses into the ovs bridge we would expect them to be sent however that apparently has changed at some point. leading to https://bugs.launchpad.net/nova/+bug/1951623 ml2/ovn never send network-vif-plugged events when teh port is plugged it cheats and send them when the port is bound but the exact rules for that have also chagne over the last few releases. nova has no way to discover this behavior from neutron and we have to do our best to geuess based on some atrrbutes of the port. for example as noted below the firewall dirver used with ml2/ovs makes a difference if you use iptables_hybrid we sue the hybrid_plug mechanisum that means the vm tap device is added to a linux bridge which is then connect to ovs with a veth pair. for move operation like live migrate the linux bridge and veth pair are created on the destionat in prelivemigrate and nova waits for the event. sicne we cant detech what security group driver is used from the port we have to guess based on if hybrid_plug=true in the port binding profile. for iptables hybrid_plug is True for noop and openvswich security group driver hybrid_plug is set to false https://review.opendev.org/c/openstack/nova/+/767368 attempted to account for the fact that network-vif-plugged woudl not be sent in thet later case in prelive migrate since at the time the vm interface was only plugged in to ovs by libvirt during the migration. https://review.opendev.org/c/openstack/nova/+/767368/1/nova/network/model.py#547 def get_live_migration_plug_time_events(self): """Returns a list of external events for any VIFs that have "plug-time" events during live migration. """ return [('network-vif-plugged', vif['id']) for vif in self if vif.has_live_migration_plug_time_event] https://review.opendev.org/c/openstack/nova/+/767368/1/nova/network/model.py#472 def has_live_migration_plug_time_event(self): """Returns whether this VIF's network-vif-plugged external event will be sent by Neutron at "plugtime" - in other words, as soon as neutron completes configuring the network backend. """ return self.is_hybrid_plug_enabled() what that code does is skip waiting for network-vif plugged event during live migration for all interfaces wehre hybrid_plug is false which include ml2/ovs with noop or openvswitch security group driver and ml2/ovn as it never send them at the correct time. it turns out to fix https://bugs.launchpad.net/nova/+bug/1951623 we also should be skipping waiting if the admin state on the port is disabled by adding and vif['active'] == 'active' to the list comprehention. the code shoul also have addtional knoladge of the network backedn to make the reight descissions however the bound_drivers intoduced by https://specs.openstack.org/openstack/neutron-specs/specs/train/port-binding-extended-information.html was never actully implemeted in neutron so neutron does not curretnly tell nova if it ml2/ovs or ml2/ovn or ml2/odl all of the above have vif_type OVS so we cant renable waiting for netwrok vif plugged event when hybrid_plug is false and ml2/ovs is used sicne while it would be correct for ml2/ovs it woudl break ml2/ovn so we are forced to support the least capable netowrk backend in any situation. until this is fix in nova and neutron its unlikely you will be able ot adres this in kolla in a meaningful way. every time we skip waiting for a network-vif-plugged event in nova when there ideally woudl be one as part fo a move operation we introduced a race between the vm starting on the destinatnion host and the network backend completing its configuration so simpley settign?[DEFAULT]/vif_plugging_is_fatal=False or [compute]/live_migration_wait_for_vif_plug=false risk the vm not haveing netwroking when configured. https://docs.openstack.org/nova/latest/configuration/config.html#DEFAULT.vif_plugging_is_fatal https://docs.openstack.org/nova/latest/configuration/config.html#compute.live_migration_wait_for_vif_plug the do provide ways for operators to work around some bugs as will the recently added [workarounds]/wait_for_vif_plugged_event_during_hard_reboot option https://docs.openstack.org/nova/latest/configuration/config.html#workarounds.wait_for_vif_plugged_event_during_hard_reboot however this shoudl not be complexity that the operator shoudl have to understand and configure via kolla. we shoudl fix the contract between nova and neutron includeing requireing out of tree network vendros like cisco aci or other core plugins to actully conform to the interface but after 5 eyars of trying to get this fixed its still not and we just have to play the wackamole game everytime someoen reports another edgecase. in this specific case i dont knwo why you are not getting the event but ffor ml2/ovs both the l2 agent and dhcp agent but need to notify the neutron server that provisiouning is complete and apprent the port also now need to be admin state actitve/up before the network-vif-plugged event is sent. in the case wehre it fails i woudl chekc the dhcp agaent log, l2 agent log and neutorn server log try and se if one or both of the l2/dhcp agent failed to provision the port. i would guess it sthe dhcp agent given it works on the retry to the next host. regards sean > Thanks! > Tony > ________________________________________ > From: Laurent Dumont > Sent: November 22, 2021 02:05 PM > To: Michal Arbet > Cc: openstack-discuss > Subject: Re: [neutron][nova] [kolla] vif plugged timeout > > How high did you have to raise it? If it does appear after X amount of time, then the VIF plug is not lost? > > On Sat, Nov 20, 2021 at 7:23 AM Michal Arbet > wrote: > + if i raise vif_plugged_timeout ( hope i rember it correct ) in nova to some high number ..problem dissapear ... But it's only workaround > > D?a so 20. 11. 2021, 12:05 Michal Arbet > nap?sal(a): > Hi, > > Has anyone seen issue which I am currently facing ? > > When launching heat stack ( but it's same if I launch several of instances ) vif plugged in timeouts an I don't know why, sometimes it is OK ..sometimes is failing. > > Sometimes neutron reports vif plugged in < 10 sec ( test env ) sometimes it's 100 and more seconds, it seems there is some race condition but I can't find out where the problem is. But on the end every instance is spawned ok (retry mechanism worked). > > Another finding is that it has to do something with security group, if noop driver is used ..everything is working good. > > Firewall security setup is openvswitch . > > Test env is wallaby. > > I will attach some logs when I will be near PC .. > > Thank you, > Michal Arbet (Kevko) > > > > > > From smooney at redhat.com Wed Nov 24 14:53:21 2021 From: smooney at redhat.com (Sean Mooney) Date: Wed, 24 Nov 2021 14:53:21 +0000 Subject: [neutron][nova] [kolla] vif plugged timeout In-Reply-To: <08c67a25-72b4-870d-bfb7-1aad318a1551@redhat.com> References: <08c67a25-72b4-870d-bfb7-1aad318a1551@redhat.com> Message-ID: <9eb79ad2239e05f926880da40502fbe68f8e83d9.camel@redhat.com> On Wed, 2021-11-24 at 11:05 +0100, Bogdan Dobrelya wrote: > On 11/24/21 1:21 AM, Tony Liu wrote: > > I hit the same problem, from time to time, not consistently. I am using OVN. > > Typically, it takes no more than a few seconds for neutron to confirm the port is up. > > The default timeout in my setup is 600s. Even the ports shows up in both OVN SB > > and NB, nova-compute still didn't get confirmation from neutron. Either neutron > > didn't pick it up or the message was lost and didn't get to nova-compute. > > Hoping someone could share more thoughts. > > That also may be a super-set of the revert-resize with OVS hybrid plug > issue described in [0]. Even though the problems described in the topic > may have nothing to that particular case, but does look related to the > external events framework. > > Issues like that make me thinking about some improvements to it. > > [tl;dr] bring back up the idea of buffering events with a ttl > > Like a new deferred RPC calls feature maybe? That would execute a call > after some trigger, like send unplug and forget. That would make > debugging harder, but cover the cases when an external service "forgot" > (an event was lost and the like cases) to notify Nova when it is done. > > Adding a queue to store events that Nova did not have a recieve handler > set for might help as well. And have a TTL set on it, or a more advanced > reaping logic, for example based on tombstone events invalidating the > queue contents by causal conditions. That would eliminate flaky > expectations set around starting to wait for receiving events vs sending > unexpected or belated events. Why flaky? Because in an async distributed > system there is no "before" nor "after", so an external to Nova service > will unlikely conform to any time-frame based contract for > send-notify/wait-receive/real-completion-fact. And the fact that Nova > can't tell what the network backend is (because [1] was not fully > implemented) does not make things simpler. i honestly dont think this is a viable option we have discussed it several times in nova in the past and keep coming to the same conclution either the event shoudl be sent and waited for at that right times or they loose there value. buffering the event masks bad behavior in non complent netowrk backends, it potentially exposes the teants and oeprators to security issues by breaking multi tenancy https://bugs.launchpad.net/neutron/+bug/1734320 or network conenct connecity?https://bugs.launchpad.net/nova/+bug/1815989. neutron somethime sened the events ealier then we expect and some times it send multiple network vif plugged events for effectivly the same operations. we recently "fixed" the fact that the dhcp agent would send a netwrok-vif-plugged event during live migration becasue it was already configured nad the port was fully plugged on the souce node when we were waiting for the event form the destiont nodes l2 agent. https://review.opendev.org/c/openstack/neutron/+/766277 howeveer that fix si config driven and nova cannot detach how that is set... i dissagree that in a distibuted system like nova there si no before or after. we had a contract with neutron that severla neutron ml2 plugs or out of tree core plugins did not comply with. when we add a vm interface to a network backend we requrie neutron to notificy use in a timely manner that the backend has processed the port and its now safe to proceed. several backend chosse to violate that contract including ovn and as a result we have to try and make thse broken backend work in nova whne infact we shoudl not supprot them at all. the odl comuntiy when to great effort to impleent a websocket callback mechsium to be able to have odl notify neutron when it had configured the port on the ovs bridge and networking-odl then incoperated that in to there ml2 dirver https://opendev.org/openstack/networking-odl/src/branch/master/networking_odl/ml2/port_status_update.py#L92-L95 all of the in tree pluggins before ovn was merged in tree also implemeted this protocoal correctly sending event when the port provisioning on the netwrok backedn was compelte. ovn however still sets the l2 provision as complete when the prot status is set to up https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers/ovn/mech_driver/mech_driver.py#L1031-L1066 that gets called when the logical swith port is set to up https://github.com/openstack/neutron/blob/4e339776d90cf211396da5f95e29af65332dac61/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py#L421-L438 but that does not fully adress the issue since move oepration like live migatrion are nto properly supproted. https://review.opendev.org/c/openstack/neutron-specs/+/799198/6/specs/xena/ovn-auxiliary-port-bridge-live-migration.rst#120 should help although im slightly dismaded to see that tey will be using a new `port.vif_details`` backend field to identify it as ovn instead of the previously agreed on bound_drivers field https://specs.openstack.org/openstack/neutron-specs/specs/train/port-binding-extended-information.html if every ml2 driver and core plugin set this new backedn filed it more of less the same as the bound_drivers feild however i fear this will jsut get implemeted for ovn since its part of the ovn speficic spec which will jsut create more tech debt so im relutant to suggest nova will use this info untill it done properly for all backends. > > As Sean noted in a private irc conversation, with OVN the current > implementation is not capable of fullfilling the contract that > network-vif-plugged events are only sent after the interface is fully > configred. So it send events at bind time once it have updated the > logical port in the ovn db but before real configuration has happened. I > believe that deferred RPC calls and/or queued events might improve such > a "cheating" by making the real post-completion processing a thing for > any backend? > > [0] https://bugs.launchpad.net/nova/+bug/1952003 > > [1] > https://specs.openstack.org/openstack/neutron-specs/specs/train/port-binding-extended-information.html > > > > > Thanks! > > Tony > > ________________________________________ > > From: Laurent Dumont > > Sent: November 22, 2021 02:05 PM > > To: Michal Arbet > > Cc: openstack-discuss > > Subject: Re: [neutron][nova] [kolla] vif plugged timeout > > > > How high did you have to raise it? If it does appear after X amount of time, then the VIF plug is not lost? > > > > On Sat, Nov 20, 2021 at 7:23 AM Michal Arbet > wrote: > > + if i raise vif_plugged_timeout ( hope i rember it correct ) in nova to some high number ..problem dissapear ... But it's only workaround > > > > D?a so 20. 11. 2021, 12:05 Michal Arbet > nap?sal(a): > > Hi, > > > > Has anyone seen issue which I am currently facing ? > > > > When launching heat stack ( but it's same if I launch several of instances ) vif plugged in timeouts an I don't know why, sometimes it is OK ..sometimes is failing. > > > > Sometimes neutron reports vif plugged in < 10 sec ( test env ) sometimes it's 100 and more seconds, it seems there is some race condition but I can't find out where the problem is. But on the end every instance is spawned ok (retry mechanism worked). > > > > Another finding is that it has to do something with security group, if noop driver is used ..everything is working good. > > > > Firewall security setup is openvswitch . > > > > Test env is wallaby. > > > > I will attach some logs when I will be near PC .. > > > > Thank you, > > Michal Arbet (Kevko) > > > > > > > > > > > > > > From senrique at redhat.com Wed Nov 24 14:54:32 2021 From: senrique at redhat.com (Sofia Enriquez) Date: Wed, 24 Nov 2021 11:54:32 -0300 Subject: [cinder] Bug deputy report for week of 11-24-2021 In-Reply-To: References: Message-ID: This is a bug report from 11-17-2021 to 11-24-2021. Agenda: https://etherpad.opendev.org/p/cinder-bug-squad-meeting ----------------------------------------------------------------------------------------- Medium The next 3 bugs are related to multipath config everywhere : - https://bugs.launchpad.net/cinder/+bug/1951982 "On encrypted volume cloning rekeying doesn't use multipathing". Assigned to Gorka Eguileor. - https://bugs.launchpad.net/cinder/+bug/1951981 "Kaminario not using multipath for creating volume". Assigned to Gorka Eguileor. - https://bugs.launchpad.net/cinder/+bug/1951977 "Cinder backup - The multipath flag is not set on a create_backup operation". Assigned to Gorka Eguileor. Low - https://bugs.launchpad.net/cinder/+bug/1951250 "[Storwize] Fix multiple lsiogrp,lsvdisk calls in Retype". Unassigned. Cheers, -- Sof?a Enriquez she/her Software Engineer Red Hat PnT IRC: @enriquetaso @RedHat Red Hat Red Hat -- Sof?a Enriquez she/her Software Engineer Red Hat PnT IRC: @enriquetaso @RedHat Red Hat Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralonsoh at redhat.com Wed Nov 24 14:55:36 2021 From: ralonsoh at redhat.com (Rodolfo Alonso Hernandez) Date: Wed, 24 Nov 2021 15:55:36 +0100 Subject: [Neutron][OVN] nova-compute timeout while waiting for VIF activation confirmed by Neutron In-Reply-To: References: Message-ID: Hello Tony: Do you have the Neutron server logs? Do you see the "PortBindingChassisUpdateEvent"? When a port is bound, a Port_Bound (SB) event is issued and captured in the Neutron server. That will trigger the port binding process and the "vif-plugged" event. This OVN SB event should call "set_port_status_up" and that should write "OVN reports status up for port: %s" in the logs. Of course, this Neutron method will be called only if the logical switch port is UP. Regards. On Tue, Nov 23, 2021 at 11:59 PM Tony Liu wrote: > Hi, > > I see such problem from time to time. It's not consistently reproduceable. > ====================== > 2021-11-23 22:16:28.532 7 INFO nova.compute.manager > [req-50817b01-e7ae-4991-94fe-e4c5672c481b - - - - -] [instance: > 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] During sync_power_state the instance > has a pending task (spawning). Skip. > 2021-11-23 22:21:28.342 7 WARNING nova.virt.libvirt.driver > [req-814c98a4-3fd9-4607-9abb-5fbe5cef8650 3a4c320d64664d9cb6784b7ea52d618a > 21384fe21ebb4339b1a5f2a8f9000ea3 - default default] [instance: > 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] Timeout waiting for > [('network-vif-plugged', '735d97a3-db7c-42ed-b3be-1596d4cc7f4b')] for > instance with vm_state building and task_state spawning.: > eventlet.timeout.Timeout: 300 seconds > ====================== > The VIF/port is activated by OVN ovn-controller to ovn-sb. It seems that, > either Neutron didn't > capture the update or didn't send message back to nova-compute. Is there > any known fix for > this problem? > > > Thanks! > Tony > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralonsoh at redhat.com Wed Nov 24 14:55:36 2021 From: ralonsoh at redhat.com (Rodolfo Alonso Hernandez) Date: Wed, 24 Nov 2021 15:55:36 +0100 Subject: [Neutron][OVN] nova-compute timeout while waiting for VIF activation confirmed by Neutron In-Reply-To: References: Message-ID: Hello Tony: Do you have the Neutron server logs? Do you see the "PortBindingChassisUpdateEvent"? When a port is bound, a Port_Bound (SB) event is issued and captured in the Neutron server. That will trigger the port binding process and the "vif-plugged" event. This OVN SB event should call "set_port_status_up" and that should write "OVN reports status up for port: %s" in the logs. Of course, this Neutron method will be called only if the logical switch port is UP. Regards. On Tue, Nov 23, 2021 at 11:59 PM Tony Liu wrote: > Hi, > > I see such problem from time to time. It's not consistently reproduceable. > ====================== > 2021-11-23 22:16:28.532 7 INFO nova.compute.manager > [req-50817b01-e7ae-4991-94fe-e4c5672c481b - - - - -] [instance: > 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] During sync_power_state the instance > has a pending task (spawning). Skip. > 2021-11-23 22:21:28.342 7 WARNING nova.virt.libvirt.driver > [req-814c98a4-3fd9-4607-9abb-5fbe5cef8650 3a4c320d64664d9cb6784b7ea52d618a > 21384fe21ebb4339b1a5f2a8f9000ea3 - default default] [instance: > 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] Timeout waiting for > [('network-vif-plugged', '735d97a3-db7c-42ed-b3be-1596d4cc7f4b')] for > instance with vm_state building and task_state spawning.: > eventlet.timeout.Timeout: 300 seconds > ====================== > The VIF/port is activated by OVN ovn-controller to ovn-sb. It seems that, > either Neutron didn't > capture the update or didn't send message back to nova-compute. Is there > any known fix for > this problem? > > > Thanks! > Tony > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Wed Nov 24 16:26:41 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 24 Nov 2021 10:26:41 -0600 Subject: [all][sphinx] "rST localisation for language "foo" not found" error In-Reply-To: <5aac985038203ad3f0dc9a6ebe7bcd87690c9777.camel@redhat.com> References: <11d6c4dfa4d054eb111a167e452c64d09b64a237.camel@redhat.com> <5aac985038203ad3f0dc9a6ebe7bcd87690c9777.camel@redhat.com> Message-ID: <17d52c54a07.deaaed251244040.6117500024253433660@ghanshyammann.com> ---- On Wed, 28 Jul 2021 11:20:06 -0500 Stephen Finucane wrote ---- > On Wed, 2021-07-28 at 17:11 +0100, Stephen Finucane wrote: > > We've seen issues in some projects, whereby attempting to build translations as > > part of the 'openstack-tox-docs' zuul job [1] can result in the following > > warning: > > > > Warning, treated as error: > > ../doc/source/contributor/contributing.rst::rST localisation for language "id" not found. > > > > Based on [2], this issue appears to have been introduced by a recent version > > bump of docutils, which was in turn introduced by the introduction of Sphinx > > 4.x. > > > > We don't have a resolution for this bug yet but we'll update when we do. > > Hopefully this email will improve the "Google'ability" of this bug. > > I _think_ this is because we're only allowed to use languages that docutils > officially supports now, which presumably are those listed here [1]. We spotted > this bug initially for openstack-helm, which has translations for Indonesian > with a shortcode of 'id'. That's not in the list supported by docutils, hence > this issue. Will need more investigation from people familiar with this tooling > though. It is happening in openstack/contributor-guide also, not sure if any resolution to this yet? -https://zuul.opendev.org/t/openstack/build/c01d12032a0449a28988e6daa45a7d87/log/job-output.txt#2019 -gmann > > Stephen > > [1] https://repo.or.cz/docutils.git/tree/HEAD:/docutils/docutils/languages > > > Stephen > > > > [1] https://opendev.org/openstack/openstackdocstheme/src/commit/08461c5311aa692088a27eb40a87965fd8515aba/bin/docstheme-build-translated.sh#L153 > > [2] https://lists.freebsd.org/pipermail/freebsd-ports/2021-April/120848.html > > > > > > From thierry at openstack.org Wed Nov 24 16:51:08 2021 From: thierry at openstack.org (Thierry Carrez) Date: Wed, 24 Nov 2021 17:51:08 +0100 Subject: [largescale-sig] Next meeting: Nov 24th, 15utc In-Reply-To: References: Message-ID: <251f61c6-5a06-5fac-c6da-c03af2e4b402@openstack.org> We held our meeting today. We narrowed down the last details of our next "Large Scale OpenStack" episode on OpenInfra.Live, scheduled for Dec 9. We did not report any progress on the Large Scale Journey, our documentation of OpenStack operations at large scale. You can read the meeting logs at: https://meetings.opendev.org/meetings/large_scale_sig/2021/large_scale_sig.2021-11-24-15.00.html After the December 9 OpenInfra Live show, we will skip the meeting scheduled on Dec 22. Our next IRC meeting will be January 5, at 1500utc on #openstack-operators on OFTC. Regards, -- Thierry Carrez (ttx) From knikolla at bu.edu Wed Nov 24 17:26:01 2021 From: knikolla at bu.edu (Nikolla, Kristi) Date: Wed, 24 Nov 2021 17:26:01 +0000 Subject: [keystone][policy][ussuri] why I can create a domain In-Reply-To: <6c8d401e-1945-52df-ff69-8e92134a77b7@gmail.com> References: <6c8d401e-1945-52df-ff69-8e92134a77b7@gmail.com> Message-ID: Hi Piotr, That is likely due to the enforce_scope configuration option being set as False by default [0] We?re not to a point yet where you can safely give someone the admin role on any project. [1][2] Kristi [0]. https://docs.openstack.org/oslo.policy/latest/configuration/index.html#oslo-policy [1]. https://governance.openstack.org/tc/goals/proposed/consistent-and-secure-rbac.html [2]. https://review.opendev.org/c/openstack/governance/+/815158 From: Piotr Misiak Date: Wednesday, November 24, 2021 at 05:04 To: openstack-discuss at lists.openstack.org Subject: [keystone][policy][ussuri] why I can create a domain Hi, Maybe a stupid question but I'm really confused. In my Ussuri cloud Keystone has a following policy for create_domain action (this is a default policy from Keystone code): "identity:create_domain": "role:admin and system_scope:all" I have a user which has "admin" role assigned in project "admin" in domain "default" - AKA cloud admin. The user does not have any roles assigned on system scope. Could someone please explain why this user is able to create a domain in the cloud? Looking at the policy rule he shouldn't or maybe I'm reading it in a wrong way? Is there any "backward compatibility" casting "cloud admin" role to "system_scope:all"? Please help Thanks Piotr -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Wed Nov 24 23:08:38 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 24 Nov 2021 17:08:38 -0600 Subject: [all][tc] Yoga testing runtime update. Action needed if py3.9 job failing for your project In-Reply-To: <17d394d1377.da5099ba980738.5329502106648098442@ghanshyammann.com> References: <17d394d1377.da5099ba980738.5329502106648098442@ghanshyammann.com> Message-ID: <17d54354c99.d573cd831257985.1615577290034051097@ghanshyammann.com> ---- On Fri, 19 Nov 2021 11:44:52 -0600 Ghanshyam Mann wrote ---- > Hello Everyone. > > As discussed in TC PTG, we have updated the testing runtime for the > Yoga cycle [1]. > > Changes are: > * Add Debian 11 as the tested distro > * Change centos stream 8 -> centos stream 9 > * Bump lowest python version to test to 3.8 and highest to python 3.9 > ** This removes python 3.6 from testing. > > I pushed the job template update[2] which will make py3.9 unit test job > voting (which is non-voting currently). I do not see any projects failing > consistently on py3.9[3] but still, I will keep it as -W until early next week > (23rd Nov). If any project needs time to fix the failing py3.9 job, please do > it before 23rd Nov. Job template change is merged and py39 is voting job now. -gmann > > [1] https://governance.openstack.org/tc/reference/runtimes/yoga.html > [2] https://review.opendev.org/c/openstack/openstack-zuul-jobs/+/818609 > [3] https://zuul.openstack.org/builds?job_name=openstack-tox-py39&branch=master&result=FAILURE > > -gmann > > From laurentfdumont at gmail.com Wed Nov 24 23:43:17 2021 From: laurentfdumont at gmail.com (Laurent Dumont) Date: Wed, 24 Nov 2021 18:43:17 -0500 Subject: [ops][nova][victoria] QEMU/KVM libvirt Authentication Failed In-Reply-To: <0670B960225633449A24709C291A525251D50128@COM03.performair.local> References: <0670B960225633449A24709C291A525251D4DE7F@COM03.performair.local> <0670B960225633449A24709C291A525251D4FF98@COM03.performair.local> <0670B960225633449A24709C291A525251D50128@COM03.performair.local> Message-ID: Fantastic! Glad you were able to find a solution. On Mon, Nov 22, 2021 at 12:02 PM wrote: > Laurent; > > Your message ended up pointing me in the right direction. I started > asking myself why libvirtd came from Ubuntu configured incorrectly for live > migrations. The obvious answer is: it didn't. That suggested that I ad > configured something incorrectly. That realization, together with the > discussion from [1] led me to set libvirt.live_migration_scheme="ssh" in > nova.conf. After restarting nova-compute, I can now live migrate instances > between Ubuntu servers. > > Thank you for your assistance. > > Thank you, > > Dominic L. Hilsbos, MBA > Vice President ? Information Technology > Perform Air International Inc. > DHilsbos at PerformAir.com > www.PerformAir.com > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1254307 > > -----Original Message----- > From: DHilsbos at performair.com [mailto:DHilsbos at performair.com] > Sent: Monday, November 22, 2021 9:09 AM > To: laurentfdumont at gmail.com; mnaser at vexxhost.com > Cc: openstack-discuss at lists.openstack.org > Subject: RE: [ops][nova][victoria] QEMU/KVM libvirt Authentication Failed > > Laurent; > > We're running Victoria. Here are specific package versions: > Ubuntu: 20.10 > nova-compute: 22.2.1-0ubuntu1 (both) > nova-compute-kvm: 22.2.1-0ubuntu1 (both) > qemu-kvm: 5.0-5unbuntu9.9 (both) > libvirt-daemon: 6.6.0-1ubuntu3.5 (both) > > As I said, this has come up for me before, but I can't find records of how > it was addressed. I don't remember an issue of authentication from before, > however. From before, I do remember that after the ssh connection to setup > the new host, qemu/kvm on the old host makes a connection to qemu/kvm on > the new host, in order to coordinate the transfer of memory contents, and > other dynamic elements. > > Yes, I can cold migrate between all 3 servers, which makes this a > non-critical issue. > > While I have a CentOS Nova host, I'm not going to attempt to get > live-migration working between the Ubuntu Servers > > Changing the configuration of libvirt from system sockets to native > listeners got me past a connection refused error (it appears that lbvirt > also connects from one server to another?) > > Thank you, > > Dominic L. Hilsbos, MBA > Vice President ? Information Technology > Perform Air International Inc. > DHilsbos at PerformAir.com > www.PerformAir.com > > From: Laurent Dumont [mailto:laurentfdumont at gmail.com] > Sent: Friday, November 19, 2021 6:45 PM > To: Mohammed Naser > Cc: Dominic Hilsbos; openstack-discuss > Subject: Re: [ops][nova][victoria] QEMU/KVM libvirt Authentication Failed > > Which version of Openstack are you running? > > It seems to try to connect over qemu with auth over tcp. Without ssh? > > Is the cold migration working now? > > On Fri, Nov 19, 2021 at 8:35 PM Mohammed Naser > wrote: > Just a heads up even if you get things working you?re not going to be able > to live migrate from centos to ubuntu and vice versa since there?s going to > be things like apparmor and SELinux issues > > On Fri, Nov 19, 2021 at 11:35 AM wrote: > All; > > I feel like I've dealt with this issue before, but I can't find any > records of it. > > I've been swapping out the compute nodes in my cluster for newer and > better hardware. We also decided to abandon CentOS. All the differences > mean that we haven't been able to do live migrations. I now have 2 servers > with the same CPUs, OS (Ubuntu), OS Version (20.10), etc., and would like > to get live migration working again. > > I configured passwordless ssh access between the servers for the nova > users to get cold migration working. I have also configured passwordless > ssh for the root users in accordance with [1]. > > When I try to do a live migration, the origin server generates this error, > in the nova-compute log: > 2021-11-19 15:52:31.130 15610 ERROR nova.virt.libvirt.driver [-] > [instance: 5935c07d-0c7f-48cc-a4b9-674504fc6005] Live Migration failure: > operation failed: Failed to connect to remote libvirt URI > qemu+tcp:///system: authentication failed: authentication > failed: libvirt.libvirtError: operation failed: Failed to connect to remote > libvirt URI qemu+tcp:///system: authentication failed: > authentication failed > > At one point, I came across a tutorial on configuring live-migration for > libvirt, which included a bunch of user configuration. I don't remember > having to do that before, but is that what I need to be looking for? > > Thank you, > > Dominic L. Hilsbos, MBA > Vice President - Information Technology > Perform Air International Inc. > DHilsbos at PerformAir.com > www.PerformAir.com > > 1: > https://docs.openstack.org/nova/victoria/admin/configuring-migrations.html#section-configuring-compute-migrations > -- > Mohammed Naser > VEXXHOST, Inc. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Wed Nov 24 23:45:09 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 24 Nov 2021 17:45:09 -0600 Subject: [all][tc] Technical Committee next weekly meeting on Nov 25th at 1500 UTC In-Reply-To: <17d4a59d276.fd9316fb1106394.2758893334791698593@ghanshyammann.com> References: <17d4a59d276.fd9316fb1106394.2758893334791698593@ghanshyammann.com> Message-ID: <17d5456b83e.106070ea81258416.8834639670841443781@ghanshyammann.com> Hello Everyone, Below is the agenda for Tomorrow's TC IRC meeting schedule at 1500 UTC. https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting == Agenda for tomorrow's TC meeting == * Roll call * Follow up on past action items * Gate health check ** Fixing Zuul config error in OpenStack *** https://etherpad.opendev.org/p/zuul-config-error-openstack * Updates on community-wide goal ** RBAC goal rework *** https://review.opendev.org/c/openstack/governance/+/815158 ** Proposed community goal for FIPS compatibility and compliance *** https://review.opendev.org/c/openstack/governance/+/816587 * Magnum project health * Adjutant need PTLs and maintainers ** http://lists.openstack.org/pipermail/openstack-discuss/2021-October/025555.html * Pain Point targeting ** https://etherpad.opendev.org/p/pain-point-elimination * Open Reviews ** https://review.opendev.org/q/projects:openstack/governance+is:open -gmann ---- On Mon, 22 Nov 2021 19:12:20 -0600 Ghanshyam Mann wrote ---- > Hello Everyone, > > Technical Committee's next weekly meeting is scheduled for Nov 25th at 1500 UTC. > > If you would like to add topics for discussion, please add them to the below wiki page by > Wednesday, Nov 24th, at 2100 UTC. > > https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting > > -gmann > > From swogatpradhan22 at gmail.com Thu Nov 25 07:12:05 2021 From: swogatpradhan22 at gmail.com (Swogat Pradhan) Date: Thu, 25 Nov 2021 12:42:05 +0530 Subject: [Openstack-victoria] [IRONIC] error in network interface | baremetal node validate In-Reply-To: References: Message-ID: Is there anything else that i can share to help understand the issue? with regards, swogat pradhan On Tue, Nov 16, 2021 at 3:43 PM Swogat Pradhan wrote: > Hi, > I am currently trying to setup openstack ironic using driver IPMI. > I followed the official docs of openstack for setting everything up. > > When i run openstack baremetal node validate $NODE_UUID, i am getting the > following error: > > * Unexpected exception, traceback saved into log by ironic conductor > service that is running on controller: 'ServiceTokenAuthWrapper' object has > no attribute '_discovery_cache' * > in the network interface in command output. > > When i check the ironic conductor logs i see the following messages: > > > > > Can anyone suggest a solution or a way forward. > > With regards > Swogat Pradhan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Thu Nov 25 07:22:25 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Thu, 25 Nov 2021 08:22:25 +0100 Subject: [neutron][nova] [kolla] vif plugged timeout In-Reply-To: References: Message-ID: <8897921.lOV4Wx5bFT@p1> Hi, Basically in ML2/OVS case it may be one of 2 reasons why port isn't provisioned properly quickly: - neutron-ovs-agent is somehow slow with provisioning it or - neutron-dhcp-agent is slow provisioning that port. To check which of those happens really, You can enable debug logs in You neutron-server and look there for logs like "Port xxx provisioning completed by entity L2/DHCP" (or something similar, I don't remember it now exactly). If it works much faster with noop firewall driver, then it seems that it is more likely to be on the neutron-ovs-agent's side. In such case couple of things to check: - are You using l2population (it's required with DVR for example), - are You using SG with rules which references "remote_group_id" (like default SG for each tenant does)? If so, can You try to remove from You SG such rules and use rules with CIDRs instead? We know that using SG with remote_group_id don't scale well and if You have many ports using same SG, it may slow down neutron-ovs-agent a lot. - do You maybe have any other errors in the neutron-ovs-agent logs? Like rpc message communication errors or something else? Such errors will trigger doing fullsync of all ports on the node so it may take long time to get to actually provisioning Your new port sometimes. - what exactly version of Neutron are You using there? On sobota, 20 listopada 2021 11:05:16 CET Michal Arbet wrote: > Hi, > > Has anyone seen issue which I am currently facing ? > > When launching heat stack ( but it's same if I launch several of instances > ) vif plugged in timeouts an I don't know why, sometimes it is OK > ..sometimes is failing. > > Sometimes neutron reports vif plugged in < 10 sec ( test env ) sometimes > it's 100 and more seconds, it seems there is some race condition but I > can't find out where the problem is. But on the end every instance is > spawned ok (retry mechanism worked). > > Another finding is that it has to do something with security group, if noop > driver is used ..everything is working good. > > Firewall security setup is openvswitch . > > Test env is wallaby. > > I will attach some logs when I will be near PC .. > > Thank you, > Michal Arbet (Kevko) -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From lokendrarathour at gmail.com Thu Nov 25 07:59:12 2021 From: lokendrarathour at gmail.com (Lokendra Rathour) Date: Thu, 25 Nov 2021 13:29:12 +0530 Subject: [Triple 0] Undercloud deployment Getting failed In-Reply-To: References: Message-ID: Hi Team, Any update here please. On Tue, Nov 23, 2021 at 5:15 PM Lokendra Rathour wrote: > Hi Team getting strange error when installing triple O Train on Centos 8.4 > > '--volume', > '/var/lib/container-puppet/puppetlabs/facter.conf:/etc/puppetlabs/facter/facter.conf:ro', > '--volume', '/var/lib/container-puppet/puppetlabs/:/opt/puppetlabs/:ro', > '--volume', '/dev/log:/dev/log:rw', '--rm', '--log-driver', 'k8s-file', > '--log-opt', 'path=/var/log/containers/stdouts/container-puppet-zaqar.log', > '--security-opt', 'label=disable', '--volume', > '/usr/share/openstack-puppet/modules/:/usr/share/openstack-puppet/modules/:ro', > '--entrypoint', '/var/lib/container-puppet/container-puppet.sh', '--net', > 'host', '--volume', '/etc/hosts:/etc/hosts:ro', '--volume', > '/var/lib/container-puppet/container-puppet.sh:/var/lib/container-puppet/container-puppet.sh:ro', > 'quay.io/tripleotraincentos8/centos-binary-zaqar-wsgi:current-tripleo'] > run failed after Error: container_linux.go:370: starting container process > caused: error adding seccomp filter rule for syscall bdflush: permission > denied: OCI permission denied", > " attempt(s): 3", > "2021-11-23 11:00:38,384 WARNING: 58791 -- Retrying running container: > zaqar", > "2021-11-23 11:00:38,384 ERROR: 58791 -- Failed running container for > zaqar", > "2021-11-23 11:00:38,385 INFO: 58791 -- Finished processing puppet > configs for zaqar", > "2021-11-23 11:00:38,385 ERROR: 58782 -- ERROR configuring crond", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring glance_api", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring heat_api", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring heat", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring ironic_api", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring ironic", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring > ironic_inspector", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring neutron", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring iscsid", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring keystone", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring memcached", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring mistral", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring mysql", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring nova", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring rabbitmq", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring placement", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring swift", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring > swift_ringbuilder", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring zaqar" > ], > "stderr_lines": [], > "_ansible_no_log": false, > "attempts": 15 > } > ] > ] > Not cleaning working directory /home/stack/tripleo-heat-installer-templates > Not cleaning ansible directory /home/stack/undercloud-ansible-mie5k51_ > Install artifact is located at > /home/stack/undercloud-install-20211123110040.tar.bzip2 > > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > > Deployment Failed! > > This issue is recurring multiple times please advise. > > -- > ~ Lokendra > -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From lokendrarathour at gmail.com Thu Nov 25 08:01:27 2021 From: lokendrarathour at gmail.com (Lokendra Rathour) Date: Thu, 25 Nov 2021 13:31:27 +0530 Subject: [TripleO] IPV6 Support Message-ID: Hi, I am trying to install the undercloud using the ipv6 address on Centos 8 using Train release. It is seen that the deployment of undercloud is getting failed with error as mentioned below. Same deployment is working in the normal ipv4 case. So Questions: 1. is IPV6 supported in TripleO , if yes then please suggest how. Error: nied", " attempt(s): 2", "2021-11-25 07:25:31,535 WARNING: 65355 -- Retrying running container: zaqar", "2021-11-25 07:25:34,887 ERROR: 65355 -- ['/usr/bin/podman', 'run', '--user', '0', '--name', 'container-puppet-zaqar', '--env', 'PUPPET_TAGS=file,file_line,concat,augeas,cron,zaqar_config', '--env', 'NAME=zaqar', '--env', 'HOSTNAME=undercloud', '--env', 'NO_ARCHIVE=', '--env', 'STEP=6', '--env', 'NET_HOST=true', '--env', 'DEBUG=False', '--volume', '/etc/localtime:/etc/localtime:ro', '--volume', '/tmp/tmp06nvxzzz:/etc/config.pp:ro', '--volume', '/etc/puppet/:/tmp/puppet-etc/:ro', '--volume', '/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro', '--volume', '/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro', '--volume', '/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro', '--volume', '/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro', '--volume', '/var/lib/config-data:/var/lib/config-data/:rw', '--volume', '/var/lib/container-puppet/puppetlabs/facter.conf:/etc/puppetlabs/facter/facter.conf:ro', '--volume', '/var/lib/container-puppet/puppetlabs/:/opt/puppetlabs/:ro', '--volume', '/dev/log:/dev/log:rw', '--rm', '--log-driver', 'k8s-file', '--log-opt', 'path=/var/log/containers/stdouts/container-puppet-zaqar.log', '--security-opt', 'label=disable', '--volume', '/usr/share/openstack-puppet/modules/:/usr/share/openstack-puppet/modules/:ro', '--entrypoint', '/var/lib/container-puppet/container-puppet.sh', '--net', 'host', '--volume', '/etc/hosts:/etc/hosts:ro', '--volume', '/var/lib/container-puppet/container-puppet.sh:/var/lib/container-puppet/container-puppet.sh:ro', 'undercloud.ctlplane.localdomain:8787/tripleotraincentos8/centos-binary-zaqar-wsgi:current-tripleo'] run failed after Error: container_linux.go:370: starting container process caused: error adding seccomp filter rule for syscall bdflush: permission denied: OCI permission denied", " attempt(s): 3", "2021-11-25 07:25:34,888 WARNING: 65355 -- Retrying running container: zaqar", "2021-11-25 07:25:34,888 ERROR: 65355 -- Failed running container for zaqar", "2021-11-25 07:25:34,888 INFO: 65355 -- Finished processing puppet configs for zaqar", "2021-11-25 07:25:34,889 ERROR: 65345 -- ERROR configuring crond", "2021-11-25 07:25:34,889 ERROR: 65345 -- ERROR configuring glance_api", "2021-11-25 07:25:34,889 ERROR: 65345 -- ERROR configuring heat_api", "2021-11-25 07:25:34,889 ERROR: 65345 -- ERROR configuring heat", "2021-11-25 07:25:34,889 ERROR: 65345 -- ERROR configuring ironic_api", "2021-11-25 07:25:34,889 ERROR: 65345 -- ERROR configuring ironic", "2021-11-25 07:25:34,889 ERROR: 65345 -- ERROR configuring ironic_inspector", "2021-11-25 07:25:34,889 ERROR: 65345 -- ERROR configuring neutron", "2021-11-25 07:25:34,889 ERROR: 65345 -- ERROR configuring iscsid", "2021-11-25 07:25:34,889 ERROR: 65345 -- ERROR configuring keystone", "2021-11-25 07:25:34,890 ERROR: 65345 -- ERROR configuring memcached", "2021-11-25 07:25:34,890 ERROR: 65345 -- ERROR configuring mistral", "2021-11-25 07:25:34,890 ERROR: 65345 -- ERROR configuring mysql", "2021-11-25 07:25:34,890 ERROR: 65345 -- ERROR configuring nova", "2021-11-25 07:25:34,890 ERROR: 65345 -- ERROR configuring rabbitmq", "2021-11-25 07:25:34,890 ERROR: 65345 -- ERROR configuring placement", "2021-11-25 07:25:34,890 ERROR: 65345 -- ERROR configuring swift", "2021-11-25 07:25:34,890 ERROR: 65345 -- ERROR configuring swift_ringbuilder", "2021-11-25 07:25:34,890 ERROR: 65345 -- ERROR configuring zaqar" ], "stderr_lines": [], "_ansible_no_log": false, "attempts": 15 } ] ] Not cleaning working directory /home/stack/tripleo-heat-installer-templates Not cleaning ansible directory /home/stack/undercloud-ansible-mw4crw92 Install artifact is located at /home/stack/undercloud-install-20211125072536.tar.bzip2 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Deployment Failed! ERROR: Heat log files: /var/log/heat-launcher/undercloud_deploy-o_qf1b4w !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Deployment failed. Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/tripleoclient/v1/tripleo_deploy.py", line 1345, in _standalone_deploy raise exceptions.DeploymentError('Deployment failed') tripleoclient.exceptions.DeploymentError: Deployment failed During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/tripleoclient/v1/tripleo_deploy.py", line 1363, in _standalone_deploy raise exceptions.DeploymentError(six.text_type(e)) tripleoclient.exceptions.DeploymentError: Deployment failed During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/cliff/app.py", line 401, in run_subcommand result = cmd.run(parsed_args) File "/usr/lib/python3.6/site-packages/cliff/command.py", line 185, in run return_code = self.take_action(parsed_args) or 0 File "/usr/lib/python3.6/site-packages/tripleoclient/v1/tripleo_deploy.py", line 1451, in take_action if self._standalone_deploy(parsed_args) != 0: File "/usr/lib/python3.6/site-packages/tripleoclient/v1/tripleo_deploy.py", line 1400, in _standalone_deploy raise exceptions.DeploymentError('Deployment failed.') tripleoclient.exceptions.DeploymentError: Deployment failed. clean_up Deploy: Deployment failed. Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/tripleoclient/v1/tripleo_deploy.py", line 1345, in _standalone_deploy raise exceptions.DeploymentError('Deployment failed') tripleoclient.exceptions.DeploymentError: Deployment failed During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/tripleoclient/v1/tripleo_deploy.py", line 1363, in _standalone_deploy raise exceptions.DeploymentError(six.text_type(e)) tripleoclient.exceptions.DeploymentError: Deployment failed During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/osc_lib/shell.py", line 136, in run ret_val = super(OpenStackShell, self).run(argv) File "/usr/lib/python3.6/site-packages/cliff/app.py", line 281, in run result = self.run_subcommand(remainder) File "/usr/lib/python3.6/site-packages/osc_lib/shell.py", line 176, in run_subcommand ret_value = super(OpenStackShell, self).run_subcommand(argv) File "/usr/lib/python3.6/site-packages/cliff/app.py", line 401, in run_subcommand result = cmd.run(parsed_args) File "/usr/lib/python3.6/site-packages/cliff/command.py", line 185, in run return_code = self.take_action(parsed_args) or 0 File "/usr/lib/python3.6/site-packages/tripleoclient/v1/tripleo_deploy.py", line 1451, in take_action if self._standalone_deploy(parsed_args) != 0: File "/usr/lib/python3.6/site-packages/tripleoclient/v1/tripleo_deploy.py", line 1400, in _standalone_deploy raise exceptions.DeploymentError('Deployment failed.') tripleoclient.exceptions.DeploymentError: Deployment failed. END return value: 1 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! An error has occured while deploying the Undercloud. See the previous output for details about what went wrong. !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Command '['sudo', '--preserve-env', 'openstack', 'tripleo', 'deploy', '--standalone', '--standalone-role', 'Undercloud', '--stack', 'undercloud', '--local-domain=localdomain', '--local-ip=abce:abce:abce::1/64', '--templates=/usr/share/openstack-tripleo-heat-templates/', '--networks-file=network_data_undercloud.yaml', '--heat-native', '-e', '/usr/share/openstack-tripleo-heat-templates/environments/undercloud.yaml', '-e', '/usr/share/openstack-tripleo-heat-templates/environments/use-dns-for-vips.yaml', '-e', '/usr/share/openstack-tripleo-heat-templates/environments/podman.yaml', '-e', '/home/stack/containers-prepare-parameter.yaml', '-e', '/usr/share/openstack-tripleo-heat-templates/environments/services/ironic.yaml', '-e', '/usr/share/openstack-tripleo-heat-templates/environments/services/ironic-inspector.yaml', '-e', '/usr/share/openstack-tripleo-heat-templates/environments/services/mistral.yaml', '-e', '/usr/share/openstack-tripleo-heat-templates/environments/services/zaqar-swift-backend.yaml', '-e', '/usr/share/openstack-tripleo-heat-templates/environments/disable-telemetry.yaml', '-e', '/usr/share/openstack-tripleo-heat-templates/environments/services/tempest.yaml', '-e', '/usr/share/openstack-tripleo-heat-templates/environments/ssl/no-tls-endpoints-public-ip.yaml', '--deployment-user', 'stack', '--output-dir=/home/stack', '-e', '/home/stack/tripleo-config-generated-env-files/undercloud_parameters.yaml', '-e', '/usr/share/openstack-tripleo-heat-templates/environments/tripleo-validations.yaml', '--debug', '--log-file=install-undercloud.log', '-e', '/usr/share/openstack-tripleo-heat-templates/undercloud-stack-vstate-dropin.yaml']' returned non-zero exit status 1. Command '['sudo', '--preserve-env', 'openstack', 'tripleo', 'deploy', '--standalone', '--standalone-role', 'Undercloud', '--stack', 'undercloud', '--local-domain=localdomain', '--local-ip=abce:abce:abce::1/64', '--templates=/usr/share/openstack-tripleo-heat-templates/', '--networks-file=network_data_undercloud.yaml', '--heat-native', '-e', '/usr/share/openstack-tripleo-heat-templates/environments/undercloud.yaml', '-e', '/usr/share/openstack-tripleo-heat-templates/environments/use-dns-for-vips.yaml', '-e', '/usr/share/openstack-tripleo-heat-templates/environments/podman.yaml', '-e', '/home/stack/containers-prepare-parameter.yaml', '-e', '/usr/share/openstack-tripleo-heat-templates/environments/services/ironic.yaml', '-e', '/usr/share/openstack-tripleo-heat-templates/environments/services/ironic-inspector.yaml', '-e', '/usr/share/openstack-tripleo-heat-templates/environments/services/mistral.yaml', '-e', '/usr/share/openstack-tripleo-heat-templates/environments/services/zaqar-swift-backend.yaml', '-e', '/usr/share/openstack-tripleo-heat-templates/environments/disable-telemetry.yaml', '-e', '/usr/share/openstack-tripleo-heat-templates/environments/services/tempest.yaml', '-e', '/usr/share/openstack-tripleo-heat-templates/environments/ssl/no-tls-endpoints-public-ip.yaml', '--deployment-user', 'stack', '--output-dir=/home/stack', '-e', '/home/stack/tripleo-config-generated-env-files/undercloud_parameters.yaml', '-e', '/usr/share/openstack-tripleo-heat-templates/environments/tripleo-validations.yaml', '--debug', '--log-file=install-undercloud.log', '-e', '/usr/share/openstack-tripleo-heat-templates/undercloud-stack-vstate-dropin.yaml']' returned non-zero exit status 1. END return value: 1 [stack at undercloud ~]$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramishra at redhat.com Thu Nov 25 08:47:35 2021 From: ramishra at redhat.com (Rabi Mishra) Date: Thu, 25 Nov 2021 14:17:35 +0530 Subject: [Triple 0] Undercloud deployment Getting failed In-Reply-To: References: Message-ID: On Tue, Nov 23, 2021 at 5:17 PM Lokendra Rathour wrote: > Hi Team getting strange error when installing triple O Train on Centos 8.4 > > '--volume', > '/var/lib/container-puppet/puppetlabs/facter.conf:/etc/puppetlabs/facter/facter.conf:ro', > '--volume', '/var/lib/container-puppet/puppetlabs/:/opt/puppetlabs/:ro', > '--volume', '/dev/log:/dev/log:rw', '--rm', '--log-driver', 'k8s-file', > '--log-opt', 'path=/var/log/containers/stdouts/container-puppet-zaqar.log', > '--security-opt', 'label=disable', '--volume', > '/usr/share/openstack-puppet/modules/:/usr/share/openstack-puppet/modules/:ro', > '--entrypoint', '/var/lib/container-puppet/container-puppet.sh', '--net', > 'host', '--volume', '/etc/hosts:/etc/hosts:ro', '--volume', > '/var/lib/container-puppet/container-puppet.sh:/var/lib/container-puppet/container-puppet.sh:ro', > 'quay.io/tripleotraincentos8/centos-binary-zaqar-wsgi:current-tripleo'] > run failed after Error: container_linux.go:370: starting container process > caused: error adding seccomp filter rule for syscall bdflush: permission > denied: OCI permission denied", > That looks like an earlier podman issue[1]. We use runc[2] with train tripleo on centos8. Probably you don't have the right podman/runc version i.e podman 2.3.1+ and runc 1.0.0+. Also, (answering your question on another thread here) I don't see how it can only be reproduced with ipv6 deployments, unless your ipv4 and ipv6 environments are different. [1] https://github.com/containers/podman/issues/10735#issuecomment-869818154 [2] https://review.opendev.org/c/openstack/tripleo-ansible/+/811983 > " attempt(s): 3", > "2021-11-23 11:00:38,384 WARNING: 58791 -- Retrying running container: > zaqar", > "2021-11-23 11:00:38,384 ERROR: 58791 -- Failed running container for > zaqar", > "2021-11-23 11:00:38,385 INFO: 58791 -- Finished processing puppet > configs for zaqar", > "2021-11-23 11:00:38,385 ERROR: 58782 -- ERROR configuring crond", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring glance_api", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring heat_api", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring heat", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring ironic_api", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring ironic", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring > ironic_inspector", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring neutron", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring iscsid", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring keystone", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring memcached", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring mistral", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring mysql", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring nova", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring rabbitmq", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring placement", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring swift", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring > swift_ringbuilder", > "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring zaqar" > ], > "stderr_lines": [], > "_ansible_no_log": false, > "attempts": 15 > } > ] > ] > Not cleaning working directory /home/stack/tripleo-heat-installer-templates > Not cleaning ansible directory /home/stack/undercloud-ansible-mie5k51_ > Install artifact is located at > /home/stack/undercloud-install-20211123110040.tar.bzip2 > > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > > Deployment Failed! > > This issue is recurring multiple times please advise. > > -- > ~ Lokendra > -- Regards, Rabi Mishra -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralonsoh at redhat.com Thu Nov 25 09:41:14 2021 From: ralonsoh at redhat.com (Rodolfo Alonso Hernandez) Date: Thu, 25 Nov 2021 10:41:14 +0100 Subject: [Neutron][OVN] nova-compute timeout while waiting for VIF activation confirmed by Neutron In-Reply-To: References: Message-ID: Hello Michal: I think this thread is related to OVN. In any case, I've analyzed your logs from the Neutron point of view. Those are basically the key points in your logs: [1]. The Neutron server receives the request to create and bind a new port. Both the DHCP agent and the OVS agent provision the port (8a472e87-4a7a-4ad4-9fbb-fd9785136611). Neutron server sends the "network-vif-plugged" event at 08:51:05.635 and receives the ACK from Nova at 08:51:05.704. Nova server creates the corresponding event for the instance on the compute0: 2021-11-25 08:51:05.692 23 INFO nova.api.openstack.compute.server_external_events [req-393fcb9a-5c68-4b0a-af1b-661e73d947cb 01a75e3a9a9148218916d3beafae2120 7d16babef0524b7dade9a59b0a3569e1 - default default] Creating event network-vif-plugged:8a472e87-4a7a-4ad4-9fbb-fd9785136611 for instance 09eff2ce-356f-430f-ab30-5de58f58d698 on compute0 Nova compute agent receives this server event: 2021-11-25 08:51:05.705 8 DEBUG nova.compute.manager [req-393fcb9a-5c68-4b0a-af1b-661e73d947cb 01a75e3a9a9148218916d3beafae2120 7d16babef0524b7dade9a59b0a3569e1 - default default] [instance: 09eff2ce-356f-430f-ab30-5de58f58d698] Processing event network-vif-plugged-8a472e87-4a7a-4ad4-9fbb-fd9785136611 _process_instance_event /var/lib/kolla/venv/lib/python3.9/site-packages/nova/compute/manager.py:10205 Further triagging and log analysis should be done by Nova folks to understand why Nova compute didn't boot this VM. I fail to understand some parts. Regards. [1]https://paste.opendev.org/show/811273/ On Thu, Nov 25, 2021 at 10:15 AM Michal Arbet wrote: > Hello, > > In attachment you can find logs for compute0 and controller0 (other > computes and controllers were turned off for this test). > No OVN used, this stack is based on OVS. > > Thank you, > Michal Arbet > Openstack Engineer > > Ultimum Technologies a.s. > Na Po???? 1047/26, 11000 Praha 1 > Czech Republic > > +420 604 228 897 > michal.arbet at ultimum.io > *https://ultimum.io * > > LinkedIn | Twitter > | Facebook > > > > st 24. 11. 2021 v 15:59 odes?latel Rodolfo Alonso Hernandez < > ralonsoh at redhat.com> napsal: > >> Hello Tony: >> >> Do you have the Neutron server logs? Do you see the >> "PortBindingChassisUpdateEvent"? When a port is bound, a Port_Bound (SB) >> event is issued and captured in the Neutron server. That will trigger the >> port binding process and the "vif-plugged" event. This OVN SB event should >> call "set_port_status_up" and that should write "OVN reports status up for >> port: %s" in the logs. >> >> Of course, this Neutron method will be called only if the logical switch >> port is UP. >> >> Regards. >> >> >> On Tue, Nov 23, 2021 at 11:59 PM Tony Liu >> wrote: >> >>> Hi, >>> >>> I see such problem from time to time. It's not consistently >>> reproduceable. >>> ====================== >>> 2021-11-23 22:16:28.532 7 INFO nova.compute.manager >>> [req-50817b01-e7ae-4991-94fe-e4c5672c481b - - - - -] [instance: >>> 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] During sync_power_state the instance >>> has a pending task (spawning). Skip. >>> 2021-11-23 22:21:28.342 7 WARNING nova.virt.libvirt.driver >>> [req-814c98a4-3fd9-4607-9abb-5fbe5cef8650 3a4c320d64664d9cb6784b7ea52d618a >>> 21384fe21ebb4339b1a5f2a8f9000ea3 - default default] [instance: >>> 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] Timeout waiting for >>> [('network-vif-plugged', '735d97a3-db7c-42ed-b3be-1596d4cc7f4b')] for >>> instance with vm_state building and task_state spawning.: >>> eventlet.timeout.Timeout: 300 seconds >>> ====================== >>> The VIF/port is activated by OVN ovn-controller to ovn-sb. It seems >>> that, either Neutron didn't >>> capture the update or didn't send message back to nova-compute. Is there >>> any known fix for >>> this problem? >>> >>> >>> Thanks! >>> Tony >>> >>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralonsoh at redhat.com Thu Nov 25 09:41:14 2021 From: ralonsoh at redhat.com (Rodolfo Alonso Hernandez) Date: Thu, 25 Nov 2021 10:41:14 +0100 Subject: [Neutron][OVN] nova-compute timeout while waiting for VIF activation confirmed by Neutron In-Reply-To: References: Message-ID: Hello Michal: I think this thread is related to OVN. In any case, I've analyzed your logs from the Neutron point of view. Those are basically the key points in your logs: [1]. The Neutron server receives the request to create and bind a new port. Both the DHCP agent and the OVS agent provision the port (8a472e87-4a7a-4ad4-9fbb-fd9785136611). Neutron server sends the "network-vif-plugged" event at 08:51:05.635 and receives the ACK from Nova at 08:51:05.704. Nova server creates the corresponding event for the instance on the compute0: 2021-11-25 08:51:05.692 23 INFO nova.api.openstack.compute.server_external_events [req-393fcb9a-5c68-4b0a-af1b-661e73d947cb 01a75e3a9a9148218916d3beafae2120 7d16babef0524b7dade9a59b0a3569e1 - default default] Creating event network-vif-plugged:8a472e87-4a7a-4ad4-9fbb-fd9785136611 for instance 09eff2ce-356f-430f-ab30-5de58f58d698 on compute0 Nova compute agent receives this server event: 2021-11-25 08:51:05.705 8 DEBUG nova.compute.manager [req-393fcb9a-5c68-4b0a-af1b-661e73d947cb 01a75e3a9a9148218916d3beafae2120 7d16babef0524b7dade9a59b0a3569e1 - default default] [instance: 09eff2ce-356f-430f-ab30-5de58f58d698] Processing event network-vif-plugged-8a472e87-4a7a-4ad4-9fbb-fd9785136611 _process_instance_event /var/lib/kolla/venv/lib/python3.9/site-packages/nova/compute/manager.py:10205 Further triagging and log analysis should be done by Nova folks to understand why Nova compute didn't boot this VM. I fail to understand some parts. Regards. [1]https://paste.opendev.org/show/811273/ On Thu, Nov 25, 2021 at 10:15 AM Michal Arbet wrote: > Hello, > > In attachment you can find logs for compute0 and controller0 (other > computes and controllers were turned off for this test). > No OVN used, this stack is based on OVS. > > Thank you, > Michal Arbet > Openstack Engineer > > Ultimum Technologies a.s. > Na Po???? 1047/26, 11000 Praha 1 > Czech Republic > > +420 604 228 897 > michal.arbet at ultimum.io > *https://ultimum.io * > > LinkedIn | Twitter > | Facebook > > > > st 24. 11. 2021 v 15:59 odes?latel Rodolfo Alonso Hernandez < > ralonsoh at redhat.com> napsal: > >> Hello Tony: >> >> Do you have the Neutron server logs? Do you see the >> "PortBindingChassisUpdateEvent"? When a port is bound, a Port_Bound (SB) >> event is issued and captured in the Neutron server. That will trigger the >> port binding process and the "vif-plugged" event. This OVN SB event should >> call "set_port_status_up" and that should write "OVN reports status up for >> port: %s" in the logs. >> >> Of course, this Neutron method will be called only if the logical switch >> port is UP. >> >> Regards. >> >> >> On Tue, Nov 23, 2021 at 11:59 PM Tony Liu >> wrote: >> >>> Hi, >>> >>> I see such problem from time to time. It's not consistently >>> reproduceable. >>> ====================== >>> 2021-11-23 22:16:28.532 7 INFO nova.compute.manager >>> [req-50817b01-e7ae-4991-94fe-e4c5672c481b - - - - -] [instance: >>> 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] During sync_power_state the instance >>> has a pending task (spawning). Skip. >>> 2021-11-23 22:21:28.342 7 WARNING nova.virt.libvirt.driver >>> [req-814c98a4-3fd9-4607-9abb-5fbe5cef8650 3a4c320d64664d9cb6784b7ea52d618a >>> 21384fe21ebb4339b1a5f2a8f9000ea3 - default default] [instance: >>> 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] Timeout waiting for >>> [('network-vif-plugged', '735d97a3-db7c-42ed-b3be-1596d4cc7f4b')] for >>> instance with vm_state building and task_state spawning.: >>> eventlet.timeout.Timeout: 300 seconds >>> ====================== >>> The VIF/port is activated by OVN ovn-controller to ovn-sb. It seems >>> that, either Neutron didn't >>> capture the update or didn't send message back to nova-compute. Is there >>> any known fix for >>> this problem? >>> >>> >>> Thanks! >>> Tony >>> >>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnasiadka at gmail.com Thu Nov 25 09:45:51 2021 From: mnasiadka at gmail.com (=?utf-8?Q?Micha=C5=82_Nasiadka?=) Date: Thu, 25 Nov 2021 10:45:51 +0100 Subject: [kolla] Proposing Michal Arbet as Kolla core In-Reply-To: <139F5BE7-4083-4364-959F-A850C12FCCB1@dincercelik.com> References: <139F5BE7-4083-4364-959F-A850C12FCCB1@dincercelik.com> Message-ID: <4EE99442-A9F1-44F1-9004-52629E04A68B@gmail.com> Hello Koalas, As the time has passed - and I haven?t seen negative votes - I?ve added Michal to: - kolla-ansible-core - kolla-core Welcome to the core team Michal! Best regards, Michal Nasiadka > On 21 Nov 2021, at 07:51, Dincer Celik wrote: > > +1 > >> On 16 Nov 2021, at 15:49, Micha? Nasiadka > wrote: >> >> Hi, >> >> I would like to propose adding Michal Arbet (kevko) to the kolla-core and kolla-ansible-core groups. >> Michal did some great work on ProxySQL, is a consistent maintainer of Debian related images and has provided some helpful >> reviews. >> >> Cores - please reply +1/-1 before the end of Friday 26th November. >> >> Thanks, >> Michal > -------------- next part -------------- An HTML attachment was scrubbed... URL: From michal.arbet at ultimum.io Thu Nov 25 10:43:01 2021 From: michal.arbet at ultimum.io (Michal Arbet) Date: Thu, 25 Nov 2021 12:43:01 +0200 Subject: [Neutron][OVN] nova-compute timeout while waiting for VIF activation confirmed by Neutron In-Reply-To: References: Message-ID: Hi Rudolfo, Well, I sent logs because same LOG error I see in my test environment, and thought that it would be useful for you. Michal D?a ?t 25. 11. 2021, 11:41 Rodolfo Alonso Hernandez nap?sal(a): > Hello Michal: > > I think this thread is related to OVN. In any case, I've analyzed your > logs from the Neutron point of view. Those are basically the key points in > your logs: [1]. The Neutron server receives the request to create and bind > a new port. Both the DHCP agent and the OVS agent provision the port > (8a472e87-4a7a-4ad4-9fbb-fd9785136611). > > Neutron server sends the "network-vif-plugged" event at 08:51:05.635 and > receives the ACK from Nova at 08:51:05.704. Nova server creates the > corresponding event for the instance on the compute0: > 2021-11-25 08:51:05.692 23 INFO > nova.api.openstack.compute.server_external_events > [req-393fcb9a-5c68-4b0a-af1b-661e73d947cb 01a75e3a9a9148218916d3beafae2120 > 7d16babef0524b7dade9a59b0a3569e1 - default default] Creating event > network-vif-plugged:8a472e87-4a7a-4ad4-9fbb-fd9785136611 for instance > 09eff2ce-356f-430f-ab30-5de58f58d698 on compute0 > > Nova compute agent receives this server event: > 2021-11-25 08:51:05.705 8 DEBUG nova.compute.manager > [req-393fcb9a-5c68-4b0a-af1b-661e73d947cb 01a75e3a9a9148218916d3beafae2120 > 7d16babef0524b7dade9a59b0a3569e1 - default default] [instance: > 09eff2ce-356f-430f-ab30-5de58f58d698] Processing event > network-vif-plugged-8a472e87-4a7a-4ad4-9fbb-fd9785136611 > _process_instance_event > /var/lib/kolla/venv/lib/python3.9/site-packages/nova/compute/manager.py:10205 > > Further triagging and log analysis should be done by Nova folks to > understand why Nova compute didn't boot this VM. I fail to understand some > parts. > > Regards. > > [1]https://paste.opendev.org/show/811273/ > > On Thu, Nov 25, 2021 at 10:15 AM Michal Arbet > wrote: > >> Hello, >> >> In attachment you can find logs for compute0 and controller0 (other >> computes and controllers were turned off for this test). >> No OVN used, this stack is based on OVS. >> >> Thank you, >> Michal Arbet >> Openstack Engineer >> >> Ultimum Technologies a.s. >> Na Po???? 1047/26, 11000 Praha 1 >> Czech Republic >> >> +420 604 228 897 >> michal.arbet at ultimum.io >> *https://ultimum.io * >> >> LinkedIn | >> Twitter | Facebook >> >> >> >> st 24. 11. 2021 v 15:59 odes?latel Rodolfo Alonso Hernandez < >> ralonsoh at redhat.com> napsal: >> >>> Hello Tony: >>> >>> Do you have the Neutron server logs? Do you see the >>> "PortBindingChassisUpdateEvent"? When a port is bound, a Port_Bound (SB) >>> event is issued and captured in the Neutron server. That will trigger the >>> port binding process and the "vif-plugged" event. This OVN SB event should >>> call "set_port_status_up" and that should write "OVN reports status up for >>> port: %s" in the logs. >>> >>> Of course, this Neutron method will be called only if the logical switch >>> port is UP. >>> >>> Regards. >>> >>> >>> On Tue, Nov 23, 2021 at 11:59 PM Tony Liu >>> wrote: >>> >>>> Hi, >>>> >>>> I see such problem from time to time. It's not consistently >>>> reproduceable. >>>> ====================== >>>> 2021-11-23 22:16:28.532 7 INFO nova.compute.manager >>>> [req-50817b01-e7ae-4991-94fe-e4c5672c481b - - - - -] [instance: >>>> 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] During sync_power_state the instance >>>> has a pending task (spawning). Skip. >>>> 2021-11-23 22:21:28.342 7 WARNING nova.virt.libvirt.driver >>>> [req-814c98a4-3fd9-4607-9abb-5fbe5cef8650 3a4c320d64664d9cb6784b7ea52d618a >>>> 21384fe21ebb4339b1a5f2a8f9000ea3 - default default] [instance: >>>> 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] Timeout waiting for >>>> [('network-vif-plugged', '735d97a3-db7c-42ed-b3be-1596d4cc7f4b')] for >>>> instance with vm_state building and task_state spawning.: >>>> eventlet.timeout.Timeout: 300 seconds >>>> ====================== >>>> The VIF/port is activated by OVN ovn-controller to ovn-sb. It seems >>>> that, either Neutron didn't >>>> capture the update or didn't send message back to nova-compute. Is >>>> there any known fix for >>>> this problem? >>>> >>>> >>>> Thanks! >>>> Tony >>>> >>>> >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From michal.arbet at ultimum.io Thu Nov 25 10:43:01 2021 From: michal.arbet at ultimum.io (Michal Arbet) Date: Thu, 25 Nov 2021 12:43:01 +0200 Subject: [Neutron][OVN] nova-compute timeout while waiting for VIF activation confirmed by Neutron In-Reply-To: References: Message-ID: Hi Rudolfo, Well, I sent logs because same LOG error I see in my test environment, and thought that it would be useful for you. Michal D?a ?t 25. 11. 2021, 11:41 Rodolfo Alonso Hernandez nap?sal(a): > Hello Michal: > > I think this thread is related to OVN. In any case, I've analyzed your > logs from the Neutron point of view. Those are basically the key points in > your logs: [1]. The Neutron server receives the request to create and bind > a new port. Both the DHCP agent and the OVS agent provision the port > (8a472e87-4a7a-4ad4-9fbb-fd9785136611). > > Neutron server sends the "network-vif-plugged" event at 08:51:05.635 and > receives the ACK from Nova at 08:51:05.704. Nova server creates the > corresponding event for the instance on the compute0: > 2021-11-25 08:51:05.692 23 INFO > nova.api.openstack.compute.server_external_events > [req-393fcb9a-5c68-4b0a-af1b-661e73d947cb 01a75e3a9a9148218916d3beafae2120 > 7d16babef0524b7dade9a59b0a3569e1 - default default] Creating event > network-vif-plugged:8a472e87-4a7a-4ad4-9fbb-fd9785136611 for instance > 09eff2ce-356f-430f-ab30-5de58f58d698 on compute0 > > Nova compute agent receives this server event: > 2021-11-25 08:51:05.705 8 DEBUG nova.compute.manager > [req-393fcb9a-5c68-4b0a-af1b-661e73d947cb 01a75e3a9a9148218916d3beafae2120 > 7d16babef0524b7dade9a59b0a3569e1 - default default] [instance: > 09eff2ce-356f-430f-ab30-5de58f58d698] Processing event > network-vif-plugged-8a472e87-4a7a-4ad4-9fbb-fd9785136611 > _process_instance_event > /var/lib/kolla/venv/lib/python3.9/site-packages/nova/compute/manager.py:10205 > > Further triagging and log analysis should be done by Nova folks to > understand why Nova compute didn't boot this VM. I fail to understand some > parts. > > Regards. > > [1]https://paste.opendev.org/show/811273/ > > On Thu, Nov 25, 2021 at 10:15 AM Michal Arbet > wrote: > >> Hello, >> >> In attachment you can find logs for compute0 and controller0 (other >> computes and controllers were turned off for this test). >> No OVN used, this stack is based on OVS. >> >> Thank you, >> Michal Arbet >> Openstack Engineer >> >> Ultimum Technologies a.s. >> Na Po???? 1047/26, 11000 Praha 1 >> Czech Republic >> >> +420 604 228 897 >> michal.arbet at ultimum.io >> *https://ultimum.io * >> >> LinkedIn | >> Twitter | Facebook >> >> >> >> st 24. 11. 2021 v 15:59 odes?latel Rodolfo Alonso Hernandez < >> ralonsoh at redhat.com> napsal: >> >>> Hello Tony: >>> >>> Do you have the Neutron server logs? Do you see the >>> "PortBindingChassisUpdateEvent"? When a port is bound, a Port_Bound (SB) >>> event is issued and captured in the Neutron server. That will trigger the >>> port binding process and the "vif-plugged" event. This OVN SB event should >>> call "set_port_status_up" and that should write "OVN reports status up for >>> port: %s" in the logs. >>> >>> Of course, this Neutron method will be called only if the logical switch >>> port is UP. >>> >>> Regards. >>> >>> >>> On Tue, Nov 23, 2021 at 11:59 PM Tony Liu >>> wrote: >>> >>>> Hi, >>>> >>>> I see such problem from time to time. It's not consistently >>>> reproduceable. >>>> ====================== >>>> 2021-11-23 22:16:28.532 7 INFO nova.compute.manager >>>> [req-50817b01-e7ae-4991-94fe-e4c5672c481b - - - - -] [instance: >>>> 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] During sync_power_state the instance >>>> has a pending task (spawning). Skip. >>>> 2021-11-23 22:21:28.342 7 WARNING nova.virt.libvirt.driver >>>> [req-814c98a4-3fd9-4607-9abb-5fbe5cef8650 3a4c320d64664d9cb6784b7ea52d618a >>>> 21384fe21ebb4339b1a5f2a8f9000ea3 - default default] [instance: >>>> 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] Timeout waiting for >>>> [('network-vif-plugged', '735d97a3-db7c-42ed-b3be-1596d4cc7f4b')] for >>>> instance with vm_state building and task_state spawning.: >>>> eventlet.timeout.Timeout: 300 seconds >>>> ====================== >>>> The VIF/port is activated by OVN ovn-controller to ovn-sb. It seems >>>> that, either Neutron didn't >>>> capture the update or didn't send message back to nova-compute. Is >>>> there any known fix for >>>> this problem? >>>> >>>> >>>> Thanks! >>>> Tony >>>> >>>> >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From iurygregory at gmail.com Thu Nov 25 10:52:13 2021 From: iurygregory at gmail.com (Iury Gregory) Date: Thu, 25 Nov 2021 11:52:13 +0100 Subject: [Openstack-victoria] [IRONIC] error in network interface | baremetal node validate In-Reply-To: References: Message-ID: Hi Swogat, This error comes from the keystoneauth. Since it complains about not having the attribute, maybe you have an older version running in your setup? Em qui., 25 de nov. de 2021 ?s 08:13, Swogat Pradhan < swogatpradhan22 at gmail.com> escreveu: > Is there anything else that i can share to help understand the issue? > > with regards, > swogat pradhan > > On Tue, Nov 16, 2021 at 3:43 PM Swogat Pradhan > wrote: > >> Hi, >> I am currently trying to setup openstack ironic using driver IPMI. >> I followed the official docs of openstack for setting everything up. >> >> When i run openstack baremetal node validate $NODE_UUID, i am getting the >> following error: >> >> * Unexpected exception, traceback saved into log by ironic conductor >> service that is running on controller: 'ServiceTokenAuthWrapper' object has >> no attribute '_discovery_cache' * >> in the network interface in command output. >> >> When i check the ironic conductor logs i see the following messages: >> > >> > >> >> Can anyone suggest a solution or a way forward. >> >> With regards >> Swogat Pradhan >> > -- *Att[]'sIury Gregory Melo Ferreira * *MSc in Computer Science at UFCG* *Part of the ironic-core and puppet-manager-core team in OpenStack* *Software Engineer at Red Hat Czech* *Social*: https://www.linkedin.com/in/iurygregory *E-mail: iurygregory at gmail.com * -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Thu Nov 25 10:57:45 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Thu, 25 Nov 2021 11:57:45 +0100 Subject: [Neutron][OVN] nova-compute timeout while waiting for VIF activation confirmed by Neutron In-Reply-To: References: Message-ID: <5291667.iZASKD2KPV@p1> Hi, On czwartek, 25 listopada 2021 10:41:14 CET Rodolfo Alonso Hernandez wrote: > Hello Michal: > > I think this thread is related to OVN. In any case, I've analyzed your logs > from the Neutron point of view. Those are basically the key points in your > logs: [1]. The Neutron server receives the request to create and bind a new > port. Both the DHCP agent and the OVS agent provision the port > (8a472e87-4a7a-4ad4-9fbb-fd9785136611). > > Neutron server sends the "network-vif-plugged" event at 08:51:05.635 and > receives the ACK from Nova at 08:51:05.704. Nova server creates the > corresponding event for the instance on the compute0: > 2021-11-25 08:51:05.692 23 INFO > nova.api.openstack.compute.server_external_events > [req-393fcb9a-5c68-4b0a-af1b-661e73d947cb 01a75e3a9a9148218916d3beafae2120 > 7d16babef0524b7dade9a59b0a3569e1 - default default] Creating event > network-vif-plugged:8a472e87-4a7a-4ad4-9fbb-fd9785136611 for instance > 09eff2ce-356f-430f-ab30-5de58f58d698 on compute0 > > Nova compute agent receives this server event: > 2021-11-25 08:51:05.705 8 DEBUG nova.compute.manager > [req-393fcb9a-5c68-4b0a-af1b-661e73d947cb 01a75e3a9a9148218916d3beafae2120 > 7d16babef0524b7dade9a59b0a3569e1 - default default] [instance: > 09eff2ce-356f-430f-ab30-5de58f58d698] Processing event > network-vif-plugged-8a472e87-4a7a-4ad4-9fbb-fd9785136611 > _process_instance_event > /var/lib/kolla/venv/lib/python3.9/site-packages/nova/compute/manager.py: 10205 > > Further triagging and log analysis should be done by Nova folks to > understand why Nova compute didn't boot this VM. I fail to understand some > parts. Thx Rodolfo. I also took a look at those logs (for server 09eff2ce-356f-430f- ab30-5de58f58d698 which had port 0761ff2f-ff6a-40a5-85b7-c9b40386cbe0 but I can confirm what You found actually. Port was pretty quickly provisioned and it was reported to Nova. I don't know why Nova didn't unpause that vm. Port creation in Nova: 2021-11-25 08:50:24.250 8 INFO os_vif [req-6eab0f2a- d7f4-4251-96f4-177fc0c62212 647a328aef9b4a8b890f102d8f018576 49231caa42f34da8a9d10229b5c7a5d8 - default default] Successfully plugged vif VIFOpenVSwitch(active=False,address=fa:16:3e:3d:cd:b9,bridge_name='br- int',has_traffic_filtering=True,id=0761ff2f-ff6a-40a5-85b7- c9b40386cbe0,network=Network(cb4e6be4-9bc7-4c34-95dd- dfc3c8e9d93c),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=True,vif_name='tap0761ff2f- ff') Neutron finished provisioning of that port and sent notification to nova about 34 seconds later: 2021-11-25 08:51:00.403 30 DEBUG neutron.plugins.ml2.rpc [req- df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Device 0761ff2f- ff6a-40a5-85b7-c9b40386cbe0 up at agent ovs-agent-compute0 update_device_up / var/lib/kolla/venv/lib/python3.9/site-packages/neutron/plugins/ml2/rpc.py:278 2021-11-25 08:51:00.503 29 DEBUG neutron.db.provisioning_blocks [req- a5711e9f-9b15-4a1a-a58d-07952fb8ac34 - - - - -] Provisioning complete for port 67b32ddf-dddd-439e-92be-6a89df49b4af triggered by entity L2. provisioning_complete /var/lib/kolla/venv/lib/python3.9/site-packages/neutron/ db/provisioning_blocks.py:139 2021-11-25 08:51:00.504 29 DEBUG neutron_lib.callbacks.manager [req- a5711e9f-9b15-4a1a-a58d-07952fb8ac34 - - - - -] Notify callbacks ['neutron.plugins.ml2.plugin.Ml2Plugin._port_provisioned-8372308'] for port, provisioning_complete _notify_loop /var/lib/kolla/venv/lib/python3.9/site- packages/neutron_lib/callbacks/manager.py:192 2021-11-25 08:51:00.505 30 DEBUG neutron.db.provisioning_blocks [req- df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Provisioning for port 0761ff2f-ff6a-40a5-85b7-c9b40386cbe0 completed by entity L2. provisioning_complete /var/lib/kolla/venv/lib/python3.9/site-packages/neutron/ db/provisioning_blocks.py:133 2021-11-25 08:51:00.515 30 DEBUG neutron.db.provisioning_blocks [req- df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Provisioning complete for port 0761ff2f-ff6a-40a5-85b7-c9b40386cbe0 triggered by entity L2. provisioning_complete /var/lib/kolla/venv/lib/python3.9/site-packages/neutron/ db/provisioning_blocks.py:139 2021-11-25 08:51:00.516 30 DEBUG neutron_lib.callbacks.manager [req- df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Notify callbacks ['neutron.plugins.ml2.plugin.Ml2Plugin._port_provisioned-8372308'] for port, provisioning_complete _notify_loop /var/lib/kolla/venv/lib/python3.9/site- packages/neutron_lib/callbacks/manager.py:192 2021-11-25 08:51:00.701 30 DEBUG neutron_lib.callbacks.manager [req- df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Notify callbacks ['neutron.plugins.ml2.plugin.SecurityGroupDbMixin._ensure_default_security_group_handler-7972195'] for port, before_update _notify_loop /var/lib/kolla/venv/lib/python3.9/site- packages/neutron_lib/callbacks/manager.py:192 2021-11-25 08:51:00.746 30 DEBUG neutron.notifiers.nova [-] Sending events: [{'server_uuid': '09eff2ce-356f-430f-ab30-5de58f58d698', 'name': 'network-vif- plugged', 'status': 'completed', 'tag': '0761ff2f-ff6a-40a5-85b7- c9b40386cbe0'}] send_events /var/lib/kolla/venv/lib/python3.9/site-packages/ neutron/notifiers/nova.py:262 2021-11-25 08:51:00.762 30 DEBUG neutron.plugins.ml2.db [req- df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] For port 0761ff2f- ff6a-40a5-85b7-c9b40386cbe0, host compute0, got binding levels [PortBindingLevel(driver='openvswitch',host='compute0',level=0,port_id=0761ff2f- ff6a-40a5-85b7-c9b40386cbe0,segment=NetworkSegment(ea3a27b6- c3fa-40e3-9937-71b652cba855),segment_id=ea3a27b6-c3fa-40e3-9937-71b652cba855)] get_binding_level_objs /var/lib/kolla/venv/lib/python3.9/site-packages/ neutron/plugins/ml2/db.py:75 2021-11-25 08:51:00.818 30 INFO neutron.notifiers.nova [-] Nova event response: {'server_uuid': '09eff2ce-356f-430f-ab30-5de58f58d698', 'name': 'network-vif-plugged', 'status': 'completed', 'tag': '0761ff2f-ff6a-40a5-85b7- c9b40386cbe0', 'code': 200} > > Regards. > > [1]https://paste.opendev.org/show/811273/ > > On Thu, Nov 25, 2021 at 10:15 AM Michal Arbet > > wrote: > > Hello, > > > > In attachment you can find logs for compute0 and controller0 (other > > computes and controllers were turned off for this test). > > No OVN used, this stack is based on OVS. > > > > Thank you, > > Michal Arbet > > Openstack Engineer > > > > Ultimum Technologies a.s. > > Na Po???? 1047/26, 11000 Praha 1 > > Czech Republic > > > > +420 604 228 897 > > michal.arbet at ultimum.io > > *https://ultimum.io * > > > > LinkedIn | Twitter > > | Facebook > > > > > > > > st 24. 11. 2021 v 15:59 odes?latel Rodolfo Alonso Hernandez < > > > > ralonsoh at redhat.com> napsal: > >> Hello Tony: > >> > >> Do you have the Neutron server logs? Do you see the > >> "PortBindingChassisUpdateEvent"? When a port is bound, a Port_Bound (SB) > >> event is issued and captured in the Neutron server. That will trigger the > >> port binding process and the "vif-plugged" event. This OVN SB event should > >> call "set_port_status_up" and that should write "OVN reports status up for > >> port: %s" in the logs. > >> > >> Of course, this Neutron method will be called only if the logical switch > >> port is UP. > >> > >> Regards. > >> > >> > >> On Tue, Nov 23, 2021 at 11:59 PM Tony Liu > >> > >> wrote: > >>> Hi, > >>> > >>> I see such problem from time to time. It's not consistently > >>> reproduceable. > >>> ====================== > >>> 2021-11-23 22:16:28.532 7 INFO nova.compute.manager > >>> [req-50817b01-e7ae-4991-94fe-e4c5672c481b - - - - -] [instance: > >>> 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] During sync_power_state the instance > >>> has a pending task (spawning). Skip. > >>> 2021-11-23 22:21:28.342 7 WARNING nova.virt.libvirt.driver > >>> [req-814c98a4-3fd9-4607-9abb-5fbe5cef8650 3a4c320d64664d9cb6784b7ea52d618a > >>> 21384fe21ebb4339b1a5f2a8f9000ea3 - default default] [instance: > >>> 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] Timeout waiting for > >>> [('network-vif-plugged', '735d97a3-db7c-42ed-b3be-1596d4cc7f4b')] for > >>> instance with vm_state building and task_state spawning.: > >>> eventlet.timeout.Timeout: 300 seconds > >>> ====================== > >>> The VIF/port is activated by OVN ovn-controller to ovn-sb. It seems > >>> that, either Neutron didn't > >>> capture the update or didn't send message back to nova-compute. Is there > >>> any known fix for > >>> this problem? > >>> > >>> > >>> Thanks! > >>> Tony -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From skaplons at redhat.com Thu Nov 25 10:57:45 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Thu, 25 Nov 2021 11:57:45 +0100 Subject: [Neutron][OVN] nova-compute timeout while waiting for VIF activation confirmed by Neutron In-Reply-To: References: Message-ID: <5291667.iZASKD2KPV@p1> Hi, On czwartek, 25 listopada 2021 10:41:14 CET Rodolfo Alonso Hernandez wrote: > Hello Michal: > > I think this thread is related to OVN. In any case, I've analyzed your logs > from the Neutron point of view. Those are basically the key points in your > logs: [1]. The Neutron server receives the request to create and bind a new > port. Both the DHCP agent and the OVS agent provision the port > (8a472e87-4a7a-4ad4-9fbb-fd9785136611). > > Neutron server sends the "network-vif-plugged" event at 08:51:05.635 and > receives the ACK from Nova at 08:51:05.704. Nova server creates the > corresponding event for the instance on the compute0: > 2021-11-25 08:51:05.692 23 INFO > nova.api.openstack.compute.server_external_events > [req-393fcb9a-5c68-4b0a-af1b-661e73d947cb 01a75e3a9a9148218916d3beafae2120 > 7d16babef0524b7dade9a59b0a3569e1 - default default] Creating event > network-vif-plugged:8a472e87-4a7a-4ad4-9fbb-fd9785136611 for instance > 09eff2ce-356f-430f-ab30-5de58f58d698 on compute0 > > Nova compute agent receives this server event: > 2021-11-25 08:51:05.705 8 DEBUG nova.compute.manager > [req-393fcb9a-5c68-4b0a-af1b-661e73d947cb 01a75e3a9a9148218916d3beafae2120 > 7d16babef0524b7dade9a59b0a3569e1 - default default] [instance: > 09eff2ce-356f-430f-ab30-5de58f58d698] Processing event > network-vif-plugged-8a472e87-4a7a-4ad4-9fbb-fd9785136611 > _process_instance_event > /var/lib/kolla/venv/lib/python3.9/site-packages/nova/compute/manager.py: 10205 > > Further triagging and log analysis should be done by Nova folks to > understand why Nova compute didn't boot this VM. I fail to understand some > parts. Thx Rodolfo. I also took a look at those logs (for server 09eff2ce-356f-430f- ab30-5de58f58d698 which had port 0761ff2f-ff6a-40a5-85b7-c9b40386cbe0 but I can confirm what You found actually. Port was pretty quickly provisioned and it was reported to Nova. I don't know why Nova didn't unpause that vm. Port creation in Nova: 2021-11-25 08:50:24.250 8 INFO os_vif [req-6eab0f2a- d7f4-4251-96f4-177fc0c62212 647a328aef9b4a8b890f102d8f018576 49231caa42f34da8a9d10229b5c7a5d8 - default default] Successfully plugged vif VIFOpenVSwitch(active=False,address=fa:16:3e:3d:cd:b9,bridge_name='br- int',has_traffic_filtering=True,id=0761ff2f-ff6a-40a5-85b7- c9b40386cbe0,network=Network(cb4e6be4-9bc7-4c34-95dd- dfc3c8e9d93c),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=True,vif_name='tap0761ff2f- ff') Neutron finished provisioning of that port and sent notification to nova about 34 seconds later: 2021-11-25 08:51:00.403 30 DEBUG neutron.plugins.ml2.rpc [req- df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Device 0761ff2f- ff6a-40a5-85b7-c9b40386cbe0 up at agent ovs-agent-compute0 update_device_up / var/lib/kolla/venv/lib/python3.9/site-packages/neutron/plugins/ml2/rpc.py:278 2021-11-25 08:51:00.503 29 DEBUG neutron.db.provisioning_blocks [req- a5711e9f-9b15-4a1a-a58d-07952fb8ac34 - - - - -] Provisioning complete for port 67b32ddf-dddd-439e-92be-6a89df49b4af triggered by entity L2. provisioning_complete /var/lib/kolla/venv/lib/python3.9/site-packages/neutron/ db/provisioning_blocks.py:139 2021-11-25 08:51:00.504 29 DEBUG neutron_lib.callbacks.manager [req- a5711e9f-9b15-4a1a-a58d-07952fb8ac34 - - - - -] Notify callbacks ['neutron.plugins.ml2.plugin.Ml2Plugin._port_provisioned-8372308'] for port, provisioning_complete _notify_loop /var/lib/kolla/venv/lib/python3.9/site- packages/neutron_lib/callbacks/manager.py:192 2021-11-25 08:51:00.505 30 DEBUG neutron.db.provisioning_blocks [req- df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Provisioning for port 0761ff2f-ff6a-40a5-85b7-c9b40386cbe0 completed by entity L2. provisioning_complete /var/lib/kolla/venv/lib/python3.9/site-packages/neutron/ db/provisioning_blocks.py:133 2021-11-25 08:51:00.515 30 DEBUG neutron.db.provisioning_blocks [req- df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Provisioning complete for port 0761ff2f-ff6a-40a5-85b7-c9b40386cbe0 triggered by entity L2. provisioning_complete /var/lib/kolla/venv/lib/python3.9/site-packages/neutron/ db/provisioning_blocks.py:139 2021-11-25 08:51:00.516 30 DEBUG neutron_lib.callbacks.manager [req- df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Notify callbacks ['neutron.plugins.ml2.plugin.Ml2Plugin._port_provisioned-8372308'] for port, provisioning_complete _notify_loop /var/lib/kolla/venv/lib/python3.9/site- packages/neutron_lib/callbacks/manager.py:192 2021-11-25 08:51:00.701 30 DEBUG neutron_lib.callbacks.manager [req- df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Notify callbacks ['neutron.plugins.ml2.plugin.SecurityGroupDbMixin._ensure_default_security_group_handler-7972195'] for port, before_update _notify_loop /var/lib/kolla/venv/lib/python3.9/site- packages/neutron_lib/callbacks/manager.py:192 2021-11-25 08:51:00.746 30 DEBUG neutron.notifiers.nova [-] Sending events: [{'server_uuid': '09eff2ce-356f-430f-ab30-5de58f58d698', 'name': 'network-vif- plugged', 'status': 'completed', 'tag': '0761ff2f-ff6a-40a5-85b7- c9b40386cbe0'}] send_events /var/lib/kolla/venv/lib/python3.9/site-packages/ neutron/notifiers/nova.py:262 2021-11-25 08:51:00.762 30 DEBUG neutron.plugins.ml2.db [req- df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] For port 0761ff2f- ff6a-40a5-85b7-c9b40386cbe0, host compute0, got binding levels [PortBindingLevel(driver='openvswitch',host='compute0',level=0,port_id=0761ff2f- ff6a-40a5-85b7-c9b40386cbe0,segment=NetworkSegment(ea3a27b6- c3fa-40e3-9937-71b652cba855),segment_id=ea3a27b6-c3fa-40e3-9937-71b652cba855)] get_binding_level_objs /var/lib/kolla/venv/lib/python3.9/site-packages/ neutron/plugins/ml2/db.py:75 2021-11-25 08:51:00.818 30 INFO neutron.notifiers.nova [-] Nova event response: {'server_uuid': '09eff2ce-356f-430f-ab30-5de58f58d698', 'name': 'network-vif-plugged', 'status': 'completed', 'tag': '0761ff2f-ff6a-40a5-85b7- c9b40386cbe0', 'code': 200} > > Regards. > > [1]https://paste.opendev.org/show/811273/ > > On Thu, Nov 25, 2021 at 10:15 AM Michal Arbet > > wrote: > > Hello, > > > > In attachment you can find logs for compute0 and controller0 (other > > computes and controllers were turned off for this test). > > No OVN used, this stack is based on OVS. > > > > Thank you, > > Michal Arbet > > Openstack Engineer > > > > Ultimum Technologies a.s. > > Na Po???? 1047/26, 11000 Praha 1 > > Czech Republic > > > > +420 604 228 897 > > michal.arbet at ultimum.io > > *https://ultimum.io * > > > > LinkedIn | Twitter > > | Facebook > > > > > > > > st 24. 11. 2021 v 15:59 odes?latel Rodolfo Alonso Hernandez < > > > > ralonsoh at redhat.com> napsal: > >> Hello Tony: > >> > >> Do you have the Neutron server logs? Do you see the > >> "PortBindingChassisUpdateEvent"? When a port is bound, a Port_Bound (SB) > >> event is issued and captured in the Neutron server. That will trigger the > >> port binding process and the "vif-plugged" event. This OVN SB event should > >> call "set_port_status_up" and that should write "OVN reports status up for > >> port: %s" in the logs. > >> > >> Of course, this Neutron method will be called only if the logical switch > >> port is UP. > >> > >> Regards. > >> > >> > >> On Tue, Nov 23, 2021 at 11:59 PM Tony Liu > >> > >> wrote: > >>> Hi, > >>> > >>> I see such problem from time to time. It's not consistently > >>> reproduceable. > >>> ====================== > >>> 2021-11-23 22:16:28.532 7 INFO nova.compute.manager > >>> [req-50817b01-e7ae-4991-94fe-e4c5672c481b - - - - -] [instance: > >>> 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] During sync_power_state the instance > >>> has a pending task (spawning). Skip. > >>> 2021-11-23 22:21:28.342 7 WARNING nova.virt.libvirt.driver > >>> [req-814c98a4-3fd9-4607-9abb-5fbe5cef8650 3a4c320d64664d9cb6784b7ea52d618a > >>> 21384fe21ebb4339b1a5f2a8f9000ea3 - default default] [instance: > >>> 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] Timeout waiting for > >>> [('network-vif-plugged', '735d97a3-db7c-42ed-b3be-1596d4cc7f4b')] for > >>> instance with vm_state building and task_state spawning.: > >>> eventlet.timeout.Timeout: 300 seconds > >>> ====================== > >>> The VIF/port is activated by OVN ovn-controller to ovn-sb. It seems > >>> that, either Neutron didn't > >>> capture the update or didn't send message back to nova-compute. Is there > >>> any known fix for > >>> this problem? > >>> > >>> > >>> Thanks! > >>> Tony -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From dtantsur at redhat.com Thu Nov 25 11:06:08 2021 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Thu, 25 Nov 2021 12:06:08 +0100 Subject: [neutron] Averaga number of rechecks in Neutron comparing to other projects In-Reply-To: <3257902.aeNJFYEL58@p1> References: <3257902.aeNJFYEL58@p1> Message-ID: Hi Slawek, A bit tangentially: did you use any sort of a script to collect the recheck numbers? I'd love to do the same kind of analysis for Ironic. Dmitry On Fri, Nov 12, 2021 at 6:21 PM Slawek Kaplonski wrote: > Hi neutrinos, > > As we discussed during the last PTG, I spent some time today to get > average > number of rechecks which we need to do in the last PS of the change before > it's merged. In theory that number should be close to 0 as patch should be > merged with first CI run when it's approved by reviewers :) > All data are only from the master branch. I didn't check the same for > stable > branches. > > File with graph and raw data in csv format are in the attachments. > > Basically my conclusion is that Neutron's CI is really bad in that. We > have to > recheck many, many times before patches will be merged. > We really need to think about how to improve that in Neutron. > > -- > Slawek Kaplonski > Principal Software Engineer > Red Hat -- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Thu Nov 25 14:19:49 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Thu, 25 Nov 2021 15:19:49 +0100 Subject: [neutron] Averaga number of rechecks in Neutron comparing to other projects In-Reply-To: References: <3257902.aeNJFYEL58@p1> Message-ID: <10442762.EvYhyI6sBW@p1> Hi, On czwartek, 25 listopada 2021 12:06:08 CET Dmitry Tantsur wrote: > Hi Slawek, > > A bit tangentially: did you use any sort of a script to collect the recheck > numbers? I'd love to do the same kind of analysis for Ironic. I used script https://github.com/slawqo/tools/tree/master/rechecks > > Dmitry > > On Fri, Nov 12, 2021 at 6:21 PM Slawek Kaplonski > > wrote: > > Hi neutrinos, > > > > As we discussed during the last PTG, I spent some time today to get > > average > > number of rechecks which we need to do in the last PS of the change before > > it's merged. In theory that number should be close to 0 as patch should be > > merged with first CI run when it's approved by reviewers :) > > All data are only from the master branch. I didn't check the same for > > stable > > branches. > > > > File with graph and raw data in csv format are in the attachments. > > > > Basically my conclusion is that Neutron's CI is really bad in that. We > > have to > > recheck many, many times before patches will be merged. > > We really need to think about how to improve that in Neutron. > > > > -- > > Slawek Kaplonski > > Principal Software Engineer > > Red Hat -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From michal.arbet at ultimum.io Thu Nov 25 14:39:55 2021 From: michal.arbet at ultimum.io (Michal Arbet) Date: Thu, 25 Nov 2021 15:39:55 +0100 Subject: [kolla] Proposing Michal Arbet as Kolla core In-Reply-To: <4EE99442-A9F1-44F1-9004-52629E04A68B@gmail.com> References: <139F5BE7-4083-4364-959F-A850C12FCCB1@dincercelik.com> <4EE99442-A9F1-44F1-9004-52629E04A68B@gmail.com> Message-ID: Hi, I apologize for the late response, but I'm on vacation and there is a terrible internet. I really appreciate that I was accepted as core for the kolla/kolla-ansible group, I will try to do my best for the kolla community. Thank you very much ! Michal Arbet (kevko) Michal Arbet Openstack Engineer Ultimum Technologies a.s. Na Po???? 1047/26, 11000 Praha 1 Czech Republic +420 604 228 897 michal.arbet at ultimum.io *https://ultimum.io * LinkedIn | Twitter | Facebook ?t 25. 11. 2021 v 10:49 odes?latel Micha? Nasiadka napsal: > Hello Koalas, > > As the time has passed - and I haven?t seen negative votes - I?ve added > Michal to: > - kolla-ansible-core > - kolla-core > > Welcome to the core team Michal! > > Best regards, > Michal Nasiadka > > On 21 Nov 2021, at 07:51, Dincer Celik wrote: > > +1 > > On 16 Nov 2021, at 15:49, Micha? Nasiadka wrote: > > Hi, > > I would like to propose adding Michal Arbet (kevko) to the kolla-core and kolla-ansible-core > groups. > Michal did some great work on ProxySQL, is a consistent maintainer > of Debian related images and has provided some helpful > reviews. > > Cores - please reply +1/-1 before the end of Friday 26th November. > > Thanks, > Michal > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From elod.illes at est.tech Thu Nov 25 17:07:03 2021 From: elod.illes at est.tech (=?UTF-8?B?RWzFkWQgSWxsw6lz?=) Date: Thu, 25 Nov 2021 18:07:03 +0100 Subject: [release] Release countdown for week R-18, Nov 22-26 Message-ID: <1a9841e4-c4c4-f5eb-7001-0454a646c4e8@est.tech> Development Focus ---------------------------- We are now past the Yoga-1 milestone. Teams should currentlybe focused on feature development! General Information ---------------------------- Our next milestone in this development cycle will be Yoga-2, on 06 January, 2022. This milestone is when we freeze the list of deliverables that will be included in the Yogafinal release, so if you plan to introduce new deliverables in this release, please propose a change to add an empty deliverable file in the deliverables/yogadirectory of the openstack/releases repository. Now is also generally a good time to look at bugfixes that were introduced in the master branch that might make sense to be backported and released in a stable release. If you have any question around the OpenStack release process, feel free to ask on this mailing-list or on the #openstack-release channel on IRC. Upcoming Deadlines & Dates ----------------------------------------- Yoga-2 Milestone:06 January, 2022 Yoga final release: March 30, 2022 El?d Ill?s irc: elodilles -------------- next part -------------- An HTML attachment was scrubbed... URL: From aschultz at redhat.com Thu Nov 25 17:47:39 2021 From: aschultz at redhat.com (Alex Schultz) Date: Thu, 25 Nov 2021 10:47:39 -0700 Subject: [TripleO] IPV6 Support In-Reply-To: References: Message-ID: Yes IPV6 is supported, however the error you provided indicates problems starting containers. Make sure you pin to container-tools:3.0 to ensure you get the version we expect. dnf module -y disable container-tools:rhel8 ; dnf module -y enable container-tools:3.0 On Thu, Nov 25, 2021 at 1:02 AM Lokendra Rathour wrote: > Hi, > I am trying to install the undercloud using the ipv6 address on Centos 8 > using Train release. > It is seen that the deployment of undercloud is getting failed with error > as mentioned below. Same deployment is working in the normal ipv4 case. > So Questions: > > 1. is IPV6 supported in TripleO , if yes then please suggest how. > > > Error: > nied", > " attempt(s): 2", > "2021-11-25 07:25:31,535 WARNING: 65355 -- Retrying running container: > zaqar", > "2021-11-25 07:25:34,887 ERROR: 65355 -- ['/usr/bin/podman', 'run', > '--user', '0', '--name', 'container-puppet-zaqar', '--env', > 'PUPPET_TAGS=file,file_line,concat,augeas,cron,zaqar_config', '--env', > 'NAME=zaqar', '--env', 'HOSTNAME=undercloud', '--env', 'NO_ARCHIVE=', > '--env', 'STEP=6', '--env', 'NET_HOST=true', '--env', 'DEBUG=False', > '--volume', '/etc/localtime:/etc/localtime:ro', '--volume', > '/tmp/tmp06nvxzzz:/etc/config.pp:ro', '--volume', > '/etc/puppet/:/tmp/puppet-etc/:ro', '--volume', > '/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro', '--volume', > '/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro', > '--volume', > '/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro', > '--volume', '/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro', '--volume', > '/var/lib/config-data:/var/lib/config-data/:rw', '--volume', > '/var/lib/container-puppet/puppetlabs/facter.conf:/etc/puppetlabs/facter/facter.conf:ro', > '--volume', '/var/lib/container-puppet/puppetlabs/:/opt/puppetlabs/:ro', > '--volume', '/dev/log:/dev/log:rw', '--rm', '--log-driver', 'k8s-file', > '--log-opt', 'path=/var/log/containers/stdouts/container-puppet-zaqar.log', > '--security-opt', 'label=disable', '--volume', > '/usr/share/openstack-puppet/modules/:/usr/share/openstack-puppet/modules/:ro', > '--entrypoint', '/var/lib/container-puppet/container-puppet.sh', '--net', > 'host', '--volume', '/etc/hosts:/etc/hosts:ro', '--volume', > '/var/lib/container-puppet/container-puppet.sh:/var/lib/container-puppet/container-puppet.sh:ro', > 'undercloud.ctlplane.localdomain:8787/tripleotraincentos8/centos-binary-zaqar-wsgi:current-tripleo'] > run failed after Error: container_linux.go:370: starting container process > caused: error adding seccomp filter rule for syscall bdflush: permission > denied: OCI permission denied", > " attempt(s): 3", > "2021-11-25 07:25:34,888 WARNING: 65355 -- Retrying running container: > zaqar", > "2021-11-25 07:25:34,888 ERROR: 65355 -- Failed running container for > zaqar", > "2021-11-25 07:25:34,888 INFO: 65355 -- Finished processing puppet > configs for zaqar", > "2021-11-25 07:25:34,889 ERROR: 65345 -- ERROR configuring crond", > "2021-11-25 07:25:34,889 ERROR: 65345 -- ERROR configuring glance_api", > "2021-11-25 07:25:34,889 ERROR: 65345 -- ERROR configuring heat_api", > "2021-11-25 07:25:34,889 ERROR: 65345 -- ERROR configuring heat", > "2021-11-25 07:25:34,889 ERROR: 65345 -- ERROR configuring ironic_api", > "2021-11-25 07:25:34,889 ERROR: 65345 -- ERROR configuring ironic", > "2021-11-25 07:25:34,889 ERROR: 65345 -- ERROR configuring > ironic_inspector", > "2021-11-25 07:25:34,889 ERROR: 65345 -- ERROR configuring neutron", > "2021-11-25 07:25:34,889 ERROR: 65345 -- ERROR configuring iscsid", > "2021-11-25 07:25:34,889 ERROR: 65345 -- ERROR configuring keystone", > "2021-11-25 07:25:34,890 ERROR: 65345 -- ERROR configuring memcached", > "2021-11-25 07:25:34,890 ERROR: 65345 -- ERROR configuring mistral", > "2021-11-25 07:25:34,890 ERROR: 65345 -- ERROR configuring mysql", > "2021-11-25 07:25:34,890 ERROR: 65345 -- ERROR configuring nova", > "2021-11-25 07:25:34,890 ERROR: 65345 -- ERROR configuring rabbitmq", > "2021-11-25 07:25:34,890 ERROR: 65345 -- ERROR configuring placement", > "2021-11-25 07:25:34,890 ERROR: 65345 -- ERROR configuring swift", > "2021-11-25 07:25:34,890 ERROR: 65345 -- ERROR configuring > swift_ringbuilder", > "2021-11-25 07:25:34,890 ERROR: 65345 -- ERROR configuring zaqar" > ], > "stderr_lines": [], > "_ansible_no_log": false, > "attempts": 15 > } > ] > ] > Not cleaning working directory /home/stack/tripleo-heat-installer-templates > Not cleaning ansible directory /home/stack/undercloud-ansible-mw4crw92 > Install artifact is located at > /home/stack/undercloud-install-20211125072536.tar.bzip2 > > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > > Deployment Failed! > > ERROR: Heat log files: /var/log/heat-launcher/undercloud_deploy-o_qf1b4w > > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > > Deployment failed. > Traceback (most recent call last): > File > "/usr/lib/python3.6/site-packages/tripleoclient/v1/tripleo_deploy.py", line > 1345, in _standalone_deploy > raise exceptions.DeploymentError('Deployment failed') > tripleoclient.exceptions.DeploymentError: Deployment failed > > During handling of the above exception, another exception occurred: > > Traceback (most recent call last): > File > "/usr/lib/python3.6/site-packages/tripleoclient/v1/tripleo_deploy.py", line > 1363, in _standalone_deploy > raise exceptions.DeploymentError(six.text_type(e)) > tripleoclient.exceptions.DeploymentError: Deployment failed > > During handling of the above exception, another exception occurred: > > Traceback (most recent call last): > File "/usr/lib/python3.6/site-packages/cliff/app.py", line 401, in > run_subcommand > result = cmd.run(parsed_args) > File "/usr/lib/python3.6/site-packages/cliff/command.py", line 185, in > run > return_code = self.take_action(parsed_args) or 0 > File > "/usr/lib/python3.6/site-packages/tripleoclient/v1/tripleo_deploy.py", line > 1451, in take_action > if self._standalone_deploy(parsed_args) != 0: > File > "/usr/lib/python3.6/site-packages/tripleoclient/v1/tripleo_deploy.py", line > 1400, in _standalone_deploy > raise exceptions.DeploymentError('Deployment failed.') > tripleoclient.exceptions.DeploymentError: Deployment failed. > clean_up Deploy: Deployment failed. > Traceback (most recent call last): > File > "/usr/lib/python3.6/site-packages/tripleoclient/v1/tripleo_deploy.py", line > 1345, in _standalone_deploy > raise exceptions.DeploymentError('Deployment failed') > tripleoclient.exceptions.DeploymentError: Deployment failed > > During handling of the above exception, another exception occurred: > > Traceback (most recent call last): > File > "/usr/lib/python3.6/site-packages/tripleoclient/v1/tripleo_deploy.py", line > 1363, in _standalone_deploy > raise exceptions.DeploymentError(six.text_type(e)) > tripleoclient.exceptions.DeploymentError: Deployment failed > > During handling of the above exception, another exception occurred: > > Traceback (most recent call last): > File "/usr/lib/python3.6/site-packages/osc_lib/shell.py", line 136, in > run > ret_val = super(OpenStackShell, self).run(argv) > File "/usr/lib/python3.6/site-packages/cliff/app.py", line 281, in run > result = self.run_subcommand(remainder) > File "/usr/lib/python3.6/site-packages/osc_lib/shell.py", line 176, in > run_subcommand > ret_value = super(OpenStackShell, self).run_subcommand(argv) > File "/usr/lib/python3.6/site-packages/cliff/app.py", line 401, in > run_subcommand > result = cmd.run(parsed_args) > File "/usr/lib/python3.6/site-packages/cliff/command.py", line 185, in > run > return_code = self.take_action(parsed_args) or 0 > File > "/usr/lib/python3.6/site-packages/tripleoclient/v1/tripleo_deploy.py", line > 1451, in take_action > if self._standalone_deploy(parsed_args) != 0: > File > "/usr/lib/python3.6/site-packages/tripleoclient/v1/tripleo_deploy.py", line > 1400, in _standalone_deploy > raise exceptions.DeploymentError('Deployment failed.') > tripleoclient.exceptions.DeploymentError: Deployment failed. > > END return value: 1 > > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > > An error has occured while deploying the Undercloud. > > See the previous output for details about what went wrong. > > !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > > Command '['sudo', '--preserve-env', 'openstack', 'tripleo', 'deploy', > '--standalone', '--standalone-role', 'Undercloud', '--stack', 'undercloud', > '--local-domain=localdomain', '--local-ip=abce:abce:abce::1/64', > '--templates=/usr/share/openstack-tripleo-heat-templates/', > '--networks-file=network_data_undercloud.yaml', '--heat-native', '-e', > '/usr/share/openstack-tripleo-heat-templates/environments/undercloud.yaml', > '-e', > '/usr/share/openstack-tripleo-heat-templates/environments/use-dns-for-vips.yaml', > '-e', > '/usr/share/openstack-tripleo-heat-templates/environments/podman.yaml', > '-e', '/home/stack/containers-prepare-parameter.yaml', '-e', > '/usr/share/openstack-tripleo-heat-templates/environments/services/ironic.yaml', > '-e', > '/usr/share/openstack-tripleo-heat-templates/environments/services/ironic-inspector.yaml', > '-e', > '/usr/share/openstack-tripleo-heat-templates/environments/services/mistral.yaml', > '-e', > '/usr/share/openstack-tripleo-heat-templates/environments/services/zaqar-swift-backend.yaml', > '-e', > '/usr/share/openstack-tripleo-heat-templates/environments/disable-telemetry.yaml', > '-e', > '/usr/share/openstack-tripleo-heat-templates/environments/services/tempest.yaml', > '-e', > '/usr/share/openstack-tripleo-heat-templates/environments/ssl/no-tls-endpoints-public-ip.yaml', > '--deployment-user', 'stack', '--output-dir=/home/stack', '-e', > '/home/stack/tripleo-config-generated-env-files/undercloud_parameters.yaml', > '-e', > '/usr/share/openstack-tripleo-heat-templates/environments/tripleo-validations.yaml', > '--debug', '--log-file=install-undercloud.log', '-e', > '/usr/share/openstack-tripleo-heat-templates/undercloud-stack-vstate-dropin.yaml']' > returned non-zero exit status 1. > Command '['sudo', '--preserve-env', 'openstack', 'tripleo', 'deploy', > '--standalone', '--standalone-role', 'Undercloud', '--stack', 'undercloud', > '--local-domain=localdomain', '--local-ip=abce:abce:abce::1/64', > '--templates=/usr/share/openstack-tripleo-heat-templates/', > '--networks-file=network_data_undercloud.yaml', '--heat-native', '-e', > '/usr/share/openstack-tripleo-heat-templates/environments/undercloud.yaml', > '-e', > '/usr/share/openstack-tripleo-heat-templates/environments/use-dns-for-vips.yaml', > '-e', > '/usr/share/openstack-tripleo-heat-templates/environments/podman.yaml', > '-e', '/home/stack/containers-prepare-parameter.yaml', '-e', > '/usr/share/openstack-tripleo-heat-templates/environments/services/ironic.yaml', > '-e', > '/usr/share/openstack-tripleo-heat-templates/environments/services/ironic-inspector.yaml', > '-e', > '/usr/share/openstack-tripleo-heat-templates/environments/services/mistral.yaml', > '-e', > '/usr/share/openstack-tripleo-heat-templates/environments/services/zaqar-swift-backend.yaml', > '-e', > '/usr/share/openstack-tripleo-heat-templates/environments/disable-telemetry.yaml', > '-e', > '/usr/share/openstack-tripleo-heat-templates/environments/services/tempest.yaml', > '-e', > '/usr/share/openstack-tripleo-heat-templates/environments/ssl/no-tls-endpoints-public-ip.yaml', > '--deployment-user', 'stack', '--output-dir=/home/stack', '-e', > '/home/stack/tripleo-config-generated-env-files/undercloud_parameters.yaml', > '-e', > '/usr/share/openstack-tripleo-heat-templates/environments/tripleo-validations.yaml', > '--debug', '--log-file=install-undercloud.log', '-e', > '/usr/share/openstack-tripleo-heat-templates/undercloud-stack-vstate-dropin.yaml']' > returned non-zero exit status 1. > END return value: 1 > [stack at undercloud ~]$ > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephenfin at redhat.com Thu Nov 25 18:13:10 2021 From: stephenfin at redhat.com (Stephen Finucane) Date: Thu, 25 Nov 2021 18:13:10 +0000 Subject: python_requires >= 3.8 during Yoga Message-ID: gmann has been helpfully proposing patches to change the versions of Python we're testing against in Yoga. I've suggested that we might want to bump 'python_requires' in 'setup.cfg' to indicate that we no longer support any version of Python before 3.8 [1]. As gmann has noted, doing so would mean nova would no longer be installable on Python 3.6 or 3.7 and there has been a small bit of back and forth on the pros and cons of this. I'm wondering what other people's thoughts on this are. Is this something we should be doing? Should we do it for libraries too or just services? Do we ever want to do this? Thoughts, please! Stephen [1] https://review.opendev.org/c/openstack/nova/+/819194/comment/72ecf24f_2bd292c4/ From stephenfin at redhat.com Thu Nov 25 18:45:40 2021 From: stephenfin at redhat.com (Stephen Finucane) Date: Thu, 25 Nov 2021 18:45:40 +0000 Subject: [nova][powervm][zvm] Time to mark powervm and zvm drivers as deprecated for removal? Message-ID: I've had a PR open against the pypowervm library for over a fortnight now [1] with no activity. This has prompted me to go looking into the state of the powervm driver and what I've found isn't great. It doesn't seem there has been any feature development on the driver for many years, and the last change I can see that wasn't simply of a case of "I need to do this to get tests to pass" was over three years ago [2]. The CI also doesn't appear to have been touched in years. The situation for zvm isn't any better. The last functional change to that driver was nearly 2 years ago and I wasn't able to spot any CI running (though it's possible I just missed this). This means the powervm and zvm drivers stand out from the other non-libvirt drivers in-tree, each of which are getting at least some activity (albeit out- of-tree for Hyper-V, from what I recall). It also begs the question: are these drivers something we want to keep around? If we do, how are we going to maintain them long term? If not, how aggressive should we be in removing that now dead code? I'll open patches to mark both powervm and zvm as deprecated shortly so that, assuming nothing changes in the interim, we can look to remove it early in the Z cycle. Please respond here or on the reviews if you have concerns. Stephen [1] https://github.com/powervm/pypowervm/pull/17 [2] Change ID I89ad36f19672368a1f795e1f29c5af6368ccfeec From smooney at redhat.com Thu Nov 25 18:48:37 2021 From: smooney at redhat.com (Sean Mooney) Date: Thu, 25 Nov 2021 18:48:37 +0000 Subject: python_requires >= 3.8 during Yoga In-Reply-To: References: Message-ID: <248276d8ba4393f659132abb005932a5884d75f1.camel@redhat.com> On Thu, 2021-11-25 at 18:13 +0000, Stephen Finucane wrote: > gmann has been helpfully proposing patches to change the versions of Python > we're testing against in Yoga. I've suggested that we might want to bump > 'python_requires' in 'setup.cfg' to indicate that we no longer support any > version of Python before 3.8 [1]. As gmann has noted, doing so would mean nova > would no longer be installable on Python 3.6 or 3.7 and there has been a small > bit of back and forth on the pros and cons of this. I'm wondering what other > people's thoughts on this are. Is this something we should be doing? Should we > do it for libraries too or just services? Do we ever want to do this? Thoughts, > please! so i was advocating for dropping supprot for python below 3.8 this cycle so i woudl be at least in favor of offilly not supprot less then 3.8 in terms of if wew shoudl add python_requries >= 3.8 to project im a littel less certin i think loss of 3.6 support is effectigly goign to happen via uages of 3.7 and 3.8 only feature anyway. there are a number of feature in python 3.8 vs 3.7 and 3.6 https://docs.python.org/3/whatsnew/3.8.html https://docs.python.org/3/whatsnew/3.7.html there are also a number of thing that are removed in 3.9 and 3.10 https://docs.python.org/3/whatsnew/3.7.html#deprecated-python-modules-functions-and-methods https://docs.python.org/3/whatsnew/3.7.html#api-and-feature-removals https://docs.python.org/3/whatsnew/3.9.html#removed-in-python-39 https://docs.python.org/3/whatsnew/3.10.html#removed i have not check all of those to determin if we will be impacted by those changes or not but if we look at the end of life of the various python releases 3.6 is nolonger supppred for security updates from (23 Dec 2021) https://endoflife.date/python so i think we shoudl at least increase to 3.7 i woudl prefer to update to 3.8 which would (14 Oct 2024) that would mean we are testing with a python that is security supproted for the lifetime of the stable yoga branch and byond. > > Stephen > > [1] https://review.opendev.org/c/openstack/nova/+/819194/comment/72ecf24f_2bd292c4/ > > From katonalala at gmail.com Thu Nov 25 19:19:34 2021 From: katonalala at gmail.com (Lajos Katona) Date: Thu, 25 Nov 2021 20:19:34 +0100 Subject: [neutron] Drivers meeting - Friday 26.11.2021 - cancelled Message-ID: Hi Neutron Drivers! Due to the lack of agenda, let's cancel tomorrow's drivers meeting. See You on the meeting next week. Lajos Katona (lajoskatona) -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcin.juszkiewicz at linaro.org Thu Nov 25 19:58:28 2021 From: marcin.juszkiewicz at linaro.org (Marcin Juszkiewicz) Date: Thu, 25 Nov 2021 20:58:28 +0100 Subject: python_requires >= 3.8 during Yoga In-Reply-To: References: Message-ID: W dniu 25.11.2021 o?19:13, Stephen Finucane pisze: > gmann has been helpfully proposing patches to change the versions of Python > we're testing against in Yoga. I've suggested that we might want to bump > 'python_requires' in 'setup.cfg' to indicate that we no longer support any > version of Python before 3.8 CentOS Stream 8 has Python 3.6 by default and RDO team is doing CS8 -> CS9 migration during Yoga cycle. Can we postpone it to Z when there will be no distribution with Py 3.6 to care about? From gmann at ghanshyammann.com Thu Nov 25 21:19:02 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Thu, 25 Nov 2021 15:19:02 -0600 Subject: python_requires >= 3.8 during Yoga In-Reply-To: References: Message-ID: <17d58f75012.106d120141328715.5251672452583667038@ghanshyammann.com> ---- On Thu, 25 Nov 2021 13:58:28 -0600 Marcin Juszkiewicz wrote ---- > W dniu 25.11.2021 o 19:13, Stephen Finucane pisze: > > gmann has been helpfully proposing patches to change the versions of Python > > we're testing against in Yoga. I've suggested that we might want to bump > > 'python_requires' in 'setup.cfg' to indicate that we no longer support any > > version of Python before 3.8 > > CentOS Stream 8 has Python 3.6 by default and RDO team is doing CS8 -> > CS9 migration during Yoga cycle. Can we postpone it to Z when there will > be no distribution with Py 3.6 to care about? Postponing to Z, you mean dropping the py3.6 tests or bumping it in in 'setup.cfg' so that no one can install on py3.6 ? First one we already did and as per Yoga testing runtime we are targeting centos9-stream[1] in Yoga itself. For making 'python_requires' >=py3.8 in 'setup.cfg', I have no string opinion on this but I prefer to have flexible here that 'yes OpenStack is installable in py3.6 but we do not test it anymore from Yoga onwards so no guarantee'. Our testing runtime main goal is that we document the version we are testing *at least* which means it can work on lower or higher versions too but we just do not test them. Just for some background, we started adding 'python_requires' in 'setup.cfg' when we dropped py2.7 and wanted a hard stop for anyone keep using py2.7 on OpenStack. But in python3 world we do not have to use it for every min version bump as such. [1] https://governance.openstack.org/tc/reference/runtimes/yoga.html -gmann > > From ryan.bannon at gmail.com Thu Nov 25 22:50:10 2021 From: ryan.bannon at gmail.com (Ryan Bannon) Date: Thu, 25 Nov 2021 17:50:10 -0500 Subject: Project-scoped app creds - Best practice Message-ID: Hello all, Relatively new to OpenStack. To my understanding, application credentials are bound to users. Is there a way to bind them to Projects (I assume not) or, perhaps, Groups? My naive thought on a possible solution is that if a group has access to a Project, a "generic" user account that everybody has access to could be used for the application credentials. (The use case here is to not bind an app cred to an individual who might leave the organization, thus making the app cred secret lost.) Thanks, Ryan -------------- next part -------------- An HTML attachment was scrubbed... URL: From rafaelweingartner at gmail.com Thu Nov 25 23:20:28 2021 From: rafaelweingartner at gmail.com (=?UTF-8?Q?Rafael_Weing=C3=A4rtner?=) Date: Thu, 25 Nov 2021 20:20:28 -0300 Subject: Project-scoped app creds - Best practice In-Reply-To: References: Message-ID: Hello Ryan, We actually faced a similar situation and we extended Keystone to support the concept of Project bound credentials, which means, credentials that are owned by a project and not by a user. Therefore, the credentials are shared by all users of a project. The spec is the following: https://review.opendev.org/c/openstack/keystone-specs/+/766725 We have it already running in PROD for over 6 months now, and it is also integrated with RadosGW<>Keystone authentication. On Thu, Nov 25, 2021 at 7:53 PM Ryan Bannon wrote: > Hello all, > > Relatively new to OpenStack. > > To my understanding, application credentials are bound to users. Is there > a way to bind them to Projects (I assume not) or, perhaps, Groups? My naive > thought on a possible solution is that if a group has access to a Project, > a "generic" user account that everybody has access to could be used for the > application credentials. (The use case here is to not bind an app cred to > an individual who might leave the organization, thus making the app cred > secret lost.) > > Thanks, > > Ryan > -- Rafael Weing?rtner -------------- next part -------------- An HTML attachment was scrubbed... URL: From tkajinam at redhat.com Fri Nov 26 02:29:44 2021 From: tkajinam at redhat.com (Takashi Kajinami) Date: Fri, 26 Nov 2021 11:29:44 +0900 Subject: [puppet] Propose retiring old stable branches(Stein, Rocky and Queens) In-Reply-To: <493FB665-DAA1-49D0-9BB5-D2CC9CD07D7C@binero.com> References: <493FB665-DAA1-49D0-9BB5-D2CC9CD07D7C@binero.com> Message-ID: Thanks, Tobias, for the feedback. Since we haven't heard any objections for almost one month, I'll start submitting patches to transition these branches to EOL. On Mon, Nov 1, 2021 at 5:01 AM Tobias Urdin wrote: > Hello, > > Thanks for proposing this, I think we should retire these branches > but let?s see if we get any other feedback. > > Best regards > Tobias > > On 29 Oct 2021, at 06:29, Takashi Kajinami wrote: > > Hello, > > > In puppet repos we have plenty of stable branches still open, and the > oldest > is now stable/queens. > However recently we haven't seen many backports proposed to stein, rocky > and queens [1], so I'll propose retiring these three old branches now. > > Please let me know if anybody is interested in keeping any of these three. > > Note that currently CI jobs are broken in these three branches and it is > likely > we need to investigate the required additional pinning of dependent > packages. > If somebody is still interested in maintaining these old branches then > these jobs > should be fixed. > > [1] > https://review.opendev.org/q/(project:%255Eopenstack/puppet-.*)AND(NOT+project:openstack/puppet-tripleo)AND(NOT+project:openstack/puppet-pacemaker)AND((branch:stable/queens)OR(branch:stable/rocky)OR(branch:stable/stein)) > > Thank you, > Takashi Kajinami > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ueha.ayumu at fujitsu.com Fri Nov 26 02:54:09 2021 From: ueha.ayumu at fujitsu.com (ueha.ayumu at fujitsu.com) Date: Fri, 26 Nov 2021 02:54:09 +0000 Subject: [aodh][tacker] Support for pyparsing v 3.0.6 is required in python-aodhclient Message-ID: Hi aodh team, Hi I'm Ueha from tacker team. Tacker CI fails with the following error since a few days ago: ---from h-eng log--- Traceback (most recent call last): ... File "/usr/local/lib/python3.8/dist-packages/aodhclient/utils.py", line 43, in expr = pp.operatorPrecedence(condition, [ AttributeError: module 'pyparsing' has no attribute 'operatorPrecedence' -------------------- As I have researched, the version of pyparsing has been updated to v3.0.6 in global requirements [1]. And operatorPrecedence was already deprecated in pyparsing v3.0.6. A patch to fix this appears to have already been posted to python-aodhclient [2]. Could you consider merging the fixes to avoid errors in the CI? For your information, it have been merged similar fixes in cinder [3] and manila [4]. [1] https://review.opendev.org/c/openstack/requirements/+/818614 [2] https://review.opendev.org/c/openstack/python-aodhclient/+/819054 [3] https://review.opendev.org/c/openstack/cinder/+/818834 [4] https://review.opendev.org/c/openstack/manila/+/818829 Regards, Ueha -------------- next part -------------- An HTML attachment was scrubbed... URL: From swogatpradhan22 at gmail.com Fri Nov 26 05:37:30 2021 From: swogatpradhan22 at gmail.com (Swogat Pradhan) Date: Fri, 26 Nov 2021 11:07:30 +0530 Subject: [Openstack-victoria] [IRONIC] error in network interface | baremetal node validate In-Reply-To: References: Message-ID: Hi Iury, i have added victoria repo and then installed the packages, all the packages installed in the server are from victoria build. With regards, Swogat pradhan On Thu, Nov 25, 2021 at 4:22 PM Iury Gregory wrote: > Hi Swogat, > > This error comes from the keystoneauth. Since it complains about not > having the attribute, maybe you have an older version running in your > setup? > > Em qui., 25 de nov. de 2021 ?s 08:13, Swogat Pradhan < > swogatpradhan22 at gmail.com> escreveu: > >> Is there anything else that i can share to help understand the issue? >> >> with regards, >> swogat pradhan >> >> On Tue, Nov 16, 2021 at 3:43 PM Swogat Pradhan >> wrote: >> >>> Hi, >>> I am currently trying to setup openstack ironic using driver IPMI. >>> I followed the official docs of openstack for setting everything up. >>> >>> When i run openstack baremetal node validate $NODE_UUID, i am getting >>> the following error: >>> >>> * Unexpected exception, traceback saved into log by ironic conductor >>> service that is running on controller: 'ServiceTokenAuthWrapper' object has >>> no attribute '_discovery_cache' * >>> in the network interface in command output. >>> >>> When i check the ironic conductor logs i see the following messages: >>> >> >>> > >>> >>> Can anyone suggest a solution or a way forward. >>> >>> With regards >>> Swogat Pradhan >>> >> > > -- > > > *Att[]'sIury Gregory Melo Ferreira * > *MSc in Computer Science at UFCG* > *Part of the ironic-core and puppet-manager-core team in OpenStack* > *Software Engineer at Red Hat Czech* > *Social*: https://www.linkedin.com/in/iurygregory > *E-mail: iurygregory at gmail.com * > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balazs.gibizer at est.tech Fri Nov 26 09:04:46 2021 From: balazs.gibizer at est.tech (Balazs Gibizer) Date: Fri, 26 Nov 2021 10:04:46 +0100 Subject: python_requires >= 3.8 during Yoga In-Reply-To: <248276d8ba4393f659132abb005932a5884d75f1.camel@redhat.com> References: <248276d8ba4393f659132abb005932a5884d75f1.camel@redhat.com> Message-ID: On Thu, Nov 25 2021 at 06:48:37 PM +0000, Sean Mooney wrote: > On Thu, 2021-11-25 at 18:13 +0000, Stephen Finucane wrote: >> gmann has been helpfully proposing patches to change the versions >> of Python >> we're testing against in Yoga. I've suggested that we might want to >> bump >> 'python_requires' in 'setup.cfg' to indicate that we no longer >> support any >> version of Python before 3.8 [1]. As gmann has noted, doing so >> would mean nova >> would no longer be installable on Python 3.6 or 3.7 and there has >> been a small >> bit of back and forth on the pros and cons of this. I'm wondering >> what other >> people's thoughts on this are. Is this something we should be >> doing? Should we >> do it for libraries too or just services? Do we ever want to do >> this? Thoughts, >> please! > > so i was advocating for dropping supprot for python below 3.8 this > cycle so i woudl be at least > in favor of offilly not supprot less then 3.8 > > in terms of if wew shoudl add python_requries >= 3.8 to project im a > littel less certin > > i think loss of 3.6 support is effectigly goign to happen via uages > of 3.7 and 3.8 only feature anyway. > > there are a number of feature in python 3.8 vs 3.7 and 3.6 > https://docs.python.org/3/whatsnew/3.8.html > https://docs.python.org/3/whatsnew/3.7.html This is my view too. As soon as we turn off py36 testing somebody unintentionally can propose a patch that uses py38+ only language feature and the gate will not detect it and the reviewers might not detect it either. As soon as we land such a patch the module becomes unusable on system < py3.8 regardless of what we write in our setup.cfg. Cheers, gibi > > there are also a number of thing that are removed in 3.9 and 3.10 > https://docs.python.org/3/whatsnew/3.7.html#deprecated-python-modules-functions-and-methods > https://docs.python.org/3/whatsnew/3.7.html#api-and-feature-removals > > https://docs.python.org/3/whatsnew/3.9.html#removed-in-python-39 > https://docs.python.org/3/whatsnew/3.10.html#removed > > i have not check all of those to determin if we will be impacted by > those changes or not > but if we look at the end of life of the various python releases 3.6 > is nolonger supppred for security updates > from (23 Dec 2021) https://endoflife.date/python so i think we shoudl > at least increase to 3.7 > > i woudl prefer to update to 3.8 which would (14 Oct 2024) > > that would mean we are testing with a python that is security > supproted for the lifetime of the stable yoga branch and byond. > > > >> >> Stephen >> >> [1] >> https://review.opendev.org/c/openstack/nova/+/819194/comment/72ecf24f_2bd292c4/ >> >> > > From kamil.madac at slovenskoit.sk Fri Nov 26 09:49:17 2021 From: kamil.madac at slovenskoit.sk (=?iso-8859-2?Q?Kamil_Mad=E1=E8?=) Date: Fri, 26 Nov 2021 09:49:17 +0000 Subject: [neutron]Some floating IPs inaccessible after restart of L3 agent Message-ID: Hello Everyone, We have openstack Victoria deployed since the beginning of the year with kolla/ansible in docker containers. Everything was running OK, but few weeks ago we noticed issues with networking. Our installation uses Openvswitch networking with DVR non HA routers. Everything is running smoothly until we restart L3 agent. After that, some floating ips of VMs running on the node where L3 agent is running becomes inaccessible. Workaround is to reassign floating IP to affected VM. Every restart affects same floating IPs and VMs. No errors/excpetions found in logs. I was able to find out that after restart there are missing routes for those particular floating IPs in fip- namespace, which causes that proxy arp responses are not working. After floating IP address is reassigned, routes are added by L3 agent and floating IP is working again. Looks like some sort of race condition in L3 agent, but I was not able to identify any possible existing bug. L3 agent is in version 17.0.1.dev44. Is anyone aware of any existing bug which could explain such behavior, or does anyone have idea how to solve the issue? Kamil Mad?? Slovensko IT a.s. -------------- next part -------------- An HTML attachment was scrubbed... URL: From amoralej at redhat.com Fri Nov 26 09:54:26 2021 From: amoralej at redhat.com (Alfredo Moralejo Alonso) Date: Fri, 26 Nov 2021 10:54:26 +0100 Subject: python_requires >= 3.8 during Yoga In-Reply-To: <17d58f75012.106d120141328715.5251672452583667038@ghanshyammann.com> References: <17d58f75012.106d120141328715.5251672452583667038@ghanshyammann.com> Message-ID: On Thu, Nov 25, 2021 at 10:23 PM Ghanshyam Mann wrote: > ---- On Thu, 25 Nov 2021 13:58:28 -0600 Marcin Juszkiewicz < > marcin.juszkiewicz at linaro.org> wrote ---- > > W dniu 25.11.2021 o 19:13, Stephen Finucane pisze: > > > gmann has been helpfully proposing patches to change the versions of > Python > > > we're testing against in Yoga. I've suggested that we might want to > bump > > > 'python_requires' in 'setup.cfg' to indicate that we no longer > support any > > > version of Python before 3.8 > > > > CentOS Stream 8 has Python 3.6 by default and RDO team is doing CS8 -> > > CS9 migration during Yoga cycle. Can we postpone it to Z when there > will > > be no distribution with Py 3.6 to care about? > > As Marcin posted, the plan in RDO is to support both CentOS Stream 8 and CentOS Stream 9 in Yoga. This is how we have managed previous major CentOS version upgrades in the past providing support for both releases in an OpenStack version to ease the upgrade so I'd like to keep yoga working on py3.6 included in CS8 and CS9. > Postponing to Z, you mean dropping the py3.6 tests or bumping it in in > 'setup.cfg' > so that no one can install on py3.6 ? > > First one we already did and as per Yoga testing runtime we are targeting > centos9-stream[1] > in Yoga itself. > > For making 'python_requires' >=py3.8 in 'setup.cfg', I have no string > opinion on this but I prefer > to have flexible here that 'yes OpenStack is installable in py3.6 but we > do not test it anymore > from Yoga onwards so no guarantee'. Our testing runtime main goal is that > we document the version we > are testing *at least* which means it can work on lower or higher versions > too but we just do not test them. > > May it be possible to keep py3.6 jobs to make sure patches are not introducing py3.8-only features that would break deployment in CS8? Just for some background, we started adding 'python_requires' in > 'setup.cfg' when we dropped > py2.7 and wanted a hard stop for anyone keep using py2.7 on OpenStack. But > in python3 world > we do not have to use it for every min version bump as such. > > [1] https://governance.openstack.org/tc/reference/runtimes/yoga.html > > -gmann > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dtantsur at redhat.com Fri Nov 26 10:47:42 2021 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Fri, 26 Nov 2021 11:47:42 +0100 Subject: python_requires >= 3.8 during Yoga In-Reply-To: References: Message-ID: Hi all, Note that this decision will force us to stop supporting Bifrost [1] on CentOS/RHEL completely, unless we find a workaround. While Python 3.8 and 3.9 can be installed, they lack critical modules like python3-dnf or python3-firewalld, which cannot be pip-installed (sigh). A similar problem in Metal3: we use python3-mod_wsgi, but I guess we can switch to something else in this case. Dmitry [1] An upstream installation service for Ironic based on Ansible On Thu, Nov 25, 2021 at 7:19 PM Stephen Finucane wrote: > gmann has been helpfully proposing patches to change the versions of Python > we're testing against in Yoga. I've suggested that we might want to bump > 'python_requires' in 'setup.cfg' to indicate that we no longer support any > version of Python before 3.8 [1]. As gmann has noted, doing so would mean > nova > would no longer be installable on Python 3.6 or 3.7 and there has been a > small > bit of back and forth on the pros and cons of this. I'm wondering what > other > people's thoughts on this are. Is this something we should be doing? > Should we > do it for libraries too or just services? Do we ever want to do this? > Thoughts, > please! > > Stephen > > [1] > https://review.opendev.org/c/openstack/nova/+/819194/comment/72ecf24f_2bd292c4/ > > > -- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill -------------- next part -------------- An HTML attachment was scrubbed... URL: From balazs.gibizer at est.tech Fri Nov 26 12:26:20 2021 From: balazs.gibizer at est.tech (Balazs Gibizer) Date: Fri, 26 Nov 2021 13:26:20 +0100 Subject: python_requires >= 3.8 during Yoga In-Reply-To: References: <17d58f75012.106d120141328715.5251672452583667038@ghanshyammann.com> Message-ID: On Fri, Nov 26 2021 at 10:54:26 AM +0100, Alfredo Moralejo Alonso wrote: > > > On Thu, Nov 25, 2021 at 10:23 PM Ghanshyam Mann > wrote: >> ---- On Thu, 25 Nov 2021 13:58:28 -0600 Marcin Juszkiewicz >> wrote ---- >> > W dniu 25.11.2021 o 19:13, Stephen Finucane pisze: >> > > gmann has been helpfully proposing patches to change the >> versions of Python >> > > we're testing against in Yoga. I've suggested that we might >> want to bump >> > > 'python_requires' in 'setup.cfg' to indicate that we no longer >> support any >> > > version of Python before 3.8 >> > >> > CentOS Stream 8 has Python 3.6 by default and RDO team is doing >> CS8 -> >> > CS9 migration during Yoga cycle. Can we postpone it to Z when >> there will >> > be no distribution with Py 3.6 to care about? >> > > As Marcin posted, the plan in RDO is to support both CentOS Stream 8 > and CentOS Stream 9 in Yoga. This is how we have managed previous > major CentOS version upgrades in the past providing support for both > releases in an OpenStack version to ease the upgrade so I'd like to > keep yoga working on py3.6 included in CS8 and CS9. > >> Postponing to Z, you mean dropping the py3.6 tests or bumping it in >> in 'setup.cfg' >> so that no one can install on py3.6 ? >> >> First one we already did and as per Yoga testing runtime we are >> targeting centos9-stream[1] >> in Yoga itself. >> >> For making 'python_requires' >=py3.8 in 'setup.cfg', I have no >> string opinion on this but I prefer >> to have flexible here that 'yes OpenStack is installable in py3.6 >> but we do not test it anymore >> from Yoga onwards so no guarantee'. Our testing runtime main goal >> is that we document the version we >> are testing *at least* which means it can work on lower or higher >> versions too but we just do not test them. >> > > May it be possible to keep py3.6 jobs to make sure patches are not > introducing py3.8-only features that would break deployment in CS8? I think we can keep py36 support iff we test with py36 on the gate. Which I'm not against personally but there was a TC decision to drop py36 testing. Cheers, gibi > >> Just for some background, we started adding 'python_requires' in >> 'setup.cfg' when we dropped >> py2.7 and wanted a hard stop for anyone keep using py2.7 on >> OpenStack. But in python3 world >> we do not have to use it for every min version bump as such. >> >> [1] https://governance.openstack.org/tc/reference/runtimes/yoga.html >> >> -gmann >> >> > >> > >> From balazs.gibizer at est.tech Fri Nov 26 12:28:22 2021 From: balazs.gibizer at est.tech (Balazs Gibizer) Date: Fri, 26 Nov 2021 13:28:22 +0100 Subject: python_requires >= 3.8 during Yoga In-Reply-To: References: Message-ID: On Fri, Nov 26 2021 at 11:47:42 AM +0100, Dmitry Tantsur wrote: > Hi all, > > Note that this decision will force us to stop supporting Bifrost [1] > on CentOS/RHEL completely, unless we find a workaround. While Python > 3.8 and 3.9 can be installed, they lack critical modules like > python3-dnf or python3-firewalld, which cannot be pip-installed > (sigh). > > A similar problem in Metal3: we use python3-mod_wsgi, but I guess we > can switch to something else in this case. I'm not sure I got it. Don't OpenStack already supports py38 officially? Based on my understanding of the above it is not the case. Cheers, gibi > > Dmitry > > [1] An upstream installation service for Ironic based on Ansible > > On Thu, Nov 25, 2021 at 7:19 PM Stephen Finucane > wrote: >> gmann has been helpfully proposing patches to change the versions of >> Python >> we're testing against in Yoga. I've suggested that we might want to >> bump >> 'python_requires' in 'setup.cfg' to indicate that we no longer >> support any >> version of Python before 3.8 [1]. As gmann has noted, doing so >> would mean nova >> would no longer be installable on Python 3.6 or 3.7 and there has >> been a small >> bit of back and forth on the pros and cons of this. I'm wondering >> what other >> people's thoughts on this are. Is this something we should be >> doing? Should we >> do it for libraries too or just services? Do we ever want to do >> this? Thoughts, >> please! >> >> Stephen >> >> [1] >> https://review.opendev.org/c/openstack/nova/+/819194/comment/72ecf24f_2bd292c4/ >> >> > > > -- > Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, > Commercial register: Amtsgericht Muenchen, HRB 153243, > Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, > Michael O'Neill From lyarwood at redhat.com Fri Nov 26 12:29:42 2021 From: lyarwood at redhat.com (Lee Yarwood) Date: Fri, 26 Nov 2021 12:29:42 +0000 Subject: python_requires >= 3.8 during Yoga In-Reply-To: References: <17d58f75012.106d120141328715.5251672452583667038@ghanshyammann.com> Message-ID: <20211126122942.deyi4tymy5xoodvz@lyarwood-laptop.usersys.redhat.com> On 26-11-21 10:54:26, Alfredo Moralejo Alonso wrote: > On Thu, Nov 25, 2021 at 10:23 PM Ghanshyam Mann > wrote: > > > ---- On Thu, 25 Nov 2021 13:58:28 -0600 Marcin Juszkiewicz < > > marcin.juszkiewicz at linaro.org> wrote ---- > > > W dniu 25.11.2021 o 19:13, Stephen Finucane pisze: > > > > gmann has been helpfully proposing patches to change the > > > > versions of Python we're testing against in Yoga. I've > > > > suggested that we might want to bump 'python_requires' in > > > > 'setup.cfg' to indicate that we no longer support any version > > > > of Python before 3.8 > > > > > > CentOS Stream 8 has Python 3.6 by default and RDO team is doing > > > CS8 -> CS9 migration during Yoga cycle. Can we postpone it to Z > > > when there will be no distribution with Py 3.6 to care about? Stupid question that I should know the answer to but does RDO really support RPM based installations anymore? IOW couldn't we just workaround this by providing CS8 py38 based containers during the upgrade? > As Marcin posted, the plan in RDO is to support both CentOS Stream 8 and > CentOS Stream 9 in Yoga. This is how we have managed previous major CentOS > version upgrades in the past providing support for both releases in an > OpenStack version to ease the upgrade so I'd like to keep yoga working on > py3.6 included in CS8 and CS9. If this was the plan why wasn't it made clear to the TC before they dropped CS8 from the Yoga runtimes? Would it even be possible for the TC to add CS8 and py36 back in to the Yoga runtimes? > > Postponing to Z, you mean dropping the py3.6 tests or bumping it in > > in 'setup.cfg' so that no one can install on py3.6 ? > > > > First one we already did and as per Yoga testing runtime we are > > targeting centos9-stream[1] in Yoga itself. > > > > For making 'python_requires' >=py3.8 in 'setup.cfg', I have no > > string opinion on this but I prefer to have flexible here that 'yes > > OpenStack is installable in py3.6 but we do not test it anymore from > > Yoga onwards so no guarantee'. Our testing runtime main goal is > > that we document the version we are testing *at least* which means > > it can work on lower or higher versions too but we just do not test > > them. > > > > May it be possible to keep py3.6 jobs to make sure patches are not > introducing py3.8-only features that would break deployment in CS8? We should keep CS8 and py36 as supported runtimes if we are keeping the jobs, otherwise this just sets super confusing. > > Just for some background, we started adding 'python_requires' in> > > 'setup.cfg' when we dropped py2.7 and wanted a hard stop for > > anyone keep using py2.7 on OpenStack. But in python3 world we do > > not have to use it for every min version bump as such. > > > > [1] https://governance.openstack.org/tc/reference/runtimes/yoga.html -- Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 2D76 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From dtantsur at redhat.com Fri Nov 26 13:29:53 2021 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Fri, 26 Nov 2021 14:29:53 +0100 Subject: python_requires >= 3.8 during Yoga In-Reply-To: References: Message-ID: On Fri, Nov 26, 2021 at 1:28 PM Balazs Gibizer wrote: > > > On Fri, Nov 26 2021 at 11:47:42 AM +0100, Dmitry Tantsur > wrote: > > Hi all, > > > > Note that this decision will force us to stop supporting Bifrost [1] > > on CentOS/RHEL completely, unless we find a workaround. While Python > > 3.8 and 3.9 can be installed, they lack critical modules like > > python3-dnf or python3-firewalld, which cannot be pip-installed > > (sigh). > > > > A similar problem in Metal3: we use python3-mod_wsgi, but I guess we > > can switch to something else in this case. > > I'm not sure I got it. Don't OpenStack already supports py38 > officially? Based on my understanding of the above it is not the case. > Now I'm confused as well :) OpenStack supports 3.8 and 3.9, CentOS/RHEL ships 3.6 and a limited version of 3.8 and 3.9. Some Python projects may be okay with it, but Ansible requires things that cannot be installed unless provided by OS packages (or built from source). Examples include python3-dnf, python3-libselinux, python3-firewall and presumably python3-mod_wsgi. Dmitry > > Cheers, > gibi > > > > > Dmitry > > > > [1] An upstream installation service for Ironic based on Ansible > > > > On Thu, Nov 25, 2021 at 7:19 PM Stephen Finucane > > wrote: > >> gmann has been helpfully proposing patches to change the versions of > >> Python > >> we're testing against in Yoga. I've suggested that we might want to > >> bump > >> 'python_requires' in 'setup.cfg' to indicate that we no longer > >> support any > >> version of Python before 3.8 [1]. As gmann has noted, doing so > >> would mean nova > >> would no longer be installable on Python 3.6 or 3.7 and there has > >> been a small > >> bit of back and forth on the pros and cons of this. I'm wondering > >> what other > >> people's thoughts on this are. Is this something we should be > >> doing? Should we > >> do it for libraries too or just services? Do we ever want to do > >> this? Thoughts, > >> please! > >> > >> Stephen > >> > >> [1] > >> > https://review.opendev.org/c/openstack/nova/+/819194/comment/72ecf24f_2bd292c4/ > >> > >> > > > > > > -- > > Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, > > Commercial register: Amtsgericht Muenchen, HRB 153243, > > Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, > > Michael O'Neill > > > -- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Fri Nov 26 14:25:25 2021 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Fri, 26 Nov 2021 15:25:25 +0100 Subject: python_requires >= 3.8 during Yoga In-Reply-To: References: Message-ID: On Fri, 26 Nov 2021 at 14:31, Dmitry Tantsur wrote: > > > > On Fri, Nov 26, 2021 at 1:28 PM Balazs Gibizer wrote: >> >> >> >> On Fri, Nov 26 2021 at 11:47:42 AM +0100, Dmitry Tantsur >> wrote: >> > Hi all, >> > >> > Note that this decision will force us to stop supporting Bifrost [1] >> > on CentOS/RHEL completely, unless we find a workaround. While Python >> > 3.8 and 3.9 can be installed, they lack critical modules like >> > python3-dnf or python3-firewalld, which cannot be pip-installed >> > (sigh). >> > >> > A similar problem in Metal3: we use python3-mod_wsgi, but I guess we >> > can switch to something else in this case. >> >> I'm not sure I got it. Don't OpenStack already supports py38 >> officially? Based on my understanding of the above it is not the case. > > > Now I'm confused as well :) > > OpenStack supports 3.8 and 3.9, CentOS/RHEL ships 3.6 and a limited version of 3.8 and 3.9. Some Python projects may be okay with it, but Ansible requires things that cannot be installed unless provided by OS packages (or built from source). Examples include python3-dnf, python3-libselinux, python3-firewall and presumably python3-mod_wsgi. The question is: is it hard for RDO to provide these deps? If not, it might be the easiest solution. If yes, we (TC) might want to revisit this decision for this cycle. -yoctozepto From fungi at yuggoth.org Fri Nov 26 14:30:19 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 26 Nov 2021 14:30:19 +0000 Subject: python_requires >= 3.8 during Yoga In-Reply-To: References: Message-ID: <20211126143019.2o7rycs44ycnkgez@yuggoth.org> On 2021-11-26 14:29:53 +0100 (+0100), Dmitry Tantsur wrote: [...] > CentOS/RHEL ships 3.6 and a limited version of 3.8 and 3.9. [...] Is this still true for CentOS Stream 9? The TC decision was to support that instead of CentOS Stream 8 in Yoga. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From smooney at redhat.com Fri Nov 26 15:17:13 2021 From: smooney at redhat.com (Sean Mooney) Date: Fri, 26 Nov 2021 15:17:13 +0000 Subject: python_requires >= 3.8 during Yoga In-Reply-To: <20211126143019.2o7rycs44ycnkgez@yuggoth.org> References: <20211126143019.2o7rycs44ycnkgez@yuggoth.org> Message-ID: <7abdf66256a617a5c8df4fe75573a8aa1ef604cb.camel@redhat.com> On Fri, 2021-11-26 at 14:30 +0000, Jeremy Stanley wrote: > On 2021-11-26 14:29:53 +0100 (+0100), Dmitry Tantsur wrote: > [...] > > CentOS/RHEL ships 3.6 and a limited version of 3.8 and 3.9. > [...] > > Is this still true for CentOS Stream 9? The TC decision was to > support that instead of CentOS Stream 8 in Yoga. centos 9 will ship 3.9 only at least as of now and rhel 9 will not support 3.6 or 3.8 gaing only 3.9 at least initally it might get 3.10 as a non default at some time From dtantsur at redhat.com Fri Nov 26 15:20:39 2021 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Fri, 26 Nov 2021 16:20:39 +0100 Subject: python_requires >= 3.8 during Yoga In-Reply-To: <20211126143019.2o7rycs44ycnkgez@yuggoth.org> References: <20211126143019.2o7rycs44ycnkgez@yuggoth.org> Message-ID: On Fri, Nov 26, 2021 at 3:35 PM Jeremy Stanley wrote: > On 2021-11-26 14:29:53 +0100 (+0100), Dmitry Tantsur wrote: > [...] > > CentOS/RHEL ships 3.6 and a limited version of 3.8 and 3.9. > [...] > > Is this still true for CentOS Stream 9? The TC decision was to > support that instead of CentOS Stream 8 in Yoga. > No. But Stream 9 is pretty much beta, so it's not a replacement for us (and we don't have nodes in nodepool with it even yet?). Dmitry > -- > Jeremy Stanley > -- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Fri Nov 26 15:29:46 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Fri, 26 Nov 2021 09:29:46 -0600 Subject: python_requires >= 3.8 during Yoga In-Reply-To: References: <20211126143019.2o7rycs44ycnkgez@yuggoth.org> Message-ID: <17d5cdde95a.b95752d01381326.4462262438602968824@ghanshyammann.com> ---- On Fri, 26 Nov 2021 09:20:39 -0600 Dmitry Tantsur wrote ---- > > > On Fri, Nov 26, 2021 at 3:35 PM Jeremy Stanley wrote: > On 2021-11-26 14:29:53 +0100 (+0100), Dmitry Tantsur wrote: > [...] > > CentOS/RHEL ships 3.6 and a limited version of 3.8 and 3.9. > [...] > > Is this still true for CentOS Stream 9? The TC decision was to > support that instead of CentOS Stream 8 in Yoga. > > No. But Stream 9 is pretty much beta, so it's not a replacement for us (and we don't have nodes in nodepool with it even yet?). I think here is the confusion. In TC, after checking with centos team impression was CentOS stream 9 is released and that is what we should update In OpenStack testing. And then only we updated the centos stream 8 -> 9 and dropped py3.6 testing - https://review.opendev.org/c/openstack/governance/+/815851/3..6/reference/runtimes/yoga.rst -gmann > Dmitry > -- > Jeremy Stanley > > > -- > Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, > Commercial register: Amtsgericht Muenchen, HRB 153243, > Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill > From gmann at ghanshyammann.com Fri Nov 26 15:37:44 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Fri, 26 Nov 2021 09:37:44 -0600 Subject: python_requires >= 3.8 during Yoga In-Reply-To: <20211126122942.deyi4tymy5xoodvz@lyarwood-laptop.usersys.redhat.com> References: <17d58f75012.106d120141328715.5251672452583667038@ghanshyammann.com> <20211126122942.deyi4tymy5xoodvz@lyarwood-laptop.usersys.redhat.com> Message-ID: <17d5ce532e1.eabdb7481381896.7024061515184177274@ghanshyammann.com> ---- On Fri, 26 Nov 2021 06:29:42 -0600 Lee Yarwood wrote ---- > On 26-11-21 10:54:26, Alfredo Moralejo Alonso wrote: > > On Thu, Nov 25, 2021 at 10:23 PM Ghanshyam Mann > > wrote: > > > > > ---- On Thu, 25 Nov 2021 13:58:28 -0600 Marcin Juszkiewicz < > > > marcin.juszkiewicz at linaro.org> wrote ---- > > > > W dniu 25.11.2021 o 19:13, Stephen Finucane pisze: > > > > > gmann has been helpfully proposing patches to change the > > > > > versions of Python we're testing against in Yoga. I've > > > > > suggested that we might want to bump 'python_requires' in > > > > > 'setup.cfg' to indicate that we no longer support any version > > > > > of Python before 3.8 > > > > > > > > CentOS Stream 8 has Python 3.6 by default and RDO team is doing > > > > CS8 -> CS9 migration during Yoga cycle. Can we postpone it to Z > > > > when there will be no distribution with Py 3.6 to care about? > > Stupid question that I should know the answer to but does RDO really > support RPM based installations anymore? IOW couldn't we just workaround > this by providing CS8 py38 based containers during the upgrade? > > > As Marcin posted, the plan in RDO is to support both CentOS Stream 8 and > > CentOS Stream 9 in Yoga. This is how we have managed previous major CentOS > > version upgrades in the past providing support for both releases in an > > OpenStack version to ease the upgrade so I'd like to keep yoga working on > > py3.6 included in CS8 and CS9. > > If this was the plan why wasn't it made clear to the TC before they > dropped CS8 from the Yoga runtimes? Would it even be possible for the TC > to add CS8 and py36 back in to the Yoga runtimes? > > > > Postponing to Z, you mean dropping the py3.6 tests or bumping it in > > > in 'setup.cfg' so that no one can install on py3.6 ? > > > > > > First one we already did and as per Yoga testing runtime we are > > > targeting centos9-stream[1] in Yoga itself. > > > > > > For making 'python_requires' >=py3.8 in 'setup.cfg', I have no > > > string opinion on this but I prefer to have flexible here that 'yes > > > OpenStack is installable in py3.6 but we do not test it anymore from > > > Yoga onwards so no guarantee'. Our testing runtime main goal is > > > that we document the version we are testing *at least* which means > > > it can work on lower or higher versions too but we just do not test > > > them. > > > > > > > May it be possible to keep py3.6 jobs to make sure patches are not > > introducing py3.8-only features that would break deployment in CS8? > > We should keep CS8 and py36 as supported runtimes if we are keeping the > jobs, otherwise this just sets super confusing. Yeah, I think it create confusion as I can see in this ML thread so agree on keeping 'python_requires' also in sycn with what we test. Now question on going back to centos stream 8 support in Yoga, is it not centos stream 9 is stable released or is it experimental only? If stable then we can keep the latest available version which can be centos stream 9. Our project interface testing doc clearly stats 'latest LTS' to consider for testing[1] whenever we are ready. I am not very strongly against of reverting back to centos stream 8 but we should not add two version of same distro in testing which can be a lot of we consider below three distro - Latest Ubuntu LTS - Latest CentOS Stream Major - Latest Debian Stable [1] https://governance.openstack.org/tc/reference/project-testing-interface.html#linux-distributions > > > > Just for some background, we started adding 'python_requires' in> > > > 'setup.cfg' when we dropped py2.7 and wanted a hard stop for > > > anyone keep using py2.7 on OpenStack. But in python3 world we do > > > not have to use it for every min version bump as such. > > > > > > [1] https://governance.openstack.org/tc/reference/runtimes/yoga.html > > -- > Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 2D76 > From dtantsur at redhat.com Fri Nov 26 15:40:52 2021 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Fri, 26 Nov 2021 16:40:52 +0100 Subject: python_requires >= 3.8 during Yoga In-Reply-To: <17d5cdde95a.b95752d01381326.4462262438602968824@ghanshyammann.com> References: <20211126143019.2o7rycs44ycnkgez@yuggoth.org> <17d5cdde95a.b95752d01381326.4462262438602968824@ghanshyammann.com> Message-ID: On Fri, Nov 26, 2021 at 4:35 PM Ghanshyam Mann wrote: > ---- On Fri, 26 Nov 2021 09:20:39 -0600 Dmitry Tantsur < > dtantsur at redhat.com> wrote ---- > > > > > > On Fri, Nov 26, 2021 at 3:35 PM Jeremy Stanley > wrote: > > On 2021-11-26 14:29:53 +0100 (+0100), Dmitry Tantsur wrote: > > [...] > > > CentOS/RHEL ships 3.6 and a limited version of 3.8 and 3.9. > > [...] > > > > Is this still true for CentOS Stream 9? The TC decision was to > > support that instead of CentOS Stream 8 in Yoga. > > > > No. But Stream 9 is pretty much beta, so it's not a replacement for us > (and we don't have nodes in nodepool with it even yet?). > > I think here is the confusion. In TC, after checking with centos team > impression was CentOS stream 9 is released and that is > what we should update In OpenStack testing. And then only we updated the > centos stream 8 -> 9 and dropped py3.6 testing > > - > https://review.opendev.org/c/openstack/governance/+/815851/3..6/reference/runtimes/yoga.rst > I think there is an enormous perception gap between the CentOS team and the rest of the world. Dmitry > > > > -gmann > > > Dmitry > > -- > > Jeremy Stanley > > > > > > -- > > Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, > > Commercial register: Amtsgericht Muenchen, HRB 153243, > > Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael > O'Neill > > > > -- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Fri Nov 26 15:53:55 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 26 Nov 2021 15:53:55 +0000 Subject: python_requires >= 3.8 during Yoga In-Reply-To: References: <20211126143019.2o7rycs44ycnkgez@yuggoth.org> Message-ID: <20211126155354.nfzcslxho47kf3cg@yuggoth.org> On 2021-11-26 16:20:39 +0100 (+0100), Dmitry Tantsur wrote: > On Fri, Nov 26, 2021 at 3:35 PM Jeremy Stanley wrote: > > > On 2021-11-26 14:29:53 +0100 (+0100), Dmitry Tantsur wrote: > > [...] > > > CentOS/RHEL ships 3.6 and a limited version of 3.8 and 3.9. > > [...] > > > > Is this still true for CentOS Stream 9? The TC decision was to > > support that instead of CentOS Stream 8 in Yoga. > > > > No. But Stream 9 is pretty much beta, so it's not a replacement for us (and > we don't have nodes in nodepool with it even yet?). It was added to our nodepool over 3 weeks ago when https://review.opendev.org/816465 merged, and I see it in-use per https://zuul.opendev.org/t/openstack/nodes right now. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From lyarwood at redhat.com Fri Nov 26 16:05:15 2021 From: lyarwood at redhat.com (Lee Yarwood) Date: Fri, 26 Nov 2021 16:05:15 +0000 Subject: python_requires >= 3.8 during Yoga In-Reply-To: <17d5ce532e1.eabdb7481381896.7024061515184177274@ghanshyammann.com> References: <17d58f75012.106d120141328715.5251672452583667038@ghanshyammann.com> <20211126122942.deyi4tymy5xoodvz@lyarwood-laptop.usersys.redhat.com> <17d5ce532e1.eabdb7481381896.7024061515184177274@ghanshyammann.com> Message-ID: <20211126160515.4tvksnqcz3vaomf4@lyarwood-laptop.usersys.redhat.com> On 26-11-21 09:37:44, Ghanshyam Mann wrote: > ---- On Fri, 26 Nov 2021 06:29:42 -0600 Lee Yarwood wrote ---- > > On 26-11-21 10:54:26, Alfredo Moralejo Alonso wrote: > > > On Thu, Nov 25, 2021 at 10:23 PM Ghanshyam Mann > > > wrote: > > > > > > > ---- On Thu, 25 Nov 2021 13:58:28 -0600 Marcin Juszkiewicz < > > > > marcin.juszkiewicz at linaro.org> wrote ---- > > > > > W dniu 25.11.2021 o 19:13, Stephen Finucane pisze: > > > > > > gmann has been helpfully proposing patches to change the > > > > > > versions of Python we're testing against in Yoga. I've > > > > > > suggested that we might want to bump 'python_requires' in > > > > > > 'setup.cfg' to indicate that we no longer support any version > > > > > > of Python before 3.8 > > > > > > > > > > CentOS Stream 8 has Python 3.6 by default and RDO team is doing > > > > > CS8 -> CS9 migration during Yoga cycle. Can we postpone it to Z > > > > > when there will be no distribution with Py 3.6 to care about? > > > > Stupid question that I should know the answer to but does RDO really > > support RPM based installations anymore? IOW couldn't we just workaround > > this by providing CS8 py38 based containers during the upgrade? > > > > > As Marcin posted, the plan in RDO is to support both CentOS Stream 8 and > > > CentOS Stream 9 in Yoga. This is how we have managed previous major CentOS > > > version upgrades in the past providing support for both releases in an > > > OpenStack version to ease the upgrade so I'd like to keep yoga working on > > > py3.6 included in CS8 and CS9. > > > > If this was the plan why wasn't it made clear to the TC before they > > dropped CS8 from the Yoga runtimes? Would it even be possible for the TC > > to add CS8 and py36 back in to the Yoga runtimes? > > > > > > Postponing to Z, you mean dropping the py3.6 tests or bumping it in > > > > in 'setup.cfg' so that no one can install on py3.6 ? > > > > > > > > First one we already did and as per Yoga testing runtime we are > > > > targeting centos9-stream[1] in Yoga itself. > > > > > > > > For making 'python_requires' >=py3.8 in 'setup.cfg', I have no > > > > string opinion on this but I prefer to have flexible here that 'yes > > > > OpenStack is installable in py3.6 but we do not test it anymore from > > > > Yoga onwards so no guarantee'. Our testing runtime main goal is > > > > that we document the version we are testing *at least* which means > > > > it can work on lower or higher versions too but we just do not test > > > > them. > > > > > > > > > > May it be possible to keep py3.6 jobs to make sure patches are not > > > introducing py3.8-only features that would break deployment in CS8? > > > > We should keep CS8 and py36 as supported runtimes if we are keeping the > > jobs, otherwise this just sets super confusing. > > Yeah, I think it create confusion as I can see in this ML thread so > agree on keeping 'python_requires' also in sycn with what we test. Cool thanks! > Now question on going back to centos stream 8 support in Yoga, is it > not centos stream 9 is stable released or is it experimental only? If > stable then we can keep the latest available version which can be > centos stream 9. I honestly don't know and can't find any docs to point to. > Our project interface testing doc clearly stats 'latest LTS' to > consider for testing[1] whenever we are ready. I am not very strongly > against of reverting back to centos stream 8 but we should not add two > version of same distro in testing which can be a lot of we consider > below three distro How do we expect operators to upgrade between Xena where CentOS 8 stream is a supported runtime and Yoga where CentOS 9 stream is currently the equivalent supported runtime without supporting both for a single release? I appreciate it bloats the support matrix a little but the rest of the thread suggests we need to keep py36 around for now anyway. Cheers, -- Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 2D76 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From gmann at ghanshyammann.com Fri Nov 26 16:18:16 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Fri, 26 Nov 2021 10:18:16 -0600 Subject: python_requires >= 3.8 during Yoga In-Reply-To: <20211126160515.4tvksnqcz3vaomf4@lyarwood-laptop.usersys.redhat.com> References: <17d58f75012.106d120141328715.5251672452583667038@ghanshyammann.com> <20211126122942.deyi4tymy5xoodvz@lyarwood-laptop.usersys.redhat.com> <17d5ce532e1.eabdb7481381896.7024061515184177274@ghanshyammann.com> <20211126160515.4tvksnqcz3vaomf4@lyarwood-laptop.usersys.redhat.com> Message-ID: <17d5d0a4f3f.dc8bb48c1384810.8642013929754348916@ghanshyammann.com> ---- On Fri, 26 Nov 2021 10:05:15 -0600 Lee Yarwood wrote ---- > On 26-11-21 09:37:44, Ghanshyam Mann wrote: > > ---- On Fri, 26 Nov 2021 06:29:42 -0600 Lee Yarwood wrote ---- > > > On 26-11-21 10:54:26, Alfredo Moralejo Alonso wrote: > > > > On Thu, Nov 25, 2021 at 10:23 PM Ghanshyam Mann > > > > wrote: > > > > > > > > > ---- On Thu, 25 Nov 2021 13:58:28 -0600 Marcin Juszkiewicz < > > > > > marcin.juszkiewicz at linaro.org> wrote ---- > > > > > > W dniu 25.11.2021 o 19:13, Stephen Finucane pisze: > > > > > > > gmann has been helpfully proposing patches to change the > > > > > > > versions of Python we're testing against in Yoga. I've > > > > > > > suggested that we might want to bump 'python_requires' in > > > > > > > 'setup.cfg' to indicate that we no longer support any version > > > > > > > of Python before 3.8 > > > > > > > > > > > > CentOS Stream 8 has Python 3.6 by default and RDO team is doing > > > > > > CS8 -> CS9 migration during Yoga cycle. Can we postpone it to Z > > > > > > when there will be no distribution with Py 3.6 to care about? > > > > > > Stupid question that I should know the answer to but does RDO really > > > support RPM based installations anymore? IOW couldn't we just workaround > > > this by providing CS8 py38 based containers during the upgrade? > > > > > > > As Marcin posted, the plan in RDO is to support both CentOS Stream 8 and > > > > CentOS Stream 9 in Yoga. This is how we have managed previous major CentOS > > > > version upgrades in the past providing support for both releases in an > > > > OpenStack version to ease the upgrade so I'd like to keep yoga working on > > > > py3.6 included in CS8 and CS9. > > > > > > If this was the plan why wasn't it made clear to the TC before they > > > dropped CS8 from the Yoga runtimes? Would it even be possible for the TC > > > to add CS8 and py36 back in to the Yoga runtimes? > > > > > > > > Postponing to Z, you mean dropping the py3.6 tests or bumping it in > > > > > in 'setup.cfg' so that no one can install on py3.6 ? > > > > > > > > > > First one we already did and as per Yoga testing runtime we are > > > > > targeting centos9-stream[1] in Yoga itself. > > > > > > > > > > For making 'python_requires' >=py3.8 in 'setup.cfg', I have no > > > > > string opinion on this but I prefer to have flexible here that 'yes > > > > > OpenStack is installable in py3.6 but we do not test it anymore from > > > > > Yoga onwards so no guarantee'. Our testing runtime main goal is > > > > > that we document the version we are testing *at least* which means > > > > > it can work on lower or higher versions too but we just do not test > > > > > them. > > > > > > > > > > > > > May it be possible to keep py3.6 jobs to make sure patches are not > > > > introducing py3.8-only features that would break deployment in CS8? > > > > > > We should keep CS8 and py36 as supported runtimes if we are keeping the > > > jobs, otherwise this just sets super confusing. > > > > Yeah, I think it create confusion as I can see in this ML thread so > > agree on keeping 'python_requires' also in sycn with what we test. > > Cool thanks! > > > Now question on going back to centos stream 8 support in Yoga, is it > > not centos stream 9 is stable released or is it experimental only? If > > stable then we can keep the latest available version which can be > > centos stream 9. > > I honestly don't know and can't find any docs to point to. > > > Our project interface testing doc clearly stats 'latest LTS' to > > consider for testing[1] whenever we are ready. I am not very strongly > > against of reverting back to centos stream 8 but we should not add two > > version of same distro in testing which can be a lot of we consider > > below three distro > > How do we expect operators to upgrade between Xena where CentOS 8 stream > is a supported runtime and Yoga where CentOS 9 stream is currently the > equivalent supported runtime without supporting both for a single > release? This is really good question on upgrade testing we do at upstream and I remember it cameup and discussed a lot during py2.7 drop also that how we are testing the upgrade from py2.7 to py3. Can we do in grenade? But that we answered as we did not tested directly but stein and train tested both version so should not be any issue if you upgrade from there (one of FAQ in my blog[1]). But on distro upgrade testing, as you know we do not test those in upstream neither in grenade where upgrade are done on old node distro only not from old distro version to new distro version with new code. It is not like we do not want to test but if anyone from any distro would like to setup grenade for that and maintain then we are more happy. In summary, yes we cannot guarantee distro upgrade testing from OpenStack upstream testing due to resource bandwidth issue but we will welcome any help here. [1] https://superuser.openstack.org/articles/openstack-ussuri-is-python3-only-upgrade-impact/ -gmann > > I appreciate it bloats the support matrix a little but the rest of the > thread suggests we need to keep py36 around for now anyway. > > Cheers, > > -- > Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 2D76 > From dtantsur at redhat.com Fri Nov 26 16:18:51 2021 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Fri, 26 Nov 2021 17:18:51 +0100 Subject: python_requires >= 3.8 during Yoga In-Reply-To: <20211126155354.nfzcslxho47kf3cg@yuggoth.org> References: <20211126143019.2o7rycs44ycnkgez@yuggoth.org> <20211126155354.nfzcslxho47kf3cg@yuggoth.org> Message-ID: On Fri, Nov 26, 2021 at 4:59 PM Jeremy Stanley wrote: > On 2021-11-26 16:20:39 +0100 (+0100), Dmitry Tantsur wrote: > > On Fri, Nov 26, 2021 at 3:35 PM Jeremy Stanley > wrote: > > > > > On 2021-11-26 14:29:53 +0100 (+0100), Dmitry Tantsur wrote: > > > [...] > > > > CentOS/RHEL ships 3.6 and a limited version of 3.8 and 3.9. > > > [...] > > > > > > Is this still true for CentOS Stream 9? The TC decision was to > > > support that instead of CentOS Stream 8 in Yoga. > > > > > > > No. But Stream 9 is pretty much beta, so it's not a replacement for us > (and > > we don't have nodes in nodepool with it even yet?). > > It was added to our nodepool over 3 weeks ago when > https://review.opendev.org/816465 merged, and I see it in-use per > https://zuul.opendev.org/t/openstack/nodes right now. > Maybe I misunderstand something, but I cannot find any stream-9 nodes in use now, and https://review.opendev.org/c/openstack/bifrost/+/819058/ failed with "The nodeset "centos-9-stream" was not found" just 2 days ago. Dmitry > -- > Jeremy Stanley > -- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Fri Nov 26 16:20:13 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 26 Nov 2021 16:20:13 +0000 Subject: python_requires >= 3.8 during Yoga In-Reply-To: <20211126160515.4tvksnqcz3vaomf4@lyarwood-laptop.usersys.redhat.com> References: <17d58f75012.106d120141328715.5251672452583667038@ghanshyammann.com> <20211126122942.deyi4tymy5xoodvz@lyarwood-laptop.usersys.redhat.com> <17d5ce532e1.eabdb7481381896.7024061515184177274@ghanshyammann.com> <20211126160515.4tvksnqcz3vaomf4@lyarwood-laptop.usersys.redhat.com> Message-ID: <20211126162013.fpwlss5h7rz2xzar@yuggoth.org> On 2021-11-26 16:05:15 +0000 (+0000), Lee Yarwood wrote: [...] > How do we expect operators to upgrade between Xena where CentOS 8 stream > is a supported runtime and Yoga where CentOS 9 stream is currently the > equivalent supported runtime without supporting both for a single > release? > > I appreciate it bloats the support matrix a little but the rest of the > thread suggests we need to keep py36 around for now anyway. Somehow we manage to get by with testing only one Ubuntu version per OpenStack release. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From fungi at yuggoth.org Fri Nov 26 16:36:00 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 26 Nov 2021 16:36:00 +0000 Subject: python_requires >= 3.8 during Yoga In-Reply-To: References: <20211126143019.2o7rycs44ycnkgez@yuggoth.org> <20211126155354.nfzcslxho47kf3cg@yuggoth.org> Message-ID: <20211126163600.d3ndwz7k6qivlxqr@yuggoth.org> On 2021-11-26 17:18:51 +0100 (+0100), Dmitry Tantsur wrote: [...] > Maybe I misunderstand something, but I cannot find any stream-9 > nodes in use now, Ahh, yeah whether they appear in the nodes list will depend on if any builds are in progress for them. Today there's not a lot going on, so I likely just got lucky when I looked earlier. > and https://review.opendev.org/c/openstack/bifrost/+/819058/ > failed with "The nodeset "centos-9-stream" was not found" just 2 > days ago. https://review.opendev.org/793462 is an example of a change which was passing builds on centos-9-stream nodes weeks ago (albeit non-voting). There is currently no nodeset named centos-9-stream, but there is a node label. Adding the nodeset definition soon would probably be good to avoid further confusion. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From dtantsur at redhat.com Fri Nov 26 16:45:30 2021 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Fri, 26 Nov 2021 17:45:30 +0100 Subject: python_requires >= 3.8 during Yoga In-Reply-To: <20211126162013.fpwlss5h7rz2xzar@yuggoth.org> References: <17d58f75012.106d120141328715.5251672452583667038@ghanshyammann.com> <20211126122942.deyi4tymy5xoodvz@lyarwood-laptop.usersys.redhat.com> <17d5ce532e1.eabdb7481381896.7024061515184177274@ghanshyammann.com> <20211126160515.4tvksnqcz3vaomf4@lyarwood-laptop.usersys.redhat.com> <20211126162013.fpwlss5h7rz2xzar@yuggoth.org> Message-ID: On Fri, Nov 26, 2021 at 5:26 PM Jeremy Stanley wrote: > On 2021-11-26 16:05:15 +0000 (+0000), Lee Yarwood wrote: > [...] > > How do we expect operators to upgrade between Xena where CentOS 8 stream > > is a supported runtime and Yoga where CentOS 9 stream is currently the > > equivalent supported runtime without supporting both for a single > > release? > > > > I appreciate it bloats the support matrix a little but the rest of the > > thread suggests we need to keep py36 around for now anyway. > > Somehow we manage to get by with testing only one Ubuntu version per > OpenStack release. > I'm quite sure there is always an overlap, otherwise grenade would not work. Dmitry > -- > Jeremy Stanley > -- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Fri Nov 26 17:04:07 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 26 Nov 2021 17:04:07 +0000 Subject: python_requires >= 3.8 during Yoga In-Reply-To: References: <17d58f75012.106d120141328715.5251672452583667038@ghanshyammann.com> <20211126122942.deyi4tymy5xoodvz@lyarwood-laptop.usersys.redhat.com> <17d5ce532e1.eabdb7481381896.7024061515184177274@ghanshyammann.com> <20211126160515.4tvksnqcz3vaomf4@lyarwood-laptop.usersys.redhat.com> <20211126162013.fpwlss5h7rz2xzar@yuggoth.org> Message-ID: <20211126170407.zjcr34drutafycpp@yuggoth.org> On 2021-11-26 17:45:30 +0100 (+0100), Dmitry Tantsur wrote: > On Fri, Nov 26, 2021 at 5:26 PM Jeremy Stanley wrote: [...] > > Somehow we manage to get by with testing only one Ubuntu version > > per OpenStack release. > > I'm quite sure there is always an overlap, otherwise grenade would > not work. Good point, I hadn't considered grenade. While we don't expressly mention the platform used for upgrade tests as part of the PTI, you are correct that we deploy release N-1 on the default node type for release N-1, then upgrade it in-place to release N and run a battery of tests against it. Technically we do test the new release on the old platform (at least for projects which tests their changes or are otherwise exercised by grenade), even though the PTI says we can drop support for the Python version on that platform. If someone wants to work on running grenade on centos-8-stream nodes for yoga, we could ensure that's still a viable upgrade path the same way. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From amoralej at redhat.com Fri Nov 26 17:14:10 2021 From: amoralej at redhat.com (Alfredo Moralejo Alonso) Date: Fri, 26 Nov 2021 18:14:10 +0100 Subject: python_requires >= 3.8 during Yoga In-Reply-To: References: <20211126143019.2o7rycs44ycnkgez@yuggoth.org> <17d5cdde95a.b95752d01381326.4462262438602968824@ghanshyammann.com> Message-ID: On Fri, Nov 26, 2021 at 4:44 PM Dmitry Tantsur wrote: > > > On Fri, Nov 26, 2021 at 4:35 PM Ghanshyam Mann > wrote: > >> ---- On Fri, 26 Nov 2021 09:20:39 -0600 Dmitry Tantsur < >> dtantsur at redhat.com> wrote ---- >> > >> > >> > On Fri, Nov 26, 2021 at 3:35 PM Jeremy Stanley >> wrote: >> > On 2021-11-26 14:29:53 +0100 (+0100), Dmitry Tantsur wrote: >> > [...] >> > > CentOS/RHEL ships 3.6 and a limited version of 3.8 and 3.9. >> > [...] >> > >> > Is this still true for CentOS Stream 9? The TC decision was to >> > support that instead of CentOS Stream 8 in Yoga. >> > >> > No. But Stream 9 is pretty much beta, so it's not a replacement for us >> (and we don't have nodes in nodepool with it even yet?). >> >> I think here is the confusion. In TC, after checking with centos team >> impression was CentOS stream 9 is released and that is >> what we should update In OpenStack testing. And then only we updated the >> centos stream 8 -> 9 and dropped py3.6 testing >> >> - >> https://review.opendev.org/c/openstack/governance/+/815851/3..6/reference/runtimes/yoga.rst >> > > I think there is an enormous perception gap between the CentOS team and > the rest of the world. > > So, CentOS Stream 9 was released, in the official mirrors and usable since some weeks ago[1][2]. We shouldn't consider it beta or something like that. As mentioned, support for diskimage-builder has been introduced for CS9 and there are nodepool nodes ready for it. From RDO, we are providing RPMs for master branch content on CentOS Stream 9 [3] and actually we have been doing some tests. Actually, we have recently merged new jobs in puppet-openstack[4]. [1] https://www.centos.org/centos-stream/ [2] https://cloud.centos.org/centos/9-stream/x86_64/images/ [3] https://trunk.rdoproject.org/centos9-master/report.html [4] https://review.opendev.org/c/openstack/puppet-openstack-integration/+/793462 Alfredo > Dmitry > > >> >> >> >> -gmann >> >> > Dmitry >> > -- >> > Jeremy Stanley >> > >> > >> > -- >> > Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, >> > Commercial register: Amtsgericht Muenchen, HRB 153243, >> > Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, >> Michael O'Neill >> > >> >> > > -- > Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, > Commercial register: Amtsgericht Muenchen, HRB 153243, > Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael > O'Neill > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ryan.bannon at gmail.com Fri Nov 26 17:17:58 2021 From: ryan.bannon at gmail.com (Ryan Bannon) Date: Fri, 26 Nov 2021 12:17:58 -0500 Subject: Project-scoped app creds - Best practice In-Reply-To: References: Message-ID: Hi all, Rafael, thanks for the notes! That's a great initiative. Although it looks like it has stalled in the review phase...? (I'm new to interpreting the development workflow for OpenStack.) To all: does anybody else have input on how they solved this issue? Tx, Ryan On Thu, Nov 25, 2021 at 6:21 PM Rafael Weing?rtner < rafaelweingartner at gmail.com> wrote: > Hello Ryan, > We actually faced a similar situation and we extended Keystone to support > the concept of Project bound credentials, which means, credentials that are > owned by a project and not by a user. Therefore, the credentials are shared > by all users of a project. > > The spec is the following: > https://review.opendev.org/c/openstack/keystone-specs/+/766725 > > We have it already running in PROD for over 6 months now, and it is also > integrated with RadosGW<>Keystone authentication. > > On Thu, Nov 25, 2021 at 7:53 PM Ryan Bannon wrote: > >> Hello all, >> >> Relatively new to OpenStack. >> >> To my understanding, application credentials are bound to users. Is there >> a way to bind them to Projects (I assume not) or, perhaps, Groups? My naive >> thought on a possible solution is that if a group has access to a Project, >> a "generic" user account that everybody has access to could be used for the >> application credentials. (The use case here is to not bind an app cred to >> an individual who might leave the organization, thus making the app cred >> secret lost.) >> >> Thanks, >> >> Ryan >> > > > -- > Rafael Weing?rtner > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Fri Nov 26 17:24:59 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Fri, 26 Nov 2021 11:24:59 -0600 Subject: python_requires >= 3.8 during Yoga In-Reply-To: <17d5d0a4f3f.dc8bb48c1384810.8642013929754348916@ghanshyammann.com> References: <17d58f75012.106d120141328715.5251672452583667038@ghanshyammann.com> <20211126122942.deyi4tymy5xoodvz@lyarwood-laptop.usersys.redhat.com> <17d5ce532e1.eabdb7481381896.7024061515184177274@ghanshyammann.com> <20211126160515.4tvksnqcz3vaomf4@lyarwood-laptop.usersys.redhat.com> <17d5d0a4f3f.dc8bb48c1384810.8642013929754348916@ghanshyammann.com> Message-ID: <17d5d476490.e30e73db1388640.4406614125429012751@ghanshyammann.com> ---- On Fri, 26 Nov 2021 10:18:16 -0600 Ghanshyam Mann wrote ---- > ---- On Fri, 26 Nov 2021 10:05:15 -0600 Lee Yarwood wrote ---- > > On 26-11-21 09:37:44, Ghanshyam Mann wrote: > > > ---- On Fri, 26 Nov 2021 06:29:42 -0600 Lee Yarwood wrote ---- > > > > On 26-11-21 10:54:26, Alfredo Moralejo Alonso wrote: > > > > > On Thu, Nov 25, 2021 at 10:23 PM Ghanshyam Mann > > > > > wrote: > > > > > > > > > > > ---- On Thu, 25 Nov 2021 13:58:28 -0600 Marcin Juszkiewicz < > > > > > > marcin.juszkiewicz at linaro.org> wrote ---- > > > > > > > W dniu 25.11.2021 o 19:13, Stephen Finucane pisze: > > > > > > > > gmann has been helpfully proposing patches to change the > > > > > > > > versions of Python we're testing against in Yoga. I've > > > > > > > > suggested that we might want to bump 'python_requires' in > > > > > > > > 'setup.cfg' to indicate that we no longer support any version > > > > > > > > of Python before 3.8 > > > > > > > > > > > > > > CentOS Stream 8 has Python 3.6 by default and RDO team is doing > > > > > > > CS8 -> CS9 migration during Yoga cycle. Can we postpone it to Z > > > > > > > when there will be no distribution with Py 3.6 to care about? > > > > > > > > Stupid question that I should know the answer to but does RDO really > > > > support RPM based installations anymore? IOW couldn't we just workaround > > > > this by providing CS8 py38 based containers during the upgrade? > > > > > > > > > As Marcin posted, the plan in RDO is to support both CentOS Stream 8 and > > > > > CentOS Stream 9 in Yoga. This is how we have managed previous major CentOS > > > > > version upgrades in the past providing support for both releases in an > > > > > OpenStack version to ease the upgrade so I'd like to keep yoga working on > > > > > py3.6 included in CS8 and CS9. > > > > > > > > If this was the plan why wasn't it made clear to the TC before they > > > > dropped CS8 from the Yoga runtimes? Would it even be possible for the TC > > > > to add CS8 and py36 back in to the Yoga runtimes? > > > > > > > > > > Postponing to Z, you mean dropping the py3.6 tests or bumping it in > > > > > > in 'setup.cfg' so that no one can install on py3.6 ? > > > > > > > > > > > > First one we already did and as per Yoga testing runtime we are > > > > > > targeting centos9-stream[1] in Yoga itself. > > > > > > > > > > > > For making 'python_requires' >=py3.8 in 'setup.cfg', I have no > > > > > > string opinion on this but I prefer to have flexible here that 'yes > > > > > > OpenStack is installable in py3.6 but we do not test it anymore from > > > > > > Yoga onwards so no guarantee'. Our testing runtime main goal is > > > > > > that we document the version we are testing *at least* which means > > > > > > it can work on lower or higher versions too but we just do not test > > > > > > them. > > > > > > > > > > > > > > > > May it be possible to keep py3.6 jobs to make sure patches are not > > > > > introducing py3.8-only features that would break deployment in CS8? > > > > > > > > We should keep CS8 and py36 as supported runtimes if we are keeping the > > > > jobs, otherwise this just sets super confusing. > > > > > > Yeah, I think it create confusion as I can see in this ML thread so > > > agree on keeping 'python_requires' also in sycn with what we test. > > > > Cool thanks! > > > > > Now question on going back to centos stream 8 support in Yoga, is it > > > not centos stream 9 is stable released or is it experimental only? If > > > stable then we can keep the latest available version which can be > > > centos stream 9. > > > > I honestly don't know and can't find any docs to point to. > > > > > Our project interface testing doc clearly stats 'latest LTS' to > > > consider for testing[1] whenever we are ready. I am not very strongly > > > against of reverting back to centos stream 8 but we should not add two > > > version of same distro in testing which can be a lot of we consider > > > below three distro > > > > How do we expect operators to upgrade between Xena where CentOS 8 stream > > is a supported runtime and Yoga where CentOS 9 stream is currently the > > equivalent supported runtime without supporting both for a single > > release? > > This is really good question on upgrade testing we do at upstream and I remember > it cameup and discussed a lot during py2.7 drop also that how we are testing the > upgrade from py2.7 to py3. Can we do in grenade? But that we answered as we did > not tested directly but stein and train tested both version so should not be any issue > if you upgrade from there (one of FAQ in my blog[1]). > > But on distro upgrade testing, as you know we do not test those in upstream neither > in grenade where upgrade are done on old node distro only not from old distro version to > new distro version with new code. It is not like we do not want to test but if anyone > from any distro would like to setup grenade for that and maintain then we are more happy. > In summary, yes we cannot guarantee distro upgrade testing from OpenStack upstream testing > due to resource bandwidth issue but we will welcome any help here. We discussed with amoralej about moving the testing runtime to CentOS stream 8 and py36 or not in TC IRC channel[1]. As we at upstream do not test distro two versions in same release, amoralej agreed to keep CentOS stream 9 if one to choose which is our current testing runtime is. So no change in the direction of current testing runtime and dropping the py3.6 but there is possibility of some trade off here. If any py3.6 breaking changes are happening then it is up to projects goodness, bandwidth, or flexibility about accepting the fix or not or even add a py36 unit test job. As our testing runtime is the minimum things to test and it does not put any max limit of testing, any project can extend their testing as per their bandwidth. In summary: (This is what we agreed today in TC channel but as most of the folks are on leave today, I will keep it open until next week so see if any objections from the community and will conclude it accordingly) * No change in Yoga testing runtime and we move to cs9 and drop py36. * We will not put hard stop on cs8 support and we can: ** Devstack keep supporting cs8 in Yoga ** It can be negotiated with project to add py36 job or fix if any py36 breaking changes are observed by RDO (or any distro interested in py36) but it depends on the project decision and bandwidth. As next, how we can improve the upgrade testing from distro versions is something we will explore next and see what all we can test to make upgrade easier. [1] https://meetings.opendev.org/irclogs/%23openstack-tc/%23openstack-tc.2021-11-26.log.html#t2021-11-26T16:04:59 -gmann > > > [1] https://superuser.openstack.org/articles/openstack-ussuri-is-python3-only-upgrade-impact/ > > -gmann > > > > > I appreciate it bloats the support matrix a little but the rest of the > > thread suggests we need to keep py36 around for now anyway. > > > > Cheers, > > > > -- > > Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 2D76 > > > > From amoralej at redhat.com Fri Nov 26 17:38:00 2021 From: amoralej at redhat.com (Alfredo Moralejo Alonso) Date: Fri, 26 Nov 2021 18:38:00 +0100 Subject: python_requires >= 3.8 during Yoga In-Reply-To: References: Message-ID: On Fri, Nov 26, 2021 at 3:29 PM Rados?aw Piliszek < radoslaw.piliszek at gmail.com> wrote: > On Fri, 26 Nov 2021 at 14:31, Dmitry Tantsur wrote: > > > > > > > > On Fri, Nov 26, 2021 at 1:28 PM Balazs Gibizer > wrote: > >> > >> > >> > >> On Fri, Nov 26 2021 at 11:47:42 AM +0100, Dmitry Tantsur > >> wrote: > >> > Hi all, > >> > > >> > Note that this decision will force us to stop supporting Bifrost [1] > >> > on CentOS/RHEL completely, unless we find a workaround. While Python > >> > 3.8 and 3.9 can be installed, they lack critical modules like > >> > python3-dnf or python3-firewalld, which cannot be pip-installed > >> > (sigh). > >> > > >> > A similar problem in Metal3: we use python3-mod_wsgi, but I guess we > >> > can switch to something else in this case. > >> > >> I'm not sure I got it. Don't OpenStack already supports py38 > >> officially? Based on my understanding of the above it is not the case. > > > > > > Now I'm confused as well :) > > > > OpenStack supports 3.8 and 3.9, CentOS/RHEL ships 3.6 and a limited > version of 3.8 and 3.9. Some Python projects may be okay with it, but > Ansible requires things that cannot be installed unless provided by OS > packages (or built from source). Examples include python3-dnf, > python3-libselinux, python3-firewall and presumably python3-mod_wsgi. > > The question is: is it hard for RDO to provide these deps? > If not, it might be the easiest solution. > If yes, we (TC) might want to revisit this decision for this cycle. > > Yes, it's hard to provide those deps. The gap is big, in CS8 there are 281 python3- (py3.6) packages vs 36 python39 ones. Some of them are built with non-python packages, as libselinux, firewalld, libvirt, etc... and would require to fork and fixing packages for coinstalability. Also, there are no warranties that all those versions of packages will work on py39 in CS8. Alfredo > > -yoctozepto > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From adivya1.singh at gmail.com Fri Nov 26 17:46:40 2021 From: adivya1.singh at gmail.com (Adivya Singh) Date: Fri, 26 Nov 2021 23:16:40 +0530 Subject: Regarding Drop down of frequently used stacks in OpenStack Wallaby Version 23.1.2 Message-ID: Dear Team, I want a Drop down of frequently used stack in my orchestration --> Launch Stack parametre in openstack version given above, Is there any Solution for this, How should i go ahead with this Regards Adivya Singh -------------- next part -------------- An HTML attachment was scrubbed... URL: From rafaelweingartner at gmail.com Fri Nov 26 17:50:11 2021 From: rafaelweingartner at gmail.com (=?UTF-8?Q?Rafael_Weing=C3=A4rtner?=) Date: Fri, 26 Nov 2021 14:50:11 -0300 Subject: Project-scoped app creds - Best practice In-Reply-To: References: Message-ID: Ryan, That happened with Keystone for some time (I have 3 or more specs that are stuck, even though they were discussed in-depth), the reviews were becoming quite massive (for me, for the specs I proposed) without getting anywhere, and maybe that is a consequence of PTLs/core reviewers being focused on somewhere else, or not having enough time to execute the paperwork of a new feature being proposed, evaluated, and accepted. I know that this is also my fault because I could have volunteered to become core review/PTL to try to add more work on Keystone, but I did not have enough time to dedicate to Keystone. This caused the specs to get stuck (and their patch). That is also why, for this latest spec I created, I did not publish the patch. For the other specs, I proposed the patch, they (the specs) were more or less accepted (we had a consensus), but never got merged, which caused the patches only to get conflicts, and not move forward, which made me a bit disappointed (after having fixed conflicts countless times). One of my new year's resolutions is to dedicate more time to Keystone next year. Or, at least, as much as I dedicate to CloudKitty to see if we can get these moving on. -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Fri Nov 26 17:56:50 2021 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Fri, 26 Nov 2021 18:56:50 +0100 Subject: python_requires >= 3.8 during Yoga In-Reply-To: References: Message-ID: On Fri, 26 Nov 2021 at 18:38, Alfredo Moralejo Alonso wrote: > > > > On Fri, Nov 26, 2021 at 3:29 PM Rados?aw Piliszek wrote: >> >> On Fri, 26 Nov 2021 at 14:31, Dmitry Tantsur wrote: >> > >> > >> > >> > On Fri, Nov 26, 2021 at 1:28 PM Balazs Gibizer wrote: >> >> >> >> >> >> >> >> On Fri, Nov 26 2021 at 11:47:42 AM +0100, Dmitry Tantsur >> >> wrote: >> >> > Hi all, >> >> > >> >> > Note that this decision will force us to stop supporting Bifrost [1] >> >> > on CentOS/RHEL completely, unless we find a workaround. While Python >> >> > 3.8 and 3.9 can be installed, they lack critical modules like >> >> > python3-dnf or python3-firewalld, which cannot be pip-installed >> >> > (sigh). >> >> > >> >> > A similar problem in Metal3: we use python3-mod_wsgi, but I guess we >> >> > can switch to something else in this case. >> >> >> >> I'm not sure I got it. Don't OpenStack already supports py38 >> >> officially? Based on my understanding of the above it is not the case. >> > >> > >> > Now I'm confused as well :) >> > >> > OpenStack supports 3.8 and 3.9, CentOS/RHEL ships 3.6 and a limited version of 3.8 and 3.9. Some Python projects may be okay with it, but Ansible requires things that cannot be installed unless provided by OS packages (or built from source). Examples include python3-dnf, python3-libselinux, python3-firewall and presumably python3-mod_wsgi. >> >> The question is: is it hard for RDO to provide these deps? >> If not, it might be the easiest solution. >> If yes, we (TC) might want to revisit this decision for this cycle. >> > > Yes, it's hard to provide those deps. The gap is big, in CS8 there are 281 python3- (py3.6) packages vs 36 python39 ones. > > Some of them are built with non-python packages, as libselinux, firewalld, libvirt, etc... and would require to fork and fixing packages for coinstalability. Also, there are no warranties that all those versions of packages will work on py39 in CS8. Thanks, Alfredo, for shedding more light on it. I agree with the conclusions made by you and Ghanshyam (in the other branch of this very thread). -yoctozepto From michal.arbet at ultimum.io Thu Nov 25 09:12:57 2021 From: michal.arbet at ultimum.io (Michal Arbet) Date: Thu, 25 Nov 2021 10:12:57 +0100 Subject: [neutron][nova] [kolla] vif plugged timeout In-Reply-To: <8897921.lOV4Wx5bFT@p1> References: <8897921.lOV4Wx5bFT@p1> Message-ID: Hi, You can find logs from controller0 and compute0 in attachment (other controllers and computes were turned off for this test). Thank you, Michal Arbet Openstack Engineer Ultimum Technologies a.s. Na Po???? 1047/26, 11000 Praha 1 Czech Republic +420 604 228 897 michal.arbet at ultimum.io *https://ultimum.io * LinkedIn | Twitter | Facebook ?t 25. 11. 2021 v 8:22 odes?latel Slawek Kaplonski napsal: > Hi, > > Basically in ML2/OVS case it may be one of 2 reasons why port isn't > provisioned properly quickly: > - neutron-ovs-agent is somehow slow with provisioning it or > - neutron-dhcp-agent is slow provisioning that port. > > To check which of those happens really, You can enable debug logs in You > neutron-server and look there for logs like "Port xxx provisioning > completed > by entity L2/DHCP" (or something similar, I don't remember it now exactly). > > If it works much faster with noop firewall driver, then it seems that it > is > more likely to be on the neutron-ovs-agent's side. > In such case couple of things to check: > - are You using l2population (it's required with DVR for example), > - are You using SG with rules which references "remote_group_id" (like > default > SG for each tenant does)? If so, can You try to remove from You SG such > rules > and use rules with CIDRs instead? We know that using SG with > remote_group_id > don't scale well and if You have many ports using same SG, it may slow > down > neutron-ovs-agent a lot. > - do You maybe have any other errors in the neutron-ovs-agent logs? Like > rpc > message communication errors or something else? Such errors will trigger > doing > fullsync of all ports on the node so it may take long time to get to > actually > provisioning Your new port sometimes. > - what exactly version of Neutron are You using there? > > On sobota, 20 listopada 2021 11:05:16 CET Michal Arbet wrote: > > Hi, > > > > Has anyone seen issue which I am currently facing ? > > > > When launching heat stack ( but it's same if I launch several of > instances > > ) vif plugged in timeouts an I don't know why, sometimes it is OK > > ..sometimes is failing. > > > > Sometimes neutron reports vif plugged in < 10 sec ( test env ) sometimes > > it's 100 and more seconds, it seems there is some race condition but I > > can't find out where the problem is. But on the end every instance is > > spawned ok (retry mechanism worked). > > > > Another finding is that it has to do something with security group, if > noop > > driver is used ..everything is working good. > > > > Firewall security setup is openvswitch . > > > > Test env is wallaby. > > > > I will attach some logs when I will be near PC .. > > > > Thank you, > > Michal Arbet (Kevko) > > > -- > Slawek Kaplonski > Principal Software Engineer > Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: vif_plugged_timeout.tar.gz Type: application/gzip Size: 722402 bytes Desc: not available URL: From michal.arbet at ultimum.io Thu Nov 25 09:14:28 2021 From: michal.arbet at ultimum.io (Michal Arbet) Date: Thu, 25 Nov 2021 10:14:28 +0100 Subject: [Neutron][OVN] nova-compute timeout while waiting for VIF activation confirmed by Neutron In-Reply-To: References: Message-ID: Hello, In attachment you can find logs for compute0 and controller0 (other computes and controllers were turned off for this test). No OVN used, this stack is based on OVS. Thank you, Michal Arbet Openstack Engineer Ultimum Technologies a.s. Na Po???? 1047/26, 11000 Praha 1 Czech Republic +420 604 228 897 michal.arbet at ultimum.io *https://ultimum.io * LinkedIn | Twitter | Facebook st 24. 11. 2021 v 15:59 odes?latel Rodolfo Alonso Hernandez < ralonsoh at redhat.com> napsal: > Hello Tony: > > Do you have the Neutron server logs? Do you see the > "PortBindingChassisUpdateEvent"? When a port is bound, a Port_Bound (SB) > event is issued and captured in the Neutron server. That will trigger the > port binding process and the "vif-plugged" event. This OVN SB event should > call "set_port_status_up" and that should write "OVN reports status up for > port: %s" in the logs. > > Of course, this Neutron method will be called only if the logical switch > port is UP. > > Regards. > > > On Tue, Nov 23, 2021 at 11:59 PM Tony Liu wrote: > >> Hi, >> >> I see such problem from time to time. It's not consistently reproduceable. >> ====================== >> 2021-11-23 22:16:28.532 7 INFO nova.compute.manager >> [req-50817b01-e7ae-4991-94fe-e4c5672c481b - - - - -] [instance: >> 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] During sync_power_state the instance >> has a pending task (spawning). Skip. >> 2021-11-23 22:21:28.342 7 WARNING nova.virt.libvirt.driver >> [req-814c98a4-3fd9-4607-9abb-5fbe5cef8650 3a4c320d64664d9cb6784b7ea52d618a >> 21384fe21ebb4339b1a5f2a8f9000ea3 - default default] [instance: >> 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] Timeout waiting for >> [('network-vif-plugged', '735d97a3-db7c-42ed-b3be-1596d4cc7f4b')] for >> instance with vm_state building and task_state spawning.: >> eventlet.timeout.Timeout: 300 seconds >> ====================== >> The VIF/port is activated by OVN ovn-controller to ovn-sb. It seems that, >> either Neutron didn't >> capture the update or didn't send message back to nova-compute. Is there >> any known fix for >> this problem? >> >> >> Thanks! >> Tony >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: vif_plugged_timeout.tar.gz Type: application/gzip Size: 722402 bytes Desc: not available URL: From michal.arbet at ultimum.io Thu Nov 25 09:14:28 2021 From: michal.arbet at ultimum.io (Michal Arbet) Date: Thu, 25 Nov 2021 10:14:28 +0100 Subject: [Neutron][OVN] nova-compute timeout while waiting for VIF activation confirmed by Neutron In-Reply-To: References: Message-ID: Hello, In attachment you can find logs for compute0 and controller0 (other computes and controllers were turned off for this test). No OVN used, this stack is based on OVS. Thank you, Michal Arbet Openstack Engineer Ultimum Technologies a.s. Na Po???? 1047/26, 11000 Praha 1 Czech Republic +420 604 228 897 michal.arbet at ultimum.io *https://ultimum.io * LinkedIn | Twitter | Facebook st 24. 11. 2021 v 15:59 odes?latel Rodolfo Alonso Hernandez < ralonsoh at redhat.com> napsal: > Hello Tony: > > Do you have the Neutron server logs? Do you see the > "PortBindingChassisUpdateEvent"? When a port is bound, a Port_Bound (SB) > event is issued and captured in the Neutron server. That will trigger the > port binding process and the "vif-plugged" event. This OVN SB event should > call "set_port_status_up" and that should write "OVN reports status up for > port: %s" in the logs. > > Of course, this Neutron method will be called only if the logical switch > port is UP. > > Regards. > > > On Tue, Nov 23, 2021 at 11:59 PM Tony Liu wrote: > >> Hi, >> >> I see such problem from time to time. It's not consistently reproduceable. >> ====================== >> 2021-11-23 22:16:28.532 7 INFO nova.compute.manager >> [req-50817b01-e7ae-4991-94fe-e4c5672c481b - - - - -] [instance: >> 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] During sync_power_state the instance >> has a pending task (spawning). Skip. >> 2021-11-23 22:21:28.342 7 WARNING nova.virt.libvirt.driver >> [req-814c98a4-3fd9-4607-9abb-5fbe5cef8650 3a4c320d64664d9cb6784b7ea52d618a >> 21384fe21ebb4339b1a5f2a8f9000ea3 - default default] [instance: >> 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] Timeout waiting for >> [('network-vif-plugged', '735d97a3-db7c-42ed-b3be-1596d4cc7f4b')] for >> instance with vm_state building and task_state spawning.: >> eventlet.timeout.Timeout: 300 seconds >> ====================== >> The VIF/port is activated by OVN ovn-controller to ovn-sb. It seems that, >> either Neutron didn't >> capture the update or didn't send message back to nova-compute. Is there >> any known fix for >> this problem? >> >> >> Thanks! >> Tony >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: vif_plugged_timeout.tar.gz Type: application/gzip Size: 722402 bytes Desc: not available URL: From oleg.bondarev at huawei.com Thu Nov 25 08:57:36 2021 From: oleg.bondarev at huawei.com (Oleg Bondarev) Date: Thu, 25 Nov 2021 08:57:36 +0000 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming In-Reply-To: References: <2165480.iZASKD2KPV@p1> <3MOP2R.O83SZVO0NWN23@est.tech> Message-ID: Hello, A few thoughts from my side is scope of brainstorm: 1) Recheck actual bugs (?recheck bug 123456?) - not a new idea to better keep track of all failures - force a developer to investigate the reason of each CI failure and increase corresponding bug rating, or file a new bug (or go and fix this bug finally!) - this implies having some gate failure bugs dashboard with hottest bugs on top - simple ?recheck? could be forbidden, at least during ?crisis management? window 2) Allow recheck TIMEOUT/POST_FAILURE jobs - while I agree that re-run particular jobs is evil, TIMEOUT/POST_FAILURE are not related to the patch in majority of cases - performance issues are usually caught by Rally jobs - of course core team should monitor if timeouts become a rule for some jobs 3) Ability to block rechecks in some cases, like known gate blocker - not everyone is always aware that gates are blocked with some issue - PTL (or any core team member) can turn off rechecks during that time (with a message from Zuul) - happens not often but still can save some CI resources Thanks, Oleg --- Advanced Software Technology Lab Huawei From: Rodolfo Alonso Hernandez [mailto:ralonsoh at redhat.com] Sent: Monday, November 22, 2021 11:54 AM To: Ronelle Landy Cc: Balazs Gibizer ; Slawek Kaplonski ; openstack-discuss ; Oleg Bondarev ; lajos.katona at ericsson.com; Bernard Cafarelli ; Miguel Lavalle Subject: Re: [neutron][CI] How to reduce number of rechecks - brainstorming Hello: I think the last idea Ronelled presented (a skiplist) could be feasible in Neutron. Of course, this list could grow indefinitely, but we can always keep an eye on it. There could be another issue with Neutron tempest tests when using the "advance" image. Despite the recent improvements done recently, we are frequently having problems with the RAM size of the testing VMs. We would like to have 20% more RAM, if possible. I wish we had the ability to pre-run some checks in specific HW (tempest plugin or grenade tests). Slawek commented the different number of backends we need to provide support and testing. However I think we can remove the Linux Bridge tempest plugin from the "gate" list (it is already tested in the "check" list). Tempest plugin tests are expensive in time and prone to errors. This paragraph falls under the shoulders of the Neutron team. We can also identify those long running tests that usually fail (those that take more than 1000 seconds). A test that takes around 15 mins to run, will probably fail. We need to find those tests, investigate the slowest parts of those tests and try to improve/optimize/remove them. Thank you all for your comments and proposals. That will help a lot to improve the Neutron CI stability. Regards. On Fri, Nov 19, 2021 at 12:53 AM Ronelle Landy > wrote: On Wed, Nov 17, 2021 at 5:22 AM Balazs Gibizer > wrote: On Wed, Nov 17 2021 at 09:13:34 AM +0100, Slawek Kaplonski > wrote: > Hi, > > Recently I spent some time to check how many rechecks we need in > Neutron to > get patch merged and I compared it to some other OpenStack projects > (see [1] > for details). > TL;DR - results aren't good for us and I think we really need to do > something > with that. I really like the idea of collecting such stats. Thank you for doing it. I can even imagine to make a public dashboard somewhere with this information as it is a good indication about the health of our projects / testing. > > Of course "easiest" thing to say is that we should fix issues which > we are > hitting in the CI to make jobs more stable. But it's not that easy. > We are > struggling with those jobs for very long time. We have CI related > meeting > every week and we are fixing what we can there. > Unfortunately there is still bunch of issues which we can't fix so > far because > they are intermittent and hard to reproduce locally or in some cases > the > issues aren't realy related to the Neutron or there are new bugs > which we need > to investigate and fix :) I have couple of suggestion based on my experience working with CI in nova. We've struggled with unstable tests in TripleO as well. Here are some things we tried and implemented: 1. Created job dependencies so we only ran check tests once we knew we had the resources we needed (example we had pulled containers successfully) 2. Moved some testing to third party where we have easier control of the environment (note that third party cannot stop a change merging) 3. Used dependency pipelines to pre-qualify some dependencies ahead of letting them run wild on our check jobs 4. Requested testproject runs of changes in a less busy environment before running a full set of tests in a public zuul 5. Used a skiplist to keep track of tech debt and skip known failures that we could temporarily ignore to keep CI moving along if we're waiting on an external fix. 1) we try to open bug reports for intermittent gate failures too and keep them tagged in a list [1] so when a job fail it is easy to check if the bug is known. 2) I offer my help here now that if you see something in neutron runs that feels non neutron specific then ping me with it. Maybe we are struggling with the same problem too. 3) there was informal discussion before about a possibility to re-run only some jobs with a recheck instead for re-running the whole set. I don't know if this is feasible with Zuul and I think this only treat the symptom not the root case. But still this could be a direction if all else fails. Cheers, gibi > So this is never ending battle for us. The problem is that we have > to test > various backends, drivers, etc. so as a result we have many jobs > running on > each patch - excluding UT, pep8 and docs jobs we have around 19 jobs > in check > and 14 jobs in gate queue. > > In the past we made a lot of improvements, like e.g. we improved > irrelevant > files lists for jobs to run less jobs on some of the patches, > together with QA > team we did "integrated-networking" template to run only Neutron and > Nova > related scenario tests in the Neutron queues, we removed and > consolidated some > of the jobs (there is still one patch in progress for that but it > should just > remove around 2 jobs from the check queue). All of that are good > improvements > but still not enough to make our CI really stable :/ > > Because of all of that, I would like to ask community about any other > ideas > how we can improve that. If You have any ideas, please send it in > this email > thread or reach out to me directly on irc. > We want to discuss about them in the next video CI meeting which will > be on > November 30th. If You would have any idea and would like to join that > discussion, You are more than welcome in that meeting of course :) > > [1] > http://lists.openstack.org/pipermail/openstack-discuss/2021-November/ > 025759.html [1] https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure&orderby=-date_last_updated&start=0 > > -- > Slawek Kaplonski > Principal Software Engineer > Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From seenafallah at gmail.com Fri Nov 26 13:31:43 2021 From: seenafallah at gmail.com (Seena Fallah) Date: Fri, 26 Nov 2021 17:01:43 +0330 Subject: Nova Local Disk Storage Message-ID: Hi, Is there any support for creating instances on a whole separate disk, not an LVM? -------------- next part -------------- An HTML attachment was scrubbed... URL: From dtantsur at redhat.com Fri Nov 26 18:48:27 2021 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Fri, 26 Nov 2021 19:48:27 +0100 Subject: python_requires >= 3.8 during Yoga In-Reply-To: References: <20211126143019.2o7rycs44ycnkgez@yuggoth.org> <17d5cdde95a.b95752d01381326.4462262438602968824@ghanshyammann.com> Message-ID: On Fri, Nov 26, 2021 at 6:14 PM Alfredo Moralejo Alonso wrote: > > > On Fri, Nov 26, 2021 at 4:44 PM Dmitry Tantsur > wrote: > >> >> >> On Fri, Nov 26, 2021 at 4:35 PM Ghanshyam Mann >> wrote: >> >>> ---- On Fri, 26 Nov 2021 09:20:39 -0600 Dmitry Tantsur < >>> dtantsur at redhat.com> wrote ---- >>> > >>> > >>> > On Fri, Nov 26, 2021 at 3:35 PM Jeremy Stanley >>> wrote: >>> > On 2021-11-26 14:29:53 +0100 (+0100), Dmitry Tantsur wrote: >>> > [...] >>> > > CentOS/RHEL ships 3.6 and a limited version of 3.8 and 3.9. >>> > [...] >>> > >>> > Is this still true for CentOS Stream 9? The TC decision was to >>> > support that instead of CentOS Stream 8 in Yoga. >>> > >>> > No. But Stream 9 is pretty much beta, so it's not a replacement for >>> us (and we don't have nodes in nodepool with it even yet?). >>> >>> I think here is the confusion. In TC, after checking with centos team >>> impression was CentOS stream 9 is released and that is >>> what we should update In OpenStack testing. And then only we updated the >>> centos stream 8 -> 9 and dropped py3.6 testing >>> >>> - >>> https://review.opendev.org/c/openstack/governance/+/815851/3..6/reference/runtimes/yoga.rst >>> >> >> I think there is an enormous perception gap between the CentOS team and >> the rest of the world. >> >> > So, CentOS Stream 9 was released, in the official mirrors and usable since > some weeks ago[1][2]. We shouldn't consider it beta or something like that. > > As mentioned, support for diskimage-builder has been introduced for CS9 > and there are nodepool nodes ready for it. From RDO, we are providing RPMs > for master branch content on CentOS Stream 9 [3] and actually we have been > doing some tests. Actually, we have recently merged new jobs in > puppet-openstack[4]. > "It's usable since some weeks ago and we even added tests today" is not exactly reassuring :) The PTI uses wording "stable and LTS", which applies to Stream 9 no more than it applies to Fedora. In the end, what we test with Bifrost is what we will recommend people to deploy it in production on. I do believe people can and should deploy on Stream 8 despite all the FUD around it, but I cannot do it for Stream 9 until RHEL 9 is out. Dmitry > > [1] https://www.centos.org/centos-stream/ > > [2] https://cloud.centos.org/centos/9-stream/x86_64/images/ > [3] https://trunk.rdoproject.org/centos9-master/report.html > [4] > https://review.opendev.org/c/openstack/puppet-openstack-integration/+/793462 > > Alfredo > > > >> Dmitry >> >> >>> >>> >>> >>> -gmann >>> >>> > Dmitry >>> > -- >>> > Jeremy Stanley >>> > >>> > >>> > -- >>> > Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, >>> > Commercial register: Amtsgericht Muenchen, HRB 153243, >>> > Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, >>> Michael O'Neill >>> > >>> >>> >> >> -- >> Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, >> Commercial register: Amtsgericht Muenchen, HRB 153243, >> Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael >> O'Neill >> > -- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill -------------- next part -------------- An HTML attachment was scrubbed... URL: From ryan.bannon at gmail.com Fri Nov 26 20:21:42 2021 From: ryan.bannon at gmail.com (Ryan Bannon) Date: Fri, 26 Nov 2021 15:21:42 -0500 Subject: Project-scoped app creds - Best practice In-Reply-To: References: Message-ID: I think you're being too hard on yourself :) On Fri, Nov 26, 2021 at 12:50 PM Rafael Weing?rtner < rafaelweingartner at gmail.com> wrote: > Ryan, > That happened with Keystone for some time (I have 3 or more specs that are > stuck, even though they were discussed in-depth), the reviews were becoming > quite massive (for me, for the specs I proposed) without getting anywhere, > and maybe that is a consequence of PTLs/core reviewers being focused on > somewhere else, or not having enough time to execute the paperwork of a new > feature being proposed, evaluated, and accepted. > > I know that this is also my fault because I could have volunteered to > become core review/PTL to try to add more work on Keystone, but I did not > have enough time to dedicate to Keystone. This caused the specs to get > stuck (and their patch). That is also why, for this latest spec I created, > I did not publish the patch. For the other specs, I proposed the patch, > they (the specs) were more or less accepted (we had a consensus), but never > got merged, which caused the patches only to get conflicts, and not move > forward, which made me a bit disappointed (after having fixed conflicts > countless times). > > One of my new year's resolutions is to dedicate more time to Keystone next > year. Or, at least, as much as I dedicate to CloudKitty to see if we can > get these moving on. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rafaelweingartner at gmail.com Fri Nov 26 20:24:07 2021 From: rafaelweingartner at gmail.com (=?UTF-8?Q?Rafael_Weing=C3=A4rtner?=) Date: Fri, 26 Nov 2021 17:24:07 -0300 Subject: Project-scoped app creds - Best practice In-Reply-To: References: Message-ID: Haha, thanks! If you want/need I can share the patch to have project bound credentials in Keystone. Just let me know. On Fri, Nov 26, 2021 at 5:21 PM Ryan Bannon wrote: > I think you're being too hard on yourself :) > > On Fri, Nov 26, 2021 at 12:50 PM Rafael Weing?rtner < > rafaelweingartner at gmail.com> wrote: > >> Ryan, >> That happened with Keystone for some time (I have 3 or more specs that >> are stuck, even though they were discussed in-depth), the reviews were >> becoming quite massive (for me, for the specs I proposed) without getting >> anywhere, and maybe that is a consequence of PTLs/core reviewers being >> focused on somewhere else, or not having enough time to execute the >> paperwork of a new feature being proposed, evaluated, and accepted. >> >> I know that this is also my fault because I could have volunteered to >> become core review/PTL to try to add more work on Keystone, but I did not >> have enough time to dedicate to Keystone. This caused the specs to get >> stuck (and their patch). That is also why, for this latest spec I created, >> I did not publish the patch. For the other specs, I proposed the patch, >> they (the specs) were more or less accepted (we had a consensus), but never >> got merged, which caused the patches only to get conflicts, and not move >> forward, which made me a bit disappointed (after having fixed conflicts >> countless times). >> >> One of my new year's resolutions is to dedicate more time to Keystone >> next year. Or, at least, as much as I dedicate to CloudKitty to see if we >> can get these moving on. >> > -- Rafael Weing?rtner -------------- next part -------------- An HTML attachment was scrubbed... URL: From mdemaced at redhat.com Fri Nov 26 22:42:24 2021 From: mdemaced at redhat.com (Maysa De Macedo Souza) Date: Fri, 26 Nov 2021 23:42:24 +0100 Subject: [kuryr] Propose to EOL stable branches (Stein, Rocky and Queens) for all the kuryr scope Message-ID: Hello, We have plenty of stable branches in Kuryr repositories still open with the oldest being stable/queens. We haven't seen many backports[0] proposed to Stein, Rocky and Queens, so I would like to propose retiring these branches. Let me know if anybody is interested in keeping any of those branches. [0] https://review.opendev.org/q/(project:%255Eopenstack/kuryr.*)AND((branch:stable/queens)OR(branch:stable/rocky)OR(branch:stable/stein)) Thank you, Maysa Macedo. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mdemaced at redhat.com Fri Nov 26 22:50:39 2021 From: mdemaced at redhat.com (Maysa De Macedo Souza) Date: Fri, 26 Nov 2021 23:50:39 +0100 Subject: [kuryr] Propose to EOL stable branches (Stein, Rocky and Queens) for all the kuryr scope In-Reply-To: References: Message-ID: Hello again, We also have Pike as a stable branch[0], so that is the older one, which would also be EOL. [0] https://review.opendev.org/q/(project:%255Eopenstack/kuryr.*)AND((branch:stable/queens)OR(branch:stable/rocky)OR(branch:stable/stein)OR(branch:stable/pike)) Cheers, Maysa. On Fri, Nov 26, 2021 at 11:42 PM Maysa De Macedo Souza wrote: > Hello, > > We have plenty of stable branches in Kuryr repositories still open with > the oldest being stable/queens. > We haven't seen many backports[0] proposed to Stein, Rocky and Queens, so > I would like to propose retiring these branches. > > Let me know if anybody is interested in keeping any of those branches. > > [0] > https://review.opendev.org/q/(project:%255Eopenstack/kuryr.*)AND((branch:stable/queens)OR(branch:stable/rocky)OR(branch:stable/stein)) > > Thank you, > Maysa Macedo. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Sat Nov 27 00:13:36 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Fri, 26 Nov 2021 18:13:36 -0600 Subject: [all][tc] What's happening in Technical Committee: summary 26th Nov, 21: Reading: 10 min Message-ID: <17d5ebd7ef3.bbd5fef41396468.4019647999400995329@ghanshyammann.com> Hello Everyone, Here is this week's summary of the Technical Committee activities. 1. TC Meetings: ============ * Most of the meeting discussions are summarized below (Completed or in-progress activities section). Meeting full logs are available @ https://meetings.opendev.org/meetings/tc/2021/tc.2021-11-25-15.00.log.html * Next week's meeting is video call on 2nd Dec, Thursday 15:00 UTC, feel free the topic on the agenda[1] by 1st Dec. 2. What we completed this week: ========================= * Add ProxySQL repository for OpenStack-Ansible[2] * Mark 'Technical Writing' SIG 'complete' as it is merged into TC [3] * Rename ?Extended Maintenance? SIG to the ?Stable Maintenance? [4] * Retire training-labs repo [5] 3. Activities In progress: ================== TC Tracker for Yoga cycle ------------------------------ * This etherpad includes the Yoga cycle targets/working items for TC[6]. Open Reviews ----------------- * 8 open reviews for ongoing activities[7]. Yoga testing runtime ------------------------- We updated the Yoga testing runtime to move from CentOS stream 8 to CentOS stream 9 and so drop the py3.6[8]. But while updating projects to drop py3.6, and discussion on whether we should change 'python_requires'>= 3.8 in setup.cfg, another (but good) discussion started on ML about whether we should continue testing py3.6 and support CentOS stream 8 and 9 both in yoga which is what RDO community plan is[8]. After discussing on ML and then with Alfredo Moralejo on TC IRC channel, we have a potential way forward on this which I have summarized on ML thread[9]. I will keep that open to get feedback or any objection to the proposed plan and accordingly we will conclude it early next week. I hope I have summarized the discussion here if not please read the ML thread for this. Fixing Zuul config error -------------------------- If your project is listed in zuul config error, please start planing to fix those[10] RBAC discussion: continuing from PTG ---------------------------------------------- After last week meeting discussion, we are almost ready with this goal, if you would like to review the current direction as well as the schedule, please review it[11]. Next plan is to merge the proposed goal and then select it ASAP as m-1 is already passed for Yoga cycle. Community-wide goal updates ------------------------------------ * Two goals are proposed currently and under review. ** RBAC goal is pretty much in good shape now, feel free to review it[11] ** 'FIPS compatibility and compliance' [12]. Adjutant need maintainers and PTLs ------------------------------------------- We are waiting to hear from Braden, Albert on permission to work in this project[13]. New project 'Skyline' proposal ------------------------------------ * TC is still waiting for skyline team to respond on open queries[14]. TC tags analysis ------------------- * TC agreed to remove the framework and it is communicated in ML[15]. Project updates ------------------- * Add ansible-collection-kolla repo to Kolla project[16] * Retire js-openstack-lib [17] 4. How to contact the TC: ==================== If you would like to discuss or give feedback to TC, you can reach out to us in multiple ways: 1. Email: you can send the email with tag [tc] on openstack-discuss ML[18]. 2. Weekly meeting: The Technical Committee conduct a weekly meeting every Thursday 15 UTC [19] 3. Ping us using 'tc-members' nickname on #openstack-tc IRC channel. [1] https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting [2] https://review.opendev.org/c/openstack/governance/+/817245 [3] https://review.opendev.org/c/openstack/governance-sigs/+/815868 [4] https://review.opendev.org/c/openstack/governance-sigs/+/817499 [5] https://review.opendev.org/c/openstack/governance/+/817511 [6] https://etherpad.opendev.org/p/tc-yoga-tracker [7] https://review.opendev.org/q/projects:openstack/governance+status:open [8] http://lists.openstack.org/pipermail/openstack-discuss/2021-November/026001.html [9] http://lists.openstack.org/pipermail/openstack-discuss/2021-November/026024.html [10] https://etherpad.opendev.org/p/zuul-config-error-openstack [11] https://review.opendev.org/c/openstack/governance/+/815158 [12] https://review.opendev.org/c/openstack/governance/+/816587 [13] http://lists.openstack.org/pipermail/openstack-discuss/2021-November/025786.html [14] https://review.opendev.org/c/openstack/governance/+/814037 [15] http://lists.openstack.org/pipermail/openstack-discuss/2021-October/025571.html [16] https://review.opendev.org/c/openstack/governance/+/819331 [17] https://review.opendev.org/c/openstack/governance/+/807163 [18] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-discuss [19] http://eavesdrop.openstack.org/#Technical_Committee_Meeting -gmann From ildiko.vancsa at gmail.com Sat Nov 27 00:34:18 2021 From: ildiko.vancsa at gmail.com (Ildiko Vancsa) Date: Fri, 26 Nov 2021 16:34:18 -0800 Subject: [neutron][networking][ipv6][dns][ddi] Upcoming OpenInfra Edge Computing Group sessions In-Reply-To: References: Message-ID: <5993A296-7D67-444B-8D2F-2544C42C984A@gmail.com> Hi, It is a friendly reminder that we have the Networking and DNS discussion coming up at the OpenInfra Edge Computing Group weekly call on Monday (November 29) at 6am PST / 1400 UTC. We have invited industry experts, Cricket Liu and Andrew Wertkin, to share their thoughts and experience in this area. But, we also need YOU to join to turn the session into a lively discussion and debate! We have another networking related session the following week as well: December 6th - Networking and IPv6 discussion with Ed Horley Beyond the above topics we will have presentations from groups and projects such as Smart Edge and the Industry IoT Consortium (Formerly Industrial Internet Consortium). For the full schedule please see the edge WG?s wiki: https://wiki.openstack.org/wiki/Edge_Computing_Group#Upcoming_Topics Our calls happen every Monday at 6am Pacific Time / 1400 UTC on Zoom. You can find all the meeting details here: https://wiki.openstack.org/wiki/Edge_Computing_Group#Meetings Please let me know if you have any questions about the working group or any of the upcoming sessions. Thanks and Best Regards, Ildik? > On Nov 3, 2021, at 18:12, Ildiko Vancsa wrote: > > Hi, > > I?m reaching out to you to draw your attention to the amazing lineup of discussion topics for the OpenInfra Edge Computing Group weekly calls up until the end of this year with industry experts to present and participate in the discussions! > > I would like to invite and encourage you to join the working group sessions to discuss edge related challenges and solutions in the below areas and more! > > Some of the sessions to highlight will be continuing the discussions we started at the recent PTG: > * November 29th - Networking and DNS discussion with Cricket Liu and Andrew Wertkin > * December 6th - Networking and IPv6 discussion with Ed Horley > > Beyond the above topics we will have presentations from groups and projects such as Smart Edge and the Industry IoT Consortium (Formerly Industrial Internet Consortium). > > For the full schedule please see the edge WG?s wiki: https://wiki.openstack.org/wiki/Edge_Computing_Group#Upcoming_Topics > > Our calls happen every Monday at 6am Pacific Time / 1400 UTC on Zoom. You can find all the meeting details here: https://wiki.openstack.org/wiki/Edge_Computing_Group#Meetings > > Please let me know if you have any questions about the working group or any of the upcoming sessions. > > Thanks and Best Regards, > Ildik? > > From beagles at redhat.com Sat Nov 27 14:29:54 2021 From: beagles at redhat.com (Brent Eagles) Date: Sat, 27 Nov 2021 10:59:54 -0330 Subject: [tripleo] puppet_config: immutable relationship between config_volume and config_image? Message-ID: Hi all, I've been working on https://review.opendev.org/c/openstack/tripleo-heat-templates/+/816895 and the current pep8 check error on the designate-api-container-puppet.yaml template appears to disallow using different container config images on the same config volume. I see where this makes sense but because the way the puppet_config sections seem to be structured I wonder if it was intentional. The templates happen to work in this patch's case because the api specific config image in this template only includes some additional binaries for apache+wsgi necessary for the puppet. However, I can see where this might cause some really unfortunate issues if the config images for a given config_volume were to have different versions of puppet. While I will alter the templates so the designate config volume use the same config image, it does beg the question if having the puppet_config config_volume definitions duplicated across the templates is a good pattern. Would it be a better choice to adopt a pattern where the immutable parts of the config_volumes (e.g. config_volume name, config_image) can only be defined once per config volume? Perhaps in a separate file and have the mutable parts in their respective template definitions? Cheer, Brent -- Brent Eagles Principal Software Engineer Red Hat Inc. From oleg.bondarev at huawei.com Mon Nov 29 07:22:42 2021 From: oleg.bondarev at huawei.com (Oleg Bondarev) Date: Mon, 29 Nov 2021 07:22:42 +0000 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming In-Reply-To: <14494355.tv2OnDr8pf@p1> References: <2165480.iZASKD2KPV@p1> <3MOP2R.O83SZVO0NWN23@est.tech> <3e835881-29d8-4e33-9ef1-36171f9dd0a3@www.fastmail.com> <14494355.tv2OnDr8pf@p1> Message-ID: <0dd39f7fa19d4f058af6bc975784c242@huawei.com> Hello, A few thoughts from my side in scope of brainstorm: 1) Recheck actual bugs (?recheck bug 123456?) - not a new idea to better keep track of all failures - force a developer to investigate the reason of each CI failure and increase corresponding bug rating, or file a new bug (or go and fix this bug finally!) - I think we should have some gate failure bugs dashboard with hottest bugs on top (maybe there is one that I?m not aware of) so everyone could go and check if his CI failure is known or new - simple ?recheck? could be forbidden, at least during ?crisis management? window 2) Allow recheck TIMEOUT/POST_FAILURE jobs - while I agree that re-run particular jobs is evil, TIMEOUT/POST_FAILURE are not related to the patch in majority of cases - performance issues are usually caught by Rally jobs - of course core team should monitor if timeouts become a rule for some jobs 3) Ability to block rechecks in some cases, like known gate blocker - not everyone is always aware that gates are blocked with some issue - PTL (or any core team member) can turn off rechecks during that time (with a message from Zuul) - happens not often but still can save some CI resources Thanks, Oleg --- Advanced Software Technology Lab Huawei -----Original Message----- From: Slawek Kaplonski [mailto:skaplons at redhat.com] Sent: Thursday, November 18, 2021 10:46 AM To: Clark Boylan Cc: openstack-discuss at lists.openstack.org Subject: Re: [neutron][CI] How to reduce number of rechecks - brainstorming Hi, Thx Clark for detailed explanation about that :) On ?roda, 17 listopada 2021 16:51:57 CET you wrote: > On Wed, Nov 17, 2021, at 2:18 AM, Balazs Gibizer wrote: > > Snip. I want to respond to a specific suggestion: > > 3) there was informal discussion before about a possibility to > > re-run only some jobs with a recheck instead for re-running the > > whole set. I don't know if this is feasible with Zuul and I think > > this only treat the symptom not the root case. But still this could > > be a direction if all else fails. > > OpenStack has configured its check and gate queues with something > we've called > "clean check". This refers to the requirement that before an OpenStack > project can be gated it must pass check tests first. This policy was > instituted because a number of these infrequent but problematic issues > were traced back to recheck spamming. Basically changes would show up > and were broken. They would fail some percentage of the time. They got > rechecked until > they finally merged and now their failure rate is added to the whole. > This rule was introduced to make it more difficult to get this > flakyness into the gate. > > Locking in test results is in direct opposition to the existing policy > and goals. Locking results would make it far more trivial to land such > flakyness as you wouldn't need entire sets of jobs to pass before you could land. > Instead you could rerun individual jobs until each one passed and then > land the result. Potentially introducing significant flakyness with a > single merge. > > Locking results is also not really something that fits well with the > speculative gate queues that Zuul runs. Remember that Zuul constructs > a future git state and tests that in parallel. Currently the state for > OpenStack looks like: > > A - Nova > ^ > B - Glance > ^ > C - Neutron > ^ > D - Neutron > ^ > F - Neutron > > The B glance change is tested as if the A Nova change has already > merged and so on down the queue. If we want to keep these speculative > states we can't really have humans manually verify a failure can be ignored and retry it. > Because we'd be enqueuing job builds at different stages of > speculative state. Each job build would be testing a different version of the software. > > What we could do is implement a retry limit for failing jobs. Zuul > could rerun > failing jobs X times before giving up and reporting failure (this > would require updates to Zuul). The problem with this approach is > without some oversight it becomes very easy to land changes that make > things worse. As a side note Zuul does do retries, but only for > detected network errors or when a pre-run playbook fails. The > assumption is that network failures are due to the dangers of the > Internet, and that pre-run playbooks are small, self contained, > unlikely to fail, and when they do fail the failure should be independent of what is being tested. > > Where does that leave us? > > I think it is worth considering the original goals of "clean check". > We know that rechecking/rerunning only makes these problems worse in the long term. > They represent technical debt. One of the reasons we run these tests > is to show us when our software is broken. In the case of flaky > results we are exposing this technical debt where it impacts the > functionality of our software. The longer we avoid fixing these issues > the worse it gets, and this > is true even with "clean check". I agree with You on that and I would really like to find better/other solution for the Neutron problem than rechecking only broken jobs as I'm pretty sure that this would make things much worst quickly. > > Do we as developers find value in knowing the software needs attention before > it gets released to users? Do the users find value in running reliable > software? In the past we have asserted that "yes, there is value in > this", and have invested in tracking, investigating, and fixing these > problems even if they happen infrequently. But that does require > investment, and active maintenance. > > Clark -- Slawek Kaplonski Principal Software Engineer Red Hat From ralonsoh at redhat.com Mon Nov 29 08:15:59 2021 From: ralonsoh at redhat.com (Rodolfo Alonso Hernandez) Date: Mon, 29 Nov 2021 09:15:59 +0100 Subject: [neutron] Bug deputy, 22Nov - 28Nov Message-ID: Hello Neutrinos: This is the bug report from last week. Critical: https://bugs.launchpad.net/neutron/+bug/1952004: ?ovn: Functional test may fail if northd hasn't initialized DBs yet?. Assigned. Patch: https://review.opendev.org/c/openstack/neutron/+/819032 https://bugs.launchpad.net/neutron/+bug/1952023: ?Neutron functional tests don't properly clean up ovn-northd?. Assigned. Patch: https://review.opendev.org/c/openstack/neutron/+/819049 https://bugs.launchpad.net/neutron/+bug/1952066: ?Scenario test test_mac_learning_vms_on_same_network fails intermittently in the ovn job?. Assigned. https://bugs.launchpad.net/neutron/+bug/1952357: ?Functional tests job in the ovn-octavia-provider is broken?. Assigned. Patch: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/819377 https://bugs.launchpad.net/neutron/+bug/1952393: ?[OVN] neutron-tempest-plugin-scenario-ovn broken with "ovn-northd did not start"?. Assigned. Patch: https://review.opendev.org/c/openstack/devstack/+/819402 https://bugs.launchpad.net/neutron/+bug/1952508: ?[OVN] "TestAgentMonitor.test_agent_change_controller" failing randomly?. Assigned. Patch: https://review.opendev.org/c/openstack/neutron/+/819502 High: https://bugs.launchpad.net/neutron/+bug/1952567: ?[ml2][ovs] ports tag are missing and flood on those?. Assigned. Patch: https://review.opendev.org/c/openstack/neutron/+/819567 https://bugs.launchpad.net/neutron/+bug/1952550: ?[OVN] neutron_ovn_metadata_agent retrying on UUID errors infinitely?. *Unassigned* . Medium: https://bugs.launchpad.net/neutron/+bug/1951816: ?[OVN] setting a IPv6 address in dns_servers is broken?. *Unassigned*. https://bugs.launchpad.net/neutron/+bug/1951872: ?OVN: Missing reverse DNS for instances?. *Unassigned*. https://bugs.launchpad.net/neutron/+bug/1952055: ?native firewall driver - conntrack marks too much traffic as invalid?. *Unassigned*. Low: https://bugs.launchpad.net/neutron/+bug/1951569: ?[L3] L3 agent extension should always inherit from "L3AgentExtension"?. Assigned. Patch: https://review.opendev.org/c/openstack/neutron/+/818540 https://bugs.launchpad.net/neutron/+bug/1952409: ?SQLAlchemy "ConnectionEvents.before_execute" signature has changed in 1.4?. Assigned. Patch: https://review.opendev.org/c/openstack/neutron/+/819420 Undecided/Opinion/Invalid: https://bugs.launchpad.net/neutron/+bug/1951720: "Virtual interface creation failed". More information is needed to debug this issue. https://bugs.launchpad.net/neutron/+bug/1952249: "[OVN] Created VM name overrides predefined port dns-name". The extension works as expected. Regards. -------------- next part -------------- An HTML attachment was scrubbed... URL: From josephine.seifert at secustack.com Mon Nov 29 08:49:36 2021 From: josephine.seifert at secustack.com (Josephine Seifert) Date: Mon, 29 Nov 2021 09:49:36 +0100 Subject: [Image Encryption] No meeting today Message-ID: Hi, due to a conflicting meeting today I will not be able to hold the popup team meeting. Next meeting will be next week. greetings Josphine (Luzi) From lyarwood at redhat.com Mon Nov 29 09:10:21 2021 From: lyarwood at redhat.com (Lee Yarwood) Date: Mon, 29 Nov 2021 09:10:21 +0000 Subject: python_requires >= 3.8 during Yoga In-Reply-To: <20211126170407.zjcr34drutafycpp@yuggoth.org> References: <17d58f75012.106d120141328715.5251672452583667038@ghanshyammann.com> <20211126122942.deyi4tymy5xoodvz@lyarwood-laptop.usersys.redhat.com> <17d5ce532e1.eabdb7481381896.7024061515184177274@ghanshyammann.com> <20211126160515.4tvksnqcz3vaomf4@lyarwood-laptop.usersys.redhat.com> <20211126162013.fpwlss5h7rz2xzar@yuggoth.org> <20211126170407.zjcr34drutafycpp@yuggoth.org> Message-ID: <20211129091021.43gvliceua7ygdwz@lyarwood-laptop.usersys.redhat.com> On 26-11-21 17:04:07, Jeremy Stanley wrote: > On 2021-11-26 17:45:30 +0100 (+0100), Dmitry Tantsur wrote: > > On Fri, Nov 26, 2021 at 5:26 PM Jeremy Stanley wrote: > [...] > > > Somehow we manage to get by with testing only one Ubuntu version > > > per OpenStack release. > > > > I'm quite sure there is always an overlap, otherwise grenade would > > not work. > > Good point, I hadn't considered grenade. While we don't expressly > mention the platform used for upgrade tests as part of the PTI, you > are correct that we deploy release N-1 on the default node type for > release N-1, then upgrade it in-place to release N and run a battery > of tests against it. Technically we do test the new release on the > old platform (at least for projects which tests their changes or are > otherwise exercised by grenade), even though the PTI says we can > drop support for the Python version on that platform. This is the reason I think we need an overlap release for distro LTS versions before we drop anything. Totally agree that we don't support the underlying distro upgrade but we clearly do support in place OpenStack upgrades and need the latest code to run on the older release. > If someone wants to work on running grenade on centos-8-stream nodes > for yoga, we could ensure that's still a viable upgrade path the > same way. Yup happy to look into this today. -- Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 2D76 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From lyarwood at redhat.com Mon Nov 29 09:17:18 2021 From: lyarwood at redhat.com (Lee Yarwood) Date: Mon, 29 Nov 2021 09:17:18 +0000 Subject: python_requires >= 3.8 during Yoga In-Reply-To: <17d5d476490.e30e73db1388640.4406614125429012751@ghanshyammann.com> References: <17d58f75012.106d120141328715.5251672452583667038@ghanshyammann.com> <20211126122942.deyi4tymy5xoodvz@lyarwood-laptop.usersys.redhat.com> <17d5ce532e1.eabdb7481381896.7024061515184177274@ghanshyammann.com> <20211126160515.4tvksnqcz3vaomf4@lyarwood-laptop.usersys.redhat.com> <17d5d0a4f3f.dc8bb48c1384810.8642013929754348916@ghanshyammann.com> <17d5d476490.e30e73db1388640.4406614125429012751@ghanshyammann.com> Message-ID: <20211129091718.fdgdwmowtp3jxf6u@lyarwood-laptop.usersys.redhat.com> On 26-11-21 11:24:59, Ghanshyam Mann wrote: > ---- On Fri, 26 Nov 2021 10:18:16 -0600 Ghanshyam Mann wrote ---- > > ---- On Fri, 26 Nov 2021 10:05:15 -0600 Lee Yarwood wrote ---- > > > On 26-11-21 09:37:44, Ghanshyam Mann wrote: > > > > ---- On Fri, 26 Nov 2021 06:29:42 -0600 Lee Yarwood wrote ---- > > > > > On 26-11-21 10:54:26, Alfredo Moralejo Alonso wrote: > > > > > > On Thu, Nov 25, 2021 at 10:23 PM Ghanshyam Mann > > > > > > wrote: > > > > > > > > > > > > > ---- On Thu, 25 Nov 2021 13:58:28 -0600 Marcin Juszkiewicz < > > > > > > > marcin.juszkiewicz at linaro.org> wrote ---- > > > > > > > > W dniu 25.11.2021 o 19:13, Stephen Finucane pisze: > > > > > > > > > gmann has been helpfully proposing patches to change the > > > > > > > > > versions of Python we're testing against in Yoga. I've > > > > > > > > > suggested that we might want to bump 'python_requires' in > > > > > > > > > 'setup.cfg' to indicate that we no longer support any version > > > > > > > > > of Python before 3.8 > > > > > > > > > > > > > > > > CentOS Stream 8 has Python 3.6 by default and RDO team is doing > > > > > > > > CS8 -> CS9 migration during Yoga cycle. Can we postpone it to Z > > > > > > > > when there will be no distribution with Py 3.6 to care about? > > > > > > > > > > Stupid question that I should know the answer to but does RDO really > > > > > support RPM based installations anymore? IOW couldn't we just workaround > > > > > this by providing CS8 py38 based containers during the upgrade? > > > > > > > > > > > As Marcin posted, the plan in RDO is to support both CentOS Stream 8 and > > > > > > CentOS Stream 9 in Yoga. This is how we have managed previous major CentOS > > > > > > version upgrades in the past providing support for both releases in an > > > > > > OpenStack version to ease the upgrade so I'd like to keep yoga working on > > > > > > py3.6 included in CS8 and CS9. > > > > > > > > > > If this was the plan why wasn't it made clear to the TC before they > > > > > dropped CS8 from the Yoga runtimes? Would it even be possible for the TC > > > > > to add CS8 and py36 back in to the Yoga runtimes? > > > > > > > > > > > > Postponing to Z, you mean dropping the py3.6 tests or bumping it in > > > > > > > in 'setup.cfg' so that no one can install on py3.6 ? > > > > > > > > > > > > > > First one we already did and as per Yoga testing runtime we are > > > > > > > targeting centos9-stream[1] in Yoga itself. > > > > > > > > > > > > > > For making 'python_requires' >=py3.8 in 'setup.cfg', I have no > > > > > > > string opinion on this but I prefer to have flexible here that 'yes > > > > > > > OpenStack is installable in py3.6 but we do not test it anymore from > > > > > > > Yoga onwards so no guarantee'. Our testing runtime main goal is > > > > > > > that we document the version we are testing *at least* which means > > > > > > > it can work on lower or higher versions too but we just do not test > > > > > > > them. > > > > > > > > > > > > > > > > > > > May it be possible to keep py3.6 jobs to make sure patches are not > > > > > > introducing py3.8-only features that would break deployment in CS8? > > > > > > > > > > We should keep CS8 and py36 as supported runtimes if we are keeping the > > > > > jobs, otherwise this just sets super confusing. > > > > > > > > Yeah, I think it create confusion as I can see in this ML thread so > > > > agree on keeping 'python_requires' also in sycn with what we test. > > > > > > Cool thanks! > > > > > > > Now question on going back to centos stream 8 support in Yoga, is it > > > > not centos stream 9 is stable released or is it experimental only? If > > > > stable then we can keep the latest available version which can be > > > > centos stream 9. > > > > > > I honestly don't know and can't find any docs to point to. > > > > > > > Our project interface testing doc clearly stats 'latest LTS' to > > > > consider for testing[1] whenever we are ready. I am not very strongly > > > > against of reverting back to centos stream 8 but we should not add two > > > > version of same distro in testing which can be a lot of we consider > > > > below three distro > > > > > > How do we expect operators to upgrade between Xena where CentOS 8 stream > > > is a supported runtime and Yoga where CentOS 9 stream is currently the > > > equivalent supported runtime without supporting both for a single > > > release? > > > > This is really good question on upgrade testing we do at upstream and I remember > > it cameup and discussed a lot during py2.7 drop also that how we are testing the > > upgrade from py2.7 to py3. Can we do in grenade? But that we answered as we did > > not tested directly but stein and train tested both version so should not be any issue > > if you upgrade from there (one of FAQ in my blog[1]). > > > > But on distro upgrade testing, as you know we do not test those in upstream neither > > in grenade where upgrade are done on old node distro only not from old distro version to > > new distro version with new code. It is not like we do not want to test but if anyone > > from any distro would like to setup grenade for that and maintain then we are more happy. > > In summary, yes we cannot guarantee distro upgrade testing from OpenStack upstream testing > > due to resource bandwidth issue but we will welcome any help here. > > We discussed with amoralej about moving the testing runtime to CentOS > stream 8 and py36 or not in TC IRC channel[1]. > > As we at upstream do not test distro two versions in same release, > amoralej agreed to keep CentOS stream 9 if one to choose which is our > current testing runtime is. So no change in the direction of current > testing runtime and dropping the py3.6 but there is possibility of > some trade off here. If any py3.6 breaking changes are happening then > it is up to projects goodness, bandwidth, or flexibility about > accepting the fix or not or even add a py36 unit test job. As our > testing runtime is the minimum things to test and it does not put any > max limit of testing, any project can extend their testing as per > their bandwidth. > > In summary: > > (This is what we agreed today in TC channel but as most of the folks > are on leave today, I will keep it open until next week so see if any > objections from the community and will conclude it accordingly) > > * No change in Yoga testing runtime and we move to cs9 and drop py36. > * We will not put hard stop on cs8 support and we can: > ** Devstack keep supporting cs8 in Yoga > ** It can be negotiated with project to add py36 job or fix if any > py36 breaking changes are observed by RDO (or any distro interested in > py36) but it depends on the project decision and bandwidth. > > As next, how we can improve the upgrade testing from distro versions > is something we will explore next and see what all we can test to make > upgrade easier. I'm against this, as I said in my setup.cfg >= py38 review for openstack/nova [1] we either list and support runtimes or don't. If RDO and others need CentOS 8 Stream support for a release then lets include it and py36 still for Yoga and make things explicit. As I've said elsewhere I think the TC really need to adjust their thinking on this topic and allow for one OpenStack release where both the old and new LTS distro release are supported. Ensuring we allow people to actually upgrade in place and later handle the distro upgrade itself. Cheers, Lee [1] https://review.opendev.org/c/openstack/nova/+/819415 -- Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 2D76 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From michal.arbet at ultimum.io Mon Nov 29 09:20:00 2021 From: michal.arbet at ultimum.io (Michal Arbet) Date: Mon, 29 Nov 2021 10:20:00 +0100 Subject: [neutron]Some floating IPs inaccessible after restart of L3 agent In-Reply-To: References: Message-ID: Ahoj Kamil, I've just read email on phone quickly, and I remember that I've fixed something similar in Debian Victoria packages. Maybe it's your issue, but can't check right now. Could you check it ? It's fixed in newer versions of neutron. https://bugs.launchpad.net/ubuntu/+source/neutron/+bug/1927868 Thanks, Michal Arbet (kevko) D?a pi 26. 11. 2021, 10:53 Kamil Mad?? nap?sal(a): > Hello Everyone, > > We have openstack Victoria deployed since the beginning of the year with > kolla/ansible in docker containers. Everything was running OK, but few > weeks ago we noticed issues with networking. Our installation uses > Openvswitch networking with DVR non HA routers. > > Everything is running smoothly until we restart L3 agent. After that, some > floating ips of VMs running on the node where L3 agent is running becomes > inaccessible. Workaround is to reassign floating IP to affected VM. Every > restart affects same floating IPs and VMs. > > No errors/excpetions found in logs. > > I was able to find out that after restart there are missing routes for > those particular floating IPs in fip- namespace, which causes that proxy > arp responses are not working. After floating IP address is reassigned, > routes are added by L3 agent and floating IP is working again. > > Looks like some sort of race condition in L3 agent, but I was not able to > identify any possible existing bug. > > L3 agent is in version 17.0.1.dev44. > > Is anyone aware of any existing bug which could explain such behavior, or > does anyone have idea how to solve the issue? > > Kamil Mad?? > *Slovensko IT a.s.* > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kamil.madac at slovenskoit.sk Mon Nov 29 09:34:56 2021 From: kamil.madac at slovenskoit.sk (=?utf-8?B?S2FtaWwgTWFkw6HEjQ==?=) Date: Mon, 29 Nov 2021 09:34:56 +0000 Subject: [neutron]Some floating IPs inaccessible after restart of L3 agent In-Reply-To: References: Message-ID: Ahoj Michal, Thanks for responding and suggestion. During the weekend I upgraded neutron l3 agent to most recent victoria version of kolla container (17.2.2.dev56) and it seems it helped -> No disappearing routes in fip namespace anymore after restart ? I found change set which fixes race condition in l3 agent https://review.opendev.org/c/openstack/neutron/+/803576 from September this year and I think that could be the one which fixes it. ________________________________ From: Michal Arbet Sent: Monday, November 29, 2021 10:20 AM To: Kamil Mad?? Cc: openstack-discuss Subject: Re: [neutron]Some floating IPs inaccessible after restart of L3 agent Ahoj Kamil, I've just read email on phone quickly, and I remember that I've fixed something similar in Debian Victoria packages. Maybe it's your issue, but can't check right now. Could you check it ? It's fixed in newer versions of neutron. https://bugs.launchpad.net/ubuntu/+source/neutron/+bug/1927868 Thanks, Michal Arbet (kevko) D?a pi 26. 11. 2021, 10:53 Kamil Mad?? > nap?sal(a): Hello Everyone, We have openstack Victoria deployed since the beginning of the year with kolla/ansible in docker containers. Everything was running OK, but few weeks ago we noticed issues with networking. Our installation uses Openvswitch networking with DVR non HA routers. Everything is running smoothly until we restart L3 agent. After that, some floating ips of VMs running on the node where L3 agent is running becomes inaccessible. Workaround is to reassign floating IP to affected VM. Every restart affects same floating IPs and VMs. No errors/excpetions found in logs. I was able to find out that after restart there are missing routes for those particular floating IPs in fip- namespace, which causes that proxy arp responses are not working. After floating IP address is reassigned, routes are added by L3 agent and floating IP is working again. Looks like some sort of race condition in L3 agent, but I was not able to identify any possible existing bug. L3 agent is in version 17.0.1.dev44. Is anyone aware of any existing bug which could explain such behavior, or does anyone have idea how to solve the issue? Kamil Mad?? Slovensko IT a.s. -------------- next part -------------- An HTML attachment was scrubbed... URL: From katonalala at gmail.com Mon Nov 29 10:11:10 2021 From: katonalala at gmail.com (Lajos Katona) Date: Mon, 29 Nov 2021 11:11:10 +0100 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming In-Reply-To: <0dd39f7fa19d4f058af6bc975784c242@huawei.com> References: <2165480.iZASKD2KPV@p1> <3MOP2R.O83SZVO0NWN23@est.tech> <3e835881-29d8-4e33-9ef1-36171f9dd0a3@www.fastmail.com> <14494355.tv2OnDr8pf@p1> <0dd39f7fa19d4f058af6bc975784c242@huawei.com> Message-ID: Hi, I am not sure what is the current status of elastic, but we should use again elastic-recheck, keep the bug definitions up-to-date and dedicate time to keep it alive. >From the zuul status page at least it seems it has fresh data: http://status.openstack.org/elastic-recheck/data/integrated_gate.html It could help reviewers to see feedback from elastic-recheck if the issue in the given patch is an already known bug. https://docs.openstack.org/infra/elastic-recheck/readme.html regards Lajos Katona (lajoskatona) Oleg Bondarev ezt ?rta (id?pont: 2021. nov. 29., H, 8:35): > Hello, > > A few thoughts from my side in scope of brainstorm: > > 1) Recheck actual bugs (?recheck bug 123456?) > - not a new idea to better keep track of all failures > - force a developer to investigate the reason of each CI failure and > increase corresponding bug rating, or file a new bug (or go and fix this > bug finally!) > - I think we should have some gate failure bugs dashboard with > hottest bugs on top (maybe there is one that I?m not aware of) so everyone > could go and check if his CI failure is known or new > - simple ?recheck? could be forbidden, at least during ?crisis > management? window > > 2) Allow recheck TIMEOUT/POST_FAILURE jobs > - while I agree that re-run particular jobs is evil, > TIMEOUT/POST_FAILURE are not related to the patch in majority of cases > - performance issues are usually caught by Rally jobs > - of course core team should monitor if timeouts become a rule for > some jobs > > 3) Ability to block rechecks in some cases, like known gate blocker > - not everyone is always aware that gates are blocked with some issue > - PTL (or any core team member) can turn off rechecks during that > time (with a message from Zuul) > - happens not often but still can save some CI resources > > Thanks, > Oleg > --- > Advanced Software Technology Lab > Huawei > > -----Original Message----- > From: Slawek Kaplonski [mailto:skaplons at redhat.com] > Sent: Thursday, November 18, 2021 10:46 AM > To: Clark Boylan > Cc: openstack-discuss at lists.openstack.org > Subject: Re: [neutron][CI] How to reduce number of rechecks - brainstorming > > Hi, > > Thx Clark for detailed explanation about that :) > > On ?roda, 17 listopada 2021 16:51:57 CET you wrote: > > On Wed, Nov 17, 2021, at 2:18 AM, Balazs Gibizer wrote: > > > > Snip. I want to respond to a specific suggestion: > > > 3) there was informal discussion before about a possibility to > > > re-run only some jobs with a recheck instead for re-running the > > > whole set. I don't know if this is feasible with Zuul and I think > > > this only treat the symptom not the root case. But still this could > > > be a direction if all else fails. > > > > OpenStack has configured its check and gate queues with something > > we've > called > > "clean check". This refers to the requirement that before an OpenStack > > project can be gated it must pass check tests first. This policy was > > instituted because a number of these infrequent but problematic issues > > were traced back to recheck spamming. Basically changes would show up > > and were broken. They would fail some percentage of the time. They got > > rechecked > until > > they finally merged and now their failure rate is added to the whole. > > This rule was introduced to make it more difficult to get this > > flakyness into the gate. > > > > Locking in test results is in direct opposition to the existing policy > > and goals. Locking results would make it far more trivial to land such > > flakyness as you wouldn't need entire sets of jobs to pass before you > could land. > > Instead you could rerun individual jobs until each one passed and then > > land the result. Potentially introducing significant flakyness with a > > single merge. > > > > Locking results is also not really something that fits well with the > > speculative gate queues that Zuul runs. Remember that Zuul constructs > > a future git state and tests that in parallel. Currently the state for > > OpenStack looks like: > > > > A - Nova > > ^ > > B - Glance > > ^ > > C - Neutron > > ^ > > D - Neutron > > ^ > > F - Neutron > > > > The B glance change is tested as if the A Nova change has already > > merged and so on down the queue. If we want to keep these speculative > > states we can't really have humans manually verify a failure can be > ignored and retry it. > > Because we'd be enqueuing job builds at different stages of > > speculative state. Each job build would be testing a different version > of the software. > > > > What we could do is implement a retry limit for failing jobs. Zuul > > could > rerun > > failing jobs X times before giving up and reporting failure (this > > would require updates to Zuul). The problem with this approach is > > without some oversight it becomes very easy to land changes that make > > things worse. As a side note Zuul does do retries, but only for > > detected network errors or when a pre-run playbook fails. The > > assumption is that network failures are due to the dangers of the > > Internet, and that pre-run playbooks are small, self contained, > > unlikely to fail, and when they do fail the failure should be > independent of what is being tested. > > > > Where does that leave us? > > > > I think it is worth considering the original goals of "clean check". > > We know that rechecking/rerunning only makes these problems worse in the > long term. > > They represent technical debt. One of the reasons we run these tests > > is to show us when our software is broken. In the case of flaky > > results we are exposing this technical debt where it impacts the > > functionality of our software. The longer we avoid fixing these issues > > the worse it gets, and > this > > is true even with "clean check". > > I agree with You on that and I would really like to find better/other > solution for the Neutron problem than rechecking only broken jobs as I'm > pretty sure that this would make things much worst quickly. > > > > > Do we as developers find value in knowing the software needs attention > before > > it gets released to users? Do the users find value in running reliable > > software? In the past we have asserted that "yes, there is value in > > this", and have invested in tracking, investigating, and fixing these > > problems even if they happen infrequently. But that does require > > investment, and active maintenance. > > > > Clark > > > -- > Slawek Kaplonski > Principal Software Engineer > Red Hat > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnasiadka at gmail.com Mon Nov 29 10:14:56 2021 From: mnasiadka at gmail.com (=?utf-8?Q?Micha=C5=82_Nasiadka?=) Date: Mon, 29 Nov 2021 11:14:56 +0100 Subject: python_requires >= 3.8 during Yoga In-Reply-To: <20211129091718.fdgdwmowtp3jxf6u@lyarwood-laptop.usersys.redhat.com> References: <17d58f75012.106d120141328715.5251672452583667038@ghanshyammann.com> <20211126122942.deyi4tymy5xoodvz@lyarwood-laptop.usersys.redhat.com> <17d5ce532e1.eabdb7481381896.7024061515184177274@ghanshyammann.com> <20211126160515.4tvksnqcz3vaomf4@lyarwood-laptop.usersys.redhat.com> <17d5d0a4f3f.dc8bb48c1384810.8642013929754348916@ghanshyammann.com> <17d5d476490.e30e73db1388640.4406614125429012751@ghanshyammann.com> <20211129091718.fdgdwmowtp3jxf6u@lyarwood-laptop.usersys.redhat.com> Message-ID: Hello, I?m strongly against dropping py36 support now, unless we?re going to find a solution that works on CentOS Stream 8. RHEL 9 is not out, and probably will not be in months - how do we expect users to use Yoga on production deployments (where they use CentOS Linux/equivalents today)? Dropping the runtime testing and supporting devstack - and negotiating on a per project basis to support py36 or not - is not a solution. Either Yoga supports py36 as a transition point/release to py38 - or not. In Kolla - we also did not anticipate (and don?t think it?s a good idea) to support CentOS Stream 9 in Yoga release. With the current decision - we are either forced with supporting CentOS Stream 9 (with no alternatives like Rocky Linux/Alma Linux in place - because RHEL 9 is not out) - or dropping CentOS support completely. If we pursue CS9 - we also need to support migration from CS8 to CS9 and that?s also a considerable amount of work - which is unplanned. Best regards, Michal > On 29 Nov 2021, at 10:17, Lee Yarwood wrote: > > On 26-11-21 11:24:59, Ghanshyam Mann wrote: >> ---- On Fri, 26 Nov 2021 10:18:16 -0600 Ghanshyam Mann wrote ---- >>> ---- On Fri, 26 Nov 2021 10:05:15 -0600 Lee Yarwood wrote ---- >>>> On 26-11-21 09:37:44, Ghanshyam Mann wrote: >>>>> ---- On Fri, 26 Nov 2021 06:29:42 -0600 Lee Yarwood wrote ---- >>>>>> On 26-11-21 10:54:26, Alfredo Moralejo Alonso wrote: >>>>>>> On Thu, Nov 25, 2021 at 10:23 PM Ghanshyam Mann >>>>>>> wrote: >>>>>>> >>>>>>>> ---- On Thu, 25 Nov 2021 13:58:28 -0600 Marcin Juszkiewicz < >>>>>>>> marcin.juszkiewicz at linaro.org> wrote ---- >>>>>>>>> W dniu 25.11.2021 o 19:13, Stephen Finucane pisze: >>>>>>>>>> gmann has been helpfully proposing patches to change the >>>>>>>>>> versions of Python we're testing against in Yoga. I've >>>>>>>>>> suggested that we might want to bump 'python_requires' in >>>>>>>>>> 'setup.cfg' to indicate that we no longer support any version >>>>>>>>>> of Python before 3.8 >>>>>>>>> >>>>>>>>> CentOS Stream 8 has Python 3.6 by default and RDO team is doing >>>>>>>>> CS8 -> CS9 migration during Yoga cycle. Can we postpone it to Z >>>>>>>>> when there will be no distribution with Py 3.6 to care about? >>>>>> >>>>>> Stupid question that I should know the answer to but does RDO really >>>>>> support RPM based installations anymore? IOW couldn't we just workaround >>>>>> this by providing CS8 py38 based containers during the upgrade? >>>>>> >>>>>>> As Marcin posted, the plan in RDO is to support both CentOS Stream 8 and >>>>>>> CentOS Stream 9 in Yoga. This is how we have managed previous major CentOS >>>>>>> version upgrades in the past providing support for both releases in an >>>>>>> OpenStack version to ease the upgrade so I'd like to keep yoga working on >>>>>>> py3.6 included in CS8 and CS9. >>>>>> >>>>>> If this was the plan why wasn't it made clear to the TC before they >>>>>> dropped CS8 from the Yoga runtimes? Would it even be possible for the TC >>>>>> to add CS8 and py36 back in to the Yoga runtimes? >>>>>> >>>>>>>> Postponing to Z, you mean dropping the py3.6 tests or bumping it in >>>>>>>> in 'setup.cfg' so that no one can install on py3.6 ? >>>>>>>> >>>>>>>> First one we already did and as per Yoga testing runtime we are >>>>>>>> targeting centos9-stream[1] in Yoga itself. >>>>>>>> >>>>>>>> For making 'python_requires' >=py3.8 in 'setup.cfg', I have no >>>>>>>> string opinion on this but I prefer to have flexible here that 'yes >>>>>>>> OpenStack is installable in py3.6 but we do not test it anymore from >>>>>>>> Yoga onwards so no guarantee'. Our testing runtime main goal is >>>>>>>> that we document the version we are testing *at least* which means >>>>>>>> it can work on lower or higher versions too but we just do not test >>>>>>>> them. >>>>>>>> >>>>>>> >>>>>>> May it be possible to keep py3.6 jobs to make sure patches are not >>>>>>> introducing py3.8-only features that would break deployment in CS8? >>>>>> >>>>>> We should keep CS8 and py36 as supported runtimes if we are keeping the >>>>>> jobs, otherwise this just sets super confusing. >>>>> >>>>> Yeah, I think it create confusion as I can see in this ML thread so >>>>> agree on keeping 'python_requires' also in sycn with what we test. >>>> >>>> Cool thanks! >>>> >>>>> Now question on going back to centos stream 8 support in Yoga, is it >>>>> not centos stream 9 is stable released or is it experimental only? If >>>>> stable then we can keep the latest available version which can be >>>>> centos stream 9. >>>> >>>> I honestly don't know and can't find any docs to point to. >>>> >>>>> Our project interface testing doc clearly stats 'latest LTS' to >>>>> consider for testing[1] whenever we are ready. I am not very strongly >>>>> against of reverting back to centos stream 8 but we should not add two >>>>> version of same distro in testing which can be a lot of we consider >>>>> below three distro >>>> >>>> How do we expect operators to upgrade between Xena where CentOS 8 stream >>>> is a supported runtime and Yoga where CentOS 9 stream is currently the >>>> equivalent supported runtime without supporting both for a single >>>> release? >>> >>> This is really good question on upgrade testing we do at upstream and I remember >>> it cameup and discussed a lot during py2.7 drop also that how we are testing the >>> upgrade from py2.7 to py3. Can we do in grenade? But that we answered as we did >>> not tested directly but stein and train tested both version so should not be any issue >>> if you upgrade from there (one of FAQ in my blog[1]). >>> >>> But on distro upgrade testing, as you know we do not test those in upstream neither >>> in grenade where upgrade are done on old node distro only not from old distro version to >>> new distro version with new code. It is not like we do not want to test but if anyone >>> from any distro would like to setup grenade for that and maintain then we are more happy. >>> In summary, yes we cannot guarantee distro upgrade testing from OpenStack upstream testing >>> due to resource bandwidth issue but we will welcome any help here. >> >> We discussed with amoralej about moving the testing runtime to CentOS >> stream 8 and py36 or not in TC IRC channel[1]. >> >> As we at upstream do not test distro two versions in same release, >> amoralej agreed to keep CentOS stream 9 if one to choose which is our >> current testing runtime is. So no change in the direction of current >> testing runtime and dropping the py3.6 but there is possibility of >> some trade off here. If any py3.6 breaking changes are happening then >> it is up to projects goodness, bandwidth, or flexibility about >> accepting the fix or not or even add a py36 unit test job. As our >> testing runtime is the minimum things to test and it does not put any >> max limit of testing, any project can extend their testing as per >> their bandwidth. >> >> In summary: >> >> (This is what we agreed today in TC channel but as most of the folks >> are on leave today, I will keep it open until next week so see if any >> objections from the community and will conclude it accordingly) >> >> * No change in Yoga testing runtime and we move to cs9 and drop py36. >> * We will not put hard stop on cs8 support and we can: >> ** Devstack keep supporting cs8 in Yoga >> ** It can be negotiated with project to add py36 job or fix if any >> py36 breaking changes are observed by RDO (or any distro interested in >> py36) but it depends on the project decision and bandwidth. >> >> As next, how we can improve the upgrade testing from distro versions >> is something we will explore next and see what all we can test to make >> upgrade easier. > > I'm against this, as I said in my setup.cfg >= py38 review for > openstack/nova [1] we either list and support runtimes or don't. If RDO > and others need CentOS 8 Stream support for a release then lets include > it and py36 still for Yoga and make things explicit. > > As I've said elsewhere I think the TC really need to adjust their > thinking on this topic and allow for one OpenStack release where both > the old and new LTS distro release are supported. Ensuring we allow > people to actually upgrade in place and later handle the distro upgrade > itself. > > Cheers, > > Lee > > [1] https://review.opendev.org/c/openstack/nova/+/819415 > > -- > Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 2D76 -------------- next part -------------- An HTML attachment was scrubbed... URL: From michal.arbet at ultimum.io Mon Nov 29 10:47:47 2021 From: michal.arbet at ultimum.io (Michal Arbet) Date: Mon, 29 Nov 2021 11:47:47 +0100 Subject: [neutron]Some floating IPs inaccessible after restart of L3 agent In-Reply-To: References: Message-ID: I am glad that it works for you now :) Michal Michal Arbet Openstack Engineer Ultimum Technologies a.s. Na Po???? 1047/26, 11000 Praha 1 Czech Republic +420 604 228 897 michal.arbet at ultimum.io *https://ultimum.io * LinkedIn | Twitter | Facebook po 29. 11. 2021 v 10:35 odes?latel Kamil Mad?? napsal: > Ahoj Michal, > > Thanks for responding and suggestion. During the weekend I upgraded > neutron l3 agent to most recent victoria version of kolla container > (17.2.2.dev56) and it seems it helped -> No disappearing routes in fip > namespace anymore after restart ? > > I found change set which fixes race condition in l3 agent > https://review.opendev.org/c/openstack/neutron/+/803576 from September > this year and I think that could be the one which fixes it. > > ------------------------------ > *From:* Michal Arbet > *Sent:* Monday, November 29, 2021 10:20 AM > *To:* Kamil Mad?? > *Cc:* openstack-discuss > *Subject:* Re: [neutron]Some floating IPs inaccessible after restart of > L3 agent > > Ahoj Kamil, > > I've just read email on phone quickly, and I remember that I've fixed > something similar in Debian Victoria packages. Maybe it's your issue, but > can't check right now. > > Could you check it ? It's fixed in newer versions of neutron. > > https://bugs.launchpad.net/ubuntu/+source/neutron/+bug/1927868 > > Thanks, > Michal Arbet (kevko) > > D?a pi 26. 11. 2021, 10:53 Kamil Mad?? > nap?sal(a): > > Hello Everyone, > > We have openstack Victoria deployed since the beginning of the year with > kolla/ansible in docker containers. Everything was running OK, but few > weeks ago we noticed issues with networking. Our installation uses > Openvswitch networking with DVR non HA routers. > > Everything is running smoothly until we restart L3 agent. After that, some > floating ips of VMs running on the node where L3 agent is running becomes > inaccessible. Workaround is to reassign floating IP to affected VM. Every > restart affects same floating IPs and VMs. > > No errors/excpetions found in logs. > > I was able to find out that after restart there are missing routes for > those particular floating IPs in fip- namespace, which causes that proxy > arp responses are not working. After floating IP address is reassigned, > routes are added by L3 agent and floating IP is working again. > > Looks like some sort of race condition in L3 agent, but I was not able to > identify any possible existing bug. > > L3 agent is in version 17.0.1.dev44. > > Is anyone aware of any existing bug which could explain such behavior, or > does anyone have idea how to solve the issue? > > Kamil Mad?? > *Slovensko IT a.s.* > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Mon Nov 29 11:26:21 2021 From: smooney at redhat.com (Sean Mooney) Date: Mon, 29 Nov 2021 11:26:21 +0000 Subject: python_requires >= 3.8 during Yoga In-Reply-To: References: <17d58f75012.106d120141328715.5251672452583667038@ghanshyammann.com> <20211126122942.deyi4tymy5xoodvz@lyarwood-laptop.usersys.redhat.com> <17d5ce532e1.eabdb7481381896.7024061515184177274@ghanshyammann.com> <20211126160515.4tvksnqcz3vaomf4@lyarwood-laptop.usersys.redhat.com> <20211126162013.fpwlss5h7rz2xzar@yuggoth.org> Message-ID: <75efd4f10601cc19a45c7d8600c3c2a59b88c322.camel@redhat.com> On Fri, 2021-11-26 at 17:45 +0100, Dmitry Tantsur wrote: > On Fri, Nov 26, 2021 at 5:26 PM Jeremy Stanley wrote: > > > On 2021-11-26 16:05:15 +0000 (+0000), Lee Yarwood wrote: > > [...] > > > How do we expect operators to upgrade between Xena where CentOS 8 stream > > > is a supported runtime and Yoga where CentOS 9 stream is currently the > > > equivalent supported runtime without supporting both for a single > > > release? > > > > > > I appreciate it bloats the support matrix a little but the rest of the > > > thread suggests we need to keep py36 around for now anyway. > > > > Somehow we manage to get by with testing only one Ubuntu version per > > OpenStack release. > > > > I'm quite sure there is always an overlap, otherwise grenade would not work. in the case fo centso the overlap is py39. centos 9 will have py39+ and there was nonvoting support for py 39 for a few release at this point so yoga,xena and even wallaby should be fucntional with py39. grenade for the most point does nto test the underlying OS > > Dmitry > > > > -- > > Jeremy Stanley > > > > From dtantsur at redhat.com Mon Nov 29 11:28:47 2021 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Mon, 29 Nov 2021 12:28:47 +0100 Subject: python_requires >= 3.8 during Yoga In-Reply-To: <75efd4f10601cc19a45c7d8600c3c2a59b88c322.camel@redhat.com> References: <17d58f75012.106d120141328715.5251672452583667038@ghanshyammann.com> <20211126122942.deyi4tymy5xoodvz@lyarwood-laptop.usersys.redhat.com> <17d5ce532e1.eabdb7481381896.7024061515184177274@ghanshyammann.com> <20211126160515.4tvksnqcz3vaomf4@lyarwood-laptop.usersys.redhat.com> <20211126162013.fpwlss5h7rz2xzar@yuggoth.org> <75efd4f10601cc19a45c7d8600c3c2a59b88c322.camel@redhat.com> Message-ID: On Mon, Nov 29, 2021 at 12:26 PM Sean Mooney wrote: > On Fri, 2021-11-26 at 17:45 +0100, Dmitry Tantsur wrote: > > On Fri, Nov 26, 2021 at 5:26 PM Jeremy Stanley > wrote: > > > > > On 2021-11-26 16:05:15 +0000 (+0000), Lee Yarwood wrote: > > > [...] > > > > How do we expect operators to upgrade between Xena where CentOS 8 > stream > > > > is a supported runtime and Yoga where CentOS 9 stream is currently > the > > > > equivalent supported runtime without supporting both for a single > > > > release? > > > > > > > > I appreciate it bloats the support matrix a little but the rest of > the > > > > thread suggests we need to keep py36 around for now anyway. > > > > > > Somehow we manage to get by with testing only one Ubuntu version per > > > OpenStack release. > > > > > > > I'm quite sure there is always an overlap, otherwise grenade would not > work. > in the case fo centso the overlap is py39. > centos 9 will have py39+ and there was nonvoting support for py 39 for a > few release > at this point so yoga,xena and even wallaby should be fucntional with py39. > I don't think grenade switches Python mid-upgrade. And as I explain somewhere else in this thread, the non-default Pythons in CentOS 8 are pretty rudimental and don't work for us (neither Bifrost nor Metal3). Dmitry > > grenade for the most point does nto test the underlying OS > > > > Dmitry > > > > > > > -- > > > Jeremy Stanley > > > > > > > > > -- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill -------------- next part -------------- An HTML attachment was scrubbed... URL: From adivya1.singh at gmail.com Mon Nov 29 11:33:33 2021 From: adivya1.singh at gmail.com (Adivya Singh) Date: Mon, 29 Nov 2021 17:03:33 +0530 Subject: Regarding Drop down of frequently used stacks in OpenStack Wallaby Version 23.1.2 In-Reply-To: References: Message-ID: Dear Team, Any feedback on this Regards Adivya Singh 959098094 On Fri, Nov 26, 2021 at 11:16 PM Adivya Singh wrote: > Dear Team, > > I want a Drop down of frequently used stack in my orchestration --> Launch > Stack parametre in openstack version given above, Is there any Solution for > this, How should i go ahead with this > > Regards > Adivya Singh > -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Mon Nov 29 11:38:17 2021 From: smooney at redhat.com (Sean Mooney) Date: Mon, 29 Nov 2021 11:38:17 +0000 Subject: python_requires >= 3.8 during Yoga In-Reply-To: References: <20211126143019.2o7rycs44ycnkgez@yuggoth.org> <17d5cdde95a.b95752d01381326.4462262438602968824@ghanshyammann.com> Message-ID: On Fri, 2021-11-26 at 19:48 +0100, Dmitry Tantsur wrote: > On Fri, Nov 26, 2021 at 6:14 PM Alfredo Moralejo Alonso > wrote: > > > > > > > On Fri, Nov 26, 2021 at 4:44 PM Dmitry Tantsur > > wrote: > > > > > > > > > > > On Fri, Nov 26, 2021 at 4:35 PM Ghanshyam Mann > > > wrote: > > > > > > > ---- On Fri, 26 Nov 2021 09:20:39 -0600 Dmitry Tantsur < > > > > dtantsur at redhat.com> wrote ---- > > > > > > > > > > > > > > > On Fri, Nov 26, 2021 at 3:35 PM Jeremy Stanley > > > > wrote: > > > > > On 2021-11-26 14:29:53 +0100 (+0100), Dmitry Tantsur wrote: > > > > > [...] > > > > > > CentOS/RHEL ships 3.6 and a limited version of 3.8 and 3.9. > > > > > [...] > > > > > > > > > > Is this still true for CentOS Stream 9? The TC decision was to > > > > > support that instead of CentOS Stream 8 in Yoga. > > > > > > > > > > No. But Stream 9 is pretty much beta, so it's not a replacement for > > > > us (and we don't have nodes in nodepool with it even yet?). > > > > > > > > I think here is the confusion. In TC, after checking with centos team > > > > impression was CentOS stream 9 is released and that is > > > > what we should update In OpenStack testing. And then only we updated the > > > > centos stream 8 -> 9 and dropped py3.6 testing > > > > > > > > - > > > > https://review.opendev.org/c/openstack/governance/+/815851/3..6/reference/runtimes/yoga.rst > > > > > > > > > > I think there is an enormous perception gap between the CentOS team and > > > the rest of the world. > > > > > > > > So, CentOS Stream 9 was released, in the official mirrors and usable since > > some weeks ago[1][2]. We shouldn't consider it beta or something like that. > > > > As mentioned, support for diskimage-builder has been introduced for CS9 > > and there are nodepool nodes ready for it. From RDO, we are providing RPMs > > for master branch content on CentOS Stream 9 [3] and actually we have been > > doing some tests. Actually, we have recently merged new jobs in > > puppet-openstack[4]. > > > > "It's usable since some weeks ago and we even added tests today" is not > exactly reassuring :) The PTI uses wording "stable and LTS", which applies > to Stream 9 no more than it applies to Fedora. thats not quite true. yes centos 9 is a roling release but it is more stable then fedroa since packages landing in centos 9 stream have been stablised via fedroa already and any argurments in this regard would also apply to centos 8 stream. The only reason that centos 8 stream would be more stable then 9 stream is due to less frequent updates as focus moves to 9 stream. 9 stream is effectivly a preview of what whill be rhel 9. > > In the end, what we test with Bifrost is what we will recommend people to > deploy it in production on. I do believe people can and should deploy on > Stream 8 despite all the FUD around it, but I cannot do it for Stream 9 > until RHEL 9 is out. why just because rhel 9.0 is relased does not mean centos 9 is sudennly more stable. Now that centos 9 has been release there shoudl be no more package removalas form centos/rhel so it should have stableised in terms of the minium package set and over time we woudl expect more pacakges to be added. yes centos 8 stream will be supported until the EOL or rhel 8 so people can continue to deploy it. rhel 9 will be released next year, perhaps not before yoga is released but if you are deploying RDO you will not be useing RHEL anyway you will be using centos so stream 9 is the better plathform to use if you plan to continue to upgrade the deploy ment over the next few year as it allow you to avoid the costly OS upgrade when moving to the next openstack relesase. > > Dmitry > > > > > > [1] https://www.centos.org/centos-stream/ > > > > [2] https://cloud.centos.org/centos/9-stream/x86_64/images/ > > [3] https://trunk.rdoproject.org/centos9-master/report.html > > [4] > > https://review.opendev.org/c/openstack/puppet-openstack-integration/+/793462 > > > > Alfredo > > > > > > > > > Dmitry > > > > > > > > > > > > > > > > > > > > > > -gmann > > > > > > > > > Dmitry > > > > > -- > > > > > Jeremy Stanley > > > > > > > > > > > > > > > -- > > > > > Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, > > > > > Commercial register: Amtsgericht Muenchen, HRB 153243, > > > > > Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, > > > > Michael O'Neill > > > > > > > > > > > > > > > > > > > -- > > > Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, > > > Commercial register: Amtsgericht Muenchen, HRB 153243, > > > Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael > > > O'Neill > > > > > > From dtantsur at redhat.com Mon Nov 29 11:46:52 2021 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Mon, 29 Nov 2021 12:46:52 +0100 Subject: python_requires >= 3.8 during Yoga In-Reply-To: References: <20211126143019.2o7rycs44ycnkgez@yuggoth.org> <17d5cdde95a.b95752d01381326.4462262438602968824@ghanshyammann.com> Message-ID: On Mon, Nov 29, 2021 at 12:38 PM Sean Mooney wrote: > On Fri, 2021-11-26 at 19:48 +0100, Dmitry Tantsur wrote: > > On Fri, Nov 26, 2021 at 6:14 PM Alfredo Moralejo Alonso < > amoralej at redhat.com> > > wrote: > > > > > > > > > > > On Fri, Nov 26, 2021 at 4:44 PM Dmitry Tantsur > > > wrote: > > > > > > > > > > > > > > > On Fri, Nov 26, 2021 at 4:35 PM Ghanshyam Mann < > gmann at ghanshyammann.com> > > > > wrote: > > > > > > > > > ---- On Fri, 26 Nov 2021 09:20:39 -0600 Dmitry Tantsur < > > > > > dtantsur at redhat.com> wrote ---- > > > > > > > > > > > > > > > > > > On Fri, Nov 26, 2021 at 3:35 PM Jeremy Stanley < > fungi at yuggoth.org> > > > > > wrote: > > > > > > On 2021-11-26 14:29:53 +0100 (+0100), Dmitry Tantsur wrote: > > > > > > [...] > > > > > > > CentOS/RHEL ships 3.6 and a limited version of 3.8 and 3.9. > > > > > > [...] > > > > > > > > > > > > Is this still true for CentOS Stream 9? The TC decision was to > > > > > > support that instead of CentOS Stream 8 in Yoga. > > > > > > > > > > > > No. But Stream 9 is pretty much beta, so it's not a replacement > for > > > > > us (and we don't have nodes in nodepool with it even yet?). > > > > > > > > > > I think here is the confusion. In TC, after checking with centos > team > > > > > impression was CentOS stream 9 is released and that is > > > > > what we should update In OpenStack testing. And then only we > updated the > > > > > centos stream 8 -> 9 and dropped py3.6 testing > > > > > > > > > > - > > > > > > https://review.opendev.org/c/openstack/governance/+/815851/3..6/reference/runtimes/yoga.rst > > > > > > > > > > > > > I think there is an enormous perception gap between the CentOS team > and > > > > the rest of the world. > > > > > > > > > > > So, CentOS Stream 9 was released, in the official mirrors and usable > since > > > some weeks ago[1][2]. We shouldn't consider it beta or something like > that. > > > > > > As mentioned, support for diskimage-builder has been introduced for CS9 > > > and there are nodepool nodes ready for it. From RDO, we are providing > RPMs > > > for master branch content on CentOS Stream 9 [3] and actually we have > been > > > doing some tests. Actually, we have recently merged new jobs in > > > puppet-openstack[4]. > > > > > > > "It's usable since some weeks ago and we even added tests today" is not > > exactly reassuring :) The PTI uses wording "stable and LTS", which > applies > > to Stream 9 no more than it applies to Fedora. > thats not quite true. > yes centos 9 is a roling release but it is more stable then fedroa since > packages landing in centos 9 stream have been > stablised via fedroa already and any argurments in this regard would also > apply to centos 8 stream. > Stream 8 is part of an already released and maintained RHEL, hence I give it a certain benefit of doubt. More on this below. > > The only reason that centos 8 stream would be more stable then 9 stream is > due to less frequent updates as focus moves to 9 stream. > 9 stream is effectivly a preview of what whill be rhel 9. > > > > In the end, what we test with Bifrost is what we will recommend people to > > deploy it in production on. I do believe people can and should deploy on > > Stream 8 despite all the FUD around it, but I cannot do it for Stream 9 > > until RHEL 9 is out. > > why just because rhel 9.0 is relased does not mean centos 9 is sudennly > more stable. > Okay, you convinced me, I won't recommend Stream 9 at all :) Kidding aside, we know that each major branch of RHEL offers a certain degree of compatibility. It's not expected that 8.N+1 breaks a lot of stuff from 8.N, hence it's not expected that Stream between them will break anything (modulo bugs) either. I have no idea what and how gets into Stream 9 now, nor will I risk recommending it for production. Dmitry > Now that centos 9 has been release there shoudl be no more package > removalas form centos/rhel so > it should have stableised in terms of the minium package set and over time > we woudl expect more pacakges to be added. > yes centos 8 stream will be supported until the EOL or rhel 8 so people > can continue to deploy it. > rhel 9 will be released next year, perhaps not before yoga is released but > if you are deploying RDO you will not be useing > RHEL anyway you will be using centos so stream 9 is the better plathform > to use if you plan to continue to upgrade the deploy > ment over the next few year as it allow you to avoid the costly OS upgrade > when moving to the next openstack relesase. > > > > > Dmitry > > > > > > > > > > [1] https://www.centos.org/centos-stream/ > > > > > > [2] https://cloud.centos.org/centos/9-stream/x86_64/images/ > > > [3] https://trunk.rdoproject.org/centos9-master/report.html > > > [4] > > > > https://review.opendev.org/c/openstack/puppet-openstack-integration/+/793462 > > > > > > Alfredo > > > > > > > > > > > > > Dmitry > > > > > > > > > > > > > > > > > > < > https://review.opendev.org/c/openstack/governance/+/815851/3..6/reference/runtimes/yoga.rst > > > > > > > > > > > > -gmann > > > > > > > > > > > Dmitry > > > > > > -- > > > > > > Jeremy Stanley > > > > > > > > > > > > > > > > > > -- > > > > > > Red Hat GmbH, https://de.redhat.com/ , Registered seat: > Grasbrunn, > > > > > > Commercial register: Amtsgericht Muenchen, HRB 153243, > > > > > > Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, > > > > > Michael O'Neill > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, > > > > Commercial register: Amtsgericht Muenchen, HRB 153243, > > > > Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, > Michael > > > > O'Neill > > > > > > > > > > > -- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at a.spamming.party Mon Nov 29 12:21:52 2021 From: openstack at a.spamming.party (Jean-Philippe Evrard) Date: Mon, 29 Nov 2021 13:21:52 +0100 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> Message-ID: <8301ef05-4011-43e9-beec-66f6aeeb00f9@www.fastmail.com> Hello Thierry (and all others), First, thanks for the recap. On Fri, Nov 5, 2021, at 15:26, Thierry Carrez wrote: > The (long) document below reflects the current position of the release > management team on a popular question: should the OpenStack release > cadence be changed? Please note that we only address the release > management / stable branch management facet of the problem. There are > other dimensions to take into account (governance, feature deprecation, > supported distros...) to get a complete view of the debate. I think it's time to have a conversation with all the parties to progress forward. Take more than one dimension into account. It would be sad if we can't progress all together. > The main pressure to release more often is to make features available to > users faster. Developers get a faster feedback loop, hardware vendors > ensure software is compatible with their latest products, and users get > exciting new features. "Release early, release often" is a best practice > in our industry -- we should generally aim at releasing as often as > possible. My view is that we are in a place where openstack projects are very well tested together nowadays. This test coverage reduce the need of "coordinated releases" with larger testing... to a point that some operators are (al)ready to consume master branch. So, for those in need of the latest features early (in a long term fashion), there are two choices, regardless of release & branching cycle: Stay on that rolling forward branch (master), or manage your own fork of the code. That choice wasn't really possible in the early days of openstack without taking larger risks. > But that is counterbalanced by pressure to release less often. From a > development perspective, each release cycle comes with some process > overhead. On the integrators side, a new release means packaging and > validation work. On the users side, it means pressure to upgrade. To > justify that cost, there needs to be enough user-visible benefit (like > new features) in a given release. Very good summary. > For the last 10 years for OpenStack, that balance has been around six > months. Six months let us accumulate enough new development that it was > worth upgrading to / integrating the new version, while giving enough > time to actually do the work. It also aligned well with Foundation > events cadence, allowing to synchronize in-person developer meetings > date with start of cycles. I think we're hitting something here. > The major recent change affecting this trade-off is that the pace of new > development in OpenStack slowed down. The rhythm of changes was divided > by 3 between 2015 and 2021, reflecting that OpenStack is now a mature > and stable solution, where accessing the latest features is no longer a > major driver. That reduces some of the pressure for releasing more > often. At the same time, we have more users every day, with larger and > larger deployments, and keeping those clusters constantly up to date is > an operational challenge. That increases the pressure to release less > often. In essence, OpenStack is becoming much more like a LTS > distribution than a web browser -- something users like moving slow. > Over the past years, project teams also increasingly decoupled > individual components from the "coordinated release". More and more > components opted for an independent or intermediary-released model, > where they can put out releases in the middle of a cycle, making new > features available to their users. This increasingly opens up the > possibility of a longer "coordinated release" which would still allow > development teams to follow "release early, release often" best > practices. All that recent evolution means it is (again) time to > reconsider if the 6-month cadence is what serves our community best, and > in particular if a longer release cadence would not suit us better. Again, thanks to the increase in testability (projects tested together), it could be time for us to step away from the whole model of coordinated release, which is IMO part of this problem. I feel it's okay for a project to release when they are ready/have something to release. What's holding us up to do that? Again, if we stop pushing this "artificial" release model, we'll stop the branching efforts causing overhead work. It's not bringing any value to the ecosystem anymore. > While releasing less often would definitely reduce the load on the > release management team, most of the team work being automated, we do > not think it should be a major factor in motivating the decision. We > should not adjust the cadence too often though, as there is a one-time > cost in switching our processes. In terms of impact, we expect that a > switch to a longer cycle will encourage more project teams to adopt a > "with-intermediary" release model (rather than the traditional "with-rc" > single release per cycle), which may lead to abandoning the latter, > hence simplifying our processes. Longer cycles might also discourage > people to commit to PTL or release liaison work. We'd probably need to > manage expectations there, and encourage more frequent switches (or > create alternate models). I feel it's okay to reduce the cadence of a 'coordinated release' to a year, from a consumer perspective. However, I think it's not the right path forward _without other changes_ (see my comment above, and the reduction of the amount of branches). If the release work didn't change from when I was still in the team, having a longer cycle means more patches to review inside a single release. Of course, less activity in OpenStack have a good counter-balance effect in here. I just believe it's better to release _more often_ than not. But _branching_ should be reduced as much as possible, that's the costly part (to what I have seen, tell me if I am wrong). I don't see any value in making longer releases for the sake of it. I don't see the reason of multiplying the amount of branches and upgrade paths to maintain. > If the decision is made to switch to a longer cycle, the release > management team recommends to switch to one year directly. That would > avoid changing it again anytime soon, and synchronizing on a calendar > year is much simpler to follow and communicate. We also recommend > announcing the change well in advance. We currently have an opportunity > of making the switch when we reach the end of the release naming > alphabet, which would also greatly simplify the communications around > the change. Wouldn't it be easier to completely reduce the branching, and branching only when necessary, and let projects branch when they need to? If we define strict rules for branching (and limit the annoying bits for the consumers), it will increase the quality of the ecosystem IMO. It will also be easier to manage from a packager perspective. Next to that indeed, a "coordinated release" once a year sounds a good idea, for our users ("I am using OpenStack edition 2021"). > Finally, it is worth mentioning the impact on the stable branch work. > Releasing less often would likely impact the number of stable branches > that we keep on maintaining, so that we do not go too much in the past > (and hit unmaintained distributions or long-gone dependencies). We > currently maintain releases for 18 months before they switch to extended > maintenance, which results in between 3 and 4 releases being maintained > at the same time. We'd recommend switching to maintaining one-year > releases for 24 months, which would result in between 2 and 3 releases > being maintained at the same time. Such a change would lead to longer > maintenance for our users while reducing backporting work for our > developers. With people churn, the work will be even harder to maintain. I think however it's delaying the problem: We are not _fixing_ the base need. Managing upgrades of 2/3 releases of a complete openstack stack of projects would be an increased effort for maintainers, just done less frequently. For maintainers, it makes more sense to phase work organically, based on project needs. If you are thinking distros, having to manage all the work when a release is out is far more coordination than if things were released over time. My experience at SUSE was that the branching model is even debatable: It was more work, and after all, we were taking the code we wanted, and put our patches on top if those didn't make upstream/weren't backported on time for x reasons (valid or not ;)). So basically, for me, the stable branches have very little value nowdays from the community perspective (it would be good enough if everybody is fixing master, IMO). I am not sure I am the only one seeing it that way. I still feel it's worth documenting. >From the "refstack" (or whatever it's called now) perspective, an 'OpenStack Powered Platform xx' is still possible with this model. We need to define a yearly baseline of the versions of the software we expect, the APIs that those software expose, and the testing around them. No need for branching, "release often" still work, projects are autonomous/owner of their destiny, and we keep the coordination. Sorry for the long post for only my $0.02 ;) Regards, Jean-Philippe Evrard (evrardjp) From fungi at yuggoth.org Mon Nov 29 13:09:58 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Mon, 29 Nov 2021 13:09:58 +0000 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: <8301ef05-4011-43e9-beec-66f6aeeb00f9@www.fastmail.com> References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> <8301ef05-4011-43e9-beec-66f6aeeb00f9@www.fastmail.com> Message-ID: <20211129130957.3poysdeqsufkp3pg@yuggoth.org> On 2021-11-29 13:21:52 +0100 (+0100), Jean-Philippe Evrard wrote: [...] > My experience at SUSE was that the branching model is even > debatable: It was more work, and after all, we were taking the > code we wanted, and put our patches on top if those didn't make > upstream/weren't backported on time for x reasons (valid or not > ;)). So basically, for me, the stable branches have very little > value nowdays from the community perspective (it would be good > enough if everybody is fixing master, IMO). [...] The primary reason stable branches exist is to make it easier for us to test and publish backports of critical patches to older versions of the software, rather than expecting our downstream consumers to do that work themselves. If you're saying distribution package maintainers are going to do it anyway and ignore our published backports, then dropping the branching model may make sense, but I've seen evidence to suggest that at least some distros do consume our backports directly. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From fungi at yuggoth.org Mon Nov 29 13:41:27 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Mon, 29 Nov 2021 13:41:27 +0000 Subject: [heat][horizon] Regarding Drop down of frequently used stacks in OpenStack Wallaby Version 23.1.2 In-Reply-To: References: Message-ID: <20211129134127.gau2mgiyy2lq2x6c@yuggoth.org> On 2021-11-29 17:03:33 +0530 (+0530), Adivya Singh wrote: > Any feedback on this [...] Typically, a lack of response indicates either that nobody who read your message knows an answer, or that you didn't provide nearly enough context for anyone to even know what your question is. In this case, at least for me, I have no idea what you're trying to ask. A drop-down where (in the Horizon WebUI perhaps), and what do you mean by "stack" in this context? My best guess is that you want to boot multiple servers at the same time in a specific configuration. Normally this would be done by the Heat orchestration subsystem, so putting together those assumptions it seems like your question might be specific to the heat-dashboard plugin for Horizon. Based on the above, I recommend re-asking your question in more precise terms, and including the [heat] and [horizon] topic tags like I've done in the subject of this reply. They're already included in this mailing list's description, but I'll repeat for convenience: please see https://docs.openstack.org/project-team-guide/open-community.html#mailing-lists for guidance on how to apply topic tags in subject lines, and https://wiki.openstack.org/wiki/MailingListEtiquette for general netiquette advice. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From openstack at a.spamming.party Mon Nov 29 13:43:29 2021 From: openstack at a.spamming.party (Jean-Philippe Evrard) Date: Mon, 29 Nov 2021 14:43:29 +0100 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: <20211129130957.3poysdeqsufkp3pg@yuggoth.org> References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> <8301ef05-4011-43e9-beec-66f6aeeb00f9@www.fastmail.com> <20211129130957.3poysdeqsufkp3pg@yuggoth.org> Message-ID: On Mon, Nov 29, 2021, at 14:09, Jeremy Stanley wrote: > The primary reason stable branches exist is to make it easier for us > to test and publish backports of critical patches to older versions > of the software, rather than expecting our downstream consumers to > do that work themselves. If you're saying distribution package > maintainers are going to do it anyway and ignore our published > backports, then dropping the branching model may make sense, but > I've seen evidence to suggest that at least some distros do consume > our backports directly. Don't get me wrong, SUSE is consuming those backports, and (at least was) contributing to them. And yes, I doubt that RH/SUSE/Canonical are simply consuming those packages without ever adding their patches on a case by case basis. So yes, those distros are already doing part of their work downstream (and/or upstream). And for a valid reason: it's part of their job :) Doesn't mean we, as a whole community, still need to cut the work for every single consumer. If we are stretched thin, we need to define priorities. I believe our aggressive policy in terms of branching is hurting the rest of the ecosystem, that's why I needed to say things out loud. I meant the less we branch, the less we backport, the less painful upgrades we have to deal with. It depends on our definition of _when to branch_ of course. Your example of a "critical patch" might be a good reason to branch. We are maybe in a place where this can be on a case by case basis, or that we should improve that definition? Regards, JP From smooney at redhat.com Mon Nov 29 14:02:04 2021 From: smooney at redhat.com (Sean Mooney) Date: Mon, 29 Nov 2021 14:02:04 +0000 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: <20211129130957.3poysdeqsufkp3pg@yuggoth.org> References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> <8301ef05-4011-43e9-beec-66f6aeeb00f9@www.fastmail.com> <20211129130957.3poysdeqsufkp3pg@yuggoth.org> Message-ID: <5010f8d3cf5461afa4656c9c607ea73a5be326df.camel@redhat.com> On Mon, 2021-11-29 at 13:09 +0000, Jeremy Stanley wrote: > On 2021-11-29 13:21:52 +0100 (+0100), Jean-Philippe Evrard wrote: > [...] > > My experience at SUSE was that the branching model is even > > debatable: It was more work, and after all, we were taking the > > code we wanted, and put our patches on top if those didn't make > > upstream/weren't backported on time for x reasons (valid or not > > ;)). So basically, for me, the stable branches have very little > > value nowdays from the community perspective (it would be good > > enough if everybody is fixing master, IMO). > [...] > > The primary reason stable branches exist is to make it easier for us > to test and publish backports of critical patches to older versions > of the software, rather than expecting our downstream consumers to > do that work themselves. If you're saying distribution package > maintainers are going to do it anyway and ignore our published > backports, then dropping the branching model may make sense, but > I've seen evidence to suggest that at least some distros do consume > our backports directly. just speaking form personal experince backporting patches upstream and downstream for redhat osp. i have much much higher confidence in backporting patches to downswtream by first backproting them upstream via the stable branches due to the significantly better upstream ci before the patch is merged. most of our downstream ci happens after the code is merged as part of a unifed build/compose that is then tested by our QE often sever weeks after its merged before the release of our next downstream .z release. We have some patch ci but its really minimal in comparison to the test coverage we have upstream on stable branches and its also more work to do a downstream only backport or premtive downstream backport anyway. since we skip releases downstream if i want to do a downstream only backport e.g. because its a feature i have to backport acrross 3+ release to train in one go which is way harder then resolving conflicts per release. if im doing both an upstream backport and a premtive downstream backport to not have to wait for upstream to merge its also kind of a pain as if i need to make revsions upstream we will get a merge confilct the next time our downstream barnch is rebalsed. so basically if i can i will always do an upstream only backport and wait for the change to by syned downstream via an import. To me the stable brances privide great value, even more value if we allowed feature backports as it woudl elimiate the need for us to carry those downstream. if we could backport features it we could almost avoid downstream branched entirely for everything other then perhaps CVE, fixes or other very rare cases. even in there current state however i stongly think our stable branchs add value and its a compelling aspect of our community. not all opensouce project maintain upstream stable branches as we do and in such cases you are often forced to choose between runing the package from your disto or the project directly to get the fixes and features you need. while we do eventually stop importing from upstream into our downstream packages OSP13z15 which released in march this year was a fulll import form stable/queens the osp 16.2.2 next year will also be a fully import form stable/train once our cherry-pick only release 16.2.1 is out the door to customers. while we do carry patches downstream which must be rebased on top of upstream every time we import, upstream stable adds a lot of value and we mitigate the overhead of downstream patches by applying a very strict feature backport policy which basically ammount to no api,db,rpc or versioned object changes. you would be suprised how many feature still can be backported with those restrictions but it avoid the upgade and interoperablity impact of most feature backports. tl;dr let please keep the upstream stable branches for as long as pepole are willing to maintain them. From ildiko.vancsa at gmail.com Mon Nov 29 14:06:58 2021 From: ildiko.vancsa at gmail.com (Ildiko Vancsa) Date: Mon, 29 Nov 2021 06:06:58 -0800 Subject: [neutron][networking][ipv6][dns][ddi] Upcoming OpenInfra Edge Computing Group sessions In-Reply-To: <5993A296-7D67-444B-8D2F-2544C42C984A@gmail.com> References: <5993A296-7D67-444B-8D2F-2544C42C984A@gmail.com> Message-ID: <7F4DD655-C819-4FD6-B022-EEA3FFE33A8B@gmail.com> Hi, It is s friendly reminder that the DNS discussion is now on!! If you are interested in participating please join here: https://wiki.openstack.org/wiki/Edge_Computing_Group#Meetings Thanks and Best Regards, Ildik? > On Nov 26, 2021, at 16:34, Ildiko Vancsa wrote: > > Hi, > > It is a friendly reminder that we have the Networking and DNS discussion coming up at the OpenInfra Edge Computing Group weekly call on Monday (November 29) at 6am PST / 1400 UTC. > > We have invited industry experts, Cricket Liu and Andrew Wertkin, to share their thoughts and experience in this area. But, we also need YOU to join to turn the session into a lively discussion and debate! > > We have another networking related session the following week as well: December 6th - Networking and IPv6 discussion with Ed Horley > > Beyond the above topics we will have presentations from groups and projects such as Smart Edge and the Industry IoT Consortium (Formerly Industrial Internet Consortium). > > For the full schedule please see the edge WG?s wiki: https://wiki.openstack.org/wiki/Edge_Computing_Group#Upcoming_Topics > > Our calls happen every Monday at 6am Pacific Time / 1400 UTC on Zoom. You can find all the meeting details here: https://wiki.openstack.org/wiki/Edge_Computing_Group#Meetings > > Please let me know if you have any questions about the working group or any of the upcoming sessions. > > Thanks and Best Regards, > Ildik? > > > >> On Nov 3, 2021, at 18:12, Ildiko Vancsa wrote: >> >> Hi, >> >> I?m reaching out to you to draw your attention to the amazing lineup of discussion topics for the OpenInfra Edge Computing Group weekly calls up until the end of this year with industry experts to present and participate in the discussions! >> >> I would like to invite and encourage you to join the working group sessions to discuss edge related challenges and solutions in the below areas and more! >> >> Some of the sessions to highlight will be continuing the discussions we started at the recent PTG: >> * November 29th - Networking and DNS discussion with Cricket Liu and Andrew Wertkin >> * December 6th - Networking and IPv6 discussion with Ed Horley >> >> Beyond the above topics we will have presentations from groups and projects such as Smart Edge and the Industry IoT Consortium (Formerly Industrial Internet Consortium). >> >> For the full schedule please see the edge WG?s wiki: https://wiki.openstack.org/wiki/Edge_Computing_Group#Upcoming_Topics >> >> Our calls happen every Monday at 6am Pacific Time / 1400 UTC on Zoom. You can find all the meeting details here: https://wiki.openstack.org/wiki/Edge_Computing_Group#Meetings >> >> Please let me know if you have any questions about the working group or any of the upcoming sessions. >> >> Thanks and Best Regards, >> Ildik? >> >> > From smooney at redhat.com Mon Nov 29 14:22:31 2021 From: smooney at redhat.com (Sean Mooney) Date: Mon, 29 Nov 2021 14:22:31 +0000 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> <8301ef05-4011-43e9-beec-66f6aeeb00f9@www.fastmail.com> <20211129130957.3poysdeqsufkp3pg@yuggoth.org> Message-ID: On Mon, 2021-11-29 at 14:43 +0100, Jean-Philippe Evrard wrote: > On Mon, Nov 29, 2021, at 14:09, Jeremy Stanley wrote: > > The primary reason stable branches exist is to make it easier for us > > to test and publish backports of critical patches to older versions > > of the software, rather than expecting our downstream consumers to > > do that work themselves. If you're saying distribution package > > maintainers are going to do it anyway and ignore our published > > backports, then dropping the branching model may make sense, but > > I've seen evidence to suggest that at least some distros do consume > > our backports directly. > > Don't get me wrong, SUSE is consuming those backports, and (at least was) contributing to them. > And yes, I doubt that RH/SUSE/Canonical are simply consuming those packages without ever adding their patches on a case by case basis. So yes, those distros are already doing part of their work downstream (and/or upstream). And for a valid reason: it's part of their job :) > > Doesn't mean we, as a whole community, still need to cut the work for every single consumer. > If we are stretched thin, we need to define priorities. > > I believe our aggressive policy in terms of branching is hurting the rest of the ecosystem, that's why I needed to say things out loud. I meant the less we branch, the less we backport, the less painful upgrades we have to deal with. It depends on our definition of _when to branch_ of course. Your example of a "critical patch" might be a good reason to branch. We are maybe in a place where this can be on a case by case basis, or that we should improve that definition? i actully would not consier our branching agressive. i actully think its much longer then the rest of the indesty or ecosystem. many of the project we consume release on a monthly or quterly basis with some provideing api/abi breaking release once a year. e.g. dpdk only allows abi breaks in the q4 release i belive every year, the kernel select an lts branch to maintain every year basedon the last release of the year but as a ~6 week release schdule. i stonely belive it would be healthier for use to release more often then we do. that does not mean i think we should break version compatiably for our distibuted project more often. if we released every 6 weeks or once a quater but limited the end user impact of that so that they could mix/match release for up to a year or two that would be better for our consumers, downstream distirbutionbs and devleopers. developers would have to backport less, downstream could ship/rebase on any of the intermidary releases without breaking comparitblet to get features and consume could stay on the stable release and only upgrade once a year or opt for one of the point release in a year for new features. honestly i cant see how release less often will have any effect other then slowing the deleiver of featrue and bug fixes to customers. i dont think it will help distrobutions reduce there maintance since we will just spend more time backproting features due to the increased wait time that some of our custoemr will not want to wait for. we still recive feature backport request for our OSP 16 product based on train (sometime 10 and 13 based on newton/qeens respcitivly) if i tell our large telco customer "ok the we are two far into Yoga to complete that this cycle so the next upstream release we can target this for is Z and that will release in Q4 2022 and it will take a year for use to productize that relase so you can expect the feature to be completed in 2023" They will respond with "Well we need it in 2022 so what the ETA for a backport". if we go from a 6month upstream cycle to a 12 month one that converation chagnes form we can deliver in 9 months to 18 upstream + packaging. shure with a 12 cycle there is more likely hood we could fit it into the current cycle but it also much much more painful if we miss a release. there are only a small subset of feature that we can backport downstream without breaking interoperablity so we assume we can fall back on that. stable branches dont nessiarly help with that but not having a very long release cadnce does. > > Regards, > JP > From amoralej at redhat.com Mon Nov 29 14:58:47 2021 From: amoralej at redhat.com (Alfredo Moralejo Alonso) Date: Mon, 29 Nov 2021 15:58:47 +0100 Subject: python_requires >= 3.8 during Yoga In-Reply-To: References: <20211126143019.2o7rycs44ycnkgez@yuggoth.org> <17d5cdde95a.b95752d01381326.4462262438602968824@ghanshyammann.com> Message-ID: On Mon, Nov 29, 2021 at 12:50 PM Dmitry Tantsur wrote: > > > On Mon, Nov 29, 2021 at 12:38 PM Sean Mooney wrote: > >> On Fri, 2021-11-26 at 19:48 +0100, Dmitry Tantsur wrote: >> > On Fri, Nov 26, 2021 at 6:14 PM Alfredo Moralejo Alonso < >> amoralej at redhat.com> >> > wrote: >> > >> > > >> > > >> > > On Fri, Nov 26, 2021 at 4:44 PM Dmitry Tantsur >> > > wrote: >> > > >> > > > >> > > > >> > > > On Fri, Nov 26, 2021 at 4:35 PM Ghanshyam Mann < >> gmann at ghanshyammann.com> >> > > > wrote: >> > > > >> > > > > ---- On Fri, 26 Nov 2021 09:20:39 -0600 Dmitry Tantsur < >> > > > > dtantsur at redhat.com> wrote ---- >> > > > > > >> > > > > > >> > > > > > On Fri, Nov 26, 2021 at 3:35 PM Jeremy Stanley < >> fungi at yuggoth.org> >> > > > > wrote: >> > > > > > On 2021-11-26 14:29:53 +0100 (+0100), Dmitry Tantsur wrote: >> > > > > > [...] >> > > > > > > CentOS/RHEL ships 3.6 and a limited version of 3.8 and 3.9. >> > > > > > [...] >> > > > > > >> > > > > > Is this still true for CentOS Stream 9? The TC decision was to >> > > > > > support that instead of CentOS Stream 8 in Yoga. >> > > > > > >> > > > > > No. But Stream 9 is pretty much beta, so it's not a >> replacement for >> > > > > us (and we don't have nodes in nodepool with it even yet?). >> > > > > >> > > > > I think here is the confusion. In TC, after checking with centos >> team >> > > > > impression was CentOS stream 9 is released and that is >> > > > > what we should update In OpenStack testing. And then only we >> updated the >> > > > > centos stream 8 -> 9 and dropped py3.6 testing >> > > > > >> > > > > - >> > > > > >> https://review.opendev.org/c/openstack/governance/+/815851/3..6/reference/runtimes/yoga.rst >> > > > > >> > > > >> > > > I think there is an enormous perception gap between the CentOS team >> and >> > > > the rest of the world. >> > > > >> > > > >> > > So, CentOS Stream 9 was released, in the official mirrors and usable >> since >> > > some weeks ago[1][2]. We shouldn't consider it beta or something like >> that. >> > > >> > > As mentioned, support for diskimage-builder has been introduced for >> CS9 >> > > and there are nodepool nodes ready for it. From RDO, we are providing >> RPMs >> > > for master branch content on CentOS Stream 9 [3] and actually we have >> been >> > > doing some tests. Actually, we have recently merged new jobs in >> > > puppet-openstack[4]. >> > > >> > >> > "It's usable since some weeks ago and we even added tests today" is not >> > exactly reassuring :) The PTI uses wording "stable and LTS", which >> applies >> > to Stream 9 no more than it applies to Fedora. >> thats not quite true. >> yes centos 9 is a roling release but it is more stable then fedroa since >> packages landing in centos 9 stream have been >> stablised via fedroa already and any argurments in this regard would also >> apply to centos 8 stream. >> > > That's correct. Also, the rolling release nature of CentOS Stream applies to a single major release where Red Hat is committed to provide a set of compatibility rules, upgrade support (ar packaging levels), abi and api compatibility etc... That's not the case for Fedora where is rolling update between major releases including major version rebases, etc... > Stream 8 is part of an already released and maintained RHEL, hence I give > it a certain benefit of doubt. More on this below. > > >> >> The only reason that centos 8 stream would be more stable then 9 stream >> is due to less frequent updates as focus moves to 9 stream. >> 9 stream is effectivly a preview of what whill be rhel 9. >> > >> > In the end, what we test with Bifrost is what we will recommend people >> to >> > deploy it in production on. I do believe people can and should deploy on >> > Stream 8 despite all the FUD around it, but I cannot do it for Stream 9 >> > until RHEL 9 is out. >> >> why just because rhel 9.0 is relased does not mean centos 9 is sudennly >> more stable. >> > > Okay, you convinced me, I won't recommend Stream 9 at all :) > > Kidding aside, we know that each major branch of RHEL offers a certain > degree of compatibility. It's not expected that 8.N+1 breaks a lot of stuff > from 8.N, hence it's not expected that Stream between them will break > anything (modulo bugs) either. I have no idea what and how gets into Stream > 9 now, nor will I risk recommending it for production. > > I don't understand the doubt here, could you elaborate? The same compatibility rules applied to RHEL/CentOS Stream 8 are applied to CS/RHEL9. What is in CS9 will end up in RHEL9 at some later point of time. However, instead of pushing updates in minor releases are applied more frequently. Said this, the concept of "recommended for production" is vague, depends on how organizations work and how they want to manage risks, and each one of us may have our own perception, of course. Alfredo Dmitry > > >> Now that centos 9 has been release there shoudl be no more package >> removalas form centos/rhel so >> it should have stableised in terms of the minium package set and over >> time we woudl expect more pacakges to be added. >> yes centos 8 stream will be supported until the EOL or rhel 8 so people >> can continue to deploy it. >> rhel 9 will be released next year, perhaps not before yoga is released >> but if you are deploying RDO you will not be useing >> RHEL anyway you will be using centos so stream 9 is the better plathform >> to use if you plan to continue to upgrade the deploy >> ment over the next few year as it allow you to avoid the costly OS >> upgrade when moving to the next openstack relesase. >> > >> > >> > Dmitry >> > >> > >> > > >> > > [1] https://www.centos.org/centos-stream/ >> > > >> > > [2] https://cloud.centos.org/centos/9-stream/x86_64/images/ >> > > [3] https://trunk.rdoproject.org/centos9-master/report.html >> > > [4] >> > > >> https://review.opendev.org/c/openstack/puppet-openstack-integration/+/793462 >> > > >> > > Alfredo >> > > >> > > >> > > >> > > > Dmitry >> > > > >> > > > >> > > > > >> > > > > < >> https://review.opendev.org/c/openstack/governance/+/815851/3..6/reference/runtimes/yoga.rst >> > >> > > > > >> > > > > -gmann >> > > > > >> > > > > > Dmitry >> > > > > > -- >> > > > > > Jeremy Stanley >> > > > > > >> > > > > > >> > > > > > -- >> > > > > > Red Hat GmbH, https://de.redhat.com/ , Registered seat: >> Grasbrunn, >> > > > > > Commercial register: Amtsgericht Muenchen, HRB 153243, >> > > > > > Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, >> > > > > Michael O'Neill >> > > > > > >> > > > > >> > > > > >> > > > >> > > > -- >> > > > Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, >> > > > Commercial register: Amtsgericht Muenchen, HRB 153243, >> > > > Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, >> Michael >> > > > O'Neill >> > > > >> > > >> > >> >> > > -- > Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, > Commercial register: Amtsgericht Muenchen, HRB 153243, > Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael > O'Neill > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aschultz at redhat.com Mon Nov 29 15:00:30 2021 From: aschultz at redhat.com (Alex Schultz) Date: Mon, 29 Nov 2021 08:00:30 -0700 Subject: [tripleo] puppet_config: immutable relationship between config_volume and config_image? In-Reply-To: References: Message-ID: On Sat, Nov 27, 2021 at 7:32 AM Brent Eagles wrote: > > Hi all, > > I've been working on > https://review.opendev.org/c/openstack/tripleo-heat-templates/+/816895 and the > current pep8 check error on the designate-api-container-puppet.yaml template > appears to disallow using different container config images on the same config > volume. > > I see where this makes sense but because the way the puppet_config sections > seem to be structured I wonder if it was intentional. The templates happen to > work in this patch's case because the api specific config image in this > template only includes some additional binaries for apache+wsgi necessary for > the puppet. However, I can see where this might cause some really unfortunate > issues if the config images for a given config_volume were to have different > versions of puppet. > In looking at the code the config_volume name is used to generate the container name which would be problematic if you switch the config_volume+config_image pairing because it'll cause conflicts. IIRC config_volume is useful when you have multiple services that basically use the same base configuration container to generate their configs (e.g. neutron + neutron plugins). If you are switching to use the api container for the configuration generation then config_volume should likely be designate_api and that could be shared with the api contianer. > While I will alter the templates so the designate config volume use the > same config image, it does beg the question if having the puppet_config > config_volume definitions duplicated across the templates is a good > pattern. Would it be a better choice to adopt a pattern where the > immutable parts of the config_volumes (e.g. config_volume name, > config_image) can only be defined once per config volume? Perhaps in a > separate file and have the mutable parts in their respective template > definitions? > They aren't really duplicated. They are specific to the puppet config for a given service and are specific to the service and it's configuration generation. For example if you look at the various heat containers there are in fact 3 different pairs (heat,ContainerHeatConfigImage), (heat_api,ContainerHeatApiConfigImage), (heat_api_cfn,COntainerHeatApiCfnConfigImage). > Cheer, > > Brent > > -- > Brent Eagles > Principal Software Engineer > Red Hat Inc. > > From tobias.urdin at binero.com Mon Nov 29 10:52:07 2021 From: tobias.urdin at binero.com (Tobias Urdin) Date: Mon, 29 Nov 2021 10:52:07 +0000 Subject: python_requires >= 3.8 during Yoga In-Reply-To: References: <17d58f75012.106d120141328715.5251672452583667038@ghanshyammann.com> <20211126122942.deyi4tymy5xoodvz@lyarwood-laptop.usersys.redhat.com> <17d5ce532e1.eabdb7481381896.7024061515184177274@ghanshyammann.com> <20211126160515.4tvksnqcz3vaomf4@lyarwood-laptop.usersys.redhat.com> <17d5d0a4f3f.dc8bb48c1384810.8642013929754348916@ghanshyammann.com> <17d5d476490.e30e73db1388640.4406614125429012751@ghanshyammann.com> <20211129091718.fdgdwmowtp3jxf6u@lyarwood-laptop.usersys.redhat.com> Message-ID: <1319EEE3-AB75-495A-8EEF-E2985A4FBB10@binero.com> Hello, I agree with previous statement from Michal. The upgrade path in for example RDO has been very smooth previously with upgrading to new OpenStack release then switching out the distro version afterwards because they support both during a transition. Sure they will do that now as well, but if any project decides to break py36 there will be more work for the RDO team and more arguments to not revert changes because py38 would be the ?real? supported runtime in OpenStack testing. The transition period is required to not break the upgrade path. Best regards Tobias On 29 Nov 2021, at 11:14, Micha? Nasiadka > wrote: Hello, I?m strongly against dropping py36 support now, unless we?re going to find a solution that works on CentOS Stream 8. RHEL 9 is not out, and probably will not be in months - how do we expect users to use Yoga on production deployments (where they use CentOS Linux/equivalents today)? Dropping the runtime testing and supporting devstack - and negotiating on a per project basis to support py36 or not - is not a solution. Either Yoga supports py36 as a transition point/release to py38 - or not. In Kolla - we also did not anticipate (and don?t think it?s a good idea) to support CentOS Stream 9 in Yoga release. With the current decision - we are either forced with supporting CentOS Stream 9 (with no alternatives like Rocky Linux/Alma Linux in place - because RHEL 9 is not out) - or dropping CentOS support completely. If we pursue CS9 - we also need to support migration from CS8 to CS9 and that?s also a considerable amount of work - which is unplanned. Best regards, Michal On 29 Nov 2021, at 10:17, Lee Yarwood > wrote: On 26-11-21 11:24:59, Ghanshyam Mann wrote: ---- On Fri, 26 Nov 2021 10:18:16 -0600 Ghanshyam Mann > wrote ---- ---- On Fri, 26 Nov 2021 10:05:15 -0600 Lee Yarwood > wrote ---- On 26-11-21 09:37:44, Ghanshyam Mann wrote: ---- On Fri, 26 Nov 2021 06:29:42 -0600 Lee Yarwood > wrote ---- On 26-11-21 10:54:26, Alfredo Moralejo Alonso wrote: On Thu, Nov 25, 2021 at 10:23 PM Ghanshyam Mann > wrote: ---- On Thu, 25 Nov 2021 13:58:28 -0600 Marcin Juszkiewicz < marcin.juszkiewicz at linaro.org> wrote ---- W dniu 25.11.2021 o 19:13, Stephen Finucane pisze: gmann has been helpfully proposing patches to change the versions of Python we're testing against in Yoga. I've suggested that we might want to bump 'python_requires' in 'setup.cfg' to indicate that we no longer support any version of Python before 3.8 CentOS Stream 8 has Python 3.6 by default and RDO team is doing CS8 -> CS9 migration during Yoga cycle. Can we postpone it to Z when there will be no distribution with Py 3.6 to care about? Stupid question that I should know the answer to but does RDO really support RPM based installations anymore? IOW couldn't we just workaround this by providing CS8 py38 based containers during the upgrade? As Marcin posted, the plan in RDO is to support both CentOS Stream 8 and CentOS Stream 9 in Yoga. This is how we have managed previous major CentOS version upgrades in the past providing support for both releases in an OpenStack version to ease the upgrade so I'd like to keep yoga working on py3.6 included in CS8 and CS9. If this was the plan why wasn't it made clear to the TC before they dropped CS8 from the Yoga runtimes? Would it even be possible for the TC to add CS8 and py36 back in to the Yoga runtimes? Postponing to Z, you mean dropping the py3.6 tests or bumping it in in 'setup.cfg' so that no one can install on py3.6 ? First one we already did and as per Yoga testing runtime we are targeting centos9-stream[1] in Yoga itself. For making 'python_requires' >=py3.8 in 'setup.cfg', I have no string opinion on this but I prefer to have flexible here that 'yes OpenStack is installable in py3.6 but we do not test it anymore from Yoga onwards so no guarantee'. Our testing runtime main goal is that we document the version we are testing *at least* which means it can work on lower or higher versions too but we just do not test them. May it be possible to keep py3.6 jobs to make sure patches are not introducing py3.8-only features that would break deployment in CS8? We should keep CS8 and py36 as supported runtimes if we are keeping the jobs, otherwise this just sets super confusing. Yeah, I think it create confusion as I can see in this ML thread so agree on keeping 'python_requires' also in sycn with what we test. Cool thanks! Now question on going back to centos stream 8 support in Yoga, is it not centos stream 9 is stable released or is it experimental only? If stable then we can keep the latest available version which can be centos stream 9. I honestly don't know and can't find any docs to point to. Our project interface testing doc clearly stats 'latest LTS' to consider for testing[1] whenever we are ready. I am not very strongly against of reverting back to centos stream 8 but we should not add two version of same distro in testing which can be a lot of we consider below three distro How do we expect operators to upgrade between Xena where CentOS 8 stream is a supported runtime and Yoga where CentOS 9 stream is currently the equivalent supported runtime without supporting both for a single release? This is really good question on upgrade testing we do at upstream and I remember it cameup and discussed a lot during py2.7 drop also that how we are testing the upgrade from py2.7 to py3. Can we do in grenade? But that we answered as we did not tested directly but stein and train tested both version so should not be any issue if you upgrade from there (one of FAQ in my blog[1]). But on distro upgrade testing, as you know we do not test those in upstream neither in grenade where upgrade are done on old node distro only not from old distro version to new distro version with new code. It is not like we do not want to test but if anyone from any distro would like to setup grenade for that and maintain then we are more happy. In summary, yes we cannot guarantee distro upgrade testing from OpenStack upstream testing due to resource bandwidth issue but we will welcome any help here. We discussed with amoralej about moving the testing runtime to CentOS stream 8 and py36 or not in TC IRC channel[1]. As we at upstream do not test distro two versions in same release, amoralej agreed to keep CentOS stream 9 if one to choose which is our current testing runtime is. So no change in the direction of current testing runtime and dropping the py3.6 but there is possibility of some trade off here. If any py3.6 breaking changes are happening then it is up to projects goodness, bandwidth, or flexibility about accepting the fix or not or even add a py36 unit test job. As our testing runtime is the minimum things to test and it does not put any max limit of testing, any project can extend their testing as per their bandwidth. In summary: (This is what we agreed today in TC channel but as most of the folks are on leave today, I will keep it open until next week so see if any objections from the community and will conclude it accordingly) * No change in Yoga testing runtime and we move to cs9 and drop py36. * We will not put hard stop on cs8 support and we can: ** Devstack keep supporting cs8 in Yoga ** It can be negotiated with project to add py36 job or fix if any py36 breaking changes are observed by RDO (or any distro interested in py36) but it depends on the project decision and bandwidth. As next, how we can improve the upgrade testing from distro versions is something we will explore next and see what all we can test to make upgrade easier. I'm against this, as I said in my setup.cfg >= py38 review for openstack/nova [1] we either list and support runtimes or don't. If RDO and others need CentOS 8 Stream support for a release then lets include it and py36 still for Yoga and make things explicit. As I've said elsewhere I think the TC really need to adjust their thinking on this topic and allow for one OpenStack release where both the old and new LTS distro release are supported. Ensuring we allow people to actually upgrade in place and later handle the distro upgrade itself. Cheers, Lee [1] https://review.opendev.org/c/openstack/nova/+/819415 -- Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 2D76 -------------- next part -------------- An HTML attachment was scrubbed... URL: From faisalsheikh073 at gmail.com Mon Nov 29 12:37:29 2021 From: faisalsheikh073 at gmail.com (Faisal Sheikh) Date: Mon, 29 Nov 2021 17:37:29 +0500 Subject: [wallaby][ovn][Open vSwitch] HA for OVN DB servers using pacemaker Message-ID: Hi, I am using Openstack Wallaby release with OVN on Ubuntu 20.04. My environment consists of 2 compute nodes and 1 controller node. ovs-vswitchd (Open vSwitch) 2.15.0 Ubuntu Kernel Version: 5.4.0-88-generic compute node1 172.16.30.1 compute node2 172.16.30.3 controller/Network node/Primary ovs-db IP 172.16.30.46 backup ovs-db 172.16.30.47 I added another node for backup ovs-db server (172.16.30.47 ) and installed openvswitch-ovn and networking-ovn packages on it for backup ovs-db server and using Pacemaker to manage the ovn-northd(Open vSwitch) service between primary and backup ovs-db server active/passive HA mode. Pacemaker cluster is successful and after creating a pacemaker cluster, I am using the following commands to create one active and one backup server for OVN databases. $ pcs resource create ovndb_servers ocf:ovn:ovndb-servers \ master_ip=172.16.30.46 \ ovn_ctl= \ op monitor interval="10s" \ op monitor role=Master interval="15s" But i am getting below error: Error: Agent 'ocf:ovn:ovndb-servers' is not installed or does not provide valid metadata: Metadata query for ocf:ovn:ovndb-servers failed: Input/output error, use --force to override I would really appreciate any input in this regard. Best regards, Faisal Sheikh -------------- next part -------------- An HTML attachment was scrubbed... URL: From dms at danplanet.com Mon Nov 29 15:38:07 2021 From: dms at danplanet.com (Dan Smith) Date: Mon, 29 Nov 2021 07:38:07 -0800 Subject: [neutron][nova] [kolla] vif plugged timeout In-Reply-To: <9eb79ad2239e05f926880da40502fbe68f8e83d9.camel@redhat.com> (Sean Mooney's message of "Wed, 24 Nov 2021 14:53:21 +0000") References: <08c67a25-72b4-870d-bfb7-1aad318a1551@redhat.com> <9eb79ad2239e05f926880da40502fbe68f8e83d9.camel@redhat.com> Message-ID: >> Adding a queue to store events that Nova did not have a recieve handler >> set for might help as well. And have a TTL set on it, or a more advanced >> reaping logic, for example based on tombstone events invalidating the >> queue contents by causal conditions. That would eliminate flaky >> expectations set around starting to wait for receiving events vs sending >> unexpected or belated events. Why flaky? Because in an async distributed >> system there is no "before" nor "after", so an external to Nova service >> will unlikely conform to any time-frame based contract for >> send-notify/wait-receive/real-completion-fact. And the fact that Nova >> can't tell what the network backend is (because [1] was not fully >> implemented) does not make things simpler. > > i honestly dont think this is a viable option we have discussed it > several times in nova in the past and keep coming to the same > conclution either the event shoudl be sent and waited for at that > right times or they loose there value. Yep, agree with Sean here. I definitely don't think that making the system less deterministic is going to make things more reliable. This needs to be a contract between two services that we can depend on. Nova and Neutron both serve the purpose of abstracting the lower-level elements into a service. Making the different technologies they support behave similarly is the reason they exist. --Dan From faisalsheikh.cyber at gmail.com Mon Nov 29 15:51:29 2021 From: faisalsheikh.cyber at gmail.com (Faisal Sheikh) Date: Mon, 29 Nov 2021 20:51:29 +0500 Subject: [wallaby][ovn][Open vSwitch] HA for OVN DB servers using pacemaker Message-ID: Hi, I am using Openstack Wallaby release with OVN on Ubuntu 20.04. My environment consists of 2 compute nodes and 1 controller node. ovs-vswitchd (Open vSwitch) 2.15.0 Ubuntu Kernel Version: 5.4.0-88-generic compute node1 172.16.30.1 compute node2 172.16.30.3 controller/Network node/Primary ovs-db IP 172.16.30.46 backup ovs-db 172.16.30.47 I added another node for backup ovs-db server (172.16.30.47 ) and installed openvswitch-ovn and networking-ovn packages on it for backup ovs-db server and using Pacemaker to manage the ovn-northd(Open vSwitch) service between primary and backup ovs-db server active/passive HA mode. Pacemaker cluster is successful and after creating a pacemaker cluster, I am using the following commands to create one active and one backup server for OVN databases. $ pcs resource create ovndb_servers ocf:ovn:ovndb-servers \ master_ip=172.16.30.46 \ ovn_ctl= \ op monitor interval="10s" \ op monitor role=Master interval="15s" But i am getting below error: Error: Agent 'ocf:ovn:ovndb-servers' is not installed or does not provide valid metadata: Metadata query for ocf:ovn:ovndb-servers failed: Input/output error, use --force to override I would really appreciate any input in this regard. Best regards, Faisal Sheikh -------------- next part -------------- An HTML attachment was scrubbed... URL: From michal.arbet at ultimum.io Mon Nov 29 16:14:03 2021 From: michal.arbet at ultimum.io (Michal Arbet) Date: Mon, 29 Nov 2021 17:14:03 +0100 Subject: [neutron][nova] [kolla] vif plugged timeout In-Reply-To: References: <08c67a25-72b4-870d-bfb7-1aad318a1551@redhat.com> Message-ID: Hello, Have you already considered what Jan Vondra sent to this discussion ? I am just making sure that this was read. Thanks, Michal Arbet Openstack Engineer Ultimum Technologies a.s. Na Po???? 1047/26, 11000 Praha 1 Czech Republic +420 604 228 897 michal.arbet at ultimum.io *https://ultimum.io * LinkedIn | Twitter | Facebook st 24. 11. 2021 v 14:30 odes?latel Jan Vondra napsal: > Hi guys, > I've been further investigating Michal's (OP) issue, since he is on his > holiday, and I've found out that the issue is not really plugging the VIF > but cleanup after previous port bindings. > > We are creating 6 servers with 2-4 vifs using heat template [0]. We were > hitting some problems with placement so the stack sometimes failed to > create and we had to delete the stack and recreate it. > If we recreate it right after the deletion, the vif plugging timeout > occurs. If we wait some time (approx. 10 minutes) the stack is created > successfully. > > This brings me to believe that there is some issue with deferring the > removal of security groups from unbound ports (somewhere around this part > of code [1]) and it somehow affects the creation of new ports. However, I > am unable to find any lock that could cause this behaviour. > > The only proof I have is that after the stack recreation scenario I have > measured that the process_network_ports [2] function call could take up to > 650 s (varies from 5 s to 651 s in our environment). > > Any idea what could be causing this? > > [0] https://pastebin.com/infvj4ai > [1] > https://github.com/openstack/neutron/blob/master/neutron/agent/firewall.py#L133 > [2] > https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L2079 > > *Jan Vondra* > *http://ultimum.io * > > > st 24. 11. 2021 v 11:08 odes?latel Bogdan Dobrelya > napsal: > >> On 11/24/21 1:21 AM, Tony Liu wrote: >> > I hit the same problem, from time to time, not consistently. I am using >> OVN. >> > Typically, it takes no more than a few seconds for neutron to confirm >> the port is up. >> > The default timeout in my setup is 600s. Even the ports shows up in >> both OVN SB >> > and NB, nova-compute still didn't get confirmation from neutron. Either >> neutron >> > didn't pick it up or the message was lost and didn't get to >> nova-compute. >> > Hoping someone could share more thoughts. >> >> That also may be a super-set of the revert-resize with OVS hybrid plug >> issue described in [0]. Even though the problems described in the topic >> may have nothing to that particular case, but does look related to the >> external events framework. >> >> Issues like that make me thinking about some improvements to it. >> >> [tl;dr] bring back up the idea of buffering events with a ttl >> >> Like a new deferred RPC calls feature maybe? That would execute a call >> after some trigger, like send unplug and forget. That would make >> debugging harder, but cover the cases when an external service "forgot" >> (an event was lost and the like cases) to notify Nova when it is done. >> >> Adding a queue to store events that Nova did not have a recieve handler >> set for might help as well. And have a TTL set on it, or a more advanced >> reaping logic, for example based on tombstone events invalidating the >> queue contents by causal conditions. That would eliminate flaky >> expectations set around starting to wait for receiving events vs sending >> unexpected or belated events. Why flaky? Because in an async distributed >> system there is no "before" nor "after", so an external to Nova service >> will unlikely conform to any time-frame based contract for >> send-notify/wait-receive/real-completion-fact. And the fact that Nova >> can't tell what the network backend is (because [1] was not fully >> implemented) does not make things simpler. >> >> As Sean noted in a private irc conversation, with OVN the current >> implementation is not capable of fullfilling the contract that >> network-vif-plugged events are only sent after the interface is fully >> configred. So it send events at bind time once it have updated the >> logical port in the ovn db but before real configuration has happened. I >> believe that deferred RPC calls and/or queued events might improve such >> a "cheating" by making the real post-completion processing a thing for >> any backend? >> >> [0] https://bugs.launchpad.net/nova/+bug/1952003 >> >> [1] >> >> https://specs.openstack.org/openstack/neutron-specs/specs/train/port-binding-extended-information.html >> >> > >> > Thanks! >> > Tony >> > ________________________________________ >> > From: Laurent Dumont >> > Sent: November 22, 2021 02:05 PM >> > To: Michal Arbet >> > Cc: openstack-discuss >> > Subject: Re: [neutron][nova] [kolla] vif plugged timeout >> > >> > How high did you have to raise it? If it does appear after X amount of >> time, then the VIF plug is not lost? >> > >> > On Sat, Nov 20, 2021 at 7:23 AM Michal Arbet > > wrote: >> > + if i raise vif_plugged_timeout ( hope i rember it correct ) in nova >> to some high number ..problem dissapear ... But it's only workaround >> > >> > D?a so 20. 11. 2021, 12:05 Michal Arbet > > nap?sal(a): >> > Hi, >> > >> > Has anyone seen issue which I am currently facing ? >> > >> > When launching heat stack ( but it's same if I launch several of >> instances ) vif plugged in timeouts an I don't know why, sometimes it is OK >> ..sometimes is failing. >> > >> > Sometimes neutron reports vif plugged in < 10 sec ( test env ) >> sometimes it's 100 and more seconds, it seems there is some race condition >> but I can't find out where the problem is. But on the end every instance is >> spawned ok (retry mechanism worked). >> > >> > Another finding is that it has to do something with security group, if >> noop driver is used ..everything is working good. >> > >> > Firewall security setup is openvswitch . >> > >> > Test env is wallaby. >> > >> > I will attach some logs when I will be near PC .. >> > >> > Thank you, >> > Michal Arbet (Kevko) >> > >> > >> > >> > >> > >> > >> >> >> -- >> Best regards, >> Bogdan Dobrelya, >> Irc #bogdando >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From juliaashleykreger at gmail.com Mon Nov 29 16:24:07 2021 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Mon, 29 Nov 2021 08:24:07 -0800 Subject: [ironic] End of year meeting schedule Message-ID: Greetings everyone, Today, during the Ironic team meeting, we reached consensus[0] that we would not hold meetings during the last half of December and for the first week of January. Much as we've done the past few years to give contributors time they can feel safe taking off without missing any meetings, coupled with end of year PTO usage and holidays. We will resume our regular meetings on January 10th, 2022. Dates of cancelled meetings: * December 20th, 2021 * December 27th, 2021 * January 3rd, 2022 Hopefully everyone will get plenty of rest, and be ready to review plenty of code when they get back! Thanks, -Julia [0]: https://meetings.opendev.org/meetings/ironic/2021/ironic.2021-11-29-15.00.log.html From beagles at redhat.com Mon Nov 29 18:05:31 2021 From: beagles at redhat.com (Brent Eagles) Date: Mon, 29 Nov 2021 14:35:31 -0330 Subject: [tripleo] puppet_config: immutable relationship between config_volume and config_image? In-Reply-To: References: Message-ID: On Mon, Nov 29, 2021 at 08:00:30AM -0700, Alex Schultz wrote: > On Sat, Nov 27, 2021 at 7:32 AM Brent Eagles wrote: > > > > Hi all, > > > > I've been working on > > https://review.opendev.org/c/openstack/tripleo-heat-templates/+/816895 and the > > current pep8 check error on the designate-api-container-puppet.yaml template > > appears to disallow using different container config images on the same config > > volume. > > > > I see where this makes sense but because the way the puppet_config sections > > seem to be structured I wonder if it was intentional. The templates happen to > > work in this patch's case because the api specific config image in this > > template only includes some additional binaries for apache+wsgi necessary for > > the puppet. However, I can see where this might cause some really unfortunate > > issues if the config images for a given config_volume were to have different > > versions of puppet. > > > > In looking at the code the config_volume name is used to generate the > container name which would be problematic if you switch the > config_volume+config_image pairing because it'll cause conflicts. > IIRC config_volume is useful when you have multiple services that > basically use the same base configuration container to generate their > configs (e.g. neutron + neutron plugins). If you are switching to use > the api container for the configuration generation then config_volume > should likely be designate_api and that could be shared with the api > contianer. > > > While I will alter the templates so the designate config volume use the > > same config image, it does beg the question if having the puppet_config > > config_volume definitions duplicated across the templates is a good > > pattern. Would it be a better choice to adopt a pattern where the > > immutable parts of the config_volumes (e.g. config_volume name, > > config_image) can only be defined once per config volume? Perhaps in a > > separate file and have the mutable parts in their respective template > > definitions? > > > > They aren't really duplicated. They are specific to the puppet > config for a given service and are specific to the service and it's > configuration generation. For example if you look at the various heat > containers there are in fact 3 different pairs > (heat,ContainerHeatConfigImage), > (heat_api,ContainerHeatApiConfigImage), > (heat_api_cfn,COntainerHeatApiCfnConfigImage). Thanks for the clarification Alex! Cheers, Brent -- Brent Eagles Principal Software Engineer Red Hat Inc. From gmann at ghanshyammann.com Mon Nov 29 18:24:49 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 29 Nov 2021 12:24:49 -0600 Subject: [all][tc] Technical Committee next weekly meeting on Dec 2nd at 1500 UTC Message-ID: <17d6cf1413a.11da1006039481.6604682679623549161@ghanshyammann.com> Hello Everyone, Technical Committee's next weekly meeting is scheduled for Dec 2nd at 1500 UTC. If you would like to add topics for discussion, please add them to the below wiki page by Wednesday, Dec 1st, at 2100 UTC. https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting -gmann From arnaud.morin at gmail.com Mon Nov 29 19:03:46 2021 From: arnaud.morin at gmail.com (Arnaud Morin) Date: Mon, 29 Nov 2021 19:03:46 +0000 Subject: [ops]RabbitMQ High Availability In-Reply-To: <4e37d1a7-ca17-b50f-ba6c-96229d85a75e@redhat.com> References: <0670B960225633449A24709C291A525251D511CD@COM03.performair.local> <4e37d1a7-ca17-b50f-ba6c-96229d85a75e@redhat.com> Message-ID: Hi, After a talk on this ml (which starts at [1]), we endup building a documentation with Large Scale group. The doc is accessible at [2]. Hope this will help. [1] http://lists.openstack.org/pipermail/openstack-discuss/2020-August/016362.html [2] https://wiki.openstack.org/wiki/Large_Scale_Configuration_Rabbit On 24.11.21 - 11:31, Bogdan Dobrelya wrote: > On 11/24/21 12:34 AM, DHilsbos at performair.com wrote: > > All; > > > > In the time I've been part of this mailing list, the subject of RabbitMQ high availability has come up several times, and each time specific recommendations for both Rabbit and Open Stack are provided. I remember it being an A or B kind of recommendation (i.e. configure Rabbit like A1, and Open Stack like A2, OR configure Rabbit like B1, and Open Stack like B2). > > There is no special recommendations for rabbitmq setup for openstack, > but probably a few, like instead of putting it behind a haproxy, or the > like, list the rabbit cluster nodes in the oslo messaging config > settings directly. Also, it seems that durable queues makes a very > little sense for highly ephemeral RPC calls, just by design. I would > also add that the raft quorum queues feature of rabbitmq >=3.18 does > neither fit well into the oslo messaging design for RPC calls. > > A discussable and highly opinionated thing is also configuring > ha/mirror queue policy params for queues used for RPC calls vs > broad-casted notifications. > > And my biased personal humble recommendation is: use the upstream OCF RA > [0][1], if configuring rabbitmq cluster by pacemaker. > > [0] https://www.rabbitmq.com/pacemaker.html#auto-pacemaker > > [1] > https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/rabbitmq-server-ha > > > > > Unfortunately, I can't find the previous threads on this topic. > > > > Does anyone have this information, that they would care to share with me? > > > > Thank you, > > > > Dominic L. Hilsbos, MBA > > Vice President - Information Technology > > Perform Air International Inc. > > DHilsbos at PerformAir.com > > www.PerformAir.com > > > > > > > > > -- > Best regards, > Bogdan Dobrelya, > Irc #bogdando > > From tonyliu0592 at hotmail.com Mon Nov 29 19:39:27 2021 From: tonyliu0592 at hotmail.com (Tony Liu) Date: Mon, 29 Nov 2021 19:39:27 +0000 Subject: [glance][image-import] local disk space required by image import and conversion Message-ID: Hi, The target is for user to import .qcow2 image from download site and automatically stored as raw image. Read through [1], this seems doable. I am not sure the workflow. Does it download the whole qcow2 and convert it to raw locally, or download, convert and upload in a streaming way, which only require small space for caching? The question is that, how much local disk space is required for such import and conversion? [1] https://docs.openstack.org/glance/latest/admin/interoperable-image-import.html Thanks! Tony From dms at danplanet.com Mon Nov 29 19:52:02 2021 From: dms at danplanet.com (Dan Smith) Date: Mon, 29 Nov 2021 11:52:02 -0800 Subject: [glance][image-import] local disk space required by image import and conversion In-Reply-To: (Tony Liu's message of "Mon, 29 Nov 2021 19:39:27 +0000") References: Message-ID: > The target is for user to import .qcow2 image from download site and > automatically stored as raw image. Read through [1], this seems > doable. I am not sure the workflow. Does it download the whole qcow2 > and convert it to raw locally, or download, convert and upload in a > streaming way, which only require small space for caching? The > question is that, how much local disk space is required for such > import and conversion? Yep, local disk space is required and must be large enough to handle the whole conversion (i.e. the qcow2 plus the converted image). Keep in mind that multiple users could be doing the same, so the disk space requirement may be large. Also note that if you either need to configure the local self-reference URL[1] per worker (if you are running a recent-enough release) or that storage needs to be shared amongst all the workers for proper operation. --Dan 1: https://docs.openstack.org/glance/latest/admin/interoperable-image-import.html#staging-directory-configuration From tonyliu0592 at hotmail.com Mon Nov 29 19:58:25 2021 From: tonyliu0592 at hotmail.com (Tony Liu) Date: Mon, 29 Nov 2021 19:58:25 +0000 Subject: [glance][image-import] local disk space required by image import and conversion In-Reply-To: References: Message-ID: Thank you Dan for prompt response! Tony ________________________________________ From: Dan Smith Sent: November 29, 2021 11:52 AM To: Tony Liu Cc: openstack-discuss at lists.openstack.org; openstack-dev at lists.openstack.org Subject: Re: [glance][image-import] local disk space required by image import and conversion > The target is for user to import .qcow2 image from download site and > automatically stored as raw image. Read through [1], this seems > doable. I am not sure the workflow. Does it download the whole qcow2 > and convert it to raw locally, or download, convert and upload in a > streaming way, which only require small space for caching? The > question is that, how much local disk space is required for such > import and conversion? Yep, local disk space is required and must be large enough to handle the whole conversion (i.e. the qcow2 plus the converted image). Keep in mind that multiple users could be doing the same, so the disk space requirement may be large. Also note that if you either need to configure the local self-reference URL[1] per worker (if you are running a recent-enough release) or that storage needs to be shared amongst all the workers for proper operation. --Dan 1: https://docs.openstack.org/glance/latest/admin/interoperable-image-import.html#staging-directory-configuration From srelf at ukcloud.com Mon Nov 29 20:37:45 2021 From: srelf at ukcloud.com (Steven Relf) Date: Mon, 29 Nov 2021 20:37:45 +0000 Subject: Nova Local Disk Storage (Seena Fallah) Message-ID: Hi Seena, Do you mean passing an entire disk to an instance, or moving where nova stores its ephemeral disks? (e.g. /var/lib/nova/instances?) Rgds Steve. The future has already arrived. It's just not evenly distributed yet - William Gibson Hi, Is there any support for creating instances on a whole separate disk, not an LVM? From tonyliu0592 at hotmail.com Mon Nov 29 21:43:12 2021 From: tonyliu0592 at hotmail.com (Tony Liu) Date: Mon, 29 Nov 2021 21:43:12 +0000 Subject: [Neutron][OVN] AttributeError: 'NoneType' object has no attribute 'get_lrouter' Message-ID: Hi, It happens from time to time, not consistently, with the latest kolla/centos-binary-neutron-server:ussuri container image. >From the bottome of traceback, it seems self._ovn is None. Any clues how this can happen? I don't see other suspicious logging in Neutron and OVN. ===================== File "/usr/lib/python3.6/site-packages/neutron/services/ovn_l3/plugin.py", line 477, in get_router_availability_zones ERROR neutron.api.v2.resource lr = self._ovn.get_lrouter(router['id']) ERROR neutron.api.v2.resource AttributeError: 'NoneType' object has no attribute 'get_lrouter' ===================== Here is the full traceback. ===================== neutron.api.v2.resource [req-bf2230aa-0491-4caa-b5f7-169d37009672 b8de8d37a30147fd98681c22fc515874 b5938e7ebb0645e396b19e99bb7f9868 - default default] index failed: No details.: AttributeError: 'NoneType' object has no attribute 'get_lrouter' neutron.api.v2.resource Traceback (most recent call last): neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/neutron/api/v2/resource.py", line 98, in resource neutron.api.v2.resource result = method(request=request, **args) neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/neutron_lib/db/api.py", line 139, in wrapped neutron.api.v2.resource setattr(e, '_RETRY_EXCEEDED', True) neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__ neutron.api.v2.resource self.force_reraise() neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise neutron.api.v2.resource six.reraise(self.type_, self.value, self.tb) neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise neutron.api.v2.resource raise value neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/neutron_lib/db/api.py", line 135, in wrapped neutron.api.v2.resource return f(*args, **kwargs) neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/oslo_db/api.py", line 154, in wrapper neutron.api.v2.resource ectxt.value = e.inner_exc neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__ neutron.api.v2.resource self.force_reraise() neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise neutron.api.v2.resource six.reraise(self.type_, self.value, self.tb) neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise neutron.api.v2.resource raise value neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/oslo_db/api.py", line 142, in wrapper neutron.api.v2.resource return f(*args, **kwargs) neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/neutron_lib/db/api.py", line 183, in wrapped neutron.api.v2.resource LOG.debug("Retry wrapper got retriable exception: %s", e) neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__ neutron.api.v2.resource self.force_reraise() neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise neutron.api.v2.resource six.reraise(self.type_, self.value, self.tb) neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise neutron.api.v2.resource raise value neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/neutron_lib/db/api.py", line 179, in wrapped neutron.api.v2.resource return f(*dup_args, **dup_kwargs) neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/neutron/api/v2/base.py", line 369, in index neutron.api.v2.resource return self._items(request, True, parent_id) neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/neutron/api/v2/base.py", line 304, in _items neutron.api.v2.resource obj_list = obj_getter(request.context, **kwargs) neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/neutron_lib/db/api.py", line 233, in wrapped neutron.api.v2.resource return method(*args, **kwargs) neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/neutron_lib/db/api.py", line 139, in wrapped neutron.api.v2.resource setattr(e, '_RETRY_EXCEEDED', True) neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__ neutron.api.v2.resource self.force_reraise() neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise neutron.api.v2.resource six.reraise(self.type_, self.value, self.tb) neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise neutron.api.v2.resource raise value neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/neutron_lib/db/api.py", line 135, in wrapped neutron.api.v2.resource return f(*args, **kwargs) neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/oslo_db/api.py", line 154, in wrapper neutron.api.v2.resource ectxt.value = e.inner_exc neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__ neutron.api.v2.resource self.force_reraise() neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise neutron.api.v2.resource six.reraise(self.type_, self.value, self.tb) neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise neutron.api.v2.resource raise value neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/oslo_db/api.py", line 142, in wrapper neutron.api.v2.resource return f(*args, **kwargs) neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/neutron_lib/db/api.py", line 183, in wrapped neutron.api.v2.resource LOG.debug("Retry wrapper got retriable exception: %s", e) neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__ neutron.api.v2.resource self.force_reraise() neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise neutron.api.v2.resource six.reraise(self.type_, self.value, self.tb) neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise neutron.api.v2.resource raise value neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/neutron_lib/db/api.py", line 179, in wrapped neutron.api.v2.resource return f(*dup_args, **dup_kwargs) neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/neutron/db/l3_db.py", line 547, in get_routers neutron.api.v2.resource page_reverse=page_reverse) neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/neutron_lib/db/model_query.py", line 317, in get_collection neutron.api.v2.resource for c in query neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/neutron_lib/db/model_query.py", line 317, in neutron.api.v2.resource for c in query neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/neutron/db/l3_db.py", line 221, in _make_router_dict neutron.api.v2.resource resource_extend.apply_funcs(l3_apidef.ROUTERS, res, router) neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/neutron_lib/db/resource_extend.py", line 84, in apply_funcs neutron.api.v2.resource resolved_func(response, db_object) neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/neutron/db/availability_zone/router.py", line 40, in _add_az_to_response neutron.api.v2.resource l3_plugin.get_router_availability_zones(router_db)) neutron.api.v2.resource File "/usr/lib/python3.6/site-packages/neutron/services/ovn_l3/plugin.py", line 477, in get_router_availability_zones neutron.api.v2.resource lr = self._ovn.get_lrouter(router['id']) neutron.api.v2.resource AttributeError: 'NoneType' object has no attribute 'get_lrouter' neutron.api.v2.resource neutron.wsgi [req-bf2230aa-0491-4caa-b5f7-169d37009672 b8de8d37a30147fd98681c22fc515874 b5938e7ebb0645e396b19e99bb7f9868 - default default] 10.91.1.7,10.250.20.26 "GET /v2.0/routers HTTP/1.1" status: 500 len: 344 time: 0.2660162 ===================== Thanks! Tony From gael.therond at bitswalk.com Mon Nov 29 23:47:12 2021 From: gael.therond at bitswalk.com (=?UTF-8?Q?Ga=C3=ABl_THEROND?=) Date: Tue, 30 Nov 2021 00:47:12 +0100 Subject: Windows imaging process Message-ID: Hi everyone, On one of our Openstack platforms, we maintain windows based workloads. We currently have the following build process: 1?/- We download windows ISO image. 2?/- Use hyper-v to create a gold image that bring virtio drivers. 3?/- We upload this image on glance and publish it for our users. 4?/- Our users use packer to create their own windows custom image from our gold image. This workflow works pretty fine, it?s simple enough and as Microsoft isn?t releasing a new major Windows every two months it?s pretty ok not having all the steps automated until now. However, I?m wondering if there is a way to automate my first two steps using Openstack? So far, from my early tests, I didn?t managed to get all of appropriate gears to work together, did I missed something? I need to create a VM that at least can use the Windows ISO plus a virtio ISO as a second cdrom device and an additional user-data payload that will instruct the Windows installer to go automatically through the installation steps plus loads missing drivers from the virtio iso. This is perfectly working on a native kvm as you can add multiple cdrom device but I didn?t find a way to replicate that on Openstack. Starting from an unaltered Microsoft originated ISO image is a mandatory requirement for this project (Because of security constraints that I can?t have any impact on). Help from anyone that already had to deal with such situation would be very appreciated ! Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.therond at bitswalk.com Tue Nov 30 00:06:11 2021 From: gael.therond at bitswalk.com (=?UTF-8?Q?Ga=C3=ABl_THEROND?=) Date: Tue, 30 Nov 2021 01:06:11 +0100 Subject: Windows imaging process In-Reply-To: References: Message-ID: Hi everyone, On one of our Openstack platforms, we maintain windows based workloads. We currently have the following build process: 1?/- We download windows ISO image. 2?/- Use hyper-v to create a gold image that bring virtio drivers. 3?/- We upload this image on glance and publish it for our users. 4?/- Our users use packer to create their own windows custom image from our gold image. This workflow works pretty fine, it?s simple enough and as Microsoft isn?t releasing a new major Windows every two months it?s pretty ok not having all the steps automated until now. However, I?m wondering if there is a way to automate my first two steps using Openstack? So far, from my early tests, I didn?t managed to get all of appropriate gears to work together, did I missed something? I need to create a VM that at least can use the Windows ISO plus a virtio ISO as a second cdrom device and an additional user-data payload that will instruct the Windows installer to go automatically through the installation steps plus loads missing drivers from the virtio iso. This is perfectly working on a native kvm as you can add multiple cdrom device but I didn?t find a way to replicate that on Openstack. Starting from an unaltered Microsoft originated ISO image is a mandatory requirement for this project (Because of security constraints that I can?t have any impact on). Help from anyone that already had to deal with such situation would be very appreciated ! Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: From patryk.jakuszew at gmail.com Tue Nov 30 00:13:06 2021 From: patryk.jakuszew at gmail.com (Patryk Jakuszew) Date: Tue, 30 Nov 2021 01:13:06 +0100 Subject: Windows imaging process In-Reply-To: References: Message-ID: On Tue, 30 Nov 2021 at 01:11, Ga?l THEROND wrote: > > Starting from an unaltered Microsoft originated ISO image is a mandatory requirement for this project (Because of security constraints that I can?t have any impact on). > Does that mean you cannot alter the original image? There are methods for inserting the virtio drivers into the installation media, but that will require generating a new installer ISO out of the modified files. This article seems to be describing the procedure correctly: https://portal.nutanix.com/page/documents/kbs/details?targetId=kA00e000000bt28CAA From tonyliu0592 at hotmail.com Tue Nov 30 02:46:30 2021 From: tonyliu0592 at hotmail.com (Tony Liu) Date: Tue, 30 Nov 2021 02:46:30 +0000 Subject: [neutron][nova-compute] Received unexpected event network-vif-plugged- Message-ID: Hi, With Ussuri, when launch VM, I see this from nova-compute. ======== Received unexpected event network-vif-plugged-32b7af8a-ef81-4785-bd79-3a5619823638 ======== That caused VM to be destroyed, rescheduled and relaunched. And that recreation worked fine. >From user POV, it just takes a bit longer to launch the VM. What's the event nova-compute expects from Neutron for vif-plug? Any hints where I can look into it? More logs here. ======== 2021-11-29 18:28:10.064 7 INFO os_vif [req-2a89f605-6972-41dc-83f1-1ef6a444e486 b8de8d37a30147fd98681c22fc515874 b5938e7ebb0645e396b19e99bb7f9868 - default default] Successfully plugged vif VIFOpenVSwitch(active=False,address=fa:16:3e:f1:94:a0,bridge_name='br-int',has_traffic_filtering=True,id=32b7af8a-ef81-4785-bd79-3a5619823638,network=Network(27ea27c6-3d1e-4d37-ad4a-de2ae1578f1e),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=False,vif_name='tap32b7af8a-ef') 2021-11-29 18:28:10.515 7 INFO nova.compute.manager [req-01cbb5ac-d453-4484-9384-1a2173b532a9 - - - - -] [instance: a8eb2a2a-8c83-46b2-bfd4-fd3a7d00f2ce] VM Started (Lifecycle Event) 2021-11-29 18:28:10.544 7 INFO nova.compute.manager [req-01cbb5ac-d453-4484-9384 -1a2173b532a9 - - - - -] [instance: a8eb2a2a-8c83-46b2-bfd4-fd3a7d00f2ce] VM Paused (Lifecycle Event) 2021-11-29 18:28:10.604 7 INFO nova.compute.manager [req-01cbb5ac-d453-4484-9384 -1a2173b532a9 - - - - -] [instance: a8eb2a2a-8c83-46b2-bfd4-fd3a7d00f2ce] During sync_power_state the instance has a pending task (spawning). Skip. 2021-11-29 18:28:25.441 7 INFO nova.compute.manager [req-01cbb5ac-d453-4484-9384 -1a2173b532a9 - - - - -] [instance: a8eb2a2a-8c83-46b2-bfd4-fd3a7d00f2ce] VM Res umed (Lifecycle Event) 2021-11-29 18:28:25.445 7 INFO nova.virt.libvirt.driver [-] [instance: a8eb2a2a- 8c83-46b2-bfd4-fd3a7d00f2ce] Instance spawned successfully. 2021-11-29 18:28:25.445 7 INFO nova.compute.manager [req-2a89f605-6972-41dc-83f1-1ef6a444e486 b8de8d37a30147fd98681c22fc515874 b5938e7ebb0645e396b19e99bb7f9868 - default default] [instance: a8eb2a2a-8c83-46b2-bfd4-fd3a7d00f2ce] Took 18.75 seconds to spawn the instance on the hypervisor. 2021-11-29 18:28:25.502 7 INFO nova.compute.manager [req-01cbb5ac-d453-4484-9384-1a2173b532a9 - - - - -] [instance: a8eb2a2a-8c83-46b2-bfd4-fd3a7d00f2ce] During sync_power_state the instance has a pending task (spawning). Skip. 2021-11-29 18:28:25.514 7 INFO nova.compute.manager [req-2a89f605-6972-41dc-83f1-1ef6a444e486 b8de8d37a30147fd98681c22fc515874 b5938e7ebb0645e396b19e99bb7f9868 - default default] [instance: a8eb2a2a-8c83-46b2-bfd4-fd3a7d00f2ce] Took 19.46 seconds to build instance. 2021-11-29 18:28:27.466 7 WARNING nova.compute.manager [req-82314965-fe71-4993-bbf6-f0aa9e708529 f0c1375066214e24ab7942d72d829097 3ec2940fc10d4e57bcf929d1fe678c79 - default default] [instance: a8eb2a2a-8c83-46b2-bfd4-fd3a7d00f2ce] Received unexpected event network-vif-plugged-32b7af8a-ef81-4785-bd79-3a5619823638 for instance with vm_state active and task_state None. 2021-11-29 18:28:55.829 7 INFO nova.compute.manager [req-e36e4a81-4755-4d7f-9ee4-d43d00629765 b8de8d37a30147fd98681c22fc515874 b5938e7ebb0645e396b19e99bb7f9868 - default default] [instance: a8eb2a2a-8c83-46b2-bfd4-fd3a7d00f2ce] Terminating instance ======== Thanks! Tony From skaplons at redhat.com Tue Nov 30 07:04:18 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Tue, 30 Nov 2021 08:04:18 +0100 Subject: [neutron] CI meeting 30.11.2021 - cancelled Message-ID: <12902433.uLZWGnKmhe@p1> Hi, Due to some personal reasons I will not be able to drive today's Neutron CI meeting. Let's cancel it this week. Remember, next week we will have CI meeting on Jitsi and we will discuss all ideas about how to improve stability of our CI. -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From balazs.gibizer at est.tech Tue Nov 30 10:16:00 2021 From: balazs.gibizer at est.tech (Balazs Gibizer) Date: Tue, 30 Nov 2021 11:16:00 +0100 Subject: [Neutron][OVN] nova-compute timeout while waiting for VIF activation confirmed by Neutron In-Reply-To: <5291667.iZASKD2KPV@p1> References: <5291667.iZASKD2KPV@p1> Message-ID: On Thu, Nov 25 2021 at 11:57:45 AM +0100, Slawek Kaplonski wrote: > Hi, > > On czwartek, 25 listopada 2021 10:41:14 CET Rodolfo Alonso Hernandez > wrote: >> Hello Michal: >> >> I think this thread is related to OVN. In any case, I've analyzed >> your logs >> from the Neutron point of view. Those are basically the key points >> in your >> logs: [1]. The Neutron server receives the request to create and >> bind a new >> port. Both the DHCP agent and the OVS agent provision the port >> (8a472e87-4a7a-4ad4-9fbb-fd9785136611). >> >> Neutron server sends the "network-vif-plugged" event at >> 08:51:05.635 and >> receives the ACK from Nova at 08:51:05.704. Nova server creates the >> corresponding event for the instance on the compute0: >> 2021-11-25 08:51:05.692 23 INFO >> nova.api.openstack.compute.server_external_events >> [req-393fcb9a-5c68-4b0a-af1b-661e73d947cb >> 01a75e3a9a9148218916d3beafae2120 >> 7d16babef0524b7dade9a59b0a3569e1 - default default] Creating event >> network-vif-plugged:8a472e87-4a7a-4ad4-9fbb-fd9785136611 for >> instance >> 09eff2ce-356f-430f-ab30-5de58f58d698 on compute0 >> >> Nova compute agent receives this server event: >> 2021-11-25 08:51:05.705 8 DEBUG nova.compute.manager >> [req-393fcb9a-5c68-4b0a-af1b-661e73d947cb >> 01a75e3a9a9148218916d3beafae2120 >> 7d16babef0524b7dade9a59b0a3569e1 - default default] [instance: >> 09eff2ce-356f-430f-ab30-5de58f58d698] Processing event >> network-vif-plugged-8a472e87-4a7a-4ad4-9fbb-fd9785136611 >> _process_instance_event >> >> /var/lib/kolla/venv/lib/python3.9/site-packages/nova/compute/manager.py: > 10205 >> >> Further triagging and log analysis should be done by Nova folks to >> understand why Nova compute didn't boot this VM. I fail to >> understand some >> parts. > > Thx Rodolfo. I also took a look at those logs (for server > 09eff2ce-356f-430f- > ab30-5de58f58d698 which had port 0761ff2f-ff6a-40a5-85b7-c9b40386cbe0 > but I > can confirm what You found actually. Port was pretty quickly > provisioned and > it was reported to Nova. I don't know why Nova didn't unpause that vm. I took a look at the logs (thanks for providing it!). Nova did not unpause the VM a there was more than one port to be plugged an from one, bf0f9dd2-4fd3-4028-a3dc-e2581ff081d0, there was no vif-plugged event received. See my grepping of the log to show this https://paste.opendev.org/show/811348/ Cheers, gibi > > Port creation in Nova: > 2021-11-25 08:50:24.250 8 INFO os_vif [req-6eab0f2a- > d7f4-4251-96f4-177fc0c62212 647a328aef9b4a8b890f102d8f018576 > 49231caa42f34da8a9d10229b5c7a5d8 - default default] Successfully > plugged vif > VIFOpenVSwitch(active=False,address=fa:16:3e:3d:cd:b9,bridge_name='br- > int',has_traffic_filtering=True,id=0761ff2f-ff6a-40a5-85b7- > c9b40386cbe0,network=Network(cb4e6be4-9bc7-4c34-95dd- > dfc3c8e9d93c),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=True,vif_name='tap0761ff2f- > ff') > > Neutron finished provisioning of that port and sent notification to > nova about > 34 seconds later: > 2021-11-25 08:51:00.403 30 DEBUG neutron.plugins.ml2.rpc [req- > df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Device 0761ff2f- > ff6a-40a5-85b7-c9b40386cbe0 up at agent ovs-agent-compute0 > update_device_up / > var/lib/kolla/venv/lib/python3.9/site-packages/neutron/plugins/ml2/rpc.py:278 > 2021-11-25 08:51:00.503 29 DEBUG neutron.db.provisioning_blocks [req- > a5711e9f-9b15-4a1a-a58d-07952fb8ac34 - - - - -] Provisioning complete > for port > 67b32ddf-dddd-439e-92be-6a89df49b4af triggered by entity L2. > provisioning_complete > /var/lib/kolla/venv/lib/python3.9/site-packages/neutron/ > db/provisioning_blocks.py:139 > 2021-11-25 08:51:00.504 29 DEBUG neutron_lib.callbacks.manager [req- > a5711e9f-9b15-4a1a-a58d-07952fb8ac34 - - - - -] Notify callbacks > ['neutron.plugins.ml2.plugin.Ml2Plugin._port_provisioned-8372308'] > for port, > provisioning_complete _notify_loop > /var/lib/kolla/venv/lib/python3.9/site- > packages/neutron_lib/callbacks/manager.py:192 > 2021-11-25 08:51:00.505 30 DEBUG neutron.db.provisioning_blocks [req- > df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Provisioning for port > 0761ff2f-ff6a-40a5-85b7-c9b40386cbe0 completed by entity L2. > provisioning_complete > /var/lib/kolla/venv/lib/python3.9/site-packages/neutron/ > db/provisioning_blocks.py:133 > 2021-11-25 08:51:00.515 30 DEBUG neutron.db.provisioning_blocks [req- > df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Provisioning complete > for port > 0761ff2f-ff6a-40a5-85b7-c9b40386cbe0 triggered by entity L2. > provisioning_complete > /var/lib/kolla/venv/lib/python3.9/site-packages/neutron/ > db/provisioning_blocks.py:139 > 2021-11-25 08:51:00.516 30 DEBUG neutron_lib.callbacks.manager [req- > df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Notify callbacks > ['neutron.plugins.ml2.plugin.Ml2Plugin._port_provisioned-8372308'] > for port, > provisioning_complete _notify_loop > /var/lib/kolla/venv/lib/python3.9/site- > packages/neutron_lib/callbacks/manager.py:192 > 2021-11-25 08:51:00.701 30 DEBUG neutron_lib.callbacks.manager [req- > df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Notify callbacks > ['neutron.plugins.ml2.plugin.SecurityGroupDbMixin._ensure_default_security_group_handler-7972195'] > for port, before_update _notify_loop > /var/lib/kolla/venv/lib/python3.9/site- > packages/neutron_lib/callbacks/manager.py:192 > 2021-11-25 08:51:00.746 30 DEBUG neutron.notifiers.nova [-] Sending > events: > [{'server_uuid': '09eff2ce-356f-430f-ab30-5de58f58d698', 'name': > 'network-vif- > plugged', 'status': 'completed', 'tag': '0761ff2f-ff6a-40a5-85b7- > c9b40386cbe0'}] send_events > /var/lib/kolla/venv/lib/python3.9/site-packages/ > neutron/notifiers/nova.py:262 > 2021-11-25 08:51:00.762 30 DEBUG neutron.plugins.ml2.db [req- > df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] For port 0761ff2f- > ff6a-40a5-85b7-c9b40386cbe0, host compute0, got binding levels > [PortBindingLevel(driver='openvswitch',host='compute0',level=0,port_id=0761ff2f- > ff6a-40a5-85b7-c9b40386cbe0,segment=NetworkSegment(ea3a27b6- > c3fa-40e3-9937-71b652cba855),segment_id=ea3a27b6-c3fa-40e3-9937-71b652cba855)] > get_binding_level_objs > /var/lib/kolla/venv/lib/python3.9/site-packages/ > neutron/plugins/ml2/db.py:75 > 2021-11-25 08:51:00.818 30 INFO neutron.notifiers.nova [-] Nova event > response: {'server_uuid': '09eff2ce-356f-430f-ab30-5de58f58d698', > 'name': > 'network-vif-plugged', 'status': 'completed', 'tag': > '0761ff2f-ff6a-40a5-85b7- > c9b40386cbe0', 'code': 200} > >> >> Regards. >> >> [1]https://paste.opendev.org/show/811273/ >> >> On Thu, Nov 25, 2021 at 10:15 AM Michal Arbet >> >> >> wrote: >> > Hello, >> > >> > In attachment you can find logs for compute0 and controller0 >> (other >> > computes and controllers were turned off for this test). >> > No OVN used, this stack is based on OVS. >> > >> > Thank you, >> > Michal Arbet >> > Openstack Engineer >> > >> > Ultimum Technologies a.s. >> > Na Po???? 1047/26, 11000 Praha 1 >> > Czech Republic >> > >> > +420 604 228 897 >> > michal.arbet at ultimum.io >> > *https://ultimum.io * >> > >> > LinkedIn >> | Twitter >> > | Facebook >> > >> > >> > >> > st 24. 11. 2021 v 15:59 odes?latel Rodolfo Alonso Hernandez < >> > >> > ralonsoh at redhat.com> napsal: >> >> Hello Tony: >> >> >> >> Do you have the Neutron server logs? Do you see the >> >> "PortBindingChassisUpdateEvent"? When a port is bound, a >> Port_Bound (SB) >> >> event is issued and captured in the Neutron server. That will >> trigger the >> >> port binding process and the "vif-plugged" event. This OVN SB >> event > should >> >> call "set_port_status_up" and that should write "OVN reports >> status up > for >> >> port: %s" in the logs. >> >> >> >> Of course, this Neutron method will be called only if the >> logical switch >> >> port is UP. >> >> >> >> Regards. >> >> >> >> >> >> On Tue, Nov 23, 2021 at 11:59 PM Tony Liu >> >> >> >> >> wrote: >> >>> Hi, >> >>> >> >>> I see such problem from time to time. It's not consistently >> >>> reproduceable. >> >>> ====================== >> >>> 2021-11-23 22:16:28.532 7 INFO nova.compute.manager >> >>> [req-50817b01-e7ae-4991-94fe-e4c5672c481b - - - - -] [instance: >> >>> 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] During sync_power_state >> the > instance >> >>> has a pending task (spawning). Skip. >> >>> 2021-11-23 22:21:28.342 7 WARNING nova.virt.libvirt.driver >> >>> [req-814c98a4-3fd9-4607-9abb-5fbe5cef8650 > 3a4c320d64664d9cb6784b7ea52d618a >> >>> 21384fe21ebb4339b1a5f2a8f9000ea3 - default default] [instance: >> >>> 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] Timeout waiting for >> >>> [('network-vif-plugged', >> '735d97a3-db7c-42ed-b3be-1596d4cc7f4b')] for >> >>> instance with vm_state building and task_state spawning.: >> >>> eventlet.timeout.Timeout: 300 seconds >> >>> ====================== >> >>> The VIF/port is activated by OVN ovn-controller to ovn-sb. It >> seems >> >>> that, either Neutron didn't >> >>> capture the update or didn't send message back to nova-compute. >> Is there >> >>> any known fix for >> >>> this problem? >> >>> >> >>> >> >>> Thanks! >> >>> Tony > > > -- > Slawek Kaplonski > Principal Software Engineer > Red Hat From balazs.gibizer at est.tech Tue Nov 30 13:10:32 2021 From: balazs.gibizer at est.tech (Balazs Gibizer) Date: Tue, 30 Nov 2021 14:10:32 +0100 Subject: [neutron][nova-compute] Received unexpected event network-vif-plugged- In-Reply-To: References: Message-ID: On Tue, Nov 30 2021 at 02:46:30 AM +0000, Tony Liu wrote: > Hi, > > With Ussuri, when launch VM, I see this from nova-compute. > ======== > Received unexpected event > network-vif-plugged-32b7af8a-ef81-4785-bd79-3a5619823638 > ======== > > That caused VM to be destroyed, rescheduled and relaunched. And that > recreation worked fine. > From user POV, it just takes a bit longer to launch the VM. > What's the event nova-compute expects from Neutron for vif-plug? > Any hints where I can look into it? > > More logs here. > ======== > 2021-11-29 18:28:10.064 7 INFO os_vif > [req-2a89f605-6972-41dc-83f1-1ef6a444e486 > b8de8d37a30147fd98681c22fc515874 b5938e7ebb0645e396b19e99bb7f9868 - > default default] Successfully plugged vif > VIFOpenVSwitch(active=False,address=fa:16:3e:f1:94:a0,bridge_name='br-int',has_traffic_filtering=True,id=32b7af8a-ef81-4785-bd79-3a5619823638,network=Network(27ea27c6-3d1e-4d37-ad4a-de2ae1578f1e),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=False,vif_name='tap32b7af8a-ef') > 2021-11-29 18:28:10.515 7 INFO nova.compute.manager > [req-01cbb5ac-d453-4484-9384-1a2173b532a9 - - - - -] [instance: > a8eb2a2a-8c83-46b2-bfd4-fd3a7d00f2ce] VM Started (Lifecycle Event) > 2021-11-29 18:28:10.544 7 INFO nova.compute.manager > [req-01cbb5ac-d453-4484-9384 > -1a2173b532a9 - - - - -] [instance: > a8eb2a2a-8c83-46b2-bfd4-fd3a7d00f2ce] VM Paused (Lifecycle Event) > 2021-11-29 18:28:10.604 7 INFO nova.compute.manager > [req-01cbb5ac-d453-4484-9384 > -1a2173b532a9 - - - - -] [instance: > a8eb2a2a-8c83-46b2-bfd4-fd3a7d00f2ce] During > sync_power_state the instance has a pending task (spawning). Skip. > 2021-11-29 18:28:25.441 7 INFO nova.compute.manager > [req-01cbb5ac-d453-4484-9384 > -1a2173b532a9 - - - - -] [instance: > a8eb2a2a-8c83-46b2-bfd4-fd3a7d00f2ce] VM Res > umed (Lifecycle Event) > 2021-11-29 18:28:25.445 7 INFO nova.virt.libvirt.driver [-] > [instance: a8eb2a2a- > 8c83-46b2-bfd4-fd3a7d00f2ce] Instance spawned successfully. > 2021-11-29 18:28:25.445 7 INFO nova.compute.manager > [req-2a89f605-6972-41dc-83f1-1ef6a444e486 > b8de8d37a30147fd98681c22fc515874 b5938e7ebb0645e396b19e99bb7f9868 > - default default] [instance: a8eb2a2a-8c83-46b2-bfd4-fd3a7d00f2ce] > Took 18.75 seconds to spawn the instance on the hypervisor. > 2021-11-29 18:28:25.502 7 INFO nova.compute.manager > [req-01cbb5ac-d453-4484-9384-1a2173b532a9 - - - - -] [instance: > a8eb2a2a-8c83-46b2-bfd4-fd3a7d00f2ce] During > sync_power_state the instance has a pending task (spawning). Skip. > 2021-11-29 18:28:25.514 7 INFO nova.compute.manager > [req-2a89f605-6972-41dc-83f1-1ef6a444e486 > b8de8d37a30147fd98681c22fc515874 b5938e7ebb0645e396b19e99bb7f9868 > - default default] [instance: a8eb2a2a-8c83-46b2-bfd4-fd3a7d00f2ce] > Took 19.46 seconds to build instance. > 2021-11-29 18:28:27.466 7 WARNING nova.compute.manager > [req-82314965-fe71-4993-bbf6-f0aa9e708529 > f0c1375066214e24ab7942d72d829097 3ec2940fc10d4e57bcf929d1fe678c79 - > default default] [instance: a8eb2a2a-8c83-46b2-bfd4-fd3a7d00f2ce] > Received > unexpected event > network-vif-plugged-32b7af8a-ef81-4785-bd79-3a5619823638 for instance > with vm_state active and task_state None. > 2021-11-29 18:28:55.829 7 INFO nova.compute.manager > [req-e36e4a81-4755-4d7f-9ee4-d43d00629765 > b8de8d37a30147fd98681c22fc515874 b5938e7ebb0645e396b19e99bb7f9868 > - default default] [instance: a8eb2a2a-8c83-46b2-bfd4-fd3a7d00f2ce] > Terminating > instance > ======== Probably something is missing from the log as there are almost 30 seconds between the unexpected vif-plugged event and the termination of the instance. Could you turn on the debug logging and reproduce the issue? In general unexpected vif events just logged and then ignored so it should not cause any issue directly. Cheers, gibi > > > Thanks! > Tony > From balazs.gibizer at est.tech Tue Nov 30 13:28:26 2021 From: balazs.gibizer at est.tech (Balazs Gibizer) Date: Tue, 30 Nov 2021 14:28:26 +0100 Subject: [Neutron][OVN] nova-compute timeout while waiting for VIF activation confirmed by Neutron In-Reply-To: References: <5291667.iZASKD2KPV@p1> Message-ID: On Tue, Nov 30 2021 at 11:16:00 AM +0100, Balazs Gibizer wrote: > > > On Thu, Nov 25 2021 at 11:57:45 AM +0100, Slawek Kaplonski > wrote: >> Hi, >> >> On czwartek, 25 listopada 2021 10:41:14 CET Rodolfo Alonso Hernandez >> wrote: >>> Hello Michal: >>> >>> I think this thread is related to OVN. In any case, I've analyzed >>> your logs >>> from the Neutron point of view. Those are basically the key points >>> in your >>> logs: [1]. The Neutron server receives the request to create and >>> bind a new >>> port. Both the DHCP agent and the OVS agent provision the port >>> (8a472e87-4a7a-4ad4-9fbb-fd9785136611). >>> >>> Neutron server sends the "network-vif-plugged" event at >>> 08:51:05.635 and >>> receives the ACK from Nova at 08:51:05.704. Nova server creates the >>> corresponding event for the instance on the compute0: >>> 2021-11-25 08:51:05.692 23 INFO >>> nova.api.openstack.compute.server_external_events >>> [req-393fcb9a-5c68-4b0a-af1b-661e73d947cb >>> 01a75e3a9a9148218916d3beafae2120 >>> 7d16babef0524b7dade9a59b0a3569e1 - default default] Creating event >>> network-vif-plugged:8a472e87-4a7a-4ad4-9fbb-fd9785136611 for >>> instance >>> 09eff2ce-356f-430f-ab30-5de58f58d698 on compute0 >>> >>> Nova compute agent receives this server event: >>> 2021-11-25 08:51:05.705 8 DEBUG nova.compute.manager >>> [req-393fcb9a-5c68-4b0a-af1b-661e73d947cb >>> 01a75e3a9a9148218916d3beafae2120 >>> 7d16babef0524b7dade9a59b0a3569e1 - default default] [instance: >>> 09eff2ce-356f-430f-ab30-5de58f58d698] Processing event >>> network-vif-plugged-8a472e87-4a7a-4ad4-9fbb-fd9785136611 >>> _process_instance_event >>> >>> /var/lib/kolla/venv/lib/python3.9/site-packages/nova/compute/manager.py: >> 10205 >>> >>> Further triagging and log analysis should be done by Nova folks to >>> understand why Nova compute didn't boot this VM. I fail to >>> understand some >>> parts. >> >> Thx Rodolfo. I also took a look at those logs (for server >> 09eff2ce-356f-430f- >> ab30-5de58f58d698 which had port >> 0761ff2f-ff6a-40a5-85b7-c9b40386cbe0 but I >> can confirm what You found actually. Port was pretty quickly >> provisioned and >> it was reported to Nova. I don't know why Nova didn't unpause that >> vm. > > I took a look at the logs (thanks for providing it!). Nova did not > unpause the VM a there was more than one port to be plugged an from > one, bf0f9dd2-4fd3-4028-a3dc-e2581ff081d0, there was no vif-plugged > event received. > > See my grepping of the log to show this > https://paste.opendev.org/show/811348/ as a side note I pushed a small patch to nova to log the which event was timed out https://review.opendev.org/c/openstack/nova/+/819817 > > Cheers, > gibi > >> >> Port creation in Nova: >> 2021-11-25 08:50:24.250 8 INFO os_vif [req-6eab0f2a- >> d7f4-4251-96f4-177fc0c62212 647a328aef9b4a8b890f102d8f018576 >> 49231caa42f34da8a9d10229b5c7a5d8 - default default] Successfully >> plugged vif >> VIFOpenVSwitch(active=False,address=fa:16:3e:3d:cd:b9,bridge_name='br- >> int',has_traffic_filtering=True,id=0761ff2f-ff6a-40a5-85b7- >> c9b40386cbe0,network=Network(cb4e6be4-9bc7-4c34-95dd- >> dfc3c8e9d93c),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=True,vif_name='tap0761ff2f- >> ff') >> >> Neutron finished provisioning of that port and sent notification to >> nova about >> 34 seconds later: >> 2021-11-25 08:51:00.403 30 DEBUG neutron.plugins.ml2.rpc [req- >> df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Device 0761ff2f- >> ff6a-40a5-85b7-c9b40386cbe0 up at agent ovs-agent-compute0 >> update_device_up / >> var/lib/kolla/venv/lib/python3.9/site-packages/neutron/plugins/ml2/rpc.py:278 >> 2021-11-25 08:51:00.503 29 DEBUG neutron.db.provisioning_blocks [req- >> a5711e9f-9b15-4a1a-a58d-07952fb8ac34 - - - - -] Provisioning >> complete for port >> 67b32ddf-dddd-439e-92be-6a89df49b4af triggered by entity L2. >> provisioning_complete >> /var/lib/kolla/venv/lib/python3.9/site-packages/neutron/ >> db/provisioning_blocks.py:139 >> 2021-11-25 08:51:00.504 29 DEBUG neutron_lib.callbacks.manager [req- >> a5711e9f-9b15-4a1a-a58d-07952fb8ac34 - - - - -] Notify callbacks >> ['neutron.plugins.ml2.plugin.Ml2Plugin._port_provisioned-8372308'] >> for port, >> provisioning_complete _notify_loop >> /var/lib/kolla/venv/lib/python3.9/site- >> packages/neutron_lib/callbacks/manager.py:192 >> 2021-11-25 08:51:00.505 30 DEBUG neutron.db.provisioning_blocks [req- >> df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Provisioning for port >> 0761ff2f-ff6a-40a5-85b7-c9b40386cbe0 completed by entity L2. >> provisioning_complete >> /var/lib/kolla/venv/lib/python3.9/site-packages/neutron/ >> db/provisioning_blocks.py:133 >> 2021-11-25 08:51:00.515 30 DEBUG neutron.db.provisioning_blocks [req- >> df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Provisioning >> complete for port >> 0761ff2f-ff6a-40a5-85b7-c9b40386cbe0 triggered by entity L2. >> provisioning_complete >> /var/lib/kolla/venv/lib/python3.9/site-packages/neutron/ >> db/provisioning_blocks.py:139 >> 2021-11-25 08:51:00.516 30 DEBUG neutron_lib.callbacks.manager [req- >> df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Notify callbacks >> ['neutron.plugins.ml2.plugin.Ml2Plugin._port_provisioned-8372308'] >> for port, >> provisioning_complete _notify_loop >> /var/lib/kolla/venv/lib/python3.9/site- >> packages/neutron_lib/callbacks/manager.py:192 >> 2021-11-25 08:51:00.701 30 DEBUG neutron_lib.callbacks.manager [req- >> df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Notify callbacks >> ['neutron.plugins.ml2.plugin.SecurityGroupDbMixin._ensure_default_security_group_handler-7972195'] >> for port, before_update _notify_loop >> /var/lib/kolla/venv/lib/python3.9/site- >> packages/neutron_lib/callbacks/manager.py:192 >> 2021-11-25 08:51:00.746 30 DEBUG neutron.notifiers.nova [-] Sending >> events: >> [{'server_uuid': '09eff2ce-356f-430f-ab30-5de58f58d698', 'name': >> 'network-vif- >> plugged', 'status': 'completed', 'tag': '0761ff2f-ff6a-40a5-85b7- >> c9b40386cbe0'}] send_events >> /var/lib/kolla/venv/lib/python3.9/site-packages/ >> neutron/notifiers/nova.py:262 >> 2021-11-25 08:51:00.762 30 DEBUG neutron.plugins.ml2.db [req- >> df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] For port 0761ff2f- >> ff6a-40a5-85b7-c9b40386cbe0, host compute0, got binding levels >> [PortBindingLevel(driver='openvswitch',host='compute0',level=0,port_id=0761ff2f- >> ff6a-40a5-85b7-c9b40386cbe0,segment=NetworkSegment(ea3a27b6- >> c3fa-40e3-9937-71b652cba855),segment_id=ea3a27b6-c3fa-40e3-9937-71b652cba855)] >> get_binding_level_objs >> /var/lib/kolla/venv/lib/python3.9/site-packages/ >> neutron/plugins/ml2/db.py:75 >> 2021-11-25 08:51:00.818 30 INFO neutron.notifiers.nova [-] Nova event >> response: {'server_uuid': '09eff2ce-356f-430f-ab30-5de58f58d698', >> 'name': >> 'network-vif-plugged', 'status': 'completed', 'tag': >> '0761ff2f-ff6a-40a5-85b7- >> c9b40386cbe0', 'code': 200} >> >>> >>> Regards. >>> >>> [1]https://paste.opendev.org/show/811273/ >>> >>> On Thu, Nov 25, 2021 at 10:15 AM Michal Arbet >>>  >>> >>> wrote: >>> > Hello, >>> > >>> > In attachment you can find logs for compute0 and controller0 >>> (other >>> > computes and controllers were turned off for this test). >>> > No OVN used, this stack is based on OVS. >>> > >>> > Thank you, >>> > Michal Arbet >>> > Openstack Engineer >>> > >>> > Ultimum Technologies a.s. >>> > Na Po???? 1047/26, 11000 Praha 1 >>> > Czech Republic >>> > >>> > +420 604 228 897 >>> > michal.arbet at ultimum.io >>> > *https://ultimum.io * >>> > >>> > LinkedIn >>> | Twitter >>> > | Facebook >>> > >>> > >>> > >>> > st 24. 11. 2021 v 15:59 odes?latel Rodolfo Alonso Hernandez < >>> > >>> > ralonsoh at redhat.com> napsal: >>> >> Hello Tony: >>> >> >>> >> Do you have the Neutron server logs? Do you see the >>> >> "PortBindingChassisUpdateEvent"? When a port is bound, a >>> Port_Bound (SB) >>> >> event is issued and captured in the Neutron server. That will >>> trigger the >>> >> port binding process and the "vif-plugged" event. This OVN SB >>> event >> should >>> >> call "set_port_status_up" and that should write "OVN reports >>> status up >> for >>> >> port: %s" in the logs. >>> >> >>> >> Of course, this Neutron method will be called only if the >>> logical switch >>> >> port is UP. >>> >> >>> >> Regards. >>> >> >>> >> >>> >> On Tue, Nov 23, 2021 at 11:59 PM Tony Liu >>>  >>> >> >>> >> wrote: >>> >>> Hi, >>> >>> >>> >>> I see such problem from time to time. It's not consistently >>> >>> reproduceable. >>> >>> ====================== >>> >>> 2021-11-23 22:16:28.532 7 INFO nova.compute.manager >>> >>> [req-50817b01-e7ae-4991-94fe-e4c5672c481b - - - - -] [instance: >>> >>> 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] During sync_power_state >>> the >> instance >>> >>> has a pending task (spawning). Skip. >>> >>> 2021-11-23 22:21:28.342 7 WARNING nova.virt.libvirt.driver >>> >>> [req-814c98a4-3fd9-4607-9abb-5fbe5cef8650 >> 3a4c320d64664d9cb6784b7ea52d618a >>> >>> 21384fe21ebb4339b1a5f2a8f9000ea3 - default default] [instance: >>> >>> 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] Timeout waiting for >>> >>> [('network-vif-plugged', >>> '735d97a3-db7c-42ed-b3be-1596d4cc7f4b')] for >>> >>> instance with vm_state building and task_state spawning.: >>> >>> eventlet.timeout.Timeout: 300 seconds >>> >>> ====================== >>> >>> The VIF/port is activated by OVN ovn-controller to ovn-sb. It >>> seems >>> >>> that, either Neutron didn't >>> >>> capture the update or didn't send message back to >>> nova-compute. Is there >>> >>> any known fix for >>> >>> this problem? >>> >>> >>> >>> >>> >>> Thanks! >>> >>> Tony >> >> >> -- >> Slawek Kaplonski >> Principal Software Engineer >> Red Hat > > > From pkliczew at redhat.com Tue Nov 30 13:42:39 2021 From: pkliczew at redhat.com (Piotr Kliczewski) Date: Tue, 30 Nov 2021 14:42:39 +0100 Subject: [Openstack][FOSDEM][CFP] Virtualization & IaaS Devroom Message-ID: We are excited to announce that the call for proposals is now open for the Virtualization & IaaS devroom at the upcoming FOSDEM 2022, to be hosted virtually on February 5th 2022. This year will mark FOSDEM?s 22nd anniversary as one of the longest-running free and open source software developer events, attracting thousands of developers and users from all over the world. Due to Covid-19, FOSDEM will be held virtually this year on February 5th & 6th, 2022. About the Devroom The Virtualization & IaaS devroom will feature session topics such as open source hypervisors and virtual machine managers such as Xen Project, KVM, bhyve, and VirtualBox, and Infrastructure-as-a-Service projects such as KubeVirt, Apache CloudStack, Foreman, OpenStack, oVirt, QEMU and OpenNebula. This devroom will host presentations that focus on topics of shared interest, such as KVM; libvirt; shared storage; virtualized networking; cloud security; clustering and high availability; interfacing with multiple hypervisors; hyperconverged deployments; and scaling across hundreds or thousands of servers. Presentations in this devroom will be aimed at users or developers working on these platforms who are looking to collaborate and improve shared infrastructure or solve common problems. We seek topics that encourage dialog between projects and continued work post-FOSDEM. Important Dates Submission deadline: 20th of December Acceptance notifications: 25th of December Final schedule announcement: 31st of December Recorded presentations upload deadline: 15th of January Devroom: 6th February 2022 Submit Your Proposal All submissions must be made via the Pentabarf event planning site[1]. If you have not used Pentabarf before, you will need to create an account. If you submitted proposals for FOSDEM in previous years, you can use your existing account. After creating the account, select Create Event to start the submission process. Make sure to select Virtualization and IaaS devroom from the Track list. Please fill out all the required fields, and provide a meaningful abstract and description of your proposed session. Submission Guidelines We expect more proposals than we can possibly accept, so it is vitally important that you submit your proposal on or before the deadline. Late submissions are unlikely to be considered. All presentation slots are 30 minutes, with 20 minutes planned for presentations, and 10 minutes for Q&A. All presentations will need to be pre-recorded and put into our system at least a couple of weeks before the event. The presentations should be uploaded by 15th of January and made available under Creative Commons licenses. In the Submission notes field, please indicate that you agree that your presentation will be licensed under the CC-By-SA-4.0 or CC-By-4.0 license and that you agree to have your presentation recorded. For example: "If my presentation is accepted for FOSDEM, I hereby agree to license all recordings, slides, and other associated materials under the Creative Commons Attribution Share-Alike 4.0 International License. Sincerely, ." In the Submission notes field, please also confirm that if your talk is accepted, you will be able to attend the virtual FOSDEM event for the Q&A. We will not consider proposals from prospective speakers who are unsure whether they will be able to attend the FOSDEM virtual event. If you are experiencing problems with Pentabarf, the proposal submission interface, or have other questions, you can email our devroom mailing list[2] and we will try to help you. Code of Conduct Following the release of the updated code of conduct for FOSDEM, we'd like to remind all speakers and attendees that all of the presentations and discussions in our devroom are held under the guidelines set in the CoC and we expect attendees, speakers, and volunteers to follow the CoC at all times. If you submit a proposal and it is accepted, you will be required to confirm that you accept the FOSDEM CoC. If you have any questions about the CoC or wish to have one of the devroom organizers review your presentation slides or any other content for CoC compliance, please email us and we will do our best to assist you. Call for Volunteers We are also looking for volunteers to help run the devroom. We need assistance with helping speakers to record the presentation as well as helping with streaming and chat moderation for the devroom. Please contact devroom mailing list [2] for more information. Questions? If you have any questions about this devroom, please send your questions to our devroom mailing list. You can also subscribe to the list to receive updates about important dates, session announcements, and to connect with other attendees. See you all at FOSDEM! [1] https://penta.fosdem.org/submission/FOSDEM22 [2] iaas-virt-devroom at lists.fosdem.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From elfosardo at gmail.com Tue Nov 30 14:01:36 2021 From: elfosardo at gmail.com (Riccardo Pittau) Date: Tue, 30 Nov 2021 15:01:36 +0100 Subject: [ironic][meeting] Postpone meeting time until end of March 2022 Message-ID: Hello ironicers! Due to some overlapping time with other meetings that cause me to attend to 2 (or sometimes 3.....) meetings at the same time, I'd like to propose to move the time for the weekly ironic upstream meeting 1 hour later, so from the current 15:00-16:00 UTC to 16:00-17:00 UTC, until the next Daylight saving time change that should happen at the end of March 2022. Thank you! Ciao, Riccardo -------------- next part -------------- An HTML attachment was scrubbed... URL: From juliaashleykreger at gmail.com Tue Nov 30 14:10:19 2021 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Tue, 30 Nov 2021 06:10:19 -0800 Subject: [ironic][meeting] Postpone meeting time until end of March 2022 In-Reply-To: References: Message-ID: I, personally, have no objections to pushing the meeting back by one hour until the end of March. For a while in the past, we actually were holding the meeting starting at 16:00 UTC. -Julia On Tue, Nov 30, 2021 at 6:05 AM Riccardo Pittau wrote: > > Hello ironicers! > > Due to some overlapping time with other meetings that cause me to attend to 2 (or sometimes 3.....) meetings at the same time, I'd like to propose to move the time for the weekly ironic upstream meeting 1 hour later, so from the current 15:00-16:00 UTC to 16:00-17:00 UTC, until the next Daylight saving time change that should happen at the end of March 2022. > > Thank you! > > Ciao, > Riccardo > > From michal.arbet at ultimum.io Tue Nov 30 14:19:55 2021 From: michal.arbet at ultimum.io (Michal Arbet) Date: Tue, 30 Nov 2021 15:19:55 +0100 Subject: [Neutron][OVN] nova-compute timeout while waiting for VIF activation confirmed by Neutron In-Reply-To: References: <5291667.iZASKD2KPV@p1> Message-ID: Hi, Thank you Balazs for investigation, so, the question is, why other VIFs events are received quickly and that last one no ? It looks like a bug, isn't it ? Sorry, I am not as good as you to review code in deep. Btw, did you see message from my colleague Jan Vondra ? It's from email chain [neutron][nova] [kolla] vif plugged timeout >>>>>>>>>>>>> Hi guys, I've been further investigating Michal's (OP) issue, since he is on his holiday, and I've found out that the issue is not really plugging the VIF but cleanup after previous port bindings. We are creating 6 servers with 2-4 vifs using heat template [0]. We were hitting some problems with placement so the stack sometimes failed to create and we had to delete the stack and recreate it. If we recreate it right after the deletion, the vif plugging timeout occurs. If we wait some time (approx. 10 minutes) the stack is created successfully. This brings me to believe that there is some issue with deferring the removal of security groups from unbound ports (somewhere around this part of code [1]) and it somehow affects the creation of new ports. However, I am unable to find any lock that could cause this behaviour. The only proof I have is that after the stack recreation scenario I have measured that the process_network_ports [2] function call could take up to 650 s (varies from 5 s to 651 s in our environment). Any idea what could be causing this? [0] https://pastebin.com/infvj4ai [1] https://github.com/openstack/neutron/blob/master/neutron/agent/firewall.py#L133 [2] https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L2079 Jan Vondra http://ultimum.io <<<<<<<<<<<<< Thank you, Michal Arbet (kevko) Michal Arbet Openstack Engineer Ultimum Technologies a.s. Na Po???? 1047/26, 11000 Praha 1 Czech Republic +420 604 228 897 michal.arbet at ultimum.io *https://ultimum.io * LinkedIn | Twitter | Facebook ?t 30. 11. 2021 v 14:28 odes?latel Balazs Gibizer napsal: > > > On Tue, Nov 30 2021 at 11:16:00 AM +0100, Balazs Gibizer > wrote: > > > > > > On Thu, Nov 25 2021 at 11:57:45 AM +0100, Slawek Kaplonski > > wrote: > >> Hi, > >> > >> On czwartek, 25 listopada 2021 10:41:14 CET Rodolfo Alonso Hernandez > >> wrote: > >>> Hello Michal: > >>> > >>> I think this thread is related to OVN. In any case, I've analyzed > >>> your logs > >>> from the Neutron point of view. Those are basically the key points > >>> in your > >>> logs: [1]. The Neutron server receives the request to create and > >>> bind a new > >>> port. Both the DHCP agent and the OVS agent provision the port > >>> (8a472e87-4a7a-4ad4-9fbb-fd9785136611). > >>> > >>> Neutron server sends the "network-vif-plugged" event at > >>> 08:51:05.635 and > >>> receives the ACK from Nova at 08:51:05.704. Nova server creates the > >>> corresponding event for the instance on the compute0: > >>> 2021-11-25 08:51:05.692 23 INFO > >>> nova.api.openstack.compute.server_external_events > >>> [req-393fcb9a-5c68-4b0a-af1b-661e73d947cb > >>> 01a75e3a9a9148218916d3beafae2120 > >>> 7d16babef0524b7dade9a59b0a3569e1 - default default] Creating event > >>> network-vif-plugged:8a472e87-4a7a-4ad4-9fbb-fd9785136611 for > >>> instance > >>> 09eff2ce-356f-430f-ab30-5de58f58d698 on compute0 > >>> > >>> Nova compute agent receives this server event: > >>> 2021-11-25 08:51:05.705 8 DEBUG nova.compute.manager > >>> [req-393fcb9a-5c68-4b0a-af1b-661e73d947cb > >>> 01a75e3a9a9148218916d3beafae2120 > >>> 7d16babef0524b7dade9a59b0a3569e1 - default default] [instance: > >>> 09eff2ce-356f-430f-ab30-5de58f58d698] Processing event > >>> network-vif-plugged-8a472e87-4a7a-4ad4-9fbb-fd9785136611 > >>> _process_instance_event > >>> > >>> > /var/lib/kolla/venv/lib/python3.9/site-packages/nova/compute/manager.py: > >> 10205 > >>> > >>> Further triagging and log analysis should be done by Nova folks to > >>> understand why Nova compute didn't boot this VM. I fail to > >>> understand some > >>> parts. > >> > >> Thx Rodolfo. I also took a look at those logs (for server > >> 09eff2ce-356f-430f- > >> ab30-5de58f58d698 which had port > >> 0761ff2f-ff6a-40a5-85b7-c9b40386cbe0 but I > >> can confirm what You found actually. Port was pretty quickly > >> provisioned and > >> it was reported to Nova. I don't know why Nova didn't unpause that > >> vm. > > > > I took a look at the logs (thanks for providing it!). Nova did not > > unpause the VM a there was more than one port to be plugged an from > > one, bf0f9dd2-4fd3-4028-a3dc-e2581ff081d0, there was no vif-plugged > > event received. > > > > See my grepping of the log to show this > > https://paste.opendev.org/show/811348/ > > > as a side note I pushed a small patch to nova to log the which event > was timed out https://review.opendev.org/c/openstack/nova/+/819817 > > > > > Cheers, > > gibi > > > >> > >> Port creation in Nova: > >> 2021-11-25 08:50:24.250 8 INFO os_vif [req-6eab0f2a- > >> d7f4-4251-96f4-177fc0c62212 647a328aef9b4a8b890f102d8f018576 > >> 49231caa42f34da8a9d10229b5c7a5d8 - default default] Successfully > >> plugged vif > >> VIFOpenVSwitch(active=False,address=fa:16:3e:3d:cd:b9,bridge_name='br- > >> int',has_traffic_filtering=True,id=0761ff2f-ff6a-40a5-85b7- > >> c9b40386cbe0,network=Network(cb4e6be4-9bc7-4c34-95dd- > >> > dfc3c8e9d93c),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=True,vif_name='tap0761ff2f- > >> ff') > >> > >> Neutron finished provisioning of that port and sent notification to > >> nova about > >> 34 seconds later: > >> 2021-11-25 08:51:00.403 30 DEBUG neutron.plugins.ml2.rpc [req- > >> df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Device 0761ff2f- > >> ff6a-40a5-85b7-c9b40386cbe0 up at agent ovs-agent-compute0 > >> update_device_up / > >> > var/lib/kolla/venv/lib/python3.9/site-packages/neutron/plugins/ml2/rpc.py:278 > >> 2021-11-25 08:51:00.503 29 DEBUG neutron.db.provisioning_blocks [req- > >> a5711e9f-9b15-4a1a-a58d-07952fb8ac34 - - - - -] Provisioning > >> complete for port > >> 67b32ddf-dddd-439e-92be-6a89df49b4af triggered by entity L2. > >> provisioning_complete > >> /var/lib/kolla/venv/lib/python3.9/site-packages/neutron/ > >> db/provisioning_blocks.py:139 > >> 2021-11-25 08:51:00.504 29 DEBUG neutron_lib.callbacks.manager [req- > >> a5711e9f-9b15-4a1a-a58d-07952fb8ac34 - - - - -] Notify callbacks > >> ['neutron.plugins.ml2.plugin.Ml2Plugin._port_provisioned-8372308'] > >> for port, > >> provisioning_complete _notify_loop > >> /var/lib/kolla/venv/lib/python3.9/site- > >> packages/neutron_lib/callbacks/manager.py:192 > >> 2021-11-25 08:51:00.505 30 DEBUG neutron.db.provisioning_blocks [req- > >> df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Provisioning for port > >> 0761ff2f-ff6a-40a5-85b7-c9b40386cbe0 completed by entity L2. > >> provisioning_complete > >> /var/lib/kolla/venv/lib/python3.9/site-packages/neutron/ > >> db/provisioning_blocks.py:133 > >> 2021-11-25 08:51:00.515 30 DEBUG neutron.db.provisioning_blocks [req- > >> df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Provisioning > >> complete for port > >> 0761ff2f-ff6a-40a5-85b7-c9b40386cbe0 triggered by entity L2. > >> provisioning_complete > >> /var/lib/kolla/venv/lib/python3.9/site-packages/neutron/ > >> db/provisioning_blocks.py:139 > >> 2021-11-25 08:51:00.516 30 DEBUG neutron_lib.callbacks.manager [req- > >> df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Notify callbacks > >> ['neutron.plugins.ml2.plugin.Ml2Plugin._port_provisioned-8372308'] > >> for port, > >> provisioning_complete _notify_loop > >> /var/lib/kolla/venv/lib/python3.9/site- > >> packages/neutron_lib/callbacks/manager.py:192 > >> 2021-11-25 08:51:00.701 30 DEBUG neutron_lib.callbacks.manager [req- > >> df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Notify callbacks > >> > ['neutron.plugins.ml2.plugin.SecurityGroupDbMixin._ensure_default_security_group_handler-7972195'] > >> for port, before_update _notify_loop > >> /var/lib/kolla/venv/lib/python3.9/site- > >> packages/neutron_lib/callbacks/manager.py:192 > >> 2021-11-25 08:51:00.746 30 DEBUG neutron.notifiers.nova [-] Sending > >> events: > >> [{'server_uuid': '09eff2ce-356f-430f-ab30-5de58f58d698', 'name': > >> 'network-vif- > >> plugged', 'status': 'completed', 'tag': '0761ff2f-ff6a-40a5-85b7- > >> c9b40386cbe0'}] send_events > >> /var/lib/kolla/venv/lib/python3.9/site-packages/ > >> neutron/notifiers/nova.py:262 > >> 2021-11-25 08:51:00.762 30 DEBUG neutron.plugins.ml2.db [req- > >> df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] For port 0761ff2f- > >> ff6a-40a5-85b7-c9b40386cbe0, host compute0, got binding levels > >> > [PortBindingLevel(driver='openvswitch',host='compute0',level=0,port_id=0761ff2f- > >> ff6a-40a5-85b7-c9b40386cbe0,segment=NetworkSegment(ea3a27b6- > >> > c3fa-40e3-9937-71b652cba855),segment_id=ea3a27b6-c3fa-40e3-9937-71b652cba855)] > >> get_binding_level_objs > >> /var/lib/kolla/venv/lib/python3.9/site-packages/ > >> neutron/plugins/ml2/db.py:75 > >> 2021-11-25 08:51:00.818 30 INFO neutron.notifiers.nova [-] Nova event > >> response: {'server_uuid': '09eff2ce-356f-430f-ab30-5de58f58d698', > >> 'name': > >> 'network-vif-plugged', 'status': 'completed', 'tag': > >> '0761ff2f-ff6a-40a5-85b7- > >> c9b40386cbe0', 'code': 200} > >> > >>> > >>> Regards. > >>> > >>> [1]https://paste.opendev.org/show/811273/ > >>> > >>> On Thu, Nov 25, 2021 at 10:15 AM Michal Arbet > >>> > >>> > >>> wrote: > >>> > Hello, > >>> > > >>> > In attachment you can find logs for compute0 and controller0 > >>> (other > >>> > computes and controllers were turned off for this test). > >>> > No OVN used, this stack is based on OVS. > >>> > > >>> > Thank you, > >>> > Michal Arbet > >>> > Openstack Engineer > >>> > > >>> > Ultimum Technologies a.s. > >>> > Na Po???? 1047/26, 11000 Praha 1 > >>> > Czech Republic > >>> > > >>> > +420 604 228 897 > >>> > michal.arbet at ultimum.io > >>> > *https://ultimum.io * > >>> > > >>> > LinkedIn > >>> | Twitter > >>> > | Facebook > >>> > > >>> > > >>> > > >>> > st 24. 11. 2021 v 15:59 odes?latel Rodolfo Alonso Hernandez < > >>> > > >>> > ralonsoh at redhat.com> napsal: > >>> >> Hello Tony: > >>> >> > >>> >> Do you have the Neutron server logs? Do you see the > >>> >> "PortBindingChassisUpdateEvent"? When a port is bound, a > >>> Port_Bound (SB) > >>> >> event is issued and captured in the Neutron server. That will > >>> trigger the > >>> >> port binding process and the "vif-plugged" event. This OVN SB > >>> event > >> should > >>> >> call "set_port_status_up" and that should write "OVN reports > >>> status up > >> for > >>> >> port: %s" in the logs. > >>> >> > >>> >> Of course, this Neutron method will be called only if the > >>> logical switch > >>> >> port is UP. > >>> >> > >>> >> Regards. > >>> >> > >>> >> > >>> >> On Tue, Nov 23, 2021 at 11:59 PM Tony Liu > >>> > >>> >> > >>> >> wrote: > >>> >>> Hi, > >>> >>> > >>> >>> I see such problem from time to time. It's not consistently > >>> >>> reproduceable. > >>> >>> ====================== > >>> >>> 2021-11-23 22:16:28.532 7 INFO nova.compute.manager > >>> >>> [req-50817b01-e7ae-4991-94fe-e4c5672c481b - - - - -] [instance: > >>> >>> 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] During sync_power_state > >>> the > >> instance > >>> >>> has a pending task (spawning). Skip. > >>> >>> 2021-11-23 22:21:28.342 7 WARNING nova.virt.libvirt.driver > >>> >>> [req-814c98a4-3fd9-4607-9abb-5fbe5cef8650 > >> 3a4c320d64664d9cb6784b7ea52d618a > >>> >>> 21384fe21ebb4339b1a5f2a8f9000ea3 - default default] [instance: > >>> >>> 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] Timeout waiting for > >>> >>> [('network-vif-plugged', > >>> '735d97a3-db7c-42ed-b3be-1596d4cc7f4b')] for > >>> >>> instance with vm_state building and task_state spawning.: > >>> >>> eventlet.timeout.Timeout: 300 seconds > >>> >>> ====================== > >>> >>> The VIF/port is activated by OVN ovn-controller to ovn-sb. It > >>> seems > >>> >>> that, either Neutron didn't > >>> >>> capture the update or didn't send message back to > >>> nova-compute. Is there > >>> >>> any known fix for > >>> >>> this problem? > >>> >>> > >>> >>> > >>> >>> Thanks! > >>> >>> Tony > >> > >> > >> -- > >> Slawek Kaplonski > >> Principal Software Engineer > >> Red Hat > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From michal.arbet at ultimum.io Tue Nov 30 14:19:55 2021 From: michal.arbet at ultimum.io (Michal Arbet) Date: Tue, 30 Nov 2021 15:19:55 +0100 Subject: [Neutron][OVN] nova-compute timeout while waiting for VIF activation confirmed by Neutron In-Reply-To: References: <5291667.iZASKD2KPV@p1> Message-ID: Hi, Thank you Balazs for investigation, so, the question is, why other VIFs events are received quickly and that last one no ? It looks like a bug, isn't it ? Sorry, I am not as good as you to review code in deep. Btw, did you see message from my colleague Jan Vondra ? It's from email chain [neutron][nova] [kolla] vif plugged timeout >>>>>>>>>>>>> Hi guys, I've been further investigating Michal's (OP) issue, since he is on his holiday, and I've found out that the issue is not really plugging the VIF but cleanup after previous port bindings. We are creating 6 servers with 2-4 vifs using heat template [0]. We were hitting some problems with placement so the stack sometimes failed to create and we had to delete the stack and recreate it. If we recreate it right after the deletion, the vif plugging timeout occurs. If we wait some time (approx. 10 minutes) the stack is created successfully. This brings me to believe that there is some issue with deferring the removal of security groups from unbound ports (somewhere around this part of code [1]) and it somehow affects the creation of new ports. However, I am unable to find any lock that could cause this behaviour. The only proof I have is that after the stack recreation scenario I have measured that the process_network_ports [2] function call could take up to 650 s (varies from 5 s to 651 s in our environment). Any idea what could be causing this? [0] https://pastebin.com/infvj4ai [1] https://github.com/openstack/neutron/blob/master/neutron/agent/firewall.py#L133 [2] https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L2079 Jan Vondra http://ultimum.io <<<<<<<<<<<<< Thank you, Michal Arbet (kevko) Michal Arbet Openstack Engineer Ultimum Technologies a.s. Na Po???? 1047/26, 11000 Praha 1 Czech Republic +420 604 228 897 michal.arbet at ultimum.io *https://ultimum.io * LinkedIn | Twitter | Facebook ?t 30. 11. 2021 v 14:28 odes?latel Balazs Gibizer napsal: > > > On Tue, Nov 30 2021 at 11:16:00 AM +0100, Balazs Gibizer > wrote: > > > > > > On Thu, Nov 25 2021 at 11:57:45 AM +0100, Slawek Kaplonski > > wrote: > >> Hi, > >> > >> On czwartek, 25 listopada 2021 10:41:14 CET Rodolfo Alonso Hernandez > >> wrote: > >>> Hello Michal: > >>> > >>> I think this thread is related to OVN. In any case, I've analyzed > >>> your logs > >>> from the Neutron point of view. Those are basically the key points > >>> in your > >>> logs: [1]. The Neutron server receives the request to create and > >>> bind a new > >>> port. Both the DHCP agent and the OVS agent provision the port > >>> (8a472e87-4a7a-4ad4-9fbb-fd9785136611). > >>> > >>> Neutron server sends the "network-vif-plugged" event at > >>> 08:51:05.635 and > >>> receives the ACK from Nova at 08:51:05.704. Nova server creates the > >>> corresponding event for the instance on the compute0: > >>> 2021-11-25 08:51:05.692 23 INFO > >>> nova.api.openstack.compute.server_external_events > >>> [req-393fcb9a-5c68-4b0a-af1b-661e73d947cb > >>> 01a75e3a9a9148218916d3beafae2120 > >>> 7d16babef0524b7dade9a59b0a3569e1 - default default] Creating event > >>> network-vif-plugged:8a472e87-4a7a-4ad4-9fbb-fd9785136611 for > >>> instance > >>> 09eff2ce-356f-430f-ab30-5de58f58d698 on compute0 > >>> > >>> Nova compute agent receives this server event: > >>> 2021-11-25 08:51:05.705 8 DEBUG nova.compute.manager > >>> [req-393fcb9a-5c68-4b0a-af1b-661e73d947cb > >>> 01a75e3a9a9148218916d3beafae2120 > >>> 7d16babef0524b7dade9a59b0a3569e1 - default default] [instance: > >>> 09eff2ce-356f-430f-ab30-5de58f58d698] Processing event > >>> network-vif-plugged-8a472e87-4a7a-4ad4-9fbb-fd9785136611 > >>> _process_instance_event > >>> > >>> > /var/lib/kolla/venv/lib/python3.9/site-packages/nova/compute/manager.py: > >> 10205 > >>> > >>> Further triagging and log analysis should be done by Nova folks to > >>> understand why Nova compute didn't boot this VM. I fail to > >>> understand some > >>> parts. > >> > >> Thx Rodolfo. I also took a look at those logs (for server > >> 09eff2ce-356f-430f- > >> ab30-5de58f58d698 which had port > >> 0761ff2f-ff6a-40a5-85b7-c9b40386cbe0 but I > >> can confirm what You found actually. Port was pretty quickly > >> provisioned and > >> it was reported to Nova. I don't know why Nova didn't unpause that > >> vm. > > > > I took a look at the logs (thanks for providing it!). Nova did not > > unpause the VM a there was more than one port to be plugged an from > > one, bf0f9dd2-4fd3-4028-a3dc-e2581ff081d0, there was no vif-plugged > > event received. > > > > See my grepping of the log to show this > > https://paste.opendev.org/show/811348/ > > > as a side note I pushed a small patch to nova to log the which event > was timed out https://review.opendev.org/c/openstack/nova/+/819817 > > > > > Cheers, > > gibi > > > >> > >> Port creation in Nova: > >> 2021-11-25 08:50:24.250 8 INFO os_vif [req-6eab0f2a- > >> d7f4-4251-96f4-177fc0c62212 647a328aef9b4a8b890f102d8f018576 > >> 49231caa42f34da8a9d10229b5c7a5d8 - default default] Successfully > >> plugged vif > >> VIFOpenVSwitch(active=False,address=fa:16:3e:3d:cd:b9,bridge_name='br- > >> int',has_traffic_filtering=True,id=0761ff2f-ff6a-40a5-85b7- > >> c9b40386cbe0,network=Network(cb4e6be4-9bc7-4c34-95dd- > >> > dfc3c8e9d93c),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=True,vif_name='tap0761ff2f- > >> ff') > >> > >> Neutron finished provisioning of that port and sent notification to > >> nova about > >> 34 seconds later: > >> 2021-11-25 08:51:00.403 30 DEBUG neutron.plugins.ml2.rpc [req- > >> df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Device 0761ff2f- > >> ff6a-40a5-85b7-c9b40386cbe0 up at agent ovs-agent-compute0 > >> update_device_up / > >> > var/lib/kolla/venv/lib/python3.9/site-packages/neutron/plugins/ml2/rpc.py:278 > >> 2021-11-25 08:51:00.503 29 DEBUG neutron.db.provisioning_blocks [req- > >> a5711e9f-9b15-4a1a-a58d-07952fb8ac34 - - - - -] Provisioning > >> complete for port > >> 67b32ddf-dddd-439e-92be-6a89df49b4af triggered by entity L2. > >> provisioning_complete > >> /var/lib/kolla/venv/lib/python3.9/site-packages/neutron/ > >> db/provisioning_blocks.py:139 > >> 2021-11-25 08:51:00.504 29 DEBUG neutron_lib.callbacks.manager [req- > >> a5711e9f-9b15-4a1a-a58d-07952fb8ac34 - - - - -] Notify callbacks > >> ['neutron.plugins.ml2.plugin.Ml2Plugin._port_provisioned-8372308'] > >> for port, > >> provisioning_complete _notify_loop > >> /var/lib/kolla/venv/lib/python3.9/site- > >> packages/neutron_lib/callbacks/manager.py:192 > >> 2021-11-25 08:51:00.505 30 DEBUG neutron.db.provisioning_blocks [req- > >> df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Provisioning for port > >> 0761ff2f-ff6a-40a5-85b7-c9b40386cbe0 completed by entity L2. > >> provisioning_complete > >> /var/lib/kolla/venv/lib/python3.9/site-packages/neutron/ > >> db/provisioning_blocks.py:133 > >> 2021-11-25 08:51:00.515 30 DEBUG neutron.db.provisioning_blocks [req- > >> df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Provisioning > >> complete for port > >> 0761ff2f-ff6a-40a5-85b7-c9b40386cbe0 triggered by entity L2. > >> provisioning_complete > >> /var/lib/kolla/venv/lib/python3.9/site-packages/neutron/ > >> db/provisioning_blocks.py:139 > >> 2021-11-25 08:51:00.516 30 DEBUG neutron_lib.callbacks.manager [req- > >> df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Notify callbacks > >> ['neutron.plugins.ml2.plugin.Ml2Plugin._port_provisioned-8372308'] > >> for port, > >> provisioning_complete _notify_loop > >> /var/lib/kolla/venv/lib/python3.9/site- > >> packages/neutron_lib/callbacks/manager.py:192 > >> 2021-11-25 08:51:00.701 30 DEBUG neutron_lib.callbacks.manager [req- > >> df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Notify callbacks > >> > ['neutron.plugins.ml2.plugin.SecurityGroupDbMixin._ensure_default_security_group_handler-7972195'] > >> for port, before_update _notify_loop > >> /var/lib/kolla/venv/lib/python3.9/site- > >> packages/neutron_lib/callbacks/manager.py:192 > >> 2021-11-25 08:51:00.746 30 DEBUG neutron.notifiers.nova [-] Sending > >> events: > >> [{'server_uuid': '09eff2ce-356f-430f-ab30-5de58f58d698', 'name': > >> 'network-vif- > >> plugged', 'status': 'completed', 'tag': '0761ff2f-ff6a-40a5-85b7- > >> c9b40386cbe0'}] send_events > >> /var/lib/kolla/venv/lib/python3.9/site-packages/ > >> neutron/notifiers/nova.py:262 > >> 2021-11-25 08:51:00.762 30 DEBUG neutron.plugins.ml2.db [req- > >> df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] For port 0761ff2f- > >> ff6a-40a5-85b7-c9b40386cbe0, host compute0, got binding levels > >> > [PortBindingLevel(driver='openvswitch',host='compute0',level=0,port_id=0761ff2f- > >> ff6a-40a5-85b7-c9b40386cbe0,segment=NetworkSegment(ea3a27b6- > >> > c3fa-40e3-9937-71b652cba855),segment_id=ea3a27b6-c3fa-40e3-9937-71b652cba855)] > >> get_binding_level_objs > >> /var/lib/kolla/venv/lib/python3.9/site-packages/ > >> neutron/plugins/ml2/db.py:75 > >> 2021-11-25 08:51:00.818 30 INFO neutron.notifiers.nova [-] Nova event > >> response: {'server_uuid': '09eff2ce-356f-430f-ab30-5de58f58d698', > >> 'name': > >> 'network-vif-plugged', 'status': 'completed', 'tag': > >> '0761ff2f-ff6a-40a5-85b7- > >> c9b40386cbe0', 'code': 200} > >> > >>> > >>> Regards. > >>> > >>> [1]https://paste.opendev.org/show/811273/ > >>> > >>> On Thu, Nov 25, 2021 at 10:15 AM Michal Arbet > >>> > >>> > >>> wrote: > >>> > Hello, > >>> > > >>> > In attachment you can find logs for compute0 and controller0 > >>> (other > >>> > computes and controllers were turned off for this test). > >>> > No OVN used, this stack is based on OVS. > >>> > > >>> > Thank you, > >>> > Michal Arbet > >>> > Openstack Engineer > >>> > > >>> > Ultimum Technologies a.s. > >>> > Na Po???? 1047/26, 11000 Praha 1 > >>> > Czech Republic > >>> > > >>> > +420 604 228 897 > >>> > michal.arbet at ultimum.io > >>> > *https://ultimum.io * > >>> > > >>> > LinkedIn > >>> | Twitter > >>> > | Facebook > >>> > > >>> > > >>> > > >>> > st 24. 11. 2021 v 15:59 odes?latel Rodolfo Alonso Hernandez < > >>> > > >>> > ralonsoh at redhat.com> napsal: > >>> >> Hello Tony: > >>> >> > >>> >> Do you have the Neutron server logs? Do you see the > >>> >> "PortBindingChassisUpdateEvent"? When a port is bound, a > >>> Port_Bound (SB) > >>> >> event is issued and captured in the Neutron server. That will > >>> trigger the > >>> >> port binding process and the "vif-plugged" event. This OVN SB > >>> event > >> should > >>> >> call "set_port_status_up" and that should write "OVN reports > >>> status up > >> for > >>> >> port: %s" in the logs. > >>> >> > >>> >> Of course, this Neutron method will be called only if the > >>> logical switch > >>> >> port is UP. > >>> >> > >>> >> Regards. > >>> >> > >>> >> > >>> >> On Tue, Nov 23, 2021 at 11:59 PM Tony Liu > >>> > >>> >> > >>> >> wrote: > >>> >>> Hi, > >>> >>> > >>> >>> I see such problem from time to time. It's not consistently > >>> >>> reproduceable. > >>> >>> ====================== > >>> >>> 2021-11-23 22:16:28.532 7 INFO nova.compute.manager > >>> >>> [req-50817b01-e7ae-4991-94fe-e4c5672c481b - - - - -] [instance: > >>> >>> 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] During sync_power_state > >>> the > >> instance > >>> >>> has a pending task (spawning). Skip. > >>> >>> 2021-11-23 22:21:28.342 7 WARNING nova.virt.libvirt.driver > >>> >>> [req-814c98a4-3fd9-4607-9abb-5fbe5cef8650 > >> 3a4c320d64664d9cb6784b7ea52d618a > >>> >>> 21384fe21ebb4339b1a5f2a8f9000ea3 - default default] [instance: > >>> >>> 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] Timeout waiting for > >>> >>> [('network-vif-plugged', > >>> '735d97a3-db7c-42ed-b3be-1596d4cc7f4b')] for > >>> >>> instance with vm_state building and task_state spawning.: > >>> >>> eventlet.timeout.Timeout: 300 seconds > >>> >>> ====================== > >>> >>> The VIF/port is activated by OVN ovn-controller to ovn-sb. It > >>> seems > >>> >>> that, either Neutron didn't > >>> >>> capture the update or didn't send message back to > >>> nova-compute. Is there > >>> >>> any known fix for > >>> >>> this problem? > >>> >>> > >>> >>> > >>> >>> Thanks! > >>> >>> Tony > >> > >> > >> -- > >> Slawek Kaplonski > >> Principal Software Engineer > >> Red Hat > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balazs.gibizer at est.tech Tue Nov 30 14:32:31 2021 From: balazs.gibizer at est.tech (Balazs Gibizer) Date: Tue, 30 Nov 2021 15:32:31 +0100 Subject: [Neutron][OVN] nova-compute timeout while waiting for VIF activation confirmed by Neutron In-Reply-To: References: <5291667.iZASKD2KPV@p1> Message-ID: <723E3R.00N4IXEZ8W5U@est.tech> On Tue, Nov 30 2021 at 03:19:55 PM +0100, Michal Arbet wrote: > Hi, > > Thank you Balazs for investigation, so, the question is, why other > VIFs events are received quickly and that last one no ? Yes, It seems that most of the events came within a minute but the last one did not received within the timeout value. > It looks like a bug, isn't it ? Sorry, I am not as good as you to > review code in deep. It could be but I did not check the neutron side of the missing event. I hope Slaweq or Rodolfo can take that. (Yeah it is pretty unfortunate to ping-pong about this type of issues between nova and neutron :/) > > Btw, did you see message from my colleague Jan Vondra ? I saw it but it is mostly targeting neutron where I don't have deep enough knowledge to comment. Cheers, gibi > It's from email chain [neutron][nova] [kolla] vif plugged timeout > > >>>>>>>>>>>>> > > > Hi guys, > I've been further investigating Michal's (OP) issue, since he is on > his holiday, and I've found out that the issue is not really plugging > the VIF but cleanup after previous port bindings. > > We are creating 6 servers with 2-4 vifs using heat template [0]. We > were hitting some problems with placement so the stack sometimes > failed to create and we had to delete the stack and recreate it. > If we recreate it right after the deletion, the vif plugging timeout > occurs. If we wait some time (approx. 10 minutes) the stack is > created successfully. > > This brings me to believe that there is some issue with deferring the > removal of security groups from unbound ports (somewhere around this > part of code [1]) and it somehow affects the creation of new ports. > However, I am unable to find any lock that could cause this behaviour. > > The only proof I have is that after the stack recreation scenario I > have measured that the process_network_ports [2] function call could > take up to 650 s (varies from 5 s to 651 s in our environment). > > Any idea what could be causing this? > [0] https://pastebin.com/infvj4ai > [1] > https://github.com/openstack/neutron/blob/master/neutron/agent/firewall.py#L133 > [2] > https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L2079 > > Jan Vondra > http://ultimum.io > > <<<<<<<<<<<<< > > Thank you, > Michal Arbet (kevko) > > > Michal Arbet > Openstack Engineer > > Ultimum Technologies a.s. > Na Po???? 1047/26, 11000 Praha 1 > Czech Republic > > +420 604 228 897 > michal.arbet at ultimum.io > https://ultimum.io > > LinkedIn | Twitter | Facebook > > > ?t 30. 11. 2021 v 14:28 odes?latel Balazs Gibizer > napsal: >> >> >> On Tue, Nov 30 2021 at 11:16:00 AM +0100, Balazs Gibizer >> wrote: >> > >> > >> > On Thu, Nov 25 2021 at 11:57:45 AM +0100, Slawek Kaplonski >> > wrote: >> >> Hi, >> >> >> >> On czwartek, 25 listopada 2021 10:41:14 CET Rodolfo Alonso >> Hernandez >> >> wrote: >> >>> Hello Michal: >> >>> >> >>> I think this thread is related to OVN. In any case, I've >> analyzed >> >>> your logs >> >>> from the Neutron point of view. Those are basically the key >> points >> >>> in your >> >>> logs: [1]. The Neutron server receives the request to create >> and >> >>> bind a new >> >>> port. Both the DHCP agent and the OVS agent provision the port >> >>> (8a472e87-4a7a-4ad4-9fbb-fd9785136611). >> >>> >> >>> Neutron server sends the "network-vif-plugged" event at >> >>> 08:51:05.635 and >> >>> receives the ACK from Nova at 08:51:05.704. Nova server >> creates the >> >>> corresponding event for the instance on the compute0: >> >>> 2021-11-25 08:51:05.692 23 INFO >> >>> nova.api.openstack.compute.server_external_events >> >>> [req-393fcb9a-5c68-4b0a-af1b-661e73d947cb >> >>> 01a75e3a9a9148218916d3beafae2120 >> >>> 7d16babef0524b7dade9a59b0a3569e1 - default default] Creating >> event >> >>> network-vif-plugged:8a472e87-4a7a-4ad4-9fbb-fd9785136611 for >> >>> instance >> >>> 09eff2ce-356f-430f-ab30-5de58f58d698 on compute0 >> >>> >> >>> Nova compute agent receives this server event: >> >>> 2021-11-25 08:51:05.705 8 DEBUG nova.compute.manager >> >>> [req-393fcb9a-5c68-4b0a-af1b-661e73d947cb >> >>> 01a75e3a9a9148218916d3beafae2120 >> >>> 7d16babef0524b7dade9a59b0a3569e1 - default default] [instance: >> >>> 09eff2ce-356f-430f-ab30-5de58f58d698] Processing event >> >>> network-vif-plugged-8a472e87-4a7a-4ad4-9fbb-fd9785136611 >> >>> _process_instance_event >> >>> >> >>> >> /var/lib/kolla/venv/lib/python3.9/site-packages/nova/compute/manager.py: >> >> 10205 >> >>> >> >>> Further triagging and log analysis should be done by Nova >> folks to >> >>> understand why Nova compute didn't boot this VM. I fail to >> >>> understand some >> >>> parts. >> >> >> >> Thx Rodolfo. I also took a look at those logs (for server >> >> 09eff2ce-356f-430f- >> >> ab30-5de58f58d698 which had port >> >> 0761ff2f-ff6a-40a5-85b7-c9b40386cbe0 but I >> >> can confirm what You found actually. Port was pretty quickly >> >> provisioned and >> >> it was reported to Nova. I don't know why Nova didn't unpause >> that >> >> vm. >> > >> > I took a look at the logs (thanks for providing it!). Nova did not >> > unpause the VM a there was more than one port to be plugged an >> from >> > one, bf0f9dd2-4fd3-4028-a3dc-e2581ff081d0, there was no >> vif-plugged >> > event received. >> > >> > See my grepping of the log to show this >> > https://paste.opendev.org/show/811348/ >> >> >> as a side note I pushed a small patch to nova to log the which event >> was timed out https://review.opendev.org/c/openstack/nova/+/819817 >> >> > >> > Cheers, >> > gibi >> > >> >> >> >> Port creation in Nova: >> >> 2021-11-25 08:50:24.250 8 INFO os_vif [req-6eab0f2a- >> >> d7f4-4251-96f4-177fc0c62212 647a328aef9b4a8b890f102d8f018576 >> >> 49231caa42f34da8a9d10229b5c7a5d8 - default default] Successfully >> >> plugged vif >> >> >> VIFOpenVSwitch(active=False,address=fa:16:3e:3d:cd:b9,bridge_name='br- >> >> int',has_traffic_filtering=True,id=0761ff2f-ff6a-40a5-85b7- >> >> c9b40386cbe0,network=Network(cb4e6be4-9bc7-4c34-95dd- >> >> >> dfc3c8e9d93c),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=True,vif_name='tap0761ff2f- >> >> ff') >> >> >> >> Neutron finished provisioning of that port and sent notification >> to >> >> nova about >> >> 34 seconds later: >> >> 2021-11-25 08:51:00.403 30 DEBUG neutron.plugins.ml2.rpc [req- >> >> df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Device 0761ff2f- >> >> ff6a-40a5-85b7-c9b40386cbe0 up at agent ovs-agent-compute0 >> >> update_device_up / >> >> >> var/lib/kolla/venv/lib/python3.9/site-packages/neutron/plugins/ml2/rpc.py:278 >> >> 2021-11-25 08:51:00.503 29 DEBUG neutron.db.provisioning_blocks >> [req- >> >> a5711e9f-9b15-4a1a-a58d-07952fb8ac34 - - - - -] Provisioning >> >> complete for port >> >> 67b32ddf-dddd-439e-92be-6a89df49b4af triggered by entity L2. >> >> provisioning_complete >> >> /var/lib/kolla/venv/lib/python3.9/site-packages/neutron/ >> >> db/provisioning_blocks.py:139 >> >> 2021-11-25 08:51:00.504 29 DEBUG neutron_lib.callbacks.manager >> [req- >> >> a5711e9f-9b15-4a1a-a58d-07952fb8ac34 - - - - -] Notify callbacks >> >> >> ['neutron.plugins.ml2.plugin.Ml2Plugin._port_provisioned-8372308'] >> >> for port, >> >> provisioning_complete _notify_loop >> >> /var/lib/kolla/venv/lib/python3.9/site- >> >> packages/neutron_lib/callbacks/manager.py:192 >> >> 2021-11-25 08:51:00.505 30 DEBUG neutron.db.provisioning_blocks >> [req- >> >> df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Provisioning for >> port >> >> 0761ff2f-ff6a-40a5-85b7-c9b40386cbe0 completed by entity L2. >> >> provisioning_complete >> >> /var/lib/kolla/venv/lib/python3.9/site-packages/neutron/ >> >> db/provisioning_blocks.py:133 >> >> 2021-11-25 08:51:00.515 30 DEBUG neutron.db.provisioning_blocks >> [req- >> >> df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Provisioning >> >> complete for port >> >> 0761ff2f-ff6a-40a5-85b7-c9b40386cbe0 triggered by entity L2. >> >> provisioning_complete >> >> /var/lib/kolla/venv/lib/python3.9/site-packages/neutron/ >> >> db/provisioning_blocks.py:139 >> >> 2021-11-25 08:51:00.516 30 DEBUG neutron_lib.callbacks.manager >> [req- >> >> df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Notify callbacks >> >> >> ['neutron.plugins.ml2.plugin.Ml2Plugin._port_provisioned-8372308'] >> >> for port, >> >> provisioning_complete _notify_loop >> >> /var/lib/kolla/venv/lib/python3.9/site- >> >> packages/neutron_lib/callbacks/manager.py:192 >> >> 2021-11-25 08:51:00.701 30 DEBUG neutron_lib.callbacks.manager >> [req- >> >> df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Notify callbacks >> >> >> ['neutron.plugins.ml2.plugin.SecurityGroupDbMixin._ensure_default_security_group_handler-7972195'] >> >> for port, before_update _notify_loop >> >> /var/lib/kolla/venv/lib/python3.9/site- >> >> packages/neutron_lib/callbacks/manager.py:192 >> >> 2021-11-25 08:51:00.746 30 DEBUG neutron.notifiers.nova [-] >> Sending >> >> events: >> >> [{'server_uuid': '09eff2ce-356f-430f-ab30-5de58f58d698', 'name': >> >> 'network-vif- >> >> plugged', 'status': 'completed', 'tag': '0761ff2f-ff6a-40a5-85b7- >> >> c9b40386cbe0'}] send_events >> >> /var/lib/kolla/venv/lib/python3.9/site-packages/ >> >> neutron/notifiers/nova.py:262 >> >> 2021-11-25 08:51:00.762 30 DEBUG neutron.plugins.ml2.db [req- >> >> df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] For port >> 0761ff2f- >> >> ff6a-40a5-85b7-c9b40386cbe0, host compute0, got binding levels >> >> >> [PortBindingLevel(driver='openvswitch',host='compute0',level=0,port_id=0761ff2f- >> >> ff6a-40a5-85b7-c9b40386cbe0,segment=NetworkSegment(ea3a27b6- >> >> >> c3fa-40e3-9937-71b652cba855),segment_id=ea3a27b6-c3fa-40e3-9937-71b652cba855)] >> >> get_binding_level_objs >> >> /var/lib/kolla/venv/lib/python3.9/site-packages/ >> >> neutron/plugins/ml2/db.py:75 >> >> 2021-11-25 08:51:00.818 30 INFO neutron.notifiers.nova [-] Nova >> event >> >> response: {'server_uuid': '09eff2ce-356f-430f-ab30-5de58f58d698', >> >> 'name': >> >> 'network-vif-plugged', 'status': 'completed', 'tag': >> >> '0761ff2f-ff6a-40a5-85b7- >> >> c9b40386cbe0', 'code': 200} >> >> >> >>> >> >>> Regards. >> >>> >> >>> [1]https://paste.opendev.org/show/811273/ >> >>> >> >>> On Thu, Nov 25, 2021 at 10:15 AM Michal Arbet >> >>> >> >>> >> >>> wrote: >> >>> > Hello, >> >>> > >> >>> > In attachment you can find logs for compute0 and controller0 >> >>> (other >> >>> > computes and controllers were turned off for this test). >> >>> > No OVN used, this stack is based on OVS. >> >>> > >> >>> > Thank you, >> >>> > Michal Arbet >> >>> > Openstack Engineer >> >>> > >> >>> > Ultimum Technologies a.s. >> >>> > Na Po???? 1047/26, 11000 Praha 1 >> >>> > Czech Republic >> >>> > >> >>> > +420 604 228 897 >> >>> > michal.arbet at ultimum.io >> >>> > *https://ultimum.io * >> >>> > >> >>> > LinkedIn >> >> >>> | Twitter >> >>> > | Facebook >> >>> > >> >>> > >> >>> > >> >>> > st 24. 11. 2021 v 15:59 odes?latel Rodolfo Alonso Hernandez >> < >> >>> > >> >>> > ralonsoh at redhat.com> napsal: >> >>> >> Hello Tony: >> >>> >> >> >>> >> Do you have the Neutron server logs? Do you see the >> >>> >> "PortBindingChassisUpdateEvent"? When a port is bound, a >> >>> Port_Bound (SB) >> >>> >> event is issued and captured in the Neutron server. That >> will >> >>> trigger the >> >>> >> port binding process and the "vif-plugged" event. This OVN >> SB >> >>> event >> >> should >> >>> >> call "set_port_status_up" and that should write "OVN reports >> >>> status up >> >> for >> >>> >> port: %s" in the logs. >> >>> >> >> >>> >> Of course, this Neutron method will be called only if the >> >>> logical switch >> >>> >> port is UP. >> >>> >> >> >>> >> Regards. >> >>> >> >> >>> >> >> >>> >> On Tue, Nov 23, 2021 at 11:59 PM Tony Liu >> >>> >> >>> >> >> >>> >> wrote: >> >>> >>> Hi, >> >>> >>> >> >>> >>> I see such problem from time to time. It's not consistently >> >>> >>> reproduceable. >> >>> >>> ====================== >> >>> >>> 2021-11-23 22:16:28.532 7 INFO nova.compute.manager >> >>> >>> [req-50817b01-e7ae-4991-94fe-e4c5672c481b - - - - -] >> [instance: >> >>> >>> 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] During >> sync_power_state >> >>> the >> >> instance >> >>> >>> has a pending task (spawning). Skip. >> >>> >>> 2021-11-23 22:21:28.342 7 WARNING nova.virt.libvirt.driver >> >>> >>> [req-814c98a4-3fd9-4607-9abb-5fbe5cef8650 >> >> 3a4c320d64664d9cb6784b7ea52d618a >> >>> >>> 21384fe21ebb4339b1a5f2a8f9000ea3 - default default] >> [instance: >> >>> >>> 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] Timeout waiting for >> >>> >>> [('network-vif-plugged', >> >>> '735d97a3-db7c-42ed-b3be-1596d4cc7f4b')] for >> >>> >>> instance with vm_state building and task_state spawning.: >> >>> >>> eventlet.timeout.Timeout: 300 seconds >> >>> >>> ====================== >> >>> >>> The VIF/port is activated by OVN ovn-controller to ovn-sb. >> It >> >>> seems >> >>> >>> that, either Neutron didn't >> >>> >>> capture the update or didn't send message back to >> >>> nova-compute. Is there >> >>> >>> any known fix for >> >>> >>> this problem? >> >>> >>> >> >>> >>> >> >>> >>> Thanks! >> >>> >>> Tony >> >> >> >> >> >> -- >> >> Slawek Kaplonski >> >> Principal Software Engineer >> >> Red Hat >> > >> > >> > >> >> From dtantsur at redhat.com Tue Nov 30 17:22:31 2021 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Tue, 30 Nov 2021 18:22:31 +0100 Subject: [ironic][meeting] Postpone meeting time until end of March 2022 In-Reply-To: References: Message-ID: No objections from me either. On Tue, Nov 30, 2021 at 3:07 PM Riccardo Pittau wrote: > Hello ironicers! > > Due to some overlapping time with other meetings that cause me to attend > to 2 (or sometimes 3.....) meetings at the same time, I'd like to propose > to move the time for the weekly ironic upstream meeting 1 hour later, so > from the current 15:00-16:00 UTC to 16:00-17:00 UTC, until the next > Daylight saving time change that should happen at the end of March 2022. > > Thank you! > > Ciao, > Riccardo > > > -- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill -------------- next part -------------- An HTML attachment was scrubbed... URL: From sbauza at redhat.com Tue Nov 30 17:41:45 2021 From: sbauza at redhat.com (Sylvain Bauza) Date: Tue, 30 Nov 2021 18:41:45 +0100 Subject: [nova] Next spec review day on Tuesday Dec 14th Message-ID: Hola, As agreed during today's Nova meeting, we will have another spec review day on Dec 14th. Please sharpen your pens and write your specs in advance, and just be around during this day if you want to discuss with the reviewers. Thanks, -Sylvain -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Tue Nov 30 19:47:02 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 30 Nov 2021 13:47:02 -0600 Subject: python_requires >= 3.8 during Yoga In-Reply-To: <1319EEE3-AB75-495A-8EEF-E2985A4FBB10@binero.com> References: <17d58f75012.106d120141328715.5251672452583667038@ghanshyammann.com> <20211126122942.deyi4tymy5xoodvz@lyarwood-laptop.usersys.redhat.com> <17d5ce532e1.eabdb7481381896.7024061515184177274@ghanshyammann.com> <20211126160515.4tvksnqcz3vaomf4@lyarwood-laptop.usersys.redhat.com> <17d5d0a4f3f.dc8bb48c1384810.8642013929754348916@ghanshyammann.com> <17d5d476490.e30e73db1388640.4406614125429012751@ghanshyammann.com> <20211129091718.fdgdwmowtp3jxf6u@lyarwood-laptop.usersys.redhat.com> <1319EEE3-AB75-495A-8EEF-E2985A4FBB10@binero.com> Message-ID: <17d7262e0ac.be45cbf0116583.6975429409607560951@ghanshyammann.com> ---- On Mon, 29 Nov 2021 04:52:07 -0600 Tobias Urdin wrote ---- > Hello,I agree with previous statement from Michal. > > The upgrade path in for example RDO has been very smooth previously with upgrading to new OpenStack releasethen switching out the distro version afterwards because they support both during a transition. > Sure they will do that now as well, but if any project decides to break py36 there will be more work for the RDO teamand more arguments to not revert changes because py38 would be the ?real? supported runtime in OpenStack testing. > The transition period is required to not break the upgrade path. > Best regardsTobias > On 29 Nov 2021, at 11:14, Micha? Nasiadka wrote: > Hello, > I?m strongly against dropping py36 support now, unless we?re going to find a solution that works on CentOS Stream 8.RHEL 9 is not out, and probably will not be in months - how do we expect users to use Yoga on production deployments (where they use CentOS Linux/equivalents today)? > Dropping the runtime testing and supporting devstack - and negotiating on a per project basis to support py36 or not - is not a solution.Either Yoga supports py36 as a transition point/release to py38 - or not. > In Kolla - we also did not anticipate (and don?t think it?s a good idea) to support CentOS Stream 9 in Yoga release.With the current decision - we are either forced with supporting CentOS Stream 9 (with no alternatives like Rocky Linux/Alma Linux in place - because RHEL 9 is not out) - or dropping CentOS support completely. > If we pursue CS9 - we also need to support migration from CS8 to CS9 and that?s also a considerable amount of work - which is unplanned. > Best regards,Michal I agree on all these points on keeping py36 for Yoga can be helpful for most people in migration to the new distro version. I have added it to the TC meeting agenda which is scheduled for Dec 2nd and discuss more on this. Please feel free to join TC meeting, details are below: https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Agenda_Suggestions -gmann > > On 29 Nov 2021, at 10:17, Lee Yarwood wrote: > On 26-11-21 11:24:59, Ghanshyam Mann wrote: > ---- On Fri, 26 Nov 2021 10:18:16 -0600 Ghanshyam Mann wrote ---- > ---- On Fri, 26 Nov 2021 10:05:15 -0600 Lee Yarwood wrote ---- > On 26-11-21 09:37:44, Ghanshyam Mann wrote: > ---- On Fri, 26 Nov 2021 06:29:42 -0600 Lee Yarwood wrote ---- > On 26-11-21 10:54:26, Alfredo Moralejo Alonso wrote: > On Thu, Nov 25, 2021 at 10:23 PM Ghanshyam Mann > wrote: > > ---- On Thu, 25 Nov 2021 13:58:28 -0600 Marcin Juszkiewicz < > marcin.juszkiewicz at linaro.org> wrote ---- > W dniu 25.11.2021 o 19:13, Stephen Finucane pisze: > gmann has been helpfully proposing patches to change the > versions of Python we're testing against in Yoga. I've > suggested that we might want to bump 'python_requires' in > 'setup.cfg' to indicate that we no longer support any version > of Python before 3.8 > > CentOS Stream 8 has Python 3.6 by default and RDO team is doing > CS8 -> CS9 migration during Yoga cycle. Can we postpone it to Z > when there will be no distribution with Py 3.6 to care about? > > Stupid question that I should know the answer to but does RDO really > support RPM based installations anymore? IOW couldn't we just workaround > this by providing CS8 py38 based containers during the upgrade? > > As Marcin posted, the plan in RDO is to support both CentOS Stream 8 and > CentOS Stream 9 in Yoga. This is how we have managed previous major CentOS > version upgrades in the past providing support for both releases in an > OpenStack version to ease the upgrade so I'd like to keep yoga working on > py3.6 included in CS8 and CS9. > > If this was the plan why wasn't it made clear to the TC before they > dropped CS8 from the Yoga runtimes? Would it even be possible for the TC > to add CS8 and py36 back in to the Yoga runtimes? > > Postponing to Z, you mean dropping the py3.6 tests or bumping it in > in 'setup.cfg' so that no one can install on py3.6 ? > > First one we already did and as per Yoga testing runtime we are > targeting centos9-stream[1] in Yoga itself. > > For making 'python_requires' >=py3.8 in 'setup.cfg', I have no > string opinion on this but I prefer to have flexible here that 'yes > OpenStack is installable in py3.6 but we do not test it anymore from > Yoga onwards so no guarantee'. Our testing runtime main goal is > that we document the version we are testing *at least* which means > it can work on lower or higher versions too but we just do not test > them. > > > May it be possible to keep py3.6 jobs to make sure patches are not > introducing py3.8-only features that would break deployment in CS8? > > We should keep CS8 and py36 as supported runtimes if we are keeping the > jobs, otherwise this just sets super confusing. > > Yeah, I think it create confusion as I can see in this ML thread so > agree on keeping 'python_requires' also in sycn with what we test. > > Cool thanks! > > Now question on going back to centos stream 8 support in Yoga, is it > not centos stream 9 is stable released or is it experimental only? If > stable then we can keep the latest available version which can be > centos stream 9. > > I honestly don't know and can't find any docs to point to. > > Our project interface testing doc clearly stats 'latest LTS' to > consider for testing[1] whenever we are ready. I am not very strongly > against of reverting back to centos stream 8 but we should not add two > version of same distro in testing which can be a lot of we consider > below three distro > > How do we expect operators to upgrade between Xena where CentOS 8 stream > is a supported runtime and Yoga where CentOS 9 stream is currently the > equivalent supported runtime without supporting both for a single > release? > > This is really good question on upgrade testing we do at upstream and I remember > it cameup and discussed a lot during py2.7 drop also that how we are testing the > upgrade from py2.7 to py3. Can we do in grenade? But that we answered as we did > not tested directly but stein and train tested both version so should not be any issue > if you upgrade from there (one of FAQ in my blog[1]). > > But on distro upgrade testing, as you know we do not test those in upstream neither > in grenade where upgrade are done on old node distro only not from old distro version to > new distro version with new code. It is not like we do not want to test but if anyone > from any distro would like to setup grenade for that and maintain then we are more happy. > In summary, yes we cannot guarantee distro upgrade testing from OpenStack upstream testing > due to resource bandwidth issue but we will welcome any help here. > > We discussed with amoralej about moving the testing runtime to CentOS > stream 8 and py36 or not in TC IRC channel[1]. > > As we at upstream do not test distro two versions in same release, > amoralej agreed to keep CentOS stream 9 if one to choose which is our > current testing runtime is. So no change in the direction of current > testing runtime and dropping the py3.6 but there is possibility of > some trade off here. If any py3.6 breaking changes are happening then > it is up to projects goodness, bandwidth, or flexibility about > accepting the fix or not or even add a py36 unit test job. As our > testing runtime is the minimum things to test and it does not put any > max limit of testing, any project can extend their testing as per > their bandwidth. > > In summary: > > (This is what we agreed today in TC channel but as most of the folks > are on leave today, I will keep it open until next week so see if any > objections from the community and will conclude it accordingly) > > * No change in Yoga testing runtime and we move to cs9 and drop py36. > * We will not put hard stop on cs8 support and we can: > ** Devstack keep supporting cs8 in Yoga > ** It can be negotiated with project to add py36 job or fix if any > py36 breaking changes are observed by RDO (or any distro interested in > py36) but it depends on the project decision and bandwidth. > > As next, how we can improve the upgrade testing from distro versions > is something we will explore next and see what all we can test to make > upgrade easier. > > I'm against this, as I said in my setup.cfg >= py38 review for > openstack/nova [1] we either list and support runtimes or don't. If RDO > and others need CentOS 8 Stream support for a release then lets include > it and py36 still for Yoga and make things explicit. > > As I've said elsewhere I think the TC really need to adjust their > thinking on this topic and allow for one OpenStack release where both > the old and new LTS distro release are supported. Ensuring we allow > people to actually upgrade in place and later handle the distro upgrade > itself. > > Cheers, > > Lee > > [1] https://review.opendev.org/c/openstack/nova/+/819415 > > -- > Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 2D76 > > From gmann at ghanshyammann.com Tue Nov 30 21:09:29 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 30 Nov 2021 15:09:29 -0600 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> <8301ef05-4011-43e9-beec-66f6aeeb00f9@www.fastmail.com> <20211129130957.3poysdeqsufkp3pg@yuggoth.org> Message-ID: <17d72ae5ee2.b80bfd38119150.5945079300853303270@ghanshyammann.com> ---- On Mon, 29 Nov 2021 08:22:31 -0600 Sean Mooney wrote ---- > On Mon, 2021-11-29 at 14:43 +0100, Jean-Philippe Evrard wrote: > > On Mon, Nov 29, 2021, at 14:09, Jeremy Stanley wrote: > > > The primary reason stable branches exist is to make it easier for us > > > to test and publish backports of critical patches to older versions > > > of the software, rather than expecting our downstream consumers to > > > do that work themselves. If you're saying distribution package > > > maintainers are going to do it anyway and ignore our published > > > backports, then dropping the branching model may make sense, but > > > I've seen evidence to suggest that at least some distros do consume > > > our backports directly. > > > > Don't get me wrong, SUSE is consuming those backports, and (at least was) contributing to them. > > And yes, I doubt that RH/SUSE/Canonical are simply consuming those packages without ever adding their patches on a case by case basis. So yes, those distros are already doing part of their work downstream (and/or upstream). And for a valid reason: it's part of their job :) > > > > Doesn't mean we, as a whole community, still need to cut the work for every single consumer. > > If we are stretched thin, we need to define priorities. > > > > I believe our aggressive policy in terms of branching is hurting the rest of the ecosystem, that's why I needed to say things out loud. I meant the less we branch, the less we backport, the less painful upgrades we have to deal with. It depends on our definition of _when to branch_ of course. Your example of a "critical patch" might be a good reason to branch. We are maybe in a place where this can be on a case by case basis, or that we should improve that definition? > i actully would not consier our branching agressive. i actully think its much longer then the rest of the indesty or ecosystem. > many of the project we consume release on a monthly or quterly basis with some provideing api/abi breaking release once a year. > e.g. dpdk only allows abi breaks in the q4 release i belive every year, the kernel select an lts branch to maintain every year basedon the last release of the year but as a ~6 week release schdule. > i stonely belive it would be healthier for use to release more often then we do. that does not mean i think we should break version compatiably for our distibuted project more often. > > if we released every 6 weeks or once a quater but limited the end user impact of that so that they could mix/match release for up to a year or two that would > be better for our consumers, downstream distirbutionbs and devleopers. > > developers would have to backport less, downstream could ship/rebase on any of the intermidary releases without breaking comparitblet to get features and consume could stay on the stable release and > only upgrade once a year or opt for one of the point release in a year for new features. > > honestly i cant see how release less often will have any effect other then slowing the deleiver of featrue and bug fixes to customers. > i dont think it will help distrobutions reduce there maintance since we will just spend more time backproting features due to the increased wait > time that some of our custoemr will not want to wait for. we still recive feature backport request for our OSP 16 product based on train (sometime 10 and 13 based on newton/qeens respcitivly) > > if i tell our large telco customer "ok the we are two far into Yoga to complete that this cycle so the next upstream release we can target this for is Z and that will release in Q4 2022 and it will > take a year for use to productize that relase so you can expect the feature to be completed in 2023" They will respond with "Well we need it in 2022 so what the ETA for a backport". > if we go from a 6month upstream cycle to a 12 month one that converation chagnes form we can deliver in 9 months to 18 upstream + packaging. shure with a 12 cycle there is more likely hood > we could fit it into the current cycle but it also much much more painful if we miss a release. there are only a small subset of feature that we can backport downstream > without breaking interoperablity so we assume we can fall back on that. This is an important point and I think I mentioned it in one of my replies but you have put it nicely. Slowing the pace of getting features released/available to users will directly hurt many OpenStack consumers. I know a similar case from telco where 6-month waiting time is ok for them but not 1 year (we asked if it is ok to deliver this feature upstream in the Z cycle and they said no that is too late). I think there might be many such cases. And people might start doing 'downstream more' due to upstream late availability of features. 2nd impact I see here is on contributors which are always my concern in many forms :). We have fewer contributors now a days in upstream, making 1-year release can impact their upstream role from their company side either to implement/ get-releases any feature or contributions help in multiple areas. That is just my thought but anyone managing the upstream contributors budget for their company can validate it. -gmann > > stable branches dont nessiarly help with that but not having a very long release cadnce does. > > > > > Regards, > > JP > > > > > From Arne.Wiebalck at cern.ch Tue Nov 30 21:33:42 2021 From: Arne.Wiebalck at cern.ch (Arne.Wiebalck at cern.ch) Date: Tue, 30 Nov 2021 22:33:42 +0100 Subject: [ironic][meeting] Postpone meeting time until end of March 2022 In-Reply-To: References: Message-ID: <1955640444.77903.1638308022081@cernmail.cern.ch> Sounds good to me as well! > On 11/30/2021 6:22 PM Dmitry Tantsur wrote: > > > No objections from me either. > > On Tue, Nov 30, 2021 at 3:07 PM Riccardo Pittau wrote: > > > > Hello ironicers! > > > > Due to some overlapping time with other meetings that cause me to attend to 2 (or sometimes 3.....) meetings at the same time, I'd like to propose to move the time for the weekly ironic upstream meeting 1 hour later, so from the current 15:00-16:00 UTC to 16:00-17:00 UTC, until the next Daylight saving time change that should happen at the end of March 2022. > > > > Thank you! > > > > Ciao, > > Riccardo > > > > > > > > > > > -- > Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, > Commercial register: Amtsgericht Muenchen, HRB 153243, > Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill > -------------- next part -------------- An HTML attachment was scrubbed... URL: From juliaashleykreger at gmail.com Tue Nov 30 21:48:15 2021 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Tue, 30 Nov 2021 13:48:15 -0800 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: <17d72ae5ee2.b80bfd38119150.5945079300853303270@ghanshyammann.com> References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> <8301ef05-4011-43e9-beec-66f6aeeb00f9@www.fastmail.com> <20211129130957.3poysdeqsufkp3pg@yuggoth.org> <17d72ae5ee2.b80bfd38119150.5945079300853303270@ghanshyammann.com> Message-ID: On Tue, Nov 30, 2021 at 1:13 PM Ghanshyam Mann wrote: > > ---- On Mon, 29 Nov 2021 08:22:31 -0600 Sean Mooney wrote ---- > > On Mon, 2021-11-29 at 14:43 +0100, Jean-Philippe Evrard wrote: > > > On Mon, Nov 29, 2021, at 14:09, Jeremy Stanley wrote: > > > > The primary reason stable branches exist is to make it easier for us > > > > to test and publish backports of critical patches to older versions > > > > of the software, rather than expecting our downstream consumers to > > > > do that work themselves. If you're saying distribution package > > > > maintainers are going to do it anyway and ignore our published > > > > backports, then dropping the branching model may make sense, but > > > > I've seen evidence to suggest that at least some distros do consume > > > > our backports directly. > > > > > > Don't get me wrong, SUSE is consuming those backports, and (at least was) contributing to them. > > > And yes, I doubt that RH/SUSE/Canonical are simply consuming those packages without ever adding their patches on a case by case basis. So yes, those distros are already doing part of their work downstream (and/or upstream). And for a valid reason: it's part of their job :) > > > > > > Doesn't mean we, as a whole community, still need to cut the work for every single consumer. > > > If we are stretched thin, we need to define priorities. > > > > > > I believe our aggressive policy in terms of branching is hurting the rest of the ecosystem, that's why I needed to say things out loud. I meant the less we branch, the less we backport, the less painful upgrades we have to deal with. It depends on our definition of _when to branch_ of course. Your example of a "critical patch" might be a good reason to branch. We are maybe in a place where this can be on a case by case basis, or that we should improve that definition? > > i actully would not consier our branching agressive. i actully think its much longer then the rest of the indesty or ecosystem. > > many of the project we consume release on a monthly or quterly basis with some provideing api/abi breaking release once a year. > > e.g. dpdk only allows abi breaks in the q4 release i belive every year, the kernel select an lts branch to maintain every year basedon the last release of the year but as a ~6 week release schdule. > > i stonely belive it would be healthier for use to release more often then we do. that does not mean i think we should break version compatiably for our distibuted project more often. > > > > if we released every 6 weeks or once a quater but limited the end user impact of that so that they could mix/match release for up to a year or two that would > > be better for our consumers, downstream distirbutionbs and devleopers. > > > > developers would have to backport less, downstream could ship/rebase on any of the intermidary releases without breaking comparitblet to get features and consume could stay on the stable release and > > only upgrade once a year or opt for one of the point release in a year for new features. > > > > honestly i cant see how release less often will have any effect other then slowing the deleiver of featrue and bug fixes to customers. > > i dont think it will help distrobutions reduce there maintance since we will just spend more time backproting features due to the increased wait > > time that some of our custoemr will not want to wait for. we still recive feature backport request for our OSP 16 product based on train (sometime 10 and 13 based on newton/qeens respcitivly) > > > > if i tell our large telco customer "ok the we are two far into Yoga to complete that this cycle so the next upstream release we can target this for is Z and that will release in Q4 2022 and it will > > take a year for use to productize that relase so you can expect the feature to be completed in 2023" They will respond with "Well we need it in 2022 so what the ETA for a backport". > > if we go from a 6month upstream cycle to a 12 month one that converation chagnes form we can deliver in 9 months to 18 upstream + packaging. shure with a 12 cycle there is more likely hood > > we could fit it into the current cycle but it also much much more painful if we miss a release. there are only a small subset of feature that we can backport downstream > > without breaking interoperablity so we assume we can fall back on that. > > This is an important point and I think I mentioned it in one of my replies but you have put it nicely. Slowing the pace of getting > features released/available to users will directly hurt many OpenStack consumers. I know a similar case from telco where > 6-month waiting time is ok for them but not 1 year (we asked if it is ok to deliver this feature upstream in the Z cycle and they > said no that is too late). I think there might be many such cases. And people might start doing 'downstream more' > due to upstream late availability of features. And yet, they won't be able to get the software into the field for just as long, and demands/requirements for feature backports substantially increases vendor production cost by increasing upgrade risk, elongating qualification testing/cycles, and ultimately trickling down to the end operator who now has to wait even longer. It feels like this is a giant "no win situation", or for you star trek fans, the Kobayashi Maru. > > 2nd impact I see here is on contributors which are always my concern in many forms :). We have fewer contributors > now a days in upstream, making 1-year release can impact their upstream role from their company side either to implement/ > get-releases any feature or contributions help in multiple areas. That is just my thought but anyone managing the upstream > contributors budget for their company can validate it. > > -gmann > > > > > stable branches dont nessiarly help with that but not having a very long release cadnce does. > > > > > > > > Regards, > > > JP > > > > > > > > > > From fungi at yuggoth.org Tue Nov 30 22:09:13 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Tue, 30 Nov 2021 22:09:13 +0000 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> <8301ef05-4011-43e9-beec-66f6aeeb00f9@www.fastmail.com> <20211129130957.3poysdeqsufkp3pg@yuggoth.org> <17d72ae5ee2.b80bfd38119150.5945079300853303270@ghanshyammann.com> Message-ID: <20211130220913.ydk5zhyvkdl7g7zc@yuggoth.org> On 2021-11-30 13:48:15 -0800 (-0800), Julia Kreger wrote: [...] > It feels like this is a giant "no win situation", or for you star > trek fans, the Kobayashi Maru. [...] So, hacker-Kirk will save us all through cheating? Sounds legit. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From juliaashleykreger at gmail.com Tue Nov 30 22:31:00 2021 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Tue, 30 Nov 2021 14:31:00 -0800 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: <20211130220913.ydk5zhyvkdl7g7zc@yuggoth.org> References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> <8301ef05-4011-43e9-beec-66f6aeeb00f9@www.fastmail.com> <20211129130957.3poysdeqsufkp3pg@yuggoth.org> <17d72ae5ee2.b80bfd38119150.5945079300853303270@ghanshyammann.com> <20211130220913.ydk5zhyvkdl7g7zc@yuggoth.org> Message-ID: On Tue, Nov 30, 2021 at 2:12 PM Jeremy Stanley wrote: > > On 2021-11-30 13:48:15 -0800 (-0800), Julia Kreger wrote: > [...] > > It feels like this is a giant "no win situation", or for you star > > trek fans, the Kobayashi Maru. > [...] > > So, hacker-Kirk will save us all through cheating? Sounds legit. > -- > Jeremy Stanley It feels like we have created a bunch of basically immovable, insurmountable conflicting obstacles. Kind of like a self digging holes. I'm worried not even hacker-Kirk can save us. Well, maybe his answer might actually be to abolish the integrated release so he can not only rescue the operators on the ship, but also beam them the tools they need to move forward. Granted, that is change, and human nature is a thing. :( From ildiko.vancsa at gmail.com Tue Nov 30 23:33:40 2021 From: ildiko.vancsa at gmail.com (Ildiko Vancsa) Date: Tue, 30 Nov 2021 15:33:40 -0800 Subject: [neutron][networking][ipv6][dns][ddi] Upcoming OpenInfra Edge Computing Group sessions In-Reply-To: <7F4DD655-C819-4FD6-B022-EEA3FFE33A8B@gmail.com> References: <5993A296-7D67-444B-8D2F-2544C42C984A@gmail.com> <7F4DD655-C819-4FD6-B022-EEA3FFE33A8B@gmail.com> Message-ID: Hi, We had a great session about DNS at the edge yesterday at the OpenInfra Edge Computing Group call. We already have the recording in case you missed the call, but would like to listen to what was discussed: https://wiki.openstack.org/wiki/Edge_Computing_Group#Networking_and_DNS_discussion_with_Cricket_Liu_and_Andrew_Wertkin Next Monday (December 6) at 6am PST / 1400 UTC we will be discussing IPv6! If you are interested in joining you can find the dial-in details here: https://wiki.openstack.org/wiki/Edge_Computing_Group#Meetings If you cannot make it to the call but have a question or specific topic to discuss related to IPv6 please share it on this thread so we can make sure to touch on it during the call. Thanks and Best Regards, Ildik? > On Nov 29, 2021, at 06:06, Ildiko Vancsa wrote: > > Hi, > > It is s friendly reminder that the DNS discussion is now on!! > > If you are interested in participating please join here: https://wiki.openstack.org/wiki/Edge_Computing_Group#Meetings > > Thanks and Best Regards, > Ildik? > > >> On Nov 26, 2021, at 16:34, Ildiko Vancsa wrote: >> >> Hi, >> >> It is a friendly reminder that we have the Networking and DNS discussion coming up at the OpenInfra Edge Computing Group weekly call on Monday (November 29) at 6am PST / 1400 UTC. >> >> We have invited industry experts, Cricket Liu and Andrew Wertkin, to share their thoughts and experience in this area. But, we also need YOU to join to turn the session into a lively discussion and debate! >> >> We have another networking related session the following week as well: December 6th - Networking and IPv6 discussion with Ed Horley >> >> Beyond the above topics we will have presentations from groups and projects such as Smart Edge and the Industry IoT Consortium (Formerly Industrial Internet Consortium). >> >> For the full schedule please see the edge WG?s wiki: https://wiki.openstack.org/wiki/Edge_Computing_Group#Upcoming_Topics >> >> Our calls happen every Monday at 6am Pacific Time / 1400 UTC on Zoom. You can find all the meeting details here: https://wiki.openstack.org/wiki/Edge_Computing_Group#Meetings >> >> Please let me know if you have any questions about the working group or any of the upcoming sessions. >> >> Thanks and Best Regards, >> Ildik? >> >> >> >>> On Nov 3, 2021, at 18:12, Ildiko Vancsa wrote: >>> >>> Hi, >>> >>> I?m reaching out to you to draw your attention to the amazing lineup of discussion topics for the OpenInfra Edge Computing Group weekly calls up until the end of this year with industry experts to present and participate in the discussions! >>> >>> I would like to invite and encourage you to join the working group sessions to discuss edge related challenges and solutions in the below areas and more! >>> >>> Some of the sessions to highlight will be continuing the discussions we started at the recent PTG: >>> * November 29th - Networking and DNS discussion with Cricket Liu and Andrew Wertkin >>> * December 6th - Networking and IPv6 discussion with Ed Horley >>> >>> Beyond the above topics we will have presentations from groups and projects such as Smart Edge and the Industry IoT Consortium (Formerly Industrial Internet Consortium). >>> >>> For the full schedule please see the edge WG?s wiki: https://wiki.openstack.org/wiki/Edge_Computing_Group#Upcoming_Topics >>> >>> Our calls happen every Monday at 6am Pacific Time / 1400 UTC on Zoom. You can find all the meeting details here: https://wiki.openstack.org/wiki/Edge_Computing_Group#Meetings >>> >>> Please let me know if you have any questions about the working group or any of the upcoming sessions. >>> >>> Thanks and Best Regards, >>> Ildik? >>> >>> >> >