From amy at demarco.com Mon Nov 1 00:20:01 2021 From: amy at demarco.com (Amy Marrich) Date: Sun, 31 Oct 2021 19:20:01 -0500 Subject: [Diversity] Diversity and Inclusion Meeting Reminder Message-ID: The Diversity & Inclusion WG invites members of all OIF projects to attend our next meeting Monday November 1st, at 17:00 UTC in the #openinfra- diversity channel on OFTC. The agenda can be found at https://etherpad.openstack.org/p/diversity-wg-agenda. Please feel free to add any topics you wish to discuss at the meeting. Thanks, Amy (apotz) -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Mon Nov 1 07:01:22 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Mon, 01 Nov 2021 08:01:22 +0100 Subject: Router interfaces are down In-Reply-To: References: Message-ID: <4390784.LvFx2qVVIh@p1> Hi, Please don't drop ML from the thread. You have to go to the node where Your router is hosted and investigate there in the agent's logs. If ports are DOWN, I would start with checking in the L2 agent logs (neutron-ovs-agent or linuxbridge-agent, idk what You are using exactly). If there is no any errors there, You can also check neutron-server logs, why ports aren't set to be UP as well as checking neutron-l3-agent logs on the node where router is hosted. On czwartek, 28 pa?dziernika 2021 18:41:29 CET Jibsan Joel Rosa Toirac wrote: > Here I let you some screenshots of my Network Topology: > > > > El jue, 28 oct 2021 a las 7:43, Jibsan Joel Rosa Toirac () > escribi?: > > Yes it?s neutron router. Well the router is centralized. It is located > > between the subnets and the node, all the subnets will pass through the > > router to internet, but I don?t know what else check to set it up for. I?m > > using a friend?s Openstack instance on a server to check out if I?m missing > > something and both nodes are the same. I will send you a screenshot of my > > network Topology. > > > > Greetings > > > > On Thu, Oct 28, 2021 at 2:33 AM Slawek Kaplonski > > > > wrote: > >> Hi, > >> > >> On ?roda, 27 pa?dziernika 2021 21:39:55 CEST Jibsan Joel Rosa Toirac > >> > >> wrote: > >> > Hello, I'm trying to route all the requests from a private vlan to > >> > internet. I have a private network and all the Virtual Machines inside > >> > >> the > >> > >> > subnet config can do everything between they, but if I ping to > >> > >> Internet, it > >> > >> > doesn't work. > >> > > >> > When I see the router_external it says all the interfaces are DOWN. > >> > >> By router_external, You mean neutron router, right? > >> If so, what kind of router it is, centralized HA or non-HA, or maybe DVR? > >> Is router scheduled properly to some node? You can check that with > >> command > >> like "neutron l3-agent-list-hosting-router ". > >> > >> > I have search in everywhere but I can't find a solution for this. > >> > > >> > Thank you for your time > >> > >> -- > >> Slawek Kaplonski > >> Principal Software Engineer > >> Red Hat -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From lokendrarathour at gmail.com Mon Nov 1 08:22:07 2021 From: lokendrarathour at gmail.com (Lokendra Rathour) Date: Mon, 1 Nov 2021 13:52:07 +0530 Subject: [ Tacker ] Passing a shell script/parameters as a file in cloud config In-Reply-To: References: Message-ID: Hello EveryOne, Any update on this, please. -Lokendra On Thu, Oct 28, 2021 at 2:43 PM Lokendra Rathour wrote: > Hi, > *In Tacker, while deploying VNFD can we pass a file ( parameter file) and > keep it at a defined path using cloud-config way?* > > Like in *generic hot template*s, we have the below-mentioned way to pass > a file directly as below: > parameters: > foo: > default: bar > > resources: > > the_server: > type: OS::Nova::Server > properties: > # flavor, image etc > user_data: > str_replace: > template: {get_file: the_server_boot.sh} > params: > $FOO: {get_param: foo} > > > *but when using this approach in Tacker BaseHOT it gives an error saying * > "nstantiation wait failed for vnf 77693e61-c80e-41e0-af9a-a0f702f3a9a7, > error: VNF Create Resource CREATE failed: resources.obsvrnnu62mb: > resources.CAS_0_group.Property error: > resources.soft_script.properties.config: No content found in the "files" > section for get_file path: Files/scripts/install.py > 2021-10-28 00:46:35.677 3853831 ERROR oslo_messaging.rpc.server > " > do we have a defined way to use the hot capability in TACKER? > > Defined Folder Structure for CSAR: > . > ??? BaseHOT > ? ??? default > ? ??? RIN_vnf_hot.yaml > ? ??? nested > ? ??? RIN_0.yaml > ? ??? RIN_1.yaml > ??? Definitions > ? ??? RIN_df_default.yaml > ? ??? RIN_top_vnfd.yaml > ? ??? RIN_types.yaml > ? ??? etsi_nfv_sol001_common_types.yaml > ? ??? etsi_nfv_sol001_vnfd_types.yaml > ??? Files > ? ??? images > ? ??? scripts > ? ??? install.py > ??? Scripts > ??? TOSCA-Metadata > ? ??? TOSCA.meta > ??? UserData > ? ??? __init__.py > ? ??? lcm_user_data.py > > *Objective: * > To pass a file at a defined path on the VDU after the VDU is > instantiated/launched. > > -- > ~ Lokendra > skype: lokendrarathour > > > -- ~ Lokendra www.inertiaspeaks.com www.inertiagroups.com skype: lokendrarathour -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at stackhpc.com Mon Nov 1 09:03:08 2021 From: mark at stackhpc.com (Mark Goddard) Date: Mon, 1 Nov 2021 09:03:08 +0000 Subject: Openstack ansible VS kolla ansible In-Reply-To: References: Message-ID: I would recommend kolla-ansible, but then I am biased. Ultimately. both are capable of giving you a production-ready OpenStack deployment, you just need to work out which is the best fit for you. Mark On Sat, 30 Oct 2021 at 21:10, A Monster wrote: > > Openstack-ansible uses LXC containers to deploy openstack services , while Kolla uses docker containers instead, which of these two deployment tools should I use for an Openstack deployment, and what are the differences between them. From amonster369 at gmail.com Mon Nov 1 09:29:58 2021 From: amonster369 at gmail.com (A Monster) Date: Mon, 1 Nov 2021 10:29:58 +0100 Subject: Using ceph for openstack storage Message-ID: Thank you for your response. Sadly, I'm talking about actual production, but I'm very limited in terms of hardware. I was thinking about using RAID for controller node as data redundancy, because I had the idea of maximizing the number of nova compute nodes, So basically i thought off using a controller with the following services ( Nova, Neutron, Keystone, Horizon, Glance, Cinder and Swift). Following the configuration you suggested, I would have : - 3 Controllers that are also Ceph Monitors - 9 Nova compute nodes and Ceph OSDs My questions are : - having multiple Ceph monitors is for the sake of redundancy or does it have a performance goal ? - Combining Ceph OSD and Nova compute wouldn't have performance drawbacks or damage the integrity of data stored in each node. wouldn't it be better is this case to use two separate servers for swift and glance and use RAID for data redundancy instead of using Ceph SDS. Thank you very much for you time. -------------- next part -------------- An HTML attachment was scrubbed... URL: From zigo at debian.org Mon Nov 1 12:12:48 2021 From: zigo at debian.org (Thomas Goirand) Date: Mon, 1 Nov 2021 13:12:48 +0100 Subject: Using ceph for openstack storage In-Reply-To: References: Message-ID: <73528885-aa0b-da12-1e46-4c45a094453e@debian.org> On 11/1/21 10:29 AM, A Monster wrote: > Thank you for?your response. > > Sadly, I'm talking about actual production, but I'm very limited in > terms of hardware. > > I was thinking about using RAID for controller node as data redundancy, > because I had the idea of maximizing the number of nova compute nodes,? > So basically i thought off using a controller with the following > services ( Nova, Neutron, Keystone, Horizon, Glance, Cinder and Swift). > Following the configuration you suggested, I would have : > - 3 Controllers that are also?Ceph Monitors > - 9 Nova compute nodes and Ceph OSDs? ? > > My questions are : > - having?multiple Ceph monitors is for the sake of redundancy or does it > have a performance goal ? > - Combining Ceph OSD and Nova compute?wouldn't have performance > drawbacks or damage the integrity?of data stored in each node. > > wouldn't it be better is this case to use two separate servers for swift > and glance and use RAID for data redundancy instead of using Ceph SDS. > > Thank you very?much for you time. > Hi, RAID will protect you from only a single type of failure on your controllers. RAID is *not* a good idea at all for Ceph or Swift (it will slow down things, and wont help much with redundancy). If you need, for example, to upgrade the operating system (for example because of a kernel security fix), you will have to restart your controllers, meaning there's going to be API down time. If you set the CEPH Mon on the controllers, then you will have the issue with Ceph Mon not being reachable during the upgrade, meaning you may end up with stuck I/O on all of your VMs. Of course, combining Ceph OSD and Nova compute is less nice than having a dedicated cluster (especially: busy VMs may slow down your Ceph and increase latency). But considering your constraints, it's still better: for any serious Ceph setup, you need to be able to "loose" at least 10% of your Ceph cluster so it can recover without impacting your overall cluster too much. The same way, I would suggest running at least the swift-object service on your compute nodes: it's common to have Swift account + containers on SSD, to seed it up. It's ok-ish to run account+container on your 3 controllers, IMO. Again, the piece of advice I'm giving is only valid because of your constraints, otherwise I would suggest a larger cluster. I hope this helps, Cheers, Thomas Goirand (zigo) From mihalis68 at gmail.com Mon Nov 1 12:45:57 2021 From: mihalis68 at gmail.com (Chris Morgan) Date: Mon, 1 Nov 2021 08:45:57 -0400 Subject: Using ceph for openstack storage In-Reply-To: References: Message-ID: <996D3999-85B9-4244-AB07-AF727CDF5DEC@gmail.com> VMs and OSDs on the same node (?hyperconverged?) is not a good idea in our experience. We used to run that way but moved to splitting nodes into either compute or storage. One of our older hyperconverged clusters OOM killed a VM only last week because ceph used up more memory than when the VM was scheduled. You also have different procedures to make a node safe for compute role than for storage. It?s tedious to have to worry about both when needing to take a node down for maintenance. Chris Morgan Sent from my iPhone > On Nov 1, 2021, at 5:36 AM, A Monster wrote: > > ? > Thank you for your response. > > Sadly, I'm talking about actual production, but I'm very limited in terms of hardware. > > I was thinking about using RAID for controller node as data redundancy, because I had the idea of maximizing the number of nova compute nodes, > So basically i thought off using a controller with the following services ( Nova, Neutron, Keystone, Horizon, Glance, Cinder and Swift). > Following the configuration you suggested, I would have : > - 3 Controllers that are also Ceph Monitors > - 9 Nova compute nodes and Ceph OSDs > > My questions are : > - having multiple Ceph monitors is for the sake of redundancy or does it have a performance goal ? > - Combining Ceph OSD and Nova compute wouldn't have performance drawbacks or damage the integrity of data stored in each node. > > wouldn't it be better is this case to use two separate servers for swift and glance and use RAID for data redundancy instead of using Ceph SDS. > > Thank you very much for you time. > From satish.txt at gmail.com Mon Nov 1 13:11:34 2021 From: satish.txt at gmail.com (Satish Patel) Date: Mon, 1 Nov 2021 09:11:34 -0400 Subject: is OVS+DPDK useful for general purpose workload In-Reply-To: References: Message-ID: Thank you Laurent, Now i am fully agree with you, that running DPDK only on the host doesn't gain anything because your VM guest will be a bottleneck. but again most of the documents keep saying you will gain performance but they never clarified what you said that it's not for everyone but only DPDK based vm. (wish there is a general purpose virtio based PMD which can suitable for all kind of workload) The only solution left is XDP and SRIOV (sriov is complicated to deploy because it doesn't support bond). On Sun, Oct 31, 2021 at 5:37 PM Laurent Dumont wrote: > > Most of the implementations I have seen for OVS-DPDK mean that the VM side would also use DPDK. > > Because even from a DPDK perspective at the compute level, the VM will become the bottleneck. 200k PPS with OVS-DPDK + non-DPDK VM is about what you get with OVS + OVSfirewall + non-DPDK VM. > > On Sun, Oct 31, 2021 at 12:21 AM Satish Patel wrote: >> >> Folks, >> >> I have deployed openstack and configured OVS-DPDK on compute nodes for >> high performance networking. My workload is general purpose workload >> like running haproxy, mysql, apache and XMPP etc. >> >> When I did load testing I found performance was average and after >> 200kpps packet rate I noticed packet drops. I heard and read that DPDK >> can handle millions of packets but in my case its not true. I am using >> virtio-net in guest vm which processes packets in the kernel so I >> believe my bottleneck is my guest VM. >> >> I don't have any guest based DPDK applications like testpmd etc. does >> that mean OVS+DPDK isn't useful for my cloud? How do I take advantage >> of OVS+DPDK with general purpose workload? >> >> Maybe I have the wrong understanding about DPDK so please help me :) >> >> Thanks >> ~S >> From jonathan.rosser at rd.bbc.co.uk Mon Nov 1 13:46:38 2021 From: jonathan.rosser at rd.bbc.co.uk (Jonathan Rosser) Date: Mon, 1 Nov 2021 13:46:38 +0000 Subject: Openstack ansible VS kolla ansible In-Reply-To: References: Message-ID: <9f39dc37-6a15-c7ad-acd7-e4e4c4b18da3@rd.bbc.co.uk> On 30/10/2021 21:05, A Monster wrote: > Openstack-ansible uses LXC containers to deploy openstack services You can also do a deployment with no use of container technologies with openstack-ansible if you wish.? Enough people have asked for that so it's a supported deployment model. Pick the tool that suits you, check out the documentation and do a proof-of-concept. They achieve the same thing by different means. Jonathan. From alex.kavanagh at canonical.com Mon Nov 1 14:13:32 2021 From: alex.kavanagh at canonical.com (Alex Kavanagh) Date: Mon, 1 Nov 2021 14:13:32 +0000 Subject: [charms] Yoga PTG Summary Message-ID: Hi All Thanks to all who participated in the PTG discussions. We had some good sessions where we tackled thorny issues in the charms deployments of OpenStack. A brief set of highlights is: - The store holding all the charms (https://jaas.ai/) just happens to be migrating to a new home (https://charmhub.io/) within the timescales of the Yoga cycle and some of the changes in behaviour provide some challenges and also opportunities to both simplify the charms and provide a better operator experience. The discussions highlighted the pain points and the general direction that the migration could take around: - Charm upgrades - OpenStack upgrades - Series upgrades of the underlying operating system. - The charms deployment of OpenStack supports versions back to mitaka (pre yoga) and queens (yoga cycle onwards). This has recently presented challenges around Python 3 support, particularly with py35 being EOL September 2020, and with py36 (bionic LTS) EOL 23rd December 2021. We discussed various strategies and approaches around maintenance, security and support of queens onwards in terms of support and upgrades. - The charms team also maintains a CI infrastructure based on Jenkins and (more recently) self-hosted Zuul to test installation and upgrade of OpenStack solutions. We discussed the further migrations of services from Jenkins to Zuul and issues associated with it. - Discussions around building new charms and refreshing old charms for the new components and features in the previous xena cycle and oncoming yoga cycle. Our raw notes are here: https://etherpad.opendev.org/p/charms-yoga-ptg Thank you again for all the participation in setting the direction and priorities for the yoga cycle. -- Alex Kavanagh - PTL Yoga -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Mon Nov 1 16:04:06 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 01 Nov 2021 11:04:06 -0500 Subject: [all][tc] Continuing the RBAC PTG discussion In-Reply-To: <17cc3068765.11a9586ce162263.9148179786177865922@ghanshyammann.com> References: <17cc3068765.11a9586ce162263.9148179786177865922@ghanshyammann.com> Message-ID: <17cdc3e5964.e42274e1372366.1164114504252831335@ghanshyammann.com> ---- On Wed, 27 Oct 2021 13:32:37 -0500 Ghanshyam Mann wrote ---- > Hello Everyone, > > As decided in PTG, we will continue the RBAC discussion from where we left in PTG. We will have a video > call next week based on the availability of most of the interested members. > > Please vote your available time in below doodle vote by Thursday (or Friday morning central time). > > - https://doodle.com/poll/6xicntb9tu657nz7 As per doodle voting, I have schedule it on Nov 3rd Wed, 15:00 - 16:00 UTC. Below is the link to join the call: https://meet.google.com/uue-adpp-xsm We will be taking notes in this etherpad https://etherpad.opendev.org/p/policy-popup-yoga-ptg -gmann > > NOTE: this is not specific to TC or people working in RBAC work but more to wider community to > get feedback and finalize the direction (like what we did in PTG session). > > Meanwhile, feel free to review the lance's updated proposal for community-wide goal > - https://review.opendev.org/c/openstack/governance/+/815158 > > -gmann > > From manchandavishal143 at gmail.com Mon Nov 1 16:12:49 2021 From: manchandavishal143 at gmail.com (vishal manchanda) Date: Mon, 1 Nov 2021 21:42:49 +0530 Subject: [horizon][dev]Handle multiple login sessions from same user in Horizon In-Reply-To: References: Message-ID: Hi Arthur, Thanks for adding a new blueprint, just approver it. Looking forward to the implementation patches. Feel free to reach out to me or the horizon team for any further queries on IRC (#openstack-horizon) at OFTC n/w. Regards, Vishal Manchanda On Sat, Oct 30, 2021 at 12:47 AM Luz de Avila, Arthur < Arthur.LuzdeAvila at windriver.com> wrote: > Hi everyone, > > In order to improve the system my colleagues and I would like to bring up > a new feature to Horizon. We found out that an user is able to login in > Horizon with the same credentials in multiple devices or/and browsers. This > may be not very secure as the user can login in many different devices > or/and browsers with the same credential. > > Thinking on that, we would like bring up more control to the admin of the > system in a way that the admin can enable or disable the multiple login > sessions according to the needs of the system. > > For a better follow up of this propose, a blueprint has been opened with > more details about the idea and concepts of this and we would like the > onion of the community whether this feature make sense to implement or not. > The blueprint opened on launchpad: > https://blueprints.launchpad.net/horizon/+spec/handle-multiple-login-sessions-from-same-user-in-horizon > > Kind regards, > Arthur Avila > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From abraden at verisign.com Mon Nov 1 16:35:50 2021 From: abraden at verisign.com (Braden, Albert) Date: Mon, 1 Nov 2021 16:35:50 +0000 Subject: Adjutant needs contributors (and a PTL) to survive! In-Reply-To: <3026c411-688b-c773-8577-e8eed40b995a@catalystcloud.nz> References: <3026c411-688b-c773-8577-e8eed40b995a@catalystcloud.nz> Message-ID: <2bd77f1e8c7542888b7e0e1a14931a41@verisign.com> Hi Adrian, I don't think I'm qualified to be PTL but I'm willing to help, and I've asked for permission. We aren't using Adjutant at this time because we're on Train and I learned at my last contract that running Adjutant on Train is a hassle, but I hope to start using it after we get to Ussuri. Has anyone else volunteered? -----Original Message----- From: Adrian Turjak Sent: Wednesday, October 27, 2021 1:41 AM To: openstack-discuss at lists.openstack.org Subject: [EXTERNAL] Adjutant needs contributors (and a PTL) to survive! Caution: This email originated from outside the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe. Hello fellow OpenStackers! I'm moving on to a different opportunity and my new role will not involve OpenStack, and there sadly isn't anyone at Catalystcloud who will be able to take over project responsibilities for Adjutant any time soon (not that I've been very onto it lately). As such Adjutant needs people to take over, and lead it going forward. I believe the codebase is in a reasonably good position for others to pick up, and I plan to go through and document a few more of my ideas for where it should go in storyboard so some of those plans exist somewhere should people want to pick up from where I left off before going fairly silent upstream. Plus if people want/need to they can reach out to me or add me to code review and chances are I'll comment/review because I do care about the project. Or I may contract some time to it. There are a few clouds running Adjutant, and people who have previously expressed interest in using it, so if you still are, the project isn't in a bad place at all. The code is stable, and the last few major refactors have cleaned up much of my biggest pain points with it. Best of luck! - adriant From yasufum.o at gmail.com Mon Nov 1 16:50:05 2021 From: yasufum.o at gmail.com (yasufum) Date: Tue, 2 Nov 2021 01:50:05 +0900 Subject: [tacker][ptg] Yoga PTG summary Message-ID: <8ba67bb6-3455-e5fd-ee02-9b366cf4301a@gmail.com> Hi everyone, Thank you for participating in the Yoga PTG sessions. We discussed 17 items totally, including 13 specs, and agreed all to implement the proposed features. We also decided to set the spec freeze on 31th Dec. The etherpad of our PTG is at https://etherpad.opendev.org/p/tacker-yoga-ptg Here is a summary of the PTG sessions. *Day 1 1. Introduce for Tacker installer * To reduce uinteresting steps of setting up tacker, we introduce a dedicated installer for developers. * For beginners, we should minimize difficulties of deploying and make them focusing on their usecases. 2. Prometheus monitoring and AutoHeal support for Kubernetes Cluster VNF via FM Interface * Add fault management interface support applied with ETSI SOL003 standards including polling mode and notify mode. * Polling mode responds to resources that is named "Alarms" and "Individual alarm". On the other hand, notify mode responds to resources that is named "Subscriptions" and "Notification endpoint" 3. Support VNF update operations reflecting changes to VNF instances using MgmtDriver * Add update APIs for container VNF instances by using ConfigMap and Secret of k8s. 4. Support CentOS stream * Revise our current incomplete CentOS support for the latest 'stream'. * We will provide some functional tests for the update, but non-voting for a while. 5. Add VNF Package sample for practical use cases * Provide more practical examples of usecase of tacker for users because current usecases in documentation are simple and not enough so much for actual cases. * There are three examples supposed in the proposal, "use multiple deployment flavours", "deploy VNF connected to external network" and "deploy VNF as HA cluster". *Day 2 6. Support handling large query results by ETSI NFV * Introduce an extended feature for reducing a large amount of result for some queries defined in ETSI standards. * The idea is simply dividing the results in several pieces with paging. 7. Support FT of Multi tenant * There are two problems in multi-tenancy case, (1) No restriction in assigning a tenant,(2) Notifying events on different tenants. * The policy of assigning a tenant should be clarified and also Notification process under multi-tenancy should be changed. * We need to clear the policy of RBAC (Is it allowed to make admin user can access all resources?) 8. Sample Ansible Driver * There are no management drivers enable VNF Configuration using Ansible. So, add the scripts as samples and docs. * Revise the name of directories for considering conventioins in takcer repo is necessary. * This can be used for 'Add VNF Package sample for practical use cases' proposal. We wii support the sample development. 9. Support Robot Framework API * Currently Tacker functional tests mainly focus on checking various VNF patterns such as a simple VNF, multi VDU, volume attach and affinity set. Tacker community is advancing ETSI NFV standard compliance, and coverage of compliant API testing becomes important. * The proposal is to use Robot Framework to achieve automated API testing and in the first step, we adopt API test code released by ETSI NFV-TST010. 10. Add Tacker-horizon Unit Test Cases * `tacker-horizon` which provides very limited features, such as showing a list of registered VIMs, VNFDs or so. We don't use this feature so much without checking instances of Tacker quickly via intuitive Web GUI way. So, we have not maintained tacker-horizon so actively * One of the reasons why bugs in tacker-horizon have not fixed is that we have no unittests in. Although we can find a bug coincidentally while using the features, but should implement unittests because bugs in horizon are some contextual and not so easy to find by hand. *Day 3 11. Report for ETSI NFV API Conformance Test Results * Share the result of Remote NFV&MEC API Plugtest 2021, Totally 136 API Conformance test sessions were executed for [NFV-SOL003] were executed as well as for NFV-SOL005. * https://hackmd.io/q4DzQ6_2Q0e-TdmtBikVlQ?view 12. Reduce code clone from sol-kubernetes job of FT * There are much similar files can be reduced for functional tests. It makes maintenance more complicated and difficult. The goal is to reduce it to less than 20%. * In particular, since sol_kubernetes's code clone rate is up to 40% , we think it will be better to refactor to make future maintenance easier. * other details: https://hackmd.io/Wo8cBIH_RPe6ll1hNwmx1w?view 13. Reduce unnecessary test codes * It's similar as above item. We have very similar YAML files for definitions and useless template files for tests can be reduced. 14. Enhance NFV SOL_v3 LCM operation * Introduce the latest V2 APIs for LCM operation as below. * Scale VNF (POST /vnf_instances/{vnfInstanceId}/scale) * Heal VNF (POST /vnf_instances/{vnfInstanceId}/heal) * Modify VNF Information (PATCH /vnf_instances/{vnfInstanceId}) * Change External VNF Connectivity (POST /vnf_instances/{vnfInstanceId}/change_ext_conn) 15. Support ETSI NFV-SOL_v3 based error-handling operation * Introduce the latest V2 APIs for error handling operation as below. * Retry operation (POST /vnf_lcm_op_occs/{vnfLcmOpOccId}/retry) * Fail operation (POST /vnf_lcm_op_occs/{vnfLcmOpOccId}/fail) * Rollback operation (POST /vnf_lcm_op_occs/{vnfLcmOpOccId}/rollback) * Test scenario needs to include raising an error, becoming FAILD_TEMP, and executing the ErrorHandling API. * Timer adjusting and using inapporpriate vnf package can cause error. 16. Support ChangeCurrentVNFPackage * Add API for ChangeCurrentVNFPackage by which blue-green deployment and rolling update are supported. * Both of VIMs, OpenStack and Kubernetes, are covered. 17. Support heal and scale method in lcm_user_data * Enable to customize stack parameters for heal and scale operations in user script, `user_lcm_data.py` more specifically. * Call the proposed methods if it's existing in the script for the operations. Thanks, Yasufumi From manchandavishal143 at gmail.com Tue Nov 2 05:54:06 2021 From: manchandavishal143 at gmail.com (vishal manchanda) Date: Tue, 2 Nov 2021 11:24:06 +0530 Subject: [horizon] Skip tomorrow Weekly IRC Meeting Message-ID: Hi all, As discussed, during the last weekly meeting[1], there will be no horizon weekly meeting tomorrow. See you next week! Thanks & regards, Vishal Manchanda [1] https://meetings.opendev.org/meetings/horizon/2021/horizon.2021-10-27-15.00.log.html#l-93 -------------- next part -------------- An HTML attachment was scrubbed... URL: From oleg.bondarev at huawei.com Tue Nov 2 07:36:09 2021 From: oleg.bondarev at huawei.com (Oleg Bondarev) Date: Tue, 2 Nov 2021 07:36:09 +0000 Subject: [neutron] Bug Deputy Report Oct 25 - 31 Message-ID: <0de9ec1bc641455da86e3820cdc77fc1@huawei.com> Hi everyone, Please find Bug Deputy report for the week Oct 25 - 31st below. 1 Critical bug in stable/train already in progress; 1 High bug looks for an assignee. Several OVN bug needs some triage from OVN folks + assignees. Critical - https://bugs.launchpad.net/neutron/+bug/1948804 - [stable/train] neutron-tempest-plugin scenario jobs fail "sudo: guestmount: command not found" o In Progress: https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/815518 o Assigned to bcafarel High - https://bugs.launchpad.net/neutron/+bug/1948832 - "_disable_ipv6_addressing_on_interface" can't find the interface o Confirmed o Unassigned Medium - https://bugs.launchpad.net/neutron/+bug/1948676 - rpc response timeout for agent report_state is not possible Edit o Fix released: https://review.opendev.org/c/openstack/neutron/+/815310 o Fixed by Tobias Urdin - https://bugs.launchpad.net/bugs/1948642 - Configuration of the ovs controller by neutron-ovs-agent isn't idempotent Edit o In progress: https://review.opendev.org/c/openstack/neutron/+/815255 o Assigned to slaweq - https://bugs.launchpad.net/neutron/+bug/1948891 - [ovn] Using ovsdb-client for MAC_Binding could theoretically block indefinitely o Confirmed o Unassigned - https://bugs.launchpad.net/neutron/+bug/1949059 - ovn-octavia-provider: incorrect router association in NB when network is linked to more than 1 router o Confirmed o Unassigned - https://bugs.launchpad.net/neutron/+bug/1949081 - [OVN] check_for_mcast_flood_reports maintenance task not accounting for localnet "mcast_flood" changes o Confirmed o Assigned to lucasgomes Undecided - https://bugs.launchpad.net/neutron/+bug/1949097 - Cloud-Init cannot contact Meta-Data-Service on Xena with OVN o New o Unassigned - https://bugs.launchpad.net/neutron/+bug/1949202 - ovn-controllers are listed as agents but cannot be disabled o New o Unassigned - https://bugs.launchpad.net/neutron/+bug/1949230 - OVN Octavia provider driver should implement allowed_cidrs to enforce security groups on LB ports o New o Unassigned Invalid - https://bugs.launchpad.net/neutron/+bug/1948656 - toggling explicitly_egress_direct from true to false does not clean flows Thanks, Oleg --- Advanced Software Technology Lab Huawei -------------- next part -------------- An HTML attachment was scrubbed... URL: From syedammad83 at gmail.com Tue Nov 2 07:51:04 2021 From: syedammad83 at gmail.com (Ammad Syed) Date: Tue, 2 Nov 2021 12:51:04 +0500 Subject: [neutron][ovn] Stateless Security Group Message-ID: Hi, I have upgraded my lab to latest xena release and ovn 21.09 and ovs 2.16. I am trying to create stateless security group. But its getting failed with below error message. # openstack security group create --stateless sec02-stateless Error while executing command: BadRequestException: 400, Unrecognized attribute(s) 'stateful' I see below logs in neutron server logs. 2021-11-02 12:47:41.921 1346 DEBUG neutron.wsgi [-] (1346) accepted ('172.16.40.45', 41272) server /usr/lib/python3/dist-packages/eventlet/wsgi.py:992 2021-11-02 12:47:42.166 1346 DEBUG neutron.api.v2.base [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 98687873a146418eaeeb54a01693669f - default default] Request body: {'security_group': {'name': 'sec02-stateless', 'stateful': False, 'description': 'sec02-stateless'}} prepare_request_body /usr/lib/python3/dist-packages/neutron/api/v2/base.py:729 2021-11-02 12:47:42.167 1346 WARNING neutron.api.v2.base [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 98687873a146418eaeeb54a01693669f - default default] An exception happened while processing the request body. The exception message is [Unrecognized attribute(s) 'stateful'].: webob.exc.HTTPBadRequest: Unrecognized attribute(s) 'stateful' 2021-11-02 12:47:42.167 1346 INFO neutron.api.v2.resource [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 98687873a146418eaeeb54a01693669f - default default] create failed (client error): Unrecognized attribute(s) 'stateful' 2021-11-02 12:47:42.168 1346 INFO neutron.wsgi [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 98687873a146418eaeeb54a01693669f - default default] 172.16.40.45 "POST /v2.0/security-groups HTTP/1.1" status: 400 len: 317 time: 0.2455938 Any advice on how to fix it ? Ammad -------------- next part -------------- An HTML attachment was scrubbed... URL: From iurygregory at gmail.com Tue Nov 2 07:51:46 2021 From: iurygregory at gmail.com (Iury Gregory) Date: Tue, 2 Nov 2021 08:51:46 +0100 Subject: =?UTF-8?Q?=5Bironic=5D_Proposing_Aija_Jaunt=C4=93va_for_sushy=2Dcore?= Message-ID: Hello everyone! I would like to propose Aija Jaunt?va (irc: ajya) to be added to the sushy-core group. Aija has been in the ironic community for a long time, she has a lot of knowledge about redfish and is always providing good reviews. ironic-cores please vote with +/- 1. -- *Att[]'sIury Gregory Melo Ferreira * *MSc in Computer Science at UFCG* *Part of the ironic-core and puppet-manager-core team in OpenStack* *Software Engineer at Red Hat Czech* *Social*: https://www.linkedin.com/in/iurygregory *E-mail: iurygregory at gmail.com * -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Tue Nov 2 08:03:53 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Tue, 02 Nov 2021 09:03:53 +0100 Subject: [neutron][ovn] Stateless Security Group In-Reply-To: References: Message-ID: <4691610.31r3eYUQgx@p1> Hi, On wtorek, 2 listopada 2021 08:51:04 CET Ammad Syed wrote: > Hi, > > I have upgraded my lab to latest xena release and ovn 21.09 and ovs 2.16. I > am trying to create stateless security group. But its getting failed with > below error message. > > # openstack security group create --stateless sec02-stateless > Error while executing command: BadRequestException: 400, Unrecognized > attribute(s) 'stateful' > > I see below logs in neutron server logs. > > 2021-11-02 12:47:41.921 1346 DEBUG neutron.wsgi [-] (1346) accepted > ('172.16.40.45', 41272) server > /usr/lib/python3/dist-packages/eventlet/wsgi.py:992 > 2021-11-02 12:47:42.166 1346 DEBUG neutron.api.v2.base > [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 > 98687873a146418eaeeb54a01693669f - default default] Request body: > {'security_group': {'name': 'sec02-stateless', 'stateful': False, > 'description': 'sec02-stateless'}} prepare_request_body > /usr/lib/python3/dist-packages/neutron/api/v2/base.py:729 > 2021-11-02 12:47:42.167 1346 WARNING neutron.api.v2.base > [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 > 98687873a146418eaeeb54a01693669f - default default] An exception happened > while processing the request body. The exception message is [Unrecognized > attribute(s) 'stateful'].: webob.exc.HTTPBadRequest: Unrecognized > attribute(s) 'stateful' > 2021-11-02 12:47:42.167 1346 INFO neutron.api.v2.resource > [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 > 98687873a146418eaeeb54a01693669f - default default] create failed (client > error): Unrecognized attribute(s) 'stateful' > 2021-11-02 12:47:42.168 1346 INFO neutron.wsgi > [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 > 98687873a146418eaeeb54a01693669f - default default] 172.16.40.45 "POST > /v2.0/security-groups HTTP/1.1" status: 400 len: 317 time: 0.2455938 > > Any advice on how to fix it ? > > Ammad Do You have 'stateful-security-group' API extension enabled? You can check it with command neutron ext-list If it's not loaded, You can check in the neutron-server logs while it wasn't loaded properly. -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From syedammad83 at gmail.com Tue Nov 2 08:09:14 2021 From: syedammad83 at gmail.com (Ammad Syed) Date: Tue, 2 Nov 2021 13:09:14 +0500 Subject: [neutron][ovn] Stateless Security Group In-Reply-To: <4691610.31r3eYUQgx@p1> References: <4691610.31r3eYUQgx@p1> Message-ID: Hi Slawek, I don't see any output with below command. neutron ext-list | grep stateful-security-group I have checked logs and found below in neutron-server.log. # grep stateful-security-group neutron-server.log 2021-11-02 13:02:20.846 998 DEBUG neutron.api.extensions [req-b022ced2-f365-4ab9-9f61-25e915231e02 - - - - -] Ext name="Stateful security group" alias="stateful-security-group" description="Indicates if the security group is stateful or not" updated="2019-11-26T09:00:00-00:00" _check_extension /usr/lib/python3/dist-packages/neutron/api/extensions.py:416 2021-11-02 13:02:20.846 998 INFO neutron.api.extensions [req-b022ced2-f365-4ab9-9f61-25e915231e02 - - - - -] Extension stateful-security-group not supported by any of loaded plugins Do I need to do any change in neutron server configuration? Ammad On Tue, Nov 2, 2021 at 1:04 PM Slawek Kaplonski wrote: > Hi, > > On wtorek, 2 listopada 2021 08:51:04 CET Ammad Syed wrote: > > Hi, > > > > I have upgraded my lab to latest xena release and ovn 21.09 and ovs > 2.16. I > > am trying to create stateless security group. But its getting failed with > > below error message. > > > > # openstack security group create --stateless sec02-stateless > > Error while executing command: BadRequestException: 400, Unrecognized > > attribute(s) 'stateful' > > > > I see below logs in neutron server logs. > > > > 2021-11-02 12:47:41.921 1346 DEBUG neutron.wsgi [-] (1346) accepted > > ('172.16.40.45', 41272) server > > /usr/lib/python3/dist-packages/eventlet/wsgi.py:992 > > 2021-11-02 12:47:42.166 1346 DEBUG neutron.api.v2.base > > [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > 19844bf62a7b498eb443508ef150e9b8 > > 98687873a146418eaeeb54a01693669f - default default] Request body: > > {'security_group': {'name': 'sec02-stateless', 'stateful': False, > > 'description': 'sec02-stateless'}} prepare_request_body > > /usr/lib/python3/dist-packages/neutron/api/v2/base.py:729 > > 2021-11-02 12:47:42.167 1346 WARNING neutron.api.v2.base > > [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > 19844bf62a7b498eb443508ef150e9b8 > > 98687873a146418eaeeb54a01693669f - default default] An exception happened > > while processing the request body. The exception message is [Unrecognized > > attribute(s) 'stateful'].: webob.exc.HTTPBadRequest: Unrecognized > > attribute(s) 'stateful' > > 2021-11-02 12:47:42.167 1346 INFO neutron.api.v2.resource > > [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > 19844bf62a7b498eb443508ef150e9b8 > > 98687873a146418eaeeb54a01693669f - default default] create failed (client > > error): Unrecognized attribute(s) 'stateful' > > 2021-11-02 12:47:42.168 1346 INFO neutron.wsgi > > [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > 19844bf62a7b498eb443508ef150e9b8 > > 98687873a146418eaeeb54a01693669f - default default] 172.16.40.45 "POST > > /v2.0/security-groups HTTP/1.1" status: 400 len: 317 time: 0.2455938 > > > > Any advice on how to fix it ? > > > > Ammad > > Do You have 'stateful-security-group' API extension enabled? You can check > it > with command > > neutron ext-list > > If it's not loaded, You can check in the neutron-server logs while it > wasn't > loaded properly. > > -- > Slawek Kaplonski > Principal Software Engineer > Red Hat -- Regards, Syed Ammad Ali -------------- next part -------------- An HTML attachment was scrubbed... URL: From syedammad83 at gmail.com Tue Nov 2 08:25:27 2021 From: syedammad83 at gmail.com (Ammad Syed) Date: Tue, 2 Nov 2021 13:25:27 +0500 Subject: [neutron][ovn] Stateless Security Group In-Reply-To: References: <4691610.31r3eYUQgx@p1> Message-ID: I have below plugins loaded in neutron.conf core_plugin = ml2 service_plugins = ovn-router, qos, segments, port_forwarding and below extension drivers ml2_conf.ini mechanism_drivers = ovn extension_drivers = port_security, qos Ammad On Tue, Nov 2, 2021 at 1:09 PM Ammad Syed wrote: > Hi Slawek, > > I don't see any output with below command. > > neutron ext-list | grep stateful-security-group > > I have checked logs and found below in neutron-server.log. > > # grep stateful-security-group neutron-server.log > > 2021-11-02 13:02:20.846 998 DEBUG neutron.api.extensions > [req-b022ced2-f365-4ab9-9f61-25e915231e02 - - - - -] Ext name="Stateful > security group" alias="stateful-security-group" description="Indicates if > the security group is stateful or not" updated="2019-11-26T09:00:00-00:00" > _check_extension > /usr/lib/python3/dist-packages/neutron/api/extensions.py:416 > 2021-11-02 13:02:20.846 998 INFO neutron.api.extensions > [req-b022ced2-f365-4ab9-9f61-25e915231e02 - - - - -] Extension > stateful-security-group not supported by any of loaded plugins > > Do I need to do any change in neutron server configuration? > > Ammad > > On Tue, Nov 2, 2021 at 1:04 PM Slawek Kaplonski > wrote: > >> Hi, >> >> On wtorek, 2 listopada 2021 08:51:04 CET Ammad Syed wrote: >> > Hi, >> > >> > I have upgraded my lab to latest xena release and ovn 21.09 and ovs >> 2.16. I >> > am trying to create stateless security group. But its getting failed >> with >> > below error message. >> > >> > # openstack security group create --stateless sec02-stateless >> > Error while executing command: BadRequestException: 400, Unrecognized >> > attribute(s) 'stateful' >> > >> > I see below logs in neutron server logs. >> > >> > 2021-11-02 12:47:41.921 1346 DEBUG neutron.wsgi [-] (1346) accepted >> > ('172.16.40.45', 41272) server >> > /usr/lib/python3/dist-packages/eventlet/wsgi.py:992 >> > 2021-11-02 12:47:42.166 1346 DEBUG neutron.api.v2.base >> > [req-b6a37fff-090f-4754-9df7-6e4314ed9481 >> 19844bf62a7b498eb443508ef150e9b8 >> > 98687873a146418eaeeb54a01693669f - default default] Request body: >> > {'security_group': {'name': 'sec02-stateless', 'stateful': False, >> > 'description': 'sec02-stateless'}} prepare_request_body >> > /usr/lib/python3/dist-packages/neutron/api/v2/base.py:729 >> > 2021-11-02 12:47:42.167 1346 WARNING neutron.api.v2.base >> > [req-b6a37fff-090f-4754-9df7-6e4314ed9481 >> 19844bf62a7b498eb443508ef150e9b8 >> > 98687873a146418eaeeb54a01693669f - default default] An exception >> happened >> > while processing the request body. The exception message is >> [Unrecognized >> > attribute(s) 'stateful'].: webob.exc.HTTPBadRequest: Unrecognized >> > attribute(s) 'stateful' >> > 2021-11-02 12:47:42.167 1346 INFO neutron.api.v2.resource >> > [req-b6a37fff-090f-4754-9df7-6e4314ed9481 >> 19844bf62a7b498eb443508ef150e9b8 >> > 98687873a146418eaeeb54a01693669f - default default] create failed >> (client >> > error): Unrecognized attribute(s) 'stateful' >> > 2021-11-02 12:47:42.168 1346 INFO neutron.wsgi >> > [req-b6a37fff-090f-4754-9df7-6e4314ed9481 >> 19844bf62a7b498eb443508ef150e9b8 >> > 98687873a146418eaeeb54a01693669f - default default] 172.16.40.45 "POST >> > /v2.0/security-groups HTTP/1.1" status: 400 len: 317 time: 0.2455938 >> > >> > Any advice on how to fix it ? >> > >> > Ammad >> >> Do You have 'stateful-security-group' API extension enabled? You can >> check it >> with command >> >> neutron ext-list >> >> If it's not loaded, You can check in the neutron-server logs while it >> wasn't >> loaded properly. >> >> -- >> Slawek Kaplonski >> Principal Software Engineer >> Red Hat > > > > -- > Regards, > > > Syed Ammad Ali > -- Regards, Syed Ammad Ali -------------- next part -------------- An HTML attachment was scrubbed... URL: From katonalala at gmail.com Tue Nov 2 08:25:44 2021 From: katonalala at gmail.com (Lajos Katona) Date: Tue, 2 Nov 2021 09:25:44 +0100 Subject: [neutron][ovn] Stateless Security Group In-Reply-To: References: Message-ID: Hi, statefull security-groups are only available with iptables based drivers: https://review.opendev.org/c/openstack/neutron/+/572767/53/releasenotes/notes/stateful-security-group-04b2902ed9c44e4f.yaml For OVS and OVN we have open RFE, nut as I know at the moment nobody works on them: https://bugs.launchpad.net/neutron/+bug/1885261 https://bugs.launchpad.net/neutron/+bug/1885262 Regards Lajos Katona (lajoskatona) Ammad Syed ezt ?rta (id?pont: 2021. nov. 2., K, 9:00): > Hi, > > I have upgraded my lab to latest xena release and ovn 21.09 and ovs 2.16. > I am trying to create stateless security group. But its getting failed with > below error message. > > # openstack security group create --stateless sec02-stateless > Error while executing command: BadRequestException: 400, Unrecognized > attribute(s) 'stateful' > > I see below logs in neutron server logs. > > 2021-11-02 12:47:41.921 1346 DEBUG neutron.wsgi [-] (1346) accepted > ('172.16.40.45', 41272) server > /usr/lib/python3/dist-packages/eventlet/wsgi.py:992 > 2021-11-02 12:47:42.166 1346 DEBUG neutron.api.v2.base > [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 > 98687873a146418eaeeb54a01693669f - default default] Request body: > {'security_group': {'name': 'sec02-stateless', 'stateful': False, > 'description': 'sec02-stateless'}} prepare_request_body > /usr/lib/python3/dist-packages/neutron/api/v2/base.py:729 > 2021-11-02 12:47:42.167 1346 WARNING neutron.api.v2.base > [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 > 98687873a146418eaeeb54a01693669f - default default] An exception happened > while processing the request body. The exception message is [Unrecognized > attribute(s) 'stateful'].: webob.exc.HTTPBadRequest: Unrecognized > attribute(s) 'stateful' > 2021-11-02 12:47:42.167 1346 INFO neutron.api.v2.resource > [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 > 98687873a146418eaeeb54a01693669f - default default] create failed (client > error): Unrecognized attribute(s) 'stateful' > 2021-11-02 12:47:42.168 1346 INFO neutron.wsgi > [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 > 98687873a146418eaeeb54a01693669f - default default] 172.16.40.45 "POST > /v2.0/security-groups HTTP/1.1" status: 400 len: 317 time: 0.2455938 > > Any advice on how to fix it ? > > Ammad > -------------- next part -------------- An HTML attachment was scrubbed... URL: From syedammad83 at gmail.com Tue Nov 2 08:29:13 2021 From: syedammad83 at gmail.com (Ammad Syed) Date: Tue, 2 Nov 2021 13:29:13 +0500 Subject: [neutron][ovn] Stateless Security Group In-Reply-To: References: Message-ID: Thanks Lajos, I was checking the release notes and found that stateless acl is supported by ovn in xena. https://docs.openstack.org/releasenotes/neutron/xena.html#:~:text=Support%20stateless%20security%20groups%20with%20the%20latest%20OVN%2021.06%2B.%20The%20stateful%3DFalse%20security%20groups%20are%20mapped%20to%20the%20new%20%E2%80%9Callow-stateless%E2%80%9D%20OVN%20ACL%20verb . Ammad On Tue, Nov 2, 2021 at 1:25 PM Lajos Katona wrote: > Hi, > statefull security-groups are only available with iptables based drivers: > > https://review.opendev.org/c/openstack/neutron/+/572767/53/releasenotes/notes/stateful-security-group-04b2902ed9c44e4f.yaml > > For OVS and OVN we have open RFE, nut as I know at the moment nobody works > on them: > https://bugs.launchpad.net/neutron/+bug/1885261 > https://bugs.launchpad.net/neutron/+bug/1885262 > > Regards > Lajos Katona (lajoskatona) > > Ammad Syed ezt ?rta (id?pont: 2021. nov. 2., K, > 9:00): > >> Hi, >> >> I have upgraded my lab to latest xena release and ovn 21.09 and ovs 2.16. >> I am trying to create stateless security group. But its getting failed with >> below error message. >> >> # openstack security group create --stateless sec02-stateless >> Error while executing command: BadRequestException: 400, Unrecognized >> attribute(s) 'stateful' >> >> I see below logs in neutron server logs. >> >> 2021-11-02 12:47:41.921 1346 DEBUG neutron.wsgi [-] (1346) accepted >> ('172.16.40.45', 41272) server >> /usr/lib/python3/dist-packages/eventlet/wsgi.py:992 >> 2021-11-02 12:47:42.166 1346 DEBUG neutron.api.v2.base >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 >> 98687873a146418eaeeb54a01693669f - default default] Request body: >> {'security_group': {'name': 'sec02-stateless', 'stateful': False, >> 'description': 'sec02-stateless'}} prepare_request_body >> /usr/lib/python3/dist-packages/neutron/api/v2/base.py:729 >> 2021-11-02 12:47:42.167 1346 WARNING neutron.api.v2.base >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 >> 98687873a146418eaeeb54a01693669f - default default] An exception happened >> while processing the request body. The exception message is [Unrecognized >> attribute(s) 'stateful'].: webob.exc.HTTPBadRequest: Unrecognized >> attribute(s) 'stateful' >> 2021-11-02 12:47:42.167 1346 INFO neutron.api.v2.resource >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 >> 98687873a146418eaeeb54a01693669f - default default] create failed (client >> error): Unrecognized attribute(s) 'stateful' >> 2021-11-02 12:47:42.168 1346 INFO neutron.wsgi >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 >> 98687873a146418eaeeb54a01693669f - default default] 172.16.40.45 "POST >> /v2.0/security-groups HTTP/1.1" status: 400 len: 317 time: 0.2455938 >> >> Any advice on how to fix it ? >> >> Ammad >> > -- Regards, Syed Ammad Ali -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Tue Nov 2 08:50:43 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Tue, 02 Nov 2021 09:50:43 +0100 Subject: [neutron] CI meeting - Tuesday 02.11.2021 Message-ID: <1800045.tdWV9SEqCh@p1> Hi, As we discussed during the PTG and in the last week's CI meeting, this week's meeting will be a video call. Please join https://meetpad.opendev.org/neutron-ci-meetings at 300pm UTC if You are interested in the Neutron CI. We will also keep meeting opened in the #openstack-neutron irc channel in case if anyone would like to participate that way. Agenda for the meeting is at https://etherpad.opendev.org/p/neutron-ci-meetings -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From elfosardo at gmail.com Tue Nov 2 08:51:54 2021 From: elfosardo at gmail.com (Riccardo Pittau) Date: Tue, 2 Nov 2021 09:51:54 +0100 Subject: =?UTF-8?Q?Re=3A_=5Bironic=5D_Proposing_Aija_Jaunt=C4=93va_for_sushy=2Dcore?= In-Reply-To: References: Message-ID: +1 Aija has done a great job so far :) On Tue, Nov 2, 2021 at 9:00 AM Iury Gregory wrote: > Hello everyone! > > I would like to propose Aija Jaunt?va (irc: ajya) to be added to the > sushy-core group. > Aija has been in the ironic community for a long time, she has a lot of > knowledge about redfish and is always providing good reviews. > > ironic-cores please vote with +/- 1. > > -- > > > *Att[]'sIury Gregory Melo Ferreira * > *MSc in Computer Science at UFCG* > *Part of the ironic-core and puppet-manager-core team in OpenStack* > *Software Engineer at Red Hat Czech* > *Social*: https://www.linkedin.com/in/iurygregory > *E-mail: iurygregory at gmail.com * > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Tue Nov 2 08:54:07 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Tue, 02 Nov 2021 09:54:07 +0100 Subject: [neutron][ovn] Stateless Security Group In-Reply-To: References: Message-ID: <2800702.e9J7NaK4W3@p1> Hi, On wtorek, 2 listopada 2021 09:29:13 CET Ammad Syed wrote: > Thanks Lajos, > > I was checking the release notes and found that stateless acl is supported > by ovn in xena. > > https://docs.openstack.org/releasenotes/neutron/ xena.html#:~:text=Support%20st > ateless%20security%20groups%20with%20the%20latest%20OVN%2021.06%2B. %20The%20st > ateful%3DFalse%20security%20groups%20are%20mapped%20to%20the%20new%20%E2%80%9C > allow-stateless%E2%80%9D%20OVN%20ACL%20verb . It should be supported by the OVN driver now IIRC. Maybe we forgot about adding this extension to the list: https://github.com/openstack/neutron/blob/ master/neutron/common/ovn/extensions.py#L93 Can You try to add it there and see if the extension will be loaded then? > > Ammad > > On Tue, Nov 2, 2021 at 1:25 PM Lajos Katona wrote: > > Hi, > > statefull security-groups are only available with iptables based drivers: > > > > https://review.opendev.org/c/openstack/neutron/+/572767/53/releasenotes/ note > > s/stateful-security-group-04b2902ed9c44e4f.yaml > > > > For OVS and OVN we have open RFE, nut as I know at the moment nobody works > > on them: > > https://bugs.launchpad.net/neutron/+bug/1885261 > > https://bugs.launchpad.net/neutron/+bug/1885262 > > > > Regards > > Lajos Katona (lajoskatona) > > > > Ammad Syed ezt ?rta (id?pont: 2021. nov. 2., K, > > > > 9:00): > >> Hi, > >> > >> I have upgraded my lab to latest xena release and ovn 21.09 and ovs 2.16. > >> I am trying to create stateless security group. But its getting failed with > >> below error message. > >> > >> # openstack security group create --stateless sec02-stateless > >> Error while executing command: BadRequestException: 400, Unrecognized > >> attribute(s) 'stateful' > >> > >> I see below logs in neutron server logs. > >> > >> 2021-11-02 12:47:41.921 1346 DEBUG neutron.wsgi [-] (1346) accepted > >> ('172.16.40.45', 41272) server > >> /usr/lib/python3/dist-packages/eventlet/wsgi.py:992 > >> 2021-11-02 12:47:42.166 1346 DEBUG neutron.api.v2.base > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 > >> 98687873a146418eaeeb54a01693669f - default default] Request body: > >> {'security_group': {'name': 'sec02-stateless', 'stateful': False, > >> 'description': 'sec02-stateless'}} prepare_request_body > >> /usr/lib/python3/dist-packages/neutron/api/v2/base.py:729 > >> 2021-11-02 12:47:42.167 1346 WARNING neutron.api.v2.base > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 > >> 98687873a146418eaeeb54a01693669f - default default] An exception happened > >> while processing the request body. The exception message is [Unrecognized > >> attribute(s) 'stateful'].: webob.exc.HTTPBadRequest: Unrecognized > >> attribute(s) 'stateful' > >> 2021-11-02 12:47:42.167 1346 INFO neutron.api.v2.resource > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 > >> 98687873a146418eaeeb54a01693669f - default default] create failed (client > >> error): Unrecognized attribute(s) 'stateful' > >> 2021-11-02 12:47:42.168 1346 INFO neutron.wsgi > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 19844bf62a7b498eb443508ef150e9b8 > >> 98687873a146418eaeeb54a01693669f - default default] 172.16.40.45 "POST > >> /v2.0/security-groups HTTP/1.1" status: 400 len: 317 time: 0.2455938 > >> > >> Any advice on how to fix it ? > >> > >> Ammad -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From dtantsur at redhat.com Tue Nov 2 08:58:47 2021 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Tue, 2 Nov 2021 09:58:47 +0100 Subject: =?UTF-8?Q?Re=3A_=5Bironic=5D_Proposing_Aija_Jaunt=C4=93va_for_sushy=2Dcore?= In-Reply-To: References: Message-ID: On Tue, Nov 2, 2021 at 8:58 AM Iury Gregory wrote: > Hello everyone! > > I would like to propose Aija Jaunt?va (irc: ajya) to be added to the > sushy-core group. > Aija has been in the ironic community for a long time, she has a lot of > knowledge about redfish and is always providing good reviews. > > ironic-cores please vote with +/- 1. > Wait, why just 1? I have the whole +2! :) Aija has been extremely helpful in all things around Redfish, I feel absolutely confident in trusting her the core rights. Welcome! Dmitry > > -- > > > *Att[]'sIury Gregory Melo Ferreira * > *MSc in Computer Science at UFCG* > *Part of the ironic-core and puppet-manager-core team in OpenStack* > *Software Engineer at Red Hat Czech* > *Social*: https://www.linkedin.com/in/iurygregory > *E-mail: iurygregory at gmail.com * > -- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill -------------- next part -------------- An HTML attachment was scrubbed... URL: From syedammad83 at gmail.com Tue Nov 2 09:04:40 2021 From: syedammad83 at gmail.com (Ammad Syed) Date: Tue, 2 Nov 2021 14:04:40 +0500 Subject: [neutron][ovn] Stateless Security Group In-Reply-To: <2800702.e9J7NaK4W3@p1> References: <2800702.e9J7NaK4W3@p1> Message-ID: Hi Slawek, Yes, after adding extension, SG created with stateful=false. # neutron ext-list | grep stateful-security-group neutron CLI is deprecated and will be removed in the Z cycle. Use openstack CLI instead. | stateful-security-group | Stateful security group # openstack security group create --stateless sec02-stateless +-----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Field | Value | +-----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | created_at | 2021-11-02T09:02:42Z | | description | sec02-stateless | | id | 29c28678-9a03-496c-8157-4afbcdc8f2af | | name | sec02-stateless | | project_id | 98687873a146418eaeeb54a01693669f | | revision_number | 1 | | rules | created_at='2021-11-02T09:02:42Z', direction='egress', ethertype='IPv6', id='17079c04-dc1d-4fbd-9f15-e79c6e585932', standard_attr_id='2863', updated_at='2021-11-02T09:02:42Z' | | | created_at='2021-11-02T09:02:42Z', direction='egress', ethertype='IPv4', id='fadfbf09-f759-453d-b493-e6f73077113a', standard_attr_id='2860', updated_at='2021-11-02T09:02:42Z' | | stateful | False | | tags | [] | | updated_at | 2021-11-02T09:02:42Z | +-----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ Let me test this feature further. Ammad On Tue, Nov 2, 2021 at 1:54 PM Slawek Kaplonski wrote: > Hi, > > On wtorek, 2 listopada 2021 09:29:13 CET Ammad Syed wrote: > > Thanks Lajos, > > > > I was checking the release notes and found that stateless acl is > supported > > by ovn in xena. > > > > https://docs.openstack.org/releasenotes/neutron/ > xena.html#:~:text=Support%20st > > ateless%20security%20groups%20with%20the%20latest%20OVN%2021.06%2B. > %20The%20st > > > > ateful%3DFalse%20security%20groups%20are%20mapped%20to%20the%20new%20%E2%80%9C > > allow-stateless%E2%80%9D%20OVN%20ACL%20verb . > > It should be supported by the OVN driver now IIRC. Maybe we forgot about > adding this extension to the list: > https://github.com/openstack/neutron/blob/ > master/neutron/common/ovn/extensions.py#L93 > > Can You try to add it there and see if the extension will be loaded then? > > > > > Ammad > > > > On Tue, Nov 2, 2021 at 1:25 PM Lajos Katona > wrote: > > > Hi, > > > statefull security-groups are only available with iptables based > drivers: > > > > > > > https://review.opendev.org/c/openstack/neutron/+/572767/53/releasenotes/ > note > > > s/stateful-security-group-04b2902ed9c44e4f.yaml > > > > > > For OVS and OVN we have open RFE, nut as I know at the moment nobody > works > > > on them: > > > https://bugs.launchpad.net/neutron/+bug/1885261 > > > https://bugs.launchpad.net/neutron/+bug/1885262 > > > > > > Regards > > > Lajos Katona (lajoskatona) > > > > > > Ammad Syed ezt ?rta (id?pont: 2021. nov. 2., > K, > > > > > > 9:00): > > >> Hi, > > >> > > >> I have upgraded my lab to latest xena release and ovn 21.09 and ovs > 2.16. > > >> I am trying to create stateless security group. But its getting > failed > with > > >> below error message. > > >> > > >> # openstack security group create --stateless sec02-stateless > > >> Error while executing command: BadRequestException: 400, Unrecognized > > >> attribute(s) 'stateful' > > >> > > >> I see below logs in neutron server logs. > > >> > > >> 2021-11-02 12:47:41.921 1346 DEBUG neutron.wsgi [-] (1346) accepted > > >> ('172.16.40.45', 41272) server > > >> /usr/lib/python3/dist-packages/eventlet/wsgi.py:992 > > >> 2021-11-02 12:47:42.166 1346 DEBUG neutron.api.v2.base > > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > 19844bf62a7b498eb443508ef150e9b8 > > >> 98687873a146418eaeeb54a01693669f - default default] Request body: > > >> {'security_group': {'name': 'sec02-stateless', 'stateful': False, > > >> 'description': 'sec02-stateless'}} prepare_request_body > > >> /usr/lib/python3/dist-packages/neutron/api/v2/base.py:729 > > >> 2021-11-02 12:47:42.167 1346 WARNING neutron.api.v2.base > > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > 19844bf62a7b498eb443508ef150e9b8 > > >> 98687873a146418eaeeb54a01693669f - default default] An exception > happened > > >> while processing the request body. The exception message is > [Unrecognized > > >> attribute(s) 'stateful'].: webob.exc.HTTPBadRequest: Unrecognized > > >> attribute(s) 'stateful' > > >> 2021-11-02 12:47:42.167 1346 INFO neutron.api.v2.resource > > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > 19844bf62a7b498eb443508ef150e9b8 > > >> 98687873a146418eaeeb54a01693669f - default default] create failed > (client > > >> error): Unrecognized attribute(s) 'stateful' > > >> 2021-11-02 12:47:42.168 1346 INFO neutron.wsgi > > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > 19844bf62a7b498eb443508ef150e9b8 > > >> 98687873a146418eaeeb54a01693669f - default default] 172.16.40.45 "POST > > >> /v2.0/security-groups HTTP/1.1" status: 400 len: 317 time: 0.2455938 > > >> > > >> Any advice on how to fix it ? > > >> > > >> Ammad > > > -- > Slawek Kaplonski > Principal Software Engineer > Red Hat -- Regards, Syed Ammad Ali -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Tue Nov 2 09:45:38 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Tue, 02 Nov 2021 10:45:38 +0100 Subject: [neutron][ovn] Stateless Security Group In-Reply-To: References: <2800702.e9J7NaK4W3@p1> Message-ID: <2724584.mvXUDI8C0e@p1> Hi, On wtorek, 2 listopada 2021 10:04:40 CET Ammad Syed wrote: > Hi Slawek, > > Yes, after adding extension, SG created with stateful=false. That's good. Can You report an Launchpad bug for that? And You can also propose that change as fix for that bug too :) > > # neutron ext-list | grep stateful-security-group > neutron CLI is deprecated and will be removed in the Z cycle. Use openstack > CLI instead. > > | stateful-security-group | Stateful security group > > # openstack security group create --stateless sec02-stateless > +----------------- +----------------------------------------------------------- > ------------------------------------------------------------------------------ > ---------------------------------------+ > | Field | Value > > +----------------- +----------------------------------------------------------- > ------------------------------------------------------------------------------ > ---------------------------------------+ > | created_at | 2021-11-02T09:02:42Z > | > | > | description | sec02-stateless > | > | > | id | 29c28678-9a03-496c-8157-4afbcdc8f2af > | > | > | name | sec02-stateless > | > | > | project_id | 98687873a146418eaeeb54a01693669f > | > | > | revision_number | 1 > | > | > | rules | created_at='2021-11-02T09:02:42Z', direction='egress', > > ethertype='IPv6', id='17079c04-dc1d-4fbd-9f15-e79c6e585932', > standard_attr_id='2863', updated_at='2021-11-02T09:02:42Z' | > > | | created_at='2021-11-02T09:02:42Z', direction='egress', > > ethertype='IPv4', id='fadfbf09-f759-453d-b493-e6f73077113a', > standard_attr_id='2860', updated_at='2021-11-02T09:02:42Z' | > > | stateful | False > | > | > | tags | [] > | > | > | updated_at | 2021-11-02T09:02:42Z > > +----------------- +----------------------------------------------------------- > ------------------------------------------------------------------------------ > ---------------------------------------+ > > Let me test this feature further. > > Ammad > > On Tue, Nov 2, 2021 at 1:54 PM Slawek Kaplonski wrote: > > Hi, > > > > On wtorek, 2 listopada 2021 09:29:13 CET Ammad Syed wrote: > > > Thanks Lajos, > > > > > > I was checking the release notes and found that stateless acl is > > > > supported > > > > > by ovn in xena. > > > > > > https://docs.openstack.org/releasenotes/neutron/ > > > > xena.html#:~:text=Support%20st > > > > > ateless%20security%20groups%20with%20the%20latest%20OVN%2021.06%2B. > > > > %20The%20st > > > > > > ateful%3DFalse%20security%20groups%20are%20mapped%20to%20the%20new%20%E2%80% > > 9C> > > > allow-stateless%E2%80%9D%20OVN%20ACL%20verb . > > > > It should be supported by the OVN driver now IIRC. Maybe we forgot about > > adding this extension to the list: > > https://github.com/openstack/neutron/blob/ > > master/neutron/common/ovn/extensions.py#L93 > > > ons.py#L93> Can You try to add it there and see if the extension will be > > loaded then?> > > > Ammad > > > > > > On Tue, Nov 2, 2021 at 1:25 PM Lajos Katona > > > > wrote: > > > > Hi, > > > > statefull security-groups are only available with iptables based > > > > drivers: > > > > > > https://review.opendev.org/c/openstack/neutron/+/572767/53/releasenotes/ > > note > > > > > > s/stateful-security-group-04b2902ed9c44e4f.yaml > > > > > > > > For OVS and OVN we have open RFE, nut as I know at the moment nobody > > > > works > > > > > > on them: > > > > https://bugs.launchpad.net/neutron/+bug/1885261 > > > > https://bugs.launchpad.net/neutron/+bug/1885262 > > > > > > > > Regards > > > > Lajos Katona (lajoskatona) > > > > > > > > Ammad Syed ezt ?rta (id?pont: 2021. nov. 2., > > > > K, > > > > > > 9:00): > > > >> Hi, > > > >> > > > >> I have upgraded my lab to latest xena release and ovn 21.09 and ovs > > > > 2.16. > > > > > >> I am trying to create stateless security group. But its getting > > > > failed > > with > > > > > >> below error message. > > > >> > > > >> # openstack security group create --stateless sec02-stateless > > > >> Error while executing command: BadRequestException: 400, Unrecognized > > > >> attribute(s) 'stateful' > > > >> > > > >> I see below logs in neutron server logs. > > > >> > > > >> 2021-11-02 12:47:41.921 1346 DEBUG neutron.wsgi [-] (1346) accepted > > > >> ('172.16.40.45', 41272) server > > > >> /usr/lib/python3/dist-packages/eventlet/wsgi.py:992 > > > >> 2021-11-02 12:47:42.166 1346 DEBUG neutron.api.v2.base > > > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > > > > 19844bf62a7b498eb443508ef150e9b8 > > > > > >> 98687873a146418eaeeb54a01693669f - default default] Request body: > > > >> {'security_group': {'name': 'sec02-stateless', 'stateful': False, > > > >> 'description': 'sec02-stateless'}} prepare_request_body > > > >> /usr/lib/python3/dist-packages/neutron/api/v2/base.py:729 > > > >> 2021-11-02 12:47:42.167 1346 WARNING neutron.api.v2.base > > > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > > > > 19844bf62a7b498eb443508ef150e9b8 > > > > > >> 98687873a146418eaeeb54a01693669f - default default] An exception > > > > happened > > > > > >> while processing the request body. The exception message is > > > > [Unrecognized > > > > > >> attribute(s) 'stateful'].: webob.exc.HTTPBadRequest: Unrecognized > > > >> attribute(s) 'stateful' > > > >> 2021-11-02 12:47:42.167 1346 INFO neutron.api.v2.resource > > > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > > > > 19844bf62a7b498eb443508ef150e9b8 > > > > > >> 98687873a146418eaeeb54a01693669f - default default] create failed > > > > (client > > > > > >> error): Unrecognized attribute(s) 'stateful' > > > >> 2021-11-02 12:47:42.168 1346 INFO neutron.wsgi > > > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > > > > 19844bf62a7b498eb443508ef150e9b8 > > > > > >> 98687873a146418eaeeb54a01693669f - default default] 172.16.40.45 "POST > > > >> /v2.0/security-groups HTTP/1.1" status: 400 len: 317 time: 0.2455938 > > > >> > > > >> Any advice on how to fix it ? > > > >> > > > >> Ammad > > > > -- > > Slawek Kaplonski > > Principal Software Engineer > > Red Hat -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From swogatpradhan22 at gmail.com Tue Nov 2 09:59:27 2021 From: swogatpradhan22 at gmail.com (Swogat Pradhan) Date: Tue, 2 Nov 2021 15:29:27 +0530 Subject: [Openstack-victoria] [Openstack-queens] migration of VM from openstack queens to victoria Message-ID: Hi, I have 2 openstack setups Setup1. Openstack queens + ceph nautilaus Setup2. Openstack victoria + ceph octopus So, I am trying to migrate some VM's (windows and linux) from Setup1 to Setup2. For migrating the VM's i am using rbd export on setup1 and then rbd import on setup2. I have successfully migrated 21 VM's. i am now facing an issue in the 22nd vm which, after migrating the VM the vm is stuck in the windows logo screen and not moving forward, and i can't seem to understand how to approach it. Attached the instance.xml files of both queens and victoria setup's of the same VM. With regards, Swogat pradhan -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: queens Type: application/octet-stream Size: 5191 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: victoria Type: application/octet-stream Size: 5320 bytes Desc: not available URL: From swogatpradhan22 at gmail.com Tue Nov 2 10:01:38 2021 From: swogatpradhan22 at gmail.com (Swogat Pradhan) Date: Tue, 2 Nov 2021 15:31:38 +0530 Subject: [Update] [Openstack-victoria] [Openstack-queens] migration of VM from openstack queens to victoria In-Reply-To: References: Message-ID: Hi, I have even tried uploading the Primary drive to glance in queens setup and launching VM using the said glance image in the Victoria setup, but still I am facing the same problem. On Tue, Nov 2, 2021 at 3:29 PM Swogat Pradhan wrote: > Hi, > I have 2 openstack setups > Setup1. Openstack queens + ceph nautilaus > Setup2. Openstack victoria + ceph octopus > > So, I am trying to migrate some VM's (windows and linux) from Setup1 to > Setup2. > For migrating the VM's i am using rbd export on setup1 and then rbd import > on setup2. > > I have successfully migrated 21 VM's. > > i am now facing an issue in the 22nd vm which, after migrating the VM the > vm is stuck in the windows logo screen and not moving forward, and i can't > seem to understand how to approach it. > > > Attached the instance.xml files of both queens and victoria setup's of the > same VM. > > With regards, > Swogat pradhan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From arne.wiebalck at cern.ch Tue Nov 2 10:16:57 2021 From: arne.wiebalck at cern.ch (Arne Wiebalck) Date: Tue, 2 Nov 2021 11:16:57 +0100 Subject: =?UTF-8?Q?Re=3a_=5bironic=5d_Proposing_Aija_Jaunt=c4=93va_for_sushy?= =?UTF-8?Q?-core?= In-Reply-To: References: Message-ID: <2c67fc97-fabf-6603-d9fe-96a53297519f@cern.ch> +1 Great job, Aija! On 02.11.21 08:51, Iury Gregory wrote: > Hello everyone! > > I would like to propose Aija Jaunt?va (irc: ajya) to be added to the > sushy-core group. > Aija has been in the ironic community for a long time, she has a lot of > knowledge about redfish and is always providing good reviews. > > ironic-cores please vote with +/- 1. > > -- > /Att[]'s > Iury Gregory Melo Ferreira > //MSc in Computer Science at UFCG > / > /Part of the ironic-core and puppet-manager-core team in OpenStack/ > //Software Engineer at Red Hat Czech// > /Social/:https://www.linkedin.com/in/iurygregory > > /E-mail: iurygregory at gmail.com / From syedammad83 at gmail.com Tue Nov 2 10:24:20 2021 From: syedammad83 at gmail.com (Ammad Syed) Date: Tue, 2 Nov 2021 15:24:20 +0500 Subject: [neutron][ovn] Stateless Security Group In-Reply-To: <2724584.mvXUDI8C0e@p1> References: <2800702.e9J7NaK4W3@p1> <2724584.mvXUDI8C0e@p1> Message-ID: Hi, I have reported the bug but not sure how to propose that change. Any guide to propose change would be highly appreciated. https://bugs.launchpad.net/neutron/+bug/1949451 On Tue, Nov 2, 2021 at 2:45 PM Slawek Kaplonski wrote: > Hi, > > On wtorek, 2 listopada 2021 10:04:40 CET Ammad Syed wrote: > > Hi Slawek, > > > > Yes, after adding extension, SG created with stateful=false. > > That's good. Can You report an Launchpad bug for that? And You can also > propose that change as fix for that bug too :) > > > > > # neutron ext-list | grep stateful-security-group > > neutron CLI is deprecated and will be removed in the Z cycle. Use > openstack > > CLI instead. > > > > | stateful-security-group | Stateful security group > > > > # openstack security group create --stateless sec02-stateless > > +----------------- > +----------------------------------------------------------- > > > > ------------------------------------------------------------------------------ > > ---------------------------------------+ > > | Field | Value > > > > +----------------- > +----------------------------------------------------------- > > > > ------------------------------------------------------------------------------ > > ---------------------------------------+ > > | created_at | 2021-11-02T09:02:42Z > > | > > | > > | description | sec02-stateless > > | > > | > > | id | 29c28678-9a03-496c-8157-4afbcdc8f2af > > | > > | > > | name | sec02-stateless > > | > > | > > | project_id | 98687873a146418eaeeb54a01693669f > > | > > | > > | revision_number | 1 > > | > > | > > | rules | created_at='2021-11-02T09:02:42Z', > direction='egress', > > > > ethertype='IPv6', id='17079c04-dc1d-4fbd-9f15-e79c6e585932', > > standard_attr_id='2863', updated_at='2021-11-02T09:02:42Z' | > > > > | | created_at='2021-11-02T09:02:42Z', > direction='egress', > > > > ethertype='IPv4', id='fadfbf09-f759-453d-b493-e6f73077113a', > > standard_attr_id='2860', updated_at='2021-11-02T09:02:42Z' | > > > > | stateful | False > > | > > | > > | tags | [] > > | > > | > > | updated_at | 2021-11-02T09:02:42Z > > > > +----------------- > +----------------------------------------------------------- > > > > ------------------------------------------------------------------------------ > > ---------------------------------------+ > > > > Let me test this feature further. > > > > Ammad > > > > On Tue, Nov 2, 2021 at 1:54 PM Slawek Kaplonski > wrote: > > > Hi, > > > > > > On wtorek, 2 listopada 2021 09:29:13 CET Ammad Syed wrote: > > > > Thanks Lajos, > > > > > > > > I was checking the release notes and found that stateless acl is > > > > > > supported > > > > > > > by ovn in xena. > > > > > > > > https://docs.openstack.org/releasenotes/neutron/ > > > > > > xena.html#:~:text=Support%20st > > > > > > > ateless%20security%20groups%20with%20the%20latest%20OVN%2021.06%2B. > > > > > > %20The%20st > > > > > > > > > > > ateful%3DFalse%20security%20groups%20are%20mapped%20to%20the%20new%20%E2%80% > > > 9C> > > > > allow-stateless%E2%80%9D%20OVN%20ACL%20verb . > > > > > > It should be supported by the OVN driver now IIRC. Maybe we forgot > about > > > adding this extension to the list: > > > https://github.com/openstack/neutron/blob/ > > > master/neutron/common/ovn/extensions.py#L93 > > > extensi > > > ons.py#L93> Can You try to add it there and see if the extension will > be > > > loaded then?> > > > > Ammad > > > > > > > > On Tue, Nov 2, 2021 at 1:25 PM Lajos Katona > > > > > > wrote: > > > > > Hi, > > > > > statefull security-groups are only available with iptables based > > > > > > drivers: > > > > > > > > > > https://review.opendev.org/c/openstack/neutron/+/572767/53/releasenotes/ > > > note > > > > > > > > s/stateful-security-group-04b2902ed9c44e4f.yaml > > > > > > > > > > For OVS and OVN we have open RFE, nut as I know at the moment > nobody > > > > > > works > > > > > > > > on them: > > > > > https://bugs.launchpad.net/neutron/+bug/1885261 > > > > > https://bugs.launchpad.net/neutron/+bug/1885262 > > > > > > > > > > Regards > > > > > Lajos Katona (lajoskatona) > > > > > > > > > > Ammad Syed ezt ?rta (id?pont: 2021. nov. > 2., > > > > > > K, > > > > > > > > 9:00): > > > > >> Hi, > > > > >> > > > > >> I have upgraded my lab to latest xena release and ovn 21.09 and > ovs > > > > > > 2.16. > > > > > > > >> I am trying to create stateless security group. But its getting > > > > > > failed > > > with > > > > > > > >> below error message. > > > > >> > > > > >> # openstack security group create --stateless sec02-stateless > > > > >> Error while executing command: BadRequestException: 400, > Unrecognized > > > > >> attribute(s) 'stateful' > > > > >> > > > > >> I see below logs in neutron server logs. > > > > >> > > > > >> 2021-11-02 12:47:41.921 1346 DEBUG neutron.wsgi [-] (1346) > accepted > > > > >> ('172.16.40.45', 41272) server > > > > >> /usr/lib/python3/dist-packages/eventlet/wsgi.py:992 > > > > >> 2021-11-02 12:47:42.166 1346 DEBUG neutron.api.v2.base > > > > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > > > > > > 19844bf62a7b498eb443508ef150e9b8 > > > > > > > >> 98687873a146418eaeeb54a01693669f - default default] Request body: > > > > >> {'security_group': {'name': 'sec02-stateless', 'stateful': False, > > > > >> 'description': 'sec02-stateless'}} prepare_request_body > > > > >> /usr/lib/python3/dist-packages/neutron/api/v2/base.py:729 > > > > >> 2021-11-02 12:47:42.167 1346 WARNING neutron.api.v2.base > > > > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > > > > > > 19844bf62a7b498eb443508ef150e9b8 > > > > > > > >> 98687873a146418eaeeb54a01693669f - default default] An exception > > > > > > happened > > > > > > > >> while processing the request body. The exception message is > > > > > > [Unrecognized > > > > > > > >> attribute(s) 'stateful'].: webob.exc.HTTPBadRequest: Unrecognized > > > > >> attribute(s) 'stateful' > > > > >> 2021-11-02 12:47:42.167 1346 INFO neutron.api.v2.resource > > > > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > > > > > > 19844bf62a7b498eb443508ef150e9b8 > > > > > > > >> 98687873a146418eaeeb54a01693669f - default default] create failed > > > > > > (client > > > > > > > >> error): Unrecognized attribute(s) 'stateful' > > > > >> 2021-11-02 12:47:42.168 1346 INFO neutron.wsgi > > > > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > > > > > > 19844bf62a7b498eb443508ef150e9b8 > > > > > > > >> 98687873a146418eaeeb54a01693669f - default default] 172.16.40.45 > "POST > > > > >> /v2.0/security-groups HTTP/1.1" status: 400 len: 317 time: > 0.2455938 > > > > >> > > > > >> Any advice on how to fix it ? > > > > >> > > > > >> Ammad > > > > > > -- > > > Slawek Kaplonski > > > Principal Software Engineer > > > Red Hat > > > -- > Slawek Kaplonski > Principal Software Engineer > Red Hat -- Regards, Syed Ammad Ali -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Tue Nov 2 11:14:26 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Tue, 02 Nov 2021 12:14:26 +0100 Subject: [neutron][ovn] Stateless Security Group In-Reply-To: References: <2724584.mvXUDI8C0e@p1> Message-ID: <2836220.mvXUDI8C0e@p1> Hi, On wtorek, 2 listopada 2021 11:24:20 CET Ammad Syed wrote: > Hi, > > I have reported the bug but not sure how to propose that change. Any guide > to propose change would be highly appreciated. Please go through https://docs.openstack.org/contributors/code-and-documentation/quick-start.html as it should be good start :) If You will have any questions, You can reach out to me on IRC. I'm slaweq there and You can catch me on the #openstack-neutron channel. > > https://bugs.launchpad.net/neutron/+bug/1949451 Thx > > On Tue, Nov 2, 2021 at 2:45 PM Slawek Kaplonski wrote: > > Hi, > > > > On wtorek, 2 listopada 2021 10:04:40 CET Ammad Syed wrote: > > > Hi Slawek, > > > > > > Yes, after adding extension, SG created with stateful=false. > > > > That's good. Can You report an Launchpad bug for that? And You can also > > propose that change as fix for that bug too :) > > > > > # neutron ext-list | grep stateful-security-group > > > neutron CLI is deprecated and will be removed in the Z cycle. Use > > > > openstack > > > > > CLI instead. > > > > > > | stateful-security-group | Stateful security group > > > > > > # openstack security group create --stateless sec02-stateless > > > +----------------- > > > > +----------------------------------------------------------- > > > > > > ---------------------------------------------------------------------------- > > --> > > > ---------------------------------------+ > > > > > > | Field | Value > > > > > > +----------------- > > > > +----------------------------------------------------------- > > > > > > ---------------------------------------------------------------------------- > > --> > > > ---------------------------------------+ > > > > > > | created_at | 2021-11-02T09:02:42Z > > > | > > > | > > > | description | sec02-stateless > > > | > > > | > > > | id | 29c28678-9a03-496c-8157-4afbcdc8f2af > > > | > > > | > > > | name | sec02-stateless > > > | > > > | > > > | project_id | 98687873a146418eaeeb54a01693669f > > > | > > > | > > > | revision_number | 1 > > > | > > > | > > > | rules | created_at='2021-11-02T09:02:42Z', > > > > direction='egress', > > > > > ethertype='IPv6', id='17079c04-dc1d-4fbd-9f15-e79c6e585932', > > > standard_attr_id='2863', updated_at='2021-11-02T09:02:42Z' | > > > > > > | | created_at='2021-11-02T09:02:42Z', > > > > direction='egress', > > > > > ethertype='IPv4', id='fadfbf09-f759-453d-b493-e6f73077113a', > > > standard_attr_id='2860', updated_at='2021-11-02T09:02:42Z' | > > > > > > | stateful | False > > > | > > > | > > > | tags | [] > > > | > > > | > > > | updated_at | 2021-11-02T09:02:42Z > > > > > > +----------------- > > > > +----------------------------------------------------------- > > > > > > ---------------------------------------------------------------------------- > > --> > > > ---------------------------------------+ > > > > > > Let me test this feature further. > > > > > > Ammad > > > > > > On Tue, Nov 2, 2021 at 1:54 PM Slawek Kaplonski > > > > wrote: > > > > Hi, > > > > > > > > On wtorek, 2 listopada 2021 09:29:13 CET Ammad Syed wrote: > > > > > Thanks Lajos, > > > > > > > > > > I was checking the release notes and found that stateless acl is > > > > > > > > supported > > > > > > > > > by ovn in xena. > > > > > > > > > > https://docs.openstack.org/releasenotes/neutron/ > > > > > > > > xena.html#:~:text=Support%20st > > > > > > > > > ateless%20security%20groups%20with%20the%20latest%20OVN%2021.06%2B. > > > > > > > > %20The%20st > > > > ateful%3DFalse%20security%20groups%20are%20mapped%20to%20the%20new%20%E2%80% > > > > > > 9C> > > > > > > > > > allow-stateless%E2%80%9D%20OVN%20ACL%20verb . > > > > > > > > It should be supported by the OVN driver now IIRC. Maybe we forgot > > > > about > > > > > > adding this extension to the list: > > > > https://github.com/openstack/neutron/blob/ > > > > master/neutron/common/ovn/extensions.py#L93 > > > > > > > extensi > > > > > > ons.py#L93> Can You try to add it there and see if the extension will > > > > be > > > > > > loaded then?> > > > > > > > > > Ammad > > > > > > > > > > On Tue, Nov 2, 2021 at 1:25 PM Lajos Katona > > > > > > > > wrote: > > > > > > Hi, > > > > > > statefull security-groups are only available with iptables based > > > > > > > > drivers: > > https://review.opendev.org/c/openstack/neutron/+/572767/53/releasenotes/ > > > > > > note > > > > > > > > > > s/stateful-security-group-04b2902ed9c44e4f.yaml > > > > > > > > > > > > For OVS and OVN we have open RFE, nut as I know at the moment > > > > nobody > > > > > > works > > > > > > > > > > on them: > > > > > > https://bugs.launchpad.net/neutron/+bug/1885261 > > > > > > https://bugs.launchpad.net/neutron/+bug/1885262 > > > > > > > > > > > > Regards > > > > > > Lajos Katona (lajoskatona) > > > > > > > > > > > > Ammad Syed ezt ?rta (id?pont: 2021. nov. > > > > 2., > > > > > > K, > > > > > > > > > > 9:00): > > > > > >> Hi, > > > > > >> > > > > > >> I have upgraded my lab to latest xena release and ovn 21.09 and > > > > ovs > > > > > > 2.16. > > > > > > > > > >> I am trying to create stateless security group. But its getting > > > > > > > > failed > > > > with > > > > > > > > > >> below error message. > > > > > >> > > > > > >> # openstack security group create --stateless sec02-stateless > > > > > >> Error while executing command: BadRequestException: 400, > > > > Unrecognized > > > > > > > >> attribute(s) 'stateful' > > > > > >> > > > > > >> I see below logs in neutron server logs. > > > > > >> > > > > > >> 2021-11-02 12:47:41.921 1346 DEBUG neutron.wsgi [-] (1346) > > > > accepted > > > > > > > >> ('172.16.40.45', 41272) server > > > > > >> /usr/lib/python3/dist-packages/eventlet/wsgi.py:992 > > > > > >> 2021-11-02 12:47:42.166 1346 DEBUG neutron.api.v2.base > > > > > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > > > > > > > > 19844bf62a7b498eb443508ef150e9b8 > > > > > > > > > >> 98687873a146418eaeeb54a01693669f - default default] Request body: > > > > > >> {'security_group': {'name': 'sec02-stateless', 'stateful': False, > > > > > >> 'description': 'sec02-stateless'}} prepare_request_body > > > > > >> /usr/lib/python3/dist-packages/neutron/api/v2/base.py:729 > > > > > >> 2021-11-02 12:47:42.167 1346 WARNING neutron.api.v2.base > > > > > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > > > > > > > > 19844bf62a7b498eb443508ef150e9b8 > > > > > > > > > >> 98687873a146418eaeeb54a01693669f - default default] An exception > > > > > > > > happened > > > > > > > > > >> while processing the request body. The exception message is > > > > > > > > [Unrecognized > > > > > > > > > >> attribute(s) 'stateful'].: webob.exc.HTTPBadRequest: Unrecognized > > > > > >> attribute(s) 'stateful' > > > > > >> 2021-11-02 12:47:42.167 1346 INFO neutron.api.v2.resource > > > > > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > > > > > > > > 19844bf62a7b498eb443508ef150e9b8 > > > > > > > > > >> 98687873a146418eaeeb54a01693669f - default default] create failed > > > > > > > > (client > > > > > > > > > >> error): Unrecognized attribute(s) 'stateful' > > > > > >> 2021-11-02 12:47:42.168 1346 INFO neutron.wsgi > > > > > >> [req-b6a37fff-090f-4754-9df7-6e4314ed9481 > > > > > > > > 19844bf62a7b498eb443508ef150e9b8 > > > > > > > > > >> 98687873a146418eaeeb54a01693669f - default default] 172.16.40.45 > > > > "POST > > > > > > > >> /v2.0/security-groups HTTP/1.1" status: 400 len: 317 time: > > 0.2455938 > > > > > > > >> Any advice on how to fix it ? > > > > > >> > > > > > >> Ammad > > > > > > > > -- > > > > Slawek Kaplonski > > > > Principal Software Engineer > > > > Red Hat > > > > -- > > Slawek Kaplonski > > Principal Software Engineer > > Red Hat -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From juliaashleykreger at gmail.com Tue Nov 2 13:00:24 2021 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Tue, 2 Nov 2021 07:00:24 -0600 Subject: =?UTF-8?Q?Re=3A_=5Bironic=5D_Proposing_Aija_Jaunt=C4=93va_for_sushy=2Dcore?= In-Reply-To: References: Message-ID: +2 from me! On Tue, Nov 2, 2021 at 1:57 AM Iury Gregory wrote: > > Hello everyone! > > I would like to propose Aija Jaunt?va (irc: ajya) to be added to the sushy-core group. > Aija has been in the ironic community for a long time, she has a lot of knowledge about redfish and is always providing good reviews. > > ironic-cores please vote with +/- 1. > > -- > Att[]'s > Iury Gregory Melo Ferreira > MSc in Computer Science at UFCG > Part of the ironic-core and puppet-manager-core team in OpenStack > Software Engineer at Red Hat Czech > Social: https://www.linkedin.com/in/iurygregory > E-mail: iurygregory at gmail.com From abraden at verisign.com Tue Nov 2 13:19:34 2021 From: abraden at verisign.com (Braden, Albert) Date: Tue, 2 Nov 2021 13:19:34 +0000 Subject: [Openstack-victoria] [Openstack-queens] migration of VM from openstack queens to victoria In-Reply-To: References: Message-ID: I?d like to experiment with migrating VMs between clusters. Where can I find a document that describes the procedure? From: Swogat Pradhan Sent: Tuesday, November 2, 2021 5:59 AM To: OpenStack Discuss Subject: [EXTERNAL] [Openstack-victoria] [Openstack-queens] migration of VM from openstack queens to victoria Caution: This email originated from outside the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe. Hi, I have 2 openstack setups Setup1. Openstack queens + ceph nautilaus Setup2. Openstack victoria + ceph octopus So, I am trying to migrate some VM's (windows and linux) from Setup1 to Setup2. For migrating the VM's i am using rbd export on setup1 and then rbd import on setup2. I have successfully migrated 21 VM's. i am now facing an issue in the 22nd vm which, after migrating the VM the vm is stuck in the windows logo screen and not moving forward, and i can't seem to understand how to approach it. Attached the instance.xml files of both queens and victoria setup's of the same VM. With regards, Swogat pradhan -------------- next part -------------- An HTML attachment was scrubbed... URL: From openinfradn at gmail.com Tue Nov 2 13:43:37 2021 From: openinfradn at gmail.com (open infra) Date: Tue, 2 Nov 2021 19:13:37 +0530 Subject: Accessing Openstack using v3 API Message-ID: Hi, I was trying to use the openstack environment using the v3 client API as mentioned in [2] documentation but ended up with an error as mentioned in [1]. I can access the same environment v3 API using curl. Am I missing something? [1] https://paste.opendev.org/show/810336/ [2] https://docs.openstack.org/python-keystoneclient/latest/using-api-v3.html Regards, Danishka -------------- next part -------------- An HTML attachment was scrubbed... URL: From bkslash at poczta.onet.pl Tue Nov 2 14:01:22 2021 From: bkslash at poczta.onet.pl (Adam Tomas) Date: Tue, 2 Nov 2021 15:01:22 +0100 Subject: [monasca][kolla-ansible] HTTPUnprocessableEntity: Dimension value must be 255 characters or less Message-ID: <69CC3C6B-4C50-4735-B65D-79C05E258B79@poczta.onet.pl> Hi, in my test deployment I get following error messages for each metric at each metric collection (every 60s in my case). Of course whole URL (i.e. /v2.0/metrics/statistics?name=transfer.region2.size&merge_metrics=True&dimensions=project_id%3Afcfaxxxxx&start_time=2021-10-31+02%3A55%3A00%2B02%3A00&end_time=2021-10-31+02%3A00%3A00%2B01%3A00&period=-3300&statistics=avg&group_by=project_id&group_by=resource_id) is longer than 255 characters, but shouldn?t this URL be processed in smaller chunks? As I see the validation is done by monasca_api, so in case it fails no data is processed by monasca_log_transformer, right? Limiting metric name lenght won?t change this situation (in this example metric name is 22 characters long, while the whole URL is 285 characters long, so 285-22 is still > 255). How can I avoid this error? Best regards Adam Tomas 2021-11-02 14:19:44.935 693 ERROR monasca_api.v2.common.bulk_processor [req-a251xxxx 60bexxxx 27bexxxxx - default default] Log transformation failed, rejecting log: monasca_api.api.core.log.exceptions.HTTPUnprocessableEntity: Dimension value /v2.0/metrics/statistics?name=transfer.region2.size&merge_metrics=True&dimensions=project_id%3Afcfaxxxxx&start_time=2021-10-31+02%3A55%3A00%2B02%3A00&end_time=2021-10-31+02%3A00%3A00%2B01%3A00&period=-3300&statistics=avg&group_by=project_id&group_by=resource_id must be 255 characters or less 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor [req-a251xxxx 60bexxxx 27bexxxx - default default] Dimension value /v2.0/metrics/statistics?name=transfer.region2.size&merge_metrics=True&dimensions=project_id%3Afcfaxxx&start_time=2021-10-31+02%3A55%3A00%2B02%3A00&end_time=2021-10-31+02%3A00%3A00%2B01%3A00&period=-3300&statistics=avg&group_by=project_id&group_by=resource_id must be 255 characters or less: monasca_api.api.core.log.exceptions.HTTPUnprocessableEntity: Dimension value /v2.0/metrics/statistics?name=transfer.region2.size&merge_metrics=True&dimensions=project_id%3Afcfaxxxx&start_time=2021-10-31+02%3A55%3A00%2B02%3A00&end_time=2021-10-31+02%3A00%3A00%2B01%3A00&period=-3300&statistics=avg&group_by=project_id&group_by=resource_id must be 255 characters or less 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor Traceback (most recent call last): 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor File "/var/lib/kolla/venv/lib/python3.8/site-packages/monasca_api/v2/common/bulk_processor.py", line 81, in _transform_message 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor dimensions=self._get_dimensions(log_element, 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor File "/var/lib/kolla/venv/lib/python3.8/site-packages/monasca_api/v2/common/bulk_processor.py", line 129, in _get_dimensions 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor validation.validate_dimensions(local_dims) 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor File "/var/lib/kolla/venv/lib/python3.8/site-packages/monasca_api/api/core/log/validation.py", line 138, in validate_dimensions 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor _validate_dimension_value(dim_value) 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor File "/var/lib/kolla/venv/lib/python3.8/site-packages/monasca_api/api/core/log/validation.py", line 114, in _validate_dimension_value 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor raise exceptions.HTTPUnprocessableEntity( 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor monasca_api.api.core.log.exceptions.HTTPUnprocessableEntity: Dimension value /v2.0/metrics/statistics?name=transfer.region2.size&merge_metrics=True&dimensions=project_id%3Afcfaxxxx&start_time=2021-10-31+02%3A55%3A00%2B02%3A00&end_time=2021-10-31+02%3A00%3A00%2B01%3A00&period=-3300&statistics=avg&group_by=project_id&group_by=resource_id must be 255 characters or less 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor From fungi at yuggoth.org Tue Nov 2 14:01:37 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Tue, 2 Nov 2021 14:01:37 +0000 Subject: Accessing Openstack using v3 API In-Reply-To: References: Message-ID: <20211102140136.dyinvai5jbsqxzjy@yuggoth.org> On 2021-11-02 19:13:37 +0530 (+0530), open infra wrote: > I was trying to use the openstack environment using the v3 client > API as mentioned in [2] documentation but ended up with an error > as mentioned in [1]. > > I can access the same environment v3 API using curl. Am I missing > something? [...] While I'm not sure what the cause of the HTTP Unauthorized exception might be (perhaps turning on debugging options might help narrow it down?), I suspect you may have an easier time developing against the unified OpenStackSDK rather than the individual per-service client libraries. For some simple examples of configuring and connecting, see the Getting Started chapter: https://docs.openstack.org/openstacksdk/xena/user/guides/intro.html Hope that helps! -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From gmann at ghanshyammann.com Tue Nov 2 14:49:11 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 02 Nov 2021 09:49:11 -0500 Subject: [all][tc] Technical Committee next weekly meeting on Nov 4th at 1500 UTC Message-ID: <17ce1201f97.f3ced5a3438820.2030079520319769825@ghanshyammann.com> Hello Everyone, Technical Committee's next weekly meeting is scheduled for Nov 4th at 1500 UTC. If you would like to add topics for discussion, please add them to the below wiki page by Wednesday, Nov 3rd, at 2100 UTC. https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting -gmann From openinfradn at gmail.com Tue Nov 2 15:55:01 2021 From: openinfradn at gmail.com (open infra) Date: Tue, 2 Nov 2021 21:25:01 +0530 Subject: Accessing Openstack using v3 API In-Reply-To: <20211102140136.dyinvai5jbsqxzjy@yuggoth.org> References: <20211102140136.dyinvai5jbsqxzjy@yuggoth.org> Message-ID: Thank you Jeremy! On Tue, Nov 2, 2021 at 7:42 PM Jeremy Stanley wrote: > On 2021-11-02 19:13:37 +0530 (+0530), open infra wrote: > > I was trying to use the openstack environment using the v3 client > > API as mentioned in [2] documentation but ended up with an error > > as mentioned in [1]. > > > > I can access the same environment v3 API using curl. Am I missing > > something? > [...] > > While I'm not sure what the cause of the HTTP Unauthorized exception > might be (perhaps turning on debugging options might help narrow it > down?), I suspect you may have an easier time developing against the > unified OpenStackSDK rather than the individual per-service client > libraries. For some simple examples of configuring and connecting, > see the Getting Started chapter: > > https://docs.openstack.org/openstacksdk/xena/user/guides/intro.html > > Hope that helps! > -- > Jeremy Stanley > -------------- next part -------------- An HTML attachment was scrubbed... URL: From swogatpradhan22 at gmail.com Wed Nov 3 05:13:03 2021 From: swogatpradhan22 at gmail.com (Swogat Pradhan) Date: Wed, 3 Nov 2021 10:43:03 +0530 Subject: [Openstack-victoria] [Openstack-queens] migration of VM from openstack queens to victoria In-Reply-To: References: Message-ID: Hi, I think there is a document for cinder migration which shows how to migrate the Volumes from one cluster to another. But that is not the process i am following, i am using rbd export and import in ceph to migrate the volumes and then recreate instances from the said volumes. Aside, I would like some input from the community on the issue i am currently facing. Thank you On Tue, Nov 2, 2021 at 6:49 PM Braden, Albert wrote: > I?d like to experiment with migrating VMs between clusters. Where can I > find a document that describes the procedure? > > > > > > *From:* Swogat Pradhan > *Sent:* Tuesday, November 2, 2021 5:59 AM > *To:* OpenStack Discuss > *Subject:* [EXTERNAL] [Openstack-victoria] [Openstack-queens] migration > of VM from openstack queens to victoria > > > > *Caution:* This email originated from outside the organization. Do not > click links or open attachments unless you recognize the sender and know > the content is safe. > > Hi, > > I have 2 openstack setups > > Setup1. Openstack queens + ceph nautilaus > > Setup2. Openstack victoria + ceph octopus > > > > So, I am trying to migrate some VM's (windows and linux) from Setup1 to > Setup2. > > For migrating the VM's i am using rbd export on setup1 and then rbd import > on setup2. > > > > I have successfully migrated 21 VM's. > > > > i am now facing an issue in the 22nd vm which, after migrating the VM the > vm is stuck in the windows logo screen and not moving forward, and i can't > seem to understand how to approach it. > > > > > > Attached the instance.xml files of both queens and victoria setup's of the > same VM. > > > > With regards, > > Swogat pradhan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nmohankumar1011 at gmail.com Wed Nov 3 06:24:03 2021 From: nmohankumar1011 at gmail.com (Mohan Kumar) Date: Wed, 3 Nov 2021 11:54:03 +0530 Subject: [neutron][ovs] Different Cookie with Remote Security group Message-ID: Team, I've created a security group with a remote security group attached to it. openstack security group rule create --ethertype=IPv4 --protocol tcp > --remote-group server-grp --ingress --dst-port=22 application-grp when I create VM with server-grp , I can see two cookies in br-int ovs bridge, one with default cookie and another new cookie with conjunction flows, is it expected behavior? ubuntu at req-generic-65-d012-mn:~$ sudo ovs-ofctl dump-flows br-int | awk '{print $1}' | sort -n | uniq cookie=0x12b7adbe73614620, cookie=0x207d02d6bbcf499e, NXST_FLOW ubuntu at req-generic-65-d012-mn:~$ sudo ovs-ofctl dump-flows br-int | grep "0x207d02d6bbcf499e" cookie=0x207d02d6bbcf499e, duration=335.421s, table=82, n_packets=0, n_bytes=0, idle_age=335, priority=70,ct_state=+est-rel-rpl,ipv6,reg6=0x1,ipv6_src=fdd8:a3ce:31ae:0:f816:3eff:fe8a:f751 actions=conjunction(8,1/2) cookie=0x207d02d6bbcf499e, duration=335.420s, table=82, n_packets=0, n_bytes=0, idle_age=335, priority=70,ct_state=+new-est,ipv6,reg6=0x1,ipv6_src=fdd8:a3ce:31ae:0:f816:3eff:fe8a:f751 actions=conjunction(9,1/2) cookie=0x207d02d6bbcf499e, duration=335.420s, table=82, n_packets=0, n_bytes=0, idle_age=335, priority=70,ct_state=+est-rel-rpl,ip,reg6=0x1,nw_src=10.0.0.38 actions=conjunction(16,1/2) cookie=0x207d02d6bbcf499e, duration=335.420s, table=82, n_packets=0, n_bytes=0, idle_age=335, priority=70,ct_state=+new-est,ip,reg6=0x1,nw_src=10.0.0.38 actions=conjunction(17,1/2) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralonsoh at redhat.com Wed Nov 3 07:40:14 2021 From: ralonsoh at redhat.com (Rodolfo Alonso Hernandez) Date: Wed, 3 Nov 2021 08:40:14 +0100 Subject: [neutron][ovs] Different Cookie with Remote Security group In-Reply-To: References: Message-ID: Hello Mohan: Each OVS agent extension (QoS) or service (firewall), will have its own "OVSCookieBridge" instance of a specific bridge; in this case br-int. That means any service or extension will write OF rules using its own cookie. By default, if you are only using OVS FW (no QoS or any other extension), only one cookie should be present in br-int. Any new rule added to the FW should have the same cookie as the other present rules. Did you restart the agent? When the OVS agent is restarted, all rules are set again, deleting the previous ones. The OVS agent generates new cookies each time. No stale flows should remain after the restart. Regards. On Wed, Nov 3, 2021 at 7:33 AM Mohan Kumar wrote: > Team, > > I've created a security group with a remote security group attached to > it. > > openstack security group rule create --ethertype=IPv4 --protocol tcp >> --remote-group server-grp --ingress --dst-port=22 application-grp > > > when I create VM with server-grp , I can see two cookies in br-int ovs > bridge, one with default cookie and another new cookie with conjunction > flows, is it expected behavior? > > ubuntu at req-generic-65-d012-mn:~$ sudo ovs-ofctl dump-flows br-int | awk '{print $1}' | sort -n | uniq > cookie=0x12b7adbe73614620, > cookie=0x207d02d6bbcf499e, > NXST_FLOW > ubuntu at req-generic-65-d012-mn:~$ sudo ovs-ofctl dump-flows br-int | grep "0x207d02d6bbcf499e" > cookie=0x207d02d6bbcf499e, duration=335.421s, table=82, n_packets=0, n_bytes=0, idle_age=335, priority=70,ct_state=+est-rel-rpl,ipv6,reg6=0x1,ipv6_src=fdd8:a3ce:31ae:0:f816:3eff:fe8a:f751 actions=conjunction(8,1/2) > cookie=0x207d02d6bbcf499e, duration=335.420s, table=82, n_packets=0, n_bytes=0, idle_age=335, priority=70,ct_state=+new-est,ipv6,reg6=0x1,ipv6_src=fdd8:a3ce:31ae:0:f816:3eff:fe8a:f751 actions=conjunction(9,1/2) > cookie=0x207d02d6bbcf499e, duration=335.420s, table=82, n_packets=0, n_bytes=0, idle_age=335, priority=70,ct_state=+est-rel-rpl,ip,reg6=0x1,nw_src=10.0.0.38 actions=conjunction(16,1/2) > cookie=0x207d02d6bbcf499e, duration=335.420s, table=82, n_packets=0, n_bytes=0, idle_age=335, priority=70,ct_state=+new-est,ip,reg6=0x1,nw_src=10.0.0.38 actions=conjunction(17,1/2) > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nmohankumar1011 at gmail.com Wed Nov 3 07:52:57 2021 From: nmohankumar1011 at gmail.com (Mohan Kumar) Date: Wed, 3 Nov 2021 13:22:57 +0530 Subject: [neutron][ovs] Different Cookie with Remote Security group In-Reply-To: References: Message-ID: Hi Rodolfo, Thanks for the reply, restart the agent fixed the issue. The cookies are replaced with a new single cookie. It look weird to me when I saw multiple cookies on the bridge. Do you think it should be fixed? Thanks ., Mohankumar N On Wed, Nov 3, 2021 at 1:10 PM Rodolfo Alonso Hernandez wrote: > Hello Mohan: > > Each OVS agent extension (QoS) or service (firewall), will have its own > "OVSCookieBridge" instance of a specific bridge; in this case br-int. That > means any service or extension will write OF rules using its own cookie. > > By default, if you are only using OVS FW (no QoS or any other extension), > only one cookie should be present in br-int. Any new rule added to the FW > should have the same cookie as the other present rules. > > Did you restart the agent? When the OVS agent is restarted, all rules are > set again, deleting the previous ones. The OVS agent generates new cookies > each time. No stale flows should remain after the restart. > > Regards. > > On Wed, Nov 3, 2021 at 7:33 AM Mohan Kumar > wrote: > >> Team, >> >> I've created a security group with a remote security group attached >> to it. >> >> openstack security group rule create --ethertype=IPv4 --protocol tcp >>> --remote-group server-grp --ingress --dst-port=22 application-grp >> >> >> when I create VM with server-grp , I can see two cookies in br-int ovs >> bridge, one with default cookie and another new cookie with conjunction >> flows, is it expected behavior? >> >> ubuntu at req-generic-65-d012-mn:~$ sudo ovs-ofctl dump-flows br-int | awk '{print $1}' | sort -n | uniq >> cookie=0x12b7adbe73614620, >> cookie=0x207d02d6bbcf499e, >> NXST_FLOW >> ubuntu at req-generic-65-d012-mn:~$ sudo ovs-ofctl dump-flows br-int | grep "0x207d02d6bbcf499e" >> cookie=0x207d02d6bbcf499e, duration=335.421s, table=82, n_packets=0, n_bytes=0, idle_age=335, priority=70,ct_state=+est-rel-rpl,ipv6,reg6=0x1,ipv6_src=fdd8:a3ce:31ae:0:f816:3eff:fe8a:f751 actions=conjunction(8,1/2) >> cookie=0x207d02d6bbcf499e, duration=335.420s, table=82, n_packets=0, n_bytes=0, idle_age=335, priority=70,ct_state=+new-est,ipv6,reg6=0x1,ipv6_src=fdd8:a3ce:31ae:0:f816:3eff:fe8a:f751 actions=conjunction(9,1/2) >> cookie=0x207d02d6bbcf499e, duration=335.420s, table=82, n_packets=0, n_bytes=0, idle_age=335, priority=70,ct_state=+est-rel-rpl,ip,reg6=0x1,nw_src=10.0.0.38 actions=conjunction(16,1/2) >> cookie=0x207d02d6bbcf499e, duration=335.420s, table=82, n_packets=0, n_bytes=0, idle_age=335, priority=70,ct_state=+new-est,ip,reg6=0x1,nw_src=10.0.0.38 actions=conjunction(17,1/2) >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From bkslash at poczta.onet.pl Wed Nov 3 08:00:59 2021 From: bkslash at poczta.onet.pl (Adam Tomas) Date: Wed, 3 Nov 2021 09:00:59 +0100 Subject: [monasca][kolla-ansible] HTTPUnprocessableEntity: Dimension value must be 255 characters or less In-Reply-To: <69CC3C6B-4C50-4735-B65D-79C05E258B79@poczta.onet.pl> References: <69CC3C6B-4C50-4735-B65D-79C05E258B79@poczta.onet.pl> Message-ID: <7DE1CD9A-E1D0-48C6-A3C5-ED4B69F337D7@poczta.onet.pl> Hi again, I did some experiment: in /var/lib/kolla/venv/lib/python3.8/site-packages/monasca_api/api/core/log/validation.py I changed the default value from 255 to 300: DIMENSION_VALUE_CONSTRAINTS = { 'MAX_LENGTH': 300 } and restarted the container. And now there are no value validation errors, but how to check if this max value do not cause problems in monasca_log_transformer or any other service in the stack (kafka/monasca_persister/elasticsearch)? Thanks in advance for any help with this issue? Best regards Adam Tomas > Wiadomo?? napisana przez Adam Tomas w dniu 02.11.2021, o godz. 15:01: > > Hi, > > in my test deployment I get following error messages for each metric at each metric collection (every 60s in my case). Of course whole URL (i.e. /v2.0/metrics/statistics?name=transfer.region2.size&merge_metrics=True&dimensions=project_id%3Afcfaxxxxx&start_time=2021-10-31+02%3A55%3A00%2B02%3A00&end_time=2021-10-31+02%3A00%3A00%2B01%3A00&period=-3300&statistics=avg&group_by=project_id&group_by=resource_id) is longer than 255 characters, but shouldn?t this URL be processed in smaller chunks? As I see the validation is done by monasca_api, so in case it fails no data is processed by monasca_log_transformer, right? Limiting metric name lenght won?t change this situation (in this example metric name is 22 characters long, while the whole URL is 285 characters long, so 285-22 is still > 255). > > How can I avoid this error? > Best regards > > Adam Tomas > > > 2021-11-02 14:19:44.935 693 ERROR monasca_api.v2.common.bulk_processor [req-a251xxxx 60bexxxx 27bexxxxx - default default] Log transformation failed, rejecting log: monasca_api.api.core.log.exceptions.HTTPUnprocessableEntity: Dimension value /v2.0/metrics/statistics?name=transfer.region2.size&merge_metrics=True&dimensions=project_id%3Afcfaxxxxx&start_time=2021-10-31+02%3A55%3A00%2B02%3A00&end_time=2021-10-31+02%3A00%3A00%2B01%3A00&period=-3300&statistics=avg&group_by=project_id&group_by=resource_id must be 255 characters or less > 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor [req-a251xxxx 60bexxxx 27bexxxx - default default] Dimension value /v2.0/metrics/statistics?name=transfer.region2.size&merge_metrics=True&dimensions=project_id%3Afcfaxxx&start_time=2021-10-31+02%3A55%3A00%2B02%3A00&end_time=2021-10-31+02%3A00%3A00%2B01%3A00&period=-3300&statistics=avg&group_by=project_id&group_by=resource_id must be 255 characters or less: monasca_api.api.core.log.exceptions.HTTPUnprocessableEntity: Dimension value /v2.0/metrics/statistics?name=transfer.region2.size&merge_metrics=True&dimensions=project_id%3Afcfaxxxx&start_time=2021-10-31+02%3A55%3A00%2B02%3A00&end_time=2021-10-31+02%3A00%3A00%2B01%3A00&period=-3300&statistics=avg&group_by=project_id&group_by=resource_id must be 255 characters or less > 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor Traceback (most recent call last): > 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor File "/var/lib/kolla/venv/lib/python3.8/site-packages/monasca_api/v2/common/bulk_processor.py", line 81, in _transform_message > 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor dimensions=self._get_dimensions(log_element, > 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor File "/var/lib/kolla/venv/lib/python3.8/site-packages/monasca_api/v2/common/bulk_processor.py", line 129, in _get_dimensions > 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor validation.validate_dimensions(local_dims) > 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor File "/var/lib/kolla/venv/lib/python3.8/site-packages/monasca_api/api/core/log/validation.py", line 138, in validate_dimensions > 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor _validate_dimension_value(dim_value) > 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor File "/var/lib/kolla/venv/lib/python3.8/site-packages/monasca_api/api/core/log/validation.py", line 114, in _validate_dimension_value > 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor raise exceptions.HTTPUnprocessableEntity( > 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor monasca_api.api.core.log.exceptions.HTTPUnprocessableEntity: Dimension value /v2.0/metrics/statistics?name=transfer.region2.size&merge_metrics=True&dimensions=project_id%3Afcfaxxxx&start_time=2021-10-31+02%3A55%3A00%2B02%3A00&end_time=2021-10-31+02%3A00%3A00%2B01%3A00&period=-3300&statistics=avg&group_by=project_id&group_by=resource_id must be 255 characters or less > 2021-11-02 14:19:44.936 693 ERROR monasca_api.v2.common.bulk_processor From tkajinam at redhat.com Wed Nov 3 10:49:25 2021 From: tkajinam at redhat.com (Takashi Kajinami) Date: Wed, 3 Nov 2021 19:49:25 +0900 Subject: [puppet] Propose retiring puppet-senlin Message-ID: Hello, I remember I raised a similar discussion recently[1] but we need the same for a different module. puppet-selin was introduced back in 2018, but the module has had only the portion made by cookiecutter and has no capability to manage fundamental resources yet. Because we haven't seen any interest in creating implementations to support even basic usage, I'll propose retiring this module. I'll be open for any feedback for a while, and will propose a series of patches for retirement if no concern is raised here for one week. Thank you, Takashi [1] http://lists.openstack.org/pipermail/openstack-discuss/2021-September/024687.html [2] https://opendev.org/openstack/puppet-senlin -------------- next part -------------- An HTML attachment was scrubbed... URL: From tobias.urdin at binero.com Wed Nov 3 11:05:34 2021 From: tobias.urdin at binero.com (Tobias Urdin) Date: Wed, 3 Nov 2021 11:05:34 +0000 Subject: [puppet] Propose retiring puppet-senlin In-Reply-To: References: Message-ID: +1 for retiring Best regards Tobias On 3 Nov 2021, at 11:49, Takashi Kajinami > wrote: Hello, I remember I raised a similar discussion recently[1] but we need the same for a different module. puppet-selin was introduced back in 2018, but the module has had only the portion made by cookiecutter and has no capability to manage fundamental resources yet. Because we haven't seen any interest in creating implementations to support even basic usage, I'll propose retiring this module. I'll be open for any feedback for a while, and will propose a series of patches for retirement if no concern is raised here for one week. Thank you, Takashi [1] http://lists.openstack.org/pipermail/openstack-discuss/2021-September/024687.html [2] https://opendev.org/openstack/puppet-senlin -------------- next part -------------- An HTML attachment was scrubbed... URL: From senrique at redhat.com Wed Nov 3 13:08:15 2021 From: senrique at redhat.com (Sofia Enriquez) Date: Wed, 3 Nov 2021 10:08:15 -0300 Subject: [cinder] Bug deputy report for week of 11-03-2021 Message-ID: This is a bug report from 10-27-2021-15-09 to 11-03-2021. Agenda: https://etherpad.opendev.org/p/cinder-bug-squad-meeting ----------------------------------------------------------------------------------------- Medium - https://bugs.launchpad.net/cinder/+bug/1949313 'Cinder backup tries to restore to the wrong host'. Assigned to Sam Morrison. - https://bugs.launchpad.net/cinder/+bug/1949189 'NfsDriver' object has no attribute 'manage_existing_get_size'. Unassigned. - https://bugs.launchpad.net/cinder/+bug/1946483 'Error when deleting encrypted volume backup from another project'. Assigned to Pavlo Shchelokovskyy. - https://bugs.launchpad.net/cinder/+bug/1948962 'Quotas: Some operations fail with a 255 volume type name'. Assigned to Gorka. - https://bugs.launchpad.net/cinder/+bug/1949061 '[Storwize] Retype of a non-rep mirror volume to mirror- volume-type with different mirror_pool is failing'. Assigned to Mounika Sreeram. Low - https://bugs.launchpad.net/cinder/+bug/1798589 'No available space check for image_conversion_dir in cinder-volume on upload-to-image'. Unassigned. Cheers -- Sof?a Enriquez she/her Software Engineer Red Hat PnT IRC: @enriquetaso @RedHat Red Hat Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From opensrloo at gmail.com Wed Nov 3 13:26:10 2021 From: opensrloo at gmail.com (Ruby Loo) Date: Wed, 3 Nov 2021 09:26:10 -0400 Subject: =?UTF-8?Q?Re=3A_=5Bironic=5D_Proposing_Aija_Jaunt=C4=93va_for_sushy=2Dcore?= In-Reply-To: References: Message-ID: Yay Aija! +1 :) --ruby On Tue, Nov 2, 2021 at 3:54 AM Iury Gregory wrote: > Hello everyone! > > I would like to propose Aija Jaunt?va (irc: ajya) to be added to the > sushy-core group. > Aija has been in the ironic community for a long time, she has a lot of > knowledge about redfish and is always providing good reviews. > > ironic-cores please vote with +/- 1. > > -- > > > *Att[]'sIury Gregory Melo Ferreira * > *MSc in Computer Science at UFCG* > *Part of the ironic-core and puppet-manager-core team in OpenStack* > *Software Engineer at Red Hat Czech* > *Social*: https://www.linkedin.com/in/iurygregory > *E-mail: iurygregory at gmail.com * > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pbasaras at gmail.com Wed Nov 3 13:26:25 2021 From: pbasaras at gmail.com (Pavlos Basaras) Date: Wed, 3 Nov 2021 15:26:25 +0200 Subject: Attach Nvidia Jetson as compute node to openstack (aarch64) Message-ID: Hello everyone, I am relatively new to the community, thanks in advance for your time and help. I have an openstack cluster ready, working with several compute nodes based on x86 architecture. My current installation is based on Ussuri. I have recently acquired a couple of nvidia Jetson devices ( https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/jetson-agx-xavier/) which I want to connect to the cluster. The arm cpu model name is ARMv8 Processor rev 0 (v8l). CPU flags on the arm architecture are different and hence egrep -c '(vmx|svm)' /proc/cpuinfo is empty to the best of my knowledge. However, i re-build the custom linux kernel (i.e, Tegra) of the Jetson device, enabling the KVM module. So once i used: *sudo kvm-ok* INFO: /dev/kvm exists KVM acceleration can be used *dmesg | grep -i kvm* [ 1.372478] kvm [1]: 16-bit VMID [ 1.372489] kvm [1]: IDMAP page: 80fa1000 [ 1.372498] kvm [1]: HYP VA range: 4000000000:7fffffffff [ 1.374185] kvm [1]: Hyp mode initialized successfully [ 1.374299] kvm [1]: vgic-v2 at 3884000 [ 1.374763] kvm [1]: vgic interrupt IRQ1 [ 1.374790] kvm [1]: virtual timer IRQ4 *dmesg | grep -i 'CPU features'* [ 0.687366] CPU features: detected feature: Privileged Access Never [ 0.687372] CPU features: detected feature: LSE atomic instructions [ 0.687378] CPU features: detected feature: User Access Override [ 0.687385] CPU features: detected feature: 32-bit EL0 Support Does this suffice to say that I can use KVM with libvirt for the nova openstack? The version of libvirt is 6.0.0. >From the nova logs i see the following with *virt_type = qemu* WARNING nova.virt.libvirt.driver [-] The libvirt driver is not tested on qemu/aarch64 by the OpenStack project and thus its quality can not be ensured. For more information, see: https://docs.openstack.org/nova/latest/user/support-matrix.html And as it detects the aarch64 device: CPU mode "host-passthrough" was chosen If i try to launch an instance i get the following error with qemu libvirt.libvirtError: unsupported configuration: CPU mode 'host-passthrough' for aarch64 qemu domain on aarch64 host is not supported by hypervisor where as i i use *virt_type= kvm* the error after trying to launch an instance is libvirt.libvirtError: unsupported configuration: Emulator '/usr/bin/qemu-system-aarch64' does not support virt type 'kvm' Any advice on how to proceed? all the best Pavlos. -------------- next part -------------- An HTML attachment was scrubbed... URL: From emilien at redhat.com Wed Nov 3 16:13:03 2021 From: emilien at redhat.com (Emilien Macchi) Date: Wed, 3 Nov 2021 12:13:03 -0400 Subject: [sdk][tc] A new home for Gophercloud Message-ID: Hi everyone, The Gophercloud project is a stable OpenStack SDK in Golang, which has been widely used by many communities now, including Terraform and all the Kubernetes on OpenStack projects. The project was initiated by Rackspace in 2013 and has already successfully managed their departure as a principal contributor. Fast forward to 2021, Joe Topjian (major maintainer) who used to be an active contributor in Puppet modules and also a voice in the OpenStack operators space, has reached out to me so we can discuss a transition for maintenance: https://github.com/gophercloud/gophercloud/issues/2246 We have discussed this internally and here are our notes: https://github.com/gophercloud/gophercloud/issues/2246#issuecomment-957589400 . To make it short, we are figuring out whether it would make sense to find a new home for the project and if yes, where. The main reason we're reaching out to the opendev community first is because we think this is the most logical place to host the project, alongside OpenStack: Some ideas: - The project would potentially have more visibility in the community, it?s a SDK therefore strongly relying on OpenStack APIs stability. - Use (some) opendev tools, mainly Zuul & nodepool resources - integrate with other projects. - Governance: - Gerrit / not Gerrit: we don?t think we would move to Gerrit yet, as the existing contributors are probably more used to the Github workflow, and we clearly don?t want to lose anyone in the process). - IRC: we could have an IRC channel, potentially Slack as well. Many things to figure out and before we answer these questions, we would like to poll the community: what do you think? Have you contributed to the project? What's your feeling about the ideas here? We welcome any feedback and will take it in consideration in our discussions. Thanks everyone, -- Emilien Macchi, on behalf of the Gophercloud contributors -------------- next part -------------- An HTML attachment was scrubbed... URL: From elod.illes at est.tech Wed Nov 3 16:22:41 2021 From: elod.illes at est.tech (=?UTF-8?B?RWzFkWQgSWxsw6lz?=) Date: Wed, 3 Nov 2021 17:22:41 +0100 Subject: [ptl][release][stable][EM] Extended Maintenance - Ussuri In-Reply-To: References: Message-ID: Hi, This is a reminder: end of next week is the planned deadline for the transition. (I've updated the lists of open & unreleased changes per project where there were any changes. See [2]) Thanks, El?d On 2021. 10. 11. 18:26, El?d Ill?s wrote: > Hi, > > As Xena was released last week and we are in a less busy period, now > it is a good time to call your attention to the following: > > In a month Ussuri is planned to transition to Extended Maintenance > phase [1] (planned date: 2021-11-12). > > I have generated the list of the current *open* and *unreleased* > changes in stable/ussuri for the follows-policy tagged repositories > [2] (where there are such patches). These lists could help the teams > who are planning to do a *final* release on Ussuri before moving > stable/ussuri branches to Extended Maintenance. Feel free to edit and > extend these lists to track your progress! > > * At the transition date the Release Team will tag the *latest* Ussuri > releases of repositories with *ussuri-em* tag. > * After the transition stable/ussuri will be still open for bug fixes, > but there won't be official releases anymore. > > *NOTE*: teams, please focus on wrapping up your libraries first if > there is any concern about the changes, in order to avoid broken > (final!) releases! > > Thanks, > > El?d > > [1] https://releases.openstack.org/ > [2] https://etherpad.opendev.org/p/ussuri-final-release-before-em > > > From artem.goncharov at gmail.com Wed Nov 3 16:25:29 2021 From: artem.goncharov at gmail.com (Artem Goncharov) Date: Wed, 3 Nov 2021 17:25:29 +0100 Subject: [sdk][tc] A new home for Gophercloud In-Reply-To: References: Message-ID: <6E557306-481C-4A71-B6C8-886CCA76CCED@gmail.com> Hey, With openstacksdk maintainer hat and without any hat I would clearly welcome this move. Regards, Artem > On 3. Nov 2021, at 17:13, Emilien Macchi wrote: > > Hi everyone, > > The Gophercloud project is a stable OpenStack SDK in Golang, which has been widely used by many communities now, including Terraform and all the Kubernetes on OpenStack projects. > The project was initiated by Rackspace in 2013 and has already successfully managed their departure as a principal contributor. Fast forward to 2021, Joe Topjian (major maintainer) who used to be an active contributor in Puppet modules and also a voice in the OpenStack operators space, has reached out to me so we can discuss a transition for maintenance: https://github.com/gophercloud/gophercloud/issues/2246 > We have discussed this internally and here are our notes: https://github.com/gophercloud/gophercloud/issues/2246#issuecomment-957589400 . > > To make it short, we are figuring out whether it would make sense to find a new home for the project and if yes, where. > The main reason we're reaching out to the opendev community first is because we think this is the most logical place to host the project, alongside OpenStack: > > Some ideas: > The project would potentially have more visibility in the community, it?s a SDK therefore strongly relying on OpenStack APIs stability. > Use (some) opendev tools, mainly Zuul & nodepool resources - integrate with other projects. > Governance: > Gerrit / not Gerrit: we don?t think we would move to Gerrit yet, as the existing contributors are probably more used to the Github workflow, and we clearly don?t want to lose anyone in the process). > IRC: we could have an IRC channel, potentially Slack as well. > > Many things to figure out and before we answer these questions, we would like to poll the community: what do you think? Have you contributed to the project? What's your feeling about the ideas here? > We welcome any feedback and will take it in consideration in our discussions. > > Thanks everyone, > -- > Emilien Macchi, on behalf of the Gophercloud contributors -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Wed Nov 3 16:40:54 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 3 Nov 2021 16:40:54 +0000 Subject: [sdk][tc] A new home for Gophercloud In-Reply-To: References: Message-ID: <20211103164054.72y2c7qk2b7ogxn2@yuggoth.org> On 2021-11-03 12:13:03 -0400 (-0400), Emilien Macchi wrote: [...] > To make it short, we are figuring out whether it would make sense > to find a new home for the project and if yes, where. > > The main reason we're reaching out to the opendev community first > is because we think this is the most logical place to host the > project, alongside OpenStack: [...] > Use (some) opendev tools, mainly Zuul & nodepool resources - integrate > with other projects. > - > > Governance: > - > > Gerrit / not Gerrit: we don?t think we would move to Gerrit yet, as the > existing contributors are probably more used to the Github workflow, and we > clearly don?t want to lose anyone in the process). [...] Hosting the project in the OpenDev Collaboratory would mean having its source code in our review.opendev.org Gerrit service as the primary reference copy and updating it through change proposals using a Gerrit workflow. You could replicate code to GitHub as is done for OpenStack's repositories, but the code copy on GitHub would merely serve as a read-only mirror. While the Zuul software does have a GitHub driver and OpenDev connects their zuul.opendev.org deployment to GitHub in order to provide advisory testing to dependencies of projects hosted in OpenDev, the OpenDev sysadmins concluded that gating projects hosted outside of the OpenDev Collaboratory's Gerrit instance (e.g., on GitHub) is not something we were able to support sustainably: http://lists.openstack.org/pipermail/openstack-infra/2019-January/006269.html This was based on experiences trying to work with the Kata community, and the "experiment" referenced in that mailing list post eventually concluded with the removal of remaining Kata project configuration when https://review.opendev.org/744687 merged approximately 15 months ago. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From james.slagle at gmail.com Wed Nov 3 16:52:35 2021 From: james.slagle at gmail.com (James Slagle) Date: Wed, 3 Nov 2021 12:52:35 -0400 Subject: [TripleO] Core team cleanup Message-ID: Hello, I took a look at our core team, "tripleo-core" in gerrit. We have a few individuals who I feel have moved on from TripleO in their focus. I looked at the reviews from stackalytics.io for the last 180 days[1]. These individuals have less than 6 reviews, which is about 1 review a month: Bob Fournier Dan Sneddon Dmitry Tantsur Ji?? Str?nsk? Juan Antonio Osorio Robles Marius Cornea These individuals have publicly expressed that they are moving on from TripleO: Michele Baldessari wes hayutin I'd like to propose we remove these folks from our core team, while thanking them for their contributions. I'll also note that I'd still value +1/-1 from these folks with a lot of significance, and encourage them to review their areas of expertise! If anyone on the list plans to start reviewing in TripleO again, then I also think we can postpone the removal for the time being and re-evaluate later. Please let me know if that's the case. Please reply and let me know any agreements or concerns with this change. Thank you! [1] https://www.stackalytics.io/report/contribution?module=tripleo-group&project_type=openstack&days=180 -- -- James Slagle -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From james.slagle at gmail.com Wed Nov 3 17:02:29 2021 From: james.slagle at gmail.com (James Slagle) Date: Wed, 3 Nov 2021 13:02:29 -0400 Subject: [TripleO] Branching our documentation Message-ID: Hello TripleO Owls, Our documentation, particularly the deploy guide, has become overly complex with all the branches that TripleO has supported over the years. I see notes that document functionality specific to releases all the way back to Mitaka! In ancient history, we made the decision to not branch our documentation because the work to maintain multiple branches outweighed the effort to just document all releases at once in the same branch. I think the scale has now tipped in the other direction. I propose that we create a stable/wallaby in tripleo-docs, and begin making the master branch specific to Yoga. This would also mean we could clean up all the old notes and admonitions about previous releases on the master branch. Going forward, if you make a change to the docs that applies to Yoga and Wallaby, you would need to backport that change to stable/wallaby. If you needed to make a change that applied only to Wallaby (or a prior release), you would make that change only on stable/wallaby. I'm not sure of all the plumbing required to make it so that the OpenStack Wallaby deploy guide builds from stable/wallaby, etc, but I wanted to get the idea out there first for feedback. Please let me know your thoughts. Thank you! -- -- James Slagle -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Wed Nov 3 17:13:10 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 03 Nov 2021 12:13:10 -0500 Subject: [all][tc] Continuing the RBAC PTG discussion In-Reply-To: <17cdc3e5964.e42274e1372366.1164114504252831335@ghanshyammann.com> References: <17cc3068765.11a9586ce162263.9148179786177865922@ghanshyammann.com> <17cdc3e5964.e42274e1372366.1164114504252831335@ghanshyammann.com> Message-ID: <17ce6ca4f3c.11a0d934542096.8078141879504981729@ghanshyammann.com> Hello Everyone, We figured out a lot of things in today call and for other open queries or goal stuff, we will continue the discussion next week on Wed Nov 10th, 15:00 - 16:00 UTC. Below is the link to join the call: https://meet.google.com/uue-adpp-xsm -gmann ---- On Mon, 01 Nov 2021 11:04:06 -0500 Ghanshyam Mann wrote ---- > ---- On Wed, 27 Oct 2021 13:32:37 -0500 Ghanshyam Mann wrote ---- > > Hello Everyone, > > > > As decided in PTG, we will continue the RBAC discussion from where we left in PTG. We will have a video > > call next week based on the availability of most of the interested members. > > > > Please vote your available time in below doodle vote by Thursday (or Friday morning central time). > > > > - https://doodle.com/poll/6xicntb9tu657nz7 > > As per doodle voting, I have schedule it on Nov 3rd Wed, 15:00 - 16:00 UTC. > > Below is the link to join the call: > > https://meet.google.com/uue-adpp-xsm > > We will be taking notes in this etherpad https://etherpad.opendev.org/p/policy-popup-yoga-ptg > > -gmann > > > > > NOTE: this is not specific to TC or people working in RBAC work but more to wider community to > > get feedback and finalize the direction (like what we did in PTG session). > > > > Meanwhile, feel free to review the lance's updated proposal for community-wide goal > > - https://review.opendev.org/c/openstack/governance/+/815158 > > > > -gmann > > > > > > From smooney at redhat.com Wed Nov 3 17:33:45 2021 From: smooney at redhat.com (Sean Mooney) Date: Wed, 03 Nov 2021 17:33:45 +0000 Subject: [TripleO] Branching our documentation In-Reply-To: References: Message-ID: <60d72e92da6bab4ded806ea05232e5c05c476eb8.camel@redhat.com> On Wed, 2021-11-03 at 13:02 -0400, James Slagle wrote: > Hello TripleO Owls, > > Our documentation, particularly the deploy guide, has become overly complex > with all the branches that TripleO has supported over the years. I see > notes that document functionality specific to releases all the way back to > Mitaka! > > In ancient history, we made the decision to not branch our documentation > because the work to maintain multiple branches outweighed the effort to > just document all releases at once in the same branch. > > I think the scale has now tipped in the other direction. I propose that we > create a stable/wallaby in tripleo-docs, and begin making the master branch > specific to Yoga. This would also mean we could clean up all the old notes > and admonitions about previous releases on the master branch. +1 as some one that very really uses ooo and who always need to look at the documentation when i do try to use it i find the current docs very hard to parse due to all the differnt release annotations inline. the ooo docs themselve are not actully that extensive upstream and you can read all or most of them in one afternoon but parsing them and the parts that apply to the relase you are trying to deploy is a lot more effort then the branached docs in other projects. i think this definetly help the new user and might also help those that are more expirnce with ooo too. > > Going forward, if you make a change to the docs that applies to Yoga and > Wallaby, you would need to backport that change to stable/wallaby. > > If you needed to make a change that applied only to Wallaby (or a prior > release), you would make that change only on stable/wallaby. > > I'm not sure of all the plumbing required to make it so that the OpenStack > Wallaby deploy guide builds from stable/wallaby, etc, but I wanted to get > the idea out there first for feedback. Please let me know your thoughts. > Thank you! > From johfulto at redhat.com Wed Nov 3 18:15:25 2021 From: johfulto at redhat.com (John Fulton) Date: Wed, 3 Nov 2021 14:15:25 -0400 Subject: [TripleO] Branching our documentation In-Reply-To: <60d72e92da6bab4ded806ea05232e5c05c476eb8.camel@redhat.com> References: <60d72e92da6bab4ded806ea05232e5c05c476eb8.camel@redhat.com> Message-ID: On Wed, Nov 3, 2021 at 1:35 PM Sean Mooney wrote: > > On Wed, 2021-11-03 at 13:02 -0400, James Slagle wrote: > > Hello TripleO Owls, > > > > Our documentation, particularly the deploy guide, has become overly complex > > with all the branches that TripleO has supported over the years. I see > > notes that document functionality specific to releases all the way back to > > Mitaka! > > > > In ancient history, we made the decision to not branch our documentation > > because the work to maintain multiple branches outweighed the effort to > > just document all releases at once in the same branch. > > > > I think the scale has now tipped in the other direction. I propose that we > > create a stable/wallaby in tripleo-docs, and begin making the master branch > > specific to Yoga. This would also mean we could clean up all the old notes > > and admonitions about previous releases on the master branch. > +1 as some one that very really uses ooo and who always need to look at the documentation > when i do try to use it i find the current docs very hard to parse due to all the differnt release annotations > inline. the ooo docs themselve are not actully that extensive upstream and you can read all or most of them in one afternoon > but parsing them and the parts that apply to the relase you are trying to deploy is a lot more effort then the branached > docs in other projects. i think this definetly help the new user and might also help those that are more expirnce with ooo too. So stable/wallaby would have the notes and admonitions about previous releases (which are useful if you're using an older version)? Then we could then make a smaller main branch which is leaner and focussed on Yoga? That sounds good to me. I'd be happy to help with the clean up after the branch is made. John > > > > Going forward, if you make a change to the docs that applies to Yoga and > > Wallaby, you would need to backport that change to stable/wallaby. > > > > If you needed to make a change that applied only to Wallaby (or a prior > > release), you would make that change only on stable/wallaby. > > > > I'm not sure of all the plumbing required to make it so that the OpenStack > > Wallaby deploy guide builds from stable/wallaby, etc, but I wanted to get > > the idea out there first for feedback. Please let me know your thoughts. > > Thank you! > > > > From beagles at redhat.com Wed Nov 3 18:45:55 2021 From: beagles at redhat.com (Brent Eagles) Date: Wed, 3 Nov 2021 16:15:55 -0230 Subject: [TripleO] Branching our documentation In-Reply-To: References: Message-ID: On Wed, Nov 03, 2021 at 01:02:29PM -0400, James Slagle wrote: > Hello TripleO Owls, > > Our documentation, particularly the deploy guide, has become overly complex > with all the branches that TripleO has supported over the years. I see > notes that document functionality specific to releases all the way back to > Mitaka! > > In ancient history, we made the decision to not branch our documentation > because the work to maintain multiple branches outweighed the effort to > just document all releases at once in the same branch. > > I think the scale has now tipped in the other direction. I propose that we > create a stable/wallaby in tripleo-docs, and begin making the master branch > specific to Yoga. This would also mean we could clean up all the old notes > and admonitions about previous releases on the master branch. Agreed! As it stands, even wallaby is quite different than victoria and earlier for the overcloud deployment and https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/deployment/install_overcloud.html isn't particular helpful after a point. I've tried to update it a few times with the "bare essential" steps but it feels so badly forked, I couldn't follow it. > Going forward, if you make a change to the docs that applies to Yoga and > Wallaby, you would need to backport that change to stable/wallaby. > > If you needed to make a change that applied only to Wallaby (or a prior > release), you would make that change only on stable/wallaby. > > I'm not sure of all the plumbing required to make it so that the OpenStack > Wallaby deploy guide builds from stable/wallaby, etc, but I wanted to get > the idea out there first for feedback. Please let me know your thoughts. > Thank you! > > -- > -- James Slagle > -- I think it's a great idea! Cheers, Brent -- Brent Eagles Principal Software Engineer Red Hat Inc. From gauurav.sabharwal at in.ibm.com Wed Nov 3 04:50:25 2021 From: gauurav.sabharwal at in.ibm.com (Gauurav Sabharwal1) Date: Wed, 3 Nov 2021 10:20:25 +0530 Subject: [cinder] : SAN migration Message-ID: Hi Experts , I need some expert advise of one of the scenario, I have multiple isolated OpenStack cluster running with train & rocky edition. Each OpenStack cluster environment have it's own isolated infrastructure of SAN ( CISCO fabric ) & Storage ( HP, EMC & IBM). Now company planning to refresh their SAN infrastructure. By procuring new Brocade SAN switches. But there are some migration relevant challenges we have. As we understand under one cinder instance only one typer of FC zone manager is supported . Currently customer configured & managing CISCO . Is it possible to configure two different vendor FC Zone manager under one cinder instance. Migration of SAN zoning is supposedly going to be happen offline way from OpenStack point of view. We will be migrating all ports of each existing cisco fabric to Brocade with zone configuration using brocade CLI. Our main concern is that after migration How CINDER DB update new zone info & path via Brocade SAN. Regards Gauurav Sabharwal IBM India Pvt. Ltd. IBM towers Ground floor, Block -A , Plot number 26, Sector 62, Noida Gautam budhnagar UP-201307. Email:gauurav.sabharwal at in.ibm.com Mobile No.: +91-9910159277 -------------- next part -------------- An HTML attachment was scrubbed... URL: From james.slagle at gmail.com Wed Nov 3 19:01:16 2021 From: james.slagle at gmail.com (James Slagle) Date: Wed, 3 Nov 2021 15:01:16 -0400 Subject: [TripleO] Branching our documentation In-Reply-To: References: <60d72e92da6bab4ded806ea05232e5c05c476eb8.camel@redhat.com> Message-ID: On Wed, Nov 3, 2021 at 2:15 PM John Fulton wrote: > On Wed, Nov 3, 2021 at 1:35 PM Sean Mooney wrote: > > > > On Wed, 2021-11-03 at 13:02 -0400, James Slagle wrote: > > > Hello TripleO Owls, > > > > > > Our documentation, particularly the deploy guide, has become overly > complex > > > with all the branches that TripleO has supported over the years. I see > > > notes that document functionality specific to releases all the way > back to > > > Mitaka! > > > > > > In ancient history, we made the decision to not branch our > documentation > > > because the work to maintain multiple branches outweighed the effort to > > > just document all releases at once in the same branch. > > > > > > I think the scale has now tipped in the other direction. I propose > that we > > > create a stable/wallaby in tripleo-docs, and begin making the master > branch > > > specific to Yoga. This would also mean we could clean up all the old > notes > > > and admonitions about previous releases on the master branch. > > +1 as some one that very really uses ooo and who always need to look at > the documentation > > when i do try to use it i find the current docs very hard to parse due > to all the differnt release annotations > > inline. the ooo docs themselve are not actully that extensive upstream > and you can read all or most of them in one afternoon > > but parsing them and the parts that apply to the relase you are trying > to deploy is a lot more effort then the branached > > docs in other projects. i think this definetly help the new user and > might also help those that are more expirnce with ooo too. > > So stable/wallaby would have the notes and admonitions about previous > releases (which are useful if you're using an older version)? > Then we could then make a smaller main branch which is leaner and > focussed on Yoga? > That is correct. stable/wallaby would be for Wallaby and all prior versions. Essentially, as the docs are now. Master would be Yoga only. When Yoga is done, we'd branch stable/yoga and master would become Z*. -- -- James Slagle -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Wed Nov 3 21:06:53 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 03 Nov 2021 16:06:53 -0500 Subject: [all][tc] Technical Committee next weekly meeting on Nov 4th at 1500 UTC In-Reply-To: <17ce1201f97.f3ced5a3438820.2030079520319769825@ghanshyammann.com> References: <17ce1201f97.f3ced5a3438820.2030079520319769825@ghanshyammann.com> Message-ID: <17ce7a0489a.f7f41b7b51390.8778486736180024566@ghanshyammann.com> Hello Everyone, Below is the agenda for Tomorrow's TC video meeting schedule at 1500 UTC. https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting == Agenda for tomorrow's TC meeting == * Roll call * Follow up on past action items * Yoga tracker ** https://etherpad.opendev.org/p/tc-yoga-tracker * Gate health check * Release management team position on a longer release cycle (ttx) ** https://etherpad.opendev.org/p/relmgt-position-1y-releases * Stable team process change ** https://review.opendev.org/c/openstack/governance/+/810721 * Adjutant need PTLs and maintainers ** http://lists.openstack.org/pipermail/openstack-discuss/2021-October/025555.html * Office hour: continue or stop? ** https://meetings.opendev.org/#Technical_Committee_Office_hours * newsletter ** https://etherpad.opendev.org/p/newsletter-openstack-news * Pain Point targeting ** https://etherpad.opendev.org/p/pain-point-elimination * Open Reviews ** https://review.opendev.org/q/projects:openstack/governance+is:open -gmann ---- On Tue, 02 Nov 2021 09:49:11 -0500 Ghanshyam Mann wrote ---- > Hello Everyone, > > Technical Committee's next weekly meeting is scheduled for Nov 4th at 1500 UTC. > > If you would like to add topics for discussion, please add them to the below wiki page by > Wednesday, Nov 3rd, at 2100 UTC. > > https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting > > -gmann > > From akanevsk at redhat.com Wed Nov 3 22:12:56 2021 From: akanevsk at redhat.com (Arkady Kanevsky) Date: Wed, 3 Nov 2021 17:12:56 -0500 Subject: =?UTF-8?Q?Re=3A_=5Bironic=5D_Proposing_Aija_Jaunt=C4=93va_for_sushy=2Dcore?= In-Reply-To: References: Message-ID: +1 On Wed, Nov 3, 2021 at 8:28 AM Ruby Loo wrote: > Yay Aija! +1 :) > > --ruby > > On Tue, Nov 2, 2021 at 3:54 AM Iury Gregory wrote: > >> Hello everyone! >> >> I would like to propose Aija Jaunt?va (irc: ajya) to be added to the >> sushy-core group. >> Aija has been in the ironic community for a long time, she has a lot of >> knowledge about redfish and is always providing good reviews. >> >> ironic-cores please vote with +/- 1. >> >> -- >> >> >> *Att[]'sIury Gregory Melo Ferreira * >> *MSc in Computer Science at UFCG* >> *Part of the ironic-core and puppet-manager-core team in OpenStack* >> *Software Engineer at Red Hat Czech* >> *Social*: https://www.linkedin.com/in/iurygregory >> *E-mail: iurygregory at gmail.com * >> > -- Arkady Kanevsky, Ph.D. Phone: 972 707-6456 Corporate Phone: 919 729-5744 ext. 8176456 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ildiko.vancsa at gmail.com Thu Nov 4 01:12:56 2021 From: ildiko.vancsa at gmail.com (Ildiko Vancsa) Date: Wed, 3 Nov 2021 18:12:56 -0700 Subject: [neutron][networking][ipv6][dns][ddi] Upcoming OpenInfra Edge Computing Group sessions Message-ID: Hi, I?m reaching out to you to draw your attention to the amazing lineup of discussion topics for the OpenInfra Edge Computing Group weekly calls up until the end of this year with industry experts to present and participate in the discussions! I would like to invite and encourage you to join the working group sessions to discuss edge related challenges and solutions in the below areas and more! Some of the sessions to highlight will be continuing the discussions we started at the recent PTG: * November 29th - Networking and DNS discussion with Cricket Liu and Andrew Wertkin * December 6th - Networking and IPv6 discussion with Ed Horley Beyond the above topics we will have presentations from groups and projects such as Smart Edge and the Industry IoT Consortium (Formerly Industrial Internet Consortium). For the full schedule please see the edge WG?s wiki: https://wiki.openstack.org/wiki/Edge_Computing_Group#Upcoming_Topics Our calls happen every Monday at 6am Pacific Time / 1400 UTC on Zoom. You can find all the meeting details here: https://wiki.openstack.org/wiki/Edge_Computing_Group#Meetings Please let me know if you have any questions about the working group or any of the upcoming sessions. Thanks and Best Regards, Ildik? From jaosorior at redhat.com Thu Nov 4 06:03:22 2021 From: jaosorior at redhat.com (Juan Osorio Robles) Date: Thu, 4 Nov 2021 08:03:22 +0200 Subject: [TripleO] Core team cleanup In-Reply-To: References: Message-ID: Hey, I meant to send an email some time ago but I didn't. Sorry about that. Yes, please remove me from the core team. My responsibilities have changed to the point where I no longer work with OpenStack. I'd still be happy to review commits if asked :) especially if I'm acquainted with the area. Thanks for updating the list Best regards On Wed, 3 Nov 2021 at 18:55, James Slagle wrote: > Hello, I took a look at our core team, "tripleo-core" in gerrit. We have a > few individuals who I feel have moved on from TripleO in their focus. I > looked at the reviews from stackalytics.io for the last 180 days[1]. > > These individuals have less than 6 reviews, which is about 1 review a > month: > Bob Fournier > Dan Sneddon > Dmitry Tantsur > Ji?? Str?nsk? > Juan Antonio Osorio Robles > Marius Cornea > > These individuals have publicly expressed that they are moving on from > TripleO: > Michele Baldessari > wes hayutin > > I'd like to propose we remove these folks from our core team, while > thanking them for their contributions. I'll also note that I'd still value > +1/-1 from these folks with a lot of significance, and encourage them to > review their areas of expertise! > > If anyone on the list plans to start reviewing in TripleO again, then I > also think we can postpone the removal for the time being and re-evaluate > later. Please let me know if that's the case. > > Please reply and let me know any agreements or concerns with this change. > > Thank you! > > [1] > https://www.stackalytics.io/report/contribution?module=tripleo-group&project_type=openstack&days=180 > > -- > -- James Slagle > -- > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jichengke2011 at gmail.com Thu Nov 4 06:23:15 2021 From: jichengke2011 at gmail.com (chengke ji) Date: Thu, 4 Nov 2021 14:23:15 +0800 Subject: [barbican] Simple Crypto Plugin kek issue In-Reply-To: References: Message-ID: You should remove old data( project kek) in table kek_data(barbican), and your project kek will issued with your new master kek. Ammad Syed ?2021?10?29??? ??4:04??? > Hi, > > I have installed barbican and using it with openstack magnum. When I am > using the default kek describe in document below, works fine and magnum > cluster creation goes successful. > > https://docs.openstack.org/barbican/latest/install/barbican-backend.html > > But when I generate a new kek with below command. > > python3 -c "from cryptography.fernet import Fernet ; key = Fernet.generate_key(); print(key)" > > > and put it in barbican.conf, the magnum cluster failed to create and I see > below logs in barbican. > > 2021-10-29 12:53:28.932 568554 INFO barbican.plugin.crypto.simple_crypto > [req-aaac01e9-82af-421b-b85a-ff998d904972 ad702ac807f44c73a32a9b7a795b693c > d782069f335041138f0cb141fde9933f - default default] Software Only Crypto > initialized > 2021-10-29 12:53:28.932 568554 DEBUG barbican.model.repositories > [req-aaac01e9-82af-421b-b85a-ff998d904972 ad702ac807f44c73a32a9b7a795b693c > d782069f335041138f0cb141fde9933f - default default] Getting session... > get_session > /usr/lib/python3/dist-packages/barbican/model/repositories.py:364 > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers > [req-aaac01e9-82af-421b-b85a-ff998d904972 ad702ac807f44c73a32a9b7a795b693c > d782069f335041138f0cb141fde9933f - default default] Secret creation failure > seen - please contact site administrator.: cryptography.fernet.InvalidToken > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers Traceback > (most recent call last): > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers File > "/usr/lib/python3/dist-packages/cryptography/fernet.py", line 113, in > _verify_signature > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers > h.verify(data[-32:]) > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers File > "/usr/lib/python3/dist-packages/cryptography/hazmat/primitives/hmac.py", > line 70, in verify > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers > ctx.verify(signature) > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers File > "/usr/lib/python3/dist-packages/cryptography/hazmat/backends/openssl/hmac.py", > line 76, in verify > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers raise > InvalidSignature("Signature did not match digest.") > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers > cryptography.exceptions.InvalidSignature: Signature did not match digest. > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers During > handling of the above exception, another exception occurred: > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers Traceback > (most recent call last): > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers File > "/usr/lib/python3/dist-packages/barbican/api/controllers/__init__.py", line > 102, in handler > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers return > fn(inst, *args, **kwargs) > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers File > "/usr/lib/python3/dist-packages/barbican/api/controllers/__init__.py", line > 88, in enforcer > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers return > fn(inst, *args, **kwargs) > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers File > "/usr/lib/python3/dist-packages/barbican/api/controllers/__init__.py", line > 150, in content_types_enforcer > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers return > fn(inst, *args, **kwargs) > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers File > "/usr/lib/python3/dist-packages/barbican/api/controllers/secrets.py", line > 456, in on_post > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers > new_secret, transport_key_model = plugin.store_secret( > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers File > "/usr/lib/python3/dist-packages/barbican/plugin/resources.py", line 108, in > store_secret > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers > secret_metadata = _store_secret_using_plugin(store_plugin, secret_dto, > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers File > "/usr/lib/python3/dist-packages/barbican/plugin/resources.py", line 279, in > _store_secret_using_plugin > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers > secret_metadata = store_plugin.store_secret(secret_dto, context) > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers File > "/usr/lib/python3/dist-packages/barbican/plugin/store_crypto.py", line 96, > in store_secret > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers > response_dto = encrypting_plugin.encrypt( > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers File > "/usr/lib/python3/dist-packages/barbican/plugin/crypto/simple_crypto.py", > line 76, in encrypt > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers kek = > self._get_kek(kek_meta_dto) > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers File > "/usr/lib/python3/dist-packages/barbican/plugin/crypto/simple_crypto.py", > line 73, in _get_kek > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers return > encryptor.decrypt(kek_meta_dto.plugin_meta) > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers File > "/usr/lib/python3/dist-packages/cryptography/fernet.py", line 76, in decrypt > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers return > self._decrypt_data(data, timestamp, ttl, int(time.time())) > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers File > "/usr/lib/python3/dist-packages/cryptography/fernet.py", line 125, in > _decrypt_data > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers > self._verify_signature(data) > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers File > "/usr/lib/python3/dist-packages/cryptography/fernet.py", line 115, in > _verify_signature > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers raise > InvalidToken > 2021-10-29 12:53:28.991 568554 ERROR barbican.api.controllers > cryptography.fernet.InvalidToken > > Any advise how to fix it ? > > - Ammad > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stendulker at gmail.com Thu Nov 4 09:06:56 2021 From: stendulker at gmail.com (Shivanand Tendulker) Date: Thu, 4 Nov 2021 14:36:56 +0530 Subject: =?UTF-8?Q?Re=3A_=5Bironic=5D_Proposing_Aija_Jaunt=C4=93va_for_sushy=2Dcore?= In-Reply-To: References: Message-ID: +1. Great job Auja !! On Tue, Nov 2, 2021 at 1:28 PM Iury Gregory wrote: > Hello everyone! > > I would like to propose Aija Jaunt?va (irc: ajya) to be added to the > sushy-core group. > Aija has been in the ironic community for a long time, she has a lot of > knowledge about redfish and is always providing good reviews. > > ironic-cores please vote with +/- 1. > > -- > > > *Att[]'sIury Gregory Melo Ferreira * > *MSc in Computer Science at UFCG* > *Part of the ironic-core and puppet-manager-core team in OpenStack* > *Software Engineer at Red Hat Czech* > *Social*: https://www.linkedin.com/in/iurygregory > *E-mail: iurygregory at gmail.com * > -------------- next part -------------- An HTML attachment was scrubbed... URL: From arne.wiebalck at cern.ch Thu Nov 4 09:18:00 2021 From: arne.wiebalck at cern.ch (Arne Wiebalck) Date: Thu, 4 Nov 2021 10:18:00 +0100 Subject: [baremetal-sig][ironic] Tue Nov 9, 2021, 2pm UTC: Hardware burn-in Message-ID: Dear all, The Bare Metal SIG will meet next week on Tue Nov 9, 2021, at 2pm UTC on zoom. The meeting will feature a "topic-of-the-day" presentation by Emmanouil Bagakis (CERN) on "Hardware Burn-in with Ironic" As usual, all details on https://etherpad.opendev.org/p/bare-metal-sig Everyone is welcome! Cheers, Arne From ashrodri at redhat.com Thu Nov 4 15:15:13 2021 From: ashrodri at redhat.com (Ashley Rodriguez) Date: Thu, 4 Nov 2021 11:15:13 -0400 Subject: [docs] Double headings on every page In-Reply-To: References: Message-ID: Hi! Recently I've noticed that the h1 text on the manila API ref no longer shows up in the main text body, though it is still listed in the navigation bar on the left hand side. I've noticed a similar issue on the Ironic, Nova, Neutron API guides as well. I think the change you mentioned caused this, though I'm not completely sure since not all API guides are affected. For example, Swift's API guide still works as expected with visible header 1s identifying the resource, and each method being listed in the nav bar as well. Cinder's API guide makes use of h2 instead, and thus doesn't have each method listed in the nav bar, though the titles are visible. What I'd like to see in the manila API guide is each resource shown as h1 both in the main body and in the navigation bar, along with each corresponding method. If I were to change the resource titles to h2 I would be able to see the title itself but lose the methods in the nav bar, and yet keeping them as h1 means the side bar works but the main text is missing titles. I don't have much experience with front-end work so I'd really appreciate any guidance you can offer to fix this. Thanks, Ashley On Tue, Aug 10, 2021 at 8:05 PM Peter Matulis wrote: > I'm now seeing this symptom on several projects in the 'latest' release > branch. Such as: > > https://docs.openstack.org/nova/latest/ > > My browser exposes the following error: > > [image: image.png] > > We patched [1] our project as a workaround. > > Could one of Stephen's commits [2] be involved? > > [1]: > https://review.opendev.org/c/openstack/charm-deployment-guide/+/803531 > [2]: > https://opendev.org/openstack/openstackdocstheme/commit/08461c5311aa692088a27eb40a87965fd8515aba > > On Thu, Jun 10, 2021 at 3:51 PM Peter Matulis > wrote: > >> Hi Stephen. Did you ever get to circle back to this? >> >> On Fri, May 14, 2021 at 7:34 AM Stephen Finucane >> wrote: >> >>> On Tue, 2021-05-11 at 11:14 -0400, Peter Matulis wrote: >>> >>> Hi, I'm hitting an oddity in one of my projects where the titles of all >>> pages show up twice. >>> >>> Example: >>> >>> >>> https://docs.openstack.org/project-deploy-guide/charm-deployment-guide/wallaby/app-nova-cells.html >>> >>> Source file is here: >>> >>> >>> https://opendev.org/openstack/charm-deployment-guide/src/branch/master/deploy-guide/source/app-nova-cells.rst >>> >>> Does anyone see what can be causing this? It appears to happen only for >>> the current stable release ('wallaby') and 'latest'. >>> >>> Thanks, >>> Peter >>> >>> >>> I suspect you're bumping into issues introduced by a new version of >>> Sphinx or docutils (new versions of both were released recently). >>> >>> Comparing the current nova docs [1] to what you have, I see the >>> duplicate

element is present but hidden by the following CSS rule: >>> >>> .docs-body .section h1 { >>> >>> display: none; >>> >>> } >>> >>> >>> That works because we have the following HTML in the nova docs: >>> >>>
>>> >>>

Extra Specs?

>>> >>> ... >>> >>>
>>> >>> >>> while the docs you linked are using the HTML5 semantic '
' tag: >>> >>>
>>> >>>

Nova Cells?

>>> >>> ... >>> >>>
>>> >>> >>> So to fix this, we'll have to update the openstackdocstheme to handle >>> these changes. I can try to take a look at this next week but I really >>> wouldn't mind if someone beat me to it. >>> >>> Stephen >>> >>> [1] >>> https://docs.openstack.org/nova/latest/configuration/extra-specs.html >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 7527 bytes Desc: not available URL: From artem.goncharov at gmail.com Thu Nov 4 15:32:31 2021 From: artem.goncharov at gmail.com (Artem Goncharov) Date: Thu, 4 Nov 2021 16:32:31 +0100 Subject: [docs] Double headings on every page In-Reply-To: References: Message-ID: openstackdocstheme has fixed this issue already some months ago [1]. If you still see this issue on fresh docs this would hint to me that you use Sphinx>=4 but openstackdocstheme <2.3.1 [1]: https://review.opendev.org/c/openstack/openstackdocstheme/+/798897 Regards, Artem > On 4. Nov 2021, at 16:15, Ashley Rodriguez wrote: > > Hi! > > Recently I've noticed that the h1 text on the manila API ref no longer shows up in the main text body, though it is still listed in the navigation bar on the left hand side. > I've noticed a similar issue on the Ironic, Nova, Neutron API guides as well. > I think the change you mentioned caused this, though I'm not completely sure since not all API guides are affected. > For example, Swift's API guide still works as expected with visible header 1s identifying the resource, and each method being listed in the nav bar as well. > Cinder's API guide makes use of h2 instead, and thus doesn't have each method listed in the nav bar, though the titles are visible. > What I'd like to see in the manila API guide is each resource shown as h1 both in the main body and in the navigation bar, along with each corresponding method. > If I were to change the resource titles to h2 I would be able to see the title itself but lose the methods in the nav bar, and yet keeping them as h1 means the side bar works but the main text is missing titles. > I don't have much experience with front-end work so I'd really appreciate any guidance you can offer to fix this. > > Thanks, > Ashley > > On Tue, Aug 10, 2021 at 8:05 PM Peter Matulis > wrote: > I'm now seeing this symptom on several projects in the 'latest' release branch. Such as: > > https://docs.openstack.org/nova/latest/ > > My browser exposes the following error: > > > > We patched [1] our project as a workaround. > > Could one of Stephen's commits [2] be involved? > > [1]: https://review.opendev.org/c/openstack/charm-deployment-guide/+/803531 > [2]: https://opendev.org/openstack/openstackdocstheme/commit/08461c5311aa692088a27eb40a87965fd8515aba > On Thu, Jun 10, 2021 at 3:51 PM Peter Matulis > wrote: > Hi Stephen. Did you ever get to circle back to this? > > On Fri, May 14, 2021 at 7:34 AM Stephen Finucane > wrote: > On Tue, 2021-05-11 at 11:14 -0400, Peter Matulis wrote: >> Hi, I'm hitting an oddity in one of my projects where the titles of all pages show up twice. >> >> Example: >> >> https://docs.openstack.org/project-deploy-guide/charm-deployment-guide/wallaby/app-nova-cells.html >> >> Source file is here: >> >> https://opendev.org/openstack/charm-deployment-guide/src/branch/master/deploy-guide/source/app-nova-cells.rst >> >> Does anyone see what can be causing this? It appears to happen only for the current stable release ('wallaby') and 'latest'. >> >> Thanks, >> Peter > > I suspect you're bumping into issues introduced by a new version of Sphinx or docutils (new versions of both were released recently). > > Comparing the current nova docs [1] to what you have, I see the duplicate

element is present but hidden by the following CSS rule: > > .docs-body .section h1 { > display: none; > } > > That works because we have the following HTML in the nova docs: > >
>

Extra Specs?

> ... >
> > while the docs you linked are using the HTML5 semantic '
' tag: > >
>

Nova Cells?

> ... >
> > So to fix this, we'll have to update the openstackdocstheme to handle these changes. I can try to take a look at this next week but I really wouldn't mind if someone beat me to it. > > Stephen > > [1] https://docs.openstack.org/nova/latest/configuration/extra-specs.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From katonalala at gmail.com Thu Nov 4 16:17:00 2021 From: katonalala at gmail.com (Lajos Katona) Date: Thu, 4 Nov 2021 17:17:00 +0100 Subject: [neutron] Drivers meeting - Friday 5.11.2021 - cancelled Message-ID: Hi Neutron Drivers! Due to the lack of agenda, let's cancel tomorrow's drivers meeting. See You on the meeting next week. Lajos Katona (lajoskatona) -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Thu Nov 4 16:23:58 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 4 Nov 2021 16:23:58 +0000 Subject: [docs] Double headings on every page In-Reply-To: References: Message-ID: <20211104162357.36phjz3mj6czoxjp@yuggoth.org> On 2021-11-04 11:15:13 -0400 (-0400), Ashley Rodriguez wrote: > Recently I've noticed that the h1 text on the manila API ref no > longer shows up in the main text body, though it is still listed > in the navigation bar on the left hand side. [...] Can you provide specific URLs? It's a little hard to be sure I understand what you're describing. For example, if you're talking about https://docs.openstack.org/api-ref/shared-file-system/ then when I pull it up in my browser I see "Shared File Systems API" at the top of the main text column. That is H1 level text. I think you may be saying that when there's more than one H1 element in the page, all subsequent H1 elements are hidden? For example, I notice that "API Versions" is the second H1 element, and while it appears in the navigation list in the left column, it does not show up inline in the main text column. This is because OpenStackDocsTheme intentionally hides H1 in the Sphinx output so that it can present a custom styled version of it at the top of the text. This works just fine when there is one and only one top-level (H1) heading element, but breaks down if a document contains more than one. > What I'd like to see in the manila API guide is each resource > shown as h1 both in the main body and in the navigation bar, along > with each corresponding method. If I were to change the resource > titles to h2 I would be able to see the title itself but lose the > methods in the nav bar, and yet keeping them as h1 means the side > bar works but the main text is missing titles. [...] Whether they're H1 or H2 level seems more like an implementation detail. What you actually seem to want is just for the document section titles *and* the methods within them to both appear at different levels in the navigation column, right? That seems like something we should be able to solve in the sidebar here: https://opendev.org/openstack/openstackdocstheme/src/branch/master/openstackdocstheme/theme/openstackdocs/sidebartoc.html You might want to play with setting theme_sidebar_mode to "toc" since it looks like "toctree" implicitly limits the maxdepth to 2 heading levels (H1 and H2 essentially). -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From rosmaita.fossdev at gmail.com Thu Nov 4 19:19:00 2021 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Thu, 4 Nov 2021 15:19:00 -0400 Subject: [cinder][rbd] festival of RBD driver reviews 11 november 1500 utc Message-ID: <9e8a69c6-dff6-e320-e618-2f828fc95861@gmail.com> By popular demand (really!), I'm scheduling a RBD driver review festival for next week. It's a community driver, and we've got a backlog of patches: https://review.opendev.org/q/project:openstack/cinder+status:open+file:cinder/volume/drivers/rbd.py If your patch is currently in merge conflict, it would be helpful if you could get conflicts resolved before the festival. Also, if you have questions about comments that have been left on your patch, this would be a good time to get them answered. who: Everyone! what: The Cinder Festival of RBD Driver Reviews when: Thursday 11 November 2021 from 1500-1600 UTC where: https://meet.google.com/fsb-qkfc-qun etherpad: https://etherpad.opendev.org/p/cinder-festival-of-driver-reviews (Note that we're trying google meet for this session.) From thierry at openstack.org Fri Nov 5 14:26:13 2021 From: thierry at openstack.org (Thierry Carrez) Date: Fri, 5 Nov 2021 15:26:13 +0100 Subject: [all][tc] Relmgt team position on release cadence Message-ID: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> Hi everyone, The (long) document below reflects the current position of the release management team on a popular question: should the OpenStack release cadence be changed? Please note that we only address the release management / stable branch management facet of the problem. There are other dimensions to take into account (governance, feature deprecation, supported distros...) to get a complete view of the debate. Introduction ------------ The subject of how often OpenStack should be released has been regularly debated in the OpenStack community. OpenStack started with a 3-month release cycle, then switched to 6-month release cycle starting with Diablo. It is often thought of a release management decision, but it is actually a much larger topic: a release cadence is a trade-off between pressure to release more often and pressure to release less often, coming in from a lot of different stakeholders. In OpenStack, it is ultimately a Technical Committee decision. But that decision is informed by the position of a number of stakeholders. This document gives historical context and describes the current release management team position. The current trade-off --------------------- The main pressure to release more often is to make features available to users faster. Developers get a faster feedback loop, hardware vendors ensure software is compatible with their latest products, and users get exciting new features. "Release early, release often" is a best practice in our industry -- we should generally aim at releasing as often as possible. But that is counterbalanced by pressure to release less often. From a development perspective, each release cycle comes with some process overhead. On the integrators side, a new release means packaging and validation work. On the users side, it means pressure to upgrade. To justify that cost, there needs to be enough user-visible benefit (like new features) in a given release. For the last 10 years for OpenStack, that balance has been around six months. Six months let us accumulate enough new development that it was worth upgrading to / integrating the new version, while giving enough time to actually do the work. It also aligned well with Foundation events cadence, allowing to synchronize in-person developer meetings date with start of cycles. What changed ------------ The major recent change affecting this trade-off is that the pace of new development in OpenStack slowed down. The rhythm of changes was divided by 3 between 2015 and 2021, reflecting that OpenStack is now a mature and stable solution, where accessing the latest features is no longer a major driver. That reduces some of the pressure for releasing more often. At the same time, we have more users every day, with larger and larger deployments, and keeping those clusters constantly up to date is an operational challenge. That increases the pressure to release less often. In essence, OpenStack is becoming much more like a LTS distribution than a web browser -- something users like moving slow. Over the past years, project teams also increasingly decoupled individual components from the "coordinated release". More and more components opted for an independent or intermediary-released model, where they can put out releases in the middle of a cycle, making new features available to their users. This increasingly opens up the possibility of a longer "coordinated release" which would still allow development teams to follow "release early, release often" best practices. All that recent evolution means it is (again) time to reconsider if the 6-month cadence is what serves our community best, and in particular if a longer release cadence would not suit us better. The release management team position on the debate -------------------------------------------------- While releasing less often would definitely reduce the load on the release management team, most of the team work being automated, we do not think it should be a major factor in motivating the decision. We should not adjust the cadence too often though, as there is a one-time cost in switching our processes. In terms of impact, we expect that a switch to a longer cycle will encourage more project teams to adopt a "with-intermediary" release model (rather than the traditional "with-rc" single release per cycle), which may lead to abandoning the latter, hence simplifying our processes. Longer cycles might also discourage people to commit to PTL or release liaison work. We'd probably need to manage expectations there, and encourage more frequent switches (or create alternate models). If the decision is made to switch to a longer cycle, the release management team recommends to switch to one year directly. That would avoid changing it again anytime soon, and synchronizing on a calendar year is much simpler to follow and communicate. We also recommend announcing the change well in advance. We currently have an opportunity of making the switch when we reach the end of the release naming alphabet, which would also greatly simplify the communications around the change. Finally, it is worth mentioning the impact on the stable branch work. Releasing less often would likely impact the number of stable branches that we keep on maintaining, so that we do not go too much in the past (and hit unmaintained distributions or long-gone dependencies). We currently maintain releases for 18 months before they switch to extended maintenance, which results in between 3 and 4 releases being maintained at the same time. We'd recommend switching to maintaining one-year releases for 24 months, which would result in between 2 and 3 releases being maintained at the same time. Such a change would lead to longer maintenance for our users while reducing backporting work for our developers. -- Thierry Carrez (ttx) On behalf of the OpenStack Release Management team From pbasaras at gmail.com Fri Nov 5 14:43:38 2021 From: pbasaras at gmail.com (Pavlos Basaras) Date: Fri, 5 Nov 2021 16:43:38 +0200 Subject: [Ussuri] [Zun] Container deployment problem with Zun "There are not enough hosts available" Message-ID: Hello, I have an Openstack cluster, with basic services and functionality working based on ussuri release. I am trying to install the Zun service to be able to deploy containers, following [controller] -- https://docs.openstack.org/zun/ussuri/install/controller-install.html and [compute] -- https://docs.openstack.org/zun/ussuri/install/compute-install.html I used the git branch based on ussuri for all components. I veryfined kuryr-libnetwork operation issuing from the compute node # docker network create --driver kuryr --ipam-driver kuryr --subnet 10.10.0.0/16 --gateway=10.10.0.1 test_net and seeing the network created successfully, etc. I am not very sure about the zun.conf file. What is the "endpoint_type = internalURL" parameter? Do I need to change internalURL? >From sudo systemctl status zun-compute i see: Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 49, in wrapped_f Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task return Retrying(*dargs, **dkw).call(f, *args, **kw) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 206, in call Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task return attempt.get(self._wrap_exception) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 247, in get Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task six.reraise(self.value[0], self.value[1], self.value[2]) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/lib/python3/dist-packages/six.py", line 703, in reraise Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task raise value Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 200, in call Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task attempt = Attempt(fn(*args, **kwargs), attempt_number, False) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/zun/compute/compute_node_tracker.py", line 350, in _update_to_placement Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task context, node_rp_uuid, name=compute_node.hostname) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", line 846, in get_provider_tree_and_ensure_r Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task parent_provider_uuid=parent_provider_uuid) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", line 628, in _ensure_resource_provider Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task parent_provider_uuid=parent_provider_uuid) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", line 514, in _create_resource_provider Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task global_request_id=context.global_id) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", line 225, in post Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task headers=headers, logger=LOG) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/lib/python3/dist-packages/keystoneauth1/adapter.py", line 392, in post Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task return self.request(url, 'POST', **kwargs) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/lib/python3/dist-packages/keystoneauth1/adapter.py", line 248, in request Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task return self.session.request(url, method, **kwargs) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/lib/python3/dist-packages/keystoneauth1/session.py", line 968, in request Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task raise exceptions.from_response(resp, method, url) *Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task keystoneauth1.exceptions.http.Conflict: Conflict (HTTP 409) (Request-ID: req-9a158c41-a485-4937-99e7-e38cdce7fded)* What is this problem? any advice? I used the default configuration values ([keystone_auth] and [keystone_authtoken]) values based on the configuration from the above links. Aslo from the controller *openstack appcontainer service list* +----+----------+-------------+-------+----------+-----------------+----------------------------+-------------------+ | Id | Host | Binary | State | Disabled | Disabled Reason | Updated At | Availability Zone | +----+----------+-------------+-------+----------+-----------------+----------------------------+-------------------+ | 1 | compute5 | zun-compute | up | False | None | 2021-11-05T14:39:01.000000 | nova | +----+----------+-------------+-------+----------+-----------------+----------------------------+-------------------+ *openstack appcontainer host show compute5* +----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Field | Value | +----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | uuid | ee3a5b44-8ffa-463e-939d-0c61868a596f | | links | [{'href': ' http://controller:9517/v1/hosts/ee3a5b44-8ffa-463e-939d-0c61868a596f', 'rel': 'self'}, {'href': ' http://controller:9517/hosts/ee3a5b44-8ffa-463e-939d-0c61868a596f', 'rel': 'bookmark'}] | | hostname | compute5 | | mem_total | 7975 | | mem_used | 0 | | total_containers | 1 | | cpus | 10 | | cpu_used | 0.0 | | architecture | x86_64 | | os_type | linux | | os | Ubuntu 18.04.6 LTS | | kernel_version | 4.15.0-161-generic | | labels | {} | | disk_total | 63 | | disk_used | 0 | | disk_quota_supported | False | | runtimes | ['io.containerd.runc.v2', 'io.containerd.runtime.v1.linux', 'runc'] | | enable_cpu_pinning | False | +----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ seems to work fine. However when i issue e.g., openstack appcontainer run --name container --net network=$NET_ID cirros ping 8.8.8.8 i get the error: | status_reason | There are not enough hosts available. Any ideas? One final thing is that I did see in the Horizon dashboard the container tab, to be able to deploy containers from horizon. Is there an extra configuration for this? sorry for the long mail. best, Pavlos -------------- next part -------------- An HTML attachment was scrubbed... URL: From sbauza at redhat.com Fri Nov 5 15:17:48 2021 From: sbauza at redhat.com (Sylvain Bauza) Date: Fri, 5 Nov 2021 16:17:48 +0100 Subject: [nova][placement] Spec review day on Nov 16th Message-ID: As agreed on our last Nova meeting [1], please sharpen your pen and prepare your specs ahead of time as we'll have a spec review day on Nov 16th Reminder: the idea of a spec review day is to ensure that contributors and reviewers are available on the same day for prioritizing Gerrit comments and IRC discussions about specs in order to facilitate and accelerate the reviewing of open specs. If you care about some fancy new feature, please make sure your spec is ready for review on time and you are somehow joinable so reviewers can ping you, or you are able to quickly reply on their comments and ideally propose a new revision if needed. Nova cores, I appreciate your dedication about specs on this particular day. -Sylvain [1] https://meetings.opendev.org/meetings/nova/2021/nova.2021-11-02-16.00.log.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnaser at vexxhost.com Fri Nov 5 15:53:25 2021 From: mnaser at vexxhost.com (Mohammed Naser) Date: Fri, 5 Nov 2021 11:53:25 -0400 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> Message-ID: On Fri, Nov 5, 2021 at 10:39 AM Thierry Carrez wrote: > > Hi everyone, > > The (long) document below reflects the current position of the release > management team on a popular question: should the OpenStack release > cadence be changed? Please note that we only address the release > management / stable branch management facet of the problem. There are > other dimensions to take into account (governance, feature deprecation, > supported distros...) to get a complete view of the debate. > > Introduction > ------------ > > The subject of how often OpenStack should be released has been regularly > debated in the OpenStack community. OpenStack started with a 3-month > release cycle, then switched to 6-month release cycle starting with > Diablo. It is often thought of a release management decision, but it is > actually a much larger topic: a release cadence is a trade-off between > pressure to release more often and pressure to release less often, > coming in from a lot of different stakeholders. In OpenStack, it is > ultimately a Technical Committee decision. But that decision is informed > by the position of a number of stakeholders. This document gives > historical context and describes the current release management team > position. > > The current trade-off > --------------------- > > The main pressure to release more often is to make features available to > users faster. Developers get a faster feedback loop, hardware vendors > ensure software is compatible with their latest products, and users get > exciting new features. "Release early, release often" is a best practice > in our industry -- we should generally aim at releasing as often as > possible. > > But that is counterbalanced by pressure to release less often. From a > development perspective, each release cycle comes with some process > overhead. On the integrators side, a new release means packaging and > validation work. On the users side, it means pressure to upgrade. To > justify that cost, there needs to be enough user-visible benefit (like > new features) in a given release. > > For the last 10 years for OpenStack, that balance has been around six > months. Six months let us accumulate enough new development that it was > worth upgrading to / integrating the new version, while giving enough > time to actually do the work. It also aligned well with Foundation > events cadence, allowing to synchronize in-person developer meetings > date with start of cycles. > > What changed > ------------ > > The major recent change affecting this trade-off is that the pace of new > development in OpenStack slowed down. The rhythm of changes was divided > by 3 between 2015 and 2021, reflecting that OpenStack is now a mature > and stable solution, where accessing the latest features is no longer a > major driver. That reduces some of the pressure for releasing more > often. At the same time, we have more users every day, with larger and > larger deployments, and keeping those clusters constantly up to date is > an operational challenge. That increases the pressure to release less > often. In essence, OpenStack is becoming much more like a LTS > distribution than a web browser -- something users like moving slow. > > Over the past years, project teams also increasingly decoupled > individual components from the "coordinated release". More and more > components opted for an independent or intermediary-released model, > where they can put out releases in the middle of a cycle, making new > features available to their users. This increasingly opens up the > possibility of a longer "coordinated release" which would still allow > development teams to follow "release early, release often" best > practices. All that recent evolution means it is (again) time to > reconsider if the 6-month cadence is what serves our community best, and > in particular if a longer release cadence would not suit us better. > > The release management team position on the debate > -------------------------------------------------- > > While releasing less often would definitely reduce the load on the > release management team, most of the team work being automated, we do > not think it should be a major factor in motivating the decision. We > should not adjust the cadence too often though, as there is a one-time > cost in switching our processes. In terms of impact, we expect that a > switch to a longer cycle will encourage more project teams to adopt a > "with-intermediary" release model (rather than the traditional "with-rc" > single release per cycle), which may lead to abandoning the latter, > hence simplifying our processes. Longer cycles might also discourage > people to commit to PTL or release liaison work. We'd probably need to > manage expectations there, and encourage more frequent switches (or > create alternate models). > > If the decision is made to switch to a longer cycle, the release > management team recommends to switch to one year directly. That would > avoid changing it again anytime soon, and synchronizing on a calendar > year is much simpler to follow and communicate. We also recommend > announcing the change well in advance. We currently have an opportunity > of making the switch when we reach the end of the release naming > alphabet, which would also greatly simplify the communications around > the change. > > Finally, it is worth mentioning the impact on the stable branch work. > Releasing less often would likely impact the number of stable branches > that we keep on maintaining, so that we do not go too much in the past > (and hit unmaintained distributions or long-gone dependencies). We > currently maintain releases for 18 months before they switch to extended > maintenance, which results in between 3 and 4 releases being maintained > at the same time. We'd recommend switching to maintaining one-year > releases for 24 months, which would result in between 2 and 3 releases > being maintained at the same time. Such a change would lead to longer > maintenance for our users while reducing backporting work for our > developers. > Thanks for the write up Thierry. I wonder what are the thoughts of the community of having LTS + normal releases so that we can have the power of both? I guess that is essentially what we have with EM, but I guess we could introduce a way to ensure that operators can just upgrade LTS to LTS. It can complicate things a bit from a CI and project management side, but I think it could solve the problem for both sides that need want new features + those who want stability? > -- > Thierry Carrez (ttx) > On behalf of the OpenStack Release Management team > -- Mohammed Naser VEXXHOST, Inc. From fungi at yuggoth.org Fri Nov 5 16:18:31 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 5 Nov 2021 16:18:31 +0000 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> Message-ID: <20211105161830.2fiv6jjdn52fwrxf@yuggoth.org> On 2021-11-05 11:53:25 -0400 (-0400), Mohammed Naser wrote: [...] > I wonder what are the thoughts of the community of having LTS + > normal releases so that we can have the power of both? I guess > that is essentially what we have with EM, but I guess we could > introduce a way to ensure that operators can just upgrade LTS to > LTS. > > It can complicate things a bit from a CI and project management > side, but I think it could solve the problem for both sides that > need want new features + those who want stability? This is really just another way of suggesting we solve the skip-level upgrades problem, since we can't really test fast-forward upgrades through so-called "non-LTS" versions once we abandon them. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From smooney at redhat.com Fri Nov 5 17:47:13 2021 From: smooney at redhat.com (Sean Mooney) Date: Fri, 05 Nov 2021 17:47:13 +0000 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> Message-ID: On Fri, 2021-11-05 at 11:53 -0400, Mohammed Naser wrote: > On Fri, Nov 5, 2021 at 10:39 AM Thierry Carrez wrote: > > > > Hi everyone, > > > > The (long) document below reflects the current position of the release > > management team on a popular question: should the OpenStack release > > cadence be changed? Please note that we only address the release > > management / stable branch management facet of the problem. There are > > other dimensions to take into account (governance, feature deprecation, > > supported distros...) to get a complete view of the debate. > > > > Introduction > > ------------ > > > > The subject of how often OpenStack should be released has been regularly > > debated in the OpenStack community. OpenStack started with a 3-month > > release cycle, then switched to 6-month release cycle starting with > > Diablo. It is often thought of a release management decision, but it is > > actually a much larger topic: a release cadence is a trade-off between > > pressure to release more often and pressure to release less often, > > coming in from a lot of different stakeholders. In OpenStack, it is > > ultimately a Technical Committee decision. But that decision is informed > > by the position of a number of stakeholders. This document gives > > historical context and describes the current release management team > > position. > > > > The current trade-off > > --------------------- > > > > The main pressure to release more often is to make features available to > > users faster. Developers get a faster feedback loop, hardware vendors > > ensure software is compatible with their latest products, and users get > > exciting new features. "Release early, release often" is a best practice > > in our industry -- we should generally aim at releasing as often as > > possible. > > > > But that is counterbalanced by pressure to release less often. From a > > development perspective, each release cycle comes with some process > > overhead. On the integrators side, a new release means packaging and > > validation work. On the users side, it means pressure to upgrade. To > > justify that cost, there needs to be enough user-visible benefit (like > > new features) in a given release. > > > > For the last 10 years for OpenStack, that balance has been around six > > months. Six months let us accumulate enough new development that it was > > worth upgrading to / integrating the new version, while giving enough > > time to actually do the work. It also aligned well with Foundation > > events cadence, allowing to synchronize in-person developer meetings > > date with start of cycles. > > > > What changed > > ------------ > > > > The major recent change affecting this trade-off is that the pace of new > > development in OpenStack slowed down. The rhythm of changes was divided > > by 3 between 2015 and 2021, reflecting that OpenStack is now a mature > > and stable solution, where accessing the latest features is no longer a > > major driver. That reduces some of the pressure for releasing more > > often. At the same time, we have more users every day, with larger and > > larger deployments, and keeping those clusters constantly up to date is > > an operational challenge. That increases the pressure to release less > > often. In essence, OpenStack is becoming much more like a LTS > > distribution than a web browser -- something users like moving slow. > > > > Over the past years, project teams also increasingly decoupled > > individual components from the "coordinated release". More and more > > components opted for an independent or intermediary-released model, > > where they can put out releases in the middle of a cycle, making new > > features available to their users. This increasingly opens up the > > possibility of a longer "coordinated release" which would still allow > > development teams to follow "release early, release often" best > > practices. All that recent evolution means it is (again) time to > > reconsider if the 6-month cadence is what serves our community best, and > > in particular if a longer release cadence would not suit us better. > > > > The release management team position on the debate > > -------------------------------------------------- > > > > While releasing less often would definitely reduce the load on the > > release management team, most of the team work being automated, we do > > not think it should be a major factor in motivating the decision. We > > should not adjust the cadence too often though, as there is a one-time > > cost in switching our processes. In terms of impact, we expect that a > > switch to a longer cycle will encourage more project teams to adopt a > > "with-intermediary" release model (rather than the traditional "with-rc" > > single release per cycle), which may lead to abandoning the latter, > > hence simplifying our processes. Longer cycles might also discourage > > people to commit to PTL or release liaison work. We'd probably need to > > manage expectations there, and encourage more frequent switches (or > > create alternate models). > > > > If the decision is made to switch to a longer cycle, the release > > management team recommends to switch to one year directly. That would > > avoid changing it again anytime soon, and synchronizing on a calendar > > year is much simpler to follow and communicate. We also recommend > > announcing the change well in advance. We currently have an opportunity > > of making the switch when we reach the end of the release naming > > alphabet, which would also greatly simplify the communications around > > the change. > > > > Finally, it is worth mentioning the impact on the stable branch work. > > Releasing less often would likely impact the number of stable branches > > that we keep on maintaining, so that we do not go too much in the past > > (and hit unmaintained distributions or long-gone dependencies). We > > currently maintain releases for 18 months before they switch to extended > > maintenance, which results in between 3 and 4 releases being maintained > > at the same time. We'd recommend switching to maintaining one-year > > releases for 24 months, which would result in between 2 and 3 releases > > being maintained at the same time. Such a change would lead to longer > > maintenance for our users while reducing backporting work for our > > developers. > > > > Thanks for the write up Thierry. > > I wonder what are the thoughts of the community of having LTS + normal releases > so that we can have the power of both? I guess that is essentially what we have > with EM, but I guess we could introduce a way to ensure that operators can just > upgrade LTS to LTS. if we were to intoduce LTS release we would have to agree on what they were as a compunity and we would need to support roling upgrade between LTS versions that would reqruie all distibuted project like nova to ensur that lts to lts rpc and db compatitblity is maintained instead of the current N+1 guarunetees we have to day. i know that would make some downstream happy as perhaps we could align our FFU support with THE LTS cadance but i would hold my breath on that. as a developer i woudl presonally prefer to have shorter cycle upstream with uprades supporte aross a more then n+1 e.g. release every 2 months but keep rolling upgrade compatiablty for at least 12 months or someting like that. the release with intermeiday lifecyle can enable that while still allowign use to have a longer or shorter planing horizon depending on the project and its veliocity. > > It can complicate things a bit from a CI and project management side, > but I think it > could solve the problem for both sides that need want new features + > those who want > stability? it might but i suspect that it will still not align with distros canonical have a new lts every 2 years and redhat has a new release every 18~months or so based on every 3rd release the lts idea i think has merrit but we likely would have to maintain at least 2 lts release in paralel to make it work. so something like 1 lts release a year maintained for 2 years with normal release every 6 months that are only maintianed for 6 months each project woudl keep rolling upgrade compatitblity ideally between lts release rather then n+1 as a new mimium. the implication of this is that we would want to have grenade jobs testing latest lts to master upgrade compatiblity in additon to n to n+1 where those differ. > > > -- > > Thierry Carrez (ttx) > > On behalf of the OpenStack Release Management team > > > > From smooney at redhat.com Fri Nov 5 17:53:41 2021 From: smooney at redhat.com (Sean Mooney) Date: Fri, 05 Nov 2021 17:53:41 +0000 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: <20211105161830.2fiv6jjdn52fwrxf@yuggoth.org> References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> <20211105161830.2fiv6jjdn52fwrxf@yuggoth.org> Message-ID: <375451085e161c0a1af037576ba35540a5ddae41.camel@redhat.com> On Fri, 2021-11-05 at 16:18 +0000, Jeremy Stanley wrote: > On 2021-11-05 11:53:25 -0400 (-0400), Mohammed Naser wrote: > [...] > > I wonder what are the thoughts of the community of having LTS + > > normal releases so that we can have the power of both? I guess > > that is essentially what we have with EM, but I guess we could > > introduce a way to ensure that operators can just upgrade LTS to > > LTS. > > > > It can complicate things a bit from a CI and project management > > side, but I think it could solve the problem for both sides that > > need want new features + those who want stability? > > This is really just another way of suggesting we solve the > skip-level upgrades problem, since we can't really test fast-forward > upgrades through so-called "non-LTS" versions once we abandon them. well realisticlly i dont think the customer that are pusshign use to supprot skip level upgrades or fast forward upgrades will be able to work with a cadence of 1 release a year so i would expect use to still need to consider skip level upgrade between lts-2 to new lts we have several customer that need at least 12 months to complte certifacaiton of all of there workloads on a new cloud so openstack distos will still have to support those customer that really need a 2 yearly or longer upgrade cadance even if we had and lts release every year. there are many other uses of openstack that can effectivly live at head cern and vexhost been two example that the 1 year cycle might suit well but for our telco and finacial custoemr 12 is still a short upgrade horizon for them. From dms at danplanet.com Fri Nov 5 18:21:51 2021 From: dms at danplanet.com (Dan Smith) Date: Fri, 05 Nov 2021 11:21:51 -0700 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: (Sean Mooney's message of "Fri, 05 Nov 2021 17:47:13 +0000") References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> Message-ID: > i know that would make some downstream happy as perhaps we could align > our FFU support with THE LTS cadance but i would hold my breath on > that. Except any downstream that is unable to align on the LTS schedule either permanently or temporarily would have to wait a full extra year to resync, which would make them decidedly unhappy I think. I'm sure some distros have had to realign downstream releases to "the next" upstream one more than once, so... :) > as a developer i woudl presonally prefer to have shorter cycle > upstream with uprades supporte aross a more then n+1 e.g. release > every 2 months but keep rolling upgrade compatiablty for at least 12 > months or someting like that. the release with intermeiday lifecyle > can enable that while still allowign use to have a longer or shorter > planing horizon depending on the project and its veliocity. This has the same problem as you highlighted above, which is that we all have to agree on the same 12 months that we're supporting that span, otherwise this collapses to just the intersection of any two projects' windows. --Dan From fungi at yuggoth.org Fri Nov 5 18:25:29 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 5 Nov 2021 18:25:29 +0000 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> Message-ID: <20211105182528.fgpjp6ax6cgficz2@yuggoth.org> On 2021-11-05 17:47:13 +0000 (+0000), Sean Mooney wrote: [...] > if we were to intoduce LTS release we would have to agree on what > they were as a compunity and we would need to support roling > upgrade between LTS versions [...] Yes, but what about upgrades between LTS and non-LTS versions (from or to)? Do we test all those as well? And if we don't, are users likely to want to use the non-LTS versions at all knowing they might be unable to cleanly update from them to an LTS version later on? > so something like 1 lts release a year maintained for 2 years > with normal release every 6 months that are only maintianed for 6 > months [...] To restate what I said in my other reply, this assumes a future where skip-level upgrades are possible. Otherwise what happens with a series of releases like A,b,C,d,E where A/C/E are the LTS releases and b/d are the non-LTS releases and someone who's using A wants to upgrade to C but we've already stopped maintaining b and can't guarantee it's even installable any longer? If the LTS idea is interesting to people, then we should take a step back and work on switching from FFU to SLU first. If we can't solve that, then there's no point to having non-LTS releases. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From gmann at ghanshyammann.com Fri Nov 5 19:07:47 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Fri, 05 Nov 2021 14:07:47 -0500 Subject: [all][tc] What's happening in Technical Committee: summary 5th Nov, 21: Reading: 10 min Message-ID: <17cf17ff56d.dca4cf29182418.6179612577638751365@ghanshyammann.com> Hello Everyone, Here is this week's summary of the Technical Committee activities. 1. TC Meetings: ============ * TC this week IRC meeting held on Nov 5th Thursday. * Most of the meeting discussions are summarized below (Completed or in-progress activities section). Meeting full recording are available @ - https://www.youtube.com/watch?v=4bt6iNaR3oI - IRC chat summary (topics and links): https://meetings.opendev.org/meetings/tc/2021/tc.2021-11-04-15.02.log.html * Next week's meeting will be on IRC on Nov 11th, Thursday 15:00 UTC, feel free the topic on agenda[1] by Nov 10th. 2. What we completed this week: ========================= * Retired kolla-cli [2] * Removed oslo independent deliverables from stable policy[3] * Magnum PTL is changed from Feilong Wang to Spryos[4] 3. Activities In progress: ================== TC Tracker for Yoga cycle ------------------------------ * I have started the etherpad to collect the Yoga cycle targets for TC[5]. Open Reviews ----------------- * Eleven open reviews for ongoing activities[6]. RBAC discussion: continuing from PTG ---------------------------------------------- We had a discussion on Wed and sort out the many points especially on get-all-project resources or system and project scoped isolation. Complete notes are in this etherpad[7]. Lance is re-working on the goal and new implementation/details, feel free to review it for more feedback[8]. We will continue the discussion on next Wed Nov 10th same time[9], feel free to join for any additional query related to your project specific and we will also discuss about what all bits we can do in Yoga cycle and what all to move as next targets. Decouple the community-wide goals from cycle release ----------------------------------------------------------------- * As discussed in PTG, we are de-coupling the community-wide goals from release cycle so that we can improve the community-wide goal selection and its completion workflow where goals are multi-cycle like RBAC etc. * Proposal is up, please review and add your feedback[10]. Yoga release community-wide goal ----------------------------------------- * With the continuing the discussion on RBAC, we are re-working on the RBAC goal, please wait until we finalize the implementation[11] * There is one more goal proposal for 'FIPS compatibility and compliance'[12]. Adjutant need maintainers and PTLs ------------------------------------------- You might have seen the email[13] from Adrian to call the maintainer for Adjutant project, please reply to that email or here or on openstack-tc channel if you are interested to maintain this project. We will wait for next week also and if no-one shows up for help then we will start the retirement process. New project 'Skyline' proposal ------------------------------------ * We discussed it in TC PTG and there are few open points about python packaging, repos, and plugins plan which we are discussion on ML. * Still waiting from skyline team to work on the above points[14]. Updating the Yoga testing runtime ---------------------------------------- * As centos stream 9 is released, I have updated the Yoga testing runtime[15] with: 1. Add Debian 11 as tested distro 2. Change centos stream 8 -> centos stream 9 3. Bump lowest python version to test to 3.8 and highest to python 3.9 Stable Core team process change --------------------------------------- * Current proposal is under review[16]. Feel free to provide early feedback if you have any. * This is ready to merge and I will do that week early, feel free to add feedback if any. Merging 'Technical Writing' SIG Chair/Maintainers to TC ------------------------------------------------------------------ * Work to merge this SIG to TC is up for review[17]. * No response on usage and maintaining help for openstack/training-labs repo[18] and in TC meeting we decided to retire it next week. If you use or would like to maintain this, this is last chance to raise hand. TC tags analysis ------------------- * TC agreed to remove the framework and it is communicated in ML[19]. Project updates ------------------- * Retiring js-openstack-lib [20] 4. How to contact the TC: ==================== If you would like to discuss or give feedback to TC, you can reach out to us in multiple ways: 1. Email: you can send the email with tag [tc] on openstack-discuss ML[21]. 2. Weekly meeting: The Technical Committee conduct a weekly meeting every Thursday 15 UTC [22] 3. Office hours: The Technical Committee offers a weekly office hour every Tuesday at 0100 UTC [23] 4. Ping us using 'tc-members' nickname on #openstack-tc IRC channel. [1] https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting [2] https://review.opendev.org/c/openstack/governance/+/814603 [3] https://review.opendev.org/c/openstack/governance/+/816828 [4] https://review.opendev.org/c/openstack/governance/+/816431 [5] https://etherpad.opendev.org/p/tc-yoga-tracker [6] https://review.opendev.org/q/projects:openstack/governance+status:open [7] https://etherpad.opendev.org/p/policy-popup-yoga-ptg [8] https://review.opendev.org/c/openstack/governance/+/815158 [9] http://lists.openstack.org/pipermail/openstack-discuss/2021-November/025666.html [10] https://review.opendev.org/c/openstack/governance/+/816387 [11] https://review.opendev.org/c/openstack/governance/+/815158 [12] https://review.opendev.org/c/openstack/governance/+/816587 [13] http://lists.openstack.org/pipermail/openstack-discuss/2021-October/025555.html [14] https://review.opendev.org/c/openstack/governance/+/814037 [15] https://review.opendev.org/c/openstack/governance/+/815851 [16] https://review.opendev.org/c/openstack/governance/+/810721 [17] https://review.opendev.org/c/openstack/governance/+/815869 [18] http://lists.openstack.org/pipermail/openstack-discuss/2021-October/025586.html [19] http://lists.openstack.org/pipermail/openstack-discuss/2021-October/025571.html [20] https://review.opendev.org/c/openstack/governance/+/807163 [21] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-discuss [22] http://eavesdrop.openstack.org/#Technical_Committee_Meeting [23] http://eavesdrop.openstack.org/#Technical_Committee_Office_hours -gmann From jasonanderson at uchicago.edu Fri Nov 5 22:54:31 2021 From: jasonanderson at uchicago.edu (Jason Anderson) Date: Fri, 5 Nov 2021 22:54:31 +0000 Subject: [kuryr] Using kuryr-kubernetes CNI without neutron agent(s)? In-Reply-To: References: <9FF2989B-69FA-494B-B60A-B066E5BF13DA@uchicago.edu> <1043CEB9-7386-4842-8912-9DE021DB9BD0@uchicago.edu> Message-ID: <9F45DC15-94D8-4112-B0A2-8C749E64F493@uchicago.edu> Hi Micha?, I continue to appreciate the information you are providing. I?ve been doing some more research into the landscape of systems and had a few follow-up questions. I?ve also left some clarifying remarks if you are interested. I?m currently evaluating OVN, haven?t used it before and there?s a bit of a learning curve ;) However it seems like it may solve a good part of the problem by removing RabbitMQ and reducing the privileges of the edge host w.r.t. network config. Now I?m looking at kuryr-kubernetes. 1. What is the difference between kuryr and kuryr-kubernetes? I have used kuryr-libnetwork before, in conjunction with kuryr-server (which I think is provided via the main kuryr project?). I am using Kolla Ansible so was spared some of the details on installation. I understand kuryr-libnetwork is basically ?kuryr for Docker? while kuryr-kubernetes is ?kuryr for K8s?, but that leaves me confused about what exactly the kuryr repo is. 2. A current idea is to have several edge ?compute? nodes that will run a lightweight k8s kubelet such as k3s. OVN will provide networking to the edge nodes, controlled from the central site. I would then place kuryr-k8s-controller on the central site and kuryr-cni-daemon on all the edge nodes. My question is: could users create their own Neutron networks (w/ their own Geneve segment) and launch pods connected on that network, and have those pods effectively be isolated from other pods in the topology? As in, can k8s be told that pod A should launch on network A?, and pod B on network B?? Or is there an assumption that from Neutron?s perspective all pods are always on a single Neutron network? Cheers, and thanks! /Jason On Oct 27, 2021, at 12:03 PM, Micha? Dulko > wrote: Hm, so a mixed OpenStack-K8s edge setup, where edge sites are Kubernetes deployments? We've took a look at some edge use cases with Kuryr and one problem people see is that if an edge site becomes disconnected from the main side, Kuryr will not allow creation of new Pods and Services as it needs connection to Neutron and Octavia APIs for that. If that's not a problem had you gave a thought into running distributed compute nodes [1] as edge sites and then Kubernetes on top of them? This architecture should be doable with Kuryr (probably with minor changes). Sort of! I work in research infrastructure and we are building an IoT/edge testbed for computer science researchers who wish to do research in edge computing. It?s a bit mad science-y. We are buying and configuring relatively high-powered edge devices such as Raspberry Pis and Jetson Nanos and making them available for experimentation at a variety of sites. Separately, the platform supports any owner of a supported device to have it managed by the testbed (i.e., they can use our interfaces to launch containers on it and connect it logically to other devices / resources in the cloud.) Distributed compute node looks a bit too heavy for this highly dynamic use-case, but thank you for sharing. Anyways, one might ask why Neutron at all. I am hopeful we can get some interesting properties such as network isolation and the ability to bridge traffic from containers across other layer 2 links such as those provided by AL2S. OVN may help if it can remove the need for RabbitMQ, which is probably the most difficult aspect to remove from OpenStack?s dependencies/assumptions, yet also one of the most pernicious from a security angle, as an untrusted worker node can easily corrupt the control plane. It's just Kuryr which needs access to the credentials, so possibly you should be able to isolate them, but I get the point, containers are worse at isolation than VMs. I?m less worried about the mechanism for isolation on the host and more the amount of privileged information the host must keep secure, and the impact of that information being compromised. Because our experimental target system involves container engines maintained externally to the core site, the risk of compromise on the edge is high. I am searching for an architecture that greatly limits the blast radius of such a compromise. Currently if we use standard Neutron networking + Kuryr, we must give RabbitMQ credentials and others to the container engines on the edge, which papers such as http://seclab.cs.sunysb.edu/seclab/pubs/asiaccs16.pdf have documented as a trivial escalation path. For this reason, narrowing the scope of what state the edge hosts can influence on the core site is paramount. Re: admin creds, maybe it is possible to carefully craft a role that only works for some Neutron operations and put that on the worker nodes. I will explore. I think those settings [2] is what would require highest Neutron permissions in baremetal case. Thanks ? so it will need to create and delete ports. This may be acceptable; without some additional API proxy layer for the edge hosts, a malicious edge host could create bogus ports and delete good ones, but that is a much smaller level of impact. I think we could create a role that only allowed such operations and generate per-host credentials. [1] https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/features/distributed_compute_node.html [2] https://opendev.org/openstack/kuryr-kubernetes/src/branch/master/kuryr_kubernetes/controller/drivers/neutron_vif.py#L125-L127 Cheers! [1] https://docs.openstack.org/kuryr-kubernetes/latest/nested_vlan_mode.html Thanks, Micha? Thanks! Jason Anderson --- Chameleon DevOps Lead Department of Computer Science, University of Chicago Mathematics and Computer Science, Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: From dtantsur at redhat.com Sat Nov 6 15:22:23 2021 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Sat, 6 Nov 2021 16:22:23 +0100 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> Message-ID: On Fri, Nov 5, 2021 at 5:04 PM Mohammed Naser wrote: > On Fri, Nov 5, 2021 at 10:39 AM Thierry Carrez > wrote: > > > > Hi everyone, > > > > The (long) document below reflects the current position of the release > > management team on a popular question: should the OpenStack release > > cadence be changed? Please note that we only address the release > > management / stable branch management facet of the problem. There are > > other dimensions to take into account (governance, feature deprecation, > > supported distros...) to get a complete view of the debate. > > > > Introduction > > ------------ > > > > The subject of how often OpenStack should be released has been regularly > > debated in the OpenStack community. OpenStack started with a 3-month > > release cycle, then switched to 6-month release cycle starting with > > Diablo. It is often thought of a release management decision, but it is > > actually a much larger topic: a release cadence is a trade-off between > > pressure to release more often and pressure to release less often, > > coming in from a lot of different stakeholders. In OpenStack, it is > > ultimately a Technical Committee decision. But that decision is informed > > by the position of a number of stakeholders. This document gives > > historical context and describes the current release management team > > position. > > > > The current trade-off > > --------------------- > > > > The main pressure to release more often is to make features available to > > users faster. Developers get a faster feedback loop, hardware vendors > > ensure software is compatible with their latest products, and users get > > exciting new features. "Release early, release often" is a best practice > > in our industry -- we should generally aim at releasing as often as > > possible. > > > > But that is counterbalanced by pressure to release less often. From a > > development perspective, each release cycle comes with some process > > overhead. On the integrators side, a new release means packaging and > > validation work. On the users side, it means pressure to upgrade. To > > justify that cost, there needs to be enough user-visible benefit (like > > new features) in a given release. > > > > For the last 10 years for OpenStack, that balance has been around six > > months. Six months let us accumulate enough new development that it was > > worth upgrading to / integrating the new version, while giving enough > > time to actually do the work. It also aligned well with Foundation > > events cadence, allowing to synchronize in-person developer meetings > > date with start of cycles. > > > > What changed > > ------------ > > > > The major recent change affecting this trade-off is that the pace of new > > development in OpenStack slowed down. The rhythm of changes was divided > > by 3 between 2015 and 2021, reflecting that OpenStack is now a mature > > and stable solution, where accessing the latest features is no longer a > > major driver. That reduces some of the pressure for releasing more > > often. At the same time, we have more users every day, with larger and > > larger deployments, and keeping those clusters constantly up to date is > > an operational challenge. That increases the pressure to release less > > often. In essence, OpenStack is becoming much more like a LTS > > distribution than a web browser -- something users like moving slow. > > > > Over the past years, project teams also increasingly decoupled > > individual components from the "coordinated release". More and more > > components opted for an independent or intermediary-released model, > > where they can put out releases in the middle of a cycle, making new > > features available to their users. This increasingly opens up the > > possibility of a longer "coordinated release" which would still allow > > development teams to follow "release early, release often" best > > practices. All that recent evolution means it is (again) time to > > reconsider if the 6-month cadence is what serves our community best, and > > in particular if a longer release cadence would not suit us better. > > > > The release management team position on the debate > > -------------------------------------------------- > > > > While releasing less often would definitely reduce the load on the > > release management team, most of the team work being automated, we do > > not think it should be a major factor in motivating the decision. We > > should not adjust the cadence too often though, as there is a one-time > > cost in switching our processes. In terms of impact, we expect that a > > switch to a longer cycle will encourage more project teams to adopt a > > "with-intermediary" release model (rather than the traditional "with-rc" > > single release per cycle), which may lead to abandoning the latter, > > hence simplifying our processes. Longer cycles might also discourage > > people to commit to PTL or release liaison work. We'd probably need to > > manage expectations there, and encourage more frequent switches (or > > create alternate models). > > > > If the decision is made to switch to a longer cycle, the release > > management team recommends to switch to one year directly. That would > > avoid changing it again anytime soon, and synchronizing on a calendar > > year is much simpler to follow and communicate. We also recommend > > announcing the change well in advance. We currently have an opportunity > > of making the switch when we reach the end of the release naming > > alphabet, which would also greatly simplify the communications around > > the change. > > > > Finally, it is worth mentioning the impact on the stable branch work. > > Releasing less often would likely impact the number of stable branches > > that we keep on maintaining, so that we do not go too much in the past > > (and hit unmaintained distributions or long-gone dependencies). We > > currently maintain releases for 18 months before they switch to extended > > maintenance, which results in between 3 and 4 releases being maintained > > at the same time. We'd recommend switching to maintaining one-year > > releases for 24 months, which would result in between 2 and 3 releases > > being maintained at the same time. Such a change would lead to longer > > maintenance for our users while reducing backporting work for our > > developers. > > > > Thanks for the write up Thierry. > ++ very well written! > > I wonder what are the thoughts of the community of having LTS + normal > releases > so that we can have the power of both? I guess that is essentially what > we have > with EM, but I guess we could introduce a way to ensure that operators can > just > upgrade LTS to LTS. > This is basically what Ironic does: we release with the rest of OpenStack, but we also do 2 more releases per cycle with their own bugfix/X.Y branches. Dmitry > > It can complicate things a bit from a CI and project management side, > but I think it > could solve the problem for both sides that need want new features + > those who want > stability? > > > -- > > Thierry Carrez (ttx) > > On behalf of the OpenStack Release Management team > > > > > -- > Mohammed Naser > VEXXHOST, Inc. > > -- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill -------------- next part -------------- An HTML attachment was scrubbed... URL: From dtantsur at redhat.com Sat Nov 6 15:25:56 2021 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Sat, 6 Nov 2021 16:25:56 +0100 Subject: [TripleO] Core team cleanup In-Reply-To: References: Message-ID: On Wed, Nov 3, 2021 at 5:57 PM James Slagle wrote: > Hello, I took a look at our core team, "tripleo-core" in gerrit. We have a > few individuals who I feel have moved on from TripleO in their focus. I > looked at the reviews from stackalytics.io for the last 180 days[1]. > > These individuals have less than 6 reviews, which is about 1 review a > month: > Bob Fournier > Dan Sneddon > Dmitry Tantsur > +1. yeah, sorry for that. I have been trying to keep an eye on TripleO things, but with my new OpenShift responsibilities it's pretty much impossible. I guess it's the same for Bob. I'm still available for questions and reviews if someone needs me. Dmitry > Ji?? Str?nsk? > Juan Antonio Osorio Robles > Marius Cornea > > These individuals have publicly expressed that they are moving on from > TripleO: > Michele Baldessari > wes hayutin > > I'd like to propose we remove these folks from our core team, while > thanking them for their contributions. I'll also note that I'd still value > +1/-1 from these folks with a lot of significance, and encourage them to > review their areas of expertise! > > If anyone on the list plans to start reviewing in TripleO again, then I > also think we can postpone the removal for the time being and re-evaluate > later. Please let me know if that's the case. > > Please reply and let me know any agreements or concerns with this change. > > Thank you! > > [1] > https://www.stackalytics.io/report/contribution?module=tripleo-group&project_type=openstack&days=180 > > -- > -- James Slagle > -- > -- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Sat Nov 6 16:25:48 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Sat, 6 Nov 2021 16:25:48 +0000 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> Message-ID: <20211106162547.odvooimmcr6r7ql7@yuggoth.org> On 2021-11-06 16:22:23 +0100 (+0100), Dmitry Tantsur wrote: [...] > This is basically what Ironic does: we release with the rest of OpenStack, > but we also do 2 more releases per cycle with their own bugfix/X.Y branches. [...] Do you expect users to be able to upgrade between those, and if so is that tested? -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From kira034 at 163.com Sat Nov 6 08:24:15 2021 From: kira034 at 163.com (Hongbin Lu) Date: Sat, 6 Nov 2021 16:24:15 +0800 (CST) Subject: [Ussuri] [Zun] Container deployment problem with Zun "There are not enough hosts available" In-Reply-To: References: Message-ID: <656bfc53.18f5.17cf4592522.Coremail.kira034@163.com> Hi, It looks zun-compute wants to create a resource provider in placement, but placement return a 409 response. I would suggest you to check placement's logs. My best guess is the resource provider with the same name is already created so placement returned 409. If this is a case, simply remove those resources and restart zun-compute service should resolve the problem. Best regards, Hongbin At 2021-11-05 22:43:38, "Pavlos Basaras" wrote: Hello, I have an Openstack cluster, with basic services and functionality working based on ussuri release. I am trying to install the Zun service to be able to deploy containers, following [controller] -- https://docs.openstack.org/zun/ussuri/install/controller-install.html and [compute] -- https://docs.openstack.org/zun/ussuri/install/compute-install.html I used the git branch based on ussuri for all components. I veryfined kuryr-libnetwork operation issuing from the compute node # docker network create --driver kuryr --ipam-driver kuryr --subnet 10.10.0.0/16 --gateway=10.10.0.1 test_net and seeing the network created successfully, etc. I am not very sure about the zun.conf file. What is the "endpoint_type = internalURL" parameter? Do I need to change internalURL? From sudo systemctl status zun-compute i see: Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 49, in wrapped_f Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task return Retrying(*dargs, **dkw).call(f, *args, **kw) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 206, in call Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task return attempt.get(self._wrap_exception) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 247, in get Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task six.reraise(self.value[0], self.value[1], self.value[2]) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/lib/python3/dist-packages/six.py", line 703, in reraise Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task raise value Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 200, in call Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task attempt = Attempt(fn(*args, **kwargs), attempt_number, False) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/zun/compute/compute_node_tracker.py", line 350, in _update_to_placement Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task context, node_rp_uuid, name=compute_node.hostname) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", line 846, in get_provider_tree_and_ensure_r Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task parent_provider_uuid=parent_provider_uuid) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", line 628, in _ensure_resource_provider Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task parent_provider_uuid=parent_provider_uuid) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", line 514, in _create_resource_provider Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task global_request_id=context.global_id) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", line 225, in post Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task headers=headers, logger=LOG) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/lib/python3/dist-packages/keystoneauth1/adapter.py", line 392, in post Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task return self.request(url, 'POST', **kwargs) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/lib/python3/dist-packages/keystoneauth1/adapter.py", line 248, in request Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task return self.session.request(url, method, **kwargs) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/lib/python3/dist-packages/keystoneauth1/session.py", line 968, in request Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task raise exceptions.from_response(resp, method, url) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task keystoneauth1.exceptions.http.Conflict: Conflict (HTTP 409) (Request-ID: req-9a158c41-a485-4937-99e7-e38cdce7fded) What is this problem? any advice? I used the default configuration values ([keystone_auth] and [keystone_authtoken]) values based on the configuration from the above links. Aslo from the controller openstack appcontainer service list +----+----------+-------------+-------+----------+-----------------+----------------------------+-------------------+ | Id | Host | Binary | State | Disabled | Disabled Reason | Updated At | Availability Zone | +----+----------+-------------+-------+----------+-----------------+----------------------------+-------------------+ | 1 | compute5 | zun-compute | up | False | None | 2021-11-05T14:39:01.000000 | nova | +----+----------+-------------+-------+----------+-----------------+----------------------------+-------------------+ openstack appcontainer host show compute5 +----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Field | Value | +----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | uuid | ee3a5b44-8ffa-463e-939d-0c61868a596f | | links | [{'href': 'http://controller:9517/v1/hosts/ee3a5b44-8ffa-463e-939d-0c61868a596f', 'rel': 'self'}, {'href': 'http://controller:9517/hosts/ee3a5b44-8ffa-463e-939d-0c61868a596f', 'rel': 'bookmark'}] | | hostname | compute5 | | mem_total | 7975 | | mem_used | 0 | | total_containers | 1 | | cpus | 10 | | cpu_used | 0.0 | | architecture | x86_64 | | os_type | linux | | os | Ubuntu 18.04.6 LTS | | kernel_version | 4.15.0-161-generic | | labels | {} | | disk_total | 63 | | disk_used | 0 | | disk_quota_supported | False | | runtimes | ['io.containerd.runc.v2', 'io.containerd.runtime.v1.linux', 'runc'] | | enable_cpu_pinning | False | +----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ seems to work fine. However when i issue e.g., openstack appcontainer run --name container --net network=$NET_ID cirros ping 8.8.8.8 i get the error: | status_reason | There are not enough hosts available. Any ideas? One final thing is that I did see in the Horizon dashboard the container tab, to be able to deploy containers from horizon. Is there an extra configuration for this? sorry for the long mail. best, Pavlos -------------- next part -------------- An HTML attachment was scrubbed... URL: From dtantsur at redhat.com Sun Nov 7 14:04:59 2021 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Sun, 7 Nov 2021 15:04:59 +0100 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: <20211106162547.odvooimmcr6r7ql7@yuggoth.org> References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> <20211106162547.odvooimmcr6r7ql7@yuggoth.org> Message-ID: Hi, On Sat, Nov 6, 2021 at 7:32 PM Jeremy Stanley wrote: > On 2021-11-06 16:22:23 +0100 (+0100), Dmitry Tantsur wrote: > [...] > > This is basically what Ironic does: we release with the rest of > OpenStack, > > but we also do 2 more releases per cycle with their own bugfix/X.Y > branches. > [...] > > Do you expect users to be able to upgrade between those, and if so > is that tested? > We prefer to think that upgrades are supported, and we're ready to fix bugs when they arise, but we don't actively test that. Not that we don't want to, mostly because of understaffing, CI stability and the fact that grenade is painful enough as it is. Dmitry > -- > Jeremy Stanley > -- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Sun Nov 7 16:49:54 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Sun, 07 Nov 2021 17:49:54 +0100 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> Message-ID: <2685967.mvXUDI8C0e@p1> Hi, On pi?tek, 5 listopada 2021 18:47:13 CET Sean Mooney wrote: > On Fri, 2021-11-05 at 11:53 -0400, Mohammed Naser wrote: > > On Fri, Nov 5, 2021 at 10:39 AM Thierry Carrez wrote: > > > Hi everyone, > > > > > > The (long) document below reflects the current position of the release > > > management team on a popular question: should the OpenStack release > > > cadence be changed? Please note that we only address the release > > > management / stable branch management facet of the problem. There are > > > other dimensions to take into account (governance, feature deprecation, > > > supported distros...) to get a complete view of the debate. > > > > > > Introduction > > > ------------ > > > > > > The subject of how often OpenStack should be released has been regularly > > > debated in the OpenStack community. OpenStack started with a 3-month > > > release cycle, then switched to 6-month release cycle starting with > > > Diablo. It is often thought of a release management decision, but it is > > > actually a much larger topic: a release cadence is a trade-off between > > > pressure to release more often and pressure to release less often, > > > coming in from a lot of different stakeholders. In OpenStack, it is > > > ultimately a Technical Committee decision. But that decision is informed > > > by the position of a number of stakeholders. This document gives > > > historical context and describes the current release management team > > > position. > > > > > > The current trade-off > > > --------------------- > > > > > > The main pressure to release more often is to make features available to > > > users faster. Developers get a faster feedback loop, hardware vendors > > > ensure software is compatible with their latest products, and users get > > > exciting new features. "Release early, release often" is a best practice > > > in our industry -- we should generally aim at releasing as often as > > > possible. > > > > > > But that is counterbalanced by pressure to release less often. From a > > > development perspective, each release cycle comes with some process > > > overhead. On the integrators side, a new release means packaging and > > > validation work. On the users side, it means pressure to upgrade. To > > > justify that cost, there needs to be enough user-visible benefit (like > > > new features) in a given release. > > > > > > For the last 10 years for OpenStack, that balance has been around six > > > months. Six months let us accumulate enough new development that it was > > > worth upgrading to / integrating the new version, while giving enough > > > time to actually do the work. It also aligned well with Foundation > > > events cadence, allowing to synchronize in-person developer meetings > > > date with start of cycles. > > > > > > What changed > > > ------------ > > > > > > The major recent change affecting this trade-off is that the pace of new > > > development in OpenStack slowed down. The rhythm of changes was divided > > > by 3 between 2015 and 2021, reflecting that OpenStack is now a mature > > > and stable solution, where accessing the latest features is no longer a > > > major driver. That reduces some of the pressure for releasing more > > > often. At the same time, we have more users every day, with larger and > > > larger deployments, and keeping those clusters constantly up to date is > > > an operational challenge. That increases the pressure to release less > > > often. In essence, OpenStack is becoming much more like a LTS > > > distribution than a web browser -- something users like moving slow. > > > > > > Over the past years, project teams also increasingly decoupled > > > individual components from the "coordinated release". More and more > > > components opted for an independent or intermediary-released model, > > > where they can put out releases in the middle of a cycle, making new > > > features available to their users. This increasingly opens up the > > > possibility of a longer "coordinated release" which would still allow > > > development teams to follow "release early, release often" best > > > practices. All that recent evolution means it is (again) time to > > > reconsider if the 6-month cadence is what serves our community best, and > > > in particular if a longer release cadence would not suit us better. > > > > > > The release management team position on the debate > > > -------------------------------------------------- > > > > > > While releasing less often would definitely reduce the load on the > > > release management team, most of the team work being automated, we do > > > not think it should be a major factor in motivating the decision. We > > > should not adjust the cadence too often though, as there is a one-time > > > cost in switching our processes. In terms of impact, we expect that a > > > switch to a longer cycle will encourage more project teams to adopt a > > > "with-intermediary" release model (rather than the traditional "with-rc" > > > single release per cycle), which may lead to abandoning the latter, > > > hence simplifying our processes. Longer cycles might also discourage > > > people to commit to PTL or release liaison work. We'd probably need to > > > manage expectations there, and encourage more frequent switches (or > > > create alternate models). > > > > > > If the decision is made to switch to a longer cycle, the release > > > management team recommends to switch to one year directly. That would > > > avoid changing it again anytime soon, and synchronizing on a calendar > > > year is much simpler to follow and communicate. We also recommend > > > announcing the change well in advance. We currently have an opportunity > > > of making the switch when we reach the end of the release naming > > > alphabet, which would also greatly simplify the communications around > > > the change. > > > > > > Finally, it is worth mentioning the impact on the stable branch work. > > > Releasing less often would likely impact the number of stable branches > > > that we keep on maintaining, so that we do not go too much in the past > > > (and hit unmaintained distributions or long-gone dependencies). We > > > currently maintain releases for 18 months before they switch to extended > > > maintenance, which results in between 3 and 4 releases being maintained > > > at the same time. We'd recommend switching to maintaining one-year > > > releases for 24 months, which would result in between 2 and 3 releases > > > being maintained at the same time. Such a change would lead to longer > > > maintenance for our users while reducing backporting work for our > > > developers. > > > > Thanks for the write up Thierry. > > > > I wonder what are the thoughts of the community of having LTS + normal > > releases so that we can have the power of both? I guess that is > > essentially what we have with EM, but I guess we could introduce a way to > > ensure that operators can just upgrade LTS to LTS. > > if we were to intoduce LTS release we would have to agree on what they were as > a compunity and we would need to support roling upgrade between LTS versions > > that would reqruie all distibuted project like nova to ensur that lts to lts > rpc and db compatitblity is maintained instead of the current N+1 guarunetees > we have to day. Not only that but also cross project communication, like e.g. nova <-> neutron needs to work fine between such LTS releases. > > i know that would make some downstream happy as perhaps we could align our FFU > support with THE LTS cadance but i would hold my breath on that. > > as a developer i woudl presonally prefer to have shorter cycle upstream with > uprades supporte aross a more then n+1 e.g. release every 2 months but keep > rolling upgrade compatiablty for at least 12 months or someting like that. > the release with intermeiday lifecyle can enable that while still allowign > use to have a longer or shorter planing horizon depending on the project and > its veliocity. > > > It can complicate things a bit from a CI and project management side, > > but I think it > > could solve the problem for both sides that need want new features + > > those who want > > stability? > > it might but i suspect that it will still not align with distros > canonical have a new lts every 2 years and redhat has a new release every > 18~months or so based on every 3rd release > > the lts idea i think has merrit but we likely would have to maintain at least > 2 lts release in paralel to make it work. > > so something like 1 lts release a year maintained for 2 years with normal > release every 6 months that are only maintianed for 6 months each project > woudl keep rolling upgrade compatitblity ideally between lts release rather > then n+1 as a new mimium. the implication of this is that we would want to > have grenade jobs testing latest lts to master upgrade compatiblity in > additon to n to n+1 where those differ. > > > -- > > > Thierry Carrez (ttx) > > > On behalf of the OpenStack Release Management team -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From gmann at ghanshyammann.com Sun Nov 7 19:11:28 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Sun, 07 Nov 2021 13:11:28 -0600 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> Message-ID: <17cfbd00b82.1268d0068215518.8021398674853406097@ghanshyammann.com> ---- On Fri, 05 Nov 2021 09:26:13 -0500 Thierry Carrez wrote ---- > Hi everyone, > > The (long) document below reflects the current position of the release > management team on a popular question: should the OpenStack release > cadence be changed? Please note that we only address the release > management / stable branch management facet of the problem. There are > other dimensions to take into account (governance, feature deprecation, > supported distros...) to get a complete view of the debate. > > Introduction > ------------ > > The subject of how often OpenStack should be released has been regularly > debated in the OpenStack community. OpenStack started with a 3-month > release cycle, then switched to 6-month release cycle starting with > Diablo. It is often thought of a release management decision, but it is > actually a much larger topic: a release cadence is a trade-off between > pressure to release more often and pressure to release less often, > coming in from a lot of different stakeholders. In OpenStack, it is > ultimately a Technical Committee decision. But that decision is informed > by the position of a number of stakeholders. This document gives > historical context and describes the current release management team > position. > > The current trade-off > --------------------- > > The main pressure to release more often is to make features available to > users faster. Developers get a faster feedback loop, hardware vendors > ensure software is compatible with their latest products, and users get > exciting new features. "Release early, release often" is a best practice > in our industry -- we should generally aim at releasing as often as > possible. > > But that is counterbalanced by pressure to release less often. From a > development perspective, each release cycle comes with some process > overhead. On the integrators side, a new release means packaging and > validation work. On the users side, it means pressure to upgrade. To > justify that cost, there needs to be enough user-visible benefit (like > new features) in a given release. Thanks Thierry for the detailed write up. At the same time, a shorter release which leads to upgrade-often pressure but it will have fewer number of changes/features, so make the upgrade easy and longer-release model will have more changes/features that will make upgrade more complex. > > For the last 10 years for OpenStack, that balance has been around six > months. Six months let us accumulate enough new development that it was > worth upgrading to / integrating the new version, while giving enough > time to actually do the work. It also aligned well with Foundation > events cadence, allowing to synchronize in-person developer meetings > date with start of cycles. > > What changed > ------------ > > The major recent change affecting this trade-off is that the pace of new > development in OpenStack slowed down. The rhythm of changes was divided > by 3 between 2015 and 2021, reflecting that OpenStack is now a mature > and stable solution, where accessing the latest features is no longer a > major driver. That reduces some of the pressure for releasing more > often. At the same time, we have more users every day, with larger and > larger deployments, and keeping those clusters constantly up to date is > an operational challenge. That increases the pressure to release less > often. In essence, OpenStack is becoming much more like a LTS > distribution than a web browser -- something users like moving slow. > > Over the past years, project teams also increasingly decoupled > individual components from the "coordinated release". More and more > components opted for an independent or intermediary-released model, > where they can put out releases in the middle of a cycle, making new > features available to their users. This increasingly opens up the > possibility of a longer "coordinated release" which would still allow > development teams to follow "release early, release often" best > practices. All that recent evolution means it is (again) time to > reconsider if the 6-month cadence is what serves our community best, and > in particular if a longer release cadence would not suit us better. > > The release management team position on the debate > -------------------------------------------------- > > While releasing less often would definitely reduce the load on the > release management team, most of the team work being automated, we do > not think it should be a major factor in motivating the decision. We > should not adjust the cadence too often though, as there is a one-time > cost in switching our processes. In terms of impact, we expect that a > switch to a longer cycle will encourage more project teams to adopt a > "with-intermediary" release model (rather than the traditional "with-rc" > single release per cycle), which may lead to abandoning the latter, > hence simplifying our processes. Longer cycles might also discourage > people to commit to PTL or release liaison work. We'd probably need to > manage expectations there, and encourage more frequent switches (or > create alternate models). > > If the decision is made to switch to a longer cycle, the release > management team recommends to switch to one year directly. That would > avoid changing it again anytime soon, and synchronizing on a calendar > year is much simpler to follow and communicate. We also recommend > announcing the change well in advance. We currently have an opportunity > of making the switch when we reach the end of the release naming > alphabet, which would also greatly simplify the communications around > the change. > > Finally, it is worth mentioning the impact on the stable branch work. > Releasing less often would likely impact the number of stable branches > that we keep on maintaining, so that we do not go too much in the past > (and hit unmaintained distributions or long-gone dependencies). We > currently maintain releases for 18 months before they switch to extended > maintenance, which results in between 3 and 4 releases being maintained > at the same time. We'd recommend switching to maintaining one-year > releases for 24 months, which would result in between 2 and 3 releases > being maintained at the same time. Such a change would lead to longer > maintenance for our users while reducing backporting work for our > developers. Yeah, if we switch to one-year release model then definitely we need to change the stable support policy. For example, do we need an extended maintenance phase if we support a release for 24 months? and if we keep the EM phase too, then important thing to note is that EM phase is the almost same amount of work upstream developers are spending now a days in terms of testing or backports (even though we have the agreement of reducing the effort for EM stables when needed, but I do not see that is happening, and we end up doing the same amount of maintenance there as we do for supported stables). As the yearly release model extend the stable support window and with our current situation of stable team shrinking, it is an open question for us whether we as a community will be able to support the new stable release window or not? Another point we need to consider is how it will impact the contribution support from the companies and volunteer contributors (we might not have many volunteer contributors now, so we can ignore it, but let's consider companies' support). For example, the foundation membership contract does not have contribution requirements, so companies' contribution support is always a volunteer or based on their customer needs. In that case, we need to think about how we can keep that without any impact. For example, change the foundation membership requirement or get companies' feedback if it does not impact their contribution support policy. -gmann > > -- > Thierry Carrez (ttx) > On behalf of the OpenStack Release Management team > > From katonalala at gmail.com Mon Nov 8 08:47:08 2021 From: katonalala at gmail.com (Lajos Katona) Date: Mon, 8 Nov 2021 09:47:08 +0100 Subject: [neutron] Neutron Team meeting - vote on time of the meeting Message-ID: Hi Neutrinos As we discussed during the PTG, let's start a poll to see which time fits best the need of the team for the Neutron Team meeting. I prepared a doodle, please check which time slot would be best for you: https://doodle.com/poll/m973q9eyag8k385w?utm_source=poll&utm_medium=link The dates for the doodle now for next week, but please ignore that. Lajos Katona (lajoskatona) -------------- next part -------------- An HTML attachment was scrubbed... URL: From bcafarel at redhat.com Mon Nov 8 09:16:07 2021 From: bcafarel at redhat.com (Bernard Cafarelli) Date: Mon, 8 Nov 2021 10:16:07 +0100 Subject: [neutron] Bug deputy report (week starting on 2021-11-01) Message-ID: Hey neutrinos, Now that we are past the biannual DST calendar mess, it is also time for a new bug deputy rotation! Here is the list of bugs reported for last week, kudos to Rodolfo for both reporting most of them and also having a fix for them! All bugs have assignees, almost have patches and some are just waiting for CI Critical * "neutron-tempest-plugin-scenario-ovn" job timing out frequently - https://bugs.launchpad.net/neutron/+bug/1949557 Change merged increasing timeout by ralonsoh: https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/816438 * "openstack-tox-py38" CI job is timing out frequently in gate - https://bugs.launchpad.net/neutron/+bug/1949704 Timeout change by ralonsoh in https://review.opendev.org/c/openstack/neutron/+/816631 * [CI] "neutron-ovs-rally-task" job failing due to an authentication problem - https://bugs.launchpad.net/neutron/+bug/1949945 Change to create keystone endpoint merged by ralonsoh: https://review.opendev.org/c/openstack/neutron/+/816792 High * [OVN] Stateless SG Extension Support - https://bugs.launchpad.net/neutron/+bug/1949451 Extension missing in the OVN supported extensions list, fix by Ihar https://review.opendev.org/c/openstack/neutron/+/816612 * "openstack-tox-py39" is timing out frequently - https://bugs.launchpad.net/neutron/+bug/1949476 Similar fix as py38 in https://review.opendev.org/c/openstack/neutron/+/816631 Medium * [OVS][QoS] Dataplane enforcement is limited to minimum bandwidth egress rule - https://bugs.launchpad.net/neutron/+bug/1949607 Patch in progress by ralonsoh: https://review.opendev.org/c/openstack/neutron/+/816530 * FIP ports count into quota as they get a project_id set - https://bugs.launchpad.net/neutron/+bug/1949767 Change in train bulk port binding, suggested fix https://review.opendev.org/c/openstack/neutron/+/816722 * [OVN] OVN mech driver does not map new segments - https://bugs.launchpad.net/neutron/+bug/1949967 WIP patch by ralonsoh https://review.opendev.org/c/openstack/neutron/+/816856 Low * [OVN] Check OVN support for stateless NAT - https://bugs.launchpad.net/neutron/+bug/1949494 Patch in progress by ralonsoh: https://review.opendev.org/c/openstack/neutron/+/816376 * [OVN] Check OVN support virtual port type - https://bugs.launchpad.net/neutron/+bug/1949496 Patch in progress by ralonsoh: https://review.opendev.org/c/openstack/neutron/+/816383 * direct-physical port creation fails with QoS minimum bandwidth rule - https://bugs.launchpad.net/neutron/+bug/1949877 Assigned to gibi -- Bernard Cafarelli -------------- next part -------------- An HTML attachment was scrubbed... URL: From mdulko at redhat.com Mon Nov 8 09:40:02 2021 From: mdulko at redhat.com (=?UTF-8?Q?Micha=C5=82?= Dulko) Date: Mon, 08 Nov 2021 10:40:02 +0100 Subject: [kuryr] Using kuryr-kubernetes CNI without neutron agent(s)? In-Reply-To: <9F45DC15-94D8-4112-B0A2-8C749E64F493@uchicago.edu> References: <9FF2989B-69FA-494B-B60A-B066E5BF13DA@uchicago.edu> <1043CEB9-7386-4842-8912-9DE021DB9BD0@uchicago.edu> <9F45DC15-94D8-4112-B0A2-8C749E64F493@uchicago.edu> Message-ID: <8bf45e0a40568dd5c07e1a15dd1955b6665d056a.camel@redhat.com> On Fri, 2021-11-05 at 22:54 +0000, Jason Anderson wrote: > Hi Micha?, > > I continue to appreciate the information you are providing. I?ve been > doing some more research into the landscape of systems and had a few > follow-up questions. I?ve also left some clarifying remarks if you are > interested. > > I?m currently evaluating OVN, haven?t used it before and there?s a bit > of a learning curve ;) However it seems like it may solve a good part > of the problem by removing RabbitMQ and reducing the privileges of the > edge host w.r.t. network config. > > Now I?m looking at kuryr-kubernetes. > > 1. What is the difference between kuryr and kuryr-kubernetes? I have > used kuryr-libnetwork before, in conjunction with kuryr-server (which I > think is provided via the main kuryr project?). I am using Kolla > Ansible so was spared some of the details on installation. I understand > kuryr-libnetwork is basically ?kuryr for Docker? while kuryr-kubernetes > is ?kuryr for K8s?, but that leaves me confused about what exactly the > kuryr repo is. In openstack/kuryr we have kuryr.lib module, which is hosting a few things shared by kuryr-libnetwork and kuryr-kubernetes. Nothing to worry about really. ;) > 2. A current idea is to have several edge ?compute? nodes that will run > a lightweight k8s kubelet such as k3s. OVN will provide networking to > the edge nodes, controlled from the central site. I would then place > kuryr-k8s-controller on the central site and kuryr-cni-daemon on all > the edge nodes. My question is: could users create their own Neutron > networks (w/ their own Geneve segment) and launch pods connected on > that network, and have those pods effectively be isolated from other > pods in the topology? As in, can k8s be told that pod A should launch > on network A?, and pod B on network B?? Or is there an assumption that > from Neutron?s perspective all pods are always on a single Neutron > network? Ha, that might be a bit tough one. Basically you can easily set Kuryr to create separate subnets for each of the K8s namespaces, but then you'd need to rely on NetworkPolicies to isolate traffic between namespaces which might not exactly fit your multitenant model. The best way to implement whatever you need might be to write your own custom subnet driver [1] that would choose the subnet e.g. based on a pod or namespace annotation. If there's a clear use case behind it, I think we can include it into the upstream code too. [1] https://opendev.org/openstack/kuryr-kubernetes/src/branch/master/kuryr_kubernetes/controller/drivers > Cheers, and thanks! > /Jason > > > On Oct 27, 2021, at 12:03 PM, Micha? Dulko wrote: > > > > Hm, so a mixed OpenStack-K8s edge setup, where edge sites are > > Kubernetes deployments? We've took a look at some edge use cases with > > Kuryr and one problem people see is that if an edge site becomes > > disconnected from the main side, Kuryr will not allow creation of new > > Pods and Services as it needs connection to Neutron and Octavia APIs > > for that. If that's not a problem had you gave a thought into running > > distributed compute nodes [1] as edge sites and then Kubernetes on > > top > > of them? This architecture should be doable with Kuryr (probably with > > minor changes). > > Sort of! I work in research infrastructure and we are building an > IoT/edge testbed for computer science researchers who wish to do > research in edge computing. It?s a bit mad science-y. We are buying and > configuring relatively high-powered edge devices such as Raspberry Pis > and Jetson Nanos and making them available for experimentation at a > variety of sites. Separately, the platform supports any owner of a > supported device to have it managed by the testbed (i.e., they can use > our interfaces to launch containers on it and connect it logically to > other devices / resources in the cloud.) > > Distributed compute node looks a bit too heavy for this highly dynamic > use-case, but thank you for sharing. > > Anyways, one might ask why Neutron at all. I am hopeful we can get some > interesting properties such as network isolation and the ability to > bridge traffic from containers across other layer 2 links such as those > provided by?AL2S. > > > > OVN may help if it can remove the need for RabbitMQ, which is > > > probably the > > > most difficult aspect to remove from OpenStack?s > > > dependencies/assumptions, > > > yet also one of the most pernicious from a security angle, as an > > > untrusted > > > worker node can easily corrupt the control plane. > > > > It's just Kuryr which needs access to the credentials, so possibly > > you > > should be able to isolate them, but I get the point, containers are > > worse at isolation than VMs. > > I?m less worried about the mechanism for isolation on the host and more > the amount of privileged information the host must keep secure, and the > impact of that information being compromised. Because our experimental > target system involves container engines maintained externally to the > core site, the risk of compromise on the edge is high. I am searching > for an architecture that greatly limits the blast radius of such a > compromise. Currently if we use standard Neutron networking + Kuryr, we > must give RabbitMQ credentials and others to the container engines on > the edge, which papers such > as?http://seclab.cs.sunysb.edu/seclab/pubs/asiaccs16.pdf?have > documented as a trivial escalation path. > > For this reason, narrowing the scope of what state the edge hosts can > influence on the core site is paramount. > > > > > > Re: admin creds, maybe it is possible to carefully craft a role > > > that only works > > > for some Neutron operations and put that on the worker nodes. I > > > will explore. > > > > I think those settings [2] is what would require highest Neutron > > permissions in baremetal case. > > Thanks ? so it will need to create and delete ports. This may be > acceptable; without some additional API proxy layer for the edge hosts, > a malicious edge host could create bogus ports and delete good ones, > but that is a much smaller level of impact. I think we could create a > role that only allowed such operations and generate per-host > credentials. > > > [1]? > > https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/features/distributed_compute_node.html > > [2]? > > https://opendev.org/openstack/kuryr-kubernetes/src/branch/master/kuryr_kubernetes/controller/drivers/neutron_vif.py#L125-L127 > > > > > Cheers! > > > > [1] > > > > https://docs.openstack.org/kuryr-kubernetes/latest/nested_vlan_mode.html > > > > > > > > Thanks, > > > > Micha? > > > > > > > > > Thanks! > > > > > Jason Anderson > > > > > > > > > > --- > > > > > > > > > > Chameleon DevOps Lead > > > > > Department of Computer Science, University of Chicago > > > > > Mathematics and Computer Science, Argonne National Laboratory > From katonalala at gmail.com Mon Nov 8 10:48:21 2021 From: katonalala at gmail.com (Lajos Katona) Date: Mon, 8 Nov 2021 11:48:21 +0100 Subject: [neutron] Neutron drivers Meeting - vote on time of the meeting Message-ID: Hi Neutrinos, As we discussed during the PTG, let's start a poll to see which time fits best the need of the team for the Neutron drivers meeting. I prepared a doodle, please check which time slot would be best for you: https://doodle.com/poll/6vdugxp7g54smdv6?utm_source=poll&utm_medium=link The dates for the doodle now for next week, but please ignore that. Lajos Katona (lajoskatona) -------------- next part -------------- An HTML attachment was scrubbed... URL: From pbasaras at gmail.com Mon Nov 8 11:02:50 2021 From: pbasaras at gmail.com (Pavlos Basaras) Date: Mon, 8 Nov 2021 13:02:50 +0200 Subject: [Ussuri] [Zun] Container deployment problem with Zun "There are not enough hosts available" In-Reply-To: <656bfc53.18f5.17cf4592522.Coremail.kira034@163.com> References: <656bfc53.18f5.17cf4592522.Coremail.kira034@163.com> Message-ID: Hello, you are right, that fixed the problem. In order to solve the problem i revisited https://docs.openstack.org/placement/ussuri/install/verify.html I executed: openstack resource provider list Then removed the host that i use for containers, restarted the zun-compute at the host and works perfectly. Thank you very much for your input. One more thing, I don't see at the horizon dashboard the tab for the containers (i just see the nova compute related tab). Is there any additional configuration for this? btw, from https://docs.openstack.org/placement/ussuri/install/verify.html, i used pip3 install osc-placement (instead of pip install..) all the best Pavlos. On Sat, Nov 6, 2021 at 10:24 AM Hongbin Lu wrote: > Hi, > > It looks zun-compute wants to create a resource provider in placement, but > placement return a 409 response. I would suggest you to check placement's > logs. My best guess is the resource provider with the same name is already > created so placement returned 409. If this is a case, simply remove those > resources and restart zun-compute service should resolve the problem. > > Best regards, > Hongbin > > > > > > > At 2021-11-05 22:43:38, "Pavlos Basaras" wrote: > > Hello, > > I have an Openstack cluster, with basic services and functionality working > based on ussuri release. > > I am trying to install the Zun service to be able to deploy containers, > following > > [controller] -- > https://docs.openstack.org/zun/ussuri/install/controller-install.html > and > [compute] -- > https://docs.openstack.org/zun/ussuri/install/compute-install.html > > I used the git branch based on ussuri for all components. > > I veryfined kuryr-libnetwork operation issuing from the compute node > # docker network create --driver kuryr --ipam-driver kuryr --subnet > 10.10.0.0/16 --gateway=10.10.0.1 test_net > > and seeing the network created successfully, etc. > > I am not very sure about the zun.conf file. > What is the "endpoint_type = internalURL" parameter? > Do I need to change internalURL? > > > From sudo systemctl status zun-compute i see: > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task File > "/usr/local/lib/python3.6/dist-packages/retrying.py", line 49, in wrapped_f > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task return Retrying(*dargs, **dkw).call(f, > *args, **kw) > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task File > "/usr/local/lib/python3.6/dist-packages/retrying.py", line 206, in call > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task return > attempt.get(self._wrap_exception) > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task File > "/usr/local/lib/python3.6/dist-packages/retrying.py", line 247, in get > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task six.reraise(self.value[0], > self.value[1], self.value[2]) > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task File > "/usr/lib/python3/dist-packages/six.py", line 703, in reraise > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task raise value > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task File > "/usr/local/lib/python3.6/dist-packages/retrying.py", line 200, in call > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task attempt = Attempt(fn(*args, **kwargs), > attempt_number, False) > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task File > "/usr/local/lib/python3.6/dist-packages/zun/compute/compute_node_tracker.py", > line 350, in _update_to_placement > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task context, node_rp_uuid, > name=compute_node.hostname) > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task File > "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", > line 846, in get_provider_tree_and_ensure_r > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task > parent_provider_uuid=parent_provider_uuid) > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task File > "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", > line 628, in _ensure_resource_provider > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task > parent_provider_uuid=parent_provider_uuid) > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task File > "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", > line 514, in _create_resource_provider > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task global_request_id=context.global_id) > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task File > "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", > line 225, in post > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task headers=headers, logger=LOG) > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task File > "/usr/lib/python3/dist-packages/keystoneauth1/adapter.py", line 392, in post > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task return self.request(url, 'POST', > **kwargs) > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task File > "/usr/lib/python3/dist-packages/keystoneauth1/adapter.py", line 248, in > request > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task return self.session.request(url, > method, **kwargs) > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task File > "/usr/lib/python3/dist-packages/keystoneauth1/session.py", line 968, in > request > Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task raise exceptions.from_response(resp, > method, url) > *Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 > ERROR oslo_service.periodic_task keystoneauth1.exceptions.http.Conflict: > Conflict (HTTP 409) (Request-ID: req-9a158c41-a485-4937-99e7-e38cdce7fded)* > > What is this problem? any advice? > I used the default configuration values ([keystone_auth] and > [keystone_authtoken]) values based on the configuration from the above > links. > > > Aslo from the controller > > > *openstack appcontainer service list* > +----+----------+-------------+-------+----------+-----------------+----------------------------+-------------------+ > | Id | Host | Binary | State | Disabled | Disabled Reason | > Updated At | Availability Zone | > > +----+----------+-------------+-------+----------+-----------------+----------------------------+-------------------+ > | 1 | compute5 | zun-compute | up | False | None | > 2021-11-05T14:39:01.000000 | nova | > > +----+----------+-------------+-------+----------+-----------------+----------------------------+-------------------+ > > > *openstack appcontainer host show compute5* > +----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ > | Field | Value > > | > > +----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ > | uuid | ee3a5b44-8ffa-463e-939d-0c61868a596f > > | > | links | [{'href': ' > http://controller:9517/v1/hosts/ee3a5b44-8ffa-463e-939d-0c61868a596f', > 'rel': 'self'}, {'href': ' > http://controller:9517/hosts/ee3a5b44-8ffa-463e-939d-0c61868a596f', > 'rel': 'bookmark'}] | > | hostname | compute5 > > | > | mem_total | 7975 > > | > | mem_used | 0 > > | > | total_containers | 1 > > | > | cpus | 10 > > | > | cpu_used | 0.0 > > | > | architecture | x86_64 > > | > | os_type | linux > > | > | os | Ubuntu 18.04.6 LTS > > | > | kernel_version | 4.15.0-161-generic > > | > | labels | {} > > | > | disk_total | 63 > > | > | disk_used | 0 > > | > | disk_quota_supported | False > > | > | runtimes | ['io.containerd.runc.v2', > 'io.containerd.runtime.v1.linux', 'runc'] > > | > | enable_cpu_pinning | False > > | > > +----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ > > > seems to work fine. > > However when i issue e.g., openstack appcontainer run --name container > --net network=$NET_ID cirros ping 8.8.8.8 > > i get the error: | status_reason | There are not enough hosts > available. > > Any ideas? > > One final thing is that I did see in the Horizon dashboard the > container tab, to be able to deploy containers from horizon. Is there an > extra configuration for this? > > sorry for the long mail. > > best, > Pavlos > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From iurygregory at gmail.com Mon Nov 8 14:18:24 2021 From: iurygregory at gmail.com (Iury Gregory) Date: Mon, 8 Nov 2021 15:18:24 +0100 Subject: =?UTF-8?Q?Re=3A_=5Bironic=5D_Proposing_Aija_Jaunt=C4=93va_for_sushy=2Dcore?= In-Reply-To: References: Message-ID: I've added Aija to the sushy-core group =) Congratulations! Em ter., 2 de nov. de 2021 ?s 08:51, Iury Gregory escreveu: > Hello everyone! > > I would like to propose Aija Jaunt?va (irc: ajya) to be added to the > sushy-core group. > Aija has been in the ironic community for a long time, she has a lot of > knowledge about redfish and is always providing good reviews. > > ironic-cores please vote with +/- 1. > > -- > > > *Att[]'sIury Gregory Melo Ferreira * > *MSc in Computer Science at UFCG* > *Part of the ironic-core and puppet-manager-core team in OpenStack* > *Software Engineer at Red Hat Czech* > *Social*: https://www.linkedin.com/in/iurygregory > *E-mail: iurygregory at gmail.com * > -- *Att[]'sIury Gregory Melo Ferreira * *MSc in Computer Science at UFCG* *Part of the ironic-core and puppet-manager-core team in OpenStack* *Software Engineer at Red Hat Czech* *Social*: https://www.linkedin.com/in/iurygregory *E-mail: iurygregory at gmail.com * -------------- next part -------------- An HTML attachment was scrubbed... URL: From ashlee at openstack.org Mon Nov 8 15:53:22 2021 From: ashlee at openstack.org (Ashlee Ferguson) Date: Mon, 8 Nov 2021 09:53:22 -0600 Subject: OpenInfra Live - November 11 at 9am CT Message-ID: Hi everyone, This week?s OpenInfra Live episode is brought to you by the OpenStack Community. The 2021 User Survey shows that the footprint of OpenStack clouds grew by 66% over the last year, totaling over 25 million cores in production. This increase has been contributed by organizations of all sizes around the world. During this episode of OpenInfra Live, we are going to talk to operators from Kakao, LINE, Schwarz IT, NeCTAR and T-Systems about what?s causing this OpenStack growth at their organization. Episode: OpenStack Is Alive: Explosive Growth Among Production Deployments Date and time: November 11 at 9am CT (1500 UTC) You can watch us live on: YouTube: https://www.youtube.com/watch?v=RhMJO82lDxc LinkedIn: https://www.linkedin.com/feed/update/urn:li:ugcPost:6862068514526756864 Facebook: https://www.facebook.com/104139126308032/posts/4493822407339660/ WeChat: recording will be posted on OpenStack WeChat after the live stream Speakers: Paul Coddington (ARDC Nectar) Reedip Banerjee (LINE) Yushiro Furukawa (LINE) Andrew Kong (Kakao) Nils Magnus (T-Systems) Adrian Seiffert (Schwarz) Marvin Titus (Schwarz) Carmel Walsh (ARDC Nectar) Have an idea for a future episode? Share it now at ideas.openinfra.live . Register now for OpenInfra Live: Keynotes, a special edition of OpenInfra Live on November 17-18th starting at 1500 UTC: https://openinfralivekeynotes.eventbrite.com/ Thanks, Ashlee OpenInfra Foundation Community & Events ashlee at openinfra.dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From thierry at openstack.org Mon Nov 8 15:58:00 2021 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 8 Nov 2021 16:58:00 +0100 Subject: [largescale-sig] Next meeting: Nov 10th, 15utc Message-ID: Hi everyone, The Large Scale SIG meeting is back this Wednesday in #openstack-operators on OFTC IRC, at 15UTC. It will be chaired by Belmiro Moreira. You can doublecheck how that time translates locally at: https://www.timeanddate.com/worldclock/fixedtime.html?iso=20211110T15 Feel free to add topics to our agenda at: https://etherpad.openstack.org/p/large-scale-sig-meeting Regards, -- Thierry Carrez From thierry at openstack.org Mon Nov 8 17:40:42 2021 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 8 Nov 2021 18:40:42 +0100 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: <17cfbd00b82.1268d0068215518.8021398674853406097@ghanshyammann.com> References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> <17cfbd00b82.1268d0068215518.8021398674853406097@ghanshyammann.com> Message-ID: Ghanshyam Mann wrote: > [...] > Thanks Thierry for the detailed write up. > > At the same time, a shorter release which leads to upgrade-often pressure but > it will have fewer number of changes/features, so make the upgrade easy and > longer-release model will have more changes/features that will make upgrade more > complex. I think that was true a few years ago, but I'm not convinced that still holds. We currently have a third of the changes volume we had back in 2015, so a one-year release in 2022 would contain far less changes than a 6-month release from 2015. Also, thanks to our testing and our focus on stability, the pain linked to the amount of breaking changes in a release is now negligible compared to the basic pain of going through a 1M-core deployment and upgrading the various pieces... every 6 months. I've heard of multiple users claiming it takes them close to 6 months to upgrade their massive deployments to a new version. So when they are done, they have to start again. -- Thierry Carrez (ttx) From gmann at ghanshyammann.com Mon Nov 8 18:35:36 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 08 Nov 2021 12:35:36 -0600 Subject: [all][tc] Technical Committee next weekly meeting on Nov 11th at 1500 UTC Message-ID: <17d00d5940e.c605c228290918.1450543438871901915@ghanshyammann.com> Hello Everyone, Technical Committee's next weekly meeting is scheduled for Nov 11th at 1500 UTC. If you would like to add topics for discussion, please add them to the below wiki page by Wednesday, Nov 10th, at 2100 UTC. https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting -gmann From juliaashleykreger at gmail.com Mon Nov 8 19:43:18 2021 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Mon, 8 Nov 2021 12:43:18 -0700 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> <17cfbd00b82.1268d0068215518.8021398674853406097@ghanshyammann.com> Message-ID: On Mon, Nov 8, 2021 at 10:44 AM Thierry Carrez wrote: > > Ghanshyam Mann wrote: > > [...] > > Thanks Thierry for the detailed write up. > > > > At the same time, a shorter release which leads to upgrade-often pressure but > > it will have fewer number of changes/features, so make the upgrade easy and > > longer-release model will have more changes/features that will make upgrade more > > complex. > > I think that was true a few years ago, but I'm not convinced that still > holds. We currently have a third of the changes volume we had back in > 2015, so a one-year release in 2022 would contain far less changes than > a 6-month release from 2015. I concur. Also, in 2015, we were still very much in a "move fast" mode of operation as a community. > Also, thanks to our testing and our focus on stability, the pain linked > to the amount of breaking changes in a release is now negligible > compared to the basic pain of going through a 1M-core deployment and > upgrading the various pieces... every 6 months. I've heard of multiple > users claiming it takes them close to 6 months to upgrade their massive > deployments to a new version. So when they are done, they have to start > again. > > -- > Thierry Carrez (ttx) > I've been hearing the exact same messaging from larger operators as well as operators in environments where they are concerned about managing risk for at least the past two years. These operators have indicated it is not uncommon for the upgrade projects which consume, test, certify for production, and deploy to production take *at least* six months to execute. At the same time, they are shy of being the ones to also "find all of the bugs", and so the project doesn't actually start until well after the new coordinated release has occurred. Quickly they become yet another version behind with this pattern. I suspect it is really easy for us as a CI focused community to think that six months is plenty of time to roll out a fully updated deployment which has been fully tested in every possible way. Except, these operators are often trying to do just that on physical hardware, with updated firmware and operatings systems bringing in new variables with every single change which may ripple up the entire stack. These operators then have to apply the lessons they have previously learned once they have worked through all of the variables. In some cases this may involve aspects such as benchmarking, to ensure they don't need to make additional changes which need to be factored into their deployment, sending them back to the start of their testing. All while thinking of phrases like "business/mission critical". I guess this means I'm in support of revising the release cycle. At the same time, I think it would be wise for us to see if we can learn from these operators the pain points they experience, the process they leverage, and ultimately see if there are opportunities to spread knowledge or potentially tooling. Or maybe even get them to contribute their patches upstream. Not that all of these issues are easily solved with any level of code, but sometimes they can include contextual disconnects and resolving those are just as important as shipping a release, IMHO. -Julia From abraden at verisign.com Mon Nov 8 19:49:29 2021 From: abraden at verisign.com (Braden, Albert) Date: Mon, 8 Nov 2021 19:49:29 +0000 Subject: Adjutant needs contributors (and a PTL) to survive! In-Reply-To: <2bd77f1e8c7542888b7e0e1a14931a41@verisign.com> References: <3026c411-688b-c773-8577-e8eed40b995a@catalystcloud.nz> <2bd77f1e8c7542888b7e0e1a14931a41@verisign.com> Message-ID: <9cabb3cb32a7441697f58933df72b514@verisign.com> I didn't have any luck contacting Adrian. Does anyone know where the storyboard is that he mentions in his email? -----Original Message----- From: Braden, Albert Sent: Monday, November 1, 2021 12:36 PM To: adriant at catalystcloud.nz; openstack-discuss at lists.openstack.org Subject: RE: [EXTERNAL] Adjutant needs contributors (and a PTL) to survive! Hi Adrian, I don't think I'm qualified to be PTL but I'm willing to help, and I've asked for permission. We aren't using Adjutant at this time because we're on Train and I learned at my last contract that running Adjutant on Train is a hassle, but I hope to start using it after we get to Ussuri. Has anyone else volunteered? -----Original Message----- From: Adrian Turjak Sent: Wednesday, October 27, 2021 1:41 AM To: openstack-discuss at lists.openstack.org Subject: [EXTERNAL] Adjutant needs contributors (and a PTL) to survive! Caution: This email originated from outside the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe. Hello fellow OpenStackers! I'm moving on to a different opportunity and my new role will not involve OpenStack, and there sadly isn't anyone at Catalystcloud who will be able to take over project responsibilities for Adjutant any time soon (not that I've been very onto it lately). As such Adjutant needs people to take over, and lead it going forward. I believe the codebase is in a reasonably good position for others to pick up, and I plan to go through and document a few more of my ideas for where it should go in storyboard so some of those plans exist somewhere should people want to pick up from where I left off before going fairly silent upstream. Plus if people want/need to they can reach out to me or add me to code review and chances are I'll comment/review because I do care about the project. Or I may contract some time to it. There are a few clouds running Adjutant, and people who have previously expressed interest in using it, so if you still are, the project isn't in a bad place at all. The code is stable, and the last few major refactors have cleaned up much of my biggest pain points with it. Best of luck! - adriant From fungi at yuggoth.org Mon Nov 8 20:24:05 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Mon, 8 Nov 2021 20:24:05 +0000 Subject: Adjutant needs contributors (and a PTL) to survive! In-Reply-To: <9cabb3cb32a7441697f58933df72b514@verisign.com> References: <3026c411-688b-c773-8577-e8eed40b995a@catalystcloud.nz> <2bd77f1e8c7542888b7e0e1a14931a41@verisign.com> <9cabb3cb32a7441697f58933df72b514@verisign.com> Message-ID: <20211108202404.f45g57hnt5aa3db4@yuggoth.org> On 2021-11-08 19:49:29 +0000 (+0000), Braden, Albert wrote: > I didn't have any luck contacting Adrian. Does anyone know where > the storyboard is that he mentions in his email? [...] There's a project group for Adjutant here: https://storyboard.openstack.org/#!/project_group/adjutant -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From hojat.gazestani1 at gmail.com Mon Nov 8 19:13:25 2021 From: hojat.gazestani1 at gmail.com (Hojii GZNI) Date: Mon, 8 Nov 2021 22:43:25 +0330 Subject: Opendaylight integration Openstack Message-ID: Hi everyone I am trying to use Opendaylight as an Openstack SDN controller but have this error for "mechanism_dirver". Xena Ubuntu 20.04 Opendaylight 8 /etc/neutron/plugins/ml2/ml2_conf.ini mechanism_drivers = opendaylight Neutron server log: CRITICAL neutron.plugins.ml2.managers [-] The following mechanism drivers were not found: {'opendaylight'} All configuration is in my Github[1] Regards, Hojat. [1]: https://github.com/hojat-gazestani/openstack -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Mon Nov 8 21:48:54 2021 From: smooney at redhat.com (Sean Mooney) Date: Mon, 08 Nov 2021 21:48:54 +0000 Subject: Opendaylight integration Openstack In-Reply-To: References: Message-ID: <74cf8bc2059798f0d9489fb7b19a997255481223.camel@redhat.com> On Mon, 2021-11-08 at 22:43 +0330, Hojii GZNI wrote: > Hi everyone > > I am trying to use Opendaylight as an Openstack SDN controller but have > this error for "mechanism_dirver". > > Xena > Ubuntu 20.04 > Opendaylight 8 > > > /etc/neutron/plugins/ml2/ml2_conf.ini > > mechanism_drivers = opendaylight i dont know which release you are using but the orginal odl driver was remvoed some time ago and replace by the v2 dirver https://opendev.org/openstack/networking-odl/src/branch/master/setup.cfg#L52 so you should be padding opendaylight_v2 as shoe in the comment https://opendev.org/openstack/networking-odl/src/branch/master/setup.cfg#L44-L47 that is likely the cause of the error > > Neutron server log: > CRITICAL neutron.plugins.ml2.managers [-] The following mechanism > drivers were not found: {'opendaylight'} > > All configuration is in my Github[1] > > Regards, > > Hojat. > > [1]: https://github.com/hojat-gazestani/openstack From arnaud.morin at gmail.com Mon Nov 8 22:57:54 2021 From: arnaud.morin at gmail.com (Arnaud) Date: Mon, 08 Nov 2021 23:57:54 +0100 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> <17cfbd00b82.1268d0068215518.8021398674853406097@ghanshyammann.com> Message-ID: <83EBFE80-193C-47F4-91B5-65BEE27509E6@gmail.com> Hey, I'd like to add my 2 cents. It's hard to upgrade a region, so when it comes to upgrade multiples regions, it's even harder. Some operators also have their own downstream patchs / extensions / drivers which make the upgrade process more complex, so it take more time (for all reasons already given in the thread, need to update the CI, the tools, the doc, the people, etc). One more thing is about consistency, when you have to manage multiple regions, it's easier if all of them are pretty identical. Human operation are always the same, and can eventually be automated. This leads to keep going on with a fixed version of OpenStack to run the business. When scaling, you (we) always chose security and consistency. Also, Julia mentioned something true about contribution from operators. It's difficult for them for multiple reasons: - pushing upstream is a process, which need to be taken into account when working on an internal fix. - it's usually quicker to push downstream because it's needed. When it comes to upstream, it's challenged by the developers (and it's good), so it take time and can be discouraging. - operators are not running master, but a stable release. Bugs on stables could be fixed differently than on master, which could also be discouraging. - writing unit tests is a job, some tech operators are not necessarily developers, so this could also be a challenge. All of these to say that helping people which are proposing a patch is a good thing. And as far as I can see, upstream developers are helping most of the time, and we should keep and encourage such behavior IMHO. Finally, I would also vote for less releases or LTS releases (but it looks heavier to have this). I think this would help keeping up to date with stables and propose more patches from operators. Cheers, Arnaud. Le 8 novembre 2021 20:43:18 GMT+01:00, Julia Kreger a ?crit?: >On Mon, Nov 8, 2021 at 10:44 AM Thierry Carrez wrote: >> >> Ghanshyam Mann wrote: >> > [...] >> > Thanks Thierry for the detailed write up. >> > >> > At the same time, a shorter release which leads to upgrade-often pressure but >> > it will have fewer number of changes/features, so make the upgrade easy and >> > longer-release model will have more changes/features that will make upgrade more >> > complex. >> >> I think that was true a few years ago, but I'm not convinced that still >> holds. We currently have a third of the changes volume we had back in >> 2015, so a one-year release in 2022 would contain far less changes than >> a 6-month release from 2015. > >I concur. Also, in 2015, we were still very much in a "move fast" mode >of operation as a community. > >> Also, thanks to our testing and our focus on stability, the pain linked >> to the amount of breaking changes in a release is now negligible >> compared to the basic pain of going through a 1M-core deployment and >> upgrading the various pieces... every 6 months. I've heard of multiple >> users claiming it takes them close to 6 months to upgrade their massive >> deployments to a new version. So when they are done, they have to start >> again. >> >> -- >> Thierry Carrez (ttx) >> > >I've been hearing the exact same messaging from larger operators as >well as operators in environments where they are concerned about >managing risk for at least the past two years. These operators have >indicated it is not uncommon for the upgrade projects which consume, >test, certify for production, and deploy to production take *at least* >six months to execute. At the same time, they are shy of being the >ones to also "find all of the bugs", and so the project doesn't >actually start until well after the new coordinated release has >occurred. Quickly they become yet another version behind with this >pattern. > >I suspect it is really easy for us as a CI focused community to think >that six months is plenty of time to roll out a fully updated >deployment which has been fully tested in every possible way. Except, >these operators are often trying to do just that on physical hardware, >with updated firmware and operatings systems bringing in new variables >with every single change which may ripple up the entire stack. These >operators then have to apply the lessons they have previously learned >once they have worked through all of the variables. In some cases this >may involve aspects such as benchmarking, to ensure they don't need to >make additional changes which need to be factored into their >deployment, sending them back to the start of their testing. All while >thinking of phrases like "business/mission critical". > >I guess this means I'm in support of revising the release cycle. At >the same time, I think it would be wise for us to see if we can learn >from these operators the pain points they experience, the process they >leverage, and ultimately see if there are opportunities to spread >knowledge or potentially tooling. Or maybe even get them to contribute >their patches upstream. Not that all of these issues are easily solved >with any level of code, but sometimes they can include contextual >disconnects and resolving those are just as important as shipping a >release, IMHO. > >-Julia > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gagehugo at gmail.com Tue Nov 9 00:58:47 2021 From: gagehugo at gmail.com (Gage Hugo) Date: Mon, 8 Nov 2021 18:58:47 -0600 Subject: [openstack-helm] No meeting tomorrow Message-ID: Hey team, Since there are no agenda items [0] for the IRC meeting tomorrow, the meeting is cancelled. Our next meeting will be November 16th. Thanks [0] https://etherpad.opendev.org/p/openstack-helm-weekly-meeting -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrew at etc.gen.nz Tue Nov 9 10:54:36 2021 From: andrew at etc.gen.nz (Andrew Ruthven) Date: Tue, 09 Nov 2021 23:54:36 +1300 Subject: Adjutant needs contributors (and a PTL) to survive! In-Reply-To: <9cabb3cb32a7441697f58933df72b514@verisign.com> References: <3026c411-688b-c773-8577-e8eed40b995a@catalystcloud.nz> <2bd77f1e8c7542888b7e0e1a14931a41@verisign.com> <9cabb3cb32a7441697f58933df72b514@verisign.com> Message-ID: <1e4045dd6edf572a8a32ca639056bc06a8782fe8.camel@etc.gen.nz> On Mon, 2021-11-08 at 19:49 +0000, Braden, Albert wrote: > I didn't have any luck contacting Adrian. Does anyone know where the > storyboard is that he mentions in his email? I'll check in with Adrian to see if he has heard from anyone. Cheers, Andrew Catalyst Cloud --? Andrew Ruthven, Wellington, New Zealand andrew at etc.gen.nz | Catalyst Cloud: | This space intentionally left blank https://catalystcloud.nz | -------------- next part -------------- An HTML attachment was scrubbed... URL: From moreira.belmiro.email.lists at gmail.com Tue Nov 9 13:50:31 2021 From: moreira.belmiro.email.lists at gmail.com (Belmiro Moreira) Date: Tue, 9 Nov 2021 14:50:31 +0100 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: <83EBFE80-193C-47F4-91B5-65BEE27509E6@gmail.com> References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> <17cfbd00b82.1268d0068215518.8021398674853406097@ghanshyammann.com> <83EBFE80-193C-47F4-91B5-65BEE27509E6@gmail.com> Message-ID: Hi, It's time again to discuss the release cycle... Just considering the number of times that lately we have been discussing the release cycle we should acknowledge that we really have a problem or at least that we have very different opinions in the community and we should discuss it openly. Thanks Thierry to bring the topic again. Looking into the last user survey we see that 23% of the deployments are running the last two releases and then we have a long... long... tail with older releases. Honestly, I have mixed feelings about it! As an operator I relate more with having a LTS release and give the possibility to upgrade between LTS releases. But having the possibility to upgrade every 6 months is also very interesting for the small and fast moving projects. Maybe an 1 year release cycle would provide the mid term here. In our cloud infrastructure we run different releases, from Stein to Victoria. There are projects that we can easily upgrade (and we do it!) and other projects that are much more complicated (because feature deprecations, Operating System dependencies, internal patches, or simply because is too risky considering the current workloads). For those we need definitely more than 6 months for the upgrade. If again we don't reach a consensus to change the release cycle at least we should continue to work in improving the upgrade experience (and don't let me wrong... the upgrade experience has been improved tremendously over the years). There are small things that change in the projects (most of them are good refactors) but can be a big headache for upgrades. Let me enumerate some: DB schema changes usually translates into offline upgrades, configuration changes (options that move to different configuration groups without bringing anything new, change defaults, policy changes), architecture changes (new projects that are now mandatory), ... In my opinion if we reduce those or at least are more aware of the challenges that they impose to operators, we will make upgrades easier and hopefully see deployments move much faster whatever is the release cycle. cheers, Belmiro On Tue, Nov 9, 2021 at 12:04 AM Arnaud wrote: > Hey, > I'd like to add my 2 cents. > > It's hard to upgrade a region, so when it comes to upgrade multiples > regions, it's even harder. > > Some operators also have their own downstream patchs / extensions / > drivers which make the upgrade process more complex, so it take more time > (for all reasons already given in the thread, need to update the CI, the > tools, the doc, the people, etc). > > One more thing is about consistency, when you have to manage multiple > regions, it's easier if all of them are pretty identical. Human operation > are always the same, and can eventually be automated. > This leads to keep going on with a fixed version of OpenStack to run the > business. > When scaling, you (we) always chose security and consistency. > > Also, Julia mentioned something true about contribution from operators. > It's difficult for them for multiple reasons: > - pushing upstream is a process, which need to be taken into account when > working on an internal fix. > - it's usually quicker to push downstream because it's needed. When it > comes to upstream, it's challenged by the developers (and it's good), so it > take time and can be discouraging. > - operators are not running master, but a stable release. Bugs on stables > could be fixed differently than on master, which could also be discouraging. > - writing unit tests is a job, some tech operators are not necessarily > developers, so this could also be a challenge. > > All of these to say that helping people which are proposing a patch is a > good thing. And as far as I can see, upstream developers are helping most > of the time, and we should keep and encourage such behavior IMHO. > > Finally, I would also vote for less releases or LTS releases (but it looks > heavier to have this). I think this would help keeping up to date with > stables and propose more patches from operators. > > Cheers, > Arnaud. > > > Le 8 novembre 2021 20:43:18 GMT+01:00, Julia Kreger < > juliaashleykreger at gmail.com> a ?crit : >> >> On Mon, Nov 8, 2021 at 10:44 AM Thierry Carrez wrote: >> >>> >>> Ghanshyam Mann wrote: >>> >>>> [...] >>>> Thanks Thierry for the detailed write up. >>>> >>>> At the same time, a shorter release which leads to upgrade-often pressure but >>>> it will have fewer number of changes/features, so make the upgrade easy and >>>> longer-release model will have more changes/features that will make upgrade more >>>> complex. >>>> >>> >>> I think that was true a few years ago, but I'm not convinced that still >>> holds. We currently have a third of the changes volume we had back in >>> 2015, so a one-year release in 2022 would contain far less changes than >>> a 6-month release from 2015. >>> >> >> I concur. Also, in 2015, we were still very much in a "move fast" mode >> of operation as a community. >> >> Also, thanks to our testing and our focus on stability, the pain linked >>> to the amount of breaking changes in a release is now negligible >>> compared to the basic pain of going through a 1M-core deployment and >>> upgrading the various pieces... every 6 months. I've heard of multiple >>> users claiming it takes them close to 6 months to upgrade their massive >>> deployments to a new version. So when they are done, they have to start >>> again. >>> >>> -- >>> Thierry Carrez (ttx) >>> >>> >> I've been hearing the exact same messaging from larger operators as >> well as operators in environments where they are concerned about >> managing risk for at least the past two years. These operators have >> indicated it is not uncommon for the upgrade projects which consume, >> test, certify for production, and deploy to production take *at least* >> six months to execute. At the same time, they are shy of being the >> ones to also "find all of the bugs", and so the project doesn't >> actually start until well after the new coordinated release has >> occurred. Quickly they become yet another version behind with this >> pattern. >> >> I suspect it is really easy for us as a CI focused community to think >> that six months is plenty of time to roll out a fully updated >> deployment which has been fully tested in every possible way. Except, >> these operators are often trying to do just that on physical hardware, >> with updated firmware and operatings systems bringing in new variables >> with every single change which may ripple up the entire stack. These >> operators then have to apply the lessons they have previously learned >> once they have worked through all of the variables. In some cases this >> may involve aspects such as benchmarking, to ensure they don't need to >> make additional changes which need to be factored into their >> deployment, sending them back to the start of their testing. All while >> thinking of phrases like "business/mission critical". >> >> I guess this means I'm in support of revising the release cycle. At >> the same time, I think it would be wise for us to see if we can learn >> from these operators the pain points they experience, the process they >> leverage, and ultimately see if there are opportunities to spread >> knowledge or potentially tooling. Or maybe even get them to contribute >> their patches upstream. Not that all of these issues are easily solved >> with any level of code, but sometimes they can include contextual >> disconnects and resolving those are just as important as shipping a >> release, IMHO. >> >> -Julia >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From juliaashleykreger at gmail.com Tue Nov 9 14:45:38 2021 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Tue, 9 Nov 2021 07:45:38 -0700 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> <17cfbd00b82.1268d0068215518.8021398674853406097@ghanshyammann.com> <83EBFE80-193C-47F4-91B5-65BEE27509E6@gmail.com> Message-ID: On Tue, Nov 9, 2021 at 6:50 AM Belmiro Moreira wrote: > > Hi, > It's time again to discuss the release cycle... > Just considering the number of times that lately we have been > discussing the release cycle we should acknowledge that we really > have a problem or at least that we have very different opinions in > the community and we should discuss it openly. > > Thanks Thierry to bring the topic again. > > Looking into the last user survey we see that 23% of the deployments > are running the last two releases and then we have a long... long... > tail with older releases. > > Honestly, I have mixed feelings about it! > > As an operator I relate more with having a LTS release and give the > possibility to upgrade between LTS releases. But having the possibility > to upgrade every 6 months is also very interesting for the small and fast > moving projects. > > Maybe an 1 year release cycle would provide the mid term here. > > In our cloud infrastructure we run different releases, from Stein to Victoria. > There are projects that we can easily upgrade (and we do it!) and other projects > that are much more complicated (because feature deprecations, Operating System > dependencies, internal patches, or simply because is too risky considering the > current workloads). > For those we need definitely more than 6 months for the upgrade. > > If again we don't reach a consensus to change the release cycle at least we > should continue to work in improving the upgrade experience (and don't let me wrong... > the upgrade experience has been improved tremendously over the years). > > There are small things that change in the projects (most of them are good refactors) > but can be a big headache for upgrades. > Let me enumerate some: DB schema changes usually translates into offline upgrades, > configuration changes (options that move to different configuration groups without > bringing anything new, change defaults, policy changes), architecture changes > (new projects that are now mandatory), ... > This is the kind of contextual reminder that needs to come up frequently. Is there any chance of conveying how long the outages are with a deployment size in your experience, with your level of risk tolerance. Same goes for human/operational impact of working through aspects like configuration options changing/moving, policy changes, architectural changes, new projects being mandatory? My hope is that we convey some sense of "what it really takes" to help provide context in which contributors making changes understand how, at least at a high level, their changes may impact others. > In my opinion if we reduce those or at least are more aware of the challenges > that they impose to operators, we will make upgrades easier and hopefully see > deployments move much faster whatever is the release cycle. > > cheers, > Belmiro > > On Tue, Nov 9, 2021 at 12:04 AM Arnaud wrote: >> >> Hey, >> I'd like to add my 2 cents. >> >> It's hard to upgrade a region, so when it comes to upgrade multiples regions, it's even harder. >> >> Some operators also have their own downstream patchs / extensions / drivers which make the upgrade process more complex, so it take more time (for all reasons already given in the thread, need to update the CI, the tools, the doc, the people, etc). >> >> One more thing is about consistency, when you have to manage multiple regions, it's easier if all of them are pretty identical. Human operation are always the same, and can eventually be automated. >> This leads to keep going on with a fixed version of OpenStack to run the business. >> When scaling, you (we) always chose security and consistency. >> >> Also, Julia mentioned something true about contribution from operators. It's difficult for them for multiple reasons: >> - pushing upstream is a process, which need to be taken into account when working on an internal fix. >> - it's usually quicker to push downstream because it's needed. When it comes to upstream, it's challenged by the developers (and it's good), so it take time and can be discouraging. >> - operators are not running master, but a stable release. Bugs on stables could be fixed differently than on master, which could also be discouraging. >> - writing unit tests is a job, some tech operators are not necessarily developers, so this could also be a challenge. >> >> All of these to say that helping people which are proposing a patch is a good thing. And as far as I can see, upstream developers are helping most of the time, and we should keep and encourage such behavior IMHO. >> >> Finally, I would also vote for less releases or LTS releases (but it looks heavier to have this). I think this would help keeping up to date with stables and propose more patches from operators. >> >> Cheers, >> Arnaud. >> >> >> Le 8 novembre 2021 20:43:18 GMT+01:00, Julia Kreger a ?crit : >>> >>> On Mon, Nov 8, 2021 at 10:44 AM Thierry Carrez wrote: >>>> >>>> >>>> Ghanshyam Mann wrote: >>>>> >>>>> [...] >>>>> Thanks Thierry for the detailed write up. >>>>> >>>>> At the same time, a shorter release which leads to upgrade-often pressure but >>>>> it will have fewer number of changes/features, so make the upgrade easy and >>>>> longer-release model will have more changes/features that will make upgrade more >>>>> complex. >>>> >>>> >>>> I think that was true a few years ago, but I'm not convinced that still >>>> holds. We currently have a third of the changes volume we had back in >>>> 2015, so a one-year release in 2022 would contain far less changes than >>>> a 6-month release from 2015. >>> >>> >>> I concur. Also, in 2015, we were still very much in a "move fast" mode >>> of operation as a community. >>> >>>> Also, thanks to our testing and our focus on stability, the pain linked >>>> to the amount of breaking changes in a release is now negligible >>>> compared to the basic pain of going through a 1M-core deployment and >>>> upgrading the various pieces... every 6 months. I've heard of multiple >>>> users claiming it takes them close to 6 months to upgrade their massive >>>> deployments to a new version. So when they are done, they have to start >>>> again. >>>> >>>> -- >>>> Thierry Carrez (ttx) >>>> >>> >>> I've been hearing the exact same messaging from larger operators as >>> well as operators in environments where they are concerned about >>> managing risk for at least the past two years. These operators have >>> indicated it is not uncommon for the upgrade projects which consume, >>> test, certify for production, and deploy to production take *at least* >>> six months to execute. At the same time, they are shy of being the >>> ones to also "find all of the bugs", and so the project doesn't >>> actually start until well after the new coordinated release has >>> occurred. Quickly they become yet another version behind with this >>> pattern. >>> >>> I suspect it is really easy for us as a CI focused community to think >>> that six months is plenty of time to roll out a fully updated >>> deployment which has been fully tested in every possible way. Except, >>> these operators are often trying to do just that on physical hardware, >>> with updated firmware and operatings systems bringing in new variables >>> with every single change which may ripple up the entire stack. These >>> operators then have to apply the lessons they have previously learned >>> once they have worked through all of the variables. In some cases this >>> may involve aspects such as benchmarking, to ensure they don't need to >>> make additional changes which need to be factored into their >>> deployment, sending them back to the start of their testing. All while >>> thinking of phrases like "business/mission critical". >>> >>> I guess this means I'm in support of revising the release cycle. At >>> the same time, I think it would be wise for us to see if we can learn >>> from these operators the pain points they experience, the process they >>> leverage, and ultimately see if there are opportunities to spread >>> knowledge or potentially tooling. Or maybe even get them to contribute >>> their patches upstream. Not that all of these issues are easily solved >>> with any level of code, but sometimes they can include contextual >>> disconnects and resolving those are just as important as shipping a >>> release, IMHO. >>> >>> -Julia >>> From mkopec at redhat.com Tue Nov 9 14:55:32 2021 From: mkopec at redhat.com (Martin Kopec) Date: Tue, 9 Nov 2021 15:55:32 +0100 Subject: [qa] Moving office hour one hour later Message-ID: Hi everyone, we have decided to move the weekly office hour (held on Tuesdays) one later to 15:00 UTC effective November 16th [1]. It was discussed during our last office hour [2]. [1] https://review.opendev.org/c/opendev/irc-meetings/+/817224 [2] https://meetings.opendev.org/meetings/qa/2021/qa.2021-11-09-14.00.log.html#l-74 Regards, -- Martin Kopec Senior Software Quality Engineer Red Hat EMEA -------------- next part -------------- An HTML attachment was scrubbed... URL: From dms at danplanet.com Tue Nov 9 15:17:37 2021 From: dms at danplanet.com (Dan Smith) Date: Tue, 09 Nov 2021 07:17:37 -0800 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: <83EBFE80-193C-47F4-91B5-65BEE27509E6@gmail.com> (Arnaud's message of "Mon, 08 Nov 2021 23:57:54 +0100") References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> <17cfbd00b82.1268d0068215518.8021398674853406097@ghanshyammann.com> <83EBFE80-193C-47F4-91B5-65BEE27509E6@gmail.com> Message-ID: > - it's usually quicker to push downstream because it's needed. When it > comes to upstream, it's challenged by the developers (and it's good), > so it take time and can be discouraging. I'm sure many operators push downstream first, and then chuck a patch into upstream gerrit in hopes of it landing upstream so they don't have to maintain it long-term. Do you think the possibility of it not landing for a year (if they make it in the first one) or two (if it goes into the next one) is a disincentive to pushing upstream? I would think it might push it past the event horizon making downstream patches more of a constant. > - writing unit tests is a job, some tech operators are not necessarily > developers, so this could also be a challenge. Yep, and my experience is that this sort of "picking up the pieces" of good fixes that need help from another developer happens mostly at the end of the release, post-FF in a lot of cases. This is the time when the pressure of the pending release is finally on and we get around to this sort of task. Expanding the release window increases the number of these things collected per cycle, and delays them being in a release by a long time. I know, we should just "do better" for the earlier parts of the cycle, but realistically that won't happen :) --Dan From zigo at debian.org Tue Nov 9 18:07:39 2021 From: zigo at debian.org (Thomas Goirand) Date: Tue, 9 Nov 2021 19:07:39 +0100 Subject: [all][tc] Relmgt team position on release cadence In-Reply-To: References: <40f6f51b-6903-1afe-7318-0689b2af482d@openstack.org> <17cfbd00b82.1268d0068215518.8021398674853406097@ghanshyammann.com> <83EBFE80-193C-47F4-91B5-65BEE27509E6@gmail.com> Message-ID: On 11/9/21 4:17 PM, Dan Smith wrote: >> - it's usually quicker to push downstream because it's needed. When it >> comes to upstream, it's challenged by the developers (and it's good), >> so it take time and can be discouraging. > > I'm sure many operators push downstream first, and then chuck a patch > into upstream gerrit in hopes of it landing upstream so they don't have > to maintain it long-term. Do you think the possibility of it not landing > for a year (if they make it in the first one) or two (if it goes into > the next one) is a disincentive to pushing upstream? Don't ask your colleagues upstream about what we do! :) With my Debian package maintainer hat on: I will continue to send patch upstream whenever I can. > Yep, and my experience is that this sort of "picking up the pieces" of > good fixes that need help from another developer happens mostly at the > end of the release, post-FF in a lot of cases. This is the time when the > pressure of the pending release is finally on and we get around to this > sort of task. It used to be that I was told to add unit tests, open a bug and close it, etc. and if I was not doing it, the patch would just stay open forever. This was the early days of OpenStack... >From packager view, that's not what I experienced. Mostly, upstream OpenStack people are nice, and understand that we (package maintainers) just jump from one package to another, and can't afford more than 15 minutes per package upgrade (considering upgrading to Xena meant upgrading 220 packages...). I've seen numerous times upstream projects taking over one of my patch, finishing the work (sometimes adding unit tests) and make the patch land (sometimes, even backport it to earlier releases). I don't think switching to a 1 year release cycle will change anything regarding distro <-> upstream relationship. Hopefully, OpenStack people will continue to be awesome and nice to work with... :) Cheers, Thomas Goirand (zigo) From dmeng at uvic.ca Mon Nov 8 21:37:56 2021 From: dmeng at uvic.ca (dmeng) Date: Mon, 08 Nov 2021 13:37:56 -0800 Subject: [sdk]: Get server fault message Message-ID: <135e55fafeebd0c6e8328ea2f4da4713@uvic.ca> Hello there, Hope everything is going well. I'm wondering if there is any method that could show the error message of a server whose status is "ERROR"? Like from openstack cli, "openstack server show server_name", if the server is in "ERROR" status, this will return a field "fault" with a message shows the error. I tried the compute service get_server and find_server, but neither of them show the error messages. Thanks and have a great day! Catherine -------------- next part -------------- An HTML attachment was scrubbed... URL: From tonyliu0592 at hotmail.com Tue Nov 9 23:50:57 2021 From: tonyliu0592 at hotmail.com (Tony Liu) Date: Tue, 9 Nov 2021 23:50:57 +0000 Subject: [keystone] failed auth takes a while Message-ID: Hi, I am using Keystone 17.0.0 with Ussuri. When issue a request by openstack cli (eg. network list) with wrong password, it takes 1+ minute to get auth failure 401 back. After correct the password, for the first request, it still takes 1+ minute to return success. After that, all following request is as fast as usual, around 1s. "openstack user password set" also takes that much time to return success. Any clues? Thanks! Tony From emilien at redhat.com Wed Nov 10 01:36:00 2021 From: emilien at redhat.com (Emilien Macchi) Date: Tue, 9 Nov 2021 20:36:00 -0500 Subject: [sdk][tc] A new home for Gophercloud In-Reply-To: <20211103164054.72y2c7qk2b7ogxn2@yuggoth.org> References: <20211103164054.72y2c7qk2b7ogxn2@yuggoth.org> Message-ID: On Wed, Nov 3, 2021 at 12:43 PM Jeremy Stanley wrote: [...] This was based on experiences trying to work with the Kata > community, and the "experiment" referenced in that mailing list post > eventually concluded with the removal of remaining Kata project > configuration when https://review.opendev.org/744687 merged > approximately 15 months ago. > ack I haven't seen much pushback from moving to Gerrit, but pretty much all feedback I got was from folks who worked (is working) on OpenStack, so a bit biased in my opinion (myself included). Beside that, if we would move to opendev, I want to see some incentives in our roadmap, not just "move our project here because it's cool". Some ideas: * Consider it as a subproject from OpenStack SDK? Or part of a SIG? * CI coverage for API regression testing (e.g. gophercloud/acceptance/compute running in Nova CI) * Getting more exposure of the project and potentially more contributors * Consolidate the best practices in general, for contributions to the project, getting started, dev environments, improving CI jobs (current jobs use OpenLab zuul, with a fork of zuul jobs). Is there any concern would we have to discuss? -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From juliaashleykreger at gmail.com Wed Nov 10 01:48:41 2021 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Tue, 9 Nov 2021 18:48:41 -0700 Subject: [sdk][tc] A new home for Gophercloud In-Reply-To: References: <20211103164054.72y2c7qk2b7ogxn2@yuggoth.org> Message-ID: On Tue, Nov 9, 2021 at 6:40 PM Emilien Macchi wrote: > > On Wed, Nov 3, 2021 at 12:43 PM Jeremy Stanley wrote: > [...] > >> This was based on experiences trying to work with the Kata >> community, and the "experiment" referenced in that mailing list post >> eventually concluded with the removal of remaining Kata project >> configuration when https://review.opendev.org/744687 merged >> approximately 15 months ago. > > > ack > > I haven't seen much pushback from moving to Gerrit, but pretty much all feedback I got was from folks who worked (is working) on OpenStack, so a bit biased in my opinion (myself included). > Beside that, if we would move to opendev, I want to see some incentives in our roadmap, not just "move our project here because it's cool". > > Some ideas: > * Consider it as a subproject from OpenStack SDK? Or part of a SIG? My $0.01 opinion is to move the OpenStack SDK "project" to be a generic "SDK" project, in which gophercloud could live. Mainly for a point of contact perspective, but I think "whatever works" may be best in the end. > * CI coverage for API regression testing (e.g. gophercloud/acceptance/compute running in Nova CI) I strongly suspect it wouldn't be hard to convince ironic contributors to do something similar for Ironic's CI. > * Getting more exposure of the project and potentially more contributors > * Consolidate the best practices in general, for contributions to the project, getting started, dev environments, improving CI jobs (current jobs use OpenLab zuul, with a fork of zuul jobs). On a plus side, the official SDK list could be updated... This would be kind of epic, actually. > > Is there any concern would we have to discuss? > -- > Emilien Macchi From tkajinam at redhat.com Wed Nov 10 02:32:11 2021 From: tkajinam at redhat.com (Takashi Kajinami) Date: Wed, 10 Nov 2021 11:32:11 +0900 Subject: [puppet] Propose retiring puppet-senlin In-Reply-To: References: Message-ID: Thanks Tobias for your feedback. Because I've heard no objections for a week while there is a positive response, I've proposed the patches to retire the project[1]. [1] https://review.opendev.org/q/topic:retire-puppet-senlin On Wed, Nov 3, 2021 at 8:09 PM Tobias Urdin wrote: > +1 for retiring > > Best regards > Tobias > > On 3 Nov 2021, at 11:49, Takashi Kajinami wrote: > > Hello, > > > I remember I raised a similar discussion recently[1] but > we need the same for a different module. > > puppet-selin was introduced back in 2018, but the module has had > only the portion made by cookiecutter and has no capability to manage > fundamental resources yet. > Because we haven't seen any interest in creating implementations > to support even basic usage, I'll propose retiring this module. > > I'll be open for any feedback for a while, and will propose a series > of patches for retirement if no concern is raised here for one week. > > Thank you, > Takashi > > [1] > http://lists.openstack.org/pipermail/openstack-discuss/2021-September/024687.html > [2] https://opendev.org/openstack/puppet-senlin > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjoen at dds.nl Wed Nov 10 08:00:44 2021 From: tjoen at dds.nl (tjoen) Date: Wed, 10 Nov 2021 09:00:44 +0100 Subject: [sdk]: Get server fault message In-Reply-To: <135e55fafeebd0c6e8328ea2f4da4713@uvic.ca> References: <135e55fafeebd0c6e8328ea2f4da4713@uvic.ca> Message-ID: On 11/8/21 22:37, dmeng wrote: > I'm wondering if there is any method that could show the error message > of a server whose status is "ERROR"? Like from openstack cli, "openstack journalctl From iurygregory at gmail.com Wed Nov 10 08:12:47 2021 From: iurygregory at gmail.com (Iury Gregory) Date: Wed, 10 Nov 2021 09:12:47 +0100 Subject: [sdk][tc] A new home for Gophercloud In-Reply-To: References: <20211103164054.72y2c7qk2b7ogxn2@yuggoth.org> Message-ID: Em qua., 10 de nov. de 2021 ?s 02:49, Julia Kreger < juliaashleykreger at gmail.com> escreveu: > On Tue, Nov 9, 2021 at 6:40 PM Emilien Macchi wrote: > > > > On Wed, Nov 3, 2021 at 12:43 PM Jeremy Stanley > wrote: > > [...] > > > >> This was based on experiences trying to work with the Kata > >> community, and the "experiment" referenced in that mailing list post > >> eventually concluded with the removal of remaining Kata project > >> configuration when https://review.opendev.org/744687 merged > >> approximately 15 months ago. > > > > > > ack > > > > I haven't seen much pushback from moving to Gerrit, but pretty much all > feedback I got was from folks who worked (is working) on OpenStack, so a > bit biased in my opinion (myself included). > > Beside that, if we would move to opendev, I want to see some incentives > in our roadmap, not just "move our project here because it's cool". > > > > Some ideas: > > * Consider it as a subproject from OpenStack SDK? Or part of a SIG? > > My $0.01 opinion is to move the OpenStack SDK "project" to be a > generic "SDK" project, in which gophercloud could live. Mainly for a > point of contact perspective, but I think "whatever works" may be best > in the end. > I agree with Julia's comment here. > * CI coverage for API regression testing (e.g. > gophercloud/acceptance/compute running in Nova CI) > > I strongly suspect it wouldn't be hard to convince ironic contributors > to do something similar for Ironic's CI. > ++ I will be more than happy to help with this since I've contributed to gophercloud, and I've also helped them to add a job that runs Ironic so we could run acceptance tests. > > * Getting more exposure of the project and potentially more contributors > > * Consolidate the best practices in general, for contributions to the > project, getting started, dev environments, improving CI jobs (current jobs > use OpenLab zuul, with a fork of zuul jobs). > > On a plus side, the official SDK list could be updated... This would > be kind of epic, actually. > ++ I think this would be very good. > > > > > Is there any concern would we have to discuss? > > -- > > Emilien Macchi > > -- *Att[]'sIury Gregory Melo Ferreira * *MSc in Computer Science at UFCG* *Part of the ironic-core and puppet-manager-core team in OpenStack* *Software Engineer at Red Hat Czech* *Social*: https://www.linkedin.com/in/iurygregory *E-mail: iurygregory at gmail.com * -------------- next part -------------- An HTML attachment was scrubbed... URL: From nabeel.tariq at rapidcompute.com Wed Nov 10 08:50:34 2021 From: nabeel.tariq at rapidcompute.com (nabeel.tariq at rapidcompute.com) Date: Wed, 10 Nov 2021 13:50:34 +0500 Subject: OVN with SSL Message-ID: <000401d7d610$081b59f0$18520dd0$@rapidcompute.com> Hi, We have configured OVN with SSL which is working while using SSL from third party. When we implement self-signed certificate, it shows certificate verification failed in logs. ERROR LOG: 2021-11-10 10:11:14.703 88854 ERROR neutron.service OpenSSL.SSL.Error: [('SSL routines', 'tls_process_server_certificate', 'certificate verify failed')] By using openssl verify command selfsigned server is showing verified. openssl verify -verbose -CAfile .pem
.crt -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 20946 bytes Desc: not available URL: From mdulko at redhat.com Wed Nov 10 09:43:54 2021 From: mdulko at redhat.com (=?UTF-8?Q?Micha=C5=82?= Dulko) Date: Wed, 10 Nov 2021 10:43:54 +0100 Subject: [sdk][tc] A new home for Gophercloud In-Reply-To: References: <20211103164054.72y2c7qk2b7ogxn2@yuggoth.org> Message-ID: On Tue, 2021-11-09 at 20:36 -0500, Emilien Macchi wrote: > On Wed, Nov 3, 2021 at 12:43 PM Jeremy Stanley > wrote: > [...] > ? > > This was based on experiences trying to work with the Kata > > community, and the "experiment" referenced in that mailing list > > post > > eventually concluded with the removal of remaining Kata project > > configuration when https://review.opendev.org/744687 merged > > approximately 15 months ago. > > > > > ack > > I haven't seen much pushback from moving to Gerrit, but pretty much > all feedback I got was from folks who worked (is working) on > OpenStack, so a bit biased in my?opinion (myself included). > Beside that, if we would move to opendev, I want to see some > incentives in our roadmap, not just "move our project here because > it's cool". > > Some ideas: > * Consider it as a subproject from OpenStack SDK? Or part of a SIG? > * CI coverage for API regression testing (e.g. > gophercloud/acceptance/compute running in Nova CI) > * Getting more exposure of the project and potentially more > contributors > * Consolidate the best practices in general, for contributions to the > project, getting started, dev environments, improving CI jobs > (current jobs use OpenLab zuul, with a fork of zuul jobs). > > Is there any concern would we have to discuss? Besides that with DevStack and its stable branches you have an easy way to test Gophercloud against various OpenStack versions. From senrique at redhat.com Wed Nov 10 13:11:23 2021 From: senrique at redhat.com (Sofia Enriquez) Date: Wed, 10 Nov 2021 10:11:23 -0300 Subject: [cinder] Bug deputy report for week of 11-10-2021 Message-ID: This is a bug report from 10-03-2021 to 11-10-2021. Agenda: https://etherpad.opendev.org/p/cinder-bug-squad-meeting ----------------------------------------------------------------------------------------- Medium - https://bugs.launchpad.net/cinder/+bug/1950291 tempest-integrated-compute-centos-8-stream fails with version conflict in boto. Unassigned. Incomplete - https://bugs.launchpad.net/cinder/+bug/1950134 InstanceLocality filter use results in AttributeError: 'Client' object has no attribute 'list_extensions'. Unassigned. Invalid - https://bugs.launchpad.net/cinder/+bug/1950128 NFS backend initialisation fails because an immutable directory exists. Unassigned. -- Sof?a Enriquez she/her Software Engineer Red Hat PnT IRC: @enriquetaso @RedHat Red Hat Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Wed Nov 10 15:00:50 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 10 Nov 2021 09:00:50 -0600 Subject: [all][tc] Continuing the RBAC PTG discussion In-Reply-To: <17ce6ca4f3c.11a0d934542096.8078141879504981729@ghanshyammann.com> References: <17cc3068765.11a9586ce162263.9148179786177865922@ghanshyammann.com> <17cdc3e5964.e42274e1372366.1164114504252831335@ghanshyammann.com> <17ce6ca4f3c.11a0d934542096.8078141879504981729@ghanshyammann.com> Message-ID: <17d0a5daab0.f08965b2428404.1434605436511830260@ghanshyammann.com> We have started this meeting, you can join @ https://meet.google.com/uue-adpp-xsm -gmann ---- On Wed, 03 Nov 2021 12:13:10 -0500 Ghanshyam Mann wrote ---- > Hello Everyone, > > We figured out a lot of things in today call and for other open queries or goal stuff, > we will continue the discussion next week on Wed Nov 10th, 15:00 - 16:00 UTC. > > Below is the link to join the call: > > https://meet.google.com/uue-adpp-xsm > > -gmann > > ---- On Mon, 01 Nov 2021 11:04:06 -0500 Ghanshyam Mann wrote ---- > > ---- On Wed, 27 Oct 2021 13:32:37 -0500 Ghanshyam Mann wrote ---- > > > Hello Everyone, > > > > > > As decided in PTG, we will continue the RBAC discussion from where we left in PTG. We will have a video > > > call next week based on the availability of most of the interested members. > > > > > > Please vote your available time in below doodle vote by Thursday (or Friday morning central time). > > > > > > - https://doodle.com/poll/6xicntb9tu657nz7 > > > > As per doodle voting, I have schedule it on Nov 3rd Wed, 15:00 - 16:00 UTC. > > > > Below is the link to join the call: > > > > https://meet.google.com/uue-adpp-xsm > > > > We will be taking notes in this etherpad https://etherpad.opendev.org/p/policy-popup-yoga-ptg > > > > -gmann > > > > > > > > NOTE: this is not specific to TC or people working in RBAC work but more to wider community to > > > get feedback and finalize the direction (like what we did in PTG session). > > > > > > Meanwhile, feel free to review the lance's updated proposal for community-wide goal > > > - https://review.opendev.org/c/openstack/governance/+/815158 > > > > > > -gmann > > > > > > > > > > > > From moreira.belmiro.email.lists at gmail.com Wed Nov 10 15:48:17 2021 From: moreira.belmiro.email.lists at gmail.com (Belmiro Moreira) Date: Wed, 10 Nov 2021 16:48:17 +0100 Subject: [largescale-sig] Next meeting: Nov 10th, 15utc In-Reply-To: References: Message-ID: Hi, we held our meeting today. We discussed the topic of our next "Large Scale OpenStack" episode on OpenInfra.Live, which should happen on Dec 9. The episode will be around tricks and tools that large deployments use for day to day ops. You can read the meeting logs at: https://meetings.opendev.org/meetings/large_scale_sig/2021/large_scale_sig.2021-11-10-15.00.html Our next IRC meeting will be Nov 24, at 1500utc on #openstack-operators on OFTC. best regards, Belmiro On Mon, Nov 8, 2021 at 5:04 PM Thierry Carrez wrote: > Hi everyone, > > The Large Scale SIG meeting is back this Wednesday in > #openstack-operators on OFTC IRC, at 15UTC. It will be chaired by > Belmiro Moreira. > > You can doublecheck how that time translates locally at: > https://www.timeanddate.com/worldclock/fixedtime.html?iso=20211110T15 > > Feel free to add topics to our agenda at: > https://etherpad.openstack.org/p/large-scale-sig-meeting > > Regards, > > -- > Thierry Carrez > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From elod.illes at est.tech Wed Nov 10 18:40:08 2021 From: elod.illes at est.tech (=?UTF-8?B?RWzFkWQgSWxsw6lz?=) Date: Wed, 10 Nov 2021 19:40:08 +0100 Subject: [cinder][kolla][OSA][release] Xena cycle-trailing release deadline Message-ID: <95058549-f349-024e-37b2-97a1a9aa62fc@est.tech> Hello teams with deliverables following the cycle-trailing release model! This is just a reminder to wrap up your trailing deliverables for Xena. A few cycles ago the deadline for cycle-trailing projects was extended to give more time. The deadline for Xena is *16 December, 2021* [1]. If things are ready sooner than that though, all the better for our downstream consumers. For reference, the following cycle-trailing deliverables will need final releases at some point until the above deadline: cinderlib kayobe kolla-ansible kolla openstack-ansible-roles openstack-ansible Thanks! El?d Ill?s irc: elodilles [1] https://releases.openstack.org/yoga/schedule.html#y-cycle-trail From gmann at ghanshyammann.com Wed Nov 10 19:03:36 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 10 Nov 2021 13:03:36 -0600 Subject: [sdk][tc] A new home for Gophercloud In-Reply-To: References: <20211103164054.72y2c7qk2b7ogxn2@yuggoth.org> Message-ID: <17d0b3beed8.f1f93452444025.7625989527428519853@ghanshyammann.com> ---- On Tue, 09 Nov 2021 19:36:00 -0600 Emilien Macchi wrote ---- > On Wed, Nov 3, 2021 at 12:43 PM Jeremy Stanley wrote: > [...] > This was based on experiences trying to work with the Kata > community, and the "experiment" referenced in that mailing list post > eventually concluded with the removal of remaining Kata project > configuration when https://review.opendev.org/744687 merged > approximately 15 months ago. > > ack > I haven't seen much pushback from moving to Gerrit, but pretty much all feedback I got was from folks who worked (is working) on OpenStack, so a bit biased in my opinion (myself included).Beside that, if we would move to opendev, I want to see some incentives in our roadmap, not just "move our project here because it's cool". > Some ideas:* Consider it as a subproject from OpenStack SDK? Or part of a SIG?* CI coverage for API regression testing (e.g. gophercloud/acceptance/compute running in Nova CI)* Getting more exposure of the project and potentially more contributors* Consolidate the best practices in general, for contributions to the project, getting started, dev environments, improving CI jobs (current jobs use OpenLab zuul, with a fork of zuul jobs). > Is there any concern would we have to discuss?-- +1, Thanks Emilien for putting the roadmap which is more clear to understand the logn term benefits. Looks good to me, especially CI part is cool to have from API testing perspective and to know where we break things (we run client jobs in many projects CI so it should not be something special we need to do) >* Consider it as a subproject from OpenStack SDK? Or part of a SIG? Just to be more clear on this. Does this mean, once we setup the things in opendev then we can migrate it under openstack/ namespace under OpenStack SDK umbrella? or you mean keep it in opendev with non-openstack namespace but collaborative effort with SDK team. -gmann > Emilien Macchi > From radoslaw.piliszek at gmail.com Wed Nov 10 19:18:31 2021 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Wed, 10 Nov 2021 20:18:31 +0100 Subject: [cinder][kolla][OSA][release] Xena cycle-trailing release deadline In-Reply-To: <95058549-f349-024e-37b2-97a1a9aa62fc@est.tech> References: <95058549-f349-024e-37b2-97a1a9aa62fc@est.tech> Message-ID: Thanks, El?d, for the reminder. Unless something unexpected comes up, Kolla projects release next week around Nov 16. -yoctozepto On Wed, 10 Nov 2021 at 19:41, El?d Ill?s wrote: > > Hello teams with deliverables following the cycle-trailing release model! > > This is just a reminder to wrap up your trailing deliverables for Xena. > A few cycles ago the deadline for cycle-trailing projects was extended > to give more time. The deadline for Xena is *16 December, 2021* [1]. > > If things are ready sooner than that though, all the better for our > downstream consumers. > > For reference, the following cycle-trailing deliverables will need > final releases at some point until the above deadline: > > cinderlib > kayobe > kolla-ansible > kolla > openstack-ansible-roles > openstack-ansible > > Thanks! > > El?d Ill?s > irc: elodilles > > [1] https://releases.openstack.org/yoga/schedule.html#y-cycle-trail > > > > From gmann at ghanshyammann.com Wed Nov 10 23:56:08 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 10 Nov 2021 17:56:08 -0600 Subject: [all][tc] Technical Committee next weekly meeting on Nov 11th at 1500 UTC In-Reply-To: <17d00d5940e.c605c228290918.1450543438871901915@ghanshyammann.com> References: <17d00d5940e.c605c228290918.1450543438871901915@ghanshyammann.com> Message-ID: <17d0c47bee0.12ba6e056450219.3324779231364314039@ghanshyammann.com> Hello Everyone, Below is the agenda for Tomorrow's TC IRC meeting schedule at 1500 UTC. https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting == Agenda for tomorrow's TC meeting == * Roll call * Follow up on past action items * Gate health check * Updates on community-wide goal ** Decoupling goal from release cycle *** https://review.opendev.org/c/openstack/governance/+/816387 ** RBAC goal rework *** https://review.opendev.org/c/openstack/governance/+/815158 ** Proposed community goal for FIPS compatibility and compliance *** https://review.opendev.org/c/openstack/governance/+/816587 * Adjutant need PTLs and maintainers ** http://lists.openstack.org/pipermail/openstack-discuss/2021-October/025555.html * Pain Point targeting ** https://etherpad.opendev.org/p/pain-point-elimination * Open Reviews ** https://review.opendev.org/q/projects:openstack/governance+is:open -gmann ---- On Mon, 08 Nov 2021 12:35:36 -0600 Ghanshyam Mann wrote ---- > Hello Everyone, > > Technical Committee's next weekly meeting is scheduled for Nov 11th at 1500 UTC. > > If you would like to add topics for discussion, please add them to the below wiki page by > Wednesday, Nov 10th, at 2100 UTC. > > https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting > > -gmann > > From gmann at ghanshyammann.com Thu Nov 11 00:41:21 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 10 Nov 2021 18:41:21 -0600 Subject: [all] Stable Core Team Process changed Message-ID: <17d0c712625.1134a0c9e450546.8685029299925787442@ghanshyammann.com> Hello Everyone, We discussed this in many PTG (Shanghai and in Xena) to decentralize the stable branch process and make it more distributed towards projects. The technical Committee has merged the resolution on this and I thought of putting it on ML in case you have not read it. With the new process, the individual project teams can manage their own stable core team in the same way that the regular core team. Also, project teams will be empowered to create/enforce polices that best meet the needs of that project. Existing stable policies stay valid and will serve the purpose as it is doing currently. And if you have any questions or seeking guidance regarding stable policies, you can reach out to the existing "Extended Maintenance" SIG which will be renamed to "Stable Maintenance". [1] https://governance.openstack.org/tc/resolutions/20210923-stable-core-team.html -gmann From gmann at ghanshyammann.com Thu Nov 11 02:30:52 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 10 Nov 2021 20:30:52 -0600 Subject: [all] Anyone use or would like to maintain openstack/training-labs repo In-Reply-To: <17cc7e09369.ddfa2076225205.6942386596020382900@ghanshyammann.com> References: <17cc7e09369.ddfa2076225205.6942386596020382900@ghanshyammann.com> Message-ID: <17d0cd56981.d71e48a7451122.1077594163909534877@ghanshyammann.com> ---- On Thu, 28 Oct 2021 12:09:16 -0500 Ghanshyam Mann wrote ---- > Hello Everyone, > > During the TC weekly meeting and PTG discussion to merge the 'Technical Writing' SIG into TC, we found > that openstack/training-labs is no maintained now a days. Even we do not know who use this repo for training. > > I have checked that upstream institute training or CoA are not using this repo in their training. If you are using > it for your training please reply to this email pr ping us on #openstack-tc IRC OFTC channel otherwise we will start > the retirement process. As there is no response on maintaining it, I have started the retirement proposal - https://review.opendev.org/c/openstack/governance/+/817511 -gmann > > - https://opendev.org/openstack/training-labs > > -gmann > > > From pbasaras at gmail.com Thu Nov 11 12:05:50 2021 From: pbasaras at gmail.com (Pavlos Basaras) Date: Thu, 11 Nov 2021 14:05:50 +0200 Subject: [Ussuri] [Zun] Container deployment problem with Zun "There are not enough hosts available" In-Reply-To: References: <656bfc53.18f5.17cf4592522.Coremail.kira034@163.com> Message-ID: Hello, no worries, million thanks for the support and the link for the UI. One more thing to point out is that in ( https://docs.openstack.org/zun/xena/install/compute-install.html) at step 9, the curl command is fixed to amd64 architecture --since i have an aarch64 system i fetched the appropriate version from https://github.com/containernetworking/plugins/releases/, and worked fine. best, Pavlos. On Thu, Nov 11, 2021 at 9:00 AM Hongbin Lu wrote: > Sorry for late reply. I missed this email. > > This is the Zun UI installation guide: > https://docs.openstack.org/zun-ui/latest/install/index.html#manual-installation . > About your suggest of pip3 installation, I will add a note to clarify that. > Thanks for pointing it out. > > Best regards, > Hongbin > > > > > > > At 2021-11-08 19:02:50, "Pavlos Basaras" wrote: > > Hello, > > you are right, that fixed the problem. > > In order to solve the problem i revisited > https://docs.openstack.org/placement/ussuri/install/verify.html > I executed: openstack resource provider list > Then removed the host that i use for containers, restarted the zun-compute > at the host and works perfectly. > Thank you very much for your input. > > One more thing, I don't see at the horizon dashboard the tab for the > containers (i just see the nova compute related tab). Is there any > additional configuration for this? > > btw, from https://docs.openstack.org/placement/ussuri/install/verify.html, > i used pip3 install osc-placement (instead of pip install..) > > all the best > Pavlos. > > On Sat, Nov 6, 2021 at 10:24 AM Hongbin Lu wrote: > >> Hi, >> >> It looks zun-compute wants to create a resource provider in placement, >> but placement return a 409 response. I would suggest you to check >> placement's logs. My best guess is the resource provider with the same name >> is already created so placement returned 409. If this is a case, simply >> remove those resources and restart zun-compute service should resolve the >> problem. >> >> Best regards, >> Hongbin >> >> >> >> >> >> >> At 2021-11-05 22:43:38, "Pavlos Basaras" wrote: >> >> Hello, >> >> I have an Openstack cluster, with basic services and >> functionality working based on ussuri release. >> >> I am trying to install the Zun service to be able to deploy containers, >> following >> >> [controller] -- >> https://docs.openstack.org/zun/ussuri/install/controller-install.html >> and >> [compute] -- >> https://docs.openstack.org/zun/ussuri/install/compute-install.html >> >> I used the git branch based on ussuri for all components. >> >> I veryfined kuryr-libnetwork operation issuing from the compute node >> # docker network create --driver kuryr --ipam-driver kuryr --subnet >> 10.10.0.0/16 --gateway=10.10.0.1 test_net >> >> and seeing the network created successfully, etc. >> >> I am not very sure about the zun.conf file. >> What is the "endpoint_type = internalURL" parameter? >> Do I need to change internalURL? >> >> >> From sudo systemctl status zun-compute i see: >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task File >> "/usr/local/lib/python3.6/dist-packages/retrying.py", line 49, in wrapped_f >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task return Retrying(*dargs, **dkw).call(f, >> *args, **kw) >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task File >> "/usr/local/lib/python3.6/dist-packages/retrying.py", line 206, in call >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task return >> attempt.get(self._wrap_exception) >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task File >> "/usr/local/lib/python3.6/dist-packages/retrying.py", line 247, in get >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task six.reraise(self.value[0], >> self.value[1], self.value[2]) >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task File >> "/usr/lib/python3/dist-packages/six.py", line 703, in reraise >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task raise value >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task File >> "/usr/local/lib/python3.6/dist-packages/retrying.py", line 200, in call >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task attempt = Attempt(fn(*args, **kwargs), >> attempt_number, False) >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task File >> "/usr/local/lib/python3.6/dist-packages/zun/compute/compute_node_tracker.py", >> line 350, in _update_to_placement >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task context, node_rp_uuid, >> name=compute_node.hostname) >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task File >> "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", >> line 846, in get_provider_tree_and_ensure_r >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task >> parent_provider_uuid=parent_provider_uuid) >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task File >> "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", >> line 628, in _ensure_resource_provider >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task >> parent_provider_uuid=parent_provider_uuid) >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task File >> "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", >> line 514, in _create_resource_provider >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task global_request_id=context.global_id) >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task File >> "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", >> line 225, in post >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task headers=headers, logger=LOG) >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task File >> "/usr/lib/python3/dist-packages/keystoneauth1/adapter.py", line 392, in post >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task return self.request(url, 'POST', >> **kwargs) >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task File >> "/usr/lib/python3/dist-packages/keystoneauth1/adapter.py", line 248, in >> request >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task return self.session.request(url, >> method, **kwargs) >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task File >> "/usr/lib/python3/dist-packages/keystoneauth1/session.py", line 968, in >> request >> Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task raise exceptions.from_response(resp, >> method, url) >> *Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 >> ERROR oslo_service.periodic_task keystoneauth1.exceptions.http.Conflict: >> Conflict (HTTP 409) (Request-ID: req-9a158c41-a485-4937-99e7-e38cdce7fded)* >> >> What is this problem? any advice? >> I used the default configuration values ([keystone_auth] and >> [keystone_authtoken]) values based on the configuration from the above >> links. >> >> >> Aslo from the controller >> >> >> *openstack appcontainer service list* >> +----+----------+-------------+-------+----------+-----------------+----------------------------+-------------------+ >> | Id | Host | Binary | State | Disabled | Disabled Reason | >> Updated At | Availability Zone | >> >> +----+----------+-------------+-------+----------+-----------------+----------------------------+-------------------+ >> | 1 | compute5 | zun-compute | up | False | None | >> 2021-11-05T14:39:01.000000 | nova | >> >> +----+----------+-------------+-------+----------+-----------------+----------------------------+-------------------+ >> >> >> *openstack appcontainer host show compute5* >> +----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ >> | Field | Value >> >> | >> >> +----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ >> | uuid | ee3a5b44-8ffa-463e-939d-0c61868a596f >> >> | >> | links | [{'href': ' >> http://controller:9517/v1/hosts/ee3a5b44-8ffa-463e-939d-0c61868a596f', >> 'rel': 'self'}, {'href': ' >> http://controller:9517/hosts/ee3a5b44-8ffa-463e-939d-0c61868a596f', >> 'rel': 'bookmark'}] | >> | hostname | compute5 >> >> | >> | mem_total | 7975 >> >> | >> | mem_used | 0 >> >> | >> | total_containers | 1 >> >> | >> | cpus | 10 >> >> | >> | cpu_used | 0.0 >> >> | >> | architecture | x86_64 >> >> | >> | os_type | linux >> >> | >> | os | Ubuntu 18.04.6 LTS >> >> | >> | kernel_version | 4.15.0-161-generic >> >> | >> | labels | {} >> >> | >> | disk_total | 63 >> >> | >> | disk_used | 0 >> >> | >> | disk_quota_supported | False >> >> | >> | runtimes | ['io.containerd.runc.v2', >> 'io.containerd.runtime.v1.linux', 'runc'] >> >> | >> | enable_cpu_pinning | False >> >> | >> >> +----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ >> >> >> seems to work fine. >> >> However when i issue e.g., openstack appcontainer run --name container >> --net network=$NET_ID cirros ping 8.8.8.8 >> >> i get the error: | status_reason | There are not enough hosts >> available. >> >> Any ideas? >> >> One final thing is that I did see in the Horizon dashboard the >> container tab, to be able to deploy containers from horizon. Is there an >> extra configuration for this? >> >> sorry for the long mail. >> >> best, >> Pavlos >> >> >> >> >> > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kira034 at 163.com Thu Nov 11 07:00:04 2021 From: kira034 at 163.com (Hongbin Lu) Date: Thu, 11 Nov 2021 15:00:04 +0800 (CST) Subject: [Ussuri] [Zun] Container deployment problem with Zun "There are not enough hosts available" In-Reply-To: References: <656bfc53.18f5.17cf4592522.Coremail.kira034@163.com> Message-ID: Sorry for late reply. I missed this email. This is the Zun UI installation guide: https://docs.openstack.org/zun-ui/latest/install/index.html#manual-installation . About your suggest of pip3 installation, I will add a note to clarify that. Thanks for pointing it out. Best regards, Hongbin At 2021-11-08 19:02:50, "Pavlos Basaras" wrote: Hello, you are right, that fixed the problem. In order to solve the problem i revisited https://docs.openstack.org/placement/ussuri/install/verify.html I executed: openstack resource provider list Then removed the host that i use for containers, restarted the zun-compute at the host and works perfectly. Thank you very much for your input. One more thing, I don't see at the horizon dashboard the tab for the containers (i just see the nova compute related tab). Is there any additional configuration for this? btw, from https://docs.openstack.org/placement/ussuri/install/verify.html, i used pip3 install osc-placement (instead of pip install..) all the best Pavlos. On Sat, Nov 6, 2021 at 10:24 AM Hongbin Lu wrote: Hi, It looks zun-compute wants to create a resource provider in placement, but placement return a 409 response. I would suggest you to check placement's logs. My best guess is the resource provider with the same name is already created so placement returned 409. If this is a case, simply remove those resources and restart zun-compute service should resolve the problem. Best regards, Hongbin At 2021-11-05 22:43:38, "Pavlos Basaras" wrote: Hello, I have an Openstack cluster, with basic services and functionality working based on ussuri release. I am trying to install the Zun service to be able to deploy containers, following [controller] -- https://docs.openstack.org/zun/ussuri/install/controller-install.html and [compute] -- https://docs.openstack.org/zun/ussuri/install/compute-install.html I used the git branch based on ussuri for all components. I veryfined kuryr-libnetwork operation issuing from the compute node # docker network create --driver kuryr --ipam-driver kuryr --subnet 10.10.0.0/16 --gateway=10.10.0.1 test_net and seeing the network created successfully, etc. I am not very sure about the zun.conf file. What is the "endpoint_type = internalURL" parameter? Do I need to change internalURL? From sudo systemctl status zun-compute i see: Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 49, in wrapped_f Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task return Retrying(*dargs, **dkw).call(f, *args, **kw) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 206, in call Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task return attempt.get(self._wrap_exception) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 247, in get Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task six.reraise(self.value[0], self.value[1], self.value[2]) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/lib/python3/dist-packages/six.py", line 703, in reraise Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task raise value Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 200, in call Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task attempt = Attempt(fn(*args, **kwargs), attempt_number, False) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/zun/compute/compute_node_tracker.py", line 350, in _update_to_placement Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task context, node_rp_uuid, name=compute_node.hostname) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", line 846, in get_provider_tree_and_ensure_r Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task parent_provider_uuid=parent_provider_uuid) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", line 628, in _ensure_resource_provider Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task parent_provider_uuid=parent_provider_uuid) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", line 514, in _create_resource_provider Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task global_request_id=context.global_id) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/local/lib/python3.6/dist-packages/zun/scheduler/client/report.py", line 225, in post Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task headers=headers, logger=LOG) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/lib/python3/dist-packages/keystoneauth1/adapter.py", line 392, in post Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task return self.request(url, 'POST', **kwargs) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/lib/python3/dist-packages/keystoneauth1/adapter.py", line 248, in request Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task return self.session.request(url, method, **kwargs) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task File "/usr/lib/python3/dist-packages/keystoneauth1/session.py", line 968, in request Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task raise exceptions.from_response(resp, method, url) Nov 05 14:34:58 compute5 zun-compute[8962]: 2021-11-05 14:34:58.828 8962 ERROR oslo_service.periodic_task keystoneauth1.exceptions.http.Conflict: Conflict (HTTP 409) (Request-ID: req-9a158c41-a485-4937-99e7-e38cdce7fded) What is this problem? any advice? I used the default configuration values ([keystone_auth] and [keystone_authtoken]) values based on the configuration from the above links. Aslo from the controller openstack appcontainer service list +----+----------+-------------+-------+----------+-----------------+----------------------------+-------------------+ | Id | Host | Binary | State | Disabled | Disabled Reason | Updated At | Availability Zone | +----+----------+-------------+-------+----------+-----------------+----------------------------+-------------------+ | 1 | compute5 | zun-compute | up | False | None | 2021-11-05T14:39:01.000000 | nova | +----+----------+-------------+-------+----------+-----------------+----------------------------+-------------------+ openstack appcontainer host show compute5 +----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Field | Value | +----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | uuid | ee3a5b44-8ffa-463e-939d-0c61868a596f | | links | [{'href': 'http://controller:9517/v1/hosts/ee3a5b44-8ffa-463e-939d-0c61868a596f', 'rel': 'self'}, {'href': 'http://controller:9517/hosts/ee3a5b44-8ffa-463e-939d-0c61868a596f', 'rel': 'bookmark'}] | | hostname | compute5 | | mem_total | 7975 | | mem_used | 0 | | total_containers | 1 | | cpus | 10 | | cpu_used | 0.0 | | architecture | x86_64 | | os_type | linux | | os | Ubuntu 18.04.6 LTS | | kernel_version | 4.15.0-161-generic | | labels | {} | | disk_total | 63 | | disk_used | 0 | | disk_quota_supported | False | | runtimes | ['io.containerd.runc.v2', 'io.containerd.runtime.v1.linux', 'runc'] | | enable_cpu_pinning | False | +----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ seems to work fine. However when i issue e.g., openstack appcontainer run --name container --net network=$NET_ID cirros ping 8.8.8.8 i get the error: | status_reason | There are not enough hosts available. Any ideas? One final thing is that I did see in the Horizon dashboard the container tab, to be able to deploy containers from horizon. Is there an extra configuration for this? sorry for the long mail. best, Pavlos -------------- next part -------------- An HTML attachment was scrubbed... URL: From katonalala at gmail.com Thu Nov 11 15:34:34 2021 From: katonalala at gmail.com (Lajos Katona) Date: Thu, 11 Nov 2021 16:34:34 +0100 Subject: [neutron] Drivers meeting agenda - 12.11.2021 Message-ID: Hi Neutron Drivers, The agenda for tomorrow's drivers meeting is at [1]. We have 1 RFE to discuss: * https://bugs.launchpad.net/neutron/+bug/1950454 : [RFE] GW IP and FIP QoS to inherit from network [1] https://wiki.openstack.org/wiki/Meetings/NeutronDrivers#Agenda See you at the meeting tomorrow. Lajos Katona (lajoskatona) -------------- next part -------------- An HTML attachment was scrubbed... URL: From DHilsbos at performair.com Thu Nov 11 16:25:13 2021 From: DHilsbos at performair.com (DHilsbos at performair.com) Date: Thu, 11 Nov 2021 16:25:13 +0000 Subject: [ops][keystone][cinder][swift][neutron][nova] Rename Availability Zone Message-ID: <0670B960225633449A24709C291A525251D4538F@COM03.performair.local> All; We recently decided to change our naming convention. As such I'd like to rename our current availability zone. The configuration files are the obvious places to do so. Is there anything I need to change in the database? Properties of images, volumes, servers, etc.? Should I just give this up as a bad deal? Maybe rebuild the cluster from scratch? Thank you, Dominic L. Hilsbos, MBA Vice President - Information Technology Perform Air International Inc. DHilsbos at PerformAir.com www.PerformAir.com From gmann at ghanshyammann.com Thu Nov 11 16:40:52 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Thu, 11 Nov 2021 10:40:52 -0600 Subject: Adjutant needs contributors (and a PTL) to survive! In-Reply-To: <1e4045dd6edf572a8a32ca639056bc06a8782fe8.camel@etc.gen.nz> References: <3026c411-688b-c773-8577-e8eed40b995a@catalystcloud.nz> <2bd77f1e8c7542888b7e0e1a14931a41@verisign.com> <9cabb3cb32a7441697f58933df72b514@verisign.com> <1e4045dd6edf572a8a32ca639056bc06a8782fe8.camel@etc.gen.nz> Message-ID: <17d0fdf9cef.bfbf002c504513.1457558466289860904@ghanshyammann.com> ---- On Tue, 09 Nov 2021 04:54:36 -0600 Andrew Ruthven wrote ---- > On Mon, 2021-11-08 at 19:49 +0000, Braden, Albert wrote:I didn't have any luck contacting Adrian. Does anyone know where the storyboard is that he mentions in his email? > I'll check in with Adrian to see if he has heard from anyone. > Cheers,AndrewCatalyst Cloud-- Andrew Ruthven, Wellington, New Zealandandrew at etc.gen.nz |Catalyst Cloud: | This space intentionally left blank https://catalystcloud.nz | @Andrew not sure but please let us know if someone from Catalyst is planning to maintain it? We are still waiting for volunteers to lead/maintain this project, if you are interested please reply here or ping us on #openstack-tc IRC channel. -gmann From james.slagle at gmail.com Thu Nov 11 18:26:41 2021 From: james.slagle at gmail.com (James Slagle) Date: Thu, 11 Nov 2021 13:26:41 -0500 Subject: [TripleO] directord to opendev Message-ID: Hello TripleO'ers, We are proposing to move the directord repo[1] from github to opendev as a top level organization. Moving to the opendev[2] hosting will make it easier to integrate with the existing tripleo-core team, our existing CI, and developer processes. The benefits seem to outweigh leaving it on github. At the same time, we want to encourage external contribution from outside TripleO, so felt that making it it's own organization under opendev would be better suited towards that than under the openstack organization. If there are concerns, please raise them. Thank you! [1] https://github.com/Directord/ [2] https://opendev.org/ -- -- James Slagle -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From james.slagle at gmail.com Thu Nov 11 18:41:59 2021 From: james.slagle at gmail.com (James Slagle) Date: Thu, 11 Nov 2021 13:41:59 -0500 Subject: [TripleO] directord to opendev In-Reply-To: References: Message-ID: On Thu, Nov 11, 2021 at 1:26 PM James Slagle wrote: > Hello TripleO'ers, > > We are proposing to move the directord repo[1] from github to opendev as a > top level organization. Moving to the opendev[2] hosting will make it > easier to integrate with the existing tripleo-core team, our existing CI, > and developer processes. The benefits seem to outweigh leaving it on github. > > At the same time, we want to encourage external contribution from outside > TripleO, so felt that making it it's own organization under opendev would > be better suited towards that than under the openstack organization. > > If there are concerns, please raise them. Thank you! > > [1] https://github.com/Directord/ > [2] https://opendev.org/ > Note that this would also include putting task-core under the directord org. -- -- James Slagle -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Thu Nov 11 19:00:31 2021 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Thu, 11 Nov 2021 20:00:31 +0100 Subject: [Openstack][cinder] scheduler filters Message-ID: Hello All, I read that capacity filters for cinder is the default, so, if I understood well, a volume is placed on the backend where more space is available. Since my two backends are on storage with same features, I wonder if I must specify a default storage backend in cinder.conf or not. Must I create a cinder volume without cinder type and scheduler evaluate where there is more space available? Thanks Ignazio -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Thu Nov 11 19:23:03 2021 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Thu, 11 Nov 2021 20:23:03 +0100 Subject: [Openstack][cinder] scheduler filters In-Reply-To: References: Message-ID: Hello again, probably I must use the same backend name for both and a cinder type associated to it and the scheduler will use the backend with more space available ? Ignazio Il Gio 11 Nov 2021, 20:00 Ignazio Cassano ha scritto: > Hello All, > I read that capacity filters for cinder is the default, so, if I > understood well, a volume is placed on the backend where more space is > available. > Since my two backends are on storage with same features, I wonder if I > must specify a default storage backend in cinder.conf or not. > Must I create a cinder volume without cinder type and scheduler evaluate > where there is more space available? > Thanks > Ignazio > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From DHilsbos at performair.com Thu Nov 11 22:59:18 2021 From: DHilsbos at performair.com (DHilsbos at performair.com) Date: Thu, 11 Nov 2021 22:59:18 +0000 Subject: [ops][nova] Error trying to migrate Message-ID: <0670B960225633449A24709C291A525251D45985@COM03.performair.local> All; I'm running into a strange issue when I try to migrate a server from one host to another. The only messages I'm seeing are in nova-conductor.log: 2021-11-11 15:16:49.118 1887742 WARNING nova.scheduler.utils [req-d648901b-d000-47ec-9497-bcc51586c381 d7c514813e5d4fe6815f5f59e8e35f2f a008ad02d16f436a9e320882ca497055 - default default] Failed to compute_task_migrate_server: Timed out waiting for a reply to message ID 237323d0824a43588f6394e5a505248b: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 237323d0824a43588f6394e5a505248b 2021-11-11 15:16:49.122 1887742 WARNING nova.scheduler.utils [req-d648901b-d000-47ec-9497-bcc51586c381 d7c514813e5d4fe6815f5f59e8e35f2f a008ad02d16f436a9e320882ca497055 - default default] [instance: 5935c07d-0c7f-48cc-a4b9-674504fc6005] Setting instance to STOPPED state.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 237323d0824a43588f6394e5a505248b 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server [req-d648901b-d000-47ec-9497-bcc51586c381 d7c514813e5d4fe6815f5f59e8e35f2f a008ad02d16f436a9e320882ca497055 - default default] Exception during message handling: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 237323d0824a43588f6394e5a505248b 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server Traceback (most recent call last): 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 433, in get 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return self._queues[msg_id].get(block=True, timeout=timeout) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/eventlet/queue.py", line 322, in get 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return waiter.wait() 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/eventlet/queue.py", line 141, in wait 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return get_hub().switch() 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/eventlet/hubs/hub.py", line 298, in switch 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return self.greenlet.switch() 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server queue.Empty 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server During handling of the above exception, another exception occurred: 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server Traceback (most recent call last): 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/server.py", line 165, in _process_incoming 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", line 309, in dispatch 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", line 229, in _do_dispatch 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/server.py", line 241, in inner 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return func(*args, **kwargs) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/manager.py", line 99, in wrapper 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return fn(self, context, *args, **kwargs) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/compute/utils.py", line 1434, in decorated_function 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return function(self, context, *args, **kwargs) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/manager.py", line 303, in migrate_server 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server host_list) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/manager.py", line 397, in _cold_migrate 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server else 'cold migrate', instance=instance) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server self.force_reraise() 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server raise value 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/manager.py", line 354, in _cold_migrate 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server task.execute() 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/tasks/base.py", line 26, in wrap 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server self.rollback(ex) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server self.force_reraise() 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server raise value 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/tasks/base.py", line 23, in wrap 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return original(self) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/tasks/base.py", line 40, in execute 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return self._execute() 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/tasks/migrate.py", line 384, in _execute 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server selection = self._schedule() 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/tasks/migrate.py", line 434, in _schedule 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return_objects=True, return_alternates=True) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/scheduler/client/query.py", line 42, in select_destinations 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server instance_uuids, return_objects, return_alternates) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/scheduler/rpcapi.py", line 160, in select_destinations 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return cctxt.call(ctxt, 'select_destinations', **msg_args) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/client.py", line 179, in call 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server transport_options=self.transport_options) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/transport.py", line 128, in _send 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server transport_options=transport_options) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 682, in send 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server transport_options=transport_options) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 670, in _send 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server call_monitor_timeout) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 559, in wait 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server message = self.waiters.get(msg_id, timeout=timeout) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 437, in get 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server 'to message ID %s' % msg_id) 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 237323d0824a43588f6394e5a505248b No messages are generated in nova-schedule, or any of the nova-compute logs on the hosts. Thank you, Dominic L. Hilsbos, MBA Vice President - Information Technology Perform Air International Inc. DHilsbos at PerformAir.com www.PerformAir.com From mnaser at vexxhost.com Fri Nov 12 03:13:28 2021 From: mnaser at vexxhost.com (Mohammed Naser) Date: Thu, 11 Nov 2021 22:13:28 -0500 Subject: [ops][nova] Error trying to migrate In-Reply-To: <0670B960225633449A24709C291A525251D45985@COM03.performair.local> References: <0670B960225633449A24709C291A525251D45985@COM03.performair.local> Message-ID: It sounds like you've got an issue with your RabbitMQ infrastructure that's causing messages not to show up. I suggest focusing your troubleshooting there. I've seen these issues resolved by simply rebuilding the RabbitMQ cluster. On Thu, Nov 11, 2021 at 6:07 PM wrote: > > All; > > I'm running into a strange issue when I try to migrate a server from one host to another. > > The only messages I'm seeing are in nova-conductor.log: > 2021-11-11 15:16:49.118 1887742 WARNING nova.scheduler.utils [req-d648901b-d000-47ec-9497-bcc51586c381 d7c514813e5d4fe6815f5f59e8e35f2f a008ad02d16f436a9e320882ca497055 - default default] Failed to compute_task_migrate_server: Timed out waiting for a reply to message ID 237323d0824a43588f6394e5a505248b: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 237323d0824a43588f6394e5a505248b > 2021-11-11 15:16:49.122 1887742 WARNING nova.scheduler.utils [req-d648901b-d000-47ec-9497-bcc51586c381 d7c514813e5d4fe6815f5f59e8e35f2f a008ad02d16f436a9e320882ca497055 - default default] [instance: 5935c07d-0c7f-48cc-a4b9-674504fc6005] Setting instance to STOPPED state.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 237323d0824a43588f6394e5a505248b > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server [req-d648901b-d000-47ec-9497-bcc51586c381 d7c514813e5d4fe6815f5f59e8e35f2f a008ad02d16f436a9e320882ca497055 - default default] Exception during message handling: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 237323d0824a43588f6394e5a505248b > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server Traceback (most recent call last): > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 433, in get > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return self._queues[msg_id].get(block=True, timeout=timeout) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/eventlet/queue.py", line 322, in get > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return waiter.wait() > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/eventlet/queue.py", line 141, in wait > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return get_hub().switch() > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/eventlet/hubs/hub.py", line 298, in switch > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return self.greenlet.switch() > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server queue.Empty > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server During handling of the above exception, another exception occurred: > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server Traceback (most recent call last): > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/server.py", line 165, in _process_incoming > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", line 309, in dispatch > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", line 229, in _do_dispatch > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/server.py", line 241, in inner > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return func(*args, **kwargs) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/manager.py", line 99, in wrapper > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return fn(self, context, *args, **kwargs) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/compute/utils.py", line 1434, in decorated_function > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return function(self, context, *args, **kwargs) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/manager.py", line 303, in migrate_server > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server host_list) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/manager.py", line 397, in _cold_migrate > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server else 'cold migrate', instance=instance) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__ > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server self.force_reraise() > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server raise value > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/manager.py", line 354, in _cold_migrate > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server task.execute() > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/tasks/base.py", line 26, in wrap > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server self.rollback(ex) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__ > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server self.force_reraise() > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server raise value > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/tasks/base.py", line 23, in wrap > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return original(self) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/tasks/base.py", line 40, in execute > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return self._execute() > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/tasks/migrate.py", line 384, in _execute > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server selection = self._schedule() > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/conductor/tasks/migrate.py", line 434, in _schedule > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return_objects=True, return_alternates=True) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/scheduler/client/query.py", line 42, in select_destinations > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server instance_uuids, return_objects, return_alternates) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/scheduler/rpcapi.py", line 160, in select_destinations > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server return cctxt.call(ctxt, 'select_destinations', **msg_args) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/client.py", line 179, in call > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server transport_options=self.transport_options) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/transport.py", line 128, in _send > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server transport_options=transport_options) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 682, in send > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server transport_options=transport_options) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 670, in _send > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server call_monitor_timeout) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 559, in wait > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server message = self.waiters.get(msg_id, timeout=timeout) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 437, in get > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server 'to message ID %s' % msg_id) > 2021-11-11 15:16:49.291 1887742 ERROR oslo_messaging.rpc.server oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 237323d0824a43588f6394e5a505248b > > No messages are generated in nova-schedule, or any of the nova-compute logs on the hosts. > > Thank you, > > Dominic L. Hilsbos, MBA > Vice President - Information Technology > Perform Air International Inc. > DHilsbos at PerformAir.com > www.PerformAir.com > > > -- Mohammed Naser VEXXHOST, Inc. From arnaud.morin at gmail.com Fri Nov 12 08:51:46 2021 From: arnaud.morin at gmail.com (Arnaud Morin) Date: Fri, 12 Nov 2021 08:51:46 +0000 Subject: [neutron][large-scale][ops] openflow rules tools In-Reply-To: <6212551.lOV4Wx5bFT@p1> References: <6212551.lOV4Wx5bFT@p1> Message-ID: Hey all, we have been working on this subject recently and we pushed this: https://review.opendev.org/c/openstack/osops/+/817715 Feel free to comment +tag [ops][large-scale] On 11.10.21 - 22:40, Slawek Kaplonski wrote: > Hi, > > For OVN with have small tool ml2ovn-trace: https://docs.openstack.org/neutron/ > latest/ovn/ml2ovn_trace.html in the neutron repo https://docs.openstack.org/ > neutron/latest/ovn/ml2ovn_trace.html but that will not be helpful for ML2/OVS > at all. > > On poniedzia?ek, 11 pa?dziernika 2021 20:05:40 CEST Arnaud wrote: > > That would be awesome! > > > > We also built a tool which is looking for openflow rules related to a tap > > interface, but since we upgraded and enabled security rules in ovs, the tool > > isn't working anymore. > > Yes, for ML2/OVS with ovs firewall driver it is really painful to debug all > those OF rules. > > > > > So before rewriting everything from scratch, I was wondering if the > community > > was also dealing with the same issue. > > If You will have anything like that, please share with community :) > > > > > So I am glad to here from you! > > Let me know :) > > Cheers > > > > Le 11 octobre 2021 17:52:52 GMT+02:00, Laurent Dumont > a ?crit : > > >Also interested in this. Reading rules in dump-flows is an absolute pain. > > >In an ideal world, I would have never have to. > > > > > >We some stuff on our side that I'll see if I can share. > > > > > >On Mon, Oct 11, 2021 at 9:41 AM Arnaud Morin > wrote: > > >> Hello, > > >> > > >> When using native ovs in neutron, we endup with a lot of openflow rules > > >> on ovs side. > > >> > > >> Debugging it with regular ovs-ofctl --color dump-flows is kind of > > >> painful. > > >> > > >> Is there any tool that the community is using to manage that? > > >> > > >> Thanks in advance! > > >> > > >> Arnaud. > > > -- > Slawek Kaplonski > Principal Software Engineer > Red Hat From zigo at debian.org Fri Nov 12 11:42:06 2021 From: zigo at debian.org (Thomas Goirand) Date: Fri, 12 Nov 2021 12:42:06 +0100 Subject: [ops][keystone][cinder][swift][neutron][nova] Rename Availability Zone In-Reply-To: <0670B960225633449A24709C291A525251D4538F@COM03.performair.local> References: <0670B960225633449A24709C291A525251D4538F@COM03.performair.local> Message-ID: <55338d20-aa80-0f74-e663-b6e859d58fc0@debian.org> On 11/11/21 5:25 PM, DHilsbos at performair.com wrote: > All; > > We recently decided to change our naming convention. As such I'd like to rename our current availability zone. The configuration files are the obvious places to do so. Is there anything I need to change in the database? Properties of images, volumes, servers, etc.? > > Should I just give this up as a bad deal? Maybe rebuild the cluster from scratch? > > Thank you, Hi, You don't need to rebuild your cluster from scratch. Just, you got to know that instances cannot change availability zone, unless you really go deep into the nova database (and change the nova SPEC there). I'm not sure for existing volumes. Unless I'm mistaking, there's no notion of AZ for images. FYI, the nova availability zones are controlled through the compute aggregate API. The cinder one from cinder.conf in each volume nodes. Cheers, Thomas Goirand (zigo) From elod.illes at est.tech Fri Nov 12 14:46:19 2021 From: elod.illes at est.tech (=?UTF-8?B?RWzFkWQgSWxsw6lz?=) Date: Fri, 12 Nov 2021 15:46:19 +0100 Subject: [release] Release countdown for week R-19, Nov 15-19 Message-ID: <2b213596-6530-e3d8-52b7-19038a553a52@est.tech> Development Focus ----------------- The Yoga-1 milestone is next week, on November 18, 2021! Project team plans for the Yogacycle should now be solidified. General Information ------------------- Libraries need to be released at least once per milestone period. Next week, the release team will propose releases for any library which had changes but has not been otherwise released since the Xenarelease. PTL's or release liaisons, please watch for these and give a +1 to acknowledge them. If there is some reason to hold off on a release, let us know that as well, by posting a -1. If we do not hear anything at all by the end of the week, we will assume things are OK to proceed. NB: If one of your libraries is still releasing 0.x versions, start thinking about when it will be appropriate to do a 1.0 version. The version number does signal the state, real or perceived, of the library, so we strongly encourage going to a full major version once things are in a good and usable state. Upcoming Deadlines & Dates -------------------------- Yoga-1 milestone: November 18, 2021 Yoga final release: March 30, 2022 El?d Ill?s irc: elodilles -------------- next part -------------- An HTML attachment was scrubbed... URL: From elod.illes at est.tech Fri Nov 12 15:26:25 2021 From: elod.illes at est.tech (=?UTF-8?B?RWzFkWQgSWxsw6lz?=) Date: Fri, 12 Nov 2021 16:26:25 +0100 Subject: [openstack-ansible] release job failure for ansible-collections-openstack Message-ID: Hi Openstack-Ansible team! This mail is just to inform you that there was a release job failure [1] yesterday and the job could not be re-run as part of the job was finished successfully in the 1st run (so the 2nd attempt failed [2]). Could you please review if everything is OK with the release? Thanks, El?d (elodilles @ #openstack-release) [1] http://lists.openstack.org/pipermail/release-job-failures/2021-November/001576.html [2] http://lists.openstack.org/pipermail/release-job-failures/2021-November/001577.html From dpawlik at redhat.com Fri Nov 12 16:14:08 2021 From: dpawlik at redhat.com (Daniel Pawlik) Date: Fri, 12 Nov 2021 17:14:08 +0100 Subject: [all] Openstack CI Log Processing project Message-ID: Hello Everyone, By moving the Opendev Elasticsearch to Openstack Elasticsearch service (in the future), a new repository was created in the Openstack project: ci-log-processing. The new repository will be used to store configuration related to the Opensearch service and all tools required to process logs from Zuul CI system to Opensearch. By moving to the new Elasticsearch system, we would like to take this opportunity to use a new service to replace the legacy submit-logstash-jobs system [1][2]. I would like to ask for volunteers to review changes in the Openstack ci-log-processing repository? [3] Please reply to this ML or ping me on #openstack-infra IRC channel so that we can plan to expand the core member list of this repo. Dan [1] https://opendev.org/opendev/base-jobs/src/branch/master/roles/submit-logstash-jobs [2] https://docs.opendev.org/opendev/system-config/latest/logstash.html [3] https://review.opendev.org/q/project:openstack%252Fci-log-processing+status:open -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Fri Nov 12 12:01:32 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Fri, 12 Nov 2021 13:01:32 +0100 Subject: [neutron] Averaga number of rechecks in Neutron comparing to other projects Message-ID: <3257902.aeNJFYEL58@p1> Hi neutrinos, As we discussed during the last PTG, I spent some time today to get average number of rechecks which we need to do in the last PS of the change before it's merged. In theory that number should be close to 0 as patch should be merged with first CI run when it's approved by reviewers :) All data are only from the master branch. I didn't check the same for stable branches. File with graph and raw data in csv format are in the attachments. Basically my conclusion is that Neutron's CI is really bad in that. We have to recheck many, many times before patches will be merged. We really need to think about how to improve that in Neutron. -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: Average_number_of_rechecks_before_patch_was_merged_in_various_projects-2021.csv Type: text/csv Size: 1352 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Average_number_of_rechecks_in_various_projects_before_patch_was_merged.png Type: image/png Size: 45823 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From gmann at ghanshyammann.com Fri Nov 12 17:22:11 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Fri, 12 Nov 2021 11:22:11 -0600 Subject: [all][tc][policy] RBAC discussions: policy popup team meeting time/place change Message-ID: <17d152bcd80.12992a51e572168.3448628568417682192@ghanshyammann.com> Hello Everyone, As you might aware that we had post-PTG discussion on new secure RBAC[1] and figured out a lot of things but still lot of things pending to figure out :). We are at least in good shape on "what to target in Yoga cycle" (this proposal - https://review.opendev.org/c/openstack/governance/+/815158) We will use policy popup team biweekly meeting to continue the discussion on open questions and with video call on meetpad. Details of meeting: https://wiki.openstack.org/wiki/Consistent_and_Secure_Default_Policies_Popup_Team#Meeting Ical: https://meetings.opendev.org/#Secure_Default_Policies_Popup-Team_Meeting Next meeting is on 18th Nov Thursday at 18:00 UTC and then biweekly. [1] http://lists.openstack.org/pipermail/openstack-discuss/2021-November/025619.html -gmann From gmann at ghanshyammann.com Fri Nov 12 17:58:18 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Fri, 12 Nov 2021 11:58:18 -0600 Subject: [all][tc] What's happening in Technical Committee: summary 12th Nov, 21: Reading: 10 min Message-ID: <17d154cdc55.10db9d53f573599.1949859880627332898@ghanshyammann.com> Hello Everyone, Here is this week's summary of the Technical Committee activities. 1. TC Meetings: ============ * TC this week IRC meeting held on Nov 11th Thursday. * Most of the meeting discussions are summarized below (Completed or in-progress activities section). Meeting full logs are available @ https://meetings.opendev.org/meetings/tc/2021/tc.2021-11-11-15.00.log.html * Next week's meeting is canceled due to OpenInfra Keynotes and we will continue weekly meeting on 25th Nov, Thursday 15:00 UTC, feel free the topic on agenda[1] by Nov 24th. 2. What we completed this week: ========================= * Changes in stable core team process[2] * Decouple the community-wide goals from cycle release[3] (I will write up summary on this once we have RBAC goal merged as an example) * Merged 'Technical Writing' SIG into TC[4] 3. Activities In progress: ================== TC Tracker for Yoga cycle ------------------------------ * This etherpad includes the Yoga cycle targets/working items for TC[5]. Open Reviews ----------------- * 12 open reviews for ongoing activities[6]. Review volunteer required for Openstack CI Log Processing project -------------------------------------------------------------------------------- dpawlik sent the email on openstack-discuss[7] about seeking help to review on new Openstack CI Log Processing project. Please respond there if you can help at some extent even not as full time. Remove office hours in favor of weekly meetings ----------------------------------------------------------- We decided to remove the TC in-active office hours[8] in favor of weekly meetings which are serving the purpose of office hours. RBAC discussion: continuing from PTG ---------------------------------------------- We had a discussion on Wed 10th too and agreed on what all we can target for Yoga cycle. Complete notes are in this etherpad[9]. Please review the goal rework[10]. Discussion on open items will be continued in policy popup team meeting[11] Community-wide goal updates ------------------------------------ * With the continuing the discussion on RBAC, we are re-working on the RBAC goal, please wait until we finalize the implementation[8] * There is one more goal proposal for 'FIPS compatibility and compliance'[12]. Adjutant need maintainers and PTLs ------------------------------------------- No volunteer to lead/maintain the Adjutant project, I have sent another reminder to the email[13]. New project 'Skyline' proposal ------------------------------------ * We discussed it in TC PTG and there are few open points about python packaging, repos, and plugins plan which we are discussion on ML. * Still waiting from skyline team to work on the above points[14]. Updating the Yoga testing runtime ---------------------------------------- * As centos stream 9 is released, I have updated the Yoga testing runtime[15] with: 1. Add Debian 11 as tested distro 2. Change centos stream 8 -> centos stream 9 3. Bump lowest python version to test to 3.8 and highest to python 3.9 TC tags analysis ------------------- * TC agreed to remove the framework and it is communicated in ML[16]. Project updates ------------------- * Rename ?Extended Maintenance? SIG to the ?Stable Maintenance?[17] * Retire training-labs repo[18] * Retire puppet-senlin[19] * Add ProxySQL repository for OpenStack-Ansible[20] * Retire js-openstack-lib [21] 4. How to contact the TC: ==================== If you would like to discuss or give feedback to TC, you can reach out to us in multiple ways: 1. Email: you can send the email with tag [tc] on openstack-discuss ML[22]. 2. Weekly meeting: The Technical Committee conduct a weekly meeting every Thursday 15 UTC [23] 3. Ping us using 'tc-members' nickname on #openstack-tc IRC channel. [1] https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting [2] http://lists.openstack.org/pipermail/openstack-discuss/2021-November/025741.html [3] https://review.opendev.org/c/openstack/governance/+/816387 [4] https://review.opendev.org/c/openstack/governance/+/815869 [5] https://etherpad.opendev.org/p/tc-yoga-tracker [6] https://review.opendev.org/q/projects:openstack/governance+status:open [7] http://lists.openstack.org/pipermail/openstack-discuss/2021-November/025758.html [8] https://review.opendev.org/c/openstack/governance/+/817493 [9] https://etherpad.opendev.org/p/policy-popup-yoga-ptg [10] https://review.opendev.org/c/openstack/governance/+/815158 [11] http://lists.openstack.org/pipermail/openstack-discuss/2021-November/025760.html [12] https://review.opendev.org/c/openstack/governance/+/816587 [13] http://lists.openstack.org/pipermail/openstack-discuss/2021-October/025555.html [14] https://review.opendev.org/c/openstack/governance/+/814037 [15] https://review.opendev.org/c/openstack/governance/+/815851 [16] http://lists.openstack.org/pipermail/openstack-discuss/2021-October/025571.html [17] https://review.opendev.org/c/openstack/governance-sigs/+/817499 [18] https://review.opendev.org/c/openstack/governance/+/817511 [19] https://review.opendev.org/c/openstack/governance/+/817329 [20] https://review.opendev.org/c/openstack/governance/+/817245 [21] https://review.opendev.org/c/openstack/governance/+/807163 [22] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-discuss [23] http://eavesdrop.openstack.org/#Technical_Committee_Meeting -gmann From radoslaw.piliszek at gmail.com Fri Nov 12 18:32:17 2021 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Fri, 12 Nov 2021 19:32:17 +0100 Subject: [openstack-ansible] release job failure for ansible-collections-openstack In-Reply-To: References: Message-ID: Please note ansible-collections-openstack is part of Ansible SIG, not OpenStack Ansible. [1] [1] https://opendev.org/openstack/governance/src/commit/bf1b5848934ab209dea2255c22ca0f177719db3b/reference/sigs-repos.yaml -yoctozepto On Fri, 12 Nov 2021 at 16:27, El?d Ill?s wrote: > > Hi Openstack-Ansible team! > > This mail is just to inform you that there was a release job failure [1] > yesterday and the job could not be re-run as part of the job was > finished successfully in the 1st run (so the 2nd attempt failed [2]). > > Could you please review if everything is OK with the release? > > Thanks, > > El?d (elodilles @ #openstack-release) > > [1] > http://lists.openstack.org/pipermail/release-job-failures/2021-November/001576.html > [2] > http://lists.openstack.org/pipermail/release-job-failures/2021-November/001577.html > > > From fungi at yuggoth.org Fri Nov 12 18:36:37 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 12 Nov 2021 18:36:37 +0000 Subject: [openstack-ansible] release job failure for ansible-collections-openstack In-Reply-To: References: Message-ID: <20211112183636.dx5rgj3467royvgc@yuggoth.org> On 2021-11-12 19:32:17 +0100 (+0100), Rados?aw Piliszek wrote: > Please note ansible-collections-openstack is part of Ansible SIG, > not OpenStack Ansible. [...] Interesting, I thought the Release Management team explicitly avoided handling releases for SIG repos, focusing solely on project team deliverables. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From james.slagle at gmail.com Fri Nov 12 20:56:17 2021 From: james.slagle at gmail.com (James Slagle) Date: Fri, 12 Nov 2021 15:56:17 -0500 Subject: [TripleO] Core team cleanup In-Reply-To: References: Message-ID: Thanks for the replies everyone. I have made these changes in gerrit. On Sat, Nov 6, 2021 at 11:28 AM Dmitry Tantsur wrote: > > > On Wed, Nov 3, 2021 at 5:57 PM James Slagle > wrote: > >> Hello, I took a look at our core team, "tripleo-core" in gerrit. We have >> a few individuals who I feel have moved on from TripleO in their focus. I >> looked at the reviews from stackalytics.io for the last 180 days[1]. >> >> These individuals have less than 6 reviews, which is about 1 review a >> month: >> Bob Fournier >> Dan Sneddon >> Dmitry Tantsur >> > > +1. yeah, sorry for that. I have been trying to keep an eye on TripleO > things, but with my new OpenShift responsibilities it's pretty much > impossible. I guess it's the same for Bob. > > I'm still available for questions and reviews if someone needs me. > > Dmitry > > >> Ji?? Str?nsk? >> Juan Antonio Osorio Robles >> Marius Cornea >> >> These individuals have publicly expressed that they are moving on from >> TripleO: >> Michele Baldessari >> wes hayutin >> >> I'd like to propose we remove these folks from our core team, while >> thanking them for their contributions. I'll also note that I'd still value >> +1/-1 from these folks with a lot of significance, and encourage them to >> review their areas of expertise! >> >> If anyone on the list plans to start reviewing in TripleO again, then I >> also think we can postpone the removal for the time being and re-evaluate >> later. Please let me know if that's the case. >> >> Please reply and let me know any agreements or concerns with this change. >> >> Thank you! >> >> [1] >> https://www.stackalytics.io/report/contribution?module=tripleo-group&project_type=openstack&days=180 >> >> -- >> -- James Slagle >> -- >> > > > -- > Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, > Commercial register: Amtsgericht Muenchen, HRB 153243, > Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael > O'Neill > -- -- James Slagle -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From noonedeadpunk at ya.ru Sat Nov 13 06:59:15 2021 From: noonedeadpunk at ya.ru (Dmitriy Rabotyagov) Date: Sat, 13 Nov 2021 08:59:15 +0200 Subject: [openstack-ansible] release job failure for ansible-collections-openstack In-Reply-To: References: Message-ID: <272321636786145@mail.yandex.ru> An HTML attachment was scrubbed... URL: From rlandy at redhat.com Sun Nov 14 23:53:23 2021 From: rlandy at redhat.com (Ronelle Landy) Date: Sun, 14 Nov 2021 18:53:23 -0500 Subject: [all] Openstack CI Log Processing project In-Reply-To: References: Message-ID: On Fri, Nov 12, 2021 at 11:19 AM Daniel Pawlik wrote: > Hello Everyone, > > By moving the Opendev Elasticsearch to Openstack Elasticsearch service (in > the future), > a new repository was created in the Openstack project: ci-log-processing. > The new repository will be used to store configuration related to the > Opensearch > service and all tools required to process logs from Zuul CI system to > Opensearch. > > By moving to the new Elasticsearch system, we would like to take this > opportunity to use a new service to replace the legacy > submit-logstash-jobs system [1][2]. > > > I would like to ask for volunteers to review changes in the Openstack > ci-log-processing repository? [3] > Please reply to this ML or ping me on #openstack-infra IRC channel so that > we can plan to > expand the core member list of this repo. > Any of the TripleO Ci cores could help out here. Thanks. > > Dan > > [1] > https://opendev.org/opendev/base-jobs/src/branch/master/roles/submit-logstash-jobs > [2] https://docs.opendev.org/opendev/system-config/latest/logstash.html > [3] > https://review.opendev.org/q/project:openstack%252Fci-log-processing+status:open > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jordanansell at catalystcloud.nz Mon Nov 15 00:53:05 2021 From: jordanansell at catalystcloud.nz (Jordan Ansell) Date: Mon, 15 Nov 2021 13:53:05 +1300 Subject: [barbican][sdk] Barbican's quota API is missing CLI support? Message-ID: <291f1b4e-8765-bc42-7cd3-92ceb8c963c8@catalystcloud.nz> Hello, I was wondering why the quota API for Barbican doesn't have a CLI command in python-barbicanclient? Thanks, Jordan Ansell From skaplons at redhat.com Mon Nov 15 09:28:47 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Mon, 15 Nov 2021 10:28:47 +0100 Subject: [neutron] CI meeting time slot In-Reply-To: <13828406.O9o76ZdvQC@p1> References: <13828406.O9o76ZdvQC@p1> Message-ID: <5771601.lOV4Wx5bFT@p1> Hi, On czwartek, 28 pa?dziernika 2021 11:10:18 CET Slawek Kaplonski wrote: > Hi, > > As per PTG discussion, I prepared doodle to check what would be the best time > slot for most of the people. > Doodle is at [1]. Please fill it in if You are interested attending the weekly > Neutron CI meeting. Meeting is on the #openstack-neutron irc channel, but we > are also planning to do it on video from time to time. > > The timeslots in doodle have dates for next week, but please ignore them. It's > just to pick the best time slot for the meeting to use it weekly. Next week > meeting will be for sure still in the current time slot, which is Tuesday > 1500 UTC. > > [1] https://doodle.com/poll/3n2im4ebyxhs45ne?utm_source=poll&utm_medium=link Thx for all who participated in the Doodle. After all it seems that the best slot for all who were interested in that meeting and attend it usually is the existing one. So nothing will change there and we will still have Neutron CI meeting on Tuesday at 1500 UTC -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From srelf at ukcloud.com Mon Nov 15 10:21:13 2021 From: srelf at ukcloud.com (Steven Relf) Date: Mon, 15 Nov 2021 10:21:13 +0000 Subject: [Nova] - OS_TYPE - Setting after the fact. Message-ID: Hello List, I have the following situation and would be thankful for any help or suggestions. We provide images to our customers, upon which we set the os_type metadata. We use this to help schedule instances to host aggregates, so we can control where certain OS's land. This works great for new instances, that are created from our images. We have the following problems though. 1. Instances created prior to the introduction of the os_type metadata existing on the image, this metadata is not flowed down, and as such these instances have NULL 2. Instances created from a snapshot do not seem to get the os_type flowed down 3. Instances imported using our migration tool do not end up with the os_type being set either. Currently the only way I can see to set os_type on an instance is to manually (yuk) update the database for each and every instance, which is missing it. Does anyone have any ideas how this can be updated without manually modifying the database. My second thought was to maybe make use of the instance metadata that you can set via the CLI or API, but then I run in to the issue that there is not a nova filter that is able to use instance metadata, and in my brief play with the code, it looks like (user) metadata is not passed in as part of the instance spec dict that is used to schedule instances. I was thinking of writing a custom filter, but I'm not sure how compliant it would be to have a function making a call to try and collect the instance metadata. In summary, I need to solve two things. 1. How do I set os_type on instances in a way that doesn't involve editing the database. 2. Or How can I use another piece of metadata that I can set via the api, which also is exposed to the filter scheduler. Rgds Steve. The future has already arrived. It's just not evenly distributed yet - William Gibson -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Mon Nov 15 11:28:17 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Mon, 15 Nov 2021 12:28:17 +0100 Subject: [neutron] Bug deputy report - week of November 8th Message-ID: <11865446.O9o76ZdvQC@p1> Hi, I was bug deputy last week. Here is the list of new issues opened in Neutron: ## Critical * https://bugs.launchpad.net/neutron/+bug/1950275 - openstack-tox-py36-with- neutron-lib-master job is failing since 05.11.2021 - gate failure in periodic job, Lajos is working on that, * https://bugs.launchpad.net/neutron/+bug/1950346 - [stable/train] neutron- tempest-plugin-designate-scenario-train fails with AttributeError: 'Manager' object has no attribute 'zones_client' - gate failure, in progress, assigned to Lajos, * https://bugs.launchpad.net/neutron/+bug/1950795 - neutron-tempest-plugin- scenario jobs on stable/rocky and stable/queens are failing with POST_FAILURE every time - assigned to slaweq ## High * https://bugs.launchpad.net/neutron/+bug/1950273 - Error 500 during log update - unassigned, happened in the CI, ovn related * https://bugs.launchpad.net/neutron/+bug/1950679 - [ovn] neutron_ovn_db_sync_util hangs on sync_routers_and_rports - Assigned to Daniel Speichert, fix in progress https://review.opendev.org/c/openstack/neutron/+/ 817637 ## Medium * https://bugs.launchpad.net/neutron/+bug/1950686 - [OVN] dns- nameserver=0.0.0.0 for a subnet isn't treated properly - unassigned, ovn related * https://bugs.launchpad.net/neutron/+bug/1899207 - [OVN][Docs] admin/config- dns-res.html should be updated for OVN - unassinged, docs bug, ovn related ## Low * https://bugs.launchpad.net/neutron/+bug/1950662 - [DHCP] Improve RPC server methods - assigned to ralonsoh, fix proposed https://review.opendev.org/c/ openstack/neutron/+/816850 ## Whishlist (RFEs) * https://bugs.launchpad.net/neutron/+bug/1950454 - [RFE] GW IP and FIP QoS to inherit from network - assigned to ralonsoh -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From rlandy at redhat.com Mon Nov 15 11:28:11 2021 From: rlandy at redhat.com (Ronelle Landy) Date: Mon, 15 Nov 2021 06:28:11 -0500 Subject: [Triple0] Gate blocker - standalone failures on master and wallaby Message-ID: Hello All, We have a check/gate blocker on master and wallaby that started on Saturday. Standalone jobs are failing tempest tests. The related bug is linked below: https://bugs.launchpad.net/tripleo/+bug/1950916 The networking team is helping debug this. Please don't recheck for now. We will update this list when we have more info/a fix. Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: From rlandy at redhat.com Mon Nov 15 11:52:15 2021 From: rlandy at redhat.com (Ronelle Landy) Date: Mon, 15 Nov 2021 06:52:15 -0500 Subject: [Triple0] Gate blocker - standalone failures on master and wallaby In-Reply-To: References: Message-ID: tripleo-ci-centos-8-containers-multinode, and the wallaby job, are also impacted. On Mon, Nov 15, 2021 at 6:28 AM Ronelle Landy wrote: > Hello All, > > We have a check/gate blocker on master and wallaby that started on > Saturday. > Standalone jobs are failing tempest tests. The related bug is linked below: > > https://bugs.launchpad.net/tripleo/+bug/1950916 > > The networking team is helping debug this. Please don't recheck for now. > We will update this list when we have more info/a fix. > > Thank you! > -------------- next part -------------- An HTML attachment was scrubbed... URL: From arxcruz at redhat.com Mon Nov 15 13:33:10 2021 From: arxcruz at redhat.com (Arx Cruz) Date: Mon, 15 Nov 2021 14:33:10 +0100 Subject: [all] Openstack CI Log Processing project In-Reply-To: References: Message-ID: Hi Daniel, I can help you on this :) Kind regards, Arx Cruz On Mon, Nov 15, 2021 at 12:57 AM Ronelle Landy wrote: > > > On Fri, Nov 12, 2021 at 11:19 AM Daniel Pawlik wrote: > >> Hello Everyone, >> >> By moving the Opendev Elasticsearch to Openstack Elasticsearch service >> (in the future), >> a new repository was created in the Openstack project: ci-log-processing. >> The new repository will be used to store configuration related to the >> Opensearch >> service and all tools required to process logs from Zuul CI system to >> Opensearch. >> > >> By moving to the new Elasticsearch system, we would like to take this >> opportunity to use a new service to replace the legacy >> submit-logstash-jobs system [1][2]. >> >> >> I would like to ask for volunteers to review changes in the Openstack >> ci-log-processing repository? [3] >> Please reply to this ML or ping me on #openstack-infra IRC channel so >> that we can plan to >> expand the core member list of this repo. >> > > Any of the TripleO Ci cores could help out here. > > Thanks. > >> >> Dan >> >> [1] >> https://opendev.org/opendev/base-jobs/src/branch/master/roles/submit-logstash-jobs >> [2] https://docs.opendev.org/opendev/system-config/latest/logstash.html >> [3] >> https://review.opendev.org/q/project:openstack%252Fci-log-processing+status:open >> > -- Arx Cruz Software Engineer Red Hat EMEA arxcruz at redhat.com @RedHat Red Hat Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From marios at redhat.com Mon Nov 15 14:16:07 2021 From: marios at redhat.com (Marios Andreou) Date: Mon, 15 Nov 2021 16:16:07 +0200 Subject: [all] Openstack CI Log Processing project In-Reply-To: References: Message-ID: On Fri, Nov 12, 2021 at 6:19 PM Daniel Pawlik wrote: > > Hello Everyone, > > By moving the Opendev Elasticsearch to Openstack Elasticsearch service (in the future), > a new repository was created in the Openstack project: ci-log-processing. > The new repository will be used to store configuration related to the Opensearch > service and all tools required to process logs from Zuul CI system to > Opensearch. > > By moving to the new Elasticsearch system, we would like to take this > opportunity to use a new service to replace the legacy submit-logstash-jobs system [1][2]. > > > I would like to ask for volunteers to review changes in the Openstack ci-log-processing repository? [3] > Please reply to this ML or ping me on #openstack-infra IRC channel so that we can plan to > expand the core member list of this repo. > Hi Daniel - count me in if you still need more eyes - adding the repo to my review list regards, marios > Dan > > [1] https://opendev.org/opendev/base-jobs/src/branch/master/roles/submit-logstash-jobs > [2] https://docs.opendev.org/opendev/system-config/latest/logstash.html > [3] https://review.opendev.org/q/project:openstack%252Fci-log-processing+status:open From smooney at redhat.com Mon Nov 15 15:17:37 2021 From: smooney at redhat.com (Sean Mooney) Date: Mon, 15 Nov 2021 15:17:37 +0000 Subject: [Nova] - OS_TYPE - Setting after the fact. In-Reply-To: References: Message-ID: <9eefc93efd057a857802fd8dda7898245399b29e.camel@redhat.com> On Mon, 2021-11-15 at 10:21 +0000, Steven Relf wrote: > Hello List, > > I have the following situation and would be thankful for any help or suggestions. > > We provide images to our customers, upon which we set the os_type metadata. We use this to help schedule instances to host aggregates, so we can control where certain OS's land. This works great for new instances, that are created from our images. > > We have the following problems though. > > > 1. Instances created prior to the introduction of the os_type metadata existing on the image, this metadata is not flowed down, and as such these instances have NULL This is the expected behaivor. we snapshot/copy the image metadata at the time the instnace was created into the instance_system_metadata table to ensure the change to the image after the fact do not affect existing vms. > 2. Instances created from a snapshot do not seem to get the os_type flowed down > 3. Instances imported using our migration tool do not end up with the os_type being set either. likely because the meataddata was not set on the image or volume before the vm was created. > > Currently the only way I can see to set os_type on an instance is to manually (yuk) update the database for each and every instance, which is missing it. os_type on the instance is not used anymore and should not be set at all https://github.com/openstack/nova/blob/master/nova/objects/instance.py#L170 the os_type filed in the instance tabel was replaced with the os_type filed in the image_metatdata https://github.com/openstack/nova/blob/master/nova/objects/image_meta.py#L552-L555 which is stored in teh instance_system_metadata table in the db. > > Does anyone have any ideas how this can be updated without manually modifying the database. > > My second thought was to maybe make use of the instance metadata that you can set via the CLI or API, but then I run in to the issue that there is not a nova filter that is able to use instance metadata, and in my brief play with the code, it looks like (user) metadata is not passed in as part of the instance spec dict that is used to schedule instances. we do have nova manage command that allow operators to set some fo the image metadata that coudl be extended to allow any image metadta to be set however the operator would then be resposible for ensureing that the chagne to the metadata do not invalide the placment of the current vm or rectifying it if it would. The only way i see to do this via the api would be a resize to same flavor which would update the flavor and image metadta and find a new host to move the vm to that is vaild for the new requiremetns. we had discussed added a new recreate api for this usecase previously but the feedback was to just extend resize. > > I was thinking of writing a custom filter, but I'm not sure how compliant it would be to have a function making a call to try and collect the instance metadata. > > In summary, I need to solve two things. > > > 1. How do I set os_type on instances in a way that doesn't involve editing the database. the only way to do this today would be a rebuild, a nova manage command would be inline with what we have done for machine_type. allowing the image metadata to be change via an api is likely not approriate unless its a new instance action like recreate which would use the updated image metadta and move the instance(optionally to the same host). https://github.com/openstack/nova/commit/c70cde057d20bb2c05a133e52b9ec560bd792698 for now i think that is the best approch to take. ensure that all image have the correct os_type set perhaps using a glance import plugin to also update update user submitted images then via an sql script or a new nova manage command update all existing images. > 2. Or How can I use another piece of metadata that I can set via the api, which also is exposed to the filter scheduler. you wont be able to use any metadata in the flavor or image since both are cahced at instance create. you might be able to use server metadata or a server tag. there is no intree filetr however that would work. in tree filters are not allwo to make api calls or RPC calls and should avoid making db queires in the filter. the request spec does not currently contaien any instnace tag or server metadata https://github.com/openstack/nova/blob/master/nova/objects/request_spec.py#L60-L116 as such there is no efficnet way to implement this as a custom filter. > > Rgds > Steve. > > The future has already arrived. It's just not evenly distributed yet - William Gibson > From srelf at ukcloud.com Mon Nov 15 15:35:19 2021 From: srelf at ukcloud.com (Steven Relf) Date: Mon, 15 Nov 2021 15:35:19 +0000 Subject: [Nova] - OS_TYPE - Setting after the fact. In-Reply-To: <9eefc93efd057a857802fd8dda7898245399b29e.camel@redhat.com> References: <9eefc93efd057a857802fd8dda7898245399b29e.camel@redhat.com> Message-ID: Hey Sean, Thanks for responding. It sounds like a nova-manage command is the way forward, to avoid editing the DB directly, the operator would then need to ensure no breaches of aggregate requirements. Ill have a look at the nova-mange command you have referenced. We do now have the os_type set on all our provided images, but this doesn?t stop a customer uploading an image without it set. I guess that?s where a glance image plugin would come in, can you point me at some documentation for me to have a read around. I don?t like the idea of having to resize, as these aren?t our instances, as we are operating a public cloud. On an aside, out of curiosity, why was the decision to not cascade these types of changes made, is it simply to provide immutability to instances? Rgds Steve. From emilien at redhat.com Mon Nov 15 15:39:56 2021 From: emilien at redhat.com (Emilien Macchi) Date: Mon, 15 Nov 2021 10:39:56 -0500 Subject: [sdk][tc] A new home for Gophercloud In-Reply-To: <17d0b3beed8.f1f93452444025.7625989527428519853@ghanshyammann.com> References: <20211103164054.72y2c7qk2b7ogxn2@yuggoth.org> <17d0b3beed8.f1f93452444025.7625989527428519853@ghanshyammann.com> Message-ID: On Wed, Nov 10, 2021 at 2:06 PM Ghanshyam Mann wrote: > ---- On Tue, 09 Nov 2021 19:36:00 -0600 Emilien Macchi < > emilien at redhat.com> wrote ---- > > On Wed, Nov 3, 2021 at 12:43 PM Jeremy Stanley > wrote: > > [...] > > This was based on experiences trying to work with the Kata > > community, and the "experiment" referenced in that mailing list post > > eventually concluded with the removal of remaining Kata project > > configuration when https://review.opendev.org/744687 merged > > approximately 15 months ago. > > > > ack > > I haven't seen much pushback from moving to Gerrit, but pretty much all > feedback I got was from folks who worked (is working) on OpenStack, so a > bit biased in my opinion (myself included).Beside that, if we would move to > opendev, I want to see some incentives in our roadmap, not just "move our > project here because it's cool". > > Some ideas:* Consider it as a subproject from OpenStack SDK? Or part of > a SIG?* CI coverage for API regression testing (e.g. > gophercloud/acceptance/compute running in Nova CI)* Getting more exposure > of the project and potentially more contributors* Consolidate the best > practices in general, for contributions to the project, getting started, > dev environments, improving CI jobs (current jobs use OpenLab zuul, with a > fork of zuul jobs). > > Is there any concern would we have to discuss?-- > > > +1, Thanks Emilien for putting the roadmap which is more clear to > understand the logn term benefits. > > Looks good to me, especially CI part is cool to have from API testing > perspective and to know where > we break things (we run client jobs in many projects CI so it should not > be something special we need to do) > > >* Consider it as a subproject from OpenStack SDK? Or part of a SIG? > > Just to be more clear on this. Does this mean, once we setup the things in > opendev then we can migrate it under > openstack/ namespace under OpenStack SDK umbrella? or you mean keep it in > opendev with non-openstack > namespace but collaborative effort with SDK team. > I think we would move the project under opendev with a non openstack namespace, and of course collaborate with everyone. -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Mon Nov 15 16:28:00 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 15 Nov 2021 10:28:00 -0600 Subject: [sdk][tc] A new home for Gophercloud In-Reply-To: References: <20211103164054.72y2c7qk2b7ogxn2@yuggoth.org> <17d0b3beed8.f1f93452444025.7625989527428519853@ghanshyammann.com> Message-ID: <17d246d4494.11dd8ebf8679316.6998852912939343738@ghanshyammann.com> ---- On Mon, 15 Nov 2021 09:39:56 -0600 Emilien Macchi wrote ---- > > > On Wed, Nov 10, 2021 at 2:06 PM Ghanshyam Mann wrote: > ---- On Tue, 09 Nov 2021 19:36:00 -0600 Emilien Macchi wrote ---- > > On Wed, Nov 3, 2021 at 12:43 PM Jeremy Stanley wrote: > > [...] > > This was based on experiences trying to work with the Kata > > community, and the "experiment" referenced in that mailing list post > > eventually concluded with the removal of remaining Kata project > > configuration when https://review.opendev.org/744687 merged > > approximately 15 months ago. > > > > ack > > I haven't seen much pushback from moving to Gerrit, but pretty much all feedback I got was from folks who worked (is working) on OpenStack, so a bit biased in my opinion (myself included).Beside that, if we would move to opendev, I want to see some incentives in our roadmap, not just "move our project here because it's cool". > > Some ideas:* Consider it as a subproject from OpenStack SDK? Or part of a SIG?* CI coverage for API regression testing (e.g. gophercloud/acceptance/compute running in Nova CI)* Getting more exposure of the project and potentially more contributors* Consolidate the best practices in general, for contributions to the project, getting started, dev environments, improving CI jobs (current jobs use OpenLab zuul, with a fork of zuul jobs). > > Is there any concern would we have to discuss?-- > > > +1, Thanks Emilien for putting the roadmap which is more clear to understand the logn term benefits. > > Looks good to me, especially CI part is cool to have from API testing perspective and to know where > we break things (we run client jobs in many projects CI so it should not be something special we need to do) > > >* Consider it as a subproject from OpenStack SDK? Or part of a SIG? > > Just to be more clear on this. Does this mean, once we setup the things in opendev then we can migrate it under > openstack/ namespace under OpenStack SDK umbrella? or you mean keep it in opendev with non-openstack > namespace but collaborative effort with SDK team. > > I think we would move the project under opendev with a non openstack namespace, and of course collaborate with everyone. Thanks for the clarification. +1, sounds like a good plan. -gmann > > -- > Emilien Macchi > From sbauza at redhat.com Mon Nov 15 16:29:41 2021 From: sbauza at redhat.com (Sylvain Bauza) Date: Mon, 15 Nov 2021 17:29:41 +0100 Subject: [nova][placement] Spec review day on Nov 16th In-Reply-To: References: Message-ID: On Fri, Nov 5, 2021 at 4:17 PM Sylvain Bauza wrote: > As agreed on our last Nova meeting [1], please sharpen your pen and > prepare your specs ahead of time as we'll have a spec review day on Nov 16th > Reminder: the idea of a spec review day is to ensure that contributors and > reviewers are available on the same day for prioritizing Gerrit comments > and IRC discussions about specs in order to facilitate and accelerate the > reviewing of open specs. > If you care about some fancy new feature, please make sure your spec is > ready for review on time and you are somehow joinable so reviewers can ping > you, or you are able to quickly reply on their comments and ideally propose > a new revision if needed. > Nova cores, I appreciate your dedication about specs on this particular > day. > > Kind reminder that tomorrow will be our Spec review day for the Yoga release. Please prepare your specs if you want us to review them, and we also appreciate any review people could make. The more contributors are reviewing our features, the better Nova release will be :-) -Sylvain -Sylvain > > [1] > https://meetings.opendev.org/meetings/nova/2021/nova.2021-11-02-16.00.log.html > -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Mon Nov 15 16:53:47 2021 From: smooney at redhat.com (Sean Mooney) Date: Mon, 15 Nov 2021 16:53:47 +0000 Subject: [Nova] - OS_TYPE - Setting after the fact. In-Reply-To: References: <9eefc93efd057a857802fd8dda7898245399b29e.camel@redhat.com> Message-ID: On Mon, 2021-11-15 at 15:35 +0000, Steven Relf wrote: > Hey Sean, > > Thanks for responding. > > It sounds like a nova-manage command is the way forward, to avoid editing the DB directly, the operator would then need to ensure no breaches of aggregate requirements. Ill have a look at the nova-mange command you have referenced. > > We do now have the os_type set on all our provided images, but this doesn?t stop a customer uploading an image without it set. I guess that?s where a glance image plugin would come in, can you point me at some documentation for me to have a read around. > > I don?t like the idea of having to resize, as these aren?t our instances, as we are operating a public cloud. hi yes so there is an image property inject plugin but that is mainly to inject static metadata https://docs.openstack.org/glance/latest/admin/interoperable-image-import.html#the-image-property-injection-plugin you would porably need to modify that to use libguestfs to inspect the image and then add the os_type based on what it finds. > > On an aside, out of curiosity, why was the decision to not cascade these types of changes made, is it simply to provide immutability to instances? immutablity is one reason and maintaining the validity fo its current placment. adding cpu pinnign extra spec for example would invalidate the placment of any vm that did not have it set also via the flavor. you could add traits requests as another exampel that would make some vm invlaide for there current host. so in generall modifying the image extra specs on a runing instnace can invalidate the current placment of the vm due to chagne in the behavior of fliters or alter the guest abi which might break worklaods. that is why we said this would have to be an operation that you oppted into (proppagation of update extra specs) rahter then an operation that happened by default. it become espically problematic for image with shared/ public or comunity visiblity as a minor chagne to adress a supprot request from one customer might impact other customers. so to be safe we making it immuntable for the lifetime of the the instance unless you change the image via a rebuild in the case of image properties or resize in the case of flavor extra specs. > > Rgds > Steve. > From kennelson11 at gmail.com Mon Nov 15 16:58:34 2021 From: kennelson11 at gmail.com (Kendall Nelson) Date: Mon, 15 Nov 2021 08:58:34 -0800 Subject: [sdk][tc] A new home for Gophercloud In-Reply-To: References: <20211103164054.72y2c7qk2b7ogxn2@yuggoth.org> <17d0b3beed8.f1f93452444025.7625989527428519853@ghanshyammann.com> Message-ID: Is there a reason why you don't want it to be under the openstack namespace? -Kendall On Mon, Nov 15, 2021 at 7:40 AM Emilien Macchi wrote: > > > On Wed, Nov 10, 2021 at 2:06 PM Ghanshyam Mann > wrote: > >> ---- On Tue, 09 Nov 2021 19:36:00 -0600 Emilien Macchi < >> emilien at redhat.com> wrote ---- >> > On Wed, Nov 3, 2021 at 12:43 PM Jeremy Stanley >> wrote: >> > [...] >> > This was based on experiences trying to work with the Kata >> > community, and the "experiment" referenced in that mailing list post >> > eventually concluded with the removal of remaining Kata project >> > configuration when https://review.opendev.org/744687 merged >> > approximately 15 months ago. >> > >> > ack >> > I haven't seen much pushback from moving to Gerrit, but pretty much >> all feedback I got was from folks who worked (is working) on OpenStack, so >> a bit biased in my opinion (myself included).Beside that, if we would move >> to opendev, I want to see some incentives in our roadmap, not just "move >> our project here because it's cool". >> > Some ideas:* Consider it as a subproject from OpenStack SDK? Or part >> of a SIG?* CI coverage for API regression testing (e.g. >> gophercloud/acceptance/compute running in Nova CI)* Getting more exposure >> of the project and potentially more contributors* Consolidate the best >> practices in general, for contributions to the project, getting started, >> dev environments, improving CI jobs (current jobs use OpenLab zuul, with a >> fork of zuul jobs). >> > Is there any concern would we have to discuss?-- >> >> >> +1, Thanks Emilien for putting the roadmap which is more clear to >> understand the logn term benefits. >> >> Looks good to me, especially CI part is cool to have from API testing >> perspective and to know where >> we break things (we run client jobs in many projects CI so it should not >> be something special we need to do) >> >> >* Consider it as a subproject from OpenStack SDK? Or part of a SIG? >> >> Just to be more clear on this. Does this mean, once we setup the things >> in opendev then we can migrate it under >> openstack/ namespace under OpenStack SDK umbrella? or you mean keep it in >> opendev with non-openstack >> namespace but collaborative effort with SDK team. >> > > I think we would move the project under opendev with a non openstack > namespace, and of course collaborate with everyone. > > -- > Emilien Macchi > -------------- next part -------------- An HTML attachment was scrubbed... URL: From senrique at redhat.com Mon Nov 15 17:34:06 2021 From: senrique at redhat.com (Sofia Enriquez) Date: Mon, 15 Nov 2021 14:34:06 -0300 Subject: [Openstack][cinder] scheduler filters In-Reply-To: References: Message-ID: Greetings, > probably I must use the same backend name for both and a cinder type associated to it and the scheduler will use the backend with more space available ? I'm not familiar with your deployment but there's a example in the documentation that I think It may help you: In a multiple-storage back-end configuration, each back end has a name ( volume_backend_name). Several back ends can have the same name. In that case, the scheduler properly decides which back end the volume has to be created in. i.e [1] In this configuration, lvmdriver-1 and lvmdriver-2 have the same volume_backend_name. If a volume creation requests the LVM back end name, the scheduler uses the capacity filter scheduler to choose the most suitable driver, which is either lvmdriver-1 or lvmdriver-2. The capacity filter scheduler is enabled by default. The next section provides more information. In addition, this example presents a lvmdriver-3 back end. Cheers, Sofia [1] https://docs.openstack.org/cinder/xena/admin/blockstorage-multi-backend.html On Thu, Nov 11, 2021 at 4:25 PM Ignazio Cassano wrote: > Hello again, probably I must use the same backend name for both and a > cinder type associated to it and the scheduler will use the backend with > more space available ? > Ignazio > > Il Gio 11 Nov 2021, 20:00 Ignazio Cassano ha > scritto: > >> Hello All, >> I read that capacity filters for cinder is the default, so, if I >> understood well, a volume is placed on the backend where more space is >> available. >> Since my two backends are on storage with same features, I wonder if I >> must specify a default storage backend in cinder.conf or not. >> Must I create a cinder volume without cinder type and scheduler evaluate >> where there is more space available? >> Thanks >> Ignazio >> >> >> -- Sof?a Enriquez she/her Software Engineer Red Hat PnT IRC: @enriquetaso @RedHat Red Hat Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Mon Nov 15 17:39:44 2021 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Mon, 15 Nov 2021 18:39:44 +0100 Subject: [Openstack][cinder] scheduler filters In-Reply-To: References: Message-ID: Many thanks , Sofia. It is exactly what I want to test. Ignazio Il Lun 15 Nov 2021, 18:34 Sofia Enriquez ha scritto: > Greetings, > > probably I must use the same backend name for both and a cinder type > associated to it and the scheduler will use the backend with more space > available ? > I'm not familiar with your deployment but there's a example in the > documentation that I think It may help you: > > In a multiple-storage back-end configuration, each back end has a name ( > volume_backend_name). Several back ends can have the same name. In that > case, the scheduler properly decides which back end the volume has to be > created in. i.e [1] In this configuration, lvmdriver-1 and lvmdriver-2 > have the same volume_backend_name. If a volume creation requests the LVM > back end name, the scheduler uses the capacity filter scheduler to choose > the most suitable driver, which is either lvmdriver-1 or lvmdriver-2. The > capacity filter scheduler is enabled by default. The next section provides > more information. In addition, this example presents a lvmdriver-3 back > end. > > Cheers, > Sofia > [1] > https://docs.openstack.org/cinder/xena/admin/blockstorage-multi-backend.html > > On Thu, Nov 11, 2021 at 4:25 PM Ignazio Cassano > wrote: > >> Hello again, probably I must use the same backend name for both and a >> cinder type associated to it and the scheduler will use the backend with >> more space available ? >> Ignazio >> >> Il Gio 11 Nov 2021, 20:00 Ignazio Cassano ha >> scritto: >> >>> Hello All, >>> I read that capacity filters for cinder is the default, so, if I >>> understood well, a volume is placed on the backend where more space is >>> available. >>> Since my two backends are on storage with same features, I wonder if I >>> must specify a default storage backend in cinder.conf or not. >>> Must I create a cinder volume without cinder type and scheduler evaluate >>> where there is more space available? >>> Thanks >>> Ignazio >>> >>> >>> > > -- > > Sof?a Enriquez > > she/her > > Software Engineer > > Red Hat PnT > > IRC: @enriquetaso > @RedHat Red Hat > Red Hat > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From faisal.sheikh at rapidcompute.com Mon Nov 15 16:11:39 2021 From: faisal.sheikh at rapidcompute.com (Muhammad Faisal) Date: Mon, 15 Nov 2021 21:11:39 +0500 Subject: Error while getting network agents Message-ID: <007a01d7da3b$78f1d800$6ad58800$@rapidcompute.com> Hi, While executing openstack network agent list we are getting below mentioned error. We have run "ovn-sbctl chassis-del 650be87c-b581-467a-b523-ce454e753780" command on controller node. OS: Ubuntu 20 Openstack version: Wallaby Number of controller/network node: 1 (172.16.30.46) Number of compute node: 2 (172.16.30.1, 172.16.30.3) OVN Version: 21.09.0 /var/log/ovn/ovn-northd.log: 2021-11-15T15:29:22.184Z|00053|northd|WARN|Dropped 7 log messages in last 127 seconds (most recently, 120 seconds ago) due to excessive rate 2021-11-15T15:29:22.184Z|00054|northd|WARN|Chassis does not exist for Chassis_Private record, name: 7d07ed32-996f-48da-9ac1-7f0455ea7ec7 2021-11-15T15:31:11.516Z|00055|northd|WARN|Dropped 14 log messages in last 110 seconds (most recently, 52 seconds ago) due to excessive rate 2021-11-15T15:31:11.516Z|00056|northd|WARN|Chassis does not exist for Chassis_Private record, name: 7d07ed32-996f-48da-9ac1-7f0455ea7ec7 2021-11-15T15:35:25.738Z|00057|northd|WARN|Dropped 6 log messages in last 254 seconds (most recently, 243 seconds ago) due to excessive rate 2021-11-15T15:35:25.738Z|00058|northd|WARN|Chassis does not exist for Chassis_Private record, name: 7d07ed32-996f-48da-9ac1-7f0455ea7ec7 2021-11-15T15:44:34.663Z|00059|northd|WARN|Dropped 12 log messages in last 549 seconds (most recently, 512 seconds ago) due to excessive rate 2021-11-15T15:44:34.663Z|00060|northd|WARN|Chassis does not exist for Chassis_Private record, name: 7d07ed32-996f-48da-9ac1-7f0455ea7ec7 2021-11-15T15:47:39.947Z|00061|northd|WARN|Dropped 2 log messages in last 185 seconds (most recently, 185 seconds ago) due to excessive rate 2021-11-15T15:47:39.948Z|00062|northd|WARN|Chassis does not exist for Chassis_Private record, name: 7d07ed32-996f-48da-9ac1-7f0455ea7ec7 /var/log/neutron/neutron-server.log: 2021-11-15 20:55:24.025 678149 DEBUG futurist.periodics [req-df627767-441c-4b46-8487-91604cd3033a - - - - -] Submitting periodic callback 'neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.maintenance.HashRingHealt hCheckPeriodics.touch_hash_ring_nodes' _process_scheduled /usr/lib/python3/dist-packages/futurist/periodics.py:641 2021-11-15 20:55:30.531 678146 DEBUG neutron.wsgi [-] (678146) accepted ('172.16.30.46', 49782) server /usr/lib/python3/dist-packages/eventlet/wsgi.py:992 2021-11-15 20:55:30.893 678146 DEBUG ovsdbapp.backend.ovs_idl.transaction [req-04604530-8f57-428b-8359-e1bd2648d417 - - - - -] Running txn n=1 command(idx=0): CheckLivenessCommand() do_commit /usr/lib/python3/dist-packages/ovsdbapp/backend/ovs_idl/transaction.py:90 2021-11-15 20:55:30.925 678146 DEBUG neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor [req-04604530-8f57-428b-8359-e1bd2648d417 - - - - -] Hash Ring: Node 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling event "update" for row b4926d80-2c2a-488f-bd27-49c7191f0c3a (table: NB_Global) notify /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/o vsdb/ovsdb_monitor.py:666 2021-11-15 20:55:30.927 678147 DEBUG neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor [-] ChassisAgentWriteEvent : Matched Chassis_Private, update, None None matches /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/o vsdb/ovsdb_monitor.py:61 2021-11-15 20:55:30.930 678147 DEBUG neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor [-] ChassisMetadataAgentWriteEvent : Matched Chassis_Private, update, None None matches /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/o vsdb/ovsdb_monitor.py:61 2021-11-15 20:55:30.931 678146 DEBUG neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor [req-04604530-8f57-428b-8359-e1bd2648d417 - - - - -] Hash Ring: Node 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling event "update" for row b4926d80-2c2a-488f-bd27-49c7191f0c3a (table: NB_Global) notify /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/o vsdb/ovsdb_monitor.py:666 2021-11-15 20:55:30.934 678146 DEBUG neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor [req-04604530-8f57-428b-8359-e1bd2648d417 - - - - -] Hash Ring: Node 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling event "update" for row b4926d80-2c2a-488f-bd27-49c7191f0c3a (table: NB_Global) notify /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/o vsdb/ovsdb_monitor.py:666 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource [req-db3a5256-a3de-41a9-a135-710ce024a698 92e064e5e77c4e32b97e6c056a86c5eb 7eb59a51503f48ddbc936c40990e2177 - default default] index failed: No details.: AttributeError: 'Chassis_Private' object has no attribute 'hostname' 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource Traceback (most recent call last): 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/neutron/api/v2/resource.py", line 98, in resource 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource result = method(request=request, **args) 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 139, in wrapped 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource setattr(e, '_RETRY_EXCEEDED', True) 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 227, in __exit__ 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource self.force_reraise() 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 200, in force_reraise 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource raise self.value 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 135, in wrapped 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource return f(*args, **kwargs) 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/oslo_db/api.py", line 154, in wrapper 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource ectxt.value = e.inner_exc 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 227, in __exit__ 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource self.force_reraise() 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 200, in force_reraise 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource raise self.value 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/oslo_db/api.py", line 142, in wrapper 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource return f(*args, **kwargs) 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 183, in wrapped 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource LOG.debug("Retry wrapper got retriable exception: %s", e) 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 227, in __exit__ 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource self.force_reraise() 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 200, in force_reraise 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource raise self.value 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 179, in wrapped 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource return f(*dup_args, **dup_kwargs) 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/neutron/api/v2/base.py", line 369, in index 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource return self._items(request, True, parent_id) 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/neutron/api/v2/base.py", line 304, in _items 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource obj_list = obj_getter(request.context, **kwargs) 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ mech_driver.py", line 1118, in fn 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource return op(results, new_method(*args, _driver=self, **kwargs)) 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ mech_driver.py", line 1182, in get_agents 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource agent_dict = agent.as_dict() 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/agent/neutro n_agent.py", line 59, in as_dict 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource 'host': self.chassis.hostname, 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource AttributeError: 'Chassis_Private' object has no attribute 'hostname' 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource 2021-11-15 20:55:30.937 678146 INFO neutron.wsgi [req-db3a5256-a3de-41a9-a135-710ce024a698 92e064e5e77c4e32b97e6c056a86c5eb 7eb59a51503f48ddbc936c40990e2177 - default default] 172.16.30.46 "GET /v2.0/agents HTTP/1.1" status: 500 len: 368 time: 0.4032691 2021-11-15 20:55:30.938 678146 DEBUG neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor [req-5efca86e-8945-43c5-a492-76e8ad16ef6e - - - - -] Hash Ring: Node 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling event "update" for row 2bd437c4-9173-4d1b-ad01-d27cf0a11c8a (table: SB_Global) notify /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/o vsdb/ovsdb_monitor.py:666 2021-11-15 20:55:30.939 678146 DEBUG neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor [req-5efca86e-8945-43c5-a492-76e8ad16ef6e - - - - -] ChassisAgentWriteEvent : Matched Chassis_Private, update, None None matches /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/o vsdb/ovsdb_monitor.py:61 2021-11-15 20:55:30.939 678146 DEBUG neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor [req-5efca86e-8945-43c5-a492-76e8ad16ef6e - - - - -] Hash Ring: Node 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling event "update" for row 03ce3887-1007-48fa-9f40-b2fe88246a69 (table: Chassis_Private) notify /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/o vsdb/ovsdb_monitor.py:666 2021-11-15 20:55:30.940 678146 DEBUG neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor [req-5efca86e-8945-43c5-a492-76e8ad16ef6e - - - - -] ChassisMetadataAgentWriteEvent : Matched Chassis_Private, update, None None matches /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/o vsdb/ovsdb_monitor.py:61 2021-11-15 20:55:30.940 678146 DEBUG neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor [req-5efca86e-8945-43c5-a492-76e8ad16ef6e - - - - -] Hash Ring: Node 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling event "update" for row 03ce3887-1007-48fa-9f40-b2fe88246a69 (table: Chassis_Private) notify /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/o vsdb/ovsdb_monitor.py:666 2021-11-15 20:55:37.389 678146 INFO neutron.wsgi [req-85f09a48-f50f-4776-b110-5d07dd1dc39e 9888a6e6da764fd28a72b1a7b25b8967 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET /v2.0/ports?device_id=21999c50-740c-4a26-970b-8e07ba37b5ac&fields=binding%3A host_id&fields=binding%3Avif_type HTTP/1.1" status: 200 len: 271 time: 0.0680673 2021-11-15 20:55:37.476 678146 DEBUG neutron.pecan_wsgi.hooks.policy_enforcement [req-4099412f-d724-48d7-8786-add0c5b47f56 9888a6e6da764fd28a72b1a7b25b8967 3e076bf2747b40ed8792ebf85d14d719 - default default] Attributes excluded by policy engine: ['standard_attr_id'] _exclude_attributes_by_policy /usr/lib/python3/dist-packages/neutron/pecan_wsgi/hooks/policy_enforcement.p y:255 2021-11-15 20:55:37.477 678146 INFO neutron.wsgi [req-4099412f-d724-48d7-8786-add0c5b47f56 9888a6e6da764fd28a72b1a7b25b8967 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET /v2.0/ports?tenant_id=7eb59a51503f48ddbc936c40990e2177&device_id=21999c50-74 0c-4a26-970b-8e07ba37b5ac HTTP/1.1" status: 200 len: 1175 time: 0.0565202 2021-11-15 20:55:37.618 678146 DEBUG neutron.pecan_wsgi.hooks.policy_enforcement [req-eebf3e86-206a-45ed-92f3-804007a39cca 9888a6e6da764fd28a72b1a7b25b8967 3e076bf2747b40ed8792ebf85d14d719 - default default] Attributes excluded by policy engine: ['standard_attr_id', 'vlan_transparent'] _exclude_attributes_by_policy /usr/lib/python3/dist-packages/neutron/pecan_wsgi/hooks/policy_enforcement.p y:255 2021-11-15 20:55:37.619 678146 INFO neutron.wsgi [req-eebf3e86-206a-45ed-92f3-804007a39cca 9888a6e6da764fd28a72b1a7b25b8967 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET /v2.0/networks?id=ed4ee148-2562-41ed-93ac-7341948ac4dc HTTP/1.1" status: 200 len: 904 time: 0.1111536 2021-11-15 20:55:37.674 678146 INFO neutron.wsgi [req-71885d0c-4aa2-4377-a014-33768c9160ae 9888a6e6da764fd28a72b1a7b25b8967 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET /v2.0/floatingips?fixed_ip_address=192.168.100.199&port_id=f9c17ae5-c07b-441 1-a0be-551686e6ba8e HTTP/1.1" status: 200 len: 217 time: 0.0442567 2021-11-15 20:55:37.727 678146 DEBUG neutron.pecan_wsgi.hooks.policy_enforcement [req-d35a3b5e-82ea-4e80-b1c6-c8b39205a1f7 9888a6e6da764fd28a72b1a7b25b8967 3e076bf2747b40ed8792ebf85d14d719 - default default] Attributes excluded by policy engine: ['standard_attr_id', 'shared'] _exclude_attributes_by_policy /usr/lib/python3/dist-packages/neutron/pecan_wsgi/hooks/policy_enforcement.p y:255 2021-11-15 20:55:37.731 678146 INFO neutron.wsgi [req-d35a3b5e-82ea-4e80-b1c6-c8b39205a1f7 9888a6e6da764fd28a72b1a7b25b8967 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET /v2.0/subnets?id=74e3b294-5fcc-4685-8bc4-5200be3af09e HTTP/1.1" status: 200 len: 858 time: 0.0468309 2021-11-15 20:55:37.774 678146 INFO neutron.wsgi [req-7a57b86f-75bf-4381-9447-0b1d7262aae8 9888a6e6da764fd28a72b1a7b25b8967 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET /v2.0/ports?network_id=ed4ee148-2562-41ed-93ac-7341948ac4dc&device_owner=net work%3Adhcp HTTP/1.1" status: 200 len: 210 time: 0.0335791 2021-11-15 20:55:37.891 678146 INFO neutron.wsgi [req-7cf3c09b-b5ee-4372-84c2-d738abd1a47a 9888a6e6da764fd28a72b1a7b25b8967 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET /v2.0/networks/ed4ee148-2562-41ed-93ac-7341948ac4dc?fields=segments HTTP/1.1" status: 200 len: 212 time: 0.1059954 2021-11-15 20:55:37.991 678146 INFO neutron.wsgi [req-3ed1edbd-9f33-4b4e-89c8-a2291f23e8ef 9888a6e6da764fd28a72b1a7b25b8967 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET /v2.0/networks/ed4ee148-2562-41ed-93ac-7341948ac4dc?fields=provider%3Aphysic al_network&fields=provider%3Anetwork_type HTTP/1.1" status: 200 len: 277 time: 0.0903363 2021-11-15 20:55:49.083 678146 INFO neutron.wsgi [req-0328379e-d381-48be-a209-a94acc297baa 9888a6e6da764fd28a72b1a7b25b8967 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.1 "GET /v2.0/ports?device_id=82e90d49-bbc4-4c2c-8ae1-7a4bcf84bb7d&fields=binding%3A host_id&fields=binding%3Avif_type HTTP/1.1" status: 200 len: 271 time: 0.2884536 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 33431 bytes Desc: not available URL: From gmann at ghanshyammann.com Mon Nov 15 18:54:03 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 15 Nov 2021 12:54:03 -0600 Subject: [all][tc] Canceling Technical Committee 18th Nov weekly meeting Message-ID: <17d24f2fab5.e49f0c5f688029.678012827550598310@ghanshyammann.com> Hello Everyone, Due to OpenInfra Keynotes happening on 17-18 Nov, this week (18th Nov) TC meeting is cancelled. -gmann From abraden at verisign.com Mon Nov 15 19:19:59 2021 From: abraden at verisign.com (Braden, Albert) Date: Mon, 15 Nov 2021 19:19:59 +0000 Subject: [BULK] Re: Adjutant needs contributors (and a PTL) to survive! In-Reply-To: <17d0fdf9cef.bfbf002c504513.1457558466289860904@ghanshyammann.com> References: <3026c411-688b-c773-8577-e8eed40b995a@catalystcloud.nz> <2bd77f1e8c7542888b7e0e1a14931a41@verisign.com> <9cabb3cb32a7441697f58933df72b514@verisign.com> <1e4045dd6edf572a8a32ca639056bc06a8782fe8.camel@etc.gen.nz> <17d0fdf9cef.bfbf002c504513.1457558466289860904@ghanshyammann.com> Message-ID: <47a580a722ba4e4fbf5a91d6e044447a@verisign.com> I heard back from Adrian, and he showed me where to find everything. I'm still waiting for permission to work on Adjutant. -----Original Message----- From: Ghanshyam Mann Sent: Thursday, November 11, 2021 11:41 AM To: Andrew Ruthven Cc: openstack-discuss at lists.openstack.org Subject: [EXTERNAL] [BULK] Re: Adjutant needs contributors (and a PTL) to survive! Caution: This email originated from outside the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe. ---- On Tue, 09 Nov 2021 04:54:36 -0600 Andrew Ruthven wrote ---- > On Mon, 2021-11-08 at 19:49 +0000, Braden, Albert wrote:I didn't have any luck contacting Adrian. Does anyone know where the storyboard is that he mentions in his email? > I'll check in with Adrian to see if he has heard from anyone. > Cheers,AndrewCatalyst Cloud-- Andrew Ruthven, Wellington, New Zealandandrew at etc.gen.nz |Catalyst Cloud: | This space intentionally left blank https://catalystcloud.nz | @Andrew not sure but please let us know if someone from Catalyst is planning to maintain it? We are still waiting for volunteers to lead/maintain this project, if you are interested please reply here or ping us on #openstack-tc IRC channel. -gmann From emilien at redhat.com Tue Nov 16 00:32:15 2021 From: emilien at redhat.com (Emilien Macchi) Date: Mon, 15 Nov 2021 19:32:15 -0500 Subject: [sdk][tc] A new home for Gophercloud In-Reply-To: References: <20211103164054.72y2c7qk2b7ogxn2@yuggoth.org> <17d0b3beed8.f1f93452444025.7625989527428519853@ghanshyammann.com> Message-ID: Hey Kendall, On Mon, Nov 15, 2021 at 11:59 AM Kendall Nelson wrote: > Is there a reason why you don't want it to be under the openstack > namespace? > The only reason that comes to my mind is not technical at all. I (not saying we, since we haven't reached consensus yet) think that we want the project in its own organization, rather than under openstack. We want to encourage external contributions from outside of OpenStack, therefore opendev would probably suit better than openstack. This is open for discussion of course, but as I see it going, these are my personal thoughts. Thanks, -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From gagehugo at gmail.com Tue Nov 16 00:34:35 2021 From: gagehugo at gmail.com (Gage Hugo) Date: Mon, 15 Nov 2021 18:34:35 -0600 Subject: [openstack-helm] No meeting tomorrow Message-ID: Hey team, Since there are no agenda items [0] for the IRC meeting tomorrow, the meeting is cancelled. Our next meeting will be November 23rd. Thanks [0] https://etherpad.opendev.org/p/openstack-helm-weekly-meeting -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Tue Nov 16 08:32:14 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Tue, 16 Nov 2021 09:32:14 +0100 Subject: [neutron] CI meeting - agenda for 16.11.2021 Message-ID: <7988921.T7Z3S40VBb@p1> Hi, Just quick reminder for those interested in the Neutron CI. Today at 1500 UTC we will have our weekly meeting. Agenda for the meeting is in the etherpad: [1]. This week's meeting will be also on the meetpad [2]. See You all there :) [1] https://etherpad.opendev.org/p/neutron-ci-meetings [2] https://meetpad.opendev.org/neutron-ci-meetings -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From swogatpradhan22 at gmail.com Tue Nov 16 10:13:02 2021 From: swogatpradhan22 at gmail.com (Swogat Pradhan) Date: Tue, 16 Nov 2021 15:43:02 +0530 Subject: [Openstack-victoria] [IRONIC] error in network interface | baremetal node validate Message-ID: Hi, I am currently trying to setup openstack ironic using driver IPMI. I followed the official docs of openstack for setting everything up. When i run openstack baremetal node validate $NODE_UUID, i am getting the following error: * Unexpected exception, traceback saved into log by ironic conductor service that is running on controller: 'ServiceTokenAuthWrapper' object has no attribute '_discovery_cache' * in the network interface in command output. When i check the ironic conductor logs i see the following messages: > Can anyone suggest a solution or a way forward. With regards Swogat Pradhan -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager [req-d2401a0c-b1e6-42a7-9576-fdf7755d2cb2 e3a04390d9a34062be0478e52404d3d2 559f7d28e7354bd398fb70074de53312 - default default] Unexpected exception occurred while validating network driver interface for driver ipmi: 'ServiceTokenAuthWrapper' object has no attribute '_discovery_cache' on node 7b902c2a-7897-4cc4-aec6-e93abfce4adf.: AttributeError: 'ServiceTokenAuthWrapper' object has no attribute '_discovery_cache' 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager Traceback (most recent call last): 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager File "/usr/lib/python3/dist-packages/ironic/conductor/manager.py", line 1958, in validate_driver_interfaces 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager iface.validate(task) 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager File "/usr/lib/python3/dist-packages/ironic/drivers/modules/network/neutron.py", line 62, in validate 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager self.get_cleaning_network_uuid(task) 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager File "/usr/lib/python3/dist-packages/ironic/common/neutron.py", line 933, in get_cleaning_network_uuid 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager return validate_network( 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager File "/usr/lib/python3/dist-packages/ironic/common/neutron.py", line 689, in validate_network 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager client = get_client(context=context) 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager File "/usr/lib/python3/dist-packages/ironic/common/neutron.py", line 79, in get_client 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager return conn.global_request(context.global_id).network 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager File "/usr/lib/python3/dist-packages/openstack/service_description.py", line 87, in __get__ 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager proxy = self._make_proxy(instance) 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager File "/usr/lib/python3/dist-packages/openstack/service_description.py", line 223, in _make_proxy 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager data = proxy_obj.get_endpoint_data() 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager File "/usr/lib/python3/dist-packages/keystoneauth1/adapter.py", line 312, in get_endpoint_data 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager return self.session.get_endpoint_data(auth or self.auth, **kwargs) 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager File "/usr/lib/python3/dist-packages/keystoneauth1/session.py", line 1250, in get_endpoint_data 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager return auth.get_endpoint_data(self, **kwargs) 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager File "/usr/lib/python3/dist-packages/keystoneauth1/plugin.py", line 132, in get_endpoint_data 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager session, cache=self._discovery_cache, 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager AttributeError: 'ServiceTokenAuthWrapper' object has no attribute '_discovery_cache' 2021-11-16 10:03:02.692 13959 ERROR ironic.conductor.manager From mnasiadka at gmail.com Tue Nov 16 12:49:35 2021 From: mnasiadka at gmail.com (=?utf-8?Q?Micha=C5=82_Nasiadka?=) Date: Tue, 16 Nov 2021 13:49:35 +0100 Subject: [kolla] Proposing Michal Arbet as Kolla core Message-ID: Hi, I would like to propose adding Michal Arbet (kevko) to the kolla-core and kolla-ansible-core groups. Michal did some great work on ProxySQL, is a consistent maintainer of Debian related images and has provided some helpful reviews. Cores - please reply +1/-1 before the end of Friday 26th November. Thanks, Michal -------------- next part -------------- An HTML attachment was scrubbed... URL: From sshnaidm at redhat.com Tue Nov 16 14:12:06 2021 From: sshnaidm at redhat.com (Sagi Shnaidman) Date: Tue, 16 Nov 2021 16:12:06 +0200 Subject: [openstack-ansible] release job failure for ansible-collections-openstack In-Reply-To: <272321636786145@mail.yandex.ru> References: <272321636786145@mail.yandex.ru> Message-ID: Hi, thanks for heads up. The first job failed because of the Zuul bug [1] and I didn't expect the job to run the second time, so I just released it manually to Galaxy. When the job tried the second time, it failed because it tried to duplicate the manual release. So I think it's all good now. [1] https://review.opendev.org/c/zuul/zuul/+/817298 On Sat, Nov 13, 2021 at 9:05 AM Dmitriy Rabotyagov wrote: > + sshnaidm@ > - ??? > > Hi! > > It's indeed OpenStack-Ansible SIG repo. > But from what I can see - release is published in Ansible Galaxy which is > the main thing that this job does. So for me things look good, but I guess > the only person who for sure can verify that is Sagi Shnaidman. > > Being able to tag repos is crucial for this sig since it's the only way to > provide Ansible Collections in Ansible Galaxy. > > 12.11.2021, 17:33, "El?d Ill?s" : > > Hi Openstack-Ansible team! > > This mail is just to inform you that there was a release job failure [1] > yesterday and the job could not be re-run as part of the job was > finished successfully in the 1st run (so the 2nd attempt failed [2]). > > Could you please review if everything is OK with the release? > > Thanks, > > El?d (elodilles @ #openstack-release) > > [1] > > http://lists.openstack.org/pipermail/release-job-failures/2021-November/001576.html > [2] > > http://lists.openstack.org/pipermail/release-job-failures/2021-November/001577.html > > > > -- > Kind Regards, > Dmitriy Rabotyagov > > -- Best regards Sagi Shnaidman -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Tue Nov 16 15:06:00 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Tue, 16 Nov 2021 15:06:00 +0000 Subject: [openstack-ansible] release job failure for ansible-collections-openstack In-Reply-To: References: <272321636786145@mail.yandex.ru> Message-ID: <20211116150600.rtuvpyhpgcuxr2u7@yuggoth.org> On 2021-11-16 16:12:06 +0200 (+0200), Sagi Shnaidman wrote: [...] > I didn't expect the job to run the second time, so I just released > it manually to Galaxy. [...] Aha, thanks, that explains it. El?d had asked me to reenqueue it once we got the zuul.tag variable back, but I didn't realize the version had been manually uploaded in the meantime. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From rlandy at redhat.com Tue Nov 16 15:58:29 2021 From: rlandy at redhat.com (Ronelle Landy) Date: Tue, 16 Nov 2021 10:58:29 -0500 Subject: [Triple0] Gate blocker - standalone failures on master and wallaby In-Reply-To: References: Message-ID: On Mon, Nov 15, 2021 at 6:52 AM Ronelle Landy wrote: > > tripleo-ci-centos-8-containers-multinode, and the wallaby job, are also > impacted. > > On Mon, Nov 15, 2021 at 6:28 AM Ronelle Landy wrote: > >> Hello All, >> >> We have a check/gate blocker on master and wallaby that started on >> Saturday. >> Standalone jobs are failing tempest tests. The related bug is linked >> below: >> >> https://bugs.launchpad.net/tripleo/+bug/1950916 >> >> The networking team is helping debug this. Please don't recheck for now. >> We will update this list when we have more info/a fix. >> > Thanks to Yatin, we have a temporary fix merged to unblock check/gate: Merged openstack/tripleo-quickstart master: Exclude libvirt/qemu from AppStream repo https://review.opendev.org/c/openstack/tripleo-quickstart/+/818043 Please recheck patches that were impacted by these failures. The compute team is still working on a more complete fix here. > >> Thank you! >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.souppart at gmail.com Tue Nov 16 08:21:26 2021 From: alex.souppart at gmail.com (alex souppart) Date: Tue, 16 Nov 2021 09:21:26 +0100 Subject: error "haproxy[]: proxy horizon has no server available!" when internal tls is activated In-Reply-To: References: Message-ID: Hello, I try to deploy an overcloud openstack in victoria version. My configuration to deploy is : openstack overcloud deploy --templates -r /home/stack/templates/roles_data.yaml \ -n /home/stack/network_data.yaml \ -e /home/stack/templates/scheduler_hints_env.yaml \ -e /home/stack/templates/network-isolation.yaml \ -e /home/stack/templates/os-net-config-mapping.yaml \ -e /home/stack/templates/node-info.yaml \ -e /home/stack/containers-prepare-parameter.yaml \ -e /home/stack/templates/host-map.yaml \ -e /home/stack/templates/ips-from-pool-all.yaml \ -e /home/stack/templates/network-environment.yaml \ -e /home/stack/templates/net-multiple-nics-vlans.yaml \ -e /home/stack/templates/ceph-ansible-external.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/ssl/enable-internal-tls.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/haproxy-internal-tls-certmonger.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/ssl/tls-endpoints-public-ip.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/octavia.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/ssl/tls-everywhere-endpoints-dns.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/disable-telemetry.yaml \ -e /home/stack/templates/tls-parameters.yaml \ -e /home/stack/templates/inject-trust-anchor.yaml \ The generated configuration of horizon httpd contains SSLVerifyClient. But Haproxy fails to check server available, because haproxy does not send a client certificate when check attempt. the generated configuration of haproxy backend is : server host1 ip_host1:5000 ca-file /etc/ipa/ca.crt check fall 5 inter 2000 rise 2 ssl verify required verifyhost host1 server host2 ip_host2:5000 ca-file /etc/ipa/ca.crt check fall 5 inter 2000 rise 2 ssl verify required verifyhost host2 server host3 ip_host3:5000 ca-file /etc/ipa/ca.crt check fall 5 inter 2000 rise 2 ssl verify required verifyhost host3 if i try adding manualy "crt /etc/pki/tls/certs/haproxy/overcloud-haproxy-internal_api.pem" in server configuration in haproxy.conf, horizon/dashboard works via haproxy. But i'm not sure that's the right way. Did I forget an environment file in deploy configuration ? Thank you in advance for your assistance with this. Best regards Souppart Alexandre -------------- next part -------------- An HTML attachment was scrubbed... URL: From katonalala at gmail.com Tue Nov 16 09:42:03 2021 From: katonalala at gmail.com (Lajos Katona) Date: Tue, 16 Nov 2021 10:42:03 +0100 Subject: Error while getting network agents In-Reply-To: <007a01d7da3b$78f1d800$6ad58800$@rapidcompute.com> References: <007a01d7da3b$78f1d800$6ad58800$@rapidcompute.com> Message-ID: Hi Muhammad, >From Wallaby you should be able to delete agent through Neutron API: https://review.opendev.org/c/openstack/neutron/+/752795 To tell the truth I don't know what happens if you execute DELETE /v2.0/agents/{agent_id} after you have deleted the chassis in sbctl. Regards Lajos Katona (lajoskatona) Muhammad Faisal ezt ?rta (id?pont: 2021. nov. 15., H, 19:05): > *Hi,* > > While executing openstack network agent list we are getting below > mentioned error. We have run ?ovn-sbctl chassis-del > 650be87c-b581-467a-b523-ce454e753780? command on controller node. > > *OS:* Ubuntu 20 > *Openstack version:* Wallaby > *Number of controller/network node:* 1 (172.16.30.46) > > *Number of compute node:* 2 (172.16.30.1, 172.16.30.3) > *OVN Version:* 21.09.0 > > /var/log/ovn/ovn-northd.log: > > 2021-11-15T15:29:22.184Z|00053|northd|WARN|Dropped 7 log messages in last > 127 seconds (most recently, 120 seconds ago) due to excessive rate > > 2021-11-15T15:29:22.184Z|00054|northd|WARN|Chassis does not exist for > Chassis_Private record, name: 7d07ed32-996f-48da-9ac1-7f0455ea7ec7 > > 2021-11-15T15:31:11.516Z|00055|northd|WARN|Dropped 14 log messages in last > 110 seconds (most recently, 52 seconds ago) due to excessive rate > > 2021-11-15T15:31:11.516Z|00056|northd|WARN|Chassis does not exist for > Chassis_Private record, name: 7d07ed32-996f-48da-9ac1-7f0455ea7ec7 > > 2021-11-15T15:35:25.738Z|00057|northd|WARN|Dropped 6 log messages in last > 254 seconds (most recently, 243 seconds ago) due to excessive rate > > 2021-11-15T15:35:25.738Z|00058|northd|WARN|Chassis does not exist for > Chassis_Private record, name: 7d07ed32-996f-48da-9ac1-7f0455ea7ec7 > > 2021-11-15T15:44:34.663Z|00059|northd|WARN|Dropped 12 log messages in last > 549 seconds (most recently, 512 seconds ago) due to excessive rate > > 2021-11-15T15:44:34.663Z|00060|northd|WARN|Chassis does not exist for > Chassis_Private record, name: 7d07ed32-996f-48da-9ac1-7f0455ea7ec7 > > 2021-11-15T15:47:39.947Z|00061|northd|WARN|Dropped 2 log messages in last > 185 seconds (most recently, 185 seconds ago) due to excessive rate > > 2021-11-15T15:47:39.948Z|00062|northd|WARN|Chassis does not exist for > Chassis_Private record, name: 7d07ed32-996f-48da-9ac1-7f0455ea7ec7 > > > /var/log/neutron/neutron-server.log: > 2021-11-15 20:55:24.025 678149 DEBUG futurist.periodics > [req-df627767-441c-4b46-8487-91604cd3033a - - - - -] Submitting periodic > callback > 'neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.maintenance.HashRingHealthCheckPeriodics.touch_hash_ring_nodes' > _process_scheduled /usr/lib/python3/dist-packages/futurist/periodics.py:641 > > 2021-11-15 20:55:30.531 678146 DEBUG neutron.wsgi [-] (678146) accepted > ('172.16.30.46', 49782) server > /usr/lib/python3/dist-packages/eventlet/wsgi.py:992 > > 2021-11-15 20:55:30.893 678146 DEBUG ovsdbapp.backend.ovs_idl.transaction > [req-04604530-8f57-428b-8359-e1bd2648d417 - - - - -] Running txn n=1 > command(idx=0): CheckLivenessCommand() do_commit > /usr/lib/python3/dist-packages/ovsdbapp/backend/ovs_idl/transaction.py:90 > > 2021-11-15 20:55:30.925 678146 DEBUG > neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor > [req-04604530-8f57-428b-8359-e1bd2648d417 - - - - -] Hash Ring: Node > 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling > event "update" for row b4926d80-2c2a-488f-bd27-49c7191f0c3a (table: > NB_Global) notify > /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:666 > > 2021-11-15 20:55:30.927 678147 DEBUG > neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor [-] > ChassisAgentWriteEvent : Matched Chassis_Private, update, None None matches > /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:61 > > 2021-11-15 20:55:30.930 678147 DEBUG > neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor [-] > ChassisMetadataAgentWriteEvent : Matched Chassis_Private, update, None None > matches > /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:61 > > 2021-11-15 20:55:30.931 678146 DEBUG > neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor > [req-04604530-8f57-428b-8359-e1bd2648d417 - - - - -] Hash Ring: Node > 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling > event "update" for row b4926d80-2c2a-488f-bd27-49c7191f0c3a (table: > NB_Global) notify > /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:666 > > 2021-11-15 20:55:30.934 678146 DEBUG > neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor > [req-04604530-8f57-428b-8359-e1bd2648d417 - - - - -] Hash Ring: Node > 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling > event "update" for row b4926d80-2c2a-488f-bd27-49c7191f0c3a (table: > NB_Global) notify > /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:666 > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource > [req-db3a5256-a3de-41a9-a135-710ce024a698 92e064e5e77c4e32b97e6c056a86c5eb > 7eb59a51503f48ddbc936c40990e2177 - default default] index failed: No > details.: AttributeError: 'Chassis_Private' object has no attribute > 'hostname' > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource Traceback > (most recent call last): > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/neutron/api/v2/resource.py", line 98, in > resource > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource result = > method(request=request, **args) > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 139, in wrapped > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource > setattr(e, '_RETRY_EXCEEDED', True) > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 227, in > __exit__ > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource > self.force_reraise() > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 200, in > force_reraise > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource raise > self.value > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 135, in wrapped > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource return > f(*args, **kwargs) > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/oslo_db/api.py", line 154, in wrapper > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource > ectxt.value = e.inner_exc > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 227, in > __exit__ > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource > self.force_reraise() > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 200, in > force_reraise > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource raise > self.value > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/oslo_db/api.py", line 142, in wrapper > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource return > f(*args, **kwargs) > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 183, in wrapped > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource > LOG.debug("Retry wrapper got retriable exception: %s", e) > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 227, in > __exit__ > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource > self.force_reraise() > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 200, in > force_reraise > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource raise > self.value > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 179, in wrapped > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource return > f(*dup_args, **dup_kwargs) > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/neutron/api/v2/base.py", line 369, in index > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource return > self._items(request, True, parent_id) > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/neutron/api/v2/base.py", line 304, in _items > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource obj_list > = obj_getter(request.context, **kwargs) > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/mech_driver.py", > line 1118, in fn > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource return > op(results, new_method(*args, _driver=self, **kwargs)) > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/mech_driver.py", > line 1182, in get_agents > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource > agent_dict = agent.as_dict() > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File > "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/agent/neutron_agent.py", > line 59, in as_dict > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource 'host': > self.chassis.hostname, > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource > AttributeError: 'Chassis_Private' object has no attribute 'hostname' > > 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource > > 2021-11-15 20:55:30.937 678146 INFO neutron.wsgi > [req-db3a5256-a3de-41a9-a135-710ce024a698 92e064e5e77c4e32b97e6c056a86c5eb > 7eb59a51503f48ddbc936c40990e2177 - default default] 172.16.30.46 "GET > /v2.0/agents HTTP/1.1" status: 500 len: 368 time: 0.4032691 > > 2021-11-15 20:55:30.938 678146 DEBUG > neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor > [req-5efca86e-8945-43c5-a492-76e8ad16ef6e - - - - -] Hash Ring: Node > 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling > event "update" for row 2bd437c4-9173-4d1b-ad01-d27cf0a11c8a (table: > SB_Global) notify > /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:666 > > 2021-11-15 20:55:30.939 678146 DEBUG > neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor > [req-5efca86e-8945-43c5-a492-76e8ad16ef6e - - - - -] ChassisAgentWriteEvent > : Matched Chassis_Private, update, None None matches > /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:61 > > 2021-11-15 20:55:30.939 678146 DEBUG > neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor > [req-5efca86e-8945-43c5-a492-76e8ad16ef6e - - - - -] Hash Ring: Node > 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling > event "update" for row 03ce3887-1007-48fa-9f40-b2fe88246a69 (table: > Chassis_Private) notify > /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:666 > > 2021-11-15 20:55:30.940 678146 DEBUG > neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor > [req-5efca86e-8945-43c5-a492-76e8ad16ef6e - - - - -] > ChassisMetadataAgentWriteEvent : Matched Chassis_Private, update, None None > matches > /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:61 > > 2021-11-15 20:55:30.940 678146 DEBUG > neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor > [req-5efca86e-8945-43c5-a492-76e8ad16ef6e - - - - -] Hash Ring: Node > 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling > event "update" for row 03ce3887-1007-48fa-9f40-b2fe88246a69 (table: > Chassis_Private) notify > /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:666 > > 2021-11-15 20:55:37.389 678146 INFO neutron.wsgi > [req-85f09a48-f50f-4776-b110-5d07dd1dc39e 9888a6e6da764fd28a72b1a7b25b8967 > 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET > /v2.0/ports?device_id=21999c50-740c-4a26-970b-8e07ba37b5ac&fields=binding%3Ahost_id&fields=binding%3Avif_type > HTTP/1.1" status: 200 len: 271 time: 0.0680673 > > 2021-11-15 20:55:37.476 678146 DEBUG > neutron.pecan_wsgi.hooks.policy_enforcement > [req-4099412f-d724-48d7-8786-add0c5b47f56 9888a6e6da764fd28a72b1a7b25b8967 > 3e076bf2747b40ed8792ebf85d14d719 - default default] Attributes excluded by > policy engine: ['standard_attr_id'] _exclude_attributes_by_policy > /usr/lib/python3/dist-packages/neutron/pecan_wsgi/hooks/policy_enforcement.py:255 > > 2021-11-15 20:55:37.477 678146 INFO neutron.wsgi > [req-4099412f-d724-48d7-8786-add0c5b47f56 9888a6e6da764fd28a72b1a7b25b8967 > 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET > /v2.0/ports?tenant_id=7eb59a51503f48ddbc936c40990e2177&device_id=21999c50-740c-4a26-970b-8e07ba37b5ac > HTTP/1.1" status: 200 len: 1175 time: 0.0565202 > > 2021-11-15 20:55:37.618 678146 DEBUG > neutron.pecan_wsgi.hooks.policy_enforcement > [req-eebf3e86-206a-45ed-92f3-804007a39cca 9888a6e6da764fd28a72b1a7b25b8967 > 3e076bf2747b40ed8792ebf85d14d719 - default default] Attributes excluded by > policy engine: ['standard_attr_id', 'vlan_transparent'] > _exclude_attributes_by_policy > /usr/lib/python3/dist-packages/neutron/pecan_wsgi/hooks/policy_enforcement.py:255 > > 2021-11-15 20:55:37.619 678146 INFO neutron.wsgi > [req-eebf3e86-206a-45ed-92f3-804007a39cca 9888a6e6da764fd28a72b1a7b25b8967 > 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET > /v2.0/networks?id=ed4ee148-2562-41ed-93ac-7341948ac4dc HTTP/1.1" status: > 200 len: 904 time: 0.1111536 > > 2021-11-15 20:55:37.674 678146 INFO neutron.wsgi > [req-71885d0c-4aa2-4377-a014-33768c9160ae 9888a6e6da764fd28a72b1a7b25b8967 > 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET > /v2.0/floatingips?fixed_ip_address=192.168.100.199&port_id=f9c17ae5-c07b-4411-a0be-551686e6ba8e > HTTP/1.1" status: 200 len: 217 time: 0.0442567 > > 2021-11-15 20:55:37.727 678146 DEBUG > neutron.pecan_wsgi.hooks.policy_enforcement > [req-d35a3b5e-82ea-4e80-b1c6-c8b39205a1f7 9888a6e6da764fd28a72b1a7b25b8967 > 3e076bf2747b40ed8792ebf85d14d719 - default default] Attributes excluded by > policy engine: ['standard_attr_id', 'shared'] _exclude_attributes_by_policy > /usr/lib/python3/dist-packages/neutron/pecan_wsgi/hooks/policy_enforcement.py:255 > > 2021-11-15 20:55:37.731 678146 INFO neutron.wsgi > [req-d35a3b5e-82ea-4e80-b1c6-c8b39205a1f7 9888a6e6da764fd28a72b1a7b25b8967 > 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET > /v2.0/subnets?id=74e3b294-5fcc-4685-8bc4-5200be3af09e HTTP/1.1" status: > 200 len: 858 time: 0.0468309 > > 2021-11-15 20:55:37.774 678146 INFO neutron.wsgi > [req-7a57b86f-75bf-4381-9447-0b1d7262aae8 9888a6e6da764fd28a72b1a7b25b8967 > 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET > /v2.0/ports?network_id=ed4ee148-2562-41ed-93ac-7341948ac4dc&device_owner=network%3Adhcp > HTTP/1.1" status: 200 len: 210 time: 0.0335791 > > 2021-11-15 20:55:37.891 678146 INFO neutron.wsgi > [req-7cf3c09b-b5ee-4372-84c2-d738abd1a47a 9888a6e6da764fd28a72b1a7b25b8967 > 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET > /v2.0/networks/ed4ee148-2562-41ed-93ac-7341948ac4dc?fields=segments > HTTP/1.1" status: 200 len: 212 time: 0.1059954 > > 2021-11-15 20:55:37.991 678146 INFO neutron.wsgi > [req-3ed1edbd-9f33-4b4e-89c8-a2291f23e8ef 9888a6e6da764fd28a72b1a7b25b8967 > 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET > /v2.0/networks/ed4ee148-2562-41ed-93ac-7341948ac4dc?fields=provider%3Aphysical_network&fields=provider%3Anetwork_type > HTTP/1.1" status: 200 len: 277 time: 0.0903363 > > 2021-11-15 20:55:49.083 678146 INFO neutron.wsgi > [req-0328379e-d381-48be-a209-a94acc297baa 9888a6e6da764fd28a72b1a7b25b8967 > 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.1 "GET > /v2.0/ports?device_id=82e90d49-bbc4-4c2c-8ae1-7a4bcf84bb7d&fields=binding%3Ahost_id&fields=binding%3Avif_type > HTTP/1.1" status: 200 len: 271 time: 0.2884536 > > [image: Email Signature] > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 33431 bytes Desc: not available URL: From cboylan at sapwetik.org Tue Nov 16 16:58:44 2021 From: cboylan at sapwetik.org (Clark Boylan) Date: Tue, 16 Nov 2021 08:58:44 -0800 Subject: =?UTF-8?Q?[all][refstack][neutron][kolla][ironic][heat][trove][senlin][b?= =?UTF-8?Q?arbican][manila]_Fixing_Zuul_Config_Errors?= Message-ID: <5072b00c-bbf6-42e9-830f-d6598a76beb7@www.fastmail.com> Hello, The OpenStack tenant in Zuul currently has 134 config errors. You can find these errors at https://zuul.opendev.org/t/openstack/config-errors or by clicking the blue bell icon in the top right of https://zuul.opendev.org/t/openstack/status. The vast majority of these errors appear related to project renames that have been requested of OpenDev or project retirements. Can you please look into fixing these as they can be an attractive nuisance when debugging Zuul problems (they also indicate that a number of your jobs are probably not working). Project renames creating issues: * openstack/python-tempestconf -> osf/python-tempestconf -> openinfra/python-tempestconf * openstack/refstack -> osf/refstack -> openinfra/refstack * x/tap-as-a-service -> openstack/tap-as-a-service * openstack/networking-l2gw -> x/networking-l2gw Project retirements creating issues: * openstack/neutron-lbaas * recordsansible/ara Projects whose configs have errors: * openinfra/python-tempestconf * openstack/heat * openstack/ironic * openstack/kolla-ansible * openstack/kuryr-kubernetes * openstack/murano-apps * openstack/networking-midonet * openstack/networking-odl * openstack/neutron * openstack/neutron-fwaas * openstack/python-troveclient * openstack/senlin * openstack/tap-as-a-service * openstack/zaqar * x/vmware-nsx * openinfra/openstackid * openstack/barbican * openstack/cookbook-openstack-application-catalog * openstack/heat-dashboard * openstack/manila-ui * openstack/python-manilaclient Let us know if we can help decipher any errors, Clark From radoslaw.piliszek at gmail.com Tue Nov 16 17:06:59 2021 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Tue, 16 Nov 2021 18:06:59 +0100 Subject: [all][refstack][neutron][kolla][ironic][heat][trove][senlin][barbican][manila] Fixing Zuul Config Errors In-Reply-To: <5072b00c-bbf6-42e9-830f-d6598a76beb7@www.fastmail.com> References: <5072b00c-bbf6-42e9-830f-d6598a76beb7@www.fastmail.com> Message-ID: @ Kolla folks It seems to affect kolla-ansible in stable/rocky and stable/stein branches due to ansible_python_interpreter being set. We either need to backport https://review.opendev.org/c/openstack/kolla-ansible/+/798205 or drop affected CI config in those. They are EM so it might be fine going either way as CI might be broken by now anyhow. -yoctozepto On Tue, 16 Nov 2021 at 17:59, Clark Boylan wrote: > > Hello, > > The OpenStack tenant in Zuul currently has 134 config errors. You can find these errors at https://zuul.opendev.org/t/openstack/config-errors or by clicking the blue bell icon in the top right of https://zuul.opendev.org/t/openstack/status. The vast majority of these errors appear related to project renames that have been requested of OpenDev or project retirements. Can you please look into fixing these as they can be an attractive nuisance when debugging Zuul problems (they also indicate that a number of your jobs are probably not working). > > Project renames creating issues: > > * openstack/python-tempestconf -> osf/python-tempestconf -> openinfra/python-tempestconf > * openstack/refstack -> osf/refstack -> openinfra/refstack > * x/tap-as-a-service -> openstack/tap-as-a-service > * openstack/networking-l2gw -> x/networking-l2gw > > Project retirements creating issues: > > * openstack/neutron-lbaas > * recordsansible/ara > > Projects whose configs have errors: > > * openinfra/python-tempestconf > * openstack/heat > * openstack/ironic > * openstack/kolla-ansible > * openstack/kuryr-kubernetes > * openstack/murano-apps > * openstack/networking-midonet > * openstack/networking-odl > * openstack/neutron > * openstack/neutron-fwaas > * openstack/python-troveclient > * openstack/senlin > * openstack/tap-as-a-service > * openstack/zaqar > * x/vmware-nsx > * openinfra/openstackid > * openstack/barbican > * openstack/cookbook-openstack-application-catalog > * openstack/heat-dashboard > * openstack/manila-ui > * openstack/python-manilaclient > > Let us know if we can help decipher any errors, > Clark > From ralonsoh at redhat.com Tue Nov 16 17:26:31 2021 From: ralonsoh at redhat.com (Rodolfo Alonso Hernandez) Date: Tue, 16 Nov 2021 18:26:31 +0100 Subject: Error while getting network agents In-Reply-To: References: <007a01d7da3b$78f1d800$6ad58800$@rapidcompute.com> Message-ID: Hello Muhammad: If the ovn-controller service is enabled in the host (in this case your controller), this service should create again the Chassis register. If the ovn-controller is not enabled, then it is possible to reproduce this issue. I could reproduce it doing those steps: - Killing the ovn-controller service. Not stopping gracefully (that will delete the "chassis" and the "chassis_private" registers) but sending the kill signal. - Deleting the "chassis" register. That will leave the "chassis_private" register with chassis=[]. However, we don't attend "chassis" events. That means the "AgentCache" won't be updated. - Restart the Neutron server. That will force the OVB SB retrieval with the "chassis_private" register without any chassis. - List the agents --> that will trigger the reported error. In any case, remember that the OVN agent deletion is only for clean-up purposes. If by any circumstance the ovn-controller in a host does not stop gracefully, it will leave "chassis" and "chassis_private" registers undeleted. To properly delete those registers, you should: - Delete both from OVN SB. - Then delete the agents from Neutron, using the CLI: "openstack network agent delete ". I'll open a bug to consider this very specific scenario. Regards. On Tue, Nov 16, 2021 at 5:22 PM Lajos Katona wrote: > Hi Muhammad, > From Wallaby you should be able to delete agent through Neutron API: > https://review.opendev.org/c/openstack/neutron/+/752795 > > To tell the truth I don't know what happens if you execute > DELETE /v2.0/agents/{agent_id} after you have deleted the chassis in sbctl. > > Regards > Lajos Katona (lajoskatona) > > Muhammad Faisal ezt ?rta (id?pont: 2021. > nov. 15., H, 19:05): > >> *Hi,* >> >> While executing openstack network agent list we are getting below >> mentioned error. We have run ?ovn-sbctl chassis-del >> 650be87c-b581-467a-b523-ce454e753780? command on controller node. >> >> *OS:* Ubuntu 20 >> *Openstack version:* Wallaby >> *Number of controller/network node:* 1 (172.16.30.46) >> >> *Number of compute node:* 2 (172.16.30.1, 172.16.30.3) >> *OVN Version:* 21.09.0 >> >> /var/log/ovn/ovn-northd.log: >> >> 2021-11-15T15:29:22.184Z|00053|northd|WARN|Dropped 7 log messages in last >> 127 seconds (most recently, 120 seconds ago) due to excessive rate >> >> 2021-11-15T15:29:22.184Z|00054|northd|WARN|Chassis does not exist for >> Chassis_Private record, name: 7d07ed32-996f-48da-9ac1-7f0455ea7ec7 >> >> 2021-11-15T15:31:11.516Z|00055|northd|WARN|Dropped 14 log messages in >> last 110 seconds (most recently, 52 seconds ago) due to excessive rate >> >> 2021-11-15T15:31:11.516Z|00056|northd|WARN|Chassis does not exist for >> Chassis_Private record, name: 7d07ed32-996f-48da-9ac1-7f0455ea7ec7 >> >> 2021-11-15T15:35:25.738Z|00057|northd|WARN|Dropped 6 log messages in last >> 254 seconds (most recently, 243 seconds ago) due to excessive rate >> >> 2021-11-15T15:35:25.738Z|00058|northd|WARN|Chassis does not exist for >> Chassis_Private record, name: 7d07ed32-996f-48da-9ac1-7f0455ea7ec7 >> >> 2021-11-15T15:44:34.663Z|00059|northd|WARN|Dropped 12 log messages in >> last 549 seconds (most recently, 512 seconds ago) due to excessive rate >> >> 2021-11-15T15:44:34.663Z|00060|northd|WARN|Chassis does not exist for >> Chassis_Private record, name: 7d07ed32-996f-48da-9ac1-7f0455ea7ec7 >> >> 2021-11-15T15:47:39.947Z|00061|northd|WARN|Dropped 2 log messages in last >> 185 seconds (most recently, 185 seconds ago) due to excessive rate >> >> 2021-11-15T15:47:39.948Z|00062|northd|WARN|Chassis does not exist for >> Chassis_Private record, name: 7d07ed32-996f-48da-9ac1-7f0455ea7ec7 >> >> >> /var/log/neutron/neutron-server.log: >> 2021-11-15 20:55:24.025 678149 DEBUG futurist.periodics >> [req-df627767-441c-4b46-8487-91604cd3033a - - - - -] Submitting periodic >> callback >> 'neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.maintenance.HashRingHealthCheckPeriodics.touch_hash_ring_nodes' >> _process_scheduled /usr/lib/python3/dist-packages/futurist/periodics.py:641 >> >> 2021-11-15 20:55:30.531 678146 DEBUG neutron.wsgi [-] (678146) accepted >> ('172.16.30.46', 49782) server >> /usr/lib/python3/dist-packages/eventlet/wsgi.py:992 >> >> 2021-11-15 20:55:30.893 678146 DEBUG ovsdbapp.backend.ovs_idl.transaction >> [req-04604530-8f57-428b-8359-e1bd2648d417 - - - - -] Running txn n=1 >> command(idx=0): CheckLivenessCommand() do_commit >> /usr/lib/python3/dist-packages/ovsdbapp/backend/ovs_idl/transaction.py:90 >> >> 2021-11-15 20:55:30.925 678146 DEBUG >> neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor >> [req-04604530-8f57-428b-8359-e1bd2648d417 - - - - -] Hash Ring: Node >> 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling >> event "update" for row b4926d80-2c2a-488f-bd27-49c7191f0c3a (table: >> NB_Global) notify >> /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:666 >> >> 2021-11-15 20:55:30.927 678147 DEBUG >> neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor [-] >> ChassisAgentWriteEvent : Matched Chassis_Private, update, None None matches >> /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:61 >> >> 2021-11-15 20:55:30.930 678147 DEBUG >> neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor [-] >> ChassisMetadataAgentWriteEvent : Matched Chassis_Private, update, None None >> matches >> /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:61 >> >> 2021-11-15 20:55:30.931 678146 DEBUG >> neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor >> [req-04604530-8f57-428b-8359-e1bd2648d417 - - - - -] Hash Ring: Node >> 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling >> event "update" for row b4926d80-2c2a-488f-bd27-49c7191f0c3a (table: >> NB_Global) notify >> /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:666 >> >> 2021-11-15 20:55:30.934 678146 DEBUG >> neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor >> [req-04604530-8f57-428b-8359-e1bd2648d417 - - - - -] Hash Ring: Node >> 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling >> event "update" for row b4926d80-2c2a-488f-bd27-49c7191f0c3a (table: >> NB_Global) notify >> /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:666 >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource >> [req-db3a5256-a3de-41a9-a135-710ce024a698 92e064e5e77c4e32b97e6c056a86c5eb >> 7eb59a51503f48ddbc936c40990e2177 - default default] index failed: No >> details.: AttributeError: 'Chassis_Private' object has no attribute >> 'hostname' >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource Traceback >> (most recent call last): >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/neutron/api/v2/resource.py", line 98, in >> resource >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource result = >> method(request=request, **args) >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 139, in wrapped >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource >> setattr(e, '_RETRY_EXCEEDED', True) >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 227, in >> __exit__ >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource >> self.force_reraise() >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 200, in >> force_reraise >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource raise >> self.value >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 135, in wrapped >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource return >> f(*args, **kwargs) >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/oslo_db/api.py", line 154, in wrapper >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource >> ectxt.value = e.inner_exc >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 227, in >> __exit__ >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource >> self.force_reraise() >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 200, in >> force_reraise >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource raise >> self.value >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/oslo_db/api.py", line 142, in wrapper >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource return >> f(*args, **kwargs) >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 183, in wrapped >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource >> LOG.debug("Retry wrapper got retriable exception: %s", e) >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 227, in >> __exit__ >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource >> self.force_reraise() >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 200, in >> force_reraise >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource raise >> self.value >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 179, in wrapped >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource return >> f(*dup_args, **dup_kwargs) >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/neutron/api/v2/base.py", line 369, in index >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource return >> self._items(request, True, parent_id) >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/neutron/api/v2/base.py", line 304, in _items >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource obj_list >> = obj_getter(request.context, **kwargs) >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/mech_driver.py", >> line 1118, in fn >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource return >> op(results, new_method(*args, _driver=self, **kwargs)) >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/mech_driver.py", >> line 1182, in get_agents >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource >> agent_dict = agent.as_dict() >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource File >> "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/agent/neutron_agent.py", >> line 59, in as_dict >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource 'host': >> self.chassis.hostname, >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource >> AttributeError: 'Chassis_Private' object has no attribute 'hostname' >> >> 2021-11-15 20:55:30.936 678146 ERROR neutron.api.v2.resource >> >> 2021-11-15 20:55:30.937 678146 INFO neutron.wsgi >> [req-db3a5256-a3de-41a9-a135-710ce024a698 92e064e5e77c4e32b97e6c056a86c5eb >> 7eb59a51503f48ddbc936c40990e2177 - default default] 172.16.30.46 "GET >> /v2.0/agents HTTP/1.1" status: 500 len: 368 time: 0.4032691 >> >> 2021-11-15 20:55:30.938 678146 DEBUG >> neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor >> [req-5efca86e-8945-43c5-a492-76e8ad16ef6e - - - - -] Hash Ring: Node >> 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling >> event "update" for row 2bd437c4-9173-4d1b-ad01-d27cf0a11c8a (table: >> SB_Global) notify >> /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:666 >> >> 2021-11-15 20:55:30.939 678146 DEBUG >> neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor >> [req-5efca86e-8945-43c5-a492-76e8ad16ef6e - - - - -] ChassisAgentWriteEvent >> : Matched Chassis_Private, update, None None matches >> /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:61 >> >> 2021-11-15 20:55:30.939 678146 DEBUG >> neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor >> [req-5efca86e-8945-43c5-a492-76e8ad16ef6e - - - - -] Hash Ring: Node >> 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling >> event "update" for row 03ce3887-1007-48fa-9f40-b2fe88246a69 (table: >> Chassis_Private) notify >> /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:666 >> >> 2021-11-15 20:55:30.940 678146 DEBUG >> neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor >> [req-5efca86e-8945-43c5-a492-76e8ad16ef6e - - - - -] >> ChassisMetadataAgentWriteEvent : Matched Chassis_Private, update, None None >> matches >> /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:61 >> >> 2021-11-15 20:55:30.940 678146 DEBUG >> neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovsdb_monitor >> [req-5efca86e-8945-43c5-a492-76e8ad16ef6e - - - - -] Hash Ring: Node >> 8318cb4d-c002-41f2-a17d-bb33d35d8b99 (host: controller-khi01) handling >> event "update" for row 03ce3887-1007-48fa-9f40-b2fe88246a69 (table: >> Chassis_Private) notify >> /usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py:666 >> >> 2021-11-15 20:55:37.389 678146 INFO neutron.wsgi >> [req-85f09a48-f50f-4776-b110-5d07dd1dc39e 9888a6e6da764fd28a72b1a7b25b8967 >> 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET >> /v2.0/ports?device_id=21999c50-740c-4a26-970b-8e07ba37b5ac&fields=binding%3Ahost_id&fields=binding%3Avif_type >> HTTP/1.1" status: 200 len: 271 time: 0.0680673 >> >> 2021-11-15 20:55:37.476 678146 DEBUG >> neutron.pecan_wsgi.hooks.policy_enforcement >> [req-4099412f-d724-48d7-8786-add0c5b47f56 9888a6e6da764fd28a72b1a7b25b8967 >> 3e076bf2747b40ed8792ebf85d14d719 - default default] Attributes excluded by >> policy engine: ['standard_attr_id'] _exclude_attributes_by_policy >> /usr/lib/python3/dist-packages/neutron/pecan_wsgi/hooks/policy_enforcement.py:255 >> >> 2021-11-15 20:55:37.477 678146 INFO neutron.wsgi >> [req-4099412f-d724-48d7-8786-add0c5b47f56 9888a6e6da764fd28a72b1a7b25b8967 >> 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET >> /v2.0/ports?tenant_id=7eb59a51503f48ddbc936c40990e2177&device_id=21999c50-740c-4a26-970b-8e07ba37b5ac >> HTTP/1.1" status: 200 len: 1175 time: 0.0565202 >> >> 2021-11-15 20:55:37.618 678146 DEBUG >> neutron.pecan_wsgi.hooks.policy_enforcement >> [req-eebf3e86-206a-45ed-92f3-804007a39cca 9888a6e6da764fd28a72b1a7b25b8967 >> 3e076bf2747b40ed8792ebf85d14d719 - default default] Attributes excluded by >> policy engine: ['standard_attr_id', 'vlan_transparent'] >> _exclude_attributes_by_policy >> /usr/lib/python3/dist-packages/neutron/pecan_wsgi/hooks/policy_enforcement.py:255 >> >> 2021-11-15 20:55:37.619 678146 INFO neutron.wsgi >> [req-eebf3e86-206a-45ed-92f3-804007a39cca 9888a6e6da764fd28a72b1a7b25b8967 >> 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET >> /v2.0/networks?id=ed4ee148-2562-41ed-93ac-7341948ac4dc HTTP/1.1" status: >> 200 len: 904 time: 0.1111536 >> >> 2021-11-15 20:55:37.674 678146 INFO neutron.wsgi >> [req-71885d0c-4aa2-4377-a014-33768c9160ae 9888a6e6da764fd28a72b1a7b25b8967 >> 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET >> /v2.0/floatingips?fixed_ip_address=192.168.100.199&port_id=f9c17ae5-c07b-4411-a0be-551686e6ba8e >> HTTP/1.1" status: 200 len: 217 time: 0.0442567 >> >> 2021-11-15 20:55:37.727 678146 DEBUG >> neutron.pecan_wsgi.hooks.policy_enforcement >> [req-d35a3b5e-82ea-4e80-b1c6-c8b39205a1f7 9888a6e6da764fd28a72b1a7b25b8967 >> 3e076bf2747b40ed8792ebf85d14d719 - default default] Attributes excluded by >> policy engine: ['standard_attr_id', 'shared'] _exclude_attributes_by_policy >> /usr/lib/python3/dist-packages/neutron/pecan_wsgi/hooks/policy_enforcement.py:255 >> >> 2021-11-15 20:55:37.731 678146 INFO neutron.wsgi >> [req-d35a3b5e-82ea-4e80-b1c6-c8b39205a1f7 9888a6e6da764fd28a72b1a7b25b8967 >> 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET >> /v2.0/subnets?id=74e3b294-5fcc-4685-8bc4-5200be3af09e HTTP/1.1" status: >> 200 len: 858 time: 0.0468309 >> >> 2021-11-15 20:55:37.774 678146 INFO neutron.wsgi >> [req-7a57b86f-75bf-4381-9447-0b1d7262aae8 9888a6e6da764fd28a72b1a7b25b8967 >> 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET >> /v2.0/ports?network_id=ed4ee148-2562-41ed-93ac-7341948ac4dc&device_owner=network%3Adhcp >> HTTP/1.1" status: 200 len: 210 time: 0.0335791 >> >> 2021-11-15 20:55:37.891 678146 INFO neutron.wsgi >> [req-7cf3c09b-b5ee-4372-84c2-d738abd1a47a 9888a6e6da764fd28a72b1a7b25b8967 >> 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET >> /v2.0/networks/ed4ee148-2562-41ed-93ac-7341948ac4dc?fields=segments >> HTTP/1.1" status: 200 len: 212 time: 0.1059954 >> >> 2021-11-15 20:55:37.991 678146 INFO neutron.wsgi >> [req-3ed1edbd-9f33-4b4e-89c8-a2291f23e8ef 9888a6e6da764fd28a72b1a7b25b8967 >> 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.3 "GET >> /v2.0/networks/ed4ee148-2562-41ed-93ac-7341948ac4dc?fields=provider%3Aphysical_network&fields=provider%3Anetwork_type >> HTTP/1.1" status: 200 len: 277 time: 0.0903363 >> >> 2021-11-15 20:55:49.083 678146 INFO neutron.wsgi >> [req-0328379e-d381-48be-a209-a94acc297baa 9888a6e6da764fd28a72b1a7b25b8967 >> 3e076bf2747b40ed8792ebf85d14d719 - default default] 172.16.30.1 "GET >> /v2.0/ports?device_id=82e90d49-bbc4-4c2c-8ae1-7a4bcf84bb7d&fields=binding%3Ahost_id&fields=binding%3Avif_type >> HTTP/1.1" status: 200 len: 271 time: 0.2884536 >> >> [image: Email Signature] >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 33431 bytes Desc: not available URL: From marcin.juszkiewicz at linaro.org Tue Nov 16 17:49:41 2021 From: marcin.juszkiewicz at linaro.org (Marcin Juszkiewicz) Date: Tue, 16 Nov 2021 18:49:41 +0100 Subject: [kolla] Proposing Michal Arbet as Kolla core In-Reply-To: References: Message-ID: <9747ba17-d3f0-3a35-a3c0-c1fb5317d665@linaro.org> W dniu 16.11.2021 o?13:49, Micha? Nasiadka pisze: > Hi, > > I would like to propose adding Michal Arbet (kevko) to the kolla-core > and kolla-ansible-core groups. > Michal did some great work on ProxySQL, is a consistent maintainer > of?Debian related?images and has provided some helpful > reviews. > > Cores - please reply +1/-1 before the end of Friday 26th November. +1 From yasufum.o at gmail.com Tue Nov 16 18:11:44 2021 From: yasufum.o at gmail.com (Yasufumi Ogawa) Date: Wed, 17 Nov 2021 03:11:44 +0900 Subject: [ Tacker ] Passing a shell script/parameters as a file in cloud config In-Reply-To: References: Message-ID: Hi Lokendra, My appologies overlooked your question. We'll confirm the issue soon. Thanks, Yasufumi On 2021/11/01 17:22, Lokendra Rathour wrote: > Hello EveryOne, > Any update on this, please. > > > -Lokendra > > > On Thu, Oct 28, 2021 at 2:43 PM Lokendra Rathour > > wrote: > > Hi, > /In Tacker, while deploying VNFD can we pass a file ( parameter > file) and keep it at a defined path using cloud-config way?/ > > Like in *generic hot template*s, we have the below-mentioned way to > pass a file directly as below: > parameters: > ? foo: > ? ? default: bar > > resources: > > ? the_server: > ? ? type: OS::Nova::Server > ? ? properties: > ? ? ? # flavor, image etc > ? ? ? user_data: > ? ? ? ? str_replace: > ? ? ? ? ? template: {get_file: the_server_boot.sh} > ? ? ? ? ? params: > ? ? ? ? ? ? $FOO: {get_param: foo} > > > *but when using this approach in Tacker BaseHOT it gives an error > saying * > "nstantiation wait failed for vnf > 77693e61-c80e-41e0-af9a-a0f702f3a9a7, error: VNF Create Resource > CREATE failed: resources.obsvrnnu62mb: > resources.CAS_0_group.Property error: > resources.soft_script.properties.config: No content found in the > "files" section for get_file path: Files/scripts/install.py > 2021-10-28 00:46:35.677 3853831 ERROR oslo_messaging.rpc.server > " > do we have a defined way to use the hot capability in TACKER? > > Defined Folder Structure for CSAR: > . > ??? BaseHOT > ??? ??? default > ??? ? ? ??? RIN_vnf_hot.yaml > ??? ? ? ??? nested > ??? ? ? ? ? ??? RIN_0.yaml > ??? ? ? ? ? ??? RIN_1.yaml > ??? Definitions > ??? ??? RIN_df_default.yaml > ??? ??? RIN_top_vnfd.yaml > ??? ??? RIN_types.yaml > ??? ??? etsi_nfv_sol001_common_types.yaml > ??? ??? etsi_nfv_sol001_vnfd_types.yaml > ??? Files > ??? ??? images > ??? ??? scripts > ??? ? ? ??? install.py > ??? Scripts > ??? TOSCA-Metadata > ??? ??? TOSCA.meta > ??? UserData > ??? ??? __init__.py > ??? ??? lcm_user_data.py > > *Objective: * > To pass a file at a defined path on the VDU after the VDU is > instantiated/launched. > > -- > ~ Lokendra > skype: lokendrarathour > > > > > -- > ~ Lokendra > www.inertiaspeaks.com > www.inertiagroups.com > skype: lokendrarathour > > From lokendrarathour at gmail.com Tue Nov 16 18:25:03 2021 From: lokendrarathour at gmail.com (Lokendra Rathour) Date: Tue, 16 Nov 2021 23:55:03 +0530 Subject: [ Tacker ] Passing a shell script/parameters as a file in cloud config In-Reply-To: References: Message-ID: Thankyou so much for the input. Will wait for the same. -Lokendra On Tue, 16 Nov 2021, 23:50 Yasufumi Ogawa, wrote: > Hi Lokendra, > > My appologies overlooked your question. We'll confirm the issue soon. > > Thanks, > Yasufumi > > On 2021/11/01 17:22, Lokendra Rathour wrote: > > Hello EveryOne, > > Any update on this, please. > > > > > > -Lokendra > > > > > > On Thu, Oct 28, 2021 at 2:43 PM Lokendra Rathour > > > wrote: > > > > Hi, > > /In Tacker, while deploying VNFD can we pass a file ( parameter > > file) and keep it at a defined path using cloud-config way?/ > > > > Like in *generic hot template*s, we have the below-mentioned way to > > pass a file directly as below: > > parameters: > > foo: > > default: bar > > > > resources: > > > > the_server: > > type: OS::Nova::Server > > properties: > > # flavor, image etc > > user_data: > > str_replace: > > template: {get_file: the_server_boot.sh} > > params: > > $FOO: {get_param: foo} > > > > > > *but when using this approach in Tacker BaseHOT it gives an error > > saying * > > "nstantiation wait failed for vnf > > 77693e61-c80e-41e0-af9a-a0f702f3a9a7, error: VNF Create Resource > > CREATE failed: resources.obsvrnnu62mb: > > resources.CAS_0_group.Property error: > > resources.soft_script.properties.config: No content found in the > > "files" section for get_file path: Files/scripts/install.py > > 2021-10-28 00:46:35.677 3853831 ERROR oslo_messaging.rpc.server > > " > > do we have a defined way to use the hot capability in TACKER? > > > > Defined Folder Structure for CSAR: > > . > > ??? BaseHOT > > ? ??? default > > ? ??? RIN_vnf_hot.yaml > > ? ??? nested > > ? ??? RIN_0.yaml > > ? ??? RIN_1.yaml > > ??? Definitions > > ? ??? RIN_df_default.yaml > > ? ??? RIN_top_vnfd.yaml > > ? ??? RIN_types.yaml > > ? ??? etsi_nfv_sol001_common_types.yaml > > ? ??? etsi_nfv_sol001_vnfd_types.yaml > > ??? Files > > ? ??? images > > ? ??? scripts > > ? ??? install.py > > ??? Scripts > > ??? TOSCA-Metadata > > ? ??? TOSCA.meta > > ??? UserData > > ? ??? __init__.py > > ? ??? lcm_user_data.py > > > > *Objective: * > > To pass a file at a defined path on the VDU after the VDU is > > instantiated/launched. > > > > -- > > ~ Lokendra > > skype: lokendrarathour > > > > > > > > > > -- > > ~ Lokendra > > www.inertiaspeaks.com > > www.inertiagroups.com > > skype: lokendrarathour > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Danny.Webb at thehutgroup.com Tue Nov 16 18:33:02 2021 From: Danny.Webb at thehutgroup.com (Danny Webb) Date: Tue, 16 Nov 2021 18:33:02 +0000 Subject: [cinder][rbd] festival of RBD driver reviews 11 november 1500 utc In-Reply-To: <9e8a69c6-dff6-e320-e618-2f828fc95861@gmail.com> References: <9e8a69c6-dff6-e320-e618-2f828fc95861@gmail.com> Message-ID: Hi All, Apologies, I totally missed this meeting. My company interested in taking up the backend QOS implementation (https://review.opendev.org/c/openstack/cinder/+/762794) of rbd driver and moving it towards completion. Would anyone be available to walk me through what's needed to finalise this? I can jump onto IRC in whichever openstack channel is required (bearing in mind I'm in GMT). Cheers, Danny ________________________________ From: Brian Rosmaita Sent: 04 November 2021 19:19 To: openstack-discuss at lists.openstack.org Subject: [cinder][rbd] festival of RBD driver reviews 11 november 1500 utc CAUTION: This email originates from outside THG By popular demand (really!), I'm scheduling a RBD driver review festival for next week. It's a community driver, and we've got a backlog of patches: https://review.opendev.org/q/project:openstack/cinder+status:open+file:cinder/volume/drivers/rbd.py If your patch is currently in merge conflict, it would be helpful if you could get conflicts resolved before the festival. Also, if you have questions about comments that have been left on your patch, this would be a good time to get them answered. who: Everyone! what: The Cinder Festival of RBD Driver Reviews when: Thursday 11 November 2021 from 1500-1600 UTC where: https://meet.google.com/fsb-qkfc-qun etherpad: https://etherpad.opendev.org/p/cinder-festival-of-driver-reviews (Note that we're trying google meet for this session.) Danny Webb Senior Linux Systems Administrator The Hut Group Tel: Email: Danny.Webb at thehutgroup.com For the purposes of this email, the "company" means The Hut Group Limited, a company registered in England and Wales (company number 6539496) whose registered office is at Fifth Floor, Voyager House, Chicago Avenue, Manchester Airport, M90 3DQ and/or any of its respective subsidiaries. Confidentiality Notice This e-mail is confidential and intended for the use of the named recipient only. If you are not the intended recipient please notify us by telephone immediately on +44(0)1606 811888 or return it to us by e-mail. Please then delete it from your system and note that any use, dissemination, forwarding, printing or copying is strictly prohibited. Any views or opinions are solely those of the author and do not necessarily represent those of the company. Encryptions and Viruses Please note that this e-mail and any attachments have not been encrypted. They may therefore be liable to be compromised. Please also note that it is your responsibility to scan this e-mail and any attachments for viruses. We do not, to the extent permitted by law, accept any liability (whether in contract, negligence or otherwise) for any virus infection and/or external compromise of security and/or confidentiality in relation to transmissions sent by e-mail. Monitoring Activity and use of the company's systems is monitored to secure its effective use and operation and for other lawful business purposes. Communications using these systems will also be monitored and may be recorded to secure effective use and operation and for other lawful business purposes. hgvyjuv -------------- next part -------------- An HTML attachment was scrubbed... URL: From jungleboyj at gmail.com Tue Nov 16 18:38:01 2021 From: jungleboyj at gmail.com (Jay Bryant) Date: Tue, 16 Nov 2021 12:38:01 -0600 Subject: [cinder][rbd] festival of RBD driver reviews 11 november 1500 utc In-Reply-To: References: <9e8a69c6-dff6-e320-e618-2f828fc95861@gmail.com> Message-ID: <33e02f27-fadd-5c07-a697-b66621dbd03d@gmail.com> On 11/16/2021 12:33 PM, Danny Webb wrote: > Hi All, > > Apologies, I totally missed this meeting.? My company interested in > taking up the backend QOS implementation > (https://review.opendev.org/c/openstack/cinder/+/762794) of rbd driver > and moving it towards completion.? Would anyone be available to walk > me through what's needed to finalise this?? I can jump onto IRC in > whichever openstack channel is required (bearing in mind I'm in GMT). Danny, Best approach would be to join the #openstack-cinder channel on oftc.net and ask questions there.? Also, an opportunity to get help would be by bringing this up in the weekly Cinder meeting [1] . Hope this information helps! [1] https://wiki.openstack.org/wiki/CinderMeetings > > Cheers, > Danny > ------------------------------------------------------------------------ > *From:* Brian Rosmaita > *Sent:* 04 November 2021 19:19 > *To:* openstack-discuss at lists.openstack.org > > *Subject:* [cinder][rbd] festival of RBD driver reviews 11 november > 1500 utc > CAUTION: This email originates from outside THG > > By popular demand (really!), I'm scheduling a RBD driver review festival > for next week. It's a community driver, and we've got a backlog of > patches: > > https://review.opendev.org/q/project:openstack/cinder+status:open+file:cinder/volume/drivers/rbd.py > > If your patch is currently in merge conflict, it would be helpful if you > could get conflicts resolved before the festival. Also, if you have > questions about comments that have been left on your patch, this would > be a good time to get them answered. > > who: Everyone! > what: The Cinder Festival of RBD Driver Reviews > when: Thursday 11 November 2021 from 1500-1600 UTC > where: https://meet.google.com/fsb-qkfc-qun > etherpad: https://etherpad.opendev.org/p/cinder-festival-of-driver-reviews > > (Note that we're trying google meet for this session.) > > Danny Webb > Senior Linux Systems Administrator > The Hut Group > > Tel: > Email: Danny.Webb at thehutgroup.com > > > For the purposes of this email, the "company" means The Hut Group > Limited, a company registered in England and Wales (company number > 6539496) whose registered office is at Fifth Floor, Voyager House, > Chicago Avenue, Manchester Airport, M90 3DQ and/or any of its > respective subsidiaries. > > *Confidentiality Notice* > This e-mail is confidential and intended for the use of the named > recipient only. If you are not the intended recipient please notify us > by telephone immediately on +44(0)1606 811888 or return it to us by > e-mail. Please then delete it from your system and note that any use, > dissemination, forwarding, printing or copying is strictly prohibited. > Any views or opinions are solely those of the author and do not > necessarily represent those of the company. > > *Encryptions and Viruses* > Please note that this e-mail and any attachments have not been > encrypted. They may therefore be liable to be compromised. Please also > note that it is your responsibility to scan this e-mail and any > attachments for viruses. We do not, to the extent permitted by law, > accept any liability (whether in contract, negligence or otherwise) > for any virus infection and/or external compromise of security and/or > confidentiality in relation to transmissions sent by e-mail. > > *Monitoring* > Activity and use of the company's systems is monitored to secure its > effective use and operation and for other lawful business purposes. > Communications using these systems will also be monitored and may be > recorded to secure effective use and operation and for other lawful > business purposes. > > hgvyjuv -------------- next part -------------- An HTML attachment was scrubbed... URL: From dmeng at uvic.ca Tue Nov 16 17:55:32 2021 From: dmeng at uvic.ca (dmeng) Date: Tue, 16 Nov 2021 09:55:32 -0800 Subject: [sdk]: Check instance error message Message-ID: <1abc189e90ecc42225a74a4be8c9ee47@uvic.ca> Hello there, Hope everything is going well. I'm wondering if there is any method that could check the error message of an instance whose status is "ERROR"? Like from openstack cli, "openstack server show server_name", if the server is in "ERROR" status, this will return a field "fault" with a message shows the error. I tried the compute service get_server and find_server, but neither of them show the error messages of an instance. Thanks and have a great day! Catherine -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Tue Nov 16 19:13:31 2021 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Tue, 16 Nov 2021 20:13:31 +0100 Subject: [kolla] Proposing Michal Arbet as Kolla core In-Reply-To: References: Message-ID: +1 On Tue, 16 Nov 2021 at 13:50, Micha? Nasiadka wrote: > > Hi, > > I would like to propose adding Michal Arbet (kevko) to the kolla-core and kolla-ansible-core groups. > Michal did some great work on ProxySQL, is a consistent maintainer of Debian related images and has provided some helpful > reviews. > > Cores - please reply +1/-1 before the end of Friday 26th November. > > Thanks, > Michal From rosmaita.fossdev at gmail.com Tue Nov 16 22:24:39 2021 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Tue, 16 Nov 2021 17:24:39 -0500 Subject: [cinder][rbd] festival of RBD driver reviews 11 november 1500 utc In-Reply-To: <33e02f27-fadd-5c07-a697-b66621dbd03d@gmail.com> References: <9e8a69c6-dff6-e320-e618-2f828fc95861@gmail.com> <33e02f27-fadd-5c07-a697-b66621dbd03d@gmail.com> Message-ID: <0cbc3047-71b9-89fc-ff1d-1f51642c65a9@gmail.com> On 11/16/21 1:38 PM, Jay Bryant wrote: > > On 11/16/2021 12:33 PM, Danny Webb wrote: >> Hi All, >> >> Apologies, I totally missed this meeting.? My company interested in >> taking up the backend QOS implementation >> (https://review.opendev.org/c/openstack/cinder/+/762794) of rbd driver >> and moving it towards completion.? Would anyone be available to walk >> me through what's needed to finalise this?? I can jump onto IRC in >> whichever openstack channel is required (bearing in mind I'm in GMT). > > Danny, > > Best approach would be to join the #openstack-cinder channel on oftc.net > and ask questions there.? Also, an opportunity to get help would be by > bringing this up in the weekly Cinder meeting [1] . > > > Hope this information helps! > > > [1] https://wiki.openstack.org/wiki/CinderMeetings > In addition to what Jay said, the history on that patch indicates that the cinder team wanted a spec outlining the design of the feature [2]. You can find info about cinder specs here: https://docs.openstack.org/cinder/latest/contributor/contributing.html#new-feature-planning [2] https://review.opendev.org/c/openstack/cinder/+/762794/3#message-c3cacc61d6a5e7b1229b64e2d72b5b2a2c68404d >> >> Cheers, >> Danny >> ------------------------------------------------------------------------ >> *From:* Brian Rosmaita >> *Sent:* 04 November 2021 19:19 >> *To:* openstack-discuss at lists.openstack.org >> >> *Subject:* [cinder][rbd] festival of RBD driver reviews 11 november >> 1500 utc >> CAUTION: This email originates from outside THG >> >> By popular demand (really!), I'm scheduling a RBD driver review festival >> for next week. It's a community driver, and we've got a backlog of >> patches: >> >> https://review.opendev.org/q/project:openstack/cinder+status:open+file:cinder/volume/drivers/rbd.py >> >> If your patch is currently in merge conflict, it would be helpful if you >> could get conflicts resolved before the festival. Also, if you have >> questions about comments that have been left on your patch, this would >> be a good time to get them answered. >> >> who: Everyone! >> what: The Cinder Festival of RBD Driver Reviews >> when: Thursday 11 November 2021 from 1500-1600 UTC >> where: https://meet.google.com/fsb-qkfc-qun >> etherpad: https://etherpad.opendev.org/p/cinder-festival-of-driver-reviews >> >> (Note that we're trying google meet for this session.) >> >> Danny Webb >> Senior Linux Systems Administrator >> The Hut Group >> >> Tel: >> Email: Danny.Webb at thehutgroup.com >> >> >> For the purposes of this email, the "company" means The Hut Group >> Limited, a company registered in England and Wales (company number >> 6539496) whose registered office is at Fifth Floor, Voyager House, >> Chicago Avenue, Manchester Airport, M90 3DQ and/or any of its >> respective subsidiaries. >> >> *Confidentiality Notice* >> This e-mail is confidential and intended for the use of the named >> recipient only. If you are not the intended recipient please notify us >> by telephone immediately on +44(0)1606 811888 or return it to us by >> e-mail. Please then delete it from your system and note that any use, >> dissemination, forwarding, printing or copying is strictly prohibited. >> Any views or opinions are solely those of the author and do not >> necessarily represent those of the company. >> >> *Encryptions and Viruses* >> Please note that this e-mail and any attachments have not been >> encrypted. They may therefore be liable to be compromised. Please also >> note that it is your responsibility to scan this e-mail and any >> attachments for viruses. We do not, to the extent permitted by law, >> accept any liability (whether in contract, negligence or otherwise) >> for any virus infection and/or external compromise of security and/or >> confidentiality in relation to transmissions sent by e-mail. >> >> *Monitoring* >> Activity and use of the company's systems is monitored to secure its >> effective use and operation and for other lawful business purposes. >> Communications using these systems will also be monitored and may be >> recorded to secure effective use and operation and for other lawful >> business purposes. >> >> hgvyjuv From skaplons at redhat.com Wed Nov 17 07:26:14 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Wed, 17 Nov 2021 08:26:14 +0100 Subject: [all][refstack][neutron][kolla][ironic][heat][trove][senlin][barbican][manila] Fixing Zuul Config Errors In-Reply-To: <5072b00c-bbf6-42e9-830f-d6598a76beb7@www.fastmail.com> References: <5072b00c-bbf6-42e9-830f-d6598a76beb7@www.fastmail.com> Message-ID: <5517904.DvuYhMxLoT@p1> Hi, I just checked neutron related things there and it seems there are 2 major issues there: 1. move of the tap-as-a-service from x/ to openstack/ namespace (that affects networking-midonet) - I will propose patch for that today. 2. remove of the neutron-lbaas repo (that affects much more than only neutron repos - for that I will try to propose patches this week as well. On wtorek, 16 listopada 2021 17:58:44 CET Clark Boylan wrote: > Hello, > > The OpenStack tenant in Zuul currently has 134 config errors. You can find > these errors at https://zuul.opendev.org/t/openstack/config-errors or by > clicking the blue bell icon in the top right of > https://zuul.opendev.org/t/openstack/status. The vast majority of these > errors appear related to project renames that have been requested of OpenDev > or project retirements. Can you please look into fixing these as they can be > an attractive nuisance when debugging Zuul problems (they also indicate that > a number of your jobs are probably not working). > > Project renames creating issues: > > * openstack/python-tempestconf -> osf/python-tempestconf -> > openinfra/python-tempestconf * openstack/refstack -> osf/refstack -> > openinfra/refstack > * x/tap-as-a-service -> openstack/tap-as-a-service > * openstack/networking-l2gw -> x/networking-l2gw > > Project retirements creating issues: > > * openstack/neutron-lbaas > * recordsansible/ara > > Projects whose configs have errors: > > * openinfra/python-tempestconf > * openstack/heat > * openstack/ironic > * openstack/kolla-ansible > * openstack/kuryr-kubernetes > * openstack/murano-apps > * openstack/networking-midonet > * openstack/networking-odl > * openstack/neutron > * openstack/neutron-fwaas > * openstack/python-troveclient > * openstack/senlin > * openstack/tap-as-a-service > * openstack/zaqar > * x/vmware-nsx > * openinfra/openstackid > * openstack/barbican > * openstack/cookbook-openstack-application-catalog > * openstack/heat-dashboard > * openstack/manila-ui > * openstack/python-manilaclient > > Let us know if we can help decipher any errors, > Clark -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From skaplons at redhat.com Wed Nov 17 07:36:08 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Wed, 17 Nov 2021 08:36:08 +0100 Subject: [all][refstack][neutron][kolla][ironic][heat][trove][senlin][barbican][manila] Fixing Zuul Config Errors In-Reply-To: <5517904.DvuYhMxLoT@p1> References: <5072b00c-bbf6-42e9-830f-d6598a76beb7@www.fastmail.com> <5517904.DvuYhMxLoT@p1> Message-ID: <4359750.LvFx2qVVIh@p1> Hi, On ?roda, 17 listopada 2021 08:26:14 CET Slawek Kaplonski wrote: > Hi, > > I just checked neutron related things there and it seems there are 2 major > issues there: > 1. move of the tap-as-a-service from x/ to openstack/ namespace (that affects > networking-midonet) - I will propose patch for that today. > 2. remove of the neutron-lbaas repo (that affects much more than only neutron > repos - for that I will try to propose patches this week as well. There are also some missing job definitions in some of the neutron related repos and also issues with missing openstack/networking-l2gw project. I will take a look into all those issues in next days. > > On wtorek, 16 listopada 2021 17:58:44 CET Clark Boylan wrote: > > Hello, > > > > The OpenStack tenant in Zuul currently has 134 config errors. You can find > > these errors at https://zuul.opendev.org/t/openstack/config-errors or by > > clicking the blue bell icon in the top right of > > https://zuul.opendev.org/t/openstack/status. The vast majority of these > > errors appear related to project renames that have been requested of OpenDev > > or project retirements. Can you please look into fixing these as they can be > > an attractive nuisance when debugging Zuul problems (they also indicate that > > a number of your jobs are probably not working). > > > > Project renames creating issues: > > * openstack/python-tempestconf -> osf/python-tempestconf -> > > > > openinfra/python-tempestconf * openstack/refstack -> osf/refstack -> > > openinfra/refstack > > > > * x/tap-as-a-service -> openstack/tap-as-a-service > > * openstack/networking-l2gw -> x/networking-l2gw > > > > Project retirements creating issues: > > * openstack/neutron-lbaas > > * recordsansible/ara > > > > Projects whose configs have errors: > > * openinfra/python-tempestconf > > * openstack/heat > > * openstack/ironic > > * openstack/kolla-ansible > > * openstack/kuryr-kubernetes > > * openstack/murano-apps > > * openstack/networking-midonet > > * openstack/networking-odl > > * openstack/neutron > > * openstack/neutron-fwaas > > * openstack/python-troveclient > > * openstack/senlin > > * openstack/tap-as-a-service > > * openstack/zaqar > > * x/vmware-nsx > > * openinfra/openstackid > > * openstack/barbican > > * openstack/cookbook-openstack-application-catalog > > * openstack/heat-dashboard > > * openstack/manila-ui > > * openstack/python-manilaclient > > > > Let us know if we can help decipher any errors, > > Clark -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From tjoen at dds.nl Wed Nov 17 07:49:47 2021 From: tjoen at dds.nl (tjoen) Date: Wed, 17 Nov 2021 08:49:47 +0100 Subject: [sdk]: Check instance error message In-Reply-To: <1abc189e90ecc42225a74a4be8c9ee47@uvic.ca> References: <1abc189e90ecc42225a74a4be8c9ee47@uvic.ca> Message-ID: <8910d566-b5b7-de5e-0a98-36cd464e0d1f@dds.nl> On 11/16/21 18:55, dmeng wrote: > I'm wondering if there is any method that could check the error message > of an instance whose status is "ERROR"? Like from openstack cli, > "openstack server show server_name", if the server is in "ERROR" status, > this will return a field "fault" with a message shows the error. I tried > the compute service get_server and find_server, but neither of them show > the error messages of an instance. Haven't I answered this a week ago? Look in the archives From katonalala at gmail.com Wed Nov 17 07:52:47 2021 From: katonalala at gmail.com (Lajos Katona) Date: Wed, 17 Nov 2021 08:52:47 +0100 Subject: [edge][neutron][all] Edge related documentation Message-ID: Hi, During Yoga PTG the Neutron team had a very useful discussion together with some Designate folks, see the etherpad [1]. >From Neutron team's perspective the task is to document how to set up an edge site with AZ, DNS.... etc. This is a cross project effort (even if just do it for Neutron-Designate today), is there a place for such documentation? [1]: https://etherpad.opendev.org/p/octavia-designate-neutron-ptg#L16 -------------- next part -------------- An HTML attachment was scrubbed... URL: From franck.vedel at univ-grenoble-alpes.fr Wed Nov 17 07:59:05 2021 From: franck.vedel at univ-grenoble-alpes.fr (Franck VEDEL) Date: Wed, 17 Nov 2021 08:59:05 +0100 Subject: [kolla-ansible][wallaby][glance] Problem with image list after reconfigure Message-ID: Hello everyone I have a strange problem and I haven't found the solution yet. Following a certificate update I had to do a "kolla-ansible -t multinode reconfigure ?. Well, after several attempts (it is not easy to use certificates with Kolla-ansible, and from my advice, not documented enough for beginners), I have my new functional certificates. Perfect ... well almost. I am trying to create a new instance to check general operation. ERROR. Okay, I look in the logs and I see that Cinder is having problems creating volumes with an error that I never had ("TypeError: 'NoneType' object is not iterable). I dig and then I wonder if it is not the Glance images which cannot be used, while they are present (openstack image list is OK). I create an empty volume: it works. I am creating a volume from an image: Failed. However, I have my list of ten images in glance. I create a new image and create a volume with this new image: it works. I create an instance with this new image: OK. What is the problem ? The images present before the "reconfigure" are listed, visible in horizon for example, but unusable. Is there a way to fix this, or do we have to reinstall them all? Thanks in advance for your help if this problem speaks to you. Franck VEDEL D?p. R?seaux Informatiques & T?l?coms IUT1 - Univ GRENOBLE Alpes 0476824462 Stages, Alternance, Emploi. http://www.rtgrenoble.fr -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Wed Nov 17 08:13:34 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Wed, 17 Nov 2021 09:13:34 +0100 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming Message-ID: <2165480.iZASKD2KPV@p1> Hi, Recently I spent some time to check how many rechecks we need in Neutron to get patch merged and I compared it to some other OpenStack projects (see [1] for details). TL;DR - results aren't good for us and I think we really need to do something with that. Of course "easiest" thing to say is that we should fix issues which we are hitting in the CI to make jobs more stable. But it's not that easy. We are struggling with those jobs for very long time. We have CI related meeting every week and we are fixing what we can there. Unfortunately there is still bunch of issues which we can't fix so far because they are intermittent and hard to reproduce locally or in some cases the issues aren't realy related to the Neutron or there are new bugs which we need to investigate and fix :) So this is never ending battle for us. The problem is that we have to test various backends, drivers, etc. so as a result we have many jobs running on each patch - excluding UT, pep8 and docs jobs we have around 19 jobs in check and 14 jobs in gate queue. In the past we made a lot of improvements, like e.g. we improved irrelevant files lists for jobs to run less jobs on some of the patches, together with QA team we did "integrated-networking" template to run only Neutron and Nova related scenario tests in the Neutron queues, we removed and consolidated some of the jobs (there is still one patch in progress for that but it should just remove around 2 jobs from the check queue). All of that are good improvements but still not enough to make our CI really stable :/ Because of all of that, I would like to ask community about any other ideas how we can improve that. If You have any ideas, please send it in this email thread or reach out to me directly on irc. We want to discuss about them in the next video CI meeting which will be on November 30th. If You would have any idea and would like to join that discussion, You are more than welcome in that meeting of course :) [1] http://lists.openstack.org/pipermail/openstack-discuss/2021-November/ 025759.html -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From zigo at debian.org Wed Nov 17 09:22:51 2021 From: zigo at debian.org (Thomas Goirand) Date: Wed, 17 Nov 2021 10:22:51 +0100 Subject: [nova][all] Adding /healthcheck support in Nova, and better healthcheck in every projects Message-ID: Hi, About a year and a half ago, I attempted to add /healthcheck support by default in all projects. For Nova, this resulted in this patch: https://review.opendev.org/c/openstack/nova/+/724684 For other projects, it's been merged almost everywhere (I'd have to survey all project to see if that's the case, or if I still have debian specific patches somewhere). Though for Nova, this sparked a discussion where it's been said that the current implementation of /healthcheck wasn't good enough. This resulted in threads about how to better do it. Unfortunately, this blocked my patch from being merged in Nova. It is my point of view to recognize a failure here. The /healthcheck URL was added in oslo.middleware so one can use it with something like haproxy to verify that the API is up, and responds. It was never designed to check, for example, if nova-api has a valid connectivity to MySQL and RabbitMQ. Yes, this is welcome, but in the mean time, operators must tweak the default file to have a valid, useable /etc/nova/api-paste.ini. So I am hereby asking the nova team: Can we please move forward and agree that 1.5 years waiting for such a minor patch is too long, and that such patch should be approved, prior to having a better healtcheck mechanism? I don't think it's a good idea to ask Nova users to wait potentially more development cycles to have a good-by-default api-paste.ini file. At the same time, I am wondering: is anyone even working on a better healthcheck system? I haven't heard that anyone is working on this. Though it would be more than welcome. Currently, to check that a daemon is alive and well, operators are stuck with: - checking with ss if the daemon is correctly connected to a given port - check the logs for rabbitmq and mysql errors (with something like filebeat + elastic search and alarming) Clearly, this doesn't scale. When running many large OpenStack clusters, it is not trivial to have a monitoring system that works and scales. The effort to deploy such a monitoring system is also not trivial at all. So what's been discussed at the time for improving the monitoring would be very much welcome, though not only for the API service: something to check the health of other daemons would be very much welcome. I'd very much would like to participate in a Yoga effort to improve the current situation, and contribute the best I can, though I'm not sure I'd be the best person to drive this... Is there anyone else willing to work on this? Hoping this message is helpful, Cheers, Thomas Goirand (zigo) From balazs.gibizer at est.tech Wed Nov 17 10:18:03 2021 From: balazs.gibizer at est.tech (Balazs Gibizer) Date: Wed, 17 Nov 2021 11:18:03 +0100 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming In-Reply-To: <2165480.iZASKD2KPV@p1> References: <2165480.iZASKD2KPV@p1> Message-ID: <3MOP2R.O83SZVO0NWN23@est.tech> On Wed, Nov 17 2021 at 09:13:34 AM +0100, Slawek Kaplonski wrote: > Hi, > > Recently I spent some time to check how many rechecks we need in > Neutron to > get patch merged and I compared it to some other OpenStack projects > (see [1] > for details). > TL;DR - results aren't good for us and I think we really need to do > something > with that. I really like the idea of collecting such stats. Thank you for doing it. I can even imagine to make a public dashboard somewhere with this information as it is a good indication about the health of our projects / testing. > > Of course "easiest" thing to say is that we should fix issues which > we are > hitting in the CI to make jobs more stable. But it's not that easy. > We are > struggling with those jobs for very long time. We have CI related > meeting > every week and we are fixing what we can there. > Unfortunately there is still bunch of issues which we can't fix so > far because > they are intermittent and hard to reproduce locally or in some cases > the > issues aren't realy related to the Neutron or there are new bugs > which we need > to investigate and fix :) I have couple of suggestion based on my experience working with CI in nova. 1) we try to open bug reports for intermittent gate failures too and keep them tagged in a list [1] so when a job fail it is easy to check if the bug is known. 2) I offer my help here now that if you see something in neutron runs that feels non neutron specific then ping me with it. Maybe we are struggling with the same problem too. 3) there was informal discussion before about a possibility to re-run only some jobs with a recheck instead for re-running the whole set. I don't know if this is feasible with Zuul and I think this only treat the symptom not the root case. But still this could be a direction if all else fails. Cheers, gibi > So this is never ending battle for us. The problem is that we have > to test > various backends, drivers, etc. so as a result we have many jobs > running on > each patch - excluding UT, pep8 and docs jobs we have around 19 jobs > in check > and 14 jobs in gate queue. > > In the past we made a lot of improvements, like e.g. we improved > irrelevant > files lists for jobs to run less jobs on some of the patches, > together with QA > team we did "integrated-networking" template to run only Neutron and > Nova > related scenario tests in the Neutron queues, we removed and > consolidated some > of the jobs (there is still one patch in progress for that but it > should just > remove around 2 jobs from the check queue). All of that are good > improvements > but still not enough to make our CI really stable :/ > > Because of all of that, I would like to ask community about any other > ideas > how we can improve that. If You have any ideas, please send it in > this email > thread or reach out to me directly on irc. > We want to discuss about them in the next video CI meeting which will > be on > November 30th. If You would have any idea and would like to join that > discussion, You are more than welcome in that meeting of course :) > > [1] > http://lists.openstack.org/pipermail/openstack-discuss/2021-November/ > 025759.html [1] https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure&orderby=-date_last_updated&start=0 > > -- > Slawek Kaplonski > Principal Software Engineer > Red Hat From arnaud.morin at gmail.com Wed Nov 17 10:54:58 2021 From: arnaud.morin at gmail.com (Arnaud Morin) Date: Wed, 17 Nov 2021 10:54:58 +0000 Subject: [nova] weird python stacktrace in nova-compute Message-ID: Hey all, We have some python stacktrace in our nova-compute journalctl logs, which looks like that: $ journalctl -u nova-compute ... Nov 01 09:04:03 host1234 python3[161354]: Exception in thread tpool_thread_6: Nov 01 09:04:03 host1234 python3[161354]: Traceback (most recent call last): Nov 01 09:04:03 host1234 python3[161354]: File "/opt/openstack/nova/lib/python3.6/threading.py", line 916, in _bootstrap_inner Nov 01 09:04:03 host1234 python3[161354]: self.run() Nov 01 09:04:03 host1234 python3[161354]: File "/opt/openstack/nova/lib/python3.6/threading.py", line 864, in run Nov 01 09:04:03 host1234 python3[161354]: self._target(*self._args, **self._kwargs) Nov 01 09:04:03 host1234 python3[161354]: File "/opt/openstack/nova/lib/python3.6/site-packages/eventlet/tpool.py", line 96, in tworker Nov 01 09:04:03 host1234 python3[161354]: _wsock.sendall(_bytetosend) Nov 01 09:04:03 host1234 python3[161354]: TimeoutError: [Errno 110] Connection timed out Nov 01 09:04:03 host1234 python3[161354]: Traceback (most recent call last): Nov 01 09:04:03 host1234 python3[161354]: File "/opt/openstack/nova/lib/python3.6/site-packages/eventlet/hubs/poll.py", line 109, in wait Nov 01 09:04:03 host1234 python3[161354]: listener.cb(fileno) Nov 01 09:04:03 host1234 python3[161354]: File "/opt/openstack/nova/lib/python3.6/site-packages/eventlet/tpool.py", line 58, in tpool_trampoline Nov 01 09:04:03 host1234 python3[161354]: assert _c Nov 01 09:04:03 host1234 python3[161354]: AssertionError Nov 01 09:04:03 host1234 python3[161354]: Removing descriptor: 16 ... After this, nova-compute is "stuck" and does nothing more, but still continue to answer on RPC (it is still up in nova services) We am not experts of python threading / eventlet stuff, and we have no idea how to debug this. Our current solution is to restart nova-compute, but it's more a dirty workaround than a real fix. Does it ring a bell to someone in the community? Cheers, Arnaud From pierre at stackhpc.com Wed Nov 17 11:07:58 2021 From: pierre at stackhpc.com (Pierre Riteau) Date: Wed, 17 Nov 2021 12:07:58 +0100 Subject: [all] git-review broken by git version 2.34.0 Message-ID: Hello, I recently upgraded to the latest git release 2.34.0 and noticed it breaks git-review (output below is with -v): Errors running git rebase -p -i remotes/gerrit/stable/xena fatal: --preserve-merges was replaced by --rebase-merges I submitted a patch [1], but since it would break compatibility with git versions older than 2.18 that don't support the --rebase-merges (-r) option, it may need to be refined before being merged. Cheers, Pierre [1] https://review.opendev.org/c/opendev/git-review/+/818219 From senrique at redhat.com Wed Nov 17 11:12:42 2021 From: senrique at redhat.com (Sofia Enriquez) Date: Wed, 17 Nov 2021 08:12:42 -0300 Subject: [cinder] Bug deputy report for week of 11-17-2021 Message-ID: No meeting today :( This is a bug report from 11-10-2021 to 11-17-2021. Agenda: https://etherpad.opendev.org/p/cinder-bug-squad-meeting ----------------------------------------------------------------------------------------- High - https://bugs.launchpad.net/cinder/+bug/1950474 "Xena accept transfer policy breaks volume transfer workflow". Unassigned. Medium - https://bugs.launchpad.net/cinder/+bug/1950291 "tempest-integrated-compute-centos-8-stream fails with version conflict in boto3". Unassigned. - https://bugs.launchpad.net/cinder/+bug/1951163 "Unable to import/manage an encrypted volume". Unassigned. - https://bugs.launchpad.net/cinder/+bug/1951046 "DS8000 driver terminates volume connection when there still has volume attached to instances - Ussuri". Unassigned. Cheers, -- Sof?a Enriquez she/her Software Engineer Red Hat PnT IRC: @enriquetaso @RedHat Red Hat Red Hat -- Sof?a Enriquez she/her Software Engineer Red Hat PnT IRC: @enriquetaso @RedHat Red Hat Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Wed Nov 17 13:13:57 2021 From: smooney at redhat.com (Sean Mooney) Date: Wed, 17 Nov 2021 13:13:57 +0000 Subject: [nova][all] Adding /healthcheck support in Nova, and better healthcheck in every projects In-Reply-To: References: Message-ID: On Wed, 2021-11-17 at 10:22 +0100, Thomas Goirand wrote: > Hi, > > About a year and a half ago, I attempted to add /healthcheck support by > default in all projects. For Nova, this resulted in this patch: > > https://review.opendev.org/c/openstack/nova/+/724684 > > For other projects, it's been merged almost everywhere (I'd have to > survey all project to see if that's the case, or if I still have debian > specific patches somewhere). > > Though for Nova, this sparked a discussion where it's been said that the > current implementation of /healthcheck wasn't good enough. This resulted > in threads about how to better do it. > > Unfortunately, this blocked my patch from being merged in Nova. > > It is my point of view to recognize a failure here. The /healthcheck URL > was added in oslo.middleware so one can use it with something like > haproxy to verify that the API is up, and responds. It was never > designed to check, for example, if nova-api has a valid connectivity to > MySQL and RabbitMQ. Yes, this is welcome, but in the mean time, > operators must tweak the default file to have a valid, useable > /etc/nova/api-paste.ini. > > So I am hereby asking the nova team: > > Can we please move forward and agree that 1.5 years waiting for such a > minor patch is too long, and that such patch should be approved, prior > to having a better healtcheck mechanism? I don't think it's a good idea > to ask Nova users to wait potentially more development cycles to have a > good-by-default api-paste.ini file. i am currently wokring on an alternitive solution for this cycle. i still belive it woudl be incorrect to add teh healtcheck provided by oslo.middelware to nova. we disucssed this at the ptg this cycel and still did nto think it was the correct way to approch this but we did agree to work on adding an alternitive form of health checks this cycle. i fundementally belive bad healthchecks are worse then no helatch checks and the olso midelware provides bad healthchecks. since the /healthcheck denpoint can be added via api-paste.ini manually i dont think we shoudl add it to our default or that packageagre shoudl either. one open question in my draft spec is for the nova api in particaly should we support /healtcheck on the normal api port instead of the dedeicated health check endpoint. > > At the same time, I am wondering: is anyone even working on a better > healthcheck system? I haven't heard that anyone is working on this. yes so i need to push the spec for review ill see if i can do that today or at a minium this week. the tldr is as follows. nova will be extended with 2 addtional options to allow a health checks endpoint to be exposed on a tcp port and/or a unix socket. these heatlth check endpoints will not be authenticated will be disabel by default. all nova binaries (nova-api, nova-schduler, nova-compute, ...) will supprot exposing the endpoint. the process will internally update a heathcheck data structure when ever they perform specific operation that can be uses as a proxy for the healt of the binary (db query, rpc ping, request to libvirt) these will be binary specific. The over all health will be summerised with a status enum, exact values to be determind but im working with (OK, DEGRADED, FAULT) for now. in the degraded and fault state there will also be a mesage and likely details filed in the respocne. message would be human readable with detail being the actual content of the health check data structure. i have not decided if i should use http status codes as part of the way to singal the status, my instinct are saying no parsing the json reponce shoudl be simple and if you just need to check the status filed for ok|degreated|falut using a 5XX error code in the degraded of fault case would not be semanticly correct. the current set of usecases i am using to drive the desting of the spec are as follows. Use Cases --------- As a operator i want a simple health-check i can consume to know if a nova process is OK, Degraded or Faulty. As an operator i want this health-check to not impact performance of the service so it can be queried frequently at short intervals. As a deployment tool implementer i want the health check to be local with no dependencies on other hosts or services to function so i can integrate it with service managers such as systemd or container runtime like docker As a packager i would like health-check to not require special client or packages consume them. CURL, socat or netcat should be all that is required to connect to the health check and retrieve the service status. As an operator i would like to be able to use health-check of the nova api and metadata services to manage the membership of endpoints in my load-balancer or reverse proxy automatically. > Though it would be more than welcome. Currently, to check that a daemon > is alive and well, operators are stuck with: > > - checking with ss if the daemon is correctly connected to a given port > - check the logs for rabbitmq and mysql errors (with something like > filebeat + elastic search and alarming) > > Clearly, this doesn't scale. When running many large OpenStack clusters, > it is not trivial to have a monitoring system that works and scales. The > effort to deploy such a monitoring system is also not trivial at all. So > what's been discussed at the time for improving the monitoring would be > very much welcome, though not only for the API service: something to > check the health of other daemons would be very much welcome. > > I'd very much would like to participate in a Yoga effort to improve the > current situation, and contribute the best I can, though I'm not sure > I'd be the best person to drive this... Is there anyone else willing to > work on this? yep i am feel free to ping me on irc: sean-k-mooney incase your wondering but we have talked before. i have not configured my defualt channels since the change to oftc but im alwasy in at least #openstack-nova after discussing this in the nova ptg session the design took a hard right turn from being based on a rpc like protocaol exposed over a unix socket with ovos as the data fromat and active probes to a http based endpoint, avaiable over tcp and or unix socket with json as the responce format and a semi global data stucutre with TTL for the data. as a result i have had to rethink and rework most of the draft spec i had prepared. The main point of design that we need to agree on is exactuly how that data stucture is accessed and wehre it is stored. in the orginal desing i proposed there was no need to store any kind of state and or modify existing functions to add healchecks. each nova service manager would just implemant a new healthcheck function that would be pass as a callback to the healtcheck manager which exposed the endpoint. With the new approch we will like add decorators to imporant functions that will update the healthchecks based on if that fucntion complete correctly. if we take the decorator because of how decorators work it can only access module level varables, class method/memeber or the parmaters to the function it is decorating. what that efffectivly means is either the health check manager need to be stored in a module level "global" variable, it need to be a signelton accessable via a class method or it need to be stored in a data stucure that is passed to almost ever funciton speicifcally the context object. i am leaning towards the context object but i need to understand how that will interact with RPC calls so it might end up being a global/singelton which sucks form a unit/fucntional testing perspective but we can make it work via fixtures. hopefully this sould like good news to you but feel free to give feedback. > > Hoping this message is helpful, > Cheers, > > Thomas Goirand (zigo) > From cboylan at sapwetik.org Wed Nov 17 14:48:42 2021 From: cboylan at sapwetik.org (Clark Boylan) Date: Wed, 17 Nov 2021 06:48:42 -0800 Subject: [all] git-review broken by git version 2.34.0 In-Reply-To: References: Message-ID: On Wed, Nov 17, 2021, at 3:07 AM, Pierre Riteau wrote: > Hello, > > I recently upgraded to the latest git release 2.34.0 and noticed it > breaks git-review (output below is with -v): > > Errors running git rebase -p -i remotes/gerrit/stable/xena > fatal: --preserve-merges was replaced by --rebase-merges > > I submitted a patch [1], but since it would break compatibility with > git versions older than 2.18 that don't support the --rebase-merges > (-r) option, it may need to be refined before being merged. Ubuntu Bionic and CentOS7 both have older git versions than 2.18. If it isn't terrible to continue to support older git that may be a good idea, but we can also suggest those installations pin their git-review to the current version instead. As a side note the man page for git rebase says that --preserve-merges and --interactive are incompatible options and yet they are both set in the failed command above. I wonder what sort of behavior we were getting out of git when this "worked". > > Cheers, > Pierre > > [1] https://review.opendev.org/c/opendev/git-review/+/818219 From cboylan at sapwetik.org Wed Nov 17 15:20:13 2021 From: cboylan at sapwetik.org (Clark Boylan) Date: Wed, 17 Nov 2021 07:20:13 -0800 Subject: [all] git-review broken by git version 2.34.0 In-Reply-To: References: Message-ID: On Wed, Nov 17, 2021, at 6:48 AM, Clark Boylan wrote: > On Wed, Nov 17, 2021, at 3:07 AM, Pierre Riteau wrote: >> Hello, >> >> I recently upgraded to the latest git release 2.34.0 and noticed it >> breaks git-review (output below is with -v): >> >> Errors running git rebase -p -i remotes/gerrit/stable/xena >> fatal: --preserve-merges was replaced by --rebase-merges >> >> I submitted a patch [1], but since it would break compatibility with >> git versions older than 2.18 that don't support the --rebase-merges >> (-r) option, it may need to be refined before being merged. > > Ubuntu Bionic and CentOS7 both have older git versions than 2.18. If it > isn't terrible to continue to support older git that may be a good > idea, but we can also suggest those installations pin their git-review > to the current version instead. I pushed a follwup, https://review.opendev.org/c/opendev/git-review/+/818238, that can be squashed into the parent if this works in testing. It attempts to check the version and set the flag appropriately. Clark From dmendiza at redhat.com Wed Nov 17 15:51:48 2021 From: dmendiza at redhat.com (Douglas Mendizabal) Date: Wed, 17 Nov 2021 09:51:48 -0600 Subject: [barbican] No weekly meeting next week Message-ID: <79495e55-da52-45a3-1d40-0e04fe4683ca@redhat.com> Hi Barbicaneers, I'll be out on PTO next week, so I'm canceling the Barbican weekly meeting for November 23. Meeting will resume the following week on November 30. Thanks, - Douglas Mendiz?bal (redrobot) From cboylan at sapwetik.org Wed Nov 17 15:51:57 2021 From: cboylan at sapwetik.org (Clark Boylan) Date: Wed, 17 Nov 2021 07:51:57 -0800 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming In-Reply-To: <3MOP2R.O83SZVO0NWN23@est.tech> References: <2165480.iZASKD2KPV@p1> <3MOP2R.O83SZVO0NWN23@est.tech> Message-ID: <3e835881-29d8-4e33-9ef1-36171f9dd0a3@www.fastmail.com> On Wed, Nov 17, 2021, at 2:18 AM, Balazs Gibizer wrote: > Snip. I want to respond to a specific suggestion: > 3) there was informal discussion before about a possibility to re-run > only some jobs with a recheck instead for re-running the whole set. I > don't know if this is feasible with Zuul and I think this only treat > the symptom not the root case. But still this could be a direction if > all else fails. > OpenStack has configured its check and gate queues with something we've called "clean check". This refers to the requirement that before an OpenStack project can be gated it must pass check tests first. This policy was instituted because a number of these infrequent but problematic issues were traced back to recheck spamming. Basically changes would show up and were broken. They would fail some percentage of the time. They got rechecked until they finally merged and now their failure rate is added to the whole. This rule was introduced to make it more difficult to get this flakyness into the gate. Locking in test results is in direct opposition to the existing policy and goals. Locking results would make it far more trivial to land such flakyness as you wouldn't need entire sets of jobs to pass before you could land. Instead you could rerun individual jobs until each one passed and then land the result. Potentially introducing significant flakyness with a single merge. Locking results is also not really something that fits well with the speculative gate queues that Zuul runs. Remember that Zuul constructs a future git state and tests that in parallel. Currently the state for OpenStack looks like: A - Nova ^ B - Glance ^ C - Neutron ^ D - Neutron ^ F - Neutron The B glance change is tested as if the A Nova change has already merged and so on down the queue. If we want to keep these speculative states we can't really have humans manually verify a failure can be ignored and retry it. Because we'd be enqueuing job builds at different stages of speculative state. Each job build would be testing a different version of the software. What we could do is implement a retry limit for failing jobs. Zuul could rerun failing jobs X times before giving up and reporting failure (this would require updates to Zuul). The problem with this approach is without some oversight it becomes very easy to land changes that make things worse. As a side note Zuul does do retries, but only for detected network errors or when a pre-run playbook fails. The assumption is that network failures are due to the dangers of the Internet, and that pre-run playbooks are small, self contained, unlikely to fail, and when they do fail the failure should be independent of what is being tested. Where does that leave us? I think it is worth considering the original goals of "clean check". We know that rechecking/rerunning only makes these problems worse in the long term. They represent technical debt. One of the reasons we run these tests is to show us when our software is broken. In the case of flaky results we are exposing this technical debt where it impacts the functionality of our software. The longer we avoid fixing these issues the worse it gets, and this is true even with "clean check". Do we as developers find value in knowing the software needs attention before it gets released to users? Do the users find value in running reliable software? In the past we have asserted that "yes, there is value in this", and have invested in tracking, investigating, and fixing these problems even if they happen infrequently. But that does require investment, and active maintenance. Clark From zigo at debian.org Wed Nov 17 16:03:20 2021 From: zigo at debian.org (Thomas Goirand) Date: Wed, 17 Nov 2021 17:03:20 +0100 Subject: [nova][all] Adding /healthcheck support in Nova, and better healthcheck in every projects In-Reply-To: References: Message-ID: <4882cd19-4f6d-c0a7-15cd-a98e73a217ad@debian.org> Hi Sean, thanks for your reply! On 11/17/21 2:13 PM, Sean Mooney wrote: > i am currently wokring on an alternitive solution for this cycle. gr8! > i still belive it woudl be incorrect to add teh healtcheck provided by oslo.middelware to nova. > we disucssed this at the ptg this cycel and still did nto think it was the correct way to approch this > but we did agree to work on adding an alternitive form of health checks this cycle. > i fundementally belive bad healthchecks are worse then no helatch checks and the olso midelware provides bad healthchecks. The current implementation is only useful for plugging haproxy to APIs, nothing more, nothing less. > since the /healthcheck denpoint can be added via api-paste.ini manually i dont think we shoudl add it to our > default or that packageagre shoudl either. Like it or not, the current state of things is: - /healthcheck is activated everywhere (I patched that myself) - The nova package at least in Debian has it activated by default (as this is the only project that refused the patch, I carry it in the package). Also, many operators already use the /healthcheck in production, so you really want to keep it. IMO, your implementation should switch to a different endpoint if you wish to not retain compatibility with the older system. For this reason, I strongly believe that the Nova team should be revising its view from a year and a half, and accept the imperfect currently implemented /healthcheck. This is not mutually exclusive to a better implementation bound on some other URL. > one open question in my draft spec is for the nova api in particaly should we support /healtcheck on the normal api port instead of > the dedeicated health check endpoint. You should absolutely not break backward compatibility!!! > yes so i need to push the spec for review ill see if i can do that today or at a minium this week. > the tldr is as follows. > > nova will be extended with 2 addtional options to allow a health checks endpoint to be exposed on a tcp port > and/or a unix socket. these heatlth check endpoints will not be authenticated will be disabel by default. > all nova binaries (nova-api, nova-schduler, nova-compute, ...) will supprot exposing the endpoint. > > the process will internally update a heathcheck data structure when ever they perform specific operation that > can be uses as a proxy for the healt of the binary (db query, rpc ping, request to libvirt) these will be binary specific. > > The over all health will be summerised with a status enum, exact values to be determind but im working with (OK, DEGRADED, FAULT) > for now. in the degraded and fault state there will also be a mesage and likely details filed in the respocne. > message would be human readable with detail being the actual content of the health check data structure. > > i have not decided if i should use http status codes as part of the way to singal the status, my instinct are saying no > parsing the json reponce shoudl be simple and if you just need to check the status filed for ok|degreated|falut using a 5XX error code > in the degraded of fault case would not be semanticly correct. All you wrote above is great. For the http status codes, please implement it, because it's cheap, and that's how Zabbix (and probably other monitoring systems) works, plus everyone understand them. > Use Cases > --------- > > As a operator i want a simple health-check i can consume to know > if a nova process is OK, Degraded or Faulty. > > As an operator i want this health-check to not impact performance of the > service so it can be queried frequently at short intervals. > > As a deployment tool implementer i want the health check to be local with no > dependencies on other hosts or services to function so i can integrate it with > service managers such as systemd or container runtime like docker > > As a packager i would like health-check to not require special client or > packages consume them. CURL, socat or netcat should be all that is required to > connect to the health check and retrieve the service status. > > As an operator i would like to be able to use health-check of the nova api and > metadata services to manage the membership of endpoints in my load-balancer > or reverse proxy automatically. > > >> Though it would be more than welcome. Currently, to check that a daemon >> is alive and well, operators are stuck with: >> >> - checking with ss if the daemon is correctly connected to a given port >> - check the logs for rabbitmq and mysql errors (with something like >> filebeat + elastic search and alarming) >> >> Clearly, this doesn't scale. When running many large OpenStack clusters, >> it is not trivial to have a monitoring system that works and scales. The >> effort to deploy such a monitoring system is also not trivial at all. So >> what's been discussed at the time for improving the monitoring would be >> very much welcome, though not only for the API service: something to >> check the health of other daemons would be very much welcome. >> >> I'd very much would like to participate in a Yoga effort to improve the >> current situation, and contribute the best I can, though I'm not sure >> I'd be the best person to drive this... Is there anyone else willing to >> work on this? > > yep i am feel free to ping me on irc: sean-k-mooney incase your wondering but we have talked before. Yes. Feel free to ping me as well, I'll enjoy contributing were I can (though I know you're more skilled than I do in OpenStack's Python code... I'll still do what I can). > i have not configured my defualt channels since the change to oftc but im alwasy in at least #openstack-nova > after discussing this in the nova ptg session the design took a hard right turn from being based on a rpc like protocaol > exposed over a unix socket with ovos as the data fromat and active probes to a http based endpoint, avaiable over tcp and or unix socket > with json as the responce format and a semi global data stucutre with TTL for the data. > > as a result i have had to rethink and rework most of the draft spec i had prepared. > The main point of design that we need to agree on is exactuly how that data stucture is accessed and wehre it is stored. > > in the orginal desing i proposed there was no need to store any kind of state and or modify existing functions to add healchecks. > each nova service manager would just implemant a new healthcheck function that would be pass as a callback to the healtcheck manager which exposed the endpoint. > > With the new approch we will like add decorators to imporant functions that will update the healthchecks based on if that fucntion complete correctly. > if we take the decorator because of how decorators work it can only access module level varables, class method/memeber or the parmaters to the function it is decorating. > what that efffectivly means is either the health check manager need to be stored in a module level "global" variable, it need to be a signelton accessable via a class method > or it need to be stored in a data stucure that is passed to almost ever funciton speicifcally the context object. > > i am leaning towards the context object but i need to understand how that will interact with RPC calls so it might end up being a global/singelton which sucks form a unit/fucntional testing > perspective but we can make it work via fixtures. > > hopefully this sould like good news to you but feel free to give feedback. I don't like the fact that we're still having the discussion 1.5 years after the proposed patch, and that still delays having Nova following what all the other projects have approved. Again, what you're doing should not be mutually exclusive with adding what already works, and what is already in production. It's been said a year and a half ago, and it's still truth. A year and a half ago, we even discuss the fact it would be a shame if it took more than a year... So can we move forward? Anyways, I'm excited that this goes forward, so thanks again for leading this initiative. Cheers, Thomas Goirand (zigo) From yasufum.o at gmail.com Wed Nov 17 16:26:22 2021 From: yasufum.o at gmail.com (Yasufumi Ogawa) Date: Thu, 18 Nov 2021 01:26:22 +0900 Subject: [tacker] Skip next IRC meeting Message-ID: <5846722b-5675-f949-2947-eecf0e098302@gmail.com> Hi team, Due to my absence from work, I would like to skip the next IRC meeting on November 23. Thanks, Yasufumi From rosmaita.fossdev at gmail.com Wed Nov 17 17:10:57 2021 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Wed, 17 Nov 2021 12:10:57 -0500 Subject: [cinder] yoga R-17 virtual midcycle on 1 december Message-ID: As decided at today's weekly meeting, the Cinder Yoga R-17 virtual midcycle will be held: DATE: Wednesday 1 December 2021 TIME: 1400-1600 UTC LOCATION: https://bluejeans.com/3228528973 The meeting will be recorded. Please add topics to the midcycle etherpad: https://etherpad.opendev.org/p/cinder-yoga-midcycles cheers, brian From cyril at redhat.com Wed Nov 17 20:11:10 2021 From: cyril at redhat.com (Cyril Roelandt) Date: Wed, 17 Nov 2021 21:11:10 +0100 Subject: [kolla-ansible][wallaby][glance] Problem with image list after reconfigure In-Reply-To: References: Message-ID: Hello, On 2021-11-17 08:59, Franck VEDEL wrote: > Hello everyone > > I have a strange problem and I haven't found the solution yet. > Following a certificate update I had to do a "kolla-ansible -t multinode reconfigure ?. > Well, after several attempts (it is not easy to use certificates with Kolla-ansible, and from my advice, not documented enough for beginners), I have my new functional certificates. Perfect ... well almost. > > I am trying to create a new instance to check general operation. ERROR. > Okay, I look in the logs and I see that Cinder is having problems creating volumes with an error that I never had ("TypeError: 'NoneType' object is not iterable). We'd like to see the logs as well, especially the stacktrace. > I dig and then I wonder if it is not the Glance images which cannot be used, while they are present (openstack image list is OK). > > I create an empty volume: it works. > I am creating a volume from an image: Failed. What commands are you running? What's the output? What's in the logs? > > However, I have my list of ten images in glance. > > I create a new image and create a volume with this new image: it works. > I create an instance with this new image: OK. > > What is the problem ? The images present before the "reconfigure" are listed, visible in horizon for example, but unusable. > Is there a way to fix this, or do we have to reinstall them all? What's your configuration? What version of OpenStack are you running? Cyril From mnaser at vexxhost.com Wed Nov 17 20:53:22 2021 From: mnaser at vexxhost.com (Mohammed Naser) Date: Wed, 17 Nov 2021 15:53:22 -0500 Subject: [nova][all] Adding /healthcheck support in Nova, and better healthcheck in every projects In-Reply-To: <4882cd19-4f6d-c0a7-15cd-a98e73a217ad@debian.org> References: <4882cd19-4f6d-c0a7-15cd-a98e73a217ad@debian.org> Message-ID: I don't think we rely on /healthcheck -- there's nothing healthy about an API endpoint blindly returning a 200 OK. You might as well just hit / and accept 300 as a code and that's exactly the same behaviour. I support what Sean is bringing up here and I don't think it makes sense to have a noop /healthcheck that always gives a 200 OK...seems a bit useless imho On Wed, Nov 17, 2021 at 11:09 AM Thomas Goirand wrote: > > Hi Sean, thanks for your reply! > > On 11/17/21 2:13 PM, Sean Mooney wrote: > > i am currently wokring on an alternitive solution for this cycle. > > gr8! > > > i still belive it woudl be incorrect to add teh healtcheck provided by oslo.middelware to nova. > > we disucssed this at the ptg this cycel and still did nto think it was the correct way to approch this > > but we did agree to work on adding an alternitive form of health checks this cycle. > > i fundementally belive bad healthchecks are worse then no helatch checks and the olso midelware provides bad healthchecks. > > The current implementation is only useful for plugging haproxy to APIs, > nothing more, nothing less. > > > since the /healthcheck denpoint can be added via api-paste.ini manually i dont think we shoudl add it to our > > default or that packageagre shoudl either. > > Like it or not, the current state of things is: > - /healthcheck is activated everywhere (I patched that myself) > - The nova package at least in Debian has it activated by default (as > this is the only project that refused the patch, I carry it in the package). > > Also, many operators already use the /healthcheck in production, so you > really want to keep it. IMO, your implementation should switch to a > different endpoint if you wish to not retain compatibility with the > older system. > > For this reason, I strongly believe that the Nova team should be > revising its view from a year and a half, and accept the imperfect > currently implemented /healthcheck. This is not mutually exclusive to a > better implementation bound on some other URL. > > > one open question in my draft spec is for the nova api in particaly should we support /healtcheck on the normal api port instead of > > the dedeicated health check endpoint. > > You should absolutely not break backward compatibility!!! > > > yes so i need to push the spec for review ill see if i can do that today or at a minium this week. > > the tldr is as follows. > > > > nova will be extended with 2 addtional options to allow a health checks endpoint to be exposed on a tcp port > > and/or a unix socket. these heatlth check endpoints will not be authenticated will be disabel by default. > > all nova binaries (nova-api, nova-schduler, nova-compute, ...) will supprot exposing the endpoint. > > > > the process will internally update a heathcheck data structure when ever they perform specific operation that > > can be uses as a proxy for the healt of the binary (db query, rpc ping, request to libvirt) these will be binary specific. > > > > The over all health will be summerised with a status enum, exact values to be determind but im working with (OK, DEGRADED, FAULT) > > for now. in the degraded and fault state there will also be a mesage and likely details filed in the respocne. > > message would be human readable with detail being the actual content of the health check data structure. > > > > i have not decided if i should use http status codes as part of the way to singal the status, my instinct are saying no > > parsing the json reponce shoudl be simple and if you just need to check the status filed for ok|degreated|falut using a 5XX error code > > in the degraded of fault case would not be semanticly correct. > > All you wrote above is great. For the http status codes, please > implement it, because it's cheap, and that's how Zabbix (and probably > other monitoring systems) works, plus everyone understand them. > > > Use Cases > > --------- > > > > As a operator i want a simple health-check i can consume to know > > if a nova process is OK, Degraded or Faulty. > > > > As an operator i want this health-check to not impact performance of the > > service so it can be queried frequently at short intervals. > > > > As a deployment tool implementer i want the health check to be local with no > > dependencies on other hosts or services to function so i can integrate it with > > service managers such as systemd or container runtime like docker > > > > As a packager i would like health-check to not require special client or > > packages consume them. CURL, socat or netcat should be all that is required to > > connect to the health check and retrieve the service status. > > > > As an operator i would like to be able to use health-check of the nova api and > > metadata services to manage the membership of endpoints in my load-balancer > > or reverse proxy automatically. > > > > > >> Though it would be more than welcome. Currently, to check that a daemon > >> is alive and well, operators are stuck with: > >> > >> - checking with ss if the daemon is correctly connected to a given port > >> - check the logs for rabbitmq and mysql errors (with something like > >> filebeat + elastic search and alarming) > >> > >> Clearly, this doesn't scale. When running many large OpenStack clusters, > >> it is not trivial to have a monitoring system that works and scales. The > >> effort to deploy such a monitoring system is also not trivial at all. So > >> what's been discussed at the time for improving the monitoring would be > >> very much welcome, though not only for the API service: something to > >> check the health of other daemons would be very much welcome. > >> > >> I'd very much would like to participate in a Yoga effort to improve the > >> current situation, and contribute the best I can, though I'm not sure > >> I'd be the best person to drive this... Is there anyone else willing to > >> work on this? > > > > yep i am feel free to ping me on irc: sean-k-mooney incase your wondering but we have talked before. > > Yes. Feel free to ping me as well, I'll enjoy contributing were I can > (though I know you're more skilled than I do in OpenStack's Python > code... I'll still do what I can). > > > i have not configured my defualt channels since the change to oftc but im alwasy in at least #openstack-nova > > after discussing this in the nova ptg session the design took a hard right turn from being based on a rpc like protocaol > > exposed over a unix socket with ovos as the data fromat and active probes to a http based endpoint, avaiable over tcp and or unix socket > > with json as the responce format and a semi global data stucutre with TTL for the data. > > > > as a result i have had to rethink and rework most of the draft spec i had prepared. > > The main point of design that we need to agree on is exactuly how that data stucture is accessed and wehre it is stored. > > > > in the orginal desing i proposed there was no need to store any kind of state and or modify existing functions to add healchecks. > > each nova service manager would just implemant a new healthcheck function that would be pass as a callback to the healtcheck manager which exposed the endpoint. > > > > With the new approch we will like add decorators to imporant functions that will update the healthchecks based on if that fucntion complete correctly. > > if we take the decorator because of how decorators work it can only access module level varables, class method/memeber or the parmaters to the function it is decorating. > > what that efffectivly means is either the health check manager need to be stored in a module level "global" variable, it need to be a signelton accessable via a class method > > or it need to be stored in a data stucure that is passed to almost ever funciton speicifcally the context object. > > > > i am leaning towards the context object but i need to understand how that will interact with RPC calls so it might end up being a global/singelton which sucks form a unit/fucntional testing > > perspective but we can make it work via fixtures. > > > > hopefully this sould like good news to you but feel free to give feedback. > > I don't like the fact that we're still having the discussion 1.5 years > after the proposed patch, and that still delays having Nova following > what all the other projects have approved. > > Again, what you're doing should not be mutually exclusive with adding > what already works, and what is already in production. It's been said a > year and a half ago, and it's still truth. A year and a half ago, we > even discuss the fact it would be a shame if it took more than a year... > So can we move forward? > > Anyways, I'm excited that this goes forward, so thanks again for leading > this initiative. > > Cheers, > > Thomas Goirand (zigo) > -- Mohammed Naser VEXXHOST, Inc. From dms at danplanet.com Wed Nov 17 21:54:49 2021 From: dms at danplanet.com (Dan Smith) Date: Wed, 17 Nov 2021 13:54:49 -0800 Subject: [nova][all] Adding /healthcheck support in Nova, and better healthcheck in every projects In-Reply-To: (Mohammed Naser's message of "Wed, 17 Nov 2021 15:53:22 -0500") References: <4882cd19-4f6d-c0a7-15cd-a98e73a217ad@debian.org> Message-ID: > I don't think we rely on /healthcheck -- there's nothing healthy about > an API endpoint blindly returning a 200 OK. > > You might as well just hit / and accept 300 as a code and that's > exactly the same behaviour. I support what Sean is bringing up here > and I don't think it makes sense to have a noop /healthcheck that > always gives a 200 OK...seems a bit useless imho Yup, totally agree. Our previous concerns over a healthcheck that checked all of nova returning too much info to be useful (for something trying to figure out if an individual worker is healthy) apply in reverse to one that returns too little to be useful. I agree, what Sean is working on is the right balance and that we should focus on that. --Dan From franck.vedel at univ-grenoble-alpes.fr Wed Nov 17 22:15:08 2021 From: franck.vedel at univ-grenoble-alpes.fr (Franck VEDEL) Date: Wed, 17 Nov 2021 23:15:08 +0100 Subject: [kolla-ansible][wallaby][glance] Problem with image list after reconfigure In-Reply-To: References: Message-ID: <87804956-038F-4F88-A467-33851679B5B4@univ-grenoble-alpes.fr> Hello and thank you for the help. I was able to move forward on my problem, without finding a satisfactory solution. Normally, I have 2 servers with the role [glance] but I noticed that all my images were on the first server (in / var / lib / docker / volumes / glance / _data / images) before the reconfigure, none on the second. But since the reconfiguration, the images are placed on the second, and no longer on the first. I do not understand why. I haven't changed anything to the multinode file. so, to get out of this situation quickly as I need this openstack for the students, I modified the multinode file and put only one server in [glance] (I put server 1, the one that had the images before reconfigure), I did a reconfigure -t glance and now I have my images usable for instances. I don't understand what happened. There is something wrong. Is it normal that after updating the certificates, all instances are turned off? thanks again Franck > Le 17 nov. 2021 ? 21:11, Cyril Roelandt a ?crit : > > Hello, > > > On 2021-11-17 08:59, Franck VEDEL wrote: >> Hello everyone >> >> I have a strange problem and I haven't found the solution yet. >> Following a certificate update I had to do a "kolla-ansible -t multinode reconfigure ?. >> Well, after several attempts (it is not easy to use certificates with Kolla-ansible, and from my advice, not documented enough for beginners), I have my new functional certificates. Perfect ... well almost. >> >> I am trying to create a new instance to check general operation. ERROR. >> Okay, I look in the logs and I see that Cinder is having problems creating volumes with an error that I never had ("TypeError: 'NoneType' object is not iterable). > > We'd like to see the logs as well, especially the stacktrace. > >> I dig and then I wonder if it is not the Glance images which cannot be used, while they are present (openstack image list is OK). >> >> I create an empty volume: it works. >> I am creating a volume from an image: Failed. > > What commands are you running? What's the output? What's in the logs? > >> >> However, I have my list of ten images in glance. >> >> I create a new image and create a volume with this new image: it works. >> I create an instance with this new image: OK. >> >> What is the problem ? The images present before the "reconfigure" are listed, visible in horizon for example, but unusable. >> Is there a way to fix this, or do we have to reinstall them all? > > What's your configuration? What version of OpenStack are you running? > > > > Cyril > -------------- next part -------------- An HTML attachment was scrubbed... URL: From melwittt at gmail.com Wed Nov 17 22:20:41 2021 From: melwittt at gmail.com (melanie witt) Date: Wed, 17 Nov 2021 14:20:41 -0800 Subject: [sdk]: Check instance error message In-Reply-To: <1abc189e90ecc42225a74a4be8c9ee47@uvic.ca> References: <1abc189e90ecc42225a74a4be8c9ee47@uvic.ca> Message-ID: On Tue Nov 16 2021 09:55:32 GMT-0800 (Pacific Standard Time), dmeng wrote: > I'm wondering if there is any method that could check the error message > of an instance whose status is "ERROR"? Like from openstack cli, > "openstack server show server_name", if the server is in "ERROR" status, > this will return a field "fault" with a message shows the error. I tried > the compute service get_server and find_server, but neither of them show > the error messages of an instance. Hi, it looks like currently the sdk doesn't have the 'fault' field in the Server model [1] so AFAICT you can't get the fault message out-of-the-box. A patch upstream will be needed to add it. This can be hacked around by, for example: import openstack from openstack.compute.v2 import server from openstack import resource class MyServer(server.Server): fault = resource.Body('fault', type=dict) conn = openstack.connect(cloud='devstack') s = conn.compute._get(MyServer, '9282db95-801f-4f43-90fb-e86d9bfb6785') s.fault {'code': 500, 'created': '2021-09-17T02:23:16Z', 'message': 'No valid host was found. '} HTH, -melanie [1] https://docs.openstack.org/openstacksdk/latest/user/model.html#server From zigo at debian.org Wed Nov 17 22:47:45 2021 From: zigo at debian.org (Thomas Goirand) Date: Wed, 17 Nov 2021 23:47:45 +0100 Subject: [nova][all] Adding /healthcheck support in Nova, and better healthcheck in every projects In-Reply-To: References: <4882cd19-4f6d-c0a7-15cd-a98e73a217ad@debian.org> Message-ID: <290155a7-ad8a-ab60-8cf3-58e2642e8f38@debian.org> On 11/17/21 10:54 PM, Dan Smith wrote: >> I don't think we rely on /healthcheck -- there's nothing healthy about >> an API endpoint blindly returning a 200 OK. >> >> You might as well just hit / and accept 300 as a code and that's >> exactly the same behaviour. I support what Sean is bringing up here >> and I don't think it makes sense to have a noop /healthcheck that >> always gives a 200 OK...seems a bit useless imho > > Yup, totally agree. Our previous concerns over a healthcheck that > checked all of nova returning too much info to be useful (for something > trying to figure out if an individual worker is healthy) apply in > reverse to one that returns too little to be useful. > > I agree, what Sean is working on is the right balance and that we should > focus on that. > > --Dan > That's not the only thing it does. It also is capable of being disabled, which is useful for maintenance: one can gracefully remove an API node for removal this way, which one cannot do with the root. Cheers, Thomas Goirand (zigo) From gmann at ghanshyammann.com Thu Nov 18 00:42:35 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 17 Nov 2021 18:42:35 -0600 Subject: [nova][all] Adding /healthcheck support in Nova, and better healthcheck in every projects In-Reply-To: References: <4882cd19-4f6d-c0a7-15cd-a98e73a217ad@debian.org> Message-ID: <17d307ecada.cc39a56d853416.7543793211146988220@ghanshyammann.com> ---- On Wed, 17 Nov 2021 15:54:49 -0600 Dan Smith wrote ---- > > I don't think we rely on /healthcheck -- there's nothing healthy about > > an API endpoint blindly returning a 200 OK. > > > > You might as well just hit / and accept 300 as a code and that's > > exactly the same behaviour. I support what Sean is bringing up here > > and I don't think it makes sense to have a noop /healthcheck that > > always gives a 200 OK...seems a bit useless imho > > Yup, totally agree. Our previous concerns over a healthcheck that > checked all of nova returning too much info to be useful (for something > trying to figure out if an individual worker is healthy) apply in > reverse to one that returns too little to be useful. True, we can see the example in this old patch PS1 trying to implement all the Nova_DB_healthcheck, Nova_MQ_healthcheck, Nova_services_healthcheck and end up a lot of info and time-consuming process - https://review.opendev.org/c/openstack/nova/+/731396/1 and then on RPC call success in PS2 - https://review.opendev.org/c/openstack/nova/+/731396/2 I agree on the point that heathchecks should be 'very Confirmed things saying it is healthy' otherwise, it just solves the HA proxy use case and rests all use cases will consider this as bad healthcheck which is the current case of solo middleware. -gmann > > I agree, what Sean is working on is the right balance and that we should > focus on that. > > --Dan > > From mnaser at vexxhost.com Thu Nov 18 01:03:02 2021 From: mnaser at vexxhost.com (Mohammed Naser) Date: Wed, 17 Nov 2021 20:03:02 -0500 Subject: [nova][all] Adding /healthcheck support in Nova, and better healthcheck in every projects In-Reply-To: <290155a7-ad8a-ab60-8cf3-58e2642e8f38@debian.org> References: <4882cd19-4f6d-c0a7-15cd-a98e73a217ad@debian.org> <290155a7-ad8a-ab60-8cf3-58e2642e8f38@debian.org> Message-ID: On Wed, Nov 17, 2021 at 5:52 PM Thomas Goirand wrote: > On 11/17/21 10:54 PM, Dan Smith wrote: > >> I don't think we rely on /healthcheck -- there's nothing healthy about > >> an API endpoint blindly returning a 200 OK. > >> > >> You might as well just hit / and accept 300 as a code and that's > >> exactly the same behaviour. I support what Sean is bringing up here > >> and I don't think it makes sense to have a noop /healthcheck that > >> always gives a 200 OK...seems a bit useless imho > > > > Yup, totally agree. Our previous concerns over a healthcheck that > > checked all of nova returning too much info to be useful (for something > > trying to figure out if an individual worker is healthy) apply in > > reverse to one that returns too little to be useful. > > > > I agree, what Sean is working on is the right balance and that we should > > focus on that. > > > > --Dan > > > > That's not the only thing it does. It also is capable of being disabled, > which is useful for maintenance: one can gracefully remove an API node > for removal this way, which one cannot do with the root. > I feel like this should be handled by whatever layer that needs to drain requests for maintenance, otherwise also it might just be the same as turning off the service, no? > Cheers, > > Thomas Goirand (zigo) > > -- Mohammed Naser VEXXHOST, Inc. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Thu Nov 18 06:23:53 2021 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Thu, 18 Nov 2021 07:23:53 +0100 Subject: [kolla-ansible][wallaby][glance] Problem with image list after reconfigure In-Reply-To: <87804956-038F-4F88-A467-33851679B5B4@univ-grenoble-alpes.fr> References: <87804956-038F-4F88-A467-33851679B5B4@univ-grenoble-alpes.fr> Message-ID: Hello, i solved using the following variabile in globals.yml: glance_file_datadir_volume=somedir and glance_backend_file="yes' So if the somedir is a nfs mount point, controllers can share images. Remember you have to deploy glance on all controllers. Ignazio Il Mer 17 Nov 2021, 23:17 Franck VEDEL ha scritto: > Hello and thank you for the help. > I was able to move forward on my problem, without finding a satisfactory > solution. > Normally, I have 2 servers with the role [glance] but I noticed that all > my images were on the first server (in / var / lib / docker / volumes / > glance / _data / images) before the reconfigure, none on the second. But > since the reconfiguration, the images are placed on the second, and no > longer on the first. I do not understand why. I haven't changed anything to > the multinode file. > so, to get out of this situation quickly as I need this openstack for the > students, I modified the multinode file and put only one server in [glance] > (I put server 1, the one that had the images before reconfigure), I did a > reconfigure -t glance and now I have my images usable for instances. > I don't understand what happened. There is something wrong. > > Is it normal that after updating the certificates, all instances are > turned off? > thanks again > > Franck > > Le 17 nov. 2021 ? 21:11, Cyril Roelandt a ?crit : > > Hello, > > > On 2021-11-17 08:59, Franck VEDEL wrote: > > Hello everyone > > I have a strange problem and I haven't found the solution yet. > Following a certificate update I had to do a "kolla-ansible -t multinode > reconfigure ?. > Well, after several attempts (it is not easy to use certificates with > Kolla-ansible, and from my advice, not documented enough for beginners), I > have my new functional certificates. Perfect ... well almost. > > I am trying to create a new instance to check general operation. ERROR. > Okay, I look in the logs and I see that Cinder is having problems creating > volumes with an error that I never had ("TypeError: 'NoneType' object is > not iterable). > > > We'd like to see the logs as well, especially the stacktrace. > > I dig and then I wonder if it is not the Glance images which cannot be > used, while they are present (openstack image list is OK). > > I create an empty volume: it works. > I am creating a volume from an image: Failed. > > > What commands are you running? What's the output? What's in the logs? > > > However, I have my list of ten images in glance. > > I create a new image and create a volume with this new image: it works. > I create an instance with this new image: OK. > > What is the problem ? The images present before the "reconfigure" are > listed, visible in horizon for example, but unusable. > Is there a way to fix this, or do we have to reinstall them all? > > > What's your configuration? What version of OpenStack are you running? > > > > Cyril > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjoen at dds.nl Thu Nov 18 07:08:34 2021 From: tjoen at dds.nl (tjoen) Date: Thu, 18 Nov 2021 08:08:34 +0100 Subject: [sdk]: Check instance error message In-Reply-To: References: <1abc189e90ecc42225a74a4be8c9ee47@uvic.ca> Message-ID: On 11/17/21 23:20, melanie witt wrote: > On Tue Nov 16 2021 09:55:32 GMT-0800 (Pacific Standard Time), dmeng > wrote: >> I'm wondering if there is any method that could check the error >> message of an instance whose status is "ERROR"? Like from openstack >> cli, "openstack server show server_name", if the server is in "ERROR" >> status, this will return a field "fault" with a message shows the >> error. I tried the compute service get_server and find_server, but >> neither of them show the error messages of an instance. > > This can be hacked around by, for example: > > import openstack > from openstack.compute.v2 import server > from openstack import resource > > > class MyServer(server.Server): > ??? fault = resource.Body('fault', type=dict) > > > conn = openstack.connect(cloud='devstack') > s = conn.compute._get(MyServer, '9282db95-801f-4f43-90fb-e86d9bfb6785') > s.fault > {'code': 500, 'created': '2021-09-17T02:23:16Z', 'message': 'No valid > host was found. '} In my (not OP) case the problems were mostly python or sudo errors So journalctl still needed From skaplons at redhat.com Thu Nov 18 07:42:22 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Thu, 18 Nov 2021 08:42:22 +0100 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming In-Reply-To: <3MOP2R.O83SZVO0NWN23@est.tech> References: <2165480.iZASKD2KPV@p1> <3MOP2R.O83SZVO0NWN23@est.tech> Message-ID: <1889698.yKVeVyVuyW@p1> Hi, On ?roda, 17 listopada 2021 11:18:03 CET Balazs Gibizer wrote: > On Wed, Nov 17 2021 at 09:13:34 AM +0100, Slawek Kaplonski > > wrote: > > Hi, > > > > Recently I spent some time to check how many rechecks we need in > > Neutron to > > get patch merged and I compared it to some other OpenStack projects > > (see [1] > > for details). > > TL;DR - results aren't good for us and I think we really need to do > > something > > with that. > > I really like the idea of collecting such stats. Thank you for doing > it. I can even imagine to make a public dashboard somewhere with this > information as it is a good indication about the health of our projects > / testing. Thx. So far it's just simple script which I run from my terminal to get that data. Nothing else. If You want to use it, it's here https://github.com/ slawqo/tools/tree/master/rechecks > > > Of course "easiest" thing to say is that we should fix issues which > > we are > > hitting in the CI to make jobs more stable. But it's not that easy. > > We are > > struggling with those jobs for very long time. We have CI related > > meeting > > every week and we are fixing what we can there. > > Unfortunately there is still bunch of issues which we can't fix so > > far because > > they are intermittent and hard to reproduce locally or in some cases > > the > > issues aren't realy related to the Neutron or there are new bugs > > which we need > > to investigate and fix :) > > I have couple of suggestion based on my experience working with CI in > nova. > > 1) we try to open bug reports for intermittent gate failures too and > keep them tagged in a list [1] so when a job fail it is easy to check > if the bug is known. Thx. We are trying more or less to do that, but TBH I think that in many cases we didn't open LPs for such issues. I added it to the list of ideas :) > > 2) I offer my help here now that if you see something in neutron runs > that feels non neutron specific then ping me with it. Maybe we are > struggling with the same problem too. Thank a lot. I will for sure ping You when I will see something like that. > > 3) there was informal discussion before about a possibility to re-run > only some jobs with a recheck instead for re-running the whole set. I > don't know if this is feasible with Zuul and I think this only treat > the symptom not the root case. But still this could be a direction if > all else fails. yes, I remember that discussion and I totally understand pros and cons of such solution, but I added it to the list as well. > > Cheers, > gibi > > > So this is never ending battle for us. The problem is that we have > > to test > > various backends, drivers, etc. so as a result we have many jobs > > running on > > each patch - excluding UT, pep8 and docs jobs we have around 19 jobs > > in check > > and 14 jobs in gate queue. > > > > In the past we made a lot of improvements, like e.g. we improved > > irrelevant > > files lists for jobs to run less jobs on some of the patches, > > together with QA > > team we did "integrated-networking" template to run only Neutron and > > Nova > > related scenario tests in the Neutron queues, we removed and > > consolidated some > > of the jobs (there is still one patch in progress for that but it > > should just > > remove around 2 jobs from the check queue). All of that are good > > improvements > > but still not enough to make our CI really stable :/ > > > > Because of all of that, I would like to ask community about any other > > ideas > > how we can improve that. If You have any ideas, please send it in > > this email > > thread or reach out to me directly on irc. > > We want to discuss about them in the next video CI meeting which will > > be on > > November 30th. If You would have any idea and would like to join that > > discussion, You are more than welcome in that meeting of course :) > > > > [1] > > http://lists.openstack.org/pipermail/openstack-discuss/2021-November/ > > 025759.html > > [1] > https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure&orderby=-date_las > t_updated&start=0 > > -- > > Slawek Kaplonski > > Principal Software Engineer > > Red Hat -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From skaplons at redhat.com Thu Nov 18 07:46:11 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Thu, 18 Nov 2021 08:46:11 +0100 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming In-Reply-To: <3e835881-29d8-4e33-9ef1-36171f9dd0a3@www.fastmail.com> References: <2165480.iZASKD2KPV@p1> <3MOP2R.O83SZVO0NWN23@est.tech> <3e835881-29d8-4e33-9ef1-36171f9dd0a3@www.fastmail.com> Message-ID: <14494355.tv2OnDr8pf@p1> Hi, Thx Clark for detailed explanation about that :) On ?roda, 17 listopada 2021 16:51:57 CET you wrote: > On Wed, Nov 17, 2021, at 2:18 AM, Balazs Gibizer wrote: > > Snip. I want to respond to a specific suggestion: > > 3) there was informal discussion before about a possibility to re-run > > only some jobs with a recheck instead for re-running the whole set. I > > don't know if this is feasible with Zuul and I think this only treat > > the symptom not the root case. But still this could be a direction if > > all else fails. > > OpenStack has configured its check and gate queues with something we've called > "clean check". This refers to the requirement that before an OpenStack > project can be gated it must pass check tests first. This policy was > instituted because a number of these infrequent but problematic issues were > traced back to recheck spamming. Basically changes would show up and were > broken. They would fail some percentage of the time. They got rechecked until > they finally merged and now their failure rate is added to the whole. This > rule was introduced to make it more difficult to get this flakyness into the > gate. > > Locking in test results is in direct opposition to the existing policy and > goals. Locking results would make it far more trivial to land such flakyness > as you wouldn't need entire sets of jobs to pass before you could land. > Instead you could rerun individual jobs until each one passed and then land > the result. Potentially introducing significant flakyness with a single > merge. > > Locking results is also not really something that fits well with the > speculative gate queues that Zuul runs. Remember that Zuul constructs a > future git state and tests that in parallel. Currently the state for > OpenStack looks like: > > A - Nova > ^ > B - Glance > ^ > C - Neutron > ^ > D - Neutron > ^ > F - Neutron > > The B glance change is tested as if the A Nova change has already merged and > so on down the queue. If we want to keep these speculative states we can't > really have humans manually verify a failure can be ignored and retry it. > Because we'd be enqueuing job builds at different stages of speculative > state. Each job build would be testing a different version of the software. > > What we could do is implement a retry limit for failing jobs. Zuul could rerun > failing jobs X times before giving up and reporting failure (this would > require updates to Zuul). The problem with this approach is without some > oversight it becomes very easy to land changes that make things worse. As a > side note Zuul does do retries, but only for detected network errors or when > a pre-run playbook fails. The assumption is that network failures are due to > the dangers of the Internet, and that pre-run playbooks are small, self > contained, unlikely to fail, and when they do fail the failure should be > independent of what is being tested. > > Where does that leave us? > > I think it is worth considering the original goals of "clean check". We know > that rechecking/rerunning only makes these problems worse in the long term. > They represent technical debt. One of the reasons we run these tests is to > show us when our software is broken. In the case of flaky results we are > exposing this technical debt where it impacts the functionality of our > software. The longer we avoid fixing these issues the worse it gets, and this > is true even with "clean check". I agree with You on that and I would really like to find better/other solution for the Neutron problem than rechecking only broken jobs as I'm pretty sure that this would make things much worst quickly. > > Do we as developers find value in knowing the software needs attention before > it gets released to users? Do the users find value in running reliable > software? In the past we have asserted that "yes, there is value in this", > and have invested in tracking, investigating, and fixing these problems even > if they happen infrequently. But that does require investment, and active > maintenance. > > Clark -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From pshchelokovskyy at mirantis.com Thu Nov 18 11:11:02 2021 From: pshchelokovskyy at mirantis.com (Pavlo Shchelokovskyy) Date: Thu, 18 Nov 2021 13:11:02 +0200 Subject: [all][refstack][neutron][kolla][ironic][heat][trove][senlin][barbican][manila] Fixing Zuul Config Errors In-Reply-To: <4359750.LvFx2qVVIh@p1> References: <5072b00c-bbf6-42e9-830f-d6598a76beb7@www.fastmail.com> <5517904.DvuYhMxLoT@p1> <4359750.LvFx2qVVIh@p1> Message-ID: Hi Clark, Why is the retirement of openstack/neutron-lbaas being a problem? The repo is there and accessible under the same URL, it has (potentially working) stable/pike and stable/queens branches, and was not retired at the time of Pike or Queens, so IMO it is a valid request for testing configuration in the same branches of other projects, openstack/heat in this case. Maybe we should leave some minimal zuul configs in retired projects for zuul to find them? Cheers, On Wed, Nov 17, 2021 at 9:45 AM Slawek Kaplonski wrote: > Hi, > > On ?roda, 17 listopada 2021 08:26:14 CET Slawek Kaplonski wrote: > > Hi, > > > > I just checked neutron related things there and it seems there are 2 > major > > issues there: > > 1. move of the tap-as-a-service from x/ to openstack/ namespace (that > affects > > networking-midonet) - I will propose patch for that today. > > 2. remove of the neutron-lbaas repo (that affects much more than only > neutron > > repos - for that I will try to propose patches this week as well. > > There are also some missing job definitions in some of the neutron related > repos and also issues with missing openstack/networking-l2gw project. I > will > take a look into all those issues in next days. > > > > > On wtorek, 16 listopada 2021 17:58:44 CET Clark Boylan wrote: > > > Hello, > > > > > > The OpenStack tenant in Zuul currently has 134 config errors. You can > find > > > these errors at https://zuul.opendev.org/t/openstack/config-errors or > by > > > clicking the blue bell icon in the top right of > > > https://zuul.opendev.org/t/openstack/status. The vast majority of > these > > > errors appear related to project renames that have been requested of > OpenDev > > > or project retirements. Can you please look into fixing these as they > can > be > > > an attractive nuisance when debugging Zuul problems (they also > indicate > that > > > a number of your jobs are probably not working). > > > > > > Project renames creating issues: > > > * openstack/python-tempestconf -> osf/python-tempestconf -> > > > > > > openinfra/python-tempestconf * openstack/refstack -> osf/refstack -> > > > openinfra/refstack > > > > > > * x/tap-as-a-service -> openstack/tap-as-a-service > > > * openstack/networking-l2gw -> x/networking-l2gw > > > > > > Project retirements creating issues: > > > * openstack/neutron-lbaas > > > * recordsansible/ara > > > > > > Projects whose configs have errors: > > > * openinfra/python-tempestconf > > > * openstack/heat > > > * openstack/ironic > > > * openstack/kolla-ansible > > > * openstack/kuryr-kubernetes > > > * openstack/murano-apps > > > * openstack/networking-midonet > > > * openstack/networking-odl > > > * openstack/neutron > > > * openstack/neutron-fwaas > > > * openstack/python-troveclient > > > * openstack/senlin > > > * openstack/tap-as-a-service > > > * openstack/zaqar > > > * x/vmware-nsx > > > * openinfra/openstackid > > > * openstack/barbican > > > * openstack/cookbook-openstack-application-catalog > > > * openstack/heat-dashboard > > > * openstack/manila-ui > > > * openstack/python-manilaclient > > > > > > Let us know if we can help decipher any errors, > > > Clark > > > -- > Slawek Kaplonski > Principal Software Engineer > Red Hat -- Dr. Pavlo Shchelokovskyy Principal Software Engineer Mirantis Inc www.mirantis.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From katonalala at gmail.com Thu Nov 18 12:26:44 2021 From: katonalala at gmail.com (Lajos Katona) Date: Thu, 18 Nov 2021 13:26:44 +0100 Subject: [all][sdk][neutronclient][neutron] List of projects that use python-neutronclient as "client" Message-ID: Hi, During the Yoga PTG we discussed the possibility of deprecating python-neutronclient (see [0]). The CLI part of python-neutronclient is already deprecated (see [1]), but we have python bindings for Neutron both in python-neutronclient and in openstacksdk. As python-neutronclient's binding code is widely used as "client" code (i.e.: Heat, Horizon, Nova), but python-openstackclient uses openstacksdk's binding code for "client" this means duplicated work and maintenance for all the bindings. If we have a new API feature the binding code must go both to python-neutronclient to make it available in Heat, and to openstacksdk to have openstackclient support for it. The best would be to have the binding code in openstacksdk and deprecate python-neutronclient, but before we plan anything we would like to have a list of projects that depend on python-neutronclient. We identified a few but for sure with python-neutronclient's long history there are a lot more: * Heat * Horizon * Nova * various Networking projects * ...... Please share Your thoughts about this topic and the projects which use python-neutronclient, it would be really helpful to see how we can move forward. [0]: https://etherpad.opendev.org/p/neutron-yoga-ptg#L372 [1]: https://review.opendev.org/c/openstack/python-neutronclient/+/795475 Regards Lajos Katona (lajoskatona) -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnasiadka at gmail.com Thu Nov 18 12:33:25 2021 From: mnasiadka at gmail.com (=?utf-8?Q?Micha=C5=82_Nasiadka?=) Date: Thu, 18 Nov 2021 13:33:25 +0100 Subject: [kolla] Plan to deprecate binary and unify on single distrubition Message-ID: <7B0C2125-975E-437C-B332-9E88F725D248@gmail.com> Hello Koalas, On the PTG we have discussed two topics: 1) Deprecate and drop binary type of Kolla images 2) Use a common base (single Linux distribution) for Kolla images This is a call for feedback - for people that have not been attending the PTG. What this essentially mean for consumers: 1) In Yoga cycle we will deprecate binary type of Kolla images, and in Z cycle those will be dropped. 2) We are not going to support CentOS Stream 9 (cs9) as a base operating system, and the source type build will rely on CentOS Stream 8 in Z release. 3) Beginning from A release Kolla will build only Debian source images - but Kolla-Ansible will still support deployment of those images on CentOS/Ubuntu/Debian Host operating systems (and Rocky Linux to be added in Yoga to that mix). Justification: The Kolla project team is limited in numbers, therefore supporting current broad mix of operating systems (especially with CentOS Stream 9 ,,on the way??) is a significant maintenance burden. By dropping binary type of images - users would be running more tested images (since Kolla/Kolla-Ansible CI runs source images jobs as voting). In Xena we?ve already changed the default image type Kolla-Ansible uses to source. We also feel that using a unified base OS for Kolla container images is a way to remove some of the maintenance burden (including CI cycles and Request for feedback: If any of those changes is a no go from your perspective - we?d like to hear your opinions. Best regards, Michal Nasiadka From balazs.gibizer at est.tech Thu Nov 18 14:39:40 2021 From: balazs.gibizer at est.tech (Balazs Gibizer) Date: Thu, 18 Nov 2021 15:39:40 +0100 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming In-Reply-To: <3e835881-29d8-4e33-9ef1-36171f9dd0a3@www.fastmail.com> References: <2165480.iZASKD2KPV@p1> <3MOP2R.O83SZVO0NWN23@est.tech> <3e835881-29d8-4e33-9ef1-36171f9dd0a3@www.fastmail.com> Message-ID: <4EVR2R.KF069X977ZIK2@est.tech> On Wed, Nov 17 2021 at 07:51:57 AM -0800, Clark Boylan wrote: > On Wed, Nov 17, 2021, at 2:18 AM, Balazs Gibizer wrote: >> > > Snip. I want to respond to a specific suggestion: > >> 3) there was informal discussion before about a possibility to >> re-run >> only some jobs with a recheck instead for re-running the whole set. >> I >> don't know if this is feasible with Zuul and I think this only treat >> the symptom not the root case. But still this could be a direction >> if >> all else fails. >> > > OpenStack has configured its check and gate queues with something > we've called "clean check". This refers to the requirement that > before an OpenStack project can be gated it must pass check tests > first. This policy was instituted because a number of these > infrequent but problematic issues were traced back to recheck > spamming. Basically changes would show up and were broken. They would > fail some percentage of the time. They got rechecked until they > finally merged and now their failure rate is added to the whole. This > rule was introduced to make it more difficult to get this flakyness > into the gate. > > Locking in test results is in direct opposition to the existing > policy and goals. Locking results would make it far more trivial to > land such flakyness as you wouldn't need entire sets of jobs to pass > before you could land. Instead you could rerun individual jobs until > each one passed and then land the result. Potentially introducing > significant flakyness with a single merge. > > Locking results is also not really something that fits well with the > speculative gate queues that Zuul runs. Remember that Zuul constructs > a future git state and tests that in parallel. Currently the state > for OpenStack looks like: > > A - Nova > ^ > B - Glance > ^ > C - Neutron > ^ > D - Neutron > ^ > F - Neutron > > The B glance change is tested as if the A Nova change has already > merged and so on down the queue. If we want to keep these speculative > states we can't really have humans manually verify a failure can be > ignored and retry it. Because we'd be enqueuing job builds at > different stages of speculative state. Each job build would be > testing a different version of the software. > > What we could do is implement a retry limit for failing jobs. Zuul > could rerun failing jobs X times before giving up and reporting > failure (this would require updates to Zuul). The problem with this > approach is without some oversight it becomes very easy to land > changes that make things worse. As a side note Zuul does do retries, > but only for detected network errors or when a pre-run playbook > fails. The assumption is that network failures are due to the dangers > of the Internet, and that pre-run playbooks are small, self > contained, unlikely to fail, and when they do fail the failure should > be independent of what is being tested. > > Where does that leave us? > > I think it is worth considering the original goals of "clean check". > We know that rechecking/rerunning only makes these problems worse in > the long term. They represent technical debt. One of the reasons we > run these tests is to show us when our software is broken. In the > case of flaky results we are exposing this technical debt where it > impacts the functionality of our software. The longer we avoid fixing > these issues the worse it gets, and this is true even with "clean > check". > > Do we as developers find value in knowing the software needs > attention before it gets released to users? Do the users find value > in running reliable software? In the past we have asserted that "yes, > there is value in this", and have invested in tracking, > investigating, and fixing these problems even if they happen > infrequently. But that does require investment, and active > maintenance. Thank you Clark! I agree with your view that the current setup provides us with very valuable information about the health of the software we are developing. I also agree that our primary goal should be to fix the flaky tests instead of hiding the results under any kind of rechecks. Still I'm wondering what we will do if it turns out that the existing developer bandwidth shrunk to the point where we simply not have the capacity for fix these technical debts. What the stable team does on stable branches in Extended Maintenance mode in a similar situation is to simply turn off problematic test jobs. So I guess that is also a valid last resort move. Cheers, gibi > > Clark > From cboylan at sapwetik.org Thu Nov 18 14:40:38 2021 From: cboylan at sapwetik.org (Clark Boylan) Date: Thu, 18 Nov 2021 06:40:38 -0800 Subject: =?UTF-8?Q?Re:_[all][refstack][neutron][kolla][ironic][heat][trove][senli?= =?UTF-8?Q?n][barbican][manila]_Fixing_Zuul_Config_Errors?= In-Reply-To: References: <5072b00c-bbf6-42e9-830f-d6598a76beb7@www.fastmail.com> <5517904.DvuYhMxLoT@p1> <4359750.LvFx2qVVIh@p1> Message-ID: <5fea46fa-f304-48fa-a171-ce1114700379@www.fastmail.com> On Thu, Nov 18, 2021, at 3:11 AM, Pavlo Shchelokovskyy wrote: > Hi Clark, > > Why is the retirement of openstack/neutron-lbaas being a problem? > > The repo is there and accessible under the same URL, it has > (potentially working) stable/pike and stable/queens branches, and was > not retired at the time of Pike or Queens, so IMO it is a valid request > for testing configuration in the same branches of other projects, > openstack/heat in this case. > > Maybe we should leave some minimal zuul configs in retired projects for > zuul to find them? The reason for this is one of the steps for project retirement is to remove the repo from zuul [0]. If the upstream for the project has retired the project I think it is reasonable to remove it from our systems like Zuul. The project isn't retired on a per branch basis. I do not think the Zuul operators should be responsible for keeping retired projects alive in the CI system if their maintainers have stopped maintaining them. Instead we should remove them from the CI system. [0] https://opendev.org/openstack/project-config/commit/3832c8fafc4d4e03c306c21f37b2d39bd7c5bd2b From smooney at redhat.com Thu Nov 18 15:04:52 2021 From: smooney at redhat.com (Sean Mooney) Date: Thu, 18 Nov 2021 15:04:52 +0000 Subject: [kolla] Plan to deprecate binary and unify on single distrubition In-Reply-To: <7B0C2125-975E-437C-B332-9E88F725D248@gmail.com> References: <7B0C2125-975E-437C-B332-9E88F725D248@gmail.com> Message-ID: On Thu, 2021-11-18 at 13:33 +0100, Micha? Nasiadka wrote: > Hello Koalas, > > On the PTG we have discussed two topics: > > 1) Deprecate and drop binary type of Kolla images > 2) Use a common base (single Linux distribution) for Kolla images > > This is a call for feedback - for people that have not been attending the PTG. > > What this essentially mean for consumers: > > 1) In Yoga cycle we will deprecate binary type of Kolla images, and in Z cycle those will be dropped. > 2) We are not going to support CentOS Stream 9 (cs9) as a base operating system, and the source type build will rely on CentOS Stream 8 in Z release. > 3) Beginning from A release Kolla will build only Debian source images - but Kolla-Ansible will still support deployment of those images on CentOS/Ubuntu/Debian Host operating systems (and Rocky Linux to be added in Yoga to that mix). > > Justification: > The Kolla project team is limited in numbers, therefore supporting current broad mix of operating systems (especially with CentOS Stream 9 ,,on the way??) is a significant maintenance burden. > By dropping binary type of images - users would be running more tested images (since Kolla/Kolla-Ansible CI runs source images jobs as voting). > In Xena we?ve already changed the default image type Kolla-Ansible uses to source. > We also feel that using a unified base OS for Kolla container images is a way to remove some of the maintenance burden (including CI cycles and > > Request for feedback: > If any of those changes is a no go from your perspective - we?d like to hear your opinions. i only have reason to use kolla-ansible on small personal clouds at home so either way this will have limited direct affect on me but just wanted to give some toughts: kolla has been my favorit way to deploy openstack for a very long time, one of the reason i liked it was the fact that it was distro independent, simple ot use/configure and it support source and binary installs. i have almost always defautl to ubuntu source although in the rare case si used centos i always used centos binary. i almost never used binary installs on debian based disto and on rpm based distos never really used soruce, im not sure if im the only one that did that. its not because i trusted the rpm package more by the way it just seam to be waht was tested more when i cahged to kolla devs on irc so i avoid ubuntu bindary and centos source as a result. with that in mind debian source is not really contoversial to me, i had one quetion on that however. will the support for the other distos be kept in the kolla images but not extended or will it be dropped. i asume the plan is to remvoe the templating for other distos in A based on point 3 above. the only other thing i wanted to point out is that while i have had some succes gettign ubuntu soruce image to run on centos host in the past it will be tricky if kolla every want to supprot selinx/apparmor. that was the main barrier i faced but there can be other. speficialy ovs and libvirt can be somewhat picky about the kernel on which they run. most of the openstack service will likely not care that the contaienr os does not match the host but some of the "system" depenciy like libvirt/ovs might. a way to address taht would be to supprot using external images for those servie form dockerhub/quay e.g. use the offical upstream mariadb image or libvirt or rabbit if that is infact a porblem. anyway its totally understandable that if you do not have contirbutor that are able to support the other distors that you would remove the supprot. espicaly with the move away form using kolla image in ooo of late and presumable a reduction in redhat contibutions to keep centos supprot alive. anyway the chagne while sad to see just form a proejct health point of vew are sad would not be enough to prevent me personally form using or recommendign kolla and kolla-ansibel. if you had prorpsed the opistte of centos binary only that would be much more concerning to me. there are some advantages to source installs that you are preservging in this change that i am glad will not be lost. one last tought is that if only one disto will be supproted for building images in the future with only soruce installs supproted, it might be worth considering if alpine or the alpine/debian-lite python 3 iamge should be revaluated as the base for an even lighter set of contianers rather then the base os image. i am kind fo assumeing that at some poitn the non python noopenstack contaienr would be used form the offial images at some point rahter then kolla contining to maintain them with the above suggestion. So this may not be applicable now and really would likely be an A or post A thing. > > Best regards, > Michal Nasiadka > From gmann at ghanshyammann.com Thu Nov 18 15:12:01 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Thu, 18 Nov 2021 09:12:01 -0600 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming In-Reply-To: <1889698.yKVeVyVuyW@p1> References: <2165480.iZASKD2KPV@p1> <3MOP2R.O83SZVO0NWN23@est.tech> <1889698.yKVeVyVuyW@p1> Message-ID: <17d339ac9bd.107860943903455.71431419368509180@ghanshyammann.com> ---- On Thu, 18 Nov 2021 01:42:22 -0600 Slawek Kaplonski wrote ---- > Hi, > > On ?roda, 17 listopada 2021 11:18:03 CET Balazs Gibizer wrote: > > On Wed, Nov 17 2021 at 09:13:34 AM +0100, Slawek Kaplonski > > > > wrote: > > > Hi, > > > > > > Recently I spent some time to check how many rechecks we need in > > > Neutron to > > > get patch merged and I compared it to some other OpenStack projects > > > (see [1] > > > for details). > > > TL;DR - results aren't good for us and I think we really need to do > > > something > > > with that. > > > > I really like the idea of collecting such stats. Thank you for doing > > it. I can even imagine to make a public dashboard somewhere with this > > information as it is a good indication about the health of our projects > > / testing. > > Thx. So far it's just simple script which I run from my terminal to get that > data. Nothing else. If You want to use it, it's here https://github.com/ > slawqo/tools/tree/master/rechecks > > > > > > Of course "easiest" thing to say is that we should fix issues which > > > we are > > > hitting in the CI to make jobs more stable. But it's not that easy. > > > We are > > > struggling with those jobs for very long time. We have CI related > > > meeting > > > every week and we are fixing what we can there. > > > Unfortunately there is still bunch of issues which we can't fix so > > > far because > > > they are intermittent and hard to reproduce locally or in some cases > > > the > > > issues aren't realy related to the Neutron or there are new bugs > > > which we need > > > to investigate and fix :) > > > > I have couple of suggestion based on my experience working with CI in > > nova. > > > > 1) we try to open bug reports for intermittent gate failures too and > > keep them tagged in a list [1] so when a job fail it is easy to check > > if the bug is known. > > Thx. We are trying more or less to do that, but TBH I think that in many cases > we didn't open LPs for such issues. > I added it to the list of ideas :) +1, I think opening bugs is the best way to get the project notified and also track the issue. I like the Slawek script to collect the recheck per project and that is something we can use in TC tracking the gate health in the weekly meeting and see which project is having more recheck, Recheck does not mean that project has the issue but at least we will encourage members to open bug on corresponding projects. -gmann > > > > > 2) I offer my help here now that if you see something in neutron runs > > that feels non neutron specific then ping me with it. Maybe we are > > struggling with the same problem too. > > Thank a lot. I will for sure ping You when I will see something like that. > > > > > 3) there was informal discussion before about a possibility to re-run > > only some jobs with a recheck instead for re-running the whole set. I > > don't know if this is feasible with Zuul and I think this only treat > > the symptom not the root case. But still this could be a direction if > > all else fails. > > yes, I remember that discussion and I totally understand pros and cons of such > solution, but I added it to the list as well. > > > > > Cheers, > > gibi > > > > > So this is never ending battle for us. The problem is that we have > > > to test > > > various backends, drivers, etc. so as a result we have many jobs > > > running on > > > each patch - excluding UT, pep8 and docs jobs we have around 19 jobs > > > in check > > > and 14 jobs in gate queue. > > > > > > In the past we made a lot of improvements, like e.g. we improved > > > irrelevant > > > files lists for jobs to run less jobs on some of the patches, > > > together with QA > > > team we did "integrated-networking" template to run only Neutron and > > > Nova > > > related scenario tests in the Neutron queues, we removed and > > > consolidated some > > > of the jobs (there is still one patch in progress for that but it > > > should just > > > remove around 2 jobs from the check queue). All of that are good > > > improvements > > > but still not enough to make our CI really stable :/ > > > > > > Because of all of that, I would like to ask community about any other > > > ideas > > > how we can improve that. If You have any ideas, please send it in > > > this email > > > thread or reach out to me directly on irc. > > > We want to discuss about them in the next video CI meeting which will > > > be on > > > November 30th. If You would have any idea and would like to join that > > > discussion, You are more than welcome in that meeting of course :) > > > > > > [1] > > > http://lists.openstack.org/pipermail/openstack-discuss/2021-November/ > > > 025759.html > > > > [1] > > https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure&orderby=-date_las > > t_updated&start=0 > > > -- > > > Slawek Kaplonski > > > Principal Software Engineer > > > Red Hat > > > -- > Slawek Kaplonski > Principal Software Engineer > Red Hat From smooney at redhat.com Thu Nov 18 15:19:33 2021 From: smooney at redhat.com (Sean Mooney) Date: Thu, 18 Nov 2021 15:19:33 +0000 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming In-Reply-To: <4EVR2R.KF069X977ZIK2@est.tech> References: <2165480.iZASKD2KPV@p1> <3MOP2R.O83SZVO0NWN23@est.tech> <3e835881-29d8-4e33-9ef1-36171f9dd0a3@www.fastmail.com> <4EVR2R.KF069X977ZIK2@est.tech> Message-ID: On Thu, 2021-11-18 at 15:39 +0100, Balazs Gibizer wrote: > > On Wed, Nov 17 2021 at 07:51:57 AM -0800, Clark Boylan > wrote: > > On Wed, Nov 17, 2021, at 2:18 AM, Balazs Gibizer wrote: > > > > > > > Snip. I want to respond to a specific suggestion: > > > > > 3) there was informal discussion before about a possibility to > > > re-run > > > only some jobs with a recheck instead for re-running the whole set. > > > I > > > don't know if this is feasible with Zuul and I think this only treat > > > the symptom not the root case. But still this could be a direction > > > if > > > all else fails. > > > > > > > OpenStack has configured its check and gate queues with something > > we've called "clean check". This refers to the requirement that > > before an OpenStack project can be gated it must pass check tests > > first. This policy was instituted because a number of these > > infrequent but problematic issues were traced back to recheck > > spamming. Basically changes would show up and were broken. They would > > fail some percentage of the time. They got rechecked until they > > finally merged and now their failure rate is added to the whole. This > > rule was introduced to make it more difficult to get this flakyness > > into the gate. > > > > Locking in test results is in direct opposition to the existing > > policy and goals. Locking results would make it far more trivial to > > land such flakyness as you wouldn't need entire sets of jobs to pass > > before you could land. Instead you could rerun individual jobs until > > each one passed and then land the result. Potentially introducing > > significant flakyness with a single merge. > > > > Locking results is also not really something that fits well with the > > speculative gate queues that Zuul runs. Remember that Zuul constructs > > a future git state and tests that in parallel. Currently the state > > for OpenStack looks like: > > > > A - Nova > > ^ > > B - Glance > > ^ > > C - Neutron > > ^ > > D - Neutron > > ^ > > F - Neutron > > > > The B glance change is tested as if the A Nova change has already > > merged and so on down the queue. If we want to keep these speculative > > states we can't really have humans manually verify a failure can be > > ignored and retry it. Because we'd be enqueuing job builds at > > different stages of speculative state. Each job build would be > > testing a different version of the software. > > > > What we could do is implement a retry limit for failing jobs. Zuul > > could rerun failing jobs X times before giving up and reporting > > failure (this would require updates to Zuul). The problem with this > > approach is without some oversight it becomes very easy to land > > changes that make things worse. As a side note Zuul does do retries, > > but only for detected network errors or when a pre-run playbook > > fails. The assumption is that network failures are due to the dangers > > of the Internet, and that pre-run playbooks are small, self > > contained, unlikely to fail, and when they do fail the failure should > > be independent of what is being tested. > > > > Where does that leave us? > > > > I think it is worth considering the original goals of "clean check". > > We know that rechecking/rerunning only makes these problems worse in > > the long term. They represent technical debt. One of the reasons we > > run these tests is to show us when our software is broken. In the > > case of flaky results we are exposing this technical debt where it > > impacts the functionality of our software. The longer we avoid fixing > > these issues the worse it gets, and this is true even with "clean > > check". > > > > Do we as developers find value in knowing the software needs > > attention before it gets released to users? Do the users find value > > in running reliable software? In the past we have asserted that "yes, > > there is value in this", and have invested in tracking, > > investigating, and fixing these problems even if they happen > > infrequently. But that does require investment, and active > > maintenance. > > Thank you Clark! I agree with your view that the current setup provides > us with very valuable information about the health of the software we > are developing. I also agree that our primary goal should be to fix the > flaky tests instead of hiding the results under any kind of rechecks. > > Still I'm wondering what we will do if it turns out that the existing > developer bandwidth shrunk to the point where we simply not have the > capacity for fix these technical debts. What the stable team does on > stable branches in Extended Maintenance mode in a similar situation is > to simply turn off problematic test jobs. So I guess that is also a > valid last resort move. one option is to "trust" the core team more and grant them explict rigth to workflow +2 and force merge a patch. trust is in quotes because its not really about trusting that the core teams can restrain themselve form blindly merging broken code but more a case of right now we entrust zuul to be the final gate keeper of our repo. When there are known broken gate failure and we are trying to land specific patch to say nova to fix or unblock the nuetron gate and we can see the neutron DNM patch that depens on this nova fix passsed then we could entrust the core team in this specific case to override zuul. i would expect this capablity to be used very spareinly but we do have some intermitent failures that happen that we can tell? are unrelated to the patch like the curernt issue with volumne attach/detach that result in kernel panics in the guest. if that is the only failure and all other test passed in gate i think it woudl be reasonable for a the neutron team to approve a neutron patch that modifies security groups for example. its very clearly an unrealted failure. that might be an alternivie to the recheck we have now and by resreving that for the core team it limits the scope for abusing this. i do think that the orginal goes of green check are good so really i would be suggesting this as an option for when check passed and we get an intermient failure in gate that we woudl override. this would not adress the issue in check but it would make itermitent failure in gate much less painful. > > Cheers, > gibi > > > > > > > > Clark > > > > > From balazs.gibizer at est.tech Thu Nov 18 15:30:37 2021 From: balazs.gibizer at est.tech (Balazs Gibizer) Date: Thu, 18 Nov 2021 16:30:37 +0100 Subject: [all] tempest-integrated-compute-centos-8-stream job is failing Message-ID: <1RXR2R.FUJGJYZK1M7N3@est.tech> Hi, The centos 8 stream job is failing in 100% of the cases with mirror issues [2]. You probably need to hold you recheck until it is resolved. cheers, gibi [1] https://zuul.opendev.org/t/openstack/builds?job_name=tempest-integrated-compute-centos-8-stream [2] https://zuul.opendev.org/t/openstack/build/67dc088859d1435c9a5097e7469e5c12/log/job-output.txt#581-591 From cboylan at sapwetik.org Thu Nov 18 15:33:48 2021 From: cboylan at sapwetik.org (Clark Boylan) Date: Thu, 18 Nov 2021 07:33:48 -0800 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming In-Reply-To: <4EVR2R.KF069X977ZIK2@est.tech> References: <2165480.iZASKD2KPV@p1> <3MOP2R.O83SZVO0NWN23@est.tech> <3e835881-29d8-4e33-9ef1-36171f9dd0a3@www.fastmail.com> <4EVR2R.KF069X977ZIK2@est.tech> Message-ID: <376b1cc3-5f3f-4e5f-8e4b-97312b033dc2@www.fastmail.com> On Thu, Nov 18, 2021, at 6:39 AM, Balazs Gibizer wrote: > On Wed, Nov 17 2021 at 07:51:57 AM -0800, Clark Boylan > wrote: >> On Wed, Nov 17, 2021, at 2:18 AM, Balazs Gibizer wrote: >>> >> >> Snip. I want to respond to a specific suggestion: >> >>> 3) there was informal discussion before about a possibility to >>> re-run >>> only some jobs with a recheck instead for re-running the whole set. >>> I >>> don't know if this is feasible with Zuul and I think this only treat >>> the symptom not the root case. But still this could be a direction >>> if >>> all else fails. >>> >> >> OpenStack has configured its check and gate queues with something >> we've called "clean check". This refers to the requirement that >> before an OpenStack project can be gated it must pass check tests >> first. This policy was instituted because a number of these >> infrequent but problematic issues were traced back to recheck >> spamming. Basically changes would show up and were broken. They would >> fail some percentage of the time. They got rechecked until they >> finally merged and now their failure rate is added to the whole. This >> rule was introduced to make it more difficult to get this flakyness >> into the gate. >> >> Locking in test results is in direct opposition to the existing >> policy and goals. Locking results would make it far more trivial to >> land such flakyness as you wouldn't need entire sets of jobs to pass >> before you could land. Instead you could rerun individual jobs until >> each one passed and then land the result. Potentially introducing >> significant flakyness with a single merge. >> >> Locking results is also not really something that fits well with the >> speculative gate queues that Zuul runs. Remember that Zuul constructs >> a future git state and tests that in parallel. Currently the state >> for OpenStack looks like: >> >> A - Nova >> ^ >> B - Glance >> ^ >> C - Neutron >> ^ >> D - Neutron >> ^ >> F - Neutron >> >> The B glance change is tested as if the A Nova change has already >> merged and so on down the queue. If we want to keep these speculative >> states we can't really have humans manually verify a failure can be >> ignored and retry it. Because we'd be enqueuing job builds at >> different stages of speculative state. Each job build would be >> testing a different version of the software. >> >> What we could do is implement a retry limit for failing jobs. Zuul >> could rerun failing jobs X times before giving up and reporting >> failure (this would require updates to Zuul). The problem with this >> approach is without some oversight it becomes very easy to land >> changes that make things worse. As a side note Zuul does do retries, >> but only for detected network errors or when a pre-run playbook >> fails. The assumption is that network failures are due to the dangers >> of the Internet, and that pre-run playbooks are small, self >> contained, unlikely to fail, and when they do fail the failure should >> be independent of what is being tested. >> >> Where does that leave us? >> >> I think it is worth considering the original goals of "clean check". >> We know that rechecking/rerunning only makes these problems worse in >> the long term. They represent technical debt. One of the reasons we >> run these tests is to show us when our software is broken. In the >> case of flaky results we are exposing this technical debt where it >> impacts the functionality of our software. The longer we avoid fixing >> these issues the worse it gets, and this is true even with "clean >> check". >> >> Do we as developers find value in knowing the software needs >> attention before it gets released to users? Do the users find value >> in running reliable software? In the past we have asserted that "yes, >> there is value in this", and have invested in tracking, >> investigating, and fixing these problems even if they happen >> infrequently. But that does require investment, and active >> maintenance. > > Thank you Clark! I agree with your view that the current setup provides > us with very valuable information about the health of the software we > are developing. I also agree that our primary goal should be to fix the > flaky tests instead of hiding the results under any kind of rechecks. > > Still I'm wondering what we will do if it turns out that the existing > developer bandwidth shrunk to the point where we simply not have the > capacity for fix these technical debts. What the stable team does on > stable branches in Extended Maintenance mode in a similar situation is > to simply turn off problematic test jobs. So I guess that is also a > valid last resort move. Absolutely reduce scope if necessary. We run a huge assortment of jobs because we've added support for the kitchen sink to OpenStack. If we can't continue to reliably test those features then it should be completely valid to remove testing and probably deprecate and remove the features as well. Historically we've done this for things like postgresql support so this isn't a new problem. > > Cheers, > gibi From cboylan at sapwetik.org Thu Nov 18 15:46:00 2021 From: cboylan at sapwetik.org (Clark Boylan) Date: Thu, 18 Nov 2021 07:46:00 -0800 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming In-Reply-To: References: <2165480.iZASKD2KPV@p1> <3MOP2R.O83SZVO0NWN23@est.tech> <3e835881-29d8-4e33-9ef1-36171f9dd0a3@www.fastmail.com> <4EVR2R.KF069X977ZIK2@est.tech> Message-ID: <0d9a0bda-dee4-4d8c-8f82-1503a6b4cab3@www.fastmail.com> On Thu, Nov 18, 2021, at 7:19 AM, Sean Mooney wrote: > On Thu, 2021-11-18 at 15:39 +0100, Balazs Gibizer wrote: >> >> On Wed, Nov 17 2021 at 07:51:57 AM -0800, Clark Boylan >> wrote: >> > On Wed, Nov 17, 2021, at 2:18 AM, Balazs Gibizer wrote: >> > > >> > >> > Snip. I want to respond to a specific suggestion: >> > >> > > 3) there was informal discussion before about a possibility to >> > > re-run >> > > only some jobs with a recheck instead for re-running the whole set. >> > > I >> > > don't know if this is feasible with Zuul and I think this only treat >> > > the symptom not the root case. But still this could be a direction >> > > if >> > > all else fails. >> > > >> > >> > OpenStack has configured its check and gate queues with something >> > we've called "clean check". This refers to the requirement that >> > before an OpenStack project can be gated it must pass check tests >> > first. This policy was instituted because a number of these >> > infrequent but problematic issues were traced back to recheck >> > spamming. Basically changes would show up and were broken. They would >> > fail some percentage of the time. They got rechecked until they >> > finally merged and now their failure rate is added to the whole. This >> > rule was introduced to make it more difficult to get this flakyness >> > into the gate. >> > >> > Locking in test results is in direct opposition to the existing >> > policy and goals. Locking results would make it far more trivial to >> > land such flakyness as you wouldn't need entire sets of jobs to pass >> > before you could land. Instead you could rerun individual jobs until >> > each one passed and then land the result. Potentially introducing >> > significant flakyness with a single merge. >> > >> > Locking results is also not really something that fits well with the >> > speculative gate queues that Zuul runs. Remember that Zuul constructs >> > a future git state and tests that in parallel. Currently the state >> > for OpenStack looks like: >> > >> > A - Nova >> > ^ >> > B - Glance >> > ^ >> > C - Neutron >> > ^ >> > D - Neutron >> > ^ >> > F - Neutron >> > >> > The B glance change is tested as if the A Nova change has already >> > merged and so on down the queue. If we want to keep these speculative >> > states we can't really have humans manually verify a failure can be >> > ignored and retry it. Because we'd be enqueuing job builds at >> > different stages of speculative state. Each job build would be >> > testing a different version of the software. >> > >> > What we could do is implement a retry limit for failing jobs. Zuul >> > could rerun failing jobs X times before giving up and reporting >> > failure (this would require updates to Zuul). The problem with this >> > approach is without some oversight it becomes very easy to land >> > changes that make things worse. As a side note Zuul does do retries, >> > but only for detected network errors or when a pre-run playbook >> > fails. The assumption is that network failures are due to the dangers >> > of the Internet, and that pre-run playbooks are small, self >> > contained, unlikely to fail, and when they do fail the failure should >> > be independent of what is being tested. >> > >> > Where does that leave us? >> > >> > I think it is worth considering the original goals of "clean check". >> > We know that rechecking/rerunning only makes these problems worse in >> > the long term. They represent technical debt. One of the reasons we >> > run these tests is to show us when our software is broken. In the >> > case of flaky results we are exposing this technical debt where it >> > impacts the functionality of our software. The longer we avoid fixing >> > these issues the worse it gets, and this is true even with "clean >> > check". >> > >> > Do we as developers find value in knowing the software needs >> > attention before it gets released to users? Do the users find value >> > in running reliable software? In the past we have asserted that "yes, >> > there is value in this", and have invested in tracking, >> > investigating, and fixing these problems even if they happen >> > infrequently. But that does require investment, and active >> > maintenance. >> >> Thank you Clark! I agree with your view that the current setup provides >> us with very valuable information about the health of the software we >> are developing. I also agree that our primary goal should be to fix the >> flaky tests instead of hiding the results under any kind of rechecks. >> >> Still I'm wondering what we will do if it turns out that the existing >> developer bandwidth shrunk to the point where we simply not have the >> capacity for fix these technical debts. What the stable team does on >> stable branches in Extended Maintenance mode in a similar situation is >> to simply turn off problematic test jobs. So I guess that is also a >> valid last resort move. > > one option is to "trust" the core team more and grant them explict > rigth to workflow +2 and force merge a patch. > > trust is in quotes because its not really about trusting that the core > teams can restrain themselve form blindly merging > broken code but more a case of right now we entrust zuul to be the > final gate keeper of our repo. > > When there are known broken gate failure and we are trying to land > specific patch to say nova to fix or unblock the nuetron > gate and we can see the neutron DNM patch that depens on this nova fix > passsed then we could entrust the core team in this specific case > to override zuul. We do already give you this option via the removal of tests that are invalid/flaky/not useful. I do worry that if we give a complete end around the CI system it will be quickly abused. We stopped requiring a bug on rechecks because we quickly realized that no one was actually debugging the failure and identifying the underlying issue. Instead they would just recheck with an arbitrary or completely wrong bug identified. I expect similar would happen here. And the end result would be that CI would simply get more flaky and unreliable for the next change. If instead we fix or remove the flaky tests/jobs we'll end up with a system that is more reliable for the next change. > > i would expect this capablity to be used very spareinly but we do have > some intermitent failures that happen that we can tell? > are unrelated to the patch like the curernt issue with volumne > attach/detach that result in kernel panics in the guest. > if that is the only failure and all other test passed in gate i think > it woudl be reasonable for a the neutron team to approve a neutron patch > that modifies security groups for example. its very clearly an > unrealted failure. As noted above, it would also be reasonable to stop running tests that cannot function. We do need to be careful that we don't remove tests and never fix the underlying issues though. We should also remember that if we have these problems in CI there is a high chance that our users will have these problems in production later (we've helped more than one of the infra donor clouds identify bugs straight out of elastic-recheck information in the past so this does happen). > > that might be an alternivie to the recheck we have now and by resreving > that for the core team it limits the scope for abusing this. > > i do think that the orginal goes of green check are good so really i > would be suggesting this as an option for when check passed and we get > an intermient > failure in gate that we woudl override. > > this would not adress the issue in check but it would make itermitent > failure in gate much less painful. > > I tried to make this point in my previous email, but I think we are still fumbling around it. If we provide mechanisms to end around flaky CI instead of fixing flaky CI the end result will be flakier CI. I'm not convinced that we'll be happier with any mechanism that doesn't remove the -1 from happening in the first place. Instead the problems will accelerate and eventually we'll be unable to rely on CI for anything useful. From zigo at debian.org Thu Nov 18 15:50:58 2021 From: zigo at debian.org (Thomas Goirand) Date: Thu, 18 Nov 2021 16:50:58 +0100 Subject: [nova][all] Adding /healthcheck support in Nova, and better healthcheck in every projects In-Reply-To: References: <4882cd19-4f6d-c0a7-15cd-a98e73a217ad@debian.org> <290155a7-ad8a-ab60-8cf3-58e2642e8f38@debian.org> Message-ID: On 11/18/21 2:03 AM, Mohammed Naser wrote: > > > On Wed, Nov 17, 2021 at 5:52 PM Thomas Goirand > wrote: > > On 11/17/21 10:54 PM, Dan Smith wrote: > >> I don't think we rely on /healthcheck -- there's nothing healthy > about > >> an API endpoint blindly returning a 200 OK. > >> > >> You might as well just hit / and accept 300 as a code and that's > >> exactly the same behaviour.? I support what Sean is bringing up here > >> and I don't think it makes sense to have a noop /healthcheck that > >> always gives a 200 OK...seems a bit useless imho > > > > Yup, totally agree. Our previous concerns over a healthcheck that > > checked all of nova returning too much info to be useful (for > something > > trying to figure out if an individual worker is healthy) apply in > > reverse to one that returns too little to be useful. > > > > I agree, what Sean is working on is the right balance and that we > should > > focus on that. > > > > --Dan > > > > That's not the only thing it does. It also is capable of being disabled, > which is useful for maintenance: one can gracefully remove an API node > for removal this way, which one cannot do with the root. > > > I feel like this should be handled by whatever layer that needs to drain > requests for maintenance, otherwise also it might just be the same as > turning off the service, no? It's not the same. If you just turn off the service, there well may be some requests attempted to the API before it's seen as down. The idea here, is to declare the API as down, so that haproxy can remove it from the pool *before* the service is really turned off. That's what the oslo.middleware disable file helps doing, which the root url cannot do. Cheers, Thomas Goirand (zigo) From dms at danplanet.com Thu Nov 18 16:15:06 2021 From: dms at danplanet.com (Dan Smith) Date: Thu, 18 Nov 2021 08:15:06 -0800 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming In-Reply-To: <0d9a0bda-dee4-4d8c-8f82-1503a6b4cab3@www.fastmail.com> (Clark Boylan's message of "Thu, 18 Nov 2021 07:46:00 -0800") References: <2165480.iZASKD2KPV@p1> <3MOP2R.O83SZVO0NWN23@est.tech> <3e835881-29d8-4e33-9ef1-36171f9dd0a3@www.fastmail.com> <4EVR2R.KF069X977ZIK2@est.tech> <0d9a0bda-dee4-4d8c-8f82-1503a6b4cab3@www.fastmail.com> Message-ID: > We do already give you this option via the removal of tests that are > invalid/flaky/not useful. I do worry that if we give a complete end > around the CI system it will be quickly abused. Absolutely agree, humans are not good at making these decisions. Despite "trust" in the core team, and even using a less-loaded word than "abuse," I really don't think that even allowing the option to override flaky tests by force merge is the right solution (at all). > I tried to make this point in my previous email, but I think we are > still fumbling around it. If we provide mechanisms to end around flaky > CI instead of fixing flaky CI the end result will be flakier CI. I'm > not convinced that we'll be happier with any mechanism that doesn't > remove the -1 from happening in the first place. Instead the problems > will accelerate and eventually we'll be unable to rely on CI for > anything useful. Agreed. Either the tests are useful or they aren't. Even if they're not very reliable, they might be useful in causing pain because they continue to highlight flaky behavior until it gets fixed. --Dan From fungi at yuggoth.org Thu Nov 18 16:24:31 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 18 Nov 2021 16:24:31 +0000 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming In-Reply-To: References: <2165480.iZASKD2KPV@p1> <3MOP2R.O83SZVO0NWN23@est.tech> <3e835881-29d8-4e33-9ef1-36171f9dd0a3@www.fastmail.com> <4EVR2R.KF069X977ZIK2@est.tech> <0d9a0bda-dee4-4d8c-8f82-1503a6b4cab3@www.fastmail.com> Message-ID: <20211118162207.nu4ktuikasoxyaq2@yuggoth.org> On 2021-11-18 08:15:06 -0800 (-0800), Dan Smith wrote: [...] > Absolutely agree, humans are not good at making these decisions. > Despite "trust" in the core team, and even using a less-loaded > word than "abuse," I really don't think that even allowing the > option to override flaky tests by force merge is the right > solution (at all). [...] Just about any time we Gerrit admins have decided to bypass testing to merge some change (and to be clear, we really don't like to if we can avoid it), we introduce a new test-breaking bug we then need to troubleshoot and fix. It's a humbling reminder that even though you may feel absolutely sure something's safe to merge without passing tests, you're probably wrong. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From gmann at ghanshyammann.com Thu Nov 18 16:45:17 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Thu, 18 Nov 2021 10:45:17 -0600 Subject: [all] tempest-integrated-compute-centos-8-stream job is failing In-Reply-To: <1RXR2R.FUJGJYZK1M7N3@est.tech> References: <1RXR2R.FUJGJYZK1M7N3@est.tech> Message-ID: <17d33f029a2.10ffde70c911325.6334468229609379685@ghanshyammann.com> ---- On Thu, 18 Nov 2021 09:30:37 -0600 Balazs Gibizer wrote ---- > Hi, > > The centos 8 stream job is failing in 100% of the cases with mirror > issues [2]. You probably need to hold you recheck until it is resolved. did we make it voting too early? In devstack centos8-stream is still non-voting [1] which I think we were waiting for its stability first (or may be yoctozepto knows). [1] https://github.com/openstack/devstack/blob/487057de80df936f96f0b7364f4abfc8a7561d55/.zuul.yaml#L620 -gmann > > cheers, > gibi > > [1] > https://zuul.opendev.org/t/openstack/builds?job_name=tempest-integrated-compute-centos-8-stream > [2] > https://zuul.opendev.org/t/openstack/build/67dc088859d1435c9a5097e7469e5c12/log/job-output.txt#581-591 > > > > From balazs.gibizer at est.tech Thu Nov 18 16:47:08 2021 From: balazs.gibizer at est.tech (Balazs Gibizer) Date: Thu, 18 Nov 2021 17:47:08 +0100 Subject: [all] tempest-integrated-compute-centos-8-stream job is failing In-Reply-To: <17d33f029a2.10ffde70c911325.6334468229609379685@ghanshyammann.com> References: <1RXR2R.FUJGJYZK1M7N3@est.tech> <17d33f029a2.10ffde70c911325.6334468229609379685@ghanshyammann.com> Message-ID: On Thu, Nov 18 2021 at 10:45:17 AM -0600, Ghanshyam Mann wrote: > ---- On Thu, 18 Nov 2021 09:30:37 -0600 Balazs Gibizer > wrote ---- > > Hi, > > > > The centos 8 stream job is failing in 100% of the cases with mirror > > issues [2]. You probably need to hold you recheck until it is > resolved. > > did we make it voting too early? In devstack centos8-stream is still > non-voting [1] which I think we were waiting for its stability first > (or > may be yoctozepto knows). OK I made a mistake it is only voting in nova as far as I see[1]. But there was now a green run[1] so maybe the problem is resolved. cheers, gibi [1] https://zuul.opendev.org/t/openstack/builds?job_name=tempest-integrated-compute-centos-8-stream&voting=1 > > [1] > https://github.com/openstack/devstack/blob/487057de80df936f96f0b7364f4abfc8a7561d55/.zuul.yaml#L620 > > -gmann > > > > cheers, > > gibi > > > > [1] > > > https://zuul.opendev.org/t/openstack/builds?job_name=tempest-integrated-compute-centos-8-stream > > [2] > > > https://zuul.opendev.org/t/openstack/build/67dc088859d1435c9a5097e7469e5c12/log/job-output.txt#581-591 > > > > > > > > From gmann at ghanshyammann.com Thu Nov 18 16:47:21 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Thu, 18 Nov 2021 10:47:21 -0600 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming In-Reply-To: <20211118162207.nu4ktuikasoxyaq2@yuggoth.org> References: <2165480.iZASKD2KPV@p1> <3MOP2R.O83SZVO0NWN23@est.tech> <3e835881-29d8-4e33-9ef1-36171f9dd0a3@www.fastmail.com> <4EVR2R.KF069X977ZIK2@est.tech> <0d9a0bda-dee4-4d8c-8f82-1503a6b4cab3@www.fastmail.com> <20211118162207.nu4ktuikasoxyaq2@yuggoth.org> Message-ID: <17d33f20e01.11ee7cdc8911479.3734135720231660826@ghanshyammann.com> ---- On Thu, 18 Nov 2021 10:24:31 -0600 Jeremy Stanley wrote ---- > On 2021-11-18 08:15:06 -0800 (-0800), Dan Smith wrote: > [...] > > Absolutely agree, humans are not good at making these decisions. > > Despite "trust" in the core team, and even using a less-loaded > > word than "abuse," I really don't think that even allowing the > > option to override flaky tests by force merge is the right > > solution (at all). > [...] > > Just about any time we Gerrit admins have decided to bypass testing > to merge some change (and to be clear, we really don't like to if we > can avoid it), we introduce a new test-breaking bug we then need to > troubleshoot and fix. It's a humbling reminder that even though you > may feel absolutely sure something's safe to merge without passing > tests, you're probably wrong. Indeed. I too agree here and it can lead to the situation that 'hey my patch was all good can you just +W this' which can end up more unstable tests/code. -gmann > -- > Jeremy Stanley > From gmann at ghanshyammann.com Thu Nov 18 16:49:05 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Thu, 18 Nov 2021 10:49:05 -0600 Subject: [all] tempest-integrated-compute-centos-8-stream job is failing In-Reply-To: References: <1RXR2R.FUJGJYZK1M7N3@est.tech> <17d33f029a2.10ffde70c911325.6334468229609379685@ghanshyammann.com> Message-ID: <17d33f3a647.c62909f7911582.7656301661228191981@ghanshyammann.com> ---- On Thu, 18 Nov 2021 10:47:08 -0600 Balazs Gibizer wrote ---- > > > On Thu, Nov 18 2021 at 10:45:17 AM -0600, Ghanshyam Mann > wrote: > > ---- On Thu, 18 Nov 2021 09:30:37 -0600 Balazs Gibizer > > wrote ---- > > > Hi, > > > > > > The centos 8 stream job is failing in 100% of the cases with mirror > > > issues [2]. You probably need to hold you recheck until it is > > resolved. > > > > did we make it voting too early? In devstack centos8-stream is still > > non-voting [1] which I think we were waiting for its stability first > > (or > > may be yoctozepto knows). > > OK I made a mistake it is only voting in nova as far as I see[1]. But > there was now a green run[1] so maybe the problem is resolved. +1, as in yoga testing runtime we are moving to centos9-stream. let me try the devstack job also voting so that any concrete job like ' tempest-integrated-compute-centos-8-stream' can be consider as voting and we capture the issue in devstack first before it break on project side. -gmann > > cheers, > gibi > > [1] > https://zuul.opendev.org/t/openstack/builds?job_name=tempest-integrated-compute-centos-8-stream&voting=1 > > > > > [1] > > https://github.com/openstack/devstack/blob/487057de80df936f96f0b7364f4abfc8a7561d55/.zuul.yaml#L620 > > > > -gmann > > > > > > cheers, > > > gibi > > > > > > [1] > > > > > https://zuul.opendev.org/t/openstack/builds?job_name=tempest-integrated-compute-centos-8-stream > > > [2] > > > > > https://zuul.opendev.org/t/openstack/build/67dc088859d1435c9a5097e7469e5c12/log/job-output.txt#581-591 > > > > > > > > > > > > > > > From balazs.gibizer at est.tech Thu Nov 18 18:39:14 2021 From: balazs.gibizer at est.tech (Balazs Gibizer) Date: Thu, 18 Nov 2021 19:39:14 +0100 Subject: [all] tempest-integrated-compute-centos-8-stream job is failing In-Reply-To: <17d33f3a647.c62909f7911582.7656301661228191981@ghanshyammann.com> References: <1RXR2R.FUJGJYZK1M7N3@est.tech> <17d33f029a2.10ffde70c911325.6334468229609379685@ghanshyammann.com> <17d33f3a647.c62909f7911582.7656301661228191981@ghanshyammann.com> Message-ID: Since then we had two green runs so I think the mirror issue has been resolved. cheers, gibi On Thu, Nov 18 2021 at 10:49:05 AM -0600, Ghanshyam Mann wrote: > ---- On Thu, 18 Nov 2021 10:47:08 -0600 Balazs Gibizer > wrote ---- > > > > > > On Thu, Nov 18 2021 at 10:45:17 AM -0600, Ghanshyam Mann > > wrote: > > > ---- On Thu, 18 Nov 2021 09:30:37 -0600 Balazs Gibizer > > > wrote ---- > > > > Hi, > > > > > > > > The centos 8 stream job is failing in 100% of the cases with > mirror > > > > issues [2]. You probably need to hold you recheck until it is > > > resolved. > > > > > > did we make it voting too early? In devstack centos8-stream is > still > > > non-voting [1] which I think we were waiting for its stability > first > > > (or > > > may be yoctozepto knows). > > > > OK I made a mistake it is only voting in nova as far as I see[1]. > But > > there was now a green run[1] so maybe the problem is resolved. > > +1, as in yoga testing runtime we are moving to centos9-stream. let me > try the devstack job also voting so that any concrete job like > ' tempest-integrated-compute-centos-8-stream' can be consider as > voting > and we capture the issue in devstack first before it break on project > side. > > -gmann > > > > > cheers, > > gibi > > > > [1] > > > https://zuul.opendev.org/t/openstack/builds?job_name=tempest-integrated-compute-centos-8-stream&voting=1 > > > > > > > > [1] > > > > https://github.com/openstack/devstack/blob/487057de80df936f96f0b7364f4abfc8a7561d55/.zuul.yaml#L620 > > > > > > -gmann > > > > > > > > cheers, > > > > gibi > > > > > > > > [1] > > > > > > > > https://zuul.opendev.org/t/openstack/builds?job_name=tempest-integrated-compute-centos-8-stream > > > > [2] > > > > > > > > https://zuul.opendev.org/t/openstack/build/67dc088859d1435c9a5097e7469e5c12/log/job-output.txt#581-591 > > > > > > > > > > > > > > > > > > > > > > From fungi at yuggoth.org Thu Nov 18 18:45:53 2021 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 18 Nov 2021 18:45:53 +0000 Subject: [all] tempest-integrated-compute-centos-8-stream job is failing In-Reply-To: References: <1RXR2R.FUJGJYZK1M7N3@est.tech> <17d33f029a2.10ffde70c911325.6334468229609379685@ghanshyammann.com> <17d33f3a647.c62909f7911582.7656301661228191981@ghanshyammann.com> Message-ID: <20211118184553.ha6eqxq744vsc6mo@yuggoth.org> On 2021-11-18 19:39:14 +0100 (+0100), Balazs Gibizer wrote: > Since then we had two green runs so I think the mirror issue has been > resolved. [...] Yes, apparently mirror.centos.org was broken. We reported the problem to Red Hat staff, and it was shortly addressed. Our next mirror sync after that solved it across our CI mirror network and CentOS Stream 8 based jobs stopped failing. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From franck.vedel at univ-grenoble-alpes.fr Thu Nov 18 18:58:07 2021 From: franck.vedel at univ-grenoble-alpes.fr (Franck VEDEL) Date: Thu, 18 Nov 2021 19:58:07 +0100 Subject: [kolla-ansible][wallaby][glance] Problem with image list after reconfigure In-Reply-To: References: <87804956-038F-4F88-A467-33851679B5B4@univ-grenoble-alpes.fr> Message-ID: <63143C21-F494-4544-86F3-22792BF062E6@univ-grenoble-alpes.fr> ok ... I got it ... and I think I was doing things wrong. Okay, so I have another question. My cinder storage is on an iscsi bay. I have 3 servers, S1, S2, S2. Compute is on S1, S2, S3. Controller is on S1 and S2. Storage is on S3. I have Glance on S1. Building an instance from an image is too long, so you have to make a volume first. If I put the images on the iSCSI bay, I mount a directory in the S1 file system, will the images build faster? Much faster ? Is this a good idea or not? Thank you again for your help and your experience Franck > Le 18 nov. 2021 ? 07:23, Ignazio Cassano a ?crit : > > Hello, i solved using the following variabile in globals.yml: > glance_file_datadir_volume=somedir > and glance_backend_file="yes' > > So if the somedir is a nfs mount point, controllers can share images. Remember you have to deploy glance on all controllers. > Ignazio > > Il Mer 17 Nov 2021, 23:17 Franck VEDEL > ha scritto: > Hello and thank you for the help. > I was able to move forward on my problem, without finding a satisfactory solution. > Normally, I have 2 servers with the role [glance] but I noticed that all my images were on the first server (in / var / lib / docker / volumes / glance / _data / images) before the reconfigure, none on the second. But since the reconfiguration, the images are placed on the second, and no longer on the first. I do not understand why. I haven't changed anything to the multinode file. > so, to get out of this situation quickly as I need this openstack for the students, I modified the multinode file and put only one server in [glance] (I put server 1, the one that had the images before reconfigure), I did a reconfigure -t glance and now I have my images usable for instances. > I don't understand what happened. There is something wrong. > > Is it normal that after updating the certificates, all instances are turned off? > thanks again > > Franck > >> Le 17 nov. 2021 ? 21:11, Cyril Roelandt > a ?crit : >> >> Hello, >> >> >> On 2021-11-17 08:59, Franck VEDEL wrote: >>> Hello everyone >>> >>> I have a strange problem and I haven't found the solution yet. >>> Following a certificate update I had to do a "kolla-ansible -t multinode reconfigure ?. >>> Well, after several attempts (it is not easy to use certificates with Kolla-ansible, and from my advice, not documented enough for beginners), I have my new functional certificates. Perfect ... well almost. >>> >>> I am trying to create a new instance to check general operation. ERROR. >>> Okay, I look in the logs and I see that Cinder is having problems creating volumes with an error that I never had ("TypeError: 'NoneType' object is not iterable). >> >> We'd like to see the logs as well, especially the stacktrace. >> >>> I dig and then I wonder if it is not the Glance images which cannot be used, while they are present (openstack image list is OK). >>> >>> I create an empty volume: it works. >>> I am creating a volume from an image: Failed. >> >> What commands are you running? What's the output? What's in the logs? >> >>> >>> However, I have my list of ten images in glance. >>> >>> I create a new image and create a volume with this new image: it works. >>> I create an instance with this new image: OK. >>> >>> What is the problem ? The images present before the "reconfigure" are listed, visible in horizon for example, but unusable. >>> Is there a way to fix this, or do we have to reinstall them all? >> >> What's your configuration? What version of OpenStack are you running? >> >> >> >> Cyril >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From katonalala at gmail.com Thu Nov 18 19:19:51 2021 From: katonalala at gmail.com (Lajos Katona) Date: Thu, 18 Nov 2021 20:19:51 +0100 Subject: [neutron] Drivers meeting - Friday 18.11.2021 - cancelled Message-ID: Hi Neutron Drivers! Due to the lack of agenda, let's cancel tomorrow's drivers meeting. See You on the meeting next week. Lajos Katona (lajoskatona) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Thu Nov 18 19:34:39 2021 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Thu, 18 Nov 2021 20:34:39 +0100 Subject: [kolla-ansible][wallaby][glance] Problem with image list after reconfigure In-Reply-To: <63143C21-F494-4544-86F3-22792BF062E6@univ-grenoble-alpes.fr> References: <87804956-038F-4F88-A467-33851679B5B4@univ-grenoble-alpes.fr> <63143C21-F494-4544-86F3-22792BF062E6@univ-grenoble-alpes.fr> Message-ID: I did not understand very well how your infrastructure is done. Generally speaking, I prefer to have 3 controllers , n computer nodes and external storage. I think using iscsi images must be downloaded and converted from qcow2 to raw format and it can takes a long time. In this case I used image cache. Probably when you create a volume from image you can see a download phase. If you use image cache the download is executed only the first time a volume from that image is created. Sorry for my bad english. Take a look at https://docs.openstack.org/cinder/latest/admin/blockstorage-image-volume-cache.html#:~:text=Image%2DVolume%20cache%C2%B6,end%20can%20clone%20a%20volume . Ignazio Il Gio 18 Nov 2021, 19:58 Franck VEDEL ha scritto: > ok ... I got it ... and I think I was doing things wrong. > Okay, so I have another question. > My cinder storage is on an iscsi bay. > I have 3 servers, S1, S2, S2. > Compute is on S1, S2, S3. > Controller is on S1 and S2. > Storage is on S3. > I have Glance on S1. Building an instance from an image is too long, so > you have to make a volume first. > If I put the images on the iSCSI bay, I mount a directory in the S1 file > system, will the images build faster? Much faster ? > Is this a good idea or not? > > Thank you again for your help and your experience > > > Franck > > Le 18 nov. 2021 ? 07:23, Ignazio Cassano a > ?crit : > > Hello, i solved using the following variabile in globals.yml: > glance_file_datadir_volume=somedir > and glance_backend_file="yes' > > So if the somedir is a nfs mount point, controllers can share images. > Remember you have to deploy glance on all controllers. > Ignazio > > Il Mer 17 Nov 2021, 23:17 Franck VEDEL < > franck.vedel at univ-grenoble-alpes.fr> ha scritto: > >> Hello and thank you for the help. >> I was able to move forward on my problem, without finding a satisfactory >> solution. >> Normally, I have 2 servers with the role [glance] but I noticed that all >> my images were on the first server (in / var / lib / docker / volumes / >> glance / _data / images) before the reconfigure, none on the second. But >> since the reconfiguration, the images are placed on the second, and no >> longer on the first. I do not understand why. I haven't changed anything to >> the multinode file. >> so, to get out of this situation quickly as I need this openstack for the >> students, I modified the multinode file and put only one server in [glance] >> (I put server 1, the one that had the images before reconfigure), I did a >> reconfigure -t glance and now I have my images usable for instances. >> I don't understand what happened. There is something wrong. >> >> Is it normal that after updating the certificates, all instances are >> turned off? >> thanks again >> >> Franck >> >> Le 17 nov. 2021 ? 21:11, Cyril Roelandt a ?crit : >> >> Hello, >> >> >> On 2021-11-17 08:59, Franck VEDEL wrote: >> >> Hello everyone >> >> I have a strange problem and I haven't found the solution yet. >> Following a certificate update I had to do a "kolla-ansible -t multinode >> reconfigure ?. >> Well, after several attempts (it is not easy to use certificates with >> Kolla-ansible, and from my advice, not documented enough for beginners), I >> have my new functional certificates. Perfect ... well almost. >> >> I am trying to create a new instance to check general operation. ERROR. >> Okay, I look in the logs and I see that Cinder is having problems >> creating volumes with an error that I never had ("TypeError: 'NoneType' >> object is not iterable). >> >> >> We'd like to see the logs as well, especially the stacktrace. >> >> I dig and then I wonder if it is not the Glance images which cannot be >> used, while they are present (openstack image list is OK). >> >> I create an empty volume: it works. >> I am creating a volume from an image: Failed. >> >> >> What commands are you running? What's the output? What's in the logs? >> >> >> However, I have my list of ten images in glance. >> >> I create a new image and create a volume with this new image: it works. >> I create an instance with this new image: OK. >> >> What is the problem ? The images present before the "reconfigure" are >> listed, visible in horizon for example, but unusable. >> Is there a way to fix this, or do we have to reinstall them all? >> >> >> What's your configuration? What version of OpenStack are you running? >> >> >> >> Cyril >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Thu Nov 18 20:57:55 2021 From: smooney at redhat.com (Sean Mooney) Date: Thu, 18 Nov 2021 20:57:55 +0000 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming In-Reply-To: <20211118162207.nu4ktuikasoxyaq2@yuggoth.org> References: <2165480.iZASKD2KPV@p1> <3MOP2R.O83SZVO0NWN23@est.tech> <3e835881-29d8-4e33-9ef1-36171f9dd0a3@www.fastmail.com> <4EVR2R.KF069X977ZIK2@est.tech> <0d9a0bda-dee4-4d8c-8f82-1503a6b4cab3@www.fastmail.com> <20211118162207.nu4ktuikasoxyaq2@yuggoth.org> Message-ID: On Thu, 2021-11-18 at 16:24 +0000, Jeremy Stanley wrote: > On 2021-11-18 08:15:06 -0800 (-0800), Dan Smith wrote: > [...] > > Absolutely agree, humans are not good at making these decisions. > > Despite "trust" in the core team, and even using a less-loaded > > word than "abuse," I really don't think that even allowing the > > option to override flaky tests by force merge is the right > > solution (at all). > [...] > > Just about any time we Gerrit admins have decided to bypass testing > to merge some change (and to be clear, we really don't like to if we > can avoid it), we introduce a new test-breaking bug we then need to > troubleshoot and fix. It's a humbling reminder that even though you > may feel absolutely sure something's safe to merge without passing > tests, you're probably wrong. well the example i gave is a failure in the interaction between nova and cinder failing in the neturon gate. there is no way the neutron patch under reivew could cause that failure to happen and i chose a specific example of the intermite failure we have with the compute volume detach fiailest where it looks like the bug is actully the tempest test.? https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_67c/810915/2/gate/nova-live-migration/67c89da/testr_results.html it appear that for some reason attaching a cinder volume and live migrating the vm while the kernel/OS in the vm is still booting up can result in a kernel panic. This has been an on going battel to solve for many weeks. there is no way that a change in neutron or glance or keystone patch could have cause the guest kernel to crash. https://bugs.launchpad.net/nova/+bug/1950310 and https://bugs.launchpad.net/nova/+bug/1939108 are two of the related bugs if they are running the tempest.api.compute.admin.test_live_migration.LiveMigrationTest* in any of there jobs however they could have been impacted by this lee yarwood has started implemeting a very old tepest spec https://specs.openstack.org/openstack/qa-specs/specs/tempest/implemented/ssh-auth-strategy.html for this and we think that will fix the test failure.https://review.opendev.org/c/openstack/tempest/+/817772/2 i suspect we have many other cases in tempest today where we have intermitent failures cause by the guest os not being ready before we do operations on the guest beyond the curent volume attach/detach issues i did not sucggest allowint the ci to be overrdien because i think that is generally a good idea,? its not but some time there are failure that we are activly trying to fix but have not found a solution for for months. im pretty sure this live migration test prevent patches to the ironic virt driver landing not so long ago requiring sevel retries. the ironic virirt dirver obvioly does not supprot live migrationand the chagne was not touching any oter part of nova so the failure was unrealted. https://review.opendev.org/c/openstack/nova/+/799327 is the cahange i was thinking of the master version need 3 recheck the backprot needed 6 more https://review.opendev.org/c/openstack/nova/+/799772 that may have actully been casue by https://bugs.launchpad.net/nova/+bug/1931702 which is an other bug for a similar kernel panic but i would not be surpirsed if it ws actully the same root causes. i think that point was lost in my orginal message. the point i was trying to make is sometimes the failture is not about the code under review its because the test is wrong. we shoudl fix the test but it can be very frustrating if you recheck somethign 3-4 times where it passes in check and fails in gate for somethign you know is unrealated but you dont want to disable the test because you dont want to losse coverate for somthign that typically fails a low amount of time. regards sean. From rlandy at redhat.com Thu Nov 18 23:52:46 2021 From: rlandy at redhat.com (Ronelle Landy) Date: Thu, 18 Nov 2021 18:52:46 -0500 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming In-Reply-To: <3MOP2R.O83SZVO0NWN23@est.tech> References: <2165480.iZASKD2KPV@p1> <3MOP2R.O83SZVO0NWN23@est.tech> Message-ID: On Wed, Nov 17, 2021 at 5:22 AM Balazs Gibizer wrote: > > > On Wed, Nov 17 2021 at 09:13:34 AM +0100, Slawek Kaplonski > wrote: > > Hi, > > > > Recently I spent some time to check how many rechecks we need in > > Neutron to > > get patch merged and I compared it to some other OpenStack projects > > (see [1] > > for details). > > TL;DR - results aren't good for us and I think we really need to do > > something > > with that. > > I really like the idea of collecting such stats. Thank you for doing > it. I can even imagine to make a public dashboard somewhere with this > information as it is a good indication about the health of our projects > / testing. > > > > > Of course "easiest" thing to say is that we should fix issues which > > we are > > hitting in the CI to make jobs more stable. But it's not that easy. > > We are > > struggling with those jobs for very long time. We have CI related > > meeting > > every week and we are fixing what we can there. > > Unfortunately there is still bunch of issues which we can't fix so > > far because > > they are intermittent and hard to reproduce locally or in some cases > > the > > issues aren't realy related to the Neutron or there are new bugs > > which we need > > to investigate and fix :) > > > I have couple of suggestion based on my experience working with CI in > nova. > We've struggled with unstable tests in TripleO as well. Here are some things we tried and implemented: 1. Created job dependencies so we only ran check tests once we knew we had the resources we needed (example we had pulled containers successfully) 2. Moved some testing to third party where we have easier control of the environment (note that third party cannot stop a change merging) 3. Used dependency pipelines to pre-qualify some dependencies ahead of letting them run wild on our check jobs 4. Requested testproject runs of changes in a less busy environment before running a full set of tests in a public zuul 5. Used a skiplist to keep track of tech debt and skip known failures that we could temporarily ignore to keep CI moving along if we're waiting on an external fix. > > 1) we try to open bug reports for intermittent gate failures too and > keep them tagged in a list [1] so when a job fail it is easy to check > if the bug is known. > > 2) I offer my help here now that if you see something in neutron runs > that feels non neutron specific then ping me with it. Maybe we are > struggling with the same problem too. > > 3) there was informal discussion before about a possibility to re-run > only some jobs with a recheck instead for re-running the whole set. I > don't know if this is feasible with Zuul and I think this only treat > the symptom not the root case. But still this could be a direction if > all else fails. > > Cheers, > gibi > > > So this is never ending battle for us. The problem is that we have > > to test > > various backends, drivers, etc. so as a result we have many jobs > > running on > > each patch - excluding UT, pep8 and docs jobs we have around 19 jobs > > in check > > and 14 jobs in gate queue. > > > > In the past we made a lot of improvements, like e.g. we improved > > irrelevant > > files lists for jobs to run less jobs on some of the patches, > > together with QA > > team we did "integrated-networking" template to run only Neutron and > > Nova > > related scenario tests in the Neutron queues, we removed and > > consolidated some > > of the jobs (there is still one patch in progress for that but it > > should just > > remove around 2 jobs from the check queue). All of that are good > > improvements > > but still not enough to make our CI really stable :/ > > > > Because of all of that, I would like to ask community about any other > > ideas > > how we can improve that. If You have any ideas, please send it in > > this email > > thread or reach out to me directly on irc. > > We want to discuss about them in the next video CI meeting which will > > be on > > November 30th. If You would have any idea and would like to join that > > discussion, You are more than welcome in that meeting of course :) > > > > [1] > > http://lists.openstack.org/pipermail/openstack-discuss/2021-November/ > > 025759.html > > > [1] > > https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure&orderby=-date_last_updated&start=0 > > > > > > -- > > Slawek Kaplonski > > Principal Software Engineer > > Red Hat > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gregory.orange at pawsey.org.au Fri Nov 19 00:33:01 2021 From: gregory.orange at pawsey.org.au (Gregory Orange) Date: Fri, 19 Nov 2021 08:33:01 +0800 Subject: [kolla] Plan to deprecate binary and unify on single distrubition In-Reply-To: <7B0C2125-975E-437C-B332-9E88F725D248@gmail.com> References: <7B0C2125-975E-437C-B332-9E88F725D248@gmail.com> Message-ID: <1e3b2745-5b78-fe2b-3113-b3253eb4c0a5@pawsey.org.au> On 18/11/21 8:33 pm, Micha? Nasiadka wrote: > 1) Deprecate and drop binary type of Kolla images My globals.yaml has #kolla_install_type: "binary" So will this mean it needs to switch to 'source' from Yoga onward, and the containers will have to be built some how? Can you point me to something to read on that? > 2) Use a common base (single Linux distribution) for Kolla images I'm not experienced enough with Kolla to know whether this will affect me, so I will roll with it and figure it out as we go. Thanks, Greg. From franck.vedel at univ-grenoble-alpes.fr Fri Nov 19 07:41:21 2021 From: franck.vedel at univ-grenoble-alpes.fr (Franck VEDEL) Date: Fri, 19 Nov 2021 08:41:21 +0100 Subject: [kolla-ansible][wallaby][glance] Problem with image list after reconfigure In-Reply-To: References: <87804956-038F-4F88-A467-33851679B5B4@univ-grenoble-alpes.fr> <63143C21-F494-4544-86F3-22792BF062E6@univ-grenoble-alpes.fr> Message-ID: hello ignacio and thank you for all this information. I also think that a structure with 3 servers may not be built properly, once again, arriving on such a project, without help (human help, because we find documents, documentations to be taken in order, with many different directions, choose the right OS, don't run into a bug (vpnaas for me), do tests, etc.). You have to make choices in order to move forward. I agree that I probably didn't do things the best way. And I regret it. Thank you for this help on how the images work. Yes, in my case the images can be used after the "download" because they are in Qcow2. I will change this, I did not understand it. It is clear that if a professional came to see my Openstack, they would tell me what is wrong, what I need to change, but hey, in the end, it still works a bit. Thanks Ingnacio, really. Franck > Le 18 nov. 2021 ? 20:34, Ignazio Cassano a ?crit : > > I did not understand very well how your infrastructure is done. > Generally speaking, I prefer to have 3 controllers , n computer nodes and external storage. > I think using iscsi images must be downloaded and converted from qcow2 to raw format and it can takes a long time. In this case I used image cache. Probably when you create a volume from image you can see a download phase. If you use image cache the download is executed only the first time a volume from that image is created. > Sorry for my bad english. > Take a look at > https://docs.openstack.org/cinder/latest/admin/blockstorage-image-volume-cache.html#:~:text=Image%2DVolume%20cache%C2%B6,end%20can%20clone%20a%20volume . > Ignazio > > > Il Gio 18 Nov 2021, 19:58 Franck VEDEL > ha scritto: > ok ... I got it ... and I think I was doing things wrong. > Okay, so I have another question. > My cinder storage is on an iscsi bay. > I have 3 servers, S1, S2, S2. > Compute is on S1, S2, S3. > Controller is on S1 and S2. > Storage is on S3. > I have Glance on S1. Building an instance from an image is too long, so you have to make a volume first. > If I put the images on the iSCSI bay, I mount a directory in the S1 file system, will the images build faster? Much faster ? > Is this a good idea or not? > > Thank you again for your help and your experience > > > Franck > >> Le 18 nov. 2021 ? 07:23, Ignazio Cassano > a ?crit : >> >> Hello, i solved using the following variabile in globals.yml: >> glance_file_datadir_volume=somedir >> and glance_backend_file="yes' >> >> So if the somedir is a nfs mount point, controllers can share images. Remember you have to deploy glance on all controllers. >> Ignazio >> >> Il Mer 17 Nov 2021, 23:17 Franck VEDEL > ha scritto: >> Hello and thank you for the help. >> I was able to move forward on my problem, without finding a satisfactory solution. >> Normally, I have 2 servers with the role [glance] but I noticed that all my images were on the first server (in / var / lib / docker / volumes / glance / _data / images) before the reconfigure, none on the second. But since the reconfiguration, the images are placed on the second, and no longer on the first. I do not understand why. I haven't changed anything to the multinode file. >> so, to get out of this situation quickly as I need this openstack for the students, I modified the multinode file and put only one server in [glance] (I put server 1, the one that had the images before reconfigure), I did a reconfigure -t glance and now I have my images usable for instances. >> I don't understand what happened. There is something wrong. >> >> Is it normal that after updating the certificates, all instances are turned off? >> thanks again >> >> Franck >> >>> Le 17 nov. 2021 ? 21:11, Cyril Roelandt > a ?crit : >>> >>> Hello, >>> >>> >>> On 2021-11-17 08:59, Franck VEDEL wrote: >>>> Hello everyone >>>> >>>> I have a strange problem and I haven't found the solution yet. >>>> Following a certificate update I had to do a "kolla-ansible -t multinode reconfigure ?. >>>> Well, after several attempts (it is not easy to use certificates with Kolla-ansible, and from my advice, not documented enough for beginners), I have my new functional certificates. Perfect ... well almost. >>>> >>>> I am trying to create a new instance to check general operation. ERROR. >>>> Okay, I look in the logs and I see that Cinder is having problems creating volumes with an error that I never had ("TypeError: 'NoneType' object is not iterable). >>> >>> We'd like to see the logs as well, especially the stacktrace. >>> >>>> I dig and then I wonder if it is not the Glance images which cannot be used, while they are present (openstack image list is OK). >>>> >>>> I create an empty volume: it works. >>>> I am creating a volume from an image: Failed. >>> >>> What commands are you running? What's the output? What's in the logs? >>> >>>> >>>> However, I have my list of ten images in glance. >>>> >>>> I create a new image and create a volume with this new image: it works. >>>> I create an instance with this new image: OK. >>>> >>>> What is the problem ? The images present before the "reconfigure" are listed, visible in horizon for example, but unusable. >>>> Is there a way to fix this, or do we have to reinstall them all? >>> >>> What's your configuration? What version of OpenStack are you running? >>> >>> >>> >>> Cyril >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnasiadka at gmail.com Fri Nov 19 08:18:31 2021 From: mnasiadka at gmail.com (=?utf-8?Q?Micha=C5=82_Nasiadka?=) Date: Fri, 19 Nov 2021 09:18:31 +0100 Subject: [kolla] Plan to deprecate binary and unify on single distrubition In-Reply-To: References: <7B0C2125-975E-437C-B332-9E88F725D248@gmail.com> Message-ID: <756EB466-CBAF-43C6-8360-4B5B0F0FE873@gmail.com> Hi Sean, > On 18 Nov 2021, at 16:04, Sean Mooney wrote: > > On Thu, 2021-11-18 at 13:33 +0100, Micha? Nasiadka wrote: >> Hello Koalas, >> >> On the PTG we have discussed two topics: >> >> 1) Deprecate and drop binary type of Kolla images >> 2) Use a common base (single Linux distribution) for Kolla images >> >> This is a call for feedback - for people that have not been attending the PTG. >> >> What this essentially mean for consumers: >> >> 1) In Yoga cycle we will deprecate binary type of Kolla images, and in Z cycle those will be dropped. >> 2) We are not going to support CentOS Stream 9 (cs9) as a base operating system, and the source type build will rely on CentOS Stream 8 in Z release. >> 3) Beginning from A release Kolla will build only Debian source images - but Kolla-Ansible will still support deployment of those images on CentOS/Ubuntu/Debian Host operating systems (and Rocky Linux to be added in Yoga to that mix). >> >> Justification: >> The Kolla project team is limited in numbers, therefore supporting current broad mix of operating systems (especially with CentOS Stream 9 ,,on the way??) is a significant maintenance burden. >> By dropping binary type of images - users would be running more tested images (since Kolla/Kolla-Ansible CI runs source images jobs as voting). >> In Xena we?ve already changed the default image type Kolla-Ansible uses to source. >> We also feel that using a unified base OS for Kolla container images is a way to remove some of the maintenance burden (including CI cycles and >> >> Request for feedback: >> If any of those changes is a no go from your perspective - we?d like to hear your opinions. > > i only have reason to use kolla-ansible on small personal clouds at home so either way this will have limited direct affect on me but just wanted to give some toughts: > kolla has been my favorit way to deploy openstack for a very long time, one of the reason i liked it was the fact that it was distro independent, simple ot use/configure and it support source and > binary installs. i have almost always defautl to ubuntu source although in the rare case si used centos i always used centos binary. > > i almost never used binary installs on debian based disto and on rpm based distos never really used soruce, im not sure if im the only one that did that. > its not because i trusted the rpm package more by the way it just seam to be waht was tested more when i cahged to kolla devs on irc so i avoid ubuntu bindary and centos source as a result. > > with that in mind debian source is not really contoversial to me, i had one quetion on that however. > will the support for the other distos be kept in the kolla images but not extended or will it be dropped. i asume the plan is to remvoe the templating for other distos in A based on point 3 above. > Yes, currently that?s the plan - to remove templating for other distros and entries for them in kolla-build code. > the only other thing i wanted to point out is that while i have had some succes gettign ubuntu soruce image to run on centos host in the past it will be tricky if kolla every want to supprot > selinx/apparmor. that was the main barrier i faced but there can be other. speficialy ovs and libvirt can be somewhat picky about the kernel on which they run. most of the openstack service will > likely not care that the contaienr os does not match the host but some of the "system" depenciy like libvirt/ovs might. a way to address taht would be to supprot using external images for those servie > form dockerhub/quay e.g. use the offical upstream mariadb image or libvirt or rabbit if that is infact a porblem. > We?ve been discussing about moving OVS and Libvirt deployment to be deployed on OS level (not in containers) - if that will be required. > anyway its totally understandable that if you do not have contirbutor that are able to support the other distors that you would remove the supprot. > espicaly with the move away form using kolla image in ooo of late and presumable a reduction in redhat contibutions to keep centos supprot alive. > > anyway the chagne while sad to see just form a proejct health point of vew are sad would not be enough to prevent me personally form using or recommendign kolla and kolla-ansibel. > if you had prorpsed the opistte of centos binary only that would be much more concerning to me. there are some advantages to source installs that you are preservging in this change > that i am glad will not be lost. > That?s good to see. > one last tought is that if only one disto will be supproted for building images in the future with only soruce installs supproted, it might be worth considering if alpine or the alpine/debian-lite > python 3 iamge should be revaluated as the base for an even lighter set of contianers rather then the base os image. i am kind fo assumeing that at some poitn the non python noopenstack contaienr > would be used form the offial images at some point rahter then kolla contining to maintain them with the above suggestion. So this may not be applicable now and really would likely be an A or post A > thing. > That was in the PTG notes, but I don?t think we have considered a separate base OS for python-based Kolla images (and a separate for those that need it or use official Docker Hub/Quay.io images for those services). Thanks for your input! > >> >> Best regards, >> Michal Nasiadka -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnasiadka at gmail.com Fri Nov 19 08:19:33 2021 From: mnasiadka at gmail.com (=?utf-8?Q?Micha=C5=82_Nasiadka?=) Date: Fri, 19 Nov 2021 09:19:33 +0100 Subject: [kolla] Plan to deprecate binary and unify on single distrubition In-Reply-To: <1e3b2745-5b78-fe2b-3113-b3253eb4c0a5@pawsey.org.au> References: <7B0C2125-975E-437C-B332-9E88F725D248@gmail.com> <1e3b2745-5b78-fe2b-3113-b3253eb4c0a5@pawsey.org.au> Message-ID: Hi Gregory, > On 19 Nov 2021, at 01:33, Gregory Orange wrote: > > On 18/11/21 8:33 pm, Micha? Nasiadka wrote: >> 1) Deprecate and drop binary type of Kolla images > > My globals.yaml has > #kolla_install_type: "binary" > > So will this mean it needs to switch to 'source' from Yoga onward, and the containers will have to be built some how? Can you point me to something to read on that? > If that?s commented out, that means with Xena you will get source images deployed (during an upgrade or on a fresh install). >> 2) Use a common base (single Linux distribution) for Kolla images > > I'm not experienced enough with Kolla to know whether this will affect me, so I will roll with it and figure it out as we go. > > Thanks, > Greg. Thanks, Michal From ignaziocassano at gmail.com Fri Nov 19 09:50:18 2021 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Fri, 19 Nov 2021 10:50:18 +0100 Subject: [kolla-ansible][wallaby][glance] Problem with image list after reconfigure In-Reply-To: <87804956-038F-4F88-A467-33851679B5B4@univ-grenoble-alpes.fr> References: <87804956-038F-4F88-A467-33851679B5B4@univ-grenoble-alpes.fr> Message-ID: Hello Franck, glance is not deployed on all nodes at default. I got the same problem In my case I have 3 controllers. I created an nfs share on a storage server where to store images. Before deploying glance, I create /var/lib/glance/images on the 3 controllers and I mount the nfs share. This is my fstab on the 3 controllers: 10.102.189.182:/netappopenstacktst2_glance /var/lib/glance nfs rw,user=glance,soft,intr,noatime,nodiratime In my globals.yml I have: glance_file_datadir_volume: "/var/lib/glance" glance_backend_file: "yes" This means images are on /var/lib/glance and since it is a nfs share all my 3 controlles can share images. Then you must deploy. To be sure the glance container is started on all controllers, since I have 3 controllers, I deployed 3 times changing the order in the inventory. First time: [control] A B C Second time: [control] B C A Third time: [control] C B A Or you can deploy glance 3 times using -t glance and -l As far as the instance stopped, I got I bug with a version of kolla. https://bugs.launchpad.net/kolla-ansible/+bug/1941706 Now is corrected and with kolla 12.2.0 it works. Ignazio Il giorno mer 17 nov 2021 alle ore 23:17 Franck VEDEL < franck.vedel at univ-grenoble-alpes.fr> ha scritto: > Hello and thank you for the help. > I was able to move forward on my problem, without finding a satisfactory > solution. > Normally, I have 2 servers with the role [glance] but I noticed that all > my images were on the first server (in / var / lib / docker / volumes / > glance / _data / images) before the reconfigure, none on the second. But > since the reconfiguration, the images are placed on the second, and no > longer on the first. I do not understand why. I haven't changed anything to > the multinode file. > so, to get out of this situation quickly as I need this openstack for the > students, I modified the multinode file and put only one server in [glance] > (I put server 1, the one that had the images before reconfigure), I did a > reconfigure -t glance and now I have my images usable for instances. > I don't understand what happened. There is something wrong. > > Is it normal that after updating the certificates, all instances are > turned off? > thanks again > > Franck > > Le 17 nov. 2021 ? 21:11, Cyril Roelandt a ?crit : > > Hello, > > > On 2021-11-17 08:59, Franck VEDEL wrote: > > Hello everyone > > I have a strange problem and I haven't found the solution yet. > Following a certificate update I had to do a "kolla-ansible -t multinode > reconfigure ?. > Well, after several attempts (it is not easy to use certificates with > Kolla-ansible, and from my advice, not documented enough for beginners), I > have my new functional certificates. Perfect ... well almost. > > I am trying to create a new instance to check general operation. ERROR. > Okay, I look in the logs and I see that Cinder is having problems creating > volumes with an error that I never had ("TypeError: 'NoneType' object is > not iterable). > > > We'd like to see the logs as well, especially the stacktrace. > > I dig and then I wonder if it is not the Glance images which cannot be > used, while they are present (openstack image list is OK). > > I create an empty volume: it works. > I am creating a volume from an image: Failed. > > > What commands are you running? What's the output? What's in the logs? > > > However, I have my list of ten images in glance. > > I create a new image and create a volume with this new image: it works. > I create an instance with this new image: OK. > > What is the problem ? The images present before the "reconfigure" are > listed, visible in horizon for example, but unusable. > Is there a way to fix this, or do we have to reinstall them all? > > > What's your configuration? What version of OpenStack are you running? > > > > Cyril > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Fri Nov 19 11:03:36 2021 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Fri, 19 Nov 2021 12:03:36 +0100 Subject: [kolla-ansible][wallaby][glance] Problem with image list after reconfigure In-Reply-To: References: <87804956-038F-4F88-A467-33851679B5B4@univ-grenoble-alpes.fr> Message-ID: If one sets glance_file_datadir_volume to non-default, then glance-api gets deployed on all hosts. -yoctozepto On Fri, 19 Nov 2021 at 10:51, Ignazio Cassano wrote: > > Hello Franck, glance is not deployed on all nodes at default. > I got the same problem > In my case I have 3 controllers. > I created an nfs share on a storage server where to store images. > Before deploying glance, I create /var/lib/glance/images on the 3 controllers and I mount the nfs share. > This is my fstab on the 3 controllers: > > 10.102.189.182:/netappopenstacktst2_glance /var/lib/glance nfs rw,user=glance,soft,intr,noatime,nodiratime > > In my globals.yml I have: > glance_file_datadir_volume: "/var/lib/glance" > glance_backend_file: "yes" > > This means images are on /var/lib/glance and since it is a nfs share all my 3 controlles can share images. > Then you must deploy. > To be sure the glance container is started on all controllers, since I have 3 controllers, I deployed 3 times changing the order in the inventory. > First time: > [control] > A > B > C > > Second time: > [control] > B > C > A > > Third time: > [control] > C > B > A > > Or you can deploy glance 3 times using -t glance and -l > > As far as the instance stopped, I got I bug with a version of kolla. > https://bugs.launchpad.net/kolla-ansible/+bug/1941706 > Now is corrected and with kolla 12.2.0 it works. > Ignazio > > > Il giorno mer 17 nov 2021 alle ore 23:17 Franck VEDEL ha scritto: >> >> Hello and thank you for the help. >> I was able to move forward on my problem, without finding a satisfactory solution. >> Normally, I have 2 servers with the role [glance] but I noticed that all my images were on the first server (in / var / lib / docker / volumes / glance / _data / images) before the reconfigure, none on the second. But since the reconfiguration, the images are placed on the second, and no longer on the first. I do not understand why. I haven't changed anything to the multinode file. >> so, to get out of this situation quickly as I need this openstack for the students, I modified the multinode file and put only one server in [glance] (I put server 1, the one that had the images before reconfigure), I did a reconfigure -t glance and now I have my images usable for instances. >> I don't understand what happened. There is something wrong. >> >> Is it normal that after updating the certificates, all instances are turned off? >> thanks again >> >> Franck >> >> Le 17 nov. 2021 ? 21:11, Cyril Roelandt a ?crit : >> >> Hello, >> >> >> On 2021-11-17 08:59, Franck VEDEL wrote: >> >> Hello everyone >> >> I have a strange problem and I haven't found the solution yet. >> Following a certificate update I had to do a "kolla-ansible -t multinode reconfigure ?. >> Well, after several attempts (it is not easy to use certificates with Kolla-ansible, and from my advice, not documented enough for beginners), I have my new functional certificates. Perfect ... well almost. >> >> I am trying to create a new instance to check general operation. ERROR. >> Okay, I look in the logs and I see that Cinder is having problems creating volumes with an error that I never had ("TypeError: 'NoneType' object is not iterable). >> >> >> We'd like to see the logs as well, especially the stacktrace. >> >> I dig and then I wonder if it is not the Glance images which cannot be used, while they are present (openstack image list is OK). >> >> I create an empty volume: it works. >> I am creating a volume from an image: Failed. >> >> >> What commands are you running? What's the output? What's in the logs? >> >> >> However, I have my list of ten images in glance. >> >> I create a new image and create a volume with this new image: it works. >> I create an instance with this new image: OK. >> >> What is the problem ? The images present before the "reconfigure" are listed, visible in horizon for example, but unusable. >> Is there a way to fix this, or do we have to reinstall them all? >> >> >> What's your configuration? What version of OpenStack are you running? >> >> >> >> Cyril >> >> From gregory.orange at pawsey.org.au Fri Nov 19 11:27:56 2021 From: gregory.orange at pawsey.org.au (Gregory Orange) Date: Fri, 19 Nov 2021 19:27:56 +0800 Subject: [kolla] Plan to deprecate binary and unify on single distrubition In-Reply-To: References: <7B0C2125-975E-437C-B332-9E88F725D248@gmail.com> <1e3b2745-5b78-fe2b-3113-b3253eb4c0a5@pawsey.org.au> Message-ID: <402bf683-8c9d-64f9-16c8-aacb4169a7b6@pawsey.org.au> On 19/11/21 4:19 pm, Micha? Nasiadka wrote: >> My globals.yaml has >> #kolla_install_type: "binary" >> >> So will this mean it needs to switch to 'source' from Yoga onward, and the containers will have to be built some how? Can you point me to something to read on that? >> > If that?s commented out, that means with Xena you will get source images deployed (during an upgrade or on a fresh install). So am I going to need to familiarise myself with some build process, such as https://docs.openstack.org/kolla/train/admin/image-building.html ? From swogatpradhan22 at gmail.com Fri Nov 19 11:41:58 2021 From: swogatpradhan22 at gmail.com (Swogat Pradhan) Date: Fri, 19 Nov 2021 17:11:58 +0530 Subject: [Openstack-victoria] [IRONIC] error in network interface | baremetal node validate In-Reply-To: References: Message-ID: can someone please suggest a way forward for this issue?? On Tue, Nov 16, 2021 at 3:43 PM Swogat Pradhan wrote: > Hi, > I am currently trying to setup openstack ironic using driver IPMI. > I followed the official docs of openstack for setting everything up. > > When i run openstack baremetal node validate $NODE_UUID, i am getting the > following error: > > * Unexpected exception, traceback saved into log by ironic conductor > service that is running on controller: 'ServiceTokenAuthWrapper' object has > no attribute '_discovery_cache' * > in the network interface in command output. > > When i check the ironic conductor logs i see the following messages: > > > > > Can anyone suggest a solution or a way forward. > > With regards > Swogat Pradhan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Fri Nov 19 12:02:01 2021 From: smooney at redhat.com (Sean Mooney) Date: Fri, 19 Nov 2021 12:02:01 +0000 Subject: [kolla] Plan to deprecate binary and unify on single distrubition In-Reply-To: <402bf683-8c9d-64f9-16c8-aacb4169a7b6@pawsey.org.au> References: <7B0C2125-975E-437C-B332-9E88F725D248@gmail.com> <1e3b2745-5b78-fe2b-3113-b3253eb4c0a5@pawsey.org.au> <402bf683-8c9d-64f9-16c8-aacb4169a7b6@pawsey.org.au> Message-ID: <2ca211d9b9429b1358810b671cdc29246b0a7e4d.camel@redhat.com> On Fri, 2021-11-19 at 19:27 +0800, Gregory Orange wrote: > On 19/11/21 4:19 pm, Micha? Nasiadka wrote: > > > My globals.yaml has > > > #kolla_install_type: "binary" > > > > > > So will this mean it needs to switch to 'source' from Yoga onward, and the containers will have to be built some how? Can you point me to something to read on that? > > > > > If that?s commented out, that means with Xena you will get source images deployed (during an upgrade or on a fresh install). > > So am I going to need to familiarise myself with some build process, such as https://docs.openstack.org/kolla/train/admin/image-building.html ? no the soruce images are also published to docker/quay registryso if you are not building them today and pulling them from there it will work the same if you are building image today there is no really different form a commandline point of view for source vs binary. the main different is the soruce images install all the deps from pip and the servcis form the offical tarballs into a python virtual env within the contaianer image. so unless you have explcitly set kolla_install_type: binay or the distroy you should not need to change anything hopefuly in your gobal.yaml ideally. > From ignaziocassano at gmail.com Fri Nov 19 13:50:30 2021 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Fri, 19 Nov 2021 14:50:30 +0100 Subject: [kolla-ansible][wallaby][glance] Problem with image list after reconfigure In-Reply-To: References: <87804956-038F-4F88-A467-33851679B5B4@univ-grenoble-alpes.fr> Message-ID: Franck, this help you a lot. Thanks Radoslaw Ignazio Il giorno ven 19 nov 2021 alle ore 12:03 Rados?aw Piliszek < radoslaw.piliszek at gmail.com> ha scritto: > If one sets glance_file_datadir_volume to non-default, then glance-api > gets deployed on all hosts. > > -yoctozepto > > On Fri, 19 Nov 2021 at 10:51, Ignazio Cassano > wrote: > > > > Hello Franck, glance is not deployed on all nodes at default. > > I got the same problem > > In my case I have 3 controllers. > > I created an nfs share on a storage server where to store images. > > Before deploying glance, I create /var/lib/glance/images on the 3 > controllers and I mount the nfs share. > > This is my fstab on the 3 controllers: > > > > 10.102.189.182:/netappopenstacktst2_glance /var/lib/glance nfs > rw,user=glance,soft,intr,noatime,nodiratime > > > > In my globals.yml I have: > > glance_file_datadir_volume: "/var/lib/glance" > > glance_backend_file: "yes" > > > > This means images are on /var/lib/glance and since it is a nfs share all > my 3 controlles can share images. > > Then you must deploy. > > To be sure the glance container is started on all controllers, since I > have 3 controllers, I deployed 3 times changing the order in the inventory. > > First time: > > [control] > > A > > B > > C > > > > Second time: > > [control] > > B > > C > > A > > > > Third time: > > [control] > > C > > B > > A > > > > Or you can deploy glance 3 times using -t glance and -l > > > > As far as the instance stopped, I got I bug with a version of kolla. > > https://bugs.launchpad.net/kolla-ansible/+bug/1941706 > > Now is corrected and with kolla 12.2.0 it works. > > Ignazio > > > > > > Il giorno mer 17 nov 2021 alle ore 23:17 Franck VEDEL < > franck.vedel at univ-grenoble-alpes.fr> ha scritto: > >> > >> Hello and thank you for the help. > >> I was able to move forward on my problem, without finding a > satisfactory solution. > >> Normally, I have 2 servers with the role [glance] but I noticed that > all my images were on the first server (in / var / lib / docker / volumes / > glance / _data / images) before the reconfigure, none on the second. But > since the reconfiguration, the images are placed on the second, and no > longer on the first. I do not understand why. I haven't changed anything to > the multinode file. > >> so, to get out of this situation quickly as I need this openstack for > the students, I modified the multinode file and put only one server in > [glance] (I put server 1, the one that had the images before reconfigure), > I did a reconfigure -t glance and now I have my images usable for instances. > >> I don't understand what happened. There is something wrong. > >> > >> Is it normal that after updating the certificates, all instances are > turned off? > >> thanks again > >> > >> Franck > >> > >> Le 17 nov. 2021 ? 21:11, Cyril Roelandt a ?crit : > >> > >> Hello, > >> > >> > >> On 2021-11-17 08:59, Franck VEDEL wrote: > >> > >> Hello everyone > >> > >> I have a strange problem and I haven't found the solution yet. > >> Following a certificate update I had to do a "kolla-ansible -t > multinode reconfigure ?. > >> Well, after several attempts (it is not easy to use certificates with > Kolla-ansible, and from my advice, not documented enough for beginners), I > have my new functional certificates. Perfect ... well almost. > >> > >> I am trying to create a new instance to check general operation. ERROR. > >> Okay, I look in the logs and I see that Cinder is having problems > creating volumes with an error that I never had ("TypeError: 'NoneType' > object is not iterable). > >> > >> > >> We'd like to see the logs as well, especially the stacktrace. > >> > >> I dig and then I wonder if it is not the Glance images which cannot be > used, while they are present (openstack image list is OK). > >> > >> I create an empty volume: it works. > >> I am creating a volume from an image: Failed. > >> > >> > >> What commands are you running? What's the output? What's in the logs? > >> > >> > >> However, I have my list of ten images in glance. > >> > >> I create a new image and create a volume with this new image: it works. > >> I create an instance with this new image: OK. > >> > >> What is the problem ? The images present before the "reconfigure" are > listed, visible in horizon for example, but unusable. > >> Is there a way to fix this, or do we have to reinstall them all? > >> > >> > >> What's your configuration? What version of OpenStack are you running? > >> > >> > >> > >> Cyril > >> > >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dtantsur at redhat.com Fri Nov 19 15:00:45 2021 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Fri, 19 Nov 2021 16:00:45 +0100 Subject: [all][refstack][neutron][kolla][ironic][heat][trove][senlin][barbican][manila] Fixing Zuul Config Errors In-Reply-To: <5072b00c-bbf6-42e9-830f-d6598a76beb7@www.fastmail.com> References: <5072b00c-bbf6-42e9-830f-d6598a76beb7@www.fastmail.com> Message-ID: On Tue, Nov 16, 2021 at 6:05 PM Clark Boylan wrote: > Hello, > > The OpenStack tenant in Zuul currently has 134 config errors. You can find > these errors at https://zuul.opendev.org/t/openstack/config-errors or by > clicking the blue bell icon in the top right of > https://zuul.opendev.org/t/openstack/status. The vast majority of these > errors appear related to project renames that have been requested of > OpenDev or project retirements. Can you please look into fixing these as > they can be an attractive nuisance when debugging Zuul problems (they also > indicate that a number of your jobs are probably not working). > > Project renames creating issues: > > * openstack/python-tempestconf -> osf/python-tempestconf -> > openinfra/python-tempestconf > * openstack/refstack -> osf/refstack -> openinfra/refstack > * x/tap-as-a-service -> openstack/tap-as-a-service > * openstack/networking-l2gw -> x/networking-l2gw > > Project retirements creating issues: > > * openstack/neutron-lbaas > * recordsansible/ara > > Projects whose configs have errors: > > * openinfra/python-tempestconf > * openstack/heat > * openstack/ironic > Ironic failures are missing jobs on Queens and Rocky. I'd avoid touching these branches unless our problems are breaking someone. Dmitry > * openstack/kolla-ansible > * openstack/kuryr-kubernetes > * openstack/murano-apps > * openstack/networking-midonet > * openstack/networking-odl > * openstack/neutron > * openstack/neutron-fwaas > * openstack/python-troveclient > * openstack/senlin > * openstack/tap-as-a-service > * openstack/zaqar > * x/vmware-nsx > * openinfra/openstackid > * openstack/barbican > * openstack/cookbook-openstack-application-catalog > * openstack/heat-dashboard > * openstack/manila-ui > * openstack/python-manilaclient > > Let us know if we can help decipher any errors, > Clark > > -- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill -------------- next part -------------- An HTML attachment was scrubbed... URL: From tobias.urdin at binero.com Fri Nov 19 15:22:08 2021 From: tobias.urdin at binero.com (Tobias Urdin) Date: Fri, 19 Nov 2021 15:22:08 +0000 Subject: [kolla] Plan to deprecate binary and unify on single distrubition In-Reply-To: <7B0C2125-975E-437C-B332-9E88F725D248@gmail.com> References: <7B0C2125-975E-437C-B332-9E88F725D248@gmail.com> Message-ID: <19643763-8235-45B3-91A8-B8A0839E31F6@binero.com> Interesting. This is probably a little bit off-topic but I find it very interesting that a majority of all the big OpenStack clouds out there is running on containers based and a lot of them using ?LOKI? that was talked about so much in the OpenInfra Live Keynotes. What I don?t understand is that, with all these limited resources, there is no joint effort in the OpenStack ecosystem to solve the container deliverables issue and then just have all deployments/tooling use the same. Maybe that what they are doing though, using Kolla images? but then, wouldn?t they contribute more and the below not be a problem *makes me wonder* Sorry for off-topic loud thinking. Best regards Tobias > On 18 Nov 2021, at 13:33, Micha? Nasiadka wrote: > > Hello Koalas, > > On the PTG we have discussed two topics: > > 1) Deprecate and drop binary type of Kolla images > 2) Use a common base (single Linux distribution) for Kolla images > > This is a call for feedback - for people that have not been attending the PTG. > > What this essentially mean for consumers: > > 1) In Yoga cycle we will deprecate binary type of Kolla images, and in Z cycle those will be dropped. > 2) We are not going to support CentOS Stream 9 (cs9) as a base operating system, and the source type build will rely on CentOS Stream 8 in Z release. > 3) Beginning from A release Kolla will build only Debian source images - but Kolla-Ansible will still support deployment of those images on CentOS/Ubuntu/Debian Host operating systems (and Rocky Linux to be added in Yoga to that mix). > > Justification: > The Kolla project team is limited in numbers, therefore supporting current broad mix of operating systems (especially with CentOS Stream 9 ,,on the way??) is a significant maintenance burden. > By dropping binary type of images - users would be running more tested images (since Kolla/Kolla-Ansible CI runs source images jobs as voting). > In Xena we?ve already changed the default image type Kolla-Ansible uses to source. > We also feel that using a unified base OS for Kolla container images is a way to remove some of the maintenance burden (including CI cycles and > > Request for feedback: > If any of those changes is a no go from your perspective - we?d like to hear your opinions. > > Best regards, > Michal Nasiadka From marcin.juszkiewicz at linaro.org Fri Nov 19 15:24:35 2021 From: marcin.juszkiewicz at linaro.org (Marcin Juszkiewicz) Date: Fri, 19 Nov 2021 16:24:35 +0100 Subject: [kolla] Plan to deprecate binary and unify on single distrubition In-Reply-To: References: <7B0C2125-975E-437C-B332-9E88F725D248@gmail.com> Message-ID: W dniu 18.11.2021 o 16:04, Sean Mooney pisze: > one last tought is that if only one disto will be supproted for > building images in the future with only soruce installs supproted, it > might be worth considering if alpine or the alpine/debian-lite python > 3 iamge should be revaluated as the base for an even lighter set of > contianers rather then the base os image. Using one distribution which covers all images makes life simple in Kolla. Alpine may not cover all our needs as some software still requires things present in glibc, missing in musl libc. And while openstack-base image probably could be built using one of python lite containers there are images derived from it which install additional distro packages. So instead of taking packages from one distribution (Debian) we would take from Debian or Alpine. Which could get out of sync too easily. From tobias.urdin at binero.com Fri Nov 19 15:35:55 2021 From: tobias.urdin at binero.com (Tobias Urdin) Date: Fri, 19 Nov 2021 15:35:55 +0000 Subject: [nova][all] Adding /healthcheck support in Nova, and better healthcheck in every projects In-Reply-To: References: <4882cd19-4f6d-c0a7-15cd-a98e73a217ad@debian.org> <290155a7-ad8a-ab60-8cf3-58e2642e8f38@debian.org> Message-ID: <9E3B75D2-EE5C-47F0-968F-B8A1285B44B1@binero.com> As Mohammed said, you can actually do the exact same in haproxy by setting the server in the backend to drain which would be the same just the opposite way around. That is ?set server / state drain? over haproxy admin socket. I really welcome Sean?s proposal on a real healthcheck framework that would actually tell you that something is not working instead of trying to find for example RabbitMQ connection issues from logs, it really is a pain. I wouldn?t want to have an ?real? healthcheck that does all these things exposed on public API though and think Sean?s proposal is correct and does not break backward capability since oslo.healthcheck middleware will still be there. Best regards Tobias > On 18 Nov 2021, at 16:50, Thomas Goirand wrote: > > On 11/18/21 2:03 AM, Mohammed Naser wrote: >> >> >> On Wed, Nov 17, 2021 at 5:52 PM Thomas Goirand > > wrote: >> >> On 11/17/21 10:54 PM, Dan Smith wrote: >>>> I don't think we rely on /healthcheck -- there's nothing healthy >> about >>>> an API endpoint blindly returning a 200 OK. >>>> >>>> You might as well just hit / and accept 300 as a code and that's >>>> exactly the same behaviour. I support what Sean is bringing up here >>>> and I don't think it makes sense to have a noop /healthcheck that >>>> always gives a 200 OK...seems a bit useless imho >>> >>> Yup, totally agree. Our previous concerns over a healthcheck that >>> checked all of nova returning too much info to be useful (for >> something >>> trying to figure out if an individual worker is healthy) apply in >>> reverse to one that returns too little to be useful. >>> >>> I agree, what Sean is working on is the right balance and that we >> should >>> focus on that. >>> >>> --Dan >>> >> >> That's not the only thing it does. It also is capable of being disabled, >> which is useful for maintenance: one can gracefully remove an API node >> for removal this way, which one cannot do with the root. >> >> >> I feel like this should be handled by whatever layer that needs to drain >> requests for maintenance, otherwise also it might just be the same as >> turning off the service, no? > > It's not the same. > > If you just turn off the service, there well may be some requests > attempted to the API before it's seen as down. The idea here, is to > declare the API as down, so that haproxy can remove it from the pool > *before* the service is really turned off. > > That's what the oslo.middleware disable file helps doing, which the root > url cannot do. > > Cheers, > > Thomas Goirand (zigo) > From DHilsbos at performair.com Fri Nov 19 16:29:00 2021 From: DHilsbos at performair.com (DHilsbos at performair.com) Date: Fri, 19 Nov 2021 16:29:00 +0000 Subject: [ops][nova][victoria] QEMU/KVM libvirt Authentication Failed Message-ID: <0670B960225633449A24709C291A525251D4DE7F@COM03.performair.local> All; I feel like I've dealt with this issue before, but I can't find any records of it. I've been swapping out the compute nodes in my cluster for newer and better hardware. We also decided to abandon CentOS. All the differences mean that we haven't been able to do live migrations. I now have 2 servers with the same CPUs, OS (Ubuntu), OS Version (20.10), etc., and would like to get live migration working again. I configured passwordless ssh access between the servers for the nova users to get cold migration working. I have also configured passwordless ssh for the root users in accordance with [1]. When I try to do a live migration, the origin server generates this error, in the nova-compute log: 2021-11-19 15:52:31.130 15610 ERROR nova.virt.libvirt.driver [-] [instance: 5935c07d-0c7f-48cc-a4b9-674504fc6005] Live Migration failure: operation failed: Failed to connect to remote libvirt URI qemu+tcp:///system: authentication failed: authentication failed: libvirt.libvirtError: operation failed: Failed to connect to remote libvirt URI qemu+tcp:///system: authentication failed: authentication failed At one point, I came across a tutorial on configuring live-migration for libvirt, which included a bunch of user configuration. I don't remember having to do that before, but is that what I need to be looking for? Thank you, Dominic L. Hilsbos, MBA Vice President - Information Technology Perform Air International Inc. DHilsbos at PerformAir.com www.PerformAir.com 1: https://docs.openstack.org/nova/victoria/admin/configuring-migrations.html#section-configuring-compute-migrations From gauurav.sabharwal at in.ibm.com Fri Nov 19 03:40:02 2021 From: gauurav.sabharwal at in.ibm.com (Gauurav Sabharwal1) Date: Fri, 19 Nov 2021 09:10:02 +0530 Subject: [cinder] : SAN migration In-Reply-To: References: Message-ID: Hi Sumit, No response. Regards Gauurav Sabharwal IBM India Pvt. Ltd. IBM towers Ground floor, Block -A , Plot number 26, Sector 62, Noida Gautam budhnagar UP-201307. Email:gauurav.sabharwal at in.ibm.com Mobile No.: +91-9910159277 From: "Sumit Marwah" To: "Gauurav Sabharwal1" Cc: openstack-discuss at lists.openstack.org, "Pravin P Kudav" Date: 19-11-2021 07:55 Subject: [EXTERNAL] Re: [cinder] : SAN migration Hi?Gauurav,? Did you receive any response from OpenStack?? Thanks,? Sumit Marwah Technology & Sales Director, APJ | Brocade Broadcom Mobile: +6596452240 1 Yishun Ave 7?| ?Singapore sumit.marwah at broadcom.com ??| broadcom.com ?????????? Hi?Gauurav, Did you receive any response from OpenStack? Thanks, Sumit Marwah Technology & Sales Director, APJ | Brocade Broadcom Mobile: +6596452240 1 Yishun Ave 7?| ?Singapore sumit.marwah at broadcom.com ??| ??broadcom.com On Wed, Nov 3, 2021 at 12:50 PM Gauurav Sabharwal1 < gauurav.sabharwal at in.ibm.com> wrote: Hi Experts , I need some expert advise of one of the scenario, I have multiple isolated OpenStack cluster running with train & rocky edition. Each OpenStack cluster environment have it's own isolated infrastructure of SAN ?( CISCO fabric ) & Storage ?( HP, EMC & IBM). Now company planning to refresh their SAN infrastructure. By procuring new Brocade SAN switches. But there are some migration relevant challenges we have. As we understand under one cinder instance only one typer of FC zone manager is supported . Currently ?customer configured & managing CISCO . Is it possible to configure two different vendor FC Zone manager under one cinder instance. Migration of SAN zoning is supposedly going to be happen offline way from OpenStack point of view. We will be migrating all ports of each existing cisco fabric to Brocade with zone configuration using brocade CLI. ? ?Our main concern is that after migration How ?CINDER DB update new zone info & path via Brocade SAN. Regards Gauurav Sabharwal IBM India Pvt. Ltd. IBM towers Ground floor, Block -A , Plot number 26, Sector 62, Noida Gautam budhnagar UP-201307. Email:gauurav.sabharwal at in.ibm.com Mobile No.: +91-9910159277 This electronic communication and the information and any files transmitted with it, or attached to it, are confidential and are intended solely for the use of the individual or entity to whom it is addressed and may contain information that is confidential, legally privileged, protected by privacy laws, or otherwise restricted from disclosure to anyone else. If you are not the intended recipient or the person responsible for delivering the e-mail to the intended recipient, you are hereby notified that any use, copying, distributing, dissemination, forwarding, printing, or copying of this e-mail is strictly prohibited. If you received this e-mail in error, please return the e-mail to the sender, delete it from your computer, and destroy any printed copy of it. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From sumit.marwah at broadcom.com Fri Nov 19 02:24:37 2021 From: sumit.marwah at broadcom.com (Sumit Marwah) Date: Fri, 19 Nov 2021 10:24:37 +0800 Subject: [cinder] : SAN migration In-Reply-To: References: Message-ID: Hi Gauurav, Did you receive any response from OpenStack? Thanks, Sumit Marwah Technology & Sales Director, APJ | Brocade Broadcom Mobile: +6596452240 1 Yishun Ave 7 | Singapore sumit.marwah at broadcom.com | broadcom.com On Wed, Nov 3, 2021 at 12:50 PM Gauurav Sabharwal1 < gauurav.sabharwal at in.ibm.com> wrote: > Hi Experts , > > I need some expert advise of one of the scenario, I have multiple isolated > OpenStack cluster running with train & rocky edition. Each OpenStack > cluster environment have it's own isolated infrastructure of SAN ( CISCO > fabric ) & Storage ( HP, EMC & IBM). > > Now company planning to refresh their SAN infrastructure. By procuring new > Brocade SAN switches. But there are some migration relevant challenges we > have. > > 1. As we understand under one cinder instance only one typer of FC > zone manager is supported . Currently customer configured & managing CISCO > .* Is it possible to configure two different vendor FC Zone manager > under one cinder instance.* > 2. Migration of SAN zoning is supposedly going to be happen offline > way from OpenStack point of view. We will be migrating all ports of each > existing cisco fabric to Brocade with zone configuration using brocade CLI. > *Our main concern is that after migration How CINDER DB update new > zone info & path via Brocade SAN.* > > > > Regards > Gauurav Sabharwal > IBM India Pvt. Ltd. > IBM towers > Ground floor, Block -A , Plot number 26, > Sector 62, Noida > Gautam budhnagar UP-201307. > Email:gauurav.sabharwal at in.ibm.com > Mobile No.: +91-9910159277 > > -- This electronic communication and the information and any files transmitted with it, or attached to it, are confidential and are intended solely for the use of the individual or entity to whom it is addressed and may contain information that is confidential, legally privileged, protected by privacy laws, or otherwise restricted from disclosure to anyone else. If you are not the intended recipient or the person responsible for delivering the e-mail to the intended recipient, you are hereby notified that any use, copying, distributing, dissemination, forwarding, printing, or copying of this e-mail is strictly prohibited. If you received this e-mail in error, please return the e-mail to the sender, delete it from your computer, and destroy any printed copy of it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Fri Nov 19 17:44:52 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Fri, 19 Nov 2021 11:44:52 -0600 Subject: [all][tc] Yoga testing runtime update. Action needed if py3.9 job failing for your project Message-ID: <17d394d1377.da5099ba980738.5329502106648098442@ghanshyammann.com> Hello Everyone. As discussed in TC PTG, we have updated the testing runtime for the Yoga cycle [1]. Changes are: * Add Debian 11 as the tested distro * Change centos stream 8 -> centos stream 9 * Bump lowest python version to test to 3.8 and highest to python 3.9 ** This removes python 3.6 from testing. I pushed the job template update[2] which will make py3.9 unit test job voting (which is non-voting currently). I do not see any projects failing consistently on py3.9[3] but still, I will keep it as -W until early next week (23rd Nov). If any project needs time to fix the failing py3.9 job, please do it before 23rd Nov. [1] https://governance.openstack.org/tc/reference/runtimes/yoga.html [2] https://review.opendev.org/c/openstack/openstack-zuul-jobs/+/818609 [3] https://zuul.openstack.org/builds?job_name=openstack-tox-py39&branch=master&result=FAILURE -gmann From mthode at mthode.org Fri Nov 19 17:56:50 2021 From: mthode at mthode.org (Matthew Thode) Date: Fri, 19 Nov 2021 11:56:50 -0600 Subject: [requirements][keystone] pysaml2 and oslo.policy not able to be updated due to keystone test failures Message-ID: <20211119175650.b4rb3nwsv7u34q5k@mthode.org> more or less as the title states the following reviews show the failures pysaml2: https://review.opendev.org/818612 oslo.policy: https://review.opendev.org/815820 -- Matthew Thode From kennelson11 at gmail.com Fri Nov 19 18:10:33 2021 From: kennelson11 at gmail.com (Kendall Nelson) Date: Fri, 19 Nov 2021 10:10:33 -0800 Subject: [sdk][tc] A new home for Gophercloud In-Reply-To: References: <20211103164054.72y2c7qk2b7ogxn2@yuggoth.org> <17d0b3beed8.f1f93452444025.7625989527428519853@ghanshyammann.com> Message-ID: Fair enough. Was just curious if there was some technical reason. I think it would make more sense for the SDKs to live together, personally, but I can also see how having it live inside OpenStack can be daunting for a new, external contributor. -Kendall On Mon, Nov 15, 2021 at 4:32 PM Emilien Macchi wrote: > Hey Kendall, > > On Mon, Nov 15, 2021 at 11:59 AM Kendall Nelson > wrote: > >> Is there a reason why you don't want it to be under the openstack >> namespace? >> > > The only reason that comes to my mind is not technical at all. > I (not saying we, since we haven't reached consensus yet) think that we > want the project in its own organization, rather than under openstack. We > want to encourage external contributions from outside of OpenStack, > therefore opendev would probably suit better than openstack. > > This is open for discussion of course, but as I see it going, these are my > personal thoughts. > > Thanks, > -- > Emilien Macchi > -------------- next part -------------- An HTML attachment was scrubbed... URL: From emilien at redhat.com Fri Nov 19 18:55:20 2021 From: emilien at redhat.com (Emilien Macchi) Date: Fri, 19 Nov 2021 13:55:20 -0500 Subject: [sdk][tc] A new home for Gophercloud In-Reply-To: References: <20211103164054.72y2c7qk2b7ogxn2@yuggoth.org> <17d0b3beed8.f1f93452444025.7625989527428519853@ghanshyammann.com> Message-ID: On Fri, Nov 19, 2021 at 1:11 PM Kendall Nelson wrote: > Fair enough. Was just curious if there was some technical reason. > > I think it would make more sense for the SDKs to live together, > personally, but I can also see how having it live inside OpenStack can be > daunting for a new, external contributor. > If it was me only, Gophercloud would move to opendev right now, I can only think about its benefits by my experience with the community and our amazing tools / workflows. But because I'm biased and the decision is not up to me only, I'm trying to see if this decision would be well welcomed. So far the feedback from non-OpenStack contributors was (in a nutshell, and roughly summarized): "We like the Github ecosystem, don't know much about Gerrit but if this gets too complicated I'll give up my PRs. However I agree we need to make CI better". So this is where we are... In an ideal world we would keep Github for issues & PRs, and use Opendev Infra, but I understand this isn't possible. For now, we're doing nothing except trying to stabilize our CI until we finally make a decision whether we move or not. I hope this email explains well enough why we haven't made any move yet. Emilien > -Kendall > > On Mon, Nov 15, 2021 at 4:32 PM Emilien Macchi wrote: > >> Hey Kendall, >> >> On Mon, Nov 15, 2021 at 11:59 AM Kendall Nelson >> wrote: >> >>> Is there a reason why you don't want it to be under the openstack >>> namespace? >>> >> >> The only reason that comes to my mind is not technical at all. >> I (not saying we, since we haven't reached consensus yet) think that we >> want the project in its own organization, rather than under openstack. We >> want to encourage external contributions from outside of OpenStack, >> therefore opendev would probably suit better than openstack. >> >> This is open for discussion of course, but as I see it going, these are >> my personal thoughts. >> >> Thanks, >> -- >> Emilien Macchi >> > -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Fri Nov 19 19:48:26 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Fri, 19 Nov 2021 13:48:26 -0600 Subject: [requirements][keystone] pysaml2 and oslo.policy not able to be updated due to keystone test failures In-Reply-To: <20211119175650.b4rb3nwsv7u34q5k@mthode.org> References: <20211119175650.b4rb3nwsv7u34q5k@mthode.org> Message-ID: <17d39be3432.fcce414f983958.4576894677399317541@ghanshyammann.com> ---- On Fri, 19 Nov 2021 11:56:50 -0600 Matthew Thode wrote ---- > more or less as the title states the following reviews show the failures > > pysaml2: https://review.opendev.org/818612 > oslo.policy: https://review.opendev.org/815820 oslo policy failing in keystone is due to a change in warning text, I have pushed the fix- https://review.opendev.org/c/openstack/keystone/+/818624 -gmann > > -- > Matthew Thode > > From dmendiza at redhat.com Fri Nov 19 20:32:42 2021 From: dmendiza at redhat.com (Douglas Mendizabal) Date: Fri, 19 Nov 2021 14:32:42 -0600 Subject: [keystone] No weekly meeting next week Message-ID: Hi Keystone friends, I'll be out on PTO next week, so I'm canceling the Keystone weekly meeting for November 23. Meetings will resume the following week on November 30. Thanks, - Douglas Mendiz?bal (redrobot) From franck.vedel at univ-grenoble-alpes.fr Fri Nov 19 20:56:09 2021 From: franck.vedel at univ-grenoble-alpes.fr (Franck VEDEL) Date: Fri, 19 Nov 2021 21:56:09 +0100 Subject: [kolla-ansible][wallaby][glance] Problem with image list after reconfigure In-Reply-To: References: <87804956-038F-4F88-A467-33851679B5B4@univ-grenoble-alpes.fr> Message-ID: Hello, thanks a lot , you help me to understand a lot of things. in particular that I have a lot of modifications to make to have an operational openstack and with good performance. If my iscsi bay is attached to S3 (I have S1, S2 and S3), I have to put glance on S3 with a mount in the filesystem of S3, and enable the cache. My images are in qcow2. suddenly I do not know if I modify them. Finally, and I don't know if this is the best solution, to make images that work well, I go through virtualbox, then from VDI to RAW (then from RAW to QCOW2 but it was a big mistake if I well understood). For example, I am having trouble with an opnsense image if I create the iinstance from iso and Horizon. If I go through virtualbox on another computer, then copy the files, the image is OK. Weird ?. Ah, I forgot, I didn't realize that order was important in a module [module]. Really not easy to handle all of this. Anyway thank you for your help. I will check the docs again and try to change this next week. Franck > Le 19 nov. 2021 ? 14:50, Ignazio Cassano a ?crit : > > Franck, this help you a lot. > Thanks Radoslaw > Ignazio > > Il giorno ven 19 nov 2021 alle ore 12:03 Rados?aw Piliszek > ha scritto: > If one sets glance_file_datadir_volume to non-default, then glance-api > gets deployed on all hosts. > > -yoctozepto > > On Fri, 19 Nov 2021 at 10:51, Ignazio Cassano > wrote: > > > > Hello Franck, glance is not deployed on all nodes at default. > > I got the same problem > > In my case I have 3 controllers. > > I created an nfs share on a storage server where to store images. > > Before deploying glance, I create /var/lib/glance/images on the 3 controllers and I mount the nfs share. > > This is my fstab on the 3 controllers: > > > > 10.102.189.182:/netappopenstacktst2_glance /var/lib/glance nfs rw,user=glance,soft,intr,noatime,nodiratime > > > > In my globals.yml I have: > > glance_file_datadir_volume: "/var/lib/glance" > > glance_backend_file: "yes" > > > > This means images are on /var/lib/glance and since it is a nfs share all my 3 controlles can share images. > > Then you must deploy. > > To be sure the glance container is started on all controllers, since I have 3 controllers, I deployed 3 times changing the order in the inventory. > > First time: > > [control] > > A > > B > > C > > > > Second time: > > [control] > > B > > C > > A > > > > Third time: > > [control] > > C > > B > > A > > > > Or you can deploy glance 3 times using -t glance and -l > > > > As far as the instance stopped, I got I bug with a version of kolla. > > https://bugs.launchpad.net/kolla-ansible/+bug/1941706 > > Now is corrected and with kolla 12.2.0 it works. > > Ignazio > > > > > > Il giorno mer 17 nov 2021 alle ore 23:17 Franck VEDEL > ha scritto: > >> > >> Hello and thank you for the help. > >> I was able to move forward on my problem, without finding a satisfactory solution. > >> Normally, I have 2 servers with the role [glance] but I noticed that all my images were on the first server (in / var / lib / docker / volumes / glance / _data / images) before the reconfigure, none on the second. But since the reconfiguration, the images are placed on the second, and no longer on the first. I do not understand why. I haven't changed anything to the multinode file. > >> so, to get out of this situation quickly as I need this openstack for the students, I modified the multinode file and put only one server in [glance] (I put server 1, the one that had the images before reconfigure), I did a reconfigure -t glance and now I have my images usable for instances. > >> I don't understand what happened. There is something wrong. > >> > >> Is it normal that after updating the certificates, all instances are turned off? > >> thanks again > >> > >> Franck > >> > >> Le 17 nov. 2021 ? 21:11, Cyril Roelandt > a ?crit : > >> > >> Hello, > >> > >> > >> On 2021-11-17 08:59, Franck VEDEL wrote: > >> > >> Hello everyone > >> > >> I have a strange problem and I haven't found the solution yet. > >> Following a certificate update I had to do a "kolla-ansible -t multinode reconfigure ?. > >> Well, after several attempts (it is not easy to use certificates with Kolla-ansible, and from my advice, not documented enough for beginners), I have my new functional certificates. Perfect ... well almost. > >> > >> I am trying to create a new instance to check general operation. ERROR. > >> Okay, I look in the logs and I see that Cinder is having problems creating volumes with an error that I never had ("TypeError: 'NoneType' object is not iterable). > >> > >> > >> We'd like to see the logs as well, especially the stacktrace. > >> > >> I dig and then I wonder if it is not the Glance images which cannot be used, while they are present (openstack image list is OK). > >> > >> I create an empty volume: it works. > >> I am creating a volume from an image: Failed. > >> > >> > >> What commands are you running? What's the output? What's in the logs? > >> > >> > >> However, I have my list of ten images in glance. > >> > >> I create a new image and create a volume with this new image: it works. > >> I create an instance with this new image: OK. > >> > >> What is the problem ? The images present before the "reconfigure" are listed, visible in horizon for example, but unusable. > >> Is there a way to fix this, or do we have to reinstall them all? > >> > >> > >> What's your configuration? What version of OpenStack are you running? > >> > >> > >> > >> Cyril > >> > >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Fri Nov 19 20:59:45 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Fri, 19 Nov 2021 14:59:45 -0600 Subject: [all][refstack][neutron][kolla][ironic][heat][trove][senlin][barbican][manila] Fixing Zuul Config Errors In-Reply-To: <5fea46fa-f304-48fa-a171-ce1114700379@www.fastmail.com> References: <5072b00c-bbf6-42e9-830f-d6598a76beb7@www.fastmail.com> <5517904.DvuYhMxLoT@p1> <4359750.LvFx2qVVIh@p1> <5fea46fa-f304-48fa-a171-ce1114700379@www.fastmail.com> Message-ID: <17d39ff7f1e.db45f09a985367.5739424682253609439@ghanshyammann.com> ---- On Thu, 18 Nov 2021 08:40:38 -0600 Clark Boylan wrote ---- > On Thu, Nov 18, 2021, at 3:11 AM, Pavlo Shchelokovskyy wrote: > > Hi Clark, > > > > Why is the retirement of openstack/neutron-lbaas being a problem? > > > > The repo is there and accessible under the same URL, it has > > (potentially working) stable/pike and stable/queens branches, and was > > not retired at the time of Pike or Queens, so IMO it is a valid request > > for testing configuration in the same branches of other projects, > > openstack/heat in this case. > > > > Maybe we should leave some minimal zuul configs in retired projects for > > zuul to find them? > > The reason for this is one of the steps for project retirement is to remove the repo from zuul [0]. If the upstream for the project has retired the project I think it is reasonable to remove it from our systems like Zuul. The project isn't retired on a per branch basis. > > I do not think the Zuul operators should be responsible for keeping retired projects alive in the CI system if their maintainers have stopped maintaining them. Instead we should remove them from the CI system. I agree, in case of deprecation we take care of stable branch and in retirement case, it means it is gone completely so usage of retired repo has to be completely cleanup from everywhere (master or stable). -gmann > > [0] https://opendev.org/openstack/project-config/commit/3832c8fafc4d4e03c306c21f37b2d39bd7c5bd2b > > From gmann at ghanshyammann.com Fri Nov 19 21:23:20 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Fri, 19 Nov 2021 15:23:20 -0600 Subject: [all][tc] What's happening in Technical Committee: summary 19th Nov, 21: Reading: 10 min Message-ID: <17d3a151547.cb44b8eb985741.7343034494975169538@ghanshyammann.com> Hello Everyone, Here is this week's summary of the Technical Committee activities. 1. TC Meetings: ============ * We cancelled this week's meeting due to Open Infra Keynotes at the same time. * Next week's meeting is on IRC on 25th Nov, Thursday 15:00 UTC, feel free the topic on the agenda[1] by Nov 24th. 2. What we completed this week: ========================= * Updated Yoga testing runtime [2] * Retire puppet-senlin[3] * Removed TC office hours in favour of weekly meetings [4] 3. Activities In progress: ================== TC Tracker for Yoga cycle ------------------------------ * This etherpad includes the Yoga cycle targets/working items for TC[5]. Open Reviews ----------------- * 8 open reviews for ongoing activities[6]. Fixing Zuul config error -------------------------- Clark sent a email tagging projects need action for zuul config error in their projects[7]. Please fix those config error which is not causing failure now but will fail on changes in those repo or at least is not testing what we think they are. Few projects are fixing those but still we need to clean it up completly. RBAC discussion: continuing from PTG ---------------------------------------------- We had another meeting on Wed and figured out the open question and schedule. Meeting notes are in this etherpad[8]. Final design and schedule is up in this goal rework patch[9]. Please review that and provide feedback from your project perspective. Community-wide goal updates ------------------------------------ * RBAC goal is pretty much in good shape now, feel free to review it[9] * There is one more goal proposal for 'FIPS compatibility and compliance' which needs feedback from community as well from TC[10]. * TC is working to prepare the final goal(s) soon, please wait for some more time. Adjutant need maintainers and PTLs ------------------------------------------- We received response from Braden, Albert on ML, so let's wait on the final call to help here[11]. New project 'Skyline' proposal ------------------------------------ * We discussed it in TC PTG, we are ok to have Skyline as official project but we have few open queries on gerrit and still waiting from skyline team to respond[12]. TC tags analysis ------------------- * TC agreed to remove the framework and it is communicated in ML[13]. Project updates ------------------- * Rename ?Extended Maintenance? SIG to the ?Stable Maintenance?[14] * Retire training-labs repo[15] * Add ProxySQL repository for OpenStack-Ansible[16] * Retire js-openstack-lib [17] 4. How to contact the TC: ==================== If you would like to discuss or give feedback to TC, you can reach out to us in multiple ways: 1. Email: you can send the email with tag [tc] on openstack-discuss ML[18]. 2. Weekly meeting: The Technical Committee conduct a weekly meeting every Thursday 15 UTC [19] 3. Ping us using 'tc-members' nickname on #openstack-tc IRC channel. [1] https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting [2] https://review.opendev.org/c/openstack/governance/+/815851 [3] https://review.opendev.org/c/openstack/governance/+/817329 [4] https://review.opendev.org/c/openstack/governance/+/817493 [5] https://etherpad.opendev.org/p/tc-yoga-tracker [6] https://review.opendev.org/q/projects:openstack/governance+status:open [7] http://lists.openstack.org/pipermail/openstack-discuss/2021-November/025758.htmlhttp://lists.openstack.org/pipermail/openstack-discuss/2021-November/025797.html [8] https://etherpad.opendev.org/p/policy-popup-yoga-ptg [9] https://review.opendev.org/c/openstack/governance/+/815158 [10] https://review.opendev.org/c/openstack/governance/+/816587 [11]?http://lists.openstack.org/pipermail/openstack-discuss/2021-November/025786.html [12] https://review.opendev.org/c/openstack/governance/+/814037 [13] http://lists.openstack.org/pipermail/openstack-discuss/2021-October/025571.html [14] https://review.opendev.org/c/openstack/governance-sigs/+/817499 [15] https://review.opendev.org/c/openstack/governance/+/817511 [16] https://review.opendev.org/c/openstack/governance/+/817245 [17] https://review.opendev.org/c/openstack/governance/+/807163 [18] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-discuss [19] http://eavesdrop.openstack.org/#Technical_Committee_Meeting-gmann -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnaser at vexxhost.com Sat Nov 20 01:32:47 2021 From: mnaser at vexxhost.com (Mohammed Naser) Date: Fri, 19 Nov 2021 20:32:47 -0500 Subject: [ops][nova][victoria] QEMU/KVM libvirt Authentication Failed In-Reply-To: <0670B960225633449A24709C291A525251D4DE7F@COM03.performair.local> References: <0670B960225633449A24709C291A525251D4DE7F@COM03.performair.local> Message-ID: Just a heads up even if you get things working you?re not going to be able to live migrate from centos to ubuntu and vice versa since there?s going to be things like apparmor and SELinux issues On Fri, Nov 19, 2021 at 11:35 AM wrote: > All; > > I feel like I've dealt with this issue before, but I can't find any > records of it. > > I've been swapping out the compute nodes in my cluster for newer and > better hardware. We also decided to abandon CentOS. All the differences > mean that we haven't been able to do live migrations. I now have 2 servers > with the same CPUs, OS (Ubuntu), OS Version (20.10), etc., and would like > to get live migration working again. > > I configured passwordless ssh access between the servers for the nova > users to get cold migration working. I have also configured passwordless > ssh for the root users in accordance with [1]. > > When I try to do a live migration, the origin server generates this error, > in the nova-compute log: > 2021-11-19 15:52:31.130 15610 ERROR nova.virt.libvirt.driver [-] > [instance: 5935c07d-0c7f-48cc-a4b9-674504fc6005] Live Migration failure: > operation failed: Failed to connect to remote libvirt URI > qemu+tcp:///system: authentication failed: authentication > failed: libvirt.libvirtError: operation failed: Failed to connect to remote > libvirt URI qemu+tcp:///system: authentication failed: > authentication failed > > At one point, I came across a tutorial on configuring live-migration for > libvirt, which included a bunch of user configuration. I don't remember > having to do that before, but is that what I need to be looking for? > > Thank you, > > Dominic L. Hilsbos, MBA > Vice President - Information Technology > Perform Air International Inc. > DHilsbos at PerformAir.com > www.PerformAir.com > > 1: > https://docs.openstack.org/nova/victoria/admin/configuring-migrations.html#section-configuring-compute-migrations > > -- Mohammed Naser VEXXHOST, Inc. -------------- next part -------------- An HTML attachment was scrubbed... URL: From laurentfdumont at gmail.com Sat Nov 20 01:45:29 2021 From: laurentfdumont at gmail.com (Laurent Dumont) Date: Fri, 19 Nov 2021 20:45:29 -0500 Subject: [ops][nova][victoria] QEMU/KVM libvirt Authentication Failed In-Reply-To: References: <0670B960225633449A24709C291A525251D4DE7F@COM03.performair.local> Message-ID: Which version of Openstack are you running? It seems to try to connect over qemu with auth over tcp. Without ssh? Is the cold migration working now? On Fri, Nov 19, 2021 at 8:35 PM Mohammed Naser wrote: > Just a heads up even if you get things working you?re not going to be able > to live migrate from centos to ubuntu and vice versa since there?s going to > be things like apparmor and SELinux issues > > On Fri, Nov 19, 2021 at 11:35 AM wrote: > >> All; >> >> I feel like I've dealt with this issue before, but I can't find any >> records of it. >> >> I've been swapping out the compute nodes in my cluster for newer and >> better hardware. We also decided to abandon CentOS. All the differences >> mean that we haven't been able to do live migrations. I now have 2 servers >> with the same CPUs, OS (Ubuntu), OS Version (20.10), etc., and would like >> to get live migration working again. >> >> I configured passwordless ssh access between the servers for the nova >> users to get cold migration working. I have also configured passwordless >> ssh for the root users in accordance with [1]. >> >> When I try to do a live migration, the origin server generates this >> error, in the nova-compute log: >> 2021-11-19 15:52:31.130 15610 ERROR nova.virt.libvirt.driver [-] >> [instance: 5935c07d-0c7f-48cc-a4b9-674504fc6005] Live Migration failure: >> operation failed: Failed to connect to remote libvirt URI >> qemu+tcp:///system: authentication failed: authentication >> failed: libvirt.libvirtError: operation failed: Failed to connect to remote >> libvirt URI qemu+tcp:///system: authentication failed: >> authentication failed >> >> At one point, I came across a tutorial on configuring live-migration for >> libvirt, which included a bunch of user configuration. I don't remember >> having to do that before, but is that what I need to be looking for? >> >> Thank you, >> >> Dominic L. Hilsbos, MBA >> Vice President - Information Technology >> Perform Air International Inc. >> DHilsbos at PerformAir.com >> www.PerformAir.com >> >> 1: >> https://docs.openstack.org/nova/victoria/admin/configuring-migrations.html#section-configuring-compute-migrations >> >> -- > Mohammed Naser > VEXXHOST, Inc. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From michal.arbet at ultimum.io Sat Nov 20 10:05:16 2021 From: michal.arbet at ultimum.io (Michal Arbet) Date: Sat, 20 Nov 2021 12:05:16 +0200 Subject: [neutron][nova] [kolla] vif plugged timeout Message-ID: Hi, Has anyone seen issue which I am currently facing ? When launching heat stack ( but it's same if I launch several of instances ) vif plugged in timeouts an I don't know why, sometimes it is OK ..sometimes is failing. Sometimes neutron reports vif plugged in < 10 sec ( test env ) sometimes it's 100 and more seconds, it seems there is some race condition but I can't find out where the problem is. But on the end every instance is spawned ok (retry mechanism worked). Another finding is that it has to do something with security group, if noop driver is used ..everything is working good. Firewall security setup is openvswitch . Test env is wallaby. I will attach some logs when I will be near PC .. Thank you, Michal Arbet (Kevko) -------------- next part -------------- An HTML attachment was scrubbed... URL: From michal.arbet at ultimum.io Sat Nov 20 12:20:19 2021 From: michal.arbet at ultimum.io (Michal Arbet) Date: Sat, 20 Nov 2021 14:20:19 +0200 Subject: [neutron][nova] [kolla] vif plugged timeout In-Reply-To: References: Message-ID: + if i raise vif_plugged_timeout ( hope i rember it correct ) in nova to some high number ..problem dissapear ... But it's only workaround D?a so 20. 11. 2021, 12:05 Michal Arbet nap?sal(a): > Hi, > > Has anyone seen issue which I am currently facing ? > > When launching heat stack ( but it's same if I launch several of instances > ) vif plugged in timeouts an I don't know why, sometimes it is OK > ..sometimes is failing. > > Sometimes neutron reports vif plugged in < 10 sec ( test env ) sometimes > it's 100 and more seconds, it seems there is some race condition but I > can't find out where the problem is. But on the end every instance is > spawned ok (retry mechanism worked). > > Another finding is that it has to do something with security group, if > noop driver is used ..everything is working good. > > Firewall security setup is openvswitch . > > Test env is wallaby. > > I will attach some logs when I will be near PC .. > > Thank you, > Michal Arbet (Kevko) > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hello at dincercelik.com Sun Nov 21 06:51:17 2021 From: hello at dincercelik.com (Dincer Celik) Date: Sun, 21 Nov 2021 09:51:17 +0300 Subject: [kolla] Proposing Michal Arbet as Kolla core In-Reply-To: References: Message-ID: <139F5BE7-4083-4364-959F-A850C12FCCB1@dincercelik.com> +1 > On 16 Nov 2021, at 15:49, Micha? Nasiadka wrote: > > Hi, > > I would like to propose adding Michal Arbet (kevko) to the kolla-core and kolla-ansible-core groups. > Michal did some great work on ProxySQL, is a consistent maintainer of Debian related images and has provided some helpful > reviews. > > Cores - please reply +1/-1 before the end of Friday 26th November. > > Thanks, > Michal -------------- next part -------------- An HTML attachment was scrubbed... URL: From kira034 at 163.com Sun Nov 21 09:48:30 2021 From: kira034 at 163.com (Hongbin Lu) Date: Sun, 21 Nov 2021 17:48:30 +0800 (CST) Subject: [neutron] Bug deputy report (Nov 15 - 21) Message-ID: <2e49efc.f7b.17d41e5ad46.Coremail.kira034@163.com> Hi, I am the bug deputy this week. Please find the report below: High * https://bugs.launchpad.net/neutron/+bug/1951010 Restarting Neutron floods Nova with segment aggregates calls * https://bugs.launchpad.net/neutron/+bug/1951225 [OVN] Agent can't be found in functional test sometimes Medium * https://bugs.launchpad.net/neutron/+bug/1951149 [OVN] If "chassis" register is deleted, "chassis_private" can have 0 "chassis" associated * https://bugs.launchpad.net/neutron/+bug/1951559 [OVN] Router ports gateway_mtu option should not always be set * https://bugs.launchpad.net/neutron/+bug/1951564 snat random-fully supported with iptables 1.6.0 Low * https://bugs.launchpad.net/neutron/+bug/1951272 [OVN] OVS to OVN migration should be stopped if OVS agent firewall is "iptables_hybrid" * https://bugs.launchpad.net/neutron/+bug/1951429 Neutron API responses should not contain tracebacks * https://bugs.launchpad.net/neutron/+bug/1951569 Undecided (might need further traiging from domain experts) * https://bugs.launchpad.net/neutron/+bug/1951074 [OVN] default setting leak nameserver config from the host to instances * https://bugs.launchpad.net/neutron/+bug/1951493 OVS DPDK restart results in that tap interfaces in network namespaces can't be opened * https://bugs.launchpad.net/neutron/+bug/1951517 Segmentation ID should be lower or equal to 4095 Best regards, Hongbin -------------- next part -------------- An HTML attachment was scrubbed... URL: From wodel.youchi at gmail.com Sun Nov 21 13:27:02 2021 From: wodel.youchi at gmail.com (wodel youchi) Date: Sun, 21 Nov 2021 14:27:02 +0100 Subject: [kolla-ansible][wallaby][magnum] calico works, flannel does not work Message-ID: Hi, I don't have any experience in kubernetes yet, my goal in the present is to test that the auto-sacele of a kubernetes cluster is working properly using magnum. I followed some tutorials to get the test done. I tried to configure the test but I had problems with magnum-metrics-server not being able to provide metrics, after digging and reading a lot of things on the web, I found out that kubernetes master couldn't connect to the metrics server (curl -k https://ip_metrics_server didn't work) So I created another template but this time using calico as a network provider, and this time it worked. How can I find why flannel does not work? where should I look? Regards. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mthode at mthode.org Sun Nov 21 23:51:28 2021 From: mthode at mthode.org (Matthew Thode) Date: Sun, 21 Nov 2021 17:51:28 -0600 Subject: [requirements][cinder][manila] pyparsing update needs handling Message-ID: <20211121235128.a6jlampynlotp4yy@mthode.org> https://review.opendev.org/818614 For cinder it looks like operatorPrecedence is gone now. For manila it looks like the same thing. -- Matthew Thode From michal.arbet at ultimum.io Mon Nov 22 08:02:35 2021 From: michal.arbet at ultimum.io (Michal Arbet) Date: Mon, 22 Nov 2021 10:02:35 +0200 Subject: [neutron][nova] [kolla] vif plugged timeout In-Reply-To: References: Message-ID: Hi, It seems it's same issue as issue on launchpad https://bugs.launchpad.net/nova/+bug/1944779 Thanks, Kevko D?a so 20. 11. 2021, 14:20 Michal Arbet nap?sal(a): > + if i raise vif_plugged_timeout ( hope i rember it correct ) in nova to > some high number ..problem dissapear ... But it's only workaround > > D?a so 20. 11. 2021, 12:05 Michal Arbet > nap?sal(a): > >> Hi, >> >> Has anyone seen issue which I am currently facing ? >> >> When launching heat stack ( but it's same if I launch several of >> instances ) vif plugged in timeouts an I don't know why, sometimes it is OK >> ..sometimes is failing. >> >> Sometimes neutron reports vif plugged in < 10 sec ( test env ) sometimes >> it's 100 and more seconds, it seems there is some race condition but I >> can't find out where the problem is. But on the end every instance is >> spawned ok (retry mechanism worked). >> >> Another finding is that it has to do something with security group, if >> noop driver is used ..everything is working good. >> >> Firewall security setup is openvswitch . >> >> Test env is wallaby. >> >> I will attach some logs when I will be near PC .. >> >> Thank you, >> Michal Arbet (Kevko) >> >> >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralonsoh at redhat.com Mon Nov 22 08:54:06 2021 From: ralonsoh at redhat.com (Rodolfo Alonso Hernandez) Date: Mon, 22 Nov 2021 09:54:06 +0100 Subject: [neutron][CI] How to reduce number of rechecks - brainstorming In-Reply-To: References: <2165480.iZASKD2KPV@p1> <3MOP2R.O83SZVO0NWN23@est.tech> Message-ID: Hello: I think the last idea Ronelled presented (a skiplist) could be feasible in Neutron. Of course, this list could grow indefinitely, but we can always keep an eye on it. There could be another issue with Neutron tempest tests when using the "advance" image. Despite the recent improvements done recently, we are frequently having problems with the RAM size of the testing VMs. We would like to have 20% more RAM, if possible. I wish we had the ability to pre-run some checks in specific HW (tempest plugin or grenade tests). Slawek commented the different number of backends we need to provide support and testing. However I think we can remove the Linux Bridge tempest plugin from the "gate" list (it is already tested in the "check" list). Tempest plugin tests are expensive in time and prone to errors. This paragraph falls under the shoulders of the Neutron team. We can also identify those long running tests that usually fail (those that take more than 1000 seconds). A test that takes around 15 mins to run, will probably fail. We need to find those tests, investigate the slowest parts of those tests and try to improve/optimize/remove them. Thank you all for your comments and proposals. That will help a lot to improve the Neutron CI stability. Regards. On Fri, Nov 19, 2021 at 12:53 AM Ronelle Landy wrote: > > > On Wed, Nov 17, 2021 at 5:22 AM Balazs Gibizer > wrote: > >> >> >> On Wed, Nov 17 2021 at 09:13:34 AM +0100, Slawek Kaplonski >> wrote: >> > Hi, >> > >> > Recently I spent some time to check how many rechecks we need in >> > Neutron to >> > get patch merged and I compared it to some other OpenStack projects >> > (see [1] >> > for details). >> > TL;DR - results aren't good for us and I think we really need to do >> > something >> > with that. >> >> I really like the idea of collecting such stats. Thank you for doing >> it. I can even imagine to make a public dashboard somewhere with this >> information as it is a good indication about the health of our projects >> / testing. >> >> > >> > Of course "easiest" thing to say is that we should fix issues which >> > we are >> > hitting in the CI to make jobs more stable. But it's not that easy. >> > We are >> > struggling with those jobs for very long time. We have CI related >> > meeting >> > every week and we are fixing what we can there. >> > Unfortunately there is still bunch of issues which we can't fix so >> > far because >> > they are intermittent and hard to reproduce locally or in some cases >> > the >> > issues aren't realy related to the Neutron or there are new bugs >> > which we need >> > to investigate and fix :) >> >> >> I have couple of suggestion based on my experience working with CI in >> nova. >> > > We've struggled with unstable tests in TripleO as well. Here are some > things we tried and implemented: > > 1. Created job dependencies so we only ran check tests once we knew we had > the resources we needed (example we had pulled containers successfully) > > 2. Moved some testing to third party where we have easier control of the > environment (note that third party cannot stop a change merging) > > 3. Used dependency pipelines to pre-qualify some dependencies ahead of > letting them run wild on our check jobs > > 4. Requested testproject runs of changes in a less busy environment before > running a full set of tests in a public zuul > > 5. Used a skiplist to keep track of tech debt and skip known failures that > we could temporarily ignore to keep CI moving along if we're waiting on an > external fix. > > > >> >> 1) we try to open bug reports for intermittent gate failures too and >> keep them tagged in a list [1] so when a job fail it is easy to check >> if the bug is known. >> >> 2) I offer my help here now that if you see something in neutron runs >> that feels non neutron specific then ping me with it. Maybe we are >> struggling with the same problem too. >> >> 3) there was informal discussion before about a possibility to re-run >> only some jobs with a recheck instead for re-running the whole set. I >> don't know if this is feasible with Zuul and I think this only treat >> the symptom not the root case. But still this could be a direction if >> all else fails. >> >> Cheers, >> gibi >> >> > So this is never ending battle for us. The problem is that we have >> > to test >> > various backends, drivers, etc. so as a result we have many jobs >> > running on >> > each patch - excluding UT, pep8 and docs jobs we have around 19 jobs >> > in check >> > and 14 jobs in gate queue. >> > >> > In the past we made a lot of improvements, like e.g. we improved >> > irrelevant >> > files lists for jobs to run less jobs on some of the patches, >> > together with QA >> > team we did "integrated-networking" template to run only Neutron and >> > Nova >> > related scenario tests in the Neutron queues, we removed and >> > consolidated some >> > of the jobs (there is still one patch in progress for that but it >> > should just >> > remove around 2 jobs from the check queue). All of that are good >> > improvements >> > but still not enough to make our CI really stable :/ >> > >> > Because of all of that, I would like to ask community about any other >> > ideas >> > how we can improve that. If You have any ideas, please send it in >> > this email >> > thread or reach out to me directly on irc. >> > We want to discuss about them in the next video CI meeting which will >> > be on >> > November 30th. If You would have any idea and would like to join that >> > discussion, You are more than welcome in that meeting of course :) >> > >> > [1] >> > http://lists.openstack.org/pipermail/openstack-discuss/2021-November/ >> > 025759.html >> >> >> [1] >> >> https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure&orderby=-date_last_updated&start=0 >> >> >> > >> > -- >> > Slawek Kaplonski >> > Principal Software Engineer >> > Red Hat >> >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From jake.yip at ardc.edu.au Mon Nov 22 09:01:28 2021 From: jake.yip at ardc.edu.au (Jake Yip) Date: Mon, 22 Nov 2021 20:01:28 +1100 Subject: [kolla-ansible][wallaby][magnum] calico works, flannel does not work In-Reply-To: References: Message-ID: <89b9fdf7-dcd9-87f1-fbac-114776c0cf25@ardc.edu.au> Hi, It could be due to https://github.com/flannel-io/flannel/issues/1155. One way to find out is to restart all the flannel pods after the cluster has been created, kubectl -n kube-system delete pod -l app=flannel Hope this helps. Regards, Jake On 22/11/21 12:27 am, wodel youchi wrote: > Hi, > I don't have any experience in kubernetes yet, my goal in the present is > to test that the auto-sacele of a kubernetes cluster is working properly > using magnum. > > I followed some tutorials to get the test done. I tried to configure the > test but I had problems with magnum-metrics-server not being able to > provide metrics, after digging and reading a lot of things on the web, I > found out that kubernetes master couldn't connect to the metrics server > (curl -k https://ip_metrics_server didn't work) > So I created another template but this time using calico as a network > provider, and this time it worked. > > How can I find why flannel does not work? where should I look? > > Regards. From thierry at openstack.org Mon Nov 22 11:08:09 2021 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 22 Nov 2021 12:08:09 +0100 Subject: [largescale-sig] Next meeting: Nov 24th, 15utc Message-ID: Hi everyone, The Large Scale SIG will be meeting this Wednesday in #openstack-operators on OFTC IRC, at 15UTC. You can doublecheck how that time translates locally at: https://www.timeanddate.com/worldclock/fixedtime.html?iso=20211124T15 We will be discussing our upcoming OpenInfra Live! episode on Dec 9. Feel free to add topics to our agenda at: https://etherpad.openstack.org/p/large-scale-sig-meeting Regards, -- Thierry Carrez From DHilsbos at performair.com Mon Nov 22 15:52:59 2021 From: DHilsbos at performair.com (DHilsbos at performair.com) Date: Mon, 22 Nov 2021 15:52:59 +0000 Subject: [ops][nova][victoria] QEMU/KVM libvirt Authentication Failed In-Reply-To: References: <0670B960225633449A24709C291A525251D4DE7F@COM03.performair.local> Message-ID: <0670B960225633449A24709C291A525251D4FF1B@COM03.performair.local> Mohammed; Yep, I'm aware. I have 3 Nova servers, 1 runs CentOS, 2 run Ubuntu. I'm only trying to live migrate between the 2 Ubuntu servers. Thank you, Dominic L. Hilsbos, MBA Vice President ? Information Technology Perform Air International Inc. DHilsbos at PerformAir.com www.PerformAir.com From: Mohammed Naser [mailto:mnaser at vexxhost.com] Sent: Friday, November 19, 2021 6:33 PM To: Dominic Hilsbos Cc: openstack-discuss at lists.openstack.org Subject: Re: [ops][nova][victoria] QEMU/KVM libvirt Authentication Failed Just a heads up even if you get things working you?re not going to be able to live migrate from centos to ubuntu and vice versa since there?s going to be things like apparmor and SELinux issues? On Fri, Nov 19, 2021 at 11:35 AM wrote: All; I feel like I've dealt with this issue before, but I can't find any records of it. I've been swapping out the compute nodes in my cluster for newer and better hardware.? We also decided to abandon CentOS.? All the differences mean that we haven't been able to do live migrations.? I now have 2 servers with the same CPUs, OS (Ubuntu), OS Version (20.10), etc., and would like to get live migration working again. I configured passwordless ssh access between the servers for the nova users to get cold migration working.? I have also configured passwordless ssh for the root users in accordance with [1]. When I try to do a live migration, the origin server generates this error, in the nova-compute log: 2021-11-19 15:52:31.130 15610 ERROR nova.virt.libvirt.driver [-] [instance: 5935c07d-0c7f-48cc-a4b9-674504fc6005] Live Migration failure: operation failed: Failed to connect to remote libvirt URI qemu+tcp:///system: authentication failed: authentication failed: libvirt.libvirtError: operation failed: Failed to connect to remote libvirt URI qemu+tcp:///system: authentication failed: authentication failed At one point, I came across a tutorial on configuring live-migration for libvirt, which included a bunch of user configuration.? I don't remember having to do that before, but is that what I need to be looking for? Thank you, Dominic L. Hilsbos, MBA Vice President - Information Technology Perform Air International Inc. DHilsbos at PerformAir.com www.PerformAir.com 1: https://docs.openstack.org/nova/victoria/admin/configuring-migrations.html#section-configuring-compute-migrations -- Mohammed Naser VEXXHOST, Inc. From DHilsbos at performair.com Mon Nov 22 16:09:23 2021 From: DHilsbos at performair.com (DHilsbos at performair.com) Date: Mon, 22 Nov 2021 16:09:23 +0000 Subject: [ops][nova][victoria] QEMU/KVM libvirt Authentication Failed In-Reply-To: References: <0670B960225633449A24709C291A525251D4DE7F@COM03.performair.local> Message-ID: <0670B960225633449A24709C291A525251D4FF98@COM03.performair.local> Laurent; We're running Victoria. Here are specific package versions: Ubuntu: 20.10 nova-compute: 22.2.1-0ubuntu1 (both) nova-compute-kvm: 22.2.1-0ubuntu1 (both) qemu-kvm: 5.0-5unbuntu9.9 (both) libvirt-daemon: 6.6.0-1ubuntu3.5 (both) As I said, this has come up for me before, but I can't find records of how it was addressed. I don't remember an issue of authentication from before, however. From before, I do remember that after the ssh connection to setup the new host, qemu/kvm on the old host makes a connection to qemu/kvm on the new host, in order to coordinate the transfer of memory contents, and other dynamic elements. Yes, I can cold migrate between all 3 servers, which makes this a non-critical issue. While I have a CentOS Nova host, I'm not going to attempt to get live-migration working between the Ubuntu Servers Changing the configuration of libvirt from system sockets to native listeners got me past a connection refused error (it appears that lbvirt also connects from one server to another?) Thank you, Dominic L. Hilsbos, MBA Vice President ? Information Technology Perform Air International Inc. DHilsbos at PerformAir.com www.PerformAir.com From: Laurent Dumont [mailto:laurentfdumont at gmail.com] Sent: Friday, November 19, 2021 6:45 PM To: Mohammed Naser Cc: Dominic Hilsbos; openstack-discuss Subject: Re: [ops][nova][victoria] QEMU/KVM libvirt Authentication Failed Which version of Openstack are you running? It seems to try to connect over qemu with auth over tcp. Without ssh? Is the cold migration working now? On Fri, Nov 19, 2021 at 8:35 PM Mohammed Naser wrote: Just a heads up even if you get things working you?re not going to be able to live migrate from centos to ubuntu and vice versa since there?s going to be things like apparmor and SELinux issues? On Fri, Nov 19, 2021 at 11:35 AM wrote: All; I feel like I've dealt with this issue before, but I can't find any records of it. I've been swapping out the compute nodes in my cluster for newer and better hardware.? We also decided to abandon CentOS.? All the differences mean that we haven't been able to do live migrations.? I now have 2 servers with the same CPUs, OS (Ubuntu), OS Version (20.10), etc., and would like to get live migration working again. I configured passwordless ssh access between the servers for the nova users to get cold migration working.? I have also configured passwordless ssh for the root users in accordance with [1]. When I try to do a live migration, the origin server generates this error, in the nova-compute log: 2021-11-19 15:52:31.130 15610 ERROR nova.virt.libvirt.driver [-] [instance: 5935c07d-0c7f-48cc-a4b9-674504fc6005] Live Migration failure: operation failed: Failed to connect to remote libvirt URI qemu+tcp:///system: authentication failed: authentication failed: libvirt.libvirtError: operation failed: Failed to connect to remote libvirt URI qemu+tcp:///system: authentication failed: authentication failed At one point, I came across a tutorial on configuring live-migration for libvirt, which included a bunch of user configuration.? I don't remember having to do that before, but is that what I need to be looking for? Thank you, Dominic L. Hilsbos, MBA Vice President - Information Technology Perform Air International Inc. DHilsbos at PerformAir.com www.PerformAir.com 1: https://docs.openstack.org/nova/victoria/admin/configuring-migrations.html#section-configuring-compute-migrations -- Mohammed Naser VEXXHOST, Inc. From jake.yip at unimelb.edu.au Mon Nov 22 09:24:26 2021 From: jake.yip at unimelb.edu.au (Jake Yip) Date: Mon, 22 Nov 2021 20:24:26 +1100 Subject: Migration from Midonet to OVN Message-ID: <4d91ad35-5fda-3164-4d9a-a624cb8faf4c@unimelb.edu.au> Hi all, We are planning a migration from Midonet to OVN. The general idea is to: - pause neutron - do a `neutron-ovn-db-sync-util` - change networks/ports to geneve/ovs - hard reboot the instances We are wondering if anyone has done a migration like this before, and will like to share their experiences. Any input will be greatly appreciated. Regards, Jake -- Jake Yip DevOps Engineer, ARDC Nectar Research Cloud From mathur.hitesh at gmail.com Mon Nov 22 10:00:50 2021 From: mathur.hitesh at gmail.com (Hitesh Mathur) Date: Mon, 22 Nov 2021 12:00:50 +0200 Subject: Cinder NFS Encryption Message-ID: Hi, I am not able to find whether Cinder support NFS data-in-transit encryption or not. Can you please provide the information on this and how to use it ? -- Regards Hitesh -------------- next part -------------- An HTML attachment was scrubbed... URL: From damien.rannou at ovhcloud.com Mon Nov 22 16:11:22 2021 From: damien.rannou at ovhcloud.com (Damien Rannou) Date: Mon, 22 Nov 2021 16:11:22 +0000 Subject: [neutron] default QOS on L3 gateway Message-ID: Hello We are currently playing with QOS on L3 agent, mostly for SNAT, but it can apply also on FIP. Everything is working properly, but I?m wondering if there is a way to define a ? default ? QOS that would be applied on Router creation, but also when the user is setting ? no_qos ? on his router. On a public cloud environnement, we cannot let the customers without any QOS limitation. Thanks ! Damien From DHilsbos at performair.com Mon Nov 22 17:01:55 2021 From: DHilsbos at performair.com (DHilsbos at performair.com) Date: Mon, 22 Nov 2021 17:01:55 +0000 Subject: [ops][nova][victoria] QEMU/KVM libvirt Authentication Failed In-Reply-To: <0670B960225633449A24709C291A525251D4FF98@COM03.performair.local> References: <0670B960225633449A24709C291A525251D4DE7F@COM03.performair.local> <0670B960225633449A24709C291A525251D4FF98@COM03.performair.local> Message-ID: <0670B960225633449A24709C291A525251D50128@COM03.performair.local> Laurent; Your message ended up pointing me in the right direction. I started asking myself why libvirtd came from Ubuntu configured incorrectly for live migrations. The obvious answer is: it didn't. That suggested that I ad configured something incorrectly. That realization, together with the discussion from [1] led me to set libvirt.live_migration_scheme="ssh" in nova.conf. After restarting nova-compute, I can now live migrate instances between Ubuntu servers. Thank you for your assistance. Thank you, Dominic L. Hilsbos, MBA Vice President ? Information Technology Perform Air International Inc. DHilsbos at PerformAir.com www.PerformAir.com [1] https://bugzilla.redhat.com/show_bug.cgi?id=1254307 -----Original Message----- From: DHilsbos at performair.com [mailto:DHilsbos at performair.com] Sent: Monday, November 22, 2021 9:09 AM To: laurentfdumont at gmail.com; mnaser at vexxhost.com Cc: openstack-discuss at lists.openstack.org Subject: RE: [ops][nova][victoria] QEMU/KVM libvirt Authentication Failed Laurent; We're running Victoria. Here are specific package versions: Ubuntu: 20.10 nova-compute: 22.2.1-0ubuntu1 (both) nova-compute-kvm: 22.2.1-0ubuntu1 (both) qemu-kvm: 5.0-5unbuntu9.9 (both) libvirt-daemon: 6.6.0-1ubuntu3.5 (both) As I said, this has come up for me before, but I can't find records of how it was addressed. I don't remember an issue of authentication from before, however. From before, I do remember that after the ssh connection to setup the new host, qemu/kvm on the old host makes a connection to qemu/kvm on the new host, in order to coordinate the transfer of memory contents, and other dynamic elements. Yes, I can cold migrate between all 3 servers, which makes this a non-critical issue. While I have a CentOS Nova host, I'm not going to attempt to get live-migration working between the Ubuntu Servers Changing the configuration of libvirt from system sockets to native listeners got me past a connection refused error (it appears that lbvirt also connects from one server to another?) Thank you, Dominic L. Hilsbos, MBA Vice President ? Information Technology Perform Air International Inc. DHilsbos at PerformAir.com www.PerformAir.com From: Laurent Dumont [mailto:laurentfdumont at gmail.com] Sent: Friday, November 19, 2021 6:45 PM To: Mohammed Naser Cc: Dominic Hilsbos; openstack-discuss Subject: Re: [ops][nova][victoria] QEMU/KVM libvirt Authentication Failed Which version of Openstack are you running? It seems to try to connect over qemu with auth over tcp. Without ssh? Is the cold migration working now? On Fri, Nov 19, 2021 at 8:35 PM Mohammed Naser wrote: Just a heads up even if you get things working you?re not going to be able to live migrate from centos to ubuntu and vice versa since there?s going to be things like apparmor and SELinux issues? On Fri, Nov 19, 2021 at 11:35 AM wrote: All; I feel like I've dealt with this issue before, but I can't find any records of it. I've been swapping out the compute nodes in my cluster for newer and better hardware.? We also decided to abandon CentOS.? All the differences mean that we haven't been able to do live migrations.? I now have 2 servers with the same CPUs, OS (Ubuntu), OS Version (20.10), etc., and would like to get live migration working again. I configured passwordless ssh access between the servers for the nova users to get cold migration working.? I have also configured passwordless ssh for the root users in accordance with [1]. When I try to do a live migration, the origin server generates this error, in the nova-compute log: 2021-11-19 15:52:31.130 15610 ERROR nova.virt.libvirt.driver [-] [instance: 5935c07d-0c7f-48cc-a4b9-674504fc6005] Live Migration failure: operation failed: Failed to connect to remote libvirt URI qemu+tcp:///system: authentication failed: authentication failed: libvirt.libvirtError: operation failed: Failed to connect to remote libvirt URI qemu+tcp:///system: authentication failed: authentication failed At one point, I came across a tutorial on configuring live-migration for libvirt, which included a bunch of user configuration.? I don't remember having to do that before, but is that what I need to be looking for? Thank you, Dominic L. Hilsbos, MBA Vice President - Information Technology Perform Air International Inc. DHilsbos at PerformAir.com www.PerformAir.com 1: https://docs.openstack.org/nova/victoria/admin/configuring-migrations.html#section-configuring-compute-migrations -- Mohammed Naser VEXXHOST, Inc. From katonalala at gmail.com Mon Nov 22 17:45:26 2021 From: katonalala at gmail.com (Lajos Katona) Date: Mon, 22 Nov 2021 18:45:26 +0100 Subject: [neutron] default QOS on L3 gateway In-Reply-To: References: Message-ID: Hi, There is an RFE to inherit network QoS: https://bugs.launchpad.net/neutron/+bug/1950454 The patch series: https://review.opendev.org/q/topic:%22bug%252F1950454%22+(status:open%20OR%20status:merged) Hope this covers your usecase. Lajos Katona (lajoskatona) Damien Rannou ezt ?rta (id?pont: 2021. nov. 22., H, 17:23): > Hello > We are currently playing with QOS on L3 agent, mostly for SNAT, but it can > apply also on FIP. > Everything is working properly, but I?m wondering if there is a way to > define a ? default ? QOS that would be applied on Router creation, > but also when the user is setting ? no_qos ? on his router. > > On a public cloud environnement, we cannot let the customers without any > QOS limitation. > > Thanks ! > Damien -------------- next part -------------- An HTML attachment was scrubbed... URL: From katonalala at gmail.com Mon Nov 22 17:52:15 2021 From: katonalala at gmail.com (Lajos Katona) Date: Mon, 22 Nov 2021 18:52:15 +0100 Subject: Migration from Midonet to OVN In-Reply-To: <4d91ad35-5fda-3164-4d9a-a624cb8faf4c@unimelb.edu.au> References: <4d91ad35-5fda-3164-4d9a-a624cb8faf4c@unimelb.edu.au> Message-ID: Hi For tripleo there's a documentation, worth read it to have a general view what tripleo does to migrate from OVS to OVN: https://docs.openstack.org/neutron/latest/ovn/migration.html There are playbooks for it in Neutron repo, that will be helpful as well I hope: https://opendev.org/openstack/neutron/src/branch/master/tools/ovn_migration Regards Lajos Katona (lajoskatona) Jake Yip ezt ?rta (id?pont: 2021. nov. 22., H, 17:23): > Hi all, > > We are planning a migration from Midonet to OVN. The general idea is to: > > - pause neutron > - do a `neutron-ovn-db-sync-util` > - change networks/ports to geneve/ovs > - hard reboot the instances > > We are wondering if anyone has done a migration like this before, and > will like to share their experiences. > > Any input will be greatly appreciated. > > Regards, > Jake > > -- > Jake Yip > DevOps Engineer, ARDC Nectar Research Cloud > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosmaita.fossdev at gmail.com Mon Nov 22 20:21:52 2021 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Mon, 22 Nov 2021 15:21:52 -0500 Subject: [cinder] reminder: this week's meeting in video+IRC Message-ID: Quick reminder that this week's Cinder team meeting on Wednesday 24 November, being the final meeting of the month, will be held in both videoconference and IRC at the regularly scheduled time of 1400 UTC. These are the video meeting rules we've agreed to: * Everyone will keep IRC open during the meeting. * We'll take notes in IRC to leave a record similar to what we have for our regular IRC meetings. * Some people are more comfortable communicating in written English. So at any point, any attendee may request that the discussion of the current topic be conducted entirely in IRC. * The meeting will be recorded. connection info: https://bluejeans.com/3228528973 meeting agenda: https://etherpad.opendev.org/p/cinder-yoga-meetings cheers, brian From gouthampravi at gmail.com Mon Nov 22 20:28:14 2021 From: gouthampravi at gmail.com (Goutham Pacha Ravi) Date: Mon, 22 Nov 2021 12:28:14 -0800 Subject: [manila] No IRC meeting on 25th Nov Message-ID: Hello Zorillas, In lieu of holidays this week, we'll be skipping the weekly meeting earlier scheduled at 1500 UTC on 25th Nov 2021. Please reach out here or on OFTC's #openstack-manila should you have any matters that need to be addressed. Thanks, Goutham From sombrafam at gmail.com Mon Nov 22 20:46:55 2021 From: sombrafam at gmail.com (Erlon Cruz) Date: Mon, 22 Nov 2021 17:46:55 -0300 Subject: [neutron] Neutron OVN+QoS Support Message-ID: Hi folks, I have a question related to the Neutron supportability of OVN+QoS. I have checked the config reference for both Victoria and Xena[1] [2] and they are shown as supported (bw limit, eggress/ingress), but I tried to set up an env with OVN+QoS but the rules are not being effective (VMs still download at maximum speed). I double-checked the configuration in the neutron API and it brings the QoS settings[3] [4] [5] , and the versions[6] [7] I'm using should support it. What makes me more confused is that there's a document[8] [9] with a gap analysis of the OVN vs OVS QoS functionality and the document *is* being updated over the releases, but it still shows that QoS is not supported in OVN. So, is there something I'm missing? Erlon _______________ [1] https://docs.openstack.org/neutron/victoria/admin/config-qos.html [2] https://docs.openstack.org/neutron/xena/admin/config-qos.html [3] QoS Config: https://gist.github.com/sombrafam/f8434c0505ed4dd3f912574e7ccebb82 [4] neutron.conf: https://gist.github.com/sombrafam/785beb10f20439c4e50eb633f294ae82 [5] ml2_conf.ini: https://gist.github.com/sombrafam/b171a38d8cd16bd4dc77cfee3916dccd [6] neutron-api-0 versions: https://gist.github.com/sombrafam/5d098daa1df3f116d599c09c96eab173 [7] nova-compute-0 versions: https://gist.github.com/sombrafam/d51102e3a32be5dc8ca03d7a23b6a998 [8] Gaps from ML2/OVS-OVN Xena: https://docs.openstack.org/neutron/xena/ovn/gaps.html [9] Gaps from ML2/OVS-OVN Victoria: https://docs.openstack.org/neutron/victoria/ovn/gaps.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From rsafrono at redhat.com Mon Nov 22 21:45:35 2021 From: rsafrono at redhat.com (Roman Safronov) Date: Mon, 22 Nov 2021 23:45:35 +0200 Subject: [neutron] Neutron OVN+QoS Support In-Reply-To: References: Message-ID: Hi Erlon, I have a couple of questions that probably will help to understand the issue better. Have you applied the QoS rules on a port, network or floating ip? Have you applied the QoS rules before starting the VM (before it's port is active) or after? Thanks On Mon, Nov 22, 2021 at 10:53 PM Erlon Cruz wrote: > Hi folks, > > I have a question related to the Neutron supportability of OVN+QoS. I have > checked the config reference for both > Victoria and Xena[1] > [2] > and they > are shown as supported (bw limit, eggress/ingress), but I tried to set up > an env > with OVN+QoS but the rules are not being effective (VMs still download at > maximum speed). I double-checked > the configuration in the neutron API and it brings the QoS settings[3] > [4] > [5] > , and > the versions[6] > [7] > I'm > using should support it. > > What makes me more confused is that there's a document[8] > [9] > with a gap > analysis of the OVN vs OVS QoS functionality > and the document *is* being updated over the releases, but it still shows > that QoS is not supported in OVN. > > So, is there something I'm missing? > > Erlon > _______________ > [1] https://docs.openstack.org/neutron/victoria/admin/config-qos.html > [2] https://docs.openstack.org/neutron/xena/admin/config-qos.html > [3] QoS Config: > https://gist.github.com/sombrafam/f8434c0505ed4dd3f912574e7ccebb82 > [4] neutron.conf: > https://gist.github.com/sombrafam/785beb10f20439c4e50eb633f294ae82 > [5] ml2_conf.ini: > https://gist.github.com/sombrafam/b171a38d8cd16bd4dc77cfee3916dccd > [6] neutron-api-0 versions: > https://gist.github.com/sombrafam/5d098daa1df3f116d599c09c96eab173 > [7] nova-compute-0 versions: > https://gist.github.com/sombrafam/d51102e3a32be5dc8ca03d7a23b6a998 > [8] Gaps from ML2/OVS-OVN Xena: > https://docs.openstack.org/neutron/xena/ovn/gaps.html > [9] Gaps from ML2/OVS-OVN Victoria: > https://docs.openstack.org/neutron/victoria/ovn/gaps.html > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ces.eduardo98 at gmail.com Mon Nov 22 22:03:10 2021 From: ces.eduardo98 at gmail.com (Carlos Silva) Date: Mon, 22 Nov 2021 19:03:10 -0300 Subject: [manila] First Yoga cycle bug squash - Nov 29th - 3rd Dec Message-ID: Greetings Zorillas and interested stackers! As mentioned in the previous weekly meetings, we will soon be meeting for the first bugsquash of the Yoga release! The event will be held from Nov 29th to 3rd December, 2021, providing an extended contribution window. We will start the event with a call on the first day (Nov 29th). There will be three calls (one of them using our Manila upstream meeting time slot). All of the three calls of the week will be held in a Jitsi room [1]. Nov 29th - 1500 to 1540 UTC - Event opening and bug assignments Dec 2nd - 1500 to 1600 UTC - Collab review session (no regular Manila meeting on this day) Dec 3rd - 1500 to 1540 UTC - Status update and event wrap up A list of selected bugs will be shared here [2]. Please feel free to add any additional bugs you would like to address during the event. [1] https://meetpad.opendev.org/ManilaYogaM1Bugsquash [2] https://ethercalc.openstack.org/wvb2oa23rxbb Thank you in advance! Hope to see you there :) carloss -------------- next part -------------- An HTML attachment was scrubbed... URL: From laurentfdumont at gmail.com Mon Nov 22 22:05:47 2021 From: laurentfdumont at gmail.com (Laurent Dumont) Date: Mon, 22 Nov 2021 17:05:47 -0500 Subject: [neutron][nova] [kolla] vif plugged timeout In-Reply-To: References: Message-ID: How high did you have to raise it? If it does appear after X amount of time, then the VIF plug is not lost? On Sat, Nov 20, 2021 at 7:23 AM Michal Arbet wrote: > + if i raise vif_plugged_timeout ( hope i rember it correct ) in nova to > some high number ..problem dissapear ... But it's only workaround > > D?a so 20. 11. 2021, 12:05 Michal Arbet > nap?sal(a): > >> Hi, >> >> Has anyone seen issue which I am currently facing ? >> >> When launching heat stack ( but it's same if I launch several of >> instances ) vif plugged in timeouts an I don't know why, sometimes it is OK >> ..sometimes is failing. >> >> Sometimes neutron reports vif plugged in < 10 sec ( test env ) sometimes >> it's 100 and more seconds, it seems there is some race condition but I >> can't find out where the problem is. But on the end every instance is >> spawned ok (retry mechanism worked). >> >> Another finding is that it has to do something with security group, if >> noop driver is used ..everything is working good. >> >> Firewall security setup is openvswitch . >> >> Test env is wallaby. >> >> I will attach some logs when I will be near PC .. >> >> Thank you, >> Michal Arbet (Kevko) >> >> >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From gouthampravi at gmail.com Mon Nov 22 22:20:33 2021 From: gouthampravi at gmail.com (Goutham Pacha Ravi) Date: Mon, 22 Nov 2021 14:20:33 -0800 Subject: [requirements][cinder][manila] pyparsing update needs handling In-Reply-To: <20211121235128.a6jlampynlotp4yy@mthode.org> References: <20211121235128.a6jlampynlotp4yy@mthode.org> Message-ID: On Sun, Nov 21, 2021 at 4:02 PM Matthew Thode wrote: > > https://review.opendev.org/818614 > > For cinder it looks like operatorPrecedence is gone now. > > For manila it looks like the same thing. Thanks for pointing it out. Fixing it with https://review.opendev.org/c/openstack/manila/+/818829 and https://review.opendev.org/c/openstack/cinder/+/818834 > > -- > Matthew Thode > From gmann at ghanshyammann.com Tue Nov 23 01:11:08 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 22 Nov 2021 19:11:08 -0600 Subject: [all][refstack][neutron][kolla][ironic][heat][trove][senlin][barbican][manila] Fixing Zuul Config Errors In-Reply-To: <17d39ff7f1e.db45f09a985367.5739424682253609439@ghanshyammann.com> References: <5072b00c-bbf6-42e9-830f-d6598a76beb7@www.fastmail.com> <5517904.DvuYhMxLoT@p1> <4359750.LvFx2qVVIh@p1> <5fea46fa-f304-48fa-a171-ce1114700379@www.fastmail.com> <17d39ff7f1e.db45f09a985367.5739424682253609439@ghanshyammann.com> Message-ID: <17d4a58ba7d.e83d82681106387.2917303785697015051@ghanshyammann.com> ---- On Fri, 19 Nov 2021 14:59:45 -0600 Ghanshyam Mann wrote ---- > ---- On Thu, 18 Nov 2021 08:40:38 -0600 Clark Boylan wrote ---- > > On Thu, Nov 18, 2021, at 3:11 AM, Pavlo Shchelokovskyy wrote: > > > Hi Clark, > > > > > > Why is the retirement of openstack/neutron-lbaas being a problem? > > > > > > The repo is there and accessible under the same URL, it has > > > (potentially working) stable/pike and stable/queens branches, and was > > > not retired at the time of Pike or Queens, so IMO it is a valid request > > > for testing configuration in the same branches of other projects, > > > openstack/heat in this case. > > > > > > Maybe we should leave some minimal zuul configs in retired projects for > > > zuul to find them? > > > > The reason for this is one of the steps for project retirement is to remove the repo from zuul [0]. If the upstream for the project has retired the project I think it is reasonable to remove it from our systems like Zuul. The project isn't retired on a per branch basis. > > > > I do not think the Zuul operators should be responsible for keeping retired projects alive in the CI system if their maintainers have stopped maintaining them. Instead we should remove them from the CI system. > > I agree, in case of deprecation we take care of stable branch and in retirement case, it means it is gone completely so usage of retired repo has to be completely > cleanup from everywhere (master or stable). I am adding this to the TC meeting agenda so that we do not forget it. Also, created the below etherpad to track the progress, please add the progress and patch links under your project: - https://etherpad.opendev.org/p/zuul-config-error-openstack -gmann > > -gmann > > > > > [0] https://opendev.org/openstack/project-config/commit/3832c8fafc4d4e03c306c21f37b2d39bd7c5bd2b > > > > > > From gmann at ghanshyammann.com Tue Nov 23 01:12:20 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 22 Nov 2021 19:12:20 -0600 Subject: [all][tc] Technical Committee next weekly meeting on Nov 25th at 1500 UTC Message-ID: <17d4a59d276.fd9316fb1106394.2758893334791698593@ghanshyammann.com> Hello Everyone, Technical Committee's next weekly meeting is scheduled for Nov 25th at 1500 UTC. If you would like to add topics for discussion, please add them to the below wiki page by Wednesday, Nov 24th, at 2100 UTC. https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting -gmann From gagehugo at gmail.com Tue Nov 23 04:55:46 2021 From: gagehugo at gmail.com (Gage Hugo) Date: Mon, 22 Nov 2021 22:55:46 -0600 Subject: [openstack-helm] No Meeting Tomorrow Message-ID: Hey team, Since there are no agenda items [0] for the IRC meeting tomorrow, the meeting is cancelled. Our next meeting will be November 30th. [0] https://etherpad.opendev.org/p/openstack-helm-weekly-meeting -------------- next part -------------- An HTML attachment was scrubbed... URL: From franck.vedel at univ-grenoble-alpes.fr Tue Nov 23 07:57:51 2021 From: franck.vedel at univ-grenoble-alpes.fr (Franck VEDEL) Date: Tue, 23 Nov 2021 08:57:51 +0100 Subject: [kolla-ansible][wallaby][glance] Problem with image list after reconfigure In-Reply-To: References: <87804956-038F-4F88-A467-33851679B5B4@univ-grenoble-alpes.fr> Message-ID: <30680C7F-7E55-4098-8438-F3CEE48C4A90@univ-grenoble-alpes.fr> Ignazio, Radoslaw, thanks to you, I made some modifications and my environment seems to work better (the images are placed on the iiscsi bay on which the volumes are stored). I installed the cache for glance. It works, well I think it does. My question is: between the different formats (qcow2, raw or other), which is the most efficient if - we create a volume then an instance from the volume - we create an instance from the image - we create an instance without volume - we create a snapshot then an instance from the snapshot Franck > > >> Le 19 nov. 2021 ? 14:50, Ignazio Cassano > a ?crit : >> >> Franck, this help you a lot. >> Thanks Radoslaw >> Ignazio >> >> Il giorno ven 19 nov 2021 alle ore 12:03 Rados?aw Piliszek > ha scritto: >> If one sets glance_file_datadir_volume to non-default, then glance-api >> gets deployed on all hosts. >> >> -yoctozepto >> >> On Fri, 19 Nov 2021 at 10:51, Ignazio Cassano > wrote: >> > >> > Hello Franck, glance is not deployed on all nodes at default. >> > I got the same problem >> > In my case I have 3 controllers. >> > I created an nfs share on a storage server where to store images. >> > Before deploying glance, I create /var/lib/glance/images on the 3 controllers and I mount the nfs share. >> > This is my fstab on the 3 controllers: >> > >> > 10.102.189.182:/netappopenstacktst2_glance /var/lib/glance nfs rw,user=glance,soft,intr,noatime,nodiratime >> > >> > In my globals.yml I have: >> > glance_file_datadir_volume: "/var/lib/glance" >> > glance_backend_file: "yes" >> > >> > This means images are on /var/lib/glance and since it is a nfs share all my 3 controlles can share images. >> > Then you must deploy. >> > To be sure the glance container is started on all controllers, since I have 3 controllers, I deployed 3 times changing the order in the inventory. >> > First time: >> > [control] >> > A >> > B >> > C >> > >> > Second time: >> > [control] >> > B >> > C >> > A >> > >> > Third time: >> > [control] >> > C >> > B >> > A >> > >> > Or you can deploy glance 3 times using -t glance and -l >> > >> > As far as the instance stopped, I got I bug with a version of kolla. >> > https://bugs.launchpad.net/kolla-ansible/+bug/1941706 >> > Now is corrected and with kolla 12.2.0 it works. >> > Ignazio >> > >> > >> > Il giorno mer 17 nov 2021 alle ore 23:17 Franck VEDEL > ha scritto: -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Tue Nov 23 08:14:49 2021 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 23 Nov 2021 09:14:49 +0100 Subject: [kolla-ansible][wallaby][glance] Problem with image list after reconfigure In-Reply-To: <30680C7F-7E55-4098-8438-F3CEE48C4A90@univ-grenoble-alpes.fr> References: <87804956-038F-4F88-A467-33851679B5B4@univ-grenoble-alpes.fr> <30680C7F-7E55-4098-8438-F3CEE48C4A90@univ-grenoble-alpes.fr> Message-ID: Franck, If the cache works fine , I think glance image format could be qcow2. The volume is created in raw format but the download phase is executed only the fisrt time you create a volume from a new image. With this setup I can create 20-30 instance in a shot and it takes few minutes to complete. I always use general purpose small images and colplete the instance configuration (package installation and so on) with heat or ansible. Ignazio Il giorno mar 23 nov 2021 alle ore 08:57 Franck VEDEL < franck.vedel at univ-grenoble-alpes.fr> ha scritto: > Ignazio, Radoslaw, > > thanks to you, I made some modifications and my environment seems to work > better (the images are placed on the iiscsi bay on which the volumes are > stored). > I installed the cache for glance. It works, well I think it does. > > My question is: between the different formats (qcow2, raw or other), which > is the most efficient if > - we create a volume then an instance from the volume > - we create an instance from the image > - we create an instance without volume > - we create a snapshot then an instance from the snapshot > > Franck > > > > > Le 19 nov. 2021 ? 14:50, Ignazio Cassano a > ?crit : > > Franck, this help you a lot. > Thanks Radoslaw > Ignazio > > Il giorno ven 19 nov 2021 alle ore 12:03 Rados?aw Piliszek < > radoslaw.piliszek at gmail.com> ha scritto: > >> If one sets glance_file_datadir_volume to non-default, then glance-api >> gets deployed on all hosts. >> >> -yoctozepto >> >> On Fri, 19 Nov 2021 at 10:51, Ignazio Cassano >> wrote: >> > >> > Hello Franck, glance is not deployed on all nodes at default. >> > I got the same problem >> > In my case I have 3 controllers. >> > I created an nfs share on a storage server where to store images. >> > Before deploying glance, I create /var/lib/glance/images on the 3 >> controllers and I mount the nfs share. >> > This is my fstab on the 3 controllers: >> > >> > 10.102.189.182:/netappopenstacktst2_glance /var/lib/glance nfs >> rw,user=glance,soft,intr,noatime,nodiratime >> > >> > In my globals.yml I have: >> > glance_file_datadir_volume: "/var/lib/glance" >> > glance_backend_file: "yes" >> > >> > This means images are on /var/lib/glance and since it is a nfs share >> all my 3 controlles can share images. >> > Then you must deploy. >> > To be sure the glance container is started on all controllers, since I >> have 3 controllers, I deployed 3 times changing the order in the inventory. >> > First time: >> > [control] >> > A >> > B >> > C >> > >> > Second time: >> > [control] >> > B >> > C >> > A >> > >> > Third time: >> > [control] >> > C >> > B >> > A >> > >> > Or you can deploy glance 3 times using -t glance and -l >> > >> > As far as the instance stopped, I got I bug with a version of kolla. >> > https://bugs.launchpad.net/kolla-ansible/+bug/1941706 >> > Now is corrected and with kolla 12.2.0 it works. >> > Ignazio >> > >> > >> > Il giorno mer 17 nov 2021 alle ore 23:17 Franck VEDEL < >> franck.vedel at univ-grenoble-alpes.fr> ha scritto: >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nsitlani03 at gmail.com Tue Nov 23 09:29:08 2021 From: nsitlani03 at gmail.com (Namrata Sitlani) Date: Tue, 23 Nov 2021 14:59:08 +0530 Subject: [magnum] [victoria] [fedora-coreos] Message-ID: Hello All, We run release Victoria(11.1.1) and we deployed Kubernetes version 1.18.20 on Magnum with Fedora CoreOS version 33 and recently we ran into issues with Cinder CSI plugin. Multiple master nodes break the CSI plugin, but if I use a single master everything works fine. In my understanding it could be related to Fedora CoreOS version. It is not clear which versions are supported. Can somebody give me information on which version is supported? Thanks, Namrata Sitlani -------------- next part -------------- An HTML attachment was scrubbed... URL: From sombrafam at gmail.com Tue Nov 23 10:38:38 2021 From: sombrafam at gmail.com (Erlon Cruz) Date: Tue, 23 Nov 2021 07:38:38 -0300 Subject: [neutron] Neutron OVN+QoS Support In-Reply-To: References: Message-ID: Hi Roman, Forgot to add that detail, since I run the same routine in a non-ovn deployment and it worked. But this is how I did it: openstack network qos policy list openstack network qos policy create bw-limiter openstack network qos rule create --type bandwidth-limit --max-kbps 512 --max-burst-kbits 512 --egress bw-limiter openstack network qos rule create --type bandwidth-limit --max-kbps 512 --max-burst-kbits 512 --ingress bw-limiter openstack network set --qos-policy bw-limiter ext_net I didn't set it in the port though, which is something I should do. I'll set it in the port too for testing but I think the above should work regardless. Erlon Em seg., 22 de nov. de 2021 ?s 18:45, Roman Safronov escreveu: > Hi Erlon, > > I have a couple of questions that probably will help to understand the > issue better. > Have you applied the QoS rules on a port, network or floating ip? > Have you applied the QoS rules before starting the VM (before it's port is > active) or after? > > Thanks > > > On Mon, Nov 22, 2021 at 10:53 PM Erlon Cruz wrote: > >> Hi folks, >> >> I have a question related to the Neutron supportability of OVN+QoS. I >> have checked the config reference for both >> Victoria and Xena[1] >> [2] >> and they >> are shown as supported (bw limit, eggress/ingress), but I tried to set up >> an env >> with OVN+QoS but the rules are not being effective (VMs still download at >> maximum speed). I double-checked >> the configuration in the neutron API and it brings the QoS settings[3] >> [4] >> [5] >> , >> and the versions[6] >> [7] >> I'm >> using should support it. >> >> What makes me more confused is that there's a document[8] >> [9] >> with a gap >> analysis of the OVN vs OVS QoS functionality >> and the document *is* being updated over the releases, but it still shows >> that QoS is not supported in OVN. >> >> So, is there something I'm missing? >> >> Erlon >> _______________ >> [1] https://docs.openstack.org/neutron/victoria/admin/config-qos.html >> [2] https://docs.openstack.org/neutron/xena/admin/config-qos.html >> [3] QoS Config: >> https://gist.github.com/sombrafam/f8434c0505ed4dd3f912574e7ccebb82 >> [4] neutron.conf: >> https://gist.github.com/sombrafam/785beb10f20439c4e50eb633f294ae82 >> [5] ml2_conf.ini: >> https://gist.github.com/sombrafam/b171a38d8cd16bd4dc77cfee3916dccd >> [6] neutron-api-0 versions: >> https://gist.github.com/sombrafam/5d098daa1df3f116d599c09c96eab173 >> [7] nova-compute-0 versions: >> https://gist.github.com/sombrafam/d51102e3a32be5dc8ca03d7a23b6a998 >> [8] Gaps from ML2/OVS-OVN Xena: >> https://docs.openstack.org/neutron/xena/ovn/gaps.html >> [9] Gaps from ML2/OVS-OVN Victoria: >> https://docs.openstack.org/neutron/victoria/ovn/gaps.html >> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rsafrono at redhat.com Tue Nov 23 11:12:03 2021 From: rsafrono at redhat.com (Roman Safronov) Date: Tue, 23 Nov 2021 13:12:03 +0200 Subject: [neutron] Neutron OVN+QoS Support In-Reply-To: References: Message-ID: Hi Erlon, There was a bug with setting QoS on a network but it had been fixed long ago. https://bugs.launchpad.net/neutron/+bug/1851362 or https://bugzilla.redhat.com/show_bug.cgi?id=1934096 At least in our downstream CI we do not observe such issues with QoS+OVN. >From the commands I see that you apply the QoS rule on the external network, right? On Tue, Nov 23, 2021 at 12:39 PM Erlon Cruz wrote: > Hi Roman, > > Forgot to add that detail, since I run the same routine in a non-ovn > deployment and it worked. But this is how I did it: > > openstack network qos policy list > openstack network qos policy create bw-limiter > openstack network qos rule create --type bandwidth-limit --max-kbps 512 > --max-burst-kbits 512 --egress bw-limiter > openstack network qos rule create --type bandwidth-limit --max-kbps 512 > --max-burst-kbits 512 --ingress bw-limiter > openstack network set --qos-policy bw-limiter ext_net > > I didn't set it in the port though, which is something I should do. I'll > set it in the port too for testing but I think the above should > work regardless. > > Erlon > > > Em seg., 22 de nov. de 2021 ?s 18:45, Roman Safronov > escreveu: > >> Hi Erlon, >> >> I have a couple of questions that probably will help to understand the >> issue better. >> Have you applied the QoS rules on a port, network or floating ip? >> Have you applied the QoS rules before starting the VM (before it's port >> is active) or after? >> >> Thanks >> >> >> On Mon, Nov 22, 2021 at 10:53 PM Erlon Cruz wrote: >> >>> Hi folks, >>> >>> I have a question related to the Neutron supportability of OVN+QoS. I >>> have checked the config reference for both >>> Victoria and Xena[1] >>> [2] >>> and >>> they are shown as supported (bw limit, eggress/ingress), but I tried to set >>> up an env >>> with OVN+QoS but the rules are not being effective (VMs still download >>> at maximum speed). I double-checked >>> the configuration in the neutron API and it brings the QoS settings[3] >>> [4] >>> [5] >>> , >>> and the versions[6] >>> [7] >>> I'm >>> using should support it. >>> >>> What makes me more confused is that there's a document[8] >>> [9] >>> with a gap >>> analysis of the OVN vs OVS QoS functionality >>> and the document *is* being updated over the releases, but it still >>> shows that QoS is not supported in OVN. >>> >>> So, is there something I'm missing? >>> >>> Erlon >>> _______________ >>> [1] https://docs.openstack.org/neutron/victoria/admin/config-qos.html >>> [2] https://docs.openstack.org/neutron/xena/admin/config-qos.html >>> [3] QoS Config: >>> https://gist.github.com/sombrafam/f8434c0505ed4dd3f912574e7ccebb82 >>> [4] neutron.conf: >>> https://gist.github.com/sombrafam/785beb10f20439c4e50eb633f294ae82 >>> [5] ml2_conf.ini: >>> https://gist.github.com/sombrafam/b171a38d8cd16bd4dc77cfee3916dccd >>> [6] neutron-api-0 versions: >>> https://gist.github.com/sombrafam/5d098daa1df3f116d599c09c96eab173 >>> [7] nova-compute-0 versions: >>> https://gist.github.com/sombrafam/d51102e3a32be5dc8ca03d7a23b6a998 >>> [8] Gaps from ML2/OVS-OVN Xena: >>> https://docs.openstack.org/neutron/xena/ovn/gaps.html >>> [9] Gaps from ML2/OVS-OVN Victoria: >>> https://docs.openstack.org/neutron/victoria/ovn/gaps.html >>> >>> >>> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From lokendrarathour at gmail.com Tue Nov 23 11:45:38 2021 From: lokendrarathour at gmail.com (Lokendra Rathour) Date: Tue, 23 Nov 2021 17:15:38 +0530 Subject: [Triple 0] Undercloud deployment Getting failed Message-ID: Hi Team getting strange error when installing triple O Train on Centos 8.4 '--volume', '/var/lib/container-puppet/puppetlabs/facter.conf:/etc/puppetlabs/facter/facter.conf:ro', '--volume', '/var/lib/container-puppet/puppetlabs/:/opt/puppetlabs/:ro', '--volume', '/dev/log:/dev/log:rw', '--rm', '--log-driver', 'k8s-file', '--log-opt', 'path=/var/log/containers/stdouts/container-puppet-zaqar.log', '--security-opt', 'label=disable', '--volume', '/usr/share/openstack-puppet/modules/:/usr/share/openstack-puppet/modules/:ro', '--entrypoint', '/var/lib/container-puppet/container-puppet.sh', '--net', 'host', '--volume', '/etc/hosts:/etc/hosts:ro', '--volume', '/var/lib/container-puppet/container-puppet.sh:/var/lib/container-puppet/container-puppet.sh:ro', 'quay.io/tripleotraincentos8/centos-binary-zaqar-wsgi:current-tripleo'] run failed after Error: container_linux.go:370: starting container process caused: error adding seccomp filter rule for syscall bdflush: permission denied: OCI permission denied", " attempt(s): 3", "2021-11-23 11:00:38,384 WARNING: 58791 -- Retrying running container: zaqar", "2021-11-23 11:00:38,384 ERROR: 58791 -- Failed running container for zaqar", "2021-11-23 11:00:38,385 INFO: 58791 -- Finished processing puppet configs for zaqar", "2021-11-23 11:00:38,385 ERROR: 58782 -- ERROR configuring crond", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring glance_api", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring heat_api", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring heat", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring ironic_api", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring ironic", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring ironic_inspector", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring neutron", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring iscsid", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring keystone", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring memcached", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring mistral", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring mysql", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring nova", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring rabbitmq", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring placement", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring swift", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring swift_ringbuilder", "2021-11-23 11:00:38,386 ERROR: 58782 -- ERROR configuring zaqar" ], "stderr_lines": [], "_ansible_no_log": false, "attempts": 15 } ] ] Not cleaning working directory /home/stack/tripleo-heat-installer-templates Not cleaning ansible directory /home/stack/undercloud-ansible-mie5k51_ Install artifact is located at /home/stack/undercloud-install-20211123110040.tar.bzip2 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Deployment Failed! This issue is recurring multiple times please advise. -- ~ Lokendra -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralonsoh at redhat.com Tue Nov 23 11:47:37 2021 From: ralonsoh at redhat.com (Rodolfo Alonso Hernandez) Date: Tue, 23 Nov 2021 12:47:37 +0100 Subject: [neutron] Neutron OVN+QoS Support In-Reply-To: References: Message-ID: Hello Erlon: We really need to review the gaps document, at least for Xena. As Roman said, we have been testing QoS in OVN successfully. The current status of QoS in OVN is (at least for Xena): - Fixed ports (VM ports): support for BW limit rules (egress/ingress) and DSCP (only egress). Neutron supports port network QoS inheritance (same as in your example). This is not for OVN but for any backend. - FIPs: support for BW limit rules (egress/ingress). Still no network QoS inheritance (in progress). - GW IP: no support yet. Ping me in #openstack-neutron channel (ralonsoh) if you have more questions. Regards. On Tue, Nov 23, 2021 at 12:12 PM Roman Safronov wrote: > Hi Erlon, > > There was a bug with setting QoS on a network but it had been fixed long > ago. > https://bugs.launchpad.net/neutron/+bug/1851362 or > https://bugzilla.redhat.com/show_bug.cgi?id=1934096 > At least in our downstream CI we do not observe such issues with QoS+OVN. > > From the commands I see that you apply the QoS rule on the external > network, right? > > On Tue, Nov 23, 2021 at 12:39 PM Erlon Cruz wrote: > >> Hi Roman, >> >> Forgot to add that detail, since I run the same routine in a non-ovn >> deployment and it worked. But this is how I did it: >> >> openstack network qos policy list >> openstack network qos policy create bw-limiter >> openstack network qos rule create --type bandwidth-limit --max-kbps 512 >> --max-burst-kbits 512 --egress bw-limiter >> openstack network qos rule create --type bandwidth-limit --max-kbps 512 >> --max-burst-kbits 512 --ingress bw-limiter >> openstack network set --qos-policy bw-limiter ext_net >> >> I didn't set it in the port though, which is something I should do. I'll >> set it in the port too for testing but I think the above should >> work regardless. >> >> Erlon >> >> >> Em seg., 22 de nov. de 2021 ?s 18:45, Roman Safronov >> escreveu: >> >>> Hi Erlon, >>> >>> I have a couple of questions that probably will help to understand the >>> issue better. >>> Have you applied the QoS rules on a port, network or floating ip? >>> Have you applied the QoS rules before starting the VM (before it's port >>> is active) or after? >>> >>> Thanks >>> >>> >>> On Mon, Nov 22, 2021 at 10:53 PM Erlon Cruz wrote: >>> >>>> Hi folks, >>>> >>>> I have a question related to the Neutron supportability of OVN+QoS. I >>>> have checked the config reference for both >>>> Victoria and Xena[1] >>>> [2] >>>> and >>>> they are shown as supported (bw limit, eggress/ingress), but I tried to set >>>> up an env >>>> with OVN+QoS but the rules are not being effective (VMs still download >>>> at maximum speed). I double-checked >>>> the configuration in the neutron API and it brings the QoS settings[3] >>>> [4] >>>> [5] >>>> , >>>> and the versions[6] >>>> [7] >>>> I'm >>>> using should support it. >>>> >>>> What makes me more confused is that there's a document[8] >>>> [9] >>>> with a gap >>>> analysis of the OVN vs OVS QoS functionality >>>> and the document *is* being updated over the releases, but it still >>>> shows that QoS is not supported in OVN. >>>> >>>> So, is there something I'm missing? >>>> >>>> Erlon >>>> _______________ >>>> [1] https://docs.openstack.org/neutron/victoria/admin/config-qos.html >>>> [2] https://docs.openstack.org/neutron/xena/admin/config-qos.html >>>> [3] QoS Config: >>>> https://gist.github.com/sombrafam/f8434c0505ed4dd3f912574e7ccebb82 >>>> [4] neutron.conf: >>>> https://gist.github.com/sombrafam/785beb10f20439c4e50eb633f294ae82 >>>> [5] ml2_conf.ini: >>>> https://gist.github.com/sombrafam/b171a38d8cd16bd4dc77cfee3916dccd >>>> [6] neutron-api-0 versions: >>>> https://gist.github.com/sombrafam/5d098daa1df3f116d599c09c96eab173 >>>> [7] nova-compute-0 versions: >>>> https://gist.github.com/sombrafam/d51102e3a32be5dc8ca03d7a23b6a998 >>>> [8] Gaps from ML2/OVS-OVN Xena: >>>> https://docs.openstack.org/neutron/xena/ovn/gaps.html >>>> [9] Gaps from ML2/OVS-OVN Victoria: >>>> https://docs.openstack.org/neutron/victoria/ovn/gaps.html >>>> >>>> >>>> >>> >>> > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sombrafam at gmail.com Tue Nov 23 13:01:41 2021 From: sombrafam at gmail.com (Erlon Cruz) Date: Tue, 23 Nov 2021 10:01:41 -0300 Subject: [neutron] Neutron OVN+QoS Support In-Reply-To: References: Message-ID: Hi Roman, Rodolfo, I tested setting the QoS policy to the port (internal) instead of the network (external), and it works! I did some more testing on the OVS vs OVN deployments and I can confirm the status you are saying. What I got was: OVS: FIP: Setting on port: FAIL Setting on network: OK Private network: Setting on port: OK Setting on network: OK Router: Internal port: OK External port: OK OVN: FIP: Setting on port: FAIL Setting on network: FAIL (I was trying this) Private network: Setting on port: OK Setting on network: OK Router: Internal port: FAIL External port: FAIL Thanks a lot for your help!! Erlon Em ter., 23 de nov. de 2021 ?s 08:47, Rodolfo Alonso Hernandez < ralonsoh at redhat.com> escreveu: > Hello Erlon: > > We really need to review the gaps document, at least for Xena. > > As Roman said, we have been testing QoS in OVN successfully. > > The current status of QoS in OVN is (at least for Xena): > - Fixed ports (VM ports): support for BW limit rules (egress/ingress) and > DSCP (only egress). Neutron supports port network QoS inheritance (same as > in your example). This is not for OVN but for any backend. > - FIPs: support for BW limit rules (egress/ingress). Still no network QoS > inheritance (in progress). > - GW IP: no support yet. > > Ping me in #openstack-neutron channel (ralonsoh) if you have more > questions. > > Regards. > > > On Tue, Nov 23, 2021 at 12:12 PM Roman Safronov > wrote: > >> Hi Erlon, >> >> There was a bug with setting QoS on a network but it had been fixed long >> ago. >> https://bugs.launchpad.net/neutron/+bug/1851362 or >> https://bugzilla.redhat.com/show_bug.cgi?id=1934096 >> At least in our downstream CI we do not observe such issues with QoS+OVN. >> >> From the commands I see that you apply the QoS rule on the external >> network, right? >> >> On Tue, Nov 23, 2021 at 12:39 PM Erlon Cruz wrote: >> >>> Hi Roman, >>> >>> Forgot to add that detail, since I run the same routine in a non-ovn >>> deployment and it worked. But this is how I did it: >>> >>> openstack network qos policy list >>> openstack network qos policy create bw-limiter >>> openstack network qos rule create --type bandwidth-limit --max-kbps 512 >>> --max-burst-kbits 512 --egress bw-limiter >>> openstack network qos rule create --type bandwidth-limit --max-kbps 512 >>> --max-burst-kbits 512 --ingress bw-limiter >>> openstack network set --qos-policy bw-limiter ext_net >>> >>> I didn't set it in the port though, which is something I should do. I'll >>> set it in the port too for testing but I think the above should >>> work regardless. >>> >>> Erlon >>> >>> >>> Em seg., 22 de nov. de 2021 ?s 18:45, Roman Safronov < >>> rsafrono at redhat.com> escreveu: >>> >>>> Hi Erlon, >>>> >>>> I have a couple of questions that probably will help to understand the >>>> issue better. >>>> Have you applied the QoS rules on a port, network or floating ip? >>>> Have you applied the QoS rules before starting the VM (before it's port >>>> is active) or after? >>>> >>>> Thanks >>>> >>>> >>>> On Mon, Nov 22, 2021 at 10:53 PM Erlon Cruz >>>> wrote: >>>> >>>>> Hi folks, >>>>> >>>>> I have a question related to the Neutron supportability of OVN+QoS. I >>>>> have checked the config reference for both >>>>> Victoria and Xena[1] >>>>> [2] >>>>> and >>>>> they are shown as supported (bw limit, eggress/ingress), but I tried to set >>>>> up an env >>>>> with OVN+QoS but the rules are not being effective (VMs still download >>>>> at maximum speed). I double-checked >>>>> the configuration in the neutron API and it brings the QoS settings[3] >>>>> >>>>> [4] >>>>> >>>>> [5] >>>>> , >>>>> and the versions[6] >>>>> >>>>> [7] >>>>> I'm >>>>> using should support it. >>>>> >>>>> What makes me more confused is that there's a document[8] >>>>> [9] >>>>> with a >>>>> gap analysis of the OVN vs OVS QoS functionality >>>>> and the document *is* being updated over the releases, but it still >>>>> shows that QoS is not supported in OVN. >>>>> >>>>> So, is there something I'm missing? >>>>> >>>>> Erlon >>>>> _______________ >>>>> [1] https://docs.openstack.org/neutron/victoria/admin/config-qos.html >>>>> [2] https://docs.openstack.org/neutron/xena/admin/config-qos.html >>>>> [3] QoS Config: >>>>> https://gist.github.com/sombrafam/f8434c0505ed4dd3f912574e7ccebb82 >>>>> [4] neutron.conf: >>>>> https://gist.github.com/sombrafam/785beb10f20439c4e50eb633f294ae82 >>>>> [5] ml2_conf.ini: >>>>> https://gist.github.com/sombrafam/b171a38d8cd16bd4dc77cfee3916dccd >>>>> [6] neutron-api-0 versions: >>>>> https://gist.github.com/sombrafam/5d098daa1df3f116d599c09c96eab173 >>>>> [7] nova-compute-0 versions: >>>>> https://gist.github.com/sombrafam/d51102e3a32be5dc8ca03d7a23b6a998 >>>>> [8] Gaps from ML2/OVS-OVN Xena: >>>>> https://docs.openstack.org/neutron/xena/ovn/gaps.html >>>>> [9] Gaps from ML2/OVS-OVN Victoria: >>>>> https://docs.openstack.org/neutron/victoria/ovn/gaps.html >>>>> >>>>> >>>>> >>>> >>>> >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From syedammad83 at gmail.com Tue Nov 23 13:47:20 2021 From: syedammad83 at gmail.com (Ammad Syed) Date: Tue, 23 Nov 2021 18:47:20 +0500 Subject: [neutron] Neutron OVN+QoS Support In-Reply-To: References: Message-ID: Hi Erlon, You can check below url for testing qos on FIP. I have tested it and it works fine. https://github.com/openstack/neutron/blob/master/doc/source/admin/config-qos.rst Ammad On Tue, Nov 23, 2021 at 6:06 PM Erlon Cruz wrote: > Hi Roman, Rodolfo, > > I tested setting the QoS policy to the port (internal) instead of the > network (external), and it works! I did some more testing on > the OVS vs OVN deployments and I can confirm the status you are saying. > What I got was: > > OVS: > FIP: > Setting on port: FAIL > Setting on network: OK > > Private network: > Setting on port: OK > Setting on network: OK > > Router: > Internal port: OK > External port: OK > > OVN: > FIP: > Setting on port: FAIL > Setting on network: FAIL (I was trying this) > > Private network: > Setting on port: OK > Setting on network: OK > > Router: > Internal port: FAIL > External port: FAIL > > Thanks a lot for your help!! > Erlon > > Em ter., 23 de nov. de 2021 ?s 08:47, Rodolfo Alonso Hernandez < > ralonsoh at redhat.com> escreveu: > >> Hello Erlon: >> >> We really need to review the gaps document, at least for Xena. >> >> As Roman said, we have been testing QoS in OVN successfully. >> >> The current status of QoS in OVN is (at least for Xena): >> - Fixed ports (VM ports): support for BW limit rules (egress/ingress) and >> DSCP (only egress). Neutron supports port network QoS inheritance (same as >> in your example). This is not for OVN but for any backend. >> - FIPs: support for BW limit rules (egress/ingress). Still no network QoS >> inheritance (in progress). >> - GW IP: no support yet. >> >> Ping me in #openstack-neutron channel (ralonsoh) if you have more >> questions. >> >> Regards. >> >> >> On Tue, Nov 23, 2021 at 12:12 PM Roman Safronov >> wrote: >> >>> Hi Erlon, >>> >>> There was a bug with setting QoS on a network but it had been fixed long >>> ago. >>> https://bugs.launchpad.net/neutron/+bug/1851362 or >>> https://bugzilla.redhat.com/show_bug.cgi?id=1934096 >>> At least in our downstream CI we do not observe such issues with >>> QoS+OVN. >>> >>> From the commands I see that you apply the QoS rule on the external >>> network, right? >>> >>> On Tue, Nov 23, 2021 at 12:39 PM Erlon Cruz wrote: >>> >>>> Hi Roman, >>>> >>>> Forgot to add that detail, since I run the same routine in a non-ovn >>>> deployment and it worked. But this is how I did it: >>>> >>>> openstack network qos policy list >>>> openstack network qos policy create bw-limiter >>>> openstack network qos rule create --type bandwidth-limit --max-kbps 512 >>>> --max-burst-kbits 512 --egress bw-limiter >>>> openstack network qos rule create --type bandwidth-limit --max-kbps 512 >>>> --max-burst-kbits 512 --ingress bw-limiter >>>> openstack network set --qos-policy bw-limiter ext_net >>>> >>>> I didn't set it in the port though, which is something I should do. >>>> I'll set it in the port too for testing but I think the above should >>>> work regardless. >>>> >>>> Erlon >>>> >>>> >>>> Em seg., 22 de nov. de 2021 ?s 18:45, Roman Safronov < >>>> rsafrono at redhat.com> escreveu: >>>> >>>>> Hi Erlon, >>>>> >>>>> I have a couple of questions that probably will help to understand the >>>>> issue better. >>>>> Have you applied the QoS rules on a port, network or floating ip? >>>>> Have you applied the QoS rules before starting the VM (before it's >>>>> port is active) or after? >>>>> >>>>> Thanks >>>>> >>>>> >>>>> On Mon, Nov 22, 2021 at 10:53 PM Erlon Cruz >>>>> wrote: >>>>> >>>>>> Hi folks, >>>>>> >>>>>> I have a question related to the Neutron supportability of OVN+QoS. I >>>>>> have checked the config reference for both >>>>>> Victoria and Xena[1] >>>>>> [2] >>>>>> and >>>>>> they are shown as supported (bw limit, eggress/ingress), but I tried to set >>>>>> up an env >>>>>> with OVN+QoS but the rules are not being effective (VMs still >>>>>> download at maximum speed). I double-checked >>>>>> the configuration in the neutron API and it brings the QoS settings >>>>>> [3] >>>>>> >>>>>> [4] >>>>>> >>>>>> [5] >>>>>> , >>>>>> and the versions[6] >>>>>> >>>>>> [7] >>>>>> I'm >>>>>> using should support it. >>>>>> >>>>>> What makes me more confused is that there's a document[8] >>>>>> [9] >>>>>> with a >>>>>> gap analysis of the OVN vs OVS QoS functionality >>>>>> and the document *is* being updated over the releases, but it still >>>>>> shows that QoS is not supported in OVN. >>>>>> >>>>>> So, is there something I'm missing? >>>>>> >>>>>> Erlon >>>>>> _______________ >>>>>> [1] https://docs.openstack.org/neutron/victoria/admin/config-qos.html >>>>>> [2] https://docs.openstack.org/neutron/xena/admin/config-qos.html >>>>>> [3] QoS Config: >>>>>> https://gist.github.com/sombrafam/f8434c0505ed4dd3f912574e7ccebb82 >>>>>> [4] neutron.conf: >>>>>> https://gist.github.com/sombrafam/785beb10f20439c4e50eb633f294ae82 >>>>>> [5] ml2_conf.ini: >>>>>> https://gist.github.com/sombrafam/b171a38d8cd16bd4dc77cfee3916dccd >>>>>> [6] neutron-api-0 versions: >>>>>> https://gist.github.com/sombrafam/5d098daa1df3f116d599c09c96eab173 >>>>>> [7] nova-compute-0 versions: >>>>>> https://gist.github.com/sombrafam/d51102e3a32be5dc8ca03d7a23b6a998 >>>>>> [8] Gaps from ML2/OVS-OVN Xena: >>>>>> https://docs.openstack.org/neutron/xena/ovn/gaps.html >>>>>> [9] Gaps from ML2/OVS-OVN Victoria: >>>>>> https://docs.openstack.org/neutron/victoria/ovn/gaps.html >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>> >>> >>> -- Regards, Syed Ammad Ali -------------- next part -------------- An HTML attachment was scrubbed... URL: From sombrafam at gmail.com Tue Nov 23 13:51:15 2021 From: sombrafam at gmail.com (Erlon Cruz) Date: Tue, 23 Nov 2021 10:51:15 -0300 Subject: Cinder NFS Encryption In-Reply-To: References: Message-ID: Hi Hitesh, Have you checked this[1] ? As far as I remember, when you do data encryption on Cinder, using volume encryption[2] and attach the volume via network, the data is transferred encrypted and nova has to share the keys. Erlon _____________ [1] Data Encryption: https://docs.openstack.org/security-guide/tenant-data/data-encryption.html [2] Cinder Volume Encryption: https://docs.openstack.org/cinder/pike/configuration/block-storage/volume-encryption.html Em seg., 22 de nov. de 2021 ?s 13:16, Hitesh Mathur escreveu: > Hi, > > I am not able to find whether Cinder support NFS data-in-transit > encryption or not. Can you please provide the information on this and how > to use it ? > > -- > Regards > Hitesh > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sombrafam at gmail.com Tue Nov 23 13:55:58 2021 From: sombrafam at gmail.com (Erlon Cruz) Date: Tue, 23 Nov 2021 10:55:58 -0300 Subject: [neutron] Neutron OVN+QoS Support In-Reply-To: References: Message-ID: Hi Ammad, What OpenStack version did you tested? I have just performed the FIP test on Xena and it didn't work for me. See the results I posted. Erlon Em ter., 23 de nov. de 2021 ?s 10:47, Ammad Syed escreveu: > Hi Erlon, > > You can check below url for testing qos on FIP. I have tested it and it > works fine. > > > https://github.com/openstack/neutron/blob/master/doc/source/admin/config-qos.rst > > Ammad > On Tue, Nov 23, 2021 at 6:06 PM Erlon Cruz wrote: > >> Hi Roman, Rodolfo, >> >> I tested setting the QoS policy to the port (internal) instead of the >> network (external), and it works! I did some more testing on >> the OVS vs OVN deployments and I can confirm the status you are saying. >> What I got was: >> >> OVS: >> FIP: >> Setting on port: FAIL >> Setting on network: OK >> >> Private network: >> Setting on port: OK >> Setting on network: OK >> >> Router: >> Internal port: OK >> External port: OK >> >> OVN: >> FIP: >> Setting on port: FAIL >> Setting on network: FAIL (I was trying this) >> >> Private network: >> Setting on port: OK >> Setting on network: OK >> >> Router: >> Internal port: FAIL >> External port: FAIL >> >> Thanks a lot for your help!! >> Erlon >> >> Em ter., 23 de nov. de 2021 ?s 08:47, Rodolfo Alonso Hernandez < >> ralonsoh at redhat.com> escreveu: >> >>> Hello Erlon: >>> >>> We really need to review the gaps document, at least for Xena. >>> >>> As Roman said, we have been testing QoS in OVN successfully. >>> >>> The current status of QoS in OVN is (at least for Xena): >>> - Fixed ports (VM ports): support for BW limit rules (egress/ingress) >>> and DSCP (only egress). Neutron supports port network QoS inheritance (same >>> as in your example). This is not for OVN but for any backend. >>> - FIPs: support for BW limit rules (egress/ingress). Still no network >>> QoS inheritance (in progress). >>> - GW IP: no support yet. >>> >>> Ping me in #openstack-neutron channel (ralonsoh) if you have more >>> questions. >>> >>> Regards. >>> >>> >>> On Tue, Nov 23, 2021 at 12:12 PM Roman Safronov >>> wrote: >>> >>>> Hi Erlon, >>>> >>>> There was a bug with setting QoS on a network but it had been fixed >>>> long ago. >>>> https://bugs.launchpad.net/neutron/+bug/1851362 or >>>> https://bugzilla.redhat.com/show_bug.cgi?id=1934096 >>>> At least in our downstream CI we do not observe such issues with >>>> QoS+OVN. >>>> >>>> From the commands I see that you apply the QoS rule on the external >>>> network, right? >>>> >>>> On Tue, Nov 23, 2021 at 12:39 PM Erlon Cruz >>>> wrote: >>>> >>>>> Hi Roman, >>>>> >>>>> Forgot to add that detail, since I run the same routine in a non-ovn >>>>> deployment and it worked. But this is how I did it: >>>>> >>>>> openstack network qos policy list >>>>> openstack network qos policy create bw-limiter >>>>> openstack network qos rule create --type bandwidth-limit --max-kbps >>>>> 512 --max-burst-kbits 512 --egress bw-limiter >>>>> openstack network qos rule create --type bandwidth-limit --max-kbps >>>>> 512 --max-burst-kbits 512 --ingress bw-limiter >>>>> openstack network set --qos-policy bw-limiter ext_net >>>>> >>>>> I didn't set it in the port though, which is something I should do. >>>>> I'll set it in the port too for testing but I think the above should >>>>> work regardless. >>>>> >>>>> Erlon >>>>> >>>>> >>>>> Em seg., 22 de nov. de 2021 ?s 18:45, Roman Safronov < >>>>> rsafrono at redhat.com> escreveu: >>>>> >>>>>> Hi Erlon, >>>>>> >>>>>> I have a couple of questions that probably will help to understand >>>>>> the issue better. >>>>>> Have you applied the QoS rules on a port, network or floating ip? >>>>>> Have you applied the QoS rules before starting the VM (before it's >>>>>> port is active) or after? >>>>>> >>>>>> Thanks >>>>>> >>>>>> >>>>>> On Mon, Nov 22, 2021 at 10:53 PM Erlon Cruz >>>>>> wrote: >>>>>> >>>>>>> Hi folks, >>>>>>> >>>>>>> I have a question related to the Neutron supportability of OVN+QoS. >>>>>>> I have checked the config reference for both >>>>>>> Victoria and Xena[1] >>>>>>> [2] >>>>>>> and >>>>>>> they are shown as supported (bw limit, eggress/ingress), but I tried to set >>>>>>> up an env >>>>>>> with OVN+QoS but the rules are not being effective (VMs still >>>>>>> download at maximum speed). I double-checked >>>>>>> the configuration in the neutron API and it brings the QoS settings >>>>>>> [3] >>>>>>> >>>>>>> [4] >>>>>>> >>>>>>> [5] >>>>>>> , >>>>>>> and the versions[6] >>>>>>> >>>>>>> [7] >>>>>>> I'm >>>>>>> using should support it. >>>>>>> >>>>>>> What makes me more confused is that there's a document[8] >>>>>>> [9] >>>>>>> with a >>>>>>> gap analysis of the OVN vs OVS QoS functionality >>>>>>> and the document *is* being updated over the releases, but it still >>>>>>> shows that QoS is not supported in OVN. >>>>>>> >>>>>>> So, is there something I'm missing? >>>>>>> >>>>>>> Erlon >>>>>>> _______________ >>>>>>> [1] >>>>>>> https://docs.openstack.org/neutron/victoria/admin/config-qos.html >>>>>>> [2] https://docs.openstack.org/neutron/xena/admin/config-qos.html >>>>>>> [3] QoS Config: >>>>>>> https://gist.github.com/sombrafam/f8434c0505ed4dd3f912574e7ccebb82 >>>>>>> [4] neutron.conf: >>>>>>> https://gist.github.com/sombrafam/785beb10f20439c4e50eb633f294ae82 >>>>>>> [5] ml2_conf.ini: >>>>>>> https://gist.github.com/sombrafam/b171a38d8cd16bd4dc77cfee3916dccd >>>>>>> [6] neutron-api-0 versions: >>>>>>> https://gist.github.com/sombrafam/5d098daa1df3f116d599c09c96eab173 >>>>>>> [7] nova-compute-0 versions: >>>>>>> https://gist.github.com/sombrafam/d51102e3a32be5dc8ca03d7a23b6a998 >>>>>>> [8] Gaps from ML2/OVS-OVN Xena: >>>>>>> https://docs.openstack.org/neutron/xena/ovn/gaps.html >>>>>>> [9] Gaps from ML2/OVS-OVN Victoria: >>>>>>> https://docs.openstack.org/neutron/victoria/ovn/gaps.html >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>> >>>> >>>> -- > Regards, > > > Syed Ammad Ali > -------------- next part -------------- An HTML attachment was scrubbed... URL: From syedammad83 at gmail.com Tue Nov 23 14:38:12 2021 From: syedammad83 at gmail.com (Ammad Syed) Date: Tue, 23 Nov 2021 19:38:12 +0500 Subject: [neutron] Neutron OVN+QoS Support In-Reply-To: References: Message-ID: Hi Erlon, I have tested on xena and it works fine. See if you have qos-fip extension loaded in neution. # openstack extension list | grep -i qos-fip | Floating IP QoS | qos-fip | The floating IP Quality of Service extension | Ammad On Tue, Nov 23, 2021 at 6:56 PM Erlon Cruz wrote: > Hi Ammad, > > What OpenStack version did you tested? I have just performed the FIP test > on Xena and it didn't work for me. See the results I posted. > > Erlon > > Em ter., 23 de nov. de 2021 ?s 10:47, Ammad Syed > escreveu: > >> Hi Erlon, >> >> You can check below url for testing qos on FIP. I have tested it and it >> works fine. >> >> >> https://github.com/openstack/neutron/blob/master/doc/source/admin/config-qos.rst >> >> Ammad >> On Tue, Nov 23, 2021 at 6:06 PM Erlon Cruz wrote: >> >>> Hi Roman, Rodolfo, >>> >>> I tested setting the QoS policy to the port (internal) instead of the >>> network (external), and it works! I did some more testing on >>> the OVS vs OVN deployments and I can confirm the status you are saying. >>> What I got was: >>> >>> OVS: >>> FIP: >>> Setting on port: FAIL >>> Setting on network: OK >>> >>> Private network: >>> Setting on port: OK >>> Setting on network: OK >>> >>> Router: >>> Internal port: OK >>> External port: OK >>> >>> OVN: >>> FIP: >>> Setting on port: FAIL >>> Setting on network: FAIL (I was trying this) >>> >>> Private network: >>> Setting on port: OK >>> Setting on network: OK >>> >>> Router: >>> Internal port: FAIL >>> External port: FAIL >>> >>> Thanks a lot for your help!! >>> Erlon >>> >>> Em ter., 23 de nov. de 2021 ?s 08:47, Rodolfo Alonso Hernandez < >>> ralonsoh at redhat.com> escreveu: >>> >>>> Hello Erlon: >>>> >>>> We really need to review the gaps document, at least for Xena. >>>> >>>> As Roman said, we have been testing QoS in OVN successfully. >>>> >>>> The current status of QoS in OVN is (at least for Xena): >>>> - Fixed ports (VM ports): support for BW limit rules (egress/ingress) >>>> and DSCP (only egress). Neutron supports port network QoS inheritance (same >>>> as in your example). This is not for OVN but for any backend. >>>> - FIPs: support for BW limit rules (egress/ingress). Still no network >>>> QoS inheritance (in progress). >>>> - GW IP: no support yet. >>>> >>>> Ping me in #openstack-neutron channel (ralonsoh) if you have more >>>> questions. >>>> >>>> Regards. >>>> >>>> >>>> On Tue, Nov 23, 2021 at 12:12 PM Roman Safronov >>>> wrote: >>>> >>>>> Hi Erlon, >>>>> >>>>> There was a bug with setting QoS on a network but it had been fixed >>>>> long ago. >>>>> https://bugs.launchpad.net/neutron/+bug/1851362 or >>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1934096 >>>>> At least in our downstream CI we do not observe such issues with >>>>> QoS+OVN. >>>>> >>>>> From the commands I see that you apply the QoS rule on the external >>>>> network, right? >>>>> >>>>> On Tue, Nov 23, 2021 at 12:39 PM Erlon Cruz >>>>> wrote: >>>>> >>>>>> Hi Roman, >>>>>> >>>>>> Forgot to add that detail, since I run the same routine in a non-ovn >>>>>> deployment and it worked. But this is how I did it: >>>>>> >>>>>> openstack network qos policy list >>>>>> openstack network qos policy create bw-limiter >>>>>> openstack network qos rule create --type bandwidth-limit --max-kbps >>>>>> 512 --max-burst-kbits 512 --egress bw-limiter >>>>>> openstack network qos rule create --type bandwidth-limit --max-kbps >>>>>> 512 --max-burst-kbits 512 --ingress bw-limiter >>>>>> openstack network set --qos-policy bw-limiter ext_net >>>>>> >>>>>> I didn't set it in the port though, which is something I should do. >>>>>> I'll set it in the port too for testing but I think the above should >>>>>> work regardless. >>>>>> >>>>>> Erlon >>>>>> >>>>>> >>>>>> Em seg., 22 de nov. de 2021 ?s 18:45, Roman Safronov < >>>>>> rsafrono at redhat.com> escreveu: >>>>>> >>>>>>> Hi Erlon, >>>>>>> >>>>>>> I have a couple of questions that probably will help to understand >>>>>>> the issue better. >>>>>>> Have you applied the QoS rules on a port, network or floating ip? >>>>>>> Have you applied the QoS rules before starting the VM (before it's >>>>>>> port is active) or after? >>>>>>> >>>>>>> Thanks >>>>>>> >>>>>>> >>>>>>> On Mon, Nov 22, 2021 at 10:53 PM Erlon Cruz >>>>>>> wrote: >>>>>>> >>>>>>>> Hi folks, >>>>>>>> >>>>>>>> I have a question related to the Neutron supportability of OVN+QoS. >>>>>>>> I have checked the config reference for both >>>>>>>> Victoria and Xena[1] >>>>>>>> [2] >>>>>>>> >>>>>>>> and they are shown as supported (bw limit, eggress/ingress), but I tried to >>>>>>>> set up an env >>>>>>>> with OVN+QoS but the rules are not being effective (VMs still >>>>>>>> download at maximum speed). I double-checked >>>>>>>> the configuration in the neutron API and it brings the QoS settings >>>>>>>> [3] >>>>>>>> >>>>>>>> [4] >>>>>>>> >>>>>>>> [5] >>>>>>>> , >>>>>>>> and the versions[6] >>>>>>>> >>>>>>>> [7] >>>>>>>> I'm >>>>>>>> using should support it. >>>>>>>> >>>>>>>> What makes me more confused is that there's a document[8] >>>>>>>> [9] >>>>>>>> with a >>>>>>>> gap analysis of the OVN vs OVS QoS functionality >>>>>>>> and the document *is* being updated over the releases, but it still >>>>>>>> shows that QoS is not supported in OVN. >>>>>>>> >>>>>>>> So, is there something I'm missing? >>>>>>>> >>>>>>>> Erlon >>>>>>>> _______________ >>>>>>>> [1] >>>>>>>> https://docs.openstack.org/neutron/victoria/admin/config-qos.html >>>>>>>> [2] https://docs.openstack.org/neutron/xena/admin/config-qos.html >>>>>>>> [3] QoS Config: >>>>>>>> https://gist.github.com/sombrafam/f8434c0505ed4dd3f912574e7ccebb82 >>>>>>>> [4] neutron.conf: >>>>>>>> https://gist.github.com/sombrafam/785beb10f20439c4e50eb633f294ae82 >>>>>>>> [5] ml2_conf.ini: >>>>>>>> https://gist.github.com/sombrafam/b171a38d8cd16bd4dc77cfee3916dccd >>>>>>>> [6] neutron-api-0 versions: >>>>>>>> https://gist.github.com/sombrafam/5d098daa1df3f116d599c09c96eab173 >>>>>>>> [7] nova-compute-0 versions: >>>>>>>> https://gist.github.com/sombrafam/d51102e3a32be5dc8ca03d7a23b6a998 >>>>>>>> [8] Gaps from ML2/OVS-OVN Xena: >>>>>>>> https://docs.openstack.org/neutron/xena/ovn/gaps.html >>>>>>>> [9] Gaps from ML2/OVS-OVN Victoria: >>>>>>>> https://docs.openstack.org/neutron/victoria/ovn/gaps.html >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>> >>>>> >>>>> -- >> Regards, >> >> >> Syed Ammad Ali >> > -- Regards, Syed Ammad Ali -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.therond at bitswalk.com Tue Nov 23 14:58:04 2021 From: gael.therond at bitswalk.com (=?UTF-8?Q?Ga=C3=ABl_THEROND?=) Date: Tue, 23 Nov 2021 14:58:04 +0000 Subject: [CLOUDKITTY][VICTORIA] - policy prohibit report:get_summary to admin Message-ID: Hi everyone! Today I faced a weird situation with one of our cloud platforms using victoria release. When trying to get a summary of projects rates would it be through Horizon or CLI using the admin user of the platform we've got the following error message: https://paste.opendev.org/show/bIgG6owrN9B2F3O7iqYG/ >From my understanding of the default policies of cloudkitty, this error seems to be a bit odd as the admin user profile actually match the default rules. At least as exposed in: https://opendev.org/openstack/cloudkitty/src/branch/stable/victoria/cloudkitty/common/policies/base.py and https://opendev.org/openstack/cloudkitty/src/branch/stable/victoria/cloudkitty/common/policies/v1/report.py Unless I misunderstood something (please correct me if I'm wrong), it's supposed to at least be ok with the matching. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rafaelweingartner at gmail.com Tue Nov 23 15:06:48 2021 From: rafaelweingartner at gmail.com (=?UTF-8?Q?Rafael_Weing=C3=A4rtner?=) Date: Tue, 23 Nov 2021 12:06:48 -0300 Subject: [CLOUDKITTY][VICTORIA] - policy prohibit report:get_summary to admin In-Reply-To: References: Message-ID: Can you check this one? https://review.opendev.org/c/openstack/cloudkitty/+/785132 On Tue, Nov 23, 2021 at 12:01 PM Ga?l THEROND wrote: > Hi everyone! > > Today I faced a weird situation with one of our cloud platforms using > victoria release. > > When trying to get a summary of projects rates would it be through Horizon > or CLI using the admin user of the platform we've got the following error > message: > > https://paste.opendev.org/show/bIgG6owrN9B2F3O7iqYG/ > > From my understanding of the default policies of cloudkitty, this error > seems to be a bit odd as the admin user profile actually match the default > rules. > > At least as exposed in: > > > https://opendev.org/openstack/cloudkitty/src/branch/stable/victoria/cloudkitty/common/policies/base.py > and > > https://opendev.org/openstack/cloudkitty/src/branch/stable/victoria/cloudkitty/common/policies/v1/report.py > > Unless I misunderstood something (please correct me if I'm wrong), it's > supposed to at least be ok with the matching. > -- Rafael Weing?rtner -------------- next part -------------- An HTML attachment was scrubbed... URL: From rafaelweingartner at gmail.com Tue Nov 23 15:08:58 2021 From: rafaelweingartner at gmail.com (=?UTF-8?Q?Rafael_Weing=C3=A4rtner?=) Date: Tue, 23 Nov 2021 12:08:58 -0300 Subject: [CLOUDKITTY][VICTORIA] - policy prohibit report:get_summary to admin In-Reply-To: References: Message-ID: I guess that the rule "context_is_admin" might have some weird definition in your version. Can you check it? On Tue, Nov 23, 2021 at 12:06 PM Rafael Weing?rtner < rafaelweingartner at gmail.com> wrote: > Can you check this one? > https://review.opendev.org/c/openstack/cloudkitty/+/785132 > > On Tue, Nov 23, 2021 at 12:01 PM Ga?l THEROND > wrote: > >> Hi everyone! >> >> Today I faced a weird situation with one of our cloud platforms using >> victoria release. >> >> When trying to get a summary of projects rates would it be through >> Horizon or CLI using the admin user of the platform we've got the following >> error message: >> >> https://paste.opendev.org/show/bIgG6owrN9B2F3O7iqYG/ >> >> From my understanding of the default policies of cloudkitty, this error >> seems to be a bit odd as the admin user profile actually match the default >> rules. >> >> At least as exposed in: >> >> >> https://opendev.org/openstack/cloudkitty/src/branch/stable/victoria/cloudkitty/common/policies/base.py >> and >> >> https://opendev.org/openstack/cloudkitty/src/branch/stable/victoria/cloudkitty/common/policies/v1/report.py >> >> Unless I misunderstood something (please correct me if I'm wrong), it's >> supposed to at least be ok with the matching. >> > > > -- > Rafael Weing?rtner > -- Rafael Weing?rtner -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.therond at bitswalk.com Tue Nov 23 15:15:40 2021 From: gael.therond at bitswalk.com (=?UTF-8?Q?Ga=C3=ABl_THEROND?=) Date: Tue, 23 Nov 2021 15:15:40 +0000 Subject: [CLOUDKITTY][VICTORIA] - policy prohibit report:get_summary to admin In-Reply-To: References: Message-ID: aaaaah nice catch! I'll check that out as I use CentOS packages; it may actually just be that! Thanks a lot! Le mar. 23 nov. 2021 ? 15:09, Rafael Weing?rtner < rafaelweingartner at gmail.com> a ?crit : > I guess that the rule "context_is_admin" might have some weird definition > in your version. Can you check it? > > On Tue, Nov 23, 2021 at 12:06 PM Rafael Weing?rtner < > rafaelweingartner at gmail.com> wrote: > >> Can you check this one? >> https://review.opendev.org/c/openstack/cloudkitty/+/785132 >> >> On Tue, Nov 23, 2021 at 12:01 PM Ga?l THEROND >> wrote: >> >>> Hi everyone! >>> >>> Today I faced a weird situation with one of our cloud platforms using >>> victoria release. >>> >>> When trying to get a summary of projects rates would it be through >>> Horizon or CLI using the admin user of the platform we've got the following >>> error message: >>> >>> https://paste.opendev.org/show/bIgG6owrN9B2F3O7iqYG/ >>> >>> From my understanding of the default policies of cloudkitty, this error >>> seems to be a bit odd as the admin user profile actually match the default >>> rules. >>> >>> At least as exposed in: >>> >>> >>> https://opendev.org/openstack/cloudkitty/src/branch/stable/victoria/cloudkitty/common/policies/base.py >>> and >>> >>> https://opendev.org/openstack/cloudkitty/src/branch/stable/victoria/cloudkitty/common/policies/v1/report.py >>> >>> Unless I misunderstood something (please correct me if I'm wrong), it's >>> supposed to at least be ok with the matching. >>> >> >> >> -- >> Rafael Weing?rtner >> > > > -- > Rafael Weing?rtner > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.therond at bitswalk.com Tue Nov 23 15:28:51 2021 From: gael.therond at bitswalk.com (=?UTF-8?Q?Ga=C3=ABl_THEROND?=) Date: Tue, 23 Nov 2021 15:28:51 +0000 Subject: [CLOUDKITTY][VICTORIA] - policy prohibit report:get_summary to admin In-Reply-To: References: Message-ID: ah ah! Was exactly that indeed! So, CentOS cloudkitty common package is not using the latest patch fixing the issue -> http://mirror.centos.org/centos/8/cloud/x86_64/openstack-victoria/Packages/o/openstack-cloudkitty-common-13.0.0-1.el8.noarch.rpm Thanks a lot for the hint! Will patch it downstream waiting for COS patch. Le mar. 23 nov. 2021 ? 15:15, Ga?l THEROND a ?crit : > aaaaah nice catch! I'll check that out as I use CentOS packages; it may > actually just be that! > > Thanks a lot! > > Le mar. 23 nov. 2021 ? 15:09, Rafael Weing?rtner < > rafaelweingartner at gmail.com> a ?crit : > >> I guess that the rule "context_is_admin" might have some weird definition >> in your version. Can you check it? >> >> On Tue, Nov 23, 2021 at 12:06 PM Rafael Weing?rtner < >> rafaelweingartner at gmail.com> wrote: >> >>> Can you check this one? >>> https://review.opendev.org/c/openstack/cloudkitty/+/785132 >>> >>> On Tue, Nov 23, 2021 at 12:01 PM Ga?l THEROND >>> wrote: >>> >>>> Hi everyone! >>>> >>>> Today I faced a weird situation with one of our cloud platforms using >>>> victoria release. >>>> >>>> When trying to get a summary of projects rates would it be through >>>> Horizon or CLI using the admin user of the platform we've got the following >>>> error message: >>>> >>>> https://paste.opendev.org/show/bIgG6owrN9B2F3O7iqYG/ >>>> >>>> From my understanding of the default policies of cloudkitty, this error >>>> seems to be a bit odd as the admin user profile actually match the default >>>> rules. >>>> >>>> At least as exposed in: >>>> >>>> >>>> https://opendev.org/openstack/cloudkitty/src/branch/stable/victoria/cloudkitty/common/policies/base.py >>>> and >>>> >>>> https://opendev.org/openstack/cloudkitty/src/branch/stable/victoria/cloudkitty/common/policies/v1/report.py >>>> >>>> Unless I misunderstood something (please correct me if I'm wrong), it's >>>> supposed to at least be ok with the matching. >>>> >>> >>> >>> -- >>> Rafael Weing?rtner >>> >> >> >> -- >> Rafael Weing?rtner >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sombrafam at gmail.com Tue Nov 23 17:41:52 2021 From: sombrafam at gmail.com (Erlon Cruz) Date: Tue, 23 Nov 2021 14:41:52 -0300 Subject: [neutron] Neutron OVN+QoS Support In-Reply-To: References: Message-ID: Hmm, My OVN deployment doesn't show the extension, and the OVS brings it by default, thought its not listed in the l3_agent.ini file. Where do you set that for OVN? The OVS deployment have the l3_agent.ini, but OVN does not have an L3 agent. Erlon Em ter., 23 de nov. de 2021 ?s 11:38, Ammad Syed escreveu: > Hi Erlon, > > I have tested on xena and it works fine. See if you have qos-fip > extension loaded in neution. > > # openstack extension list | grep -i qos-fip > > | Floating IP QoS > > | qos-fip | The floating IP > Quality of Service extension > | > Ammad > > On Tue, Nov 23, 2021 at 6:56 PM Erlon Cruz wrote: > >> Hi Ammad, >> >> What OpenStack version did you tested? I have just performed the FIP test >> on Xena and it didn't work for me. See the results I posted. >> >> Erlon >> >> Em ter., 23 de nov. de 2021 ?s 10:47, Ammad Syed >> escreveu: >> >>> Hi Erlon, >>> >>> You can check below url for testing qos on FIP. I have tested it and it >>> works fine. >>> >>> >>> https://github.com/openstack/neutron/blob/master/doc/source/admin/config-qos.rst >>> >>> Ammad >>> On Tue, Nov 23, 2021 at 6:06 PM Erlon Cruz wrote: >>> >>>> Hi Roman, Rodolfo, >>>> >>>> I tested setting the QoS policy to the port (internal) instead of the >>>> network (external), and it works! I did some more testing on >>>> the OVS vs OVN deployments and I can confirm the status you are saying. >>>> What I got was: >>>> >>>> OVS: >>>> FIP: >>>> Setting on port: FAIL >>>> Setting on network: OK >>>> >>>> Private network: >>>> Setting on port: OK >>>> Setting on network: OK >>>> >>>> Router: >>>> Internal port: OK >>>> External port: OK >>>> >>>> OVN: >>>> FIP: >>>> Setting on port: FAIL >>>> Setting on network: FAIL (I was trying this) >>>> >>>> Private network: >>>> Setting on port: OK >>>> Setting on network: OK >>>> >>>> Router: >>>> Internal port: FAIL >>>> External port: FAIL >>>> >>>> Thanks a lot for your help!! >>>> Erlon >>>> >>>> Em ter., 23 de nov. de 2021 ?s 08:47, Rodolfo Alonso Hernandez < >>>> ralonsoh at redhat.com> escreveu: >>>> >>>>> Hello Erlon: >>>>> >>>>> We really need to review the gaps document, at least for Xena. >>>>> >>>>> As Roman said, we have been testing QoS in OVN successfully. >>>>> >>>>> The current status of QoS in OVN is (at least for Xena): >>>>> - Fixed ports (VM ports): support for BW limit rules (egress/ingress) >>>>> and DSCP (only egress). Neutron supports port network QoS inheritance (same >>>>> as in your example). This is not for OVN but for any backend. >>>>> - FIPs: support for BW limit rules (egress/ingress). Still no network >>>>> QoS inheritance (in progress). >>>>> - GW IP: no support yet. >>>>> >>>>> Ping me in #openstack-neutron channel (ralonsoh) if you have more >>>>> questions. >>>>> >>>>> Regards. >>>>> >>>>> >>>>> On Tue, Nov 23, 2021 at 12:12 PM Roman Safronov >>>>> wrote: >>>>> >>>>>> Hi Erlon, >>>>>> >>>>>> There was a bug with setting QoS on a network but it had been fixed >>>>>> long ago. >>>>>> https://bugs.launchpad.net/neutron/+bug/1851362 or >>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1934096 >>>>>> At least in our downstream CI we do not observe such issues with >>>>>> QoS+OVN. >>>>>> >>>>>> From the commands I see that you apply the QoS rule on the external >>>>>> network, right? >>>>>> >>>>>> On Tue, Nov 23, 2021 at 12:39 PM Erlon Cruz >>>>>> wrote: >>>>>> >>>>>>> Hi Roman, >>>>>>> >>>>>>> Forgot to add that detail, since I run the same routine in a non-ovn >>>>>>> deployment and it worked. But this is how I did it: >>>>>>> >>>>>>> openstack network qos policy list >>>>>>> openstack network qos policy create bw-limiter >>>>>>> openstack network qos rule create --type bandwidth-limit --max-kbps >>>>>>> 512 --max-burst-kbits 512 --egress bw-limiter >>>>>>> openstack network qos rule create --type bandwidth-limit --max-kbps >>>>>>> 512 --max-burst-kbits 512 --ingress bw-limiter >>>>>>> openstack network set --qos-policy bw-limiter ext_net >>>>>>> >>>>>>> I didn't set it in the port though, which is something I should do. >>>>>>> I'll set it in the port too for testing but I think the above should >>>>>>> work regardless. >>>>>>> >>>>>>> Erlon >>>>>>> >>>>>>> >>>>>>> Em seg., 22 de nov. de 2021 ?s 18:45, Roman Safronov < >>>>>>> rsafrono at redhat.com> escreveu: >>>>>>> >>>>>>>> Hi Erlon, >>>>>>>> >>>>>>>> I have a couple of questions that probably will help to understand >>>>>>>> the issue better. >>>>>>>> Have you applied the QoS rules on a port, network or floating ip? >>>>>>>> Have you applied the QoS rules before starting the VM (before it's >>>>>>>> port is active) or after? >>>>>>>> >>>>>>>> Thanks >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Nov 22, 2021 at 10:53 PM Erlon Cruz >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi folks, >>>>>>>>> >>>>>>>>> I have a question related to the Neutron supportability of >>>>>>>>> OVN+QoS. I have checked the config reference for both >>>>>>>>> Victoria and Xena[1] >>>>>>>>> [2] >>>>>>>>> >>>>>>>>> and they are shown as supported (bw limit, eggress/ingress), but I tried to >>>>>>>>> set up an env >>>>>>>>> with OVN+QoS but the rules are not being effective (VMs still >>>>>>>>> download at maximum speed). I double-checked >>>>>>>>> the configuration in the neutron API and it brings the QoS settings >>>>>>>>> [3] >>>>>>>>> >>>>>>>>> [4] >>>>>>>>> >>>>>>>>> [5] >>>>>>>>> , >>>>>>>>> and the versions[6] >>>>>>>>> >>>>>>>>> [7] >>>>>>>>> I'm >>>>>>>>> using should support it. >>>>>>>>> >>>>>>>>> What makes me more confused is that there's a document[8] >>>>>>>>> [9] >>>>>>>>> with >>>>>>>>> a gap analysis of the OVN vs OVS QoS functionality >>>>>>>>> and the document *is* being updated over the releases, but it >>>>>>>>> still shows that QoS is not supported in OVN. >>>>>>>>> >>>>>>>>> So, is there something I'm missing? >>>>>>>>> >>>>>>>>> Erlon >>>>>>>>> _______________ >>>>>>>>> [1] >>>>>>>>> https://docs.openstack.org/neutron/victoria/admin/config-qos.html >>>>>>>>> [2] https://docs.openstack.org/neutron/xena/admin/config-qos.html >>>>>>>>> [3] QoS Config: >>>>>>>>> https://gist.github.com/sombrafam/f8434c0505ed4dd3f912574e7ccebb82 >>>>>>>>> [4] neutron.conf: >>>>>>>>> https://gist.github.com/sombrafam/785beb10f20439c4e50eb633f294ae82 >>>>>>>>> [5] ml2_conf.ini: >>>>>>>>> https://gist.github.com/sombrafam/b171a38d8cd16bd4dc77cfee3916dccd >>>>>>>>> [6] neutron-api-0 versions: >>>>>>>>> https://gist.github.com/sombrafam/5d098daa1df3f116d599c09c96eab173 >>>>>>>>> [7] nova-compute-0 versions: >>>>>>>>> https://gist.github.com/sombrafam/d51102e3a32be5dc8ca03d7a23b6a998 >>>>>>>>> [8] Gaps from ML2/OVS-OVN Xena: >>>>>>>>> https://docs.openstack.org/neutron/xena/ovn/gaps.html >>>>>>>>> [9] Gaps from ML2/OVS-OVN Victoria: >>>>>>>>> https://docs.openstack.org/neutron/victoria/ovn/gaps.html >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>>>> -- >>> Regards, >>> >>> >>> Syed Ammad Ali >>> >> > > -- > Regards, > > > Syed Ammad Ali > -------------- next part -------------- An HTML attachment was scrubbed... URL: From damien.rannou at ovhcloud.com Tue Nov 23 10:53:53 2021 From: damien.rannou at ovhcloud.com (Damien Rannou) Date: Tue, 23 Nov 2021 10:53:53 +0000 Subject: [neutron] default QOS on L3 gateway In-Reply-To: References: Message-ID: Yes exactly Just On point that I?m not sure: what append if the client ask for a router update with ?no-qos option ? Will it remove the QOS completely or just re apply the default value ? Damien Le 22 nov. 2021 ? 18:45, Lajos Katona > a ?crit : Hi, There is an RFE to inherit network QoS: https://bugs.launchpad.net/neutron/+bug/1950454 The patch series: https://review.opendev.org/q/topic:%22bug%252F1950454%22+(status:open%20OR%20status:merged) Hope this covers your usecase. Lajos Katona (lajoskatona) Damien Rannou > ezt ?rta (id?pont: 2021. nov. 22., H, 17:23): Hello We are currently playing with QOS on L3 agent, mostly for SNAT, but it can apply also on FIP. Everything is working properly, but I?m wondering if there is a way to define a ? default ? QOS that would be applied on Router creation, but also when the user is setting ? no_qos ? on his router. On a public cloud environnement, we cannot let the customers without any QOS limitation. Thanks ! Damien -------------- next part -------------- An HTML attachment was scrubbed... URL: From franck.vedel at univ-grenoble-alpes.fr Tue Nov 23 19:28:12 2021 From: franck.vedel at univ-grenoble-alpes.fr (Franck VEDEL) Date: Tue, 23 Nov 2021 20:28:12 +0100 Subject: [kolla-ansible][wallaby][glance] Problem with image list after reconfigure In-Reply-To: References: <87804956-038F-4F88-A467-33851679B5B4@univ-grenoble-alpes.fr> <30680C7F-7E55-4098-8438-F3CEE48C4A90@univ-grenoble-alpes.fr> Message-ID: Thanks again. I'll do some speed tests. In my case, I need ready-to-use images (debian, centos, ubuntu, pfsense, kali, windows 10, windows 2016), sometimes big images. This is why I am trying to find out what is the best solution with the use of an iscsi bay. Ah .... if I could change that and use disks and ceph ... Franck > Le 23 nov. 2021 ? 09:14, Ignazio Cassano a ?crit : > > Franck, If the cache works fine , I think glance image format could be qcow2. The volume is created in raw format but > the download phase is executed only the fisrt time you create a volume from a new image. > With this setup I can create 20-30 instance in a shot and it takes few minutes to complete. > I always use general purpose small images and colplete the instance configuration (package installation and so on) with heat or ansible. > Ignazio > > > Il giorno mar 23 nov 2021 alle ore 08:57 Franck VEDEL > ha scritto: > Ignazio, Radoslaw, > > thanks to you, I made some modifications and my environment seems to work better (the images are placed on the iiscsi bay on which the volumes are stored). > I installed the cache for glance. It works, well I think it does. > > My question is: between the different formats (qcow2, raw or other), which is the most efficient if > - we create a volume then an instance from the volume > - we create an instance from the image > - we create an instance without volume > - we create a snapshot then an instance from the snapshot > > Franck > > >> >> >>> Le 19 nov. 2021 ? 14:50, Ignazio Cassano > a ?crit : >>> >>> Franck, this help you a lot. >>> Thanks Radoslaw >>> Ignazio >>> >>> Il giorno ven 19 nov 2021 alle ore 12:03 Rados?aw Piliszek > ha scritto: >>> If one sets glance_file_datadir_volume to non-default, then glance-api >>> gets deployed on all hosts. >>> >>> -yoctozepto >>> >>> On Fri, 19 Nov 2021 at 10:51, Ignazio Cassano > wrote: >>> > >>> > Hello Franck, glance is not deployed on all nodes at default. >>> > I got the same problem >>> > In my case I have 3 controllers. >>> > I created an nfs share on a storage server where to store images. >>> > Before deploying glance, I create /var/lib/glance/images on the 3 controllers and I mount the nfs share. >>> > This is my fstab on the 3 controllers: >>> > >>> > 10.102.189.182:/netappopenstacktst2_glance /var/lib/glance nfs rw,user=glance,soft,intr,noatime,nodiratime >>> > >>> > In my globals.yml I have: >>> > glance_file_datadir_volume: "/var/lib/glance" >>> > glance_backend_file: "yes" >>> > >>> > This means images are on /var/lib/glance and since it is a nfs share all my 3 controlles can share images. >>> > Then you must deploy. >>> > To be sure the glance container is started on all controllers, since I have 3 controllers, I deployed 3 times changing the order in the inventory. >>> > First time: >>> > [control] >>> > A >>> > B >>> > C >>> > >>> > Second time: >>> > [control] >>> > B >>> > C >>> > A >>> > >>> > Third time: >>> > [control] >>> > C >>> > B >>> > A >>> > >>> > Or you can deploy glance 3 times using -t glance and -l >>> > >>> > As far as the instance stopped, I got I bug with a version of kolla. >>> > https://bugs.launchpad.net/kolla-ansible/+bug/1941706 >>> > Now is corrected and with kolla 12.2.0 it works. >>> > Ignazio >>> > >>> > >>> > Il giorno mer 17 nov 2021 alle ore 23:17 Franck VEDEL > ha scritto: > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosmaita.fossdev at gmail.com Tue Nov 23 21:15:15 2021 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Tue, 23 Nov 2021 16:15:15 -0500 Subject: [cinder] spec opportunities and deadline Message-ID: <25f62afc-e7a6-6876-9808-33dce952438a@gmail.com> To anyone working on a cinder spec for Yoga: This is a reminder that all Cinder Specs for features to be implemented in Yoga must be approved by Friday 17 December 2021 (23:59 UTC). There are two upcoming opportunities to get feedback on your spec proposal and/or ask questions about your spec from the cinder team: 1. Tomorrow's cinder weekly meeting (Wednesday 24 November, 1400 UTC) is being held in videoconference; if you'd like a discussion, please put a topic on the weekly agenda (which also has connection details): https://etherpad.opendev.org/p/cinder-yoga-meetings 2. The cinder yoga R-17 virtual midcycle meeting is being held next week on Wednesday 1 December 1400-1600 UTC; if you'd like to discuss your proposal, please add it to the planning etherpad: https://etherpad.opendev.org/p/cinder-yoga-midcycles cheers, brian From tonyliu0592 at hotmail.com Tue Nov 23 22:51:52 2021 From: tonyliu0592 at hotmail.com (Tony Liu) Date: Tue, 23 Nov 2021 22:51:52 +0000 Subject: [Neutron][OVN] nova-compute timeout while waiting for VIF activation confirmed by Neutron Message-ID: Hi, I see such problem from time to time. It's not consistently reproduceable. ====================== 2021-11-23 22:16:28.532 7 INFO nova.compute.manager [req-50817b01-e7ae-4991-94fe-e4c5672c481b - - - - -] [instance: 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] During sync_power_state the instance has a pending task (spawning). Skip. 2021-11-23 22:21:28.342 7 WARNING nova.virt.libvirt.driver [req-814c98a4-3fd9-4607-9abb-5fbe5cef8650 3a4c320d64664d9cb6784b7ea52d618a 21384fe21ebb4339b1a5f2a8f9000ea3 - default default] [instance: 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] Timeout waiting for [('network-vif-plugged', '735d97a3-db7c-42ed-b3be-1596d4cc7f4b')] for instance with vm_state building and task_state spawning.: eventlet.timeout.Timeout: 300 seconds ====================== The VIF/port is activated by OVN ovn-controller to ovn-sb. It seems that, either Neutron didn't capture the update or didn't send message back to nova-compute. Is there any known fix for this problem? Thanks! Tony From DHilsbos at performair.com Tue Nov 23 23:34:39 2021 From: DHilsbos at performair.com (DHilsbos at performair.com) Date: Tue, 23 Nov 2021 23:34:39 +0000 Subject: [ops]RabbitMQ High Availability Message-ID: <0670B960225633449A24709C291A525251D511CD@COM03.performair.local> All; In the time I've been part of this mailing list, the subject of RabbitMQ high availability has come up several times, and each time specific recommendations for both Rabbit and Open Stack are provided. I remember it being an A or B kind of recommendation (i.e. configure Rabbit like A1, and Open Stack like A2, OR configure Rabbit like B1, and Open Stack like B2). Unfortunately, I can't find the previous threads on this topic. Does anyone have this information, that they would care to share with me? Thank you, Dominic L. Hilsbos, MBA Vice President - Information Technology Perform Air International Inc. DHilsbos at PerformAir.com www.PerformAir.com From tonyliu0592 at hotmail.com Tue Nov 23 23:34:52 2021 From: tonyliu0592 at hotmail.com (Tony Liu) Date: Tue, 23 Nov 2021 23:34:52 +0000 Subject: [Neutron][OVN] networking_ovn.common.maintenance.DBInconsistenciesPeriodics.check_for_mcast_flood_reports failed Message-ID: Hi, Is this known issue? Any filed bug or fix to it? ======================================= 2021-11-23 23:04:20.943 40 ERROR futurist.periodics [-] Failed to call periodic 'networking_ovn.common.maintenance.DBInconsistenciesPeriodics.check_for_mcast_flood_reports' (it runs every 600.00 seconds): RuntimeError: OVSDB Error: The transaction failed because the IDL has been configured to require a database lock but didn't get it yet or has already lost it 2021-11-23 23:04:20.943 40 ERROR futurist.periodics Traceback (most recent call last): 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 111, in transaction 2021-11-23 23:04:20.943 40 ERROR futurist.periodics yield self._nested_txns_map[cur_thread_id] 2021-11-23 23:04:20.943 40 ERROR futurist.periodics KeyError: 140347734591640 2021-11-23 23:04:20.943 40 ERROR futurist.periodics 2021-11-23 23:04:20.943 40 ERROR futurist.periodics During handling of the above exception, another exception occurred: 2021-11-23 23:04:20.943 40 ERROR futurist.periodics 2021-11-23 23:04:20.943 40 ERROR futurist.periodics Traceback (most recent call last): 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/futurist/periodics.py", line 290, in run 2021-11-23 23:04:20.943 40 ERROR futurist.periodics work() 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/futurist/periodics.py", line 64, in __call__ 2021-11-23 23:04:20.943 40 ERROR futurist.periodics return self.callback(*self.args, **self.kwargs) 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/futurist/periodics.py", line 178, in decorator 2021-11-23 23:04:20.943 40 ERROR futurist.periodics return f(*args, **kwargs) 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/networking_ovn/common/maintenance.py", line 667, in check_for_mcast_flood_reports 2021-11-23 23:04:20.943 40 ERROR futurist.periodics for port in self._nb_idl.lsp_list().execute(check_error=True): 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/command.py", line 42, in execute 2021-11-23 23:04:20.943 40 ERROR futurist.periodics t.add(self) 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib64/python3.6/contextlib.py", line 88, in __exit__ 2021-11-23 23:04:20.943 40 ERROR futurist.periodics next(self.gen) 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/networking_ovn/ovsdb/impl_idl_ovn.py", line 183, in transaction 2021-11-23 23:04:20.943 40 ERROR futurist.periodics yield t 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib64/python3.6/contextlib.py", line 88, in __exit__ 2021-11-23 23:04:20.943 40 ERROR futurist.periodics next(self.gen) 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 119, in transaction 2021-11-23 23:04:20.943 40 ERROR futurist.periodics del self._nested_txns_map[cur_thread_id] 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 69, in __exit__ 2021-11-23 23:04:20.943 40 ERROR futurist.periodics self.result = self.commit() 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", line 62, in commit 2021-11-23 23:04:20.943 40 ERROR futurist.periodics raise result.ex 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/connection.py", line 128, in run 2021-11-23 23:04:20.943 40 ERROR futurist.periodics txn.results.put(txn.do_commit()) 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", line 115, in do_commit 2021-11-23 23:04:20.943 40 ERROR futurist.periodics raise RuntimeError(msg) 2021-11-23 23:04:20.943 40 ERROR futurist.periodics RuntimeError: OVSDB Error: The transaction failed because the IDL has been configured to require a database lock but didn't get it yet or has already lost it 2021-11-23 23:04:20.943 40 ERROR futurist.periodics ======================================= Thanks! Tony From rsafrono at redhat.com Tue Nov 23 23:51:08 2021 From: rsafrono at redhat.com (Roman Safronov) Date: Wed, 24 Nov 2021 01:51:08 +0200 Subject: [Neutron][OVN] networking_ovn.common.maintenance.DBInconsistenciesPeriodics.check_for_mcast_flood_reports failed In-Reply-To: References: Message-ID: https://bugs.launchpad.net/neutron/+bug/1927077 On Wed, Nov 24, 2021 at 1:41 AM Tony Liu wrote: > Hi, > > Is this known issue? Any filed bug or fix to it? > ======================================= > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics [-] Failed to call > periodic > 'networking_ovn.common.maintenance.DBInconsistenciesPeriodics.check_for_mcast_flood_reports' > (it runs every 600.00 seconds): RuntimeError: OVSDB Error: The transaction > failed because the IDL has been configured to require a database lock but > didn't get it yet or has already lost it > > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics Traceback (most recent > call last): > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 111, in > transaction > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics yield > self._nested_txns_map[cur_thread_id] > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics KeyError: > 140347734591640 > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics During handling of the > above exception, another exception occurred: > > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics Traceback (most recent > call last): > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/futurist/periodics.py", line 290, in run > > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics work() > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/futurist/periodics.py", line 64, in > __call__ > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics return > self.callback(*self.args, **self.kwargs) > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/futurist/periodics.py", line 178, in > decorator > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics return f(*args, > **kwargs) > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/networking_ovn/common/maintenance.py", > line 667, in check_for_mcast_flood_reports > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics for port in > self._nb_idl.lsp_list().execute(check_error=True): > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/command.py", > line 42, in execute > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics t.add(self) > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib64/python3.6/contextlib.py", line 88, in __exit__ > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics next(self.gen) > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/networking_ovn/ovsdb/impl_idl_ovn.py", > line 183, in transaction > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics yield t > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib64/python3.6/contextlib.py", line 88, in __exit__ > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics next(self.gen) > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 119, in > transaction > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics del > self._nested_txns_map[cur_thread_id] > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 69, in __exit__ > > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics self.result = > self.commit() > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", > line 62, in commit > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics raise result.ex > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/connection.py", > line 128, in run > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics > txn.results.put(txn.do_commit()) > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", > line 115, in do_commit > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics raise > RuntimeError(msg) > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics RuntimeError: OVSDB > Error: The transaction failed because the IDL has been configured to > require a database lock but didn't get it yet or has already lost it > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics > ======================================= > > > Thanks! > Tony > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rsafrono at redhat.com Tue Nov 23 23:51:08 2021 From: rsafrono at redhat.com (Roman Safronov) Date: Wed, 24 Nov 2021 01:51:08 +0200 Subject: [Neutron][OVN] networking_ovn.common.maintenance.DBInconsistenciesPeriodics.check_for_mcast_flood_reports failed In-Reply-To: References: Message-ID: https://bugs.launchpad.net/neutron/+bug/1927077 On Wed, Nov 24, 2021 at 1:41 AM Tony Liu wrote: > Hi, > > Is this known issue? Any filed bug or fix to it? > ======================================= > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics [-] Failed to call > periodic > 'networking_ovn.common.maintenance.DBInconsistenciesPeriodics.check_for_mcast_flood_reports' > (it runs every 600.00 seconds): RuntimeError: OVSDB Error: The transaction > failed because the IDL has been configured to require a database lock but > didn't get it yet or has already lost it > > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics Traceback (most recent > call last): > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 111, in > transaction > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics yield > self._nested_txns_map[cur_thread_id] > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics KeyError: > 140347734591640 > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics During handling of the > above exception, another exception occurred: > > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics Traceback (most recent > call last): > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/futurist/periodics.py", line 290, in run > > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics work() > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/futurist/periodics.py", line 64, in > __call__ > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics return > self.callback(*self.args, **self.kwargs) > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/futurist/periodics.py", line 178, in > decorator > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics return f(*args, > **kwargs) > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/networking_ovn/common/maintenance.py", > line 667, in check_for_mcast_flood_reports > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics for port in > self._nb_idl.lsp_list().execute(check_error=True): > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/command.py", > line 42, in execute > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics t.add(self) > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib64/python3.6/contextlib.py", line 88, in __exit__ > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics next(self.gen) > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/networking_ovn/ovsdb/impl_idl_ovn.py", > line 183, in transaction > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics yield t > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib64/python3.6/contextlib.py", line 88, in __exit__ > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics next(self.gen) > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 119, in > transaction > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics del > self._nested_txns_map[cur_thread_id] > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 69, in __exit__ > > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics self.result = > self.commit() > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", > line 62, in commit > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics raise result.ex > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/connection.py", > line 128, in run > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics > txn.results.put(txn.do_commit()) > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File > "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", > line 115, in do_commit > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics raise > RuntimeError(msg) > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics RuntimeError: OVSDB > Error: The transaction failed because the IDL has been configured to > require a database lock but didn't get it yet or has already lost it > > 2021-11-23 23:04:20.943 40 ERROR futurist.periodics > ======================================= > > > Thanks! > Tony > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tonyliu0592 at hotmail.com Wed Nov 24 00:16:00 2021 From: tonyliu0592 at hotmail.com (Tony Liu) Date: Wed, 24 Nov 2021 00:16:00 +0000 Subject: [Neutron][OVN] networking_ovn.common.maintenance.DBInconsistenciesPeriodics.check_for_mcast_flood_reports failed In-Reply-To: References: Message-ID: Thank you roman for the prompt response! Tony ________________________________________ From: Roman Safronov Sent: November 23, 2021 03:51 PM To: Tony Liu Cc: openstack-discuss; openstack-dev at lists.openstack.org Subject: Re: [Neutron][OVN] networking_ovn.common.maintenance.DBInconsistenciesPeriodics.check_for_mcast_flood_reports failed https://bugs.launchpad.net/neutron/+bug/1927077 On Wed, Nov 24, 2021 at 1:41 AM Tony Liu > wrote: Hi, Is this known issue? Any filed bug or fix to it? ======================================= 2021-11-23 23:04:20.943 40 ERROR futurist.periodics [-] Failed to call periodic 'networking_ovn.common.maintenance.DBInconsistenciesPeriodics.check_for_mcast_flood_reports' (it runs every 600.00 seconds): RuntimeError: OVSDB Error: The transaction failed because the IDL has been configured to require a database lock but didn't get it yet or has already lost it 2021-11-23 23:04:20.943 40 ERROR futurist.periodics Traceback (most recent call last): 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 111, in transaction 2021-11-23 23:04:20.943 40 ERROR futurist.periodics yield self._nested_txns_map[cur_thread_id] 2021-11-23 23:04:20.943 40 ERROR futurist.periodics KeyError: 140347734591640 2021-11-23 23:04:20.943 40 ERROR futurist.periodics 2021-11-23 23:04:20.943 40 ERROR futurist.periodics During handling of the above exception, another exception occurred: 2021-11-23 23:04:20.943 40 ERROR futurist.periodics 2021-11-23 23:04:20.943 40 ERROR futurist.periodics Traceback (most recent call last): 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/futurist/periodics.py", line 290, in run 2021-11-23 23:04:20.943 40 ERROR futurist.periodics work() 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/futurist/periodics.py", line 64, in __call__ 2021-11-23 23:04:20.943 40 ERROR futurist.periodics return self.callback(*self.args, **self.kwargs) 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/futurist/periodics.py", line 178, in decorator 2021-11-23 23:04:20.943 40 ERROR futurist.periodics return f(*args, **kwargs) 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/networking_ovn/common/maintenance.py", line 667, in check_for_mcast_flood_reports 2021-11-23 23:04:20.943 40 ERROR futurist.periodics for port in self._nb_idl.lsp_list().execute(check_error=True): 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/command.py", line 42, in execute 2021-11-23 23:04:20.943 40 ERROR futurist.periodics t.add(self) 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib64/python3.6/contextlib.py", line 88, in __exit__ 2021-11-23 23:04:20.943 40 ERROR futurist.periodics next(self.gen) 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/networking_ovn/ovsdb/impl_idl_ovn.py", line 183, in transaction 2021-11-23 23:04:20.943 40 ERROR futurist.periodics yield t 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib64/python3.6/contextlib.py", line 88, in __exit__ 2021-11-23 23:04:20.943 40 ERROR futurist.periodics next(self.gen) 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 119, in transaction 2021-11-23 23:04:20.943 40 ERROR futurist.periodics del self._nested_txns_map[cur_thread_id] 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 69, in __exit__ 2021-11-23 23:04:20.943 40 ERROR futurist.periodics self.result = self.commit() 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", line 62, in commit 2021-11-23 23:04:20.943 40 ERROR futurist.periodics raise result.ex 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/connection.py", line 128, in run 2021-11-23 23:04:20.943 40 ERROR futurist.periodics txn.results.put(txn.do_commit()) 2021-11-23 23:04:20.943 40 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", line 115, in do_commit 2021-11-23 23:04:20.943 40 ERROR futurist.periodics raise RuntimeError(msg) 2021-11-23 23:04:20.943 40 ERROR futurist.periodics RuntimeError: OVSDB Error: The transaction failed because the IDL has been configured to require a database lock but didn't get it yet or has already lost it 2021-11-23 23:04:20.943 40 ERROR futurist.periodics ======================================= Thanks! Tony From tonyliu0592 at hotmail.com Wed Nov 24 00:21:27 2021 From: tonyliu0592 at hotmail.com (Tony Liu) Date: Wed, 24 Nov 2021 00:21:27 +0000 Subject: [neutron][nova] [kolla] vif plugged timeout In-Reply-To: References: Message-ID: I hit the same problem, from time to time, not consistently. I am using OVN. Typically, it takes no more than a few seconds for neutron to confirm the port is up. The default timeout in my setup is 600s. Even the ports shows up in both OVN SB and NB, nova-compute still didn't get confirmation from neutron. Either neutron didn't pick it up or the message was lost and didn't get to nova-compute. Hoping someone could share more thoughts. Thanks! Tony ________________________________________ From: Laurent Dumont Sent: November 22, 2021 02:05 PM To: Michal Arbet Cc: openstack-discuss Subject: Re: [neutron][nova] [kolla] vif plugged timeout How high did you have to raise it? If it does appear after X amount of time, then the VIF plug is not lost? On Sat, Nov 20, 2021 at 7:23 AM Michal Arbet > wrote: + if i raise vif_plugged_timeout ( hope i rember it correct ) in nova to some high number ..problem dissapear ... But it's only workaround D?a so 20. 11. 2021, 12:05 Michal Arbet > nap?sal(a): Hi, Has anyone seen issue which I am currently facing ? When launching heat stack ( but it's same if I launch several of instances ) vif plugged in timeouts an I don't know why, sometimes it is OK ..sometimes is failing. Sometimes neutron reports vif plugged in < 10 sec ( test env ) sometimes it's 100 and more seconds, it seems there is some race condition but I can't find out where the problem is. But on the end every instance is spawned ok (retry mechanism worked). Another finding is that it has to do something with security group, if noop driver is used ..everything is working good. Firewall security setup is openvswitch . Test env is wallaby. I will attach some logs when I will be near PC .. Thank you, Michal Arbet (Kevko) From syedammad83 at gmail.com Wed Nov 24 05:47:39 2021 From: syedammad83 at gmail.com (Ammad Syed) Date: Wed, 24 Nov 2021 10:47:39 +0500 Subject: [neutron] Neutron OVN+QoS Support In-Reply-To: References: Message-ID: For OVN qos, you need to set below in neutron.conf core_plugin = ml2 service_plugins = ovn-router, qos, segments, port_forwarding and below ones in ml2_conf.ini [ml2] type_drivers = flat,geneve,vlan tenant_network_types = geneve mechanism_drivers = ovn extension_drivers = port_security, qos Ammad On Tue, Nov 23, 2021 at 10:42 PM Erlon Cruz wrote: > Hmm, > > My OVN deployment doesn't show the extension, and the OVS brings it by > default, thought its not listed in the l3_agent.ini file. > Where do you set that for OVN? The OVS deployment have the l3_agent.ini, > but OVN does not have an L3 agent. > > Erlon > > Em ter., 23 de nov. de 2021 ?s 11:38, Ammad Syed > escreveu: > >> Hi Erlon, >> >> I have tested on xena and it works fine. See if you have qos-fip >> extension loaded in neution. >> >> # openstack extension list | grep -i qos-fip >> >> | Floating IP QoS >> >> | qos-fip | The floating IP >> Quality of Service extension >> | >> Ammad >> >> On Tue, Nov 23, 2021 at 6:56 PM Erlon Cruz wrote: >> >>> Hi Ammad, >>> >>> What OpenStack version did you tested? I have just performed the FIP >>> test on Xena and it didn't work for me. See the results I posted. >>> >>> Erlon >>> >>> Em ter., 23 de nov. de 2021 ?s 10:47, Ammad Syed >>> escreveu: >>> >>>> Hi Erlon, >>>> >>>> You can check below url for testing qos on FIP. I have tested it and it >>>> works fine. >>>> >>>> >>>> https://github.com/openstack/neutron/blob/master/doc/source/admin/config-qos.rst >>>> >>>> Ammad >>>> On Tue, Nov 23, 2021 at 6:06 PM Erlon Cruz wrote: >>>> >>>>> Hi Roman, Rodolfo, >>>>> >>>>> I tested setting the QoS policy to the port (internal) instead of the >>>>> network (external), and it works! I did some more testing on >>>>> the OVS vs OVN deployments and I can confirm the status you are >>>>> saying. What I got was: >>>>> >>>>> OVS: >>>>> FIP: >>>>> Setting on port: FAIL >>>>> Setting on network: OK >>>>> >>>>> Private network: >>>>> Setting on port: OK >>>>> Setting on network: OK >>>>> >>>>> Router: >>>>> Internal port: OK >>>>> External port: OK >>>>> >>>>> OVN: >>>>> FIP: >>>>> Setting on port: FAIL >>>>> Setting on network: FAIL (I was trying this) >>>>> >>>>> Private network: >>>>> Setting on port: OK >>>>> Setting on network: OK >>>>> >>>>> Router: >>>>> Internal port: FAIL >>>>> External port: FAIL >>>>> >>>>> Thanks a lot for your help!! >>>>> Erlon >>>>> >>>>> Em ter., 23 de nov. de 2021 ?s 08:47, Rodolfo Alonso Hernandez < >>>>> ralonsoh at redhat.com> escreveu: >>>>> >>>>>> Hello Erlon: >>>>>> >>>>>> We really need to review the gaps document, at least for Xena. >>>>>> >>>>>> As Roman said, we have been testing QoS in OVN successfully. >>>>>> >>>>>> The current status of QoS in OVN is (at least for Xena): >>>>>> - Fixed ports (VM ports): support for BW limit rules (egress/ingress) >>>>>> and DSCP (only egress). Neutron supports port network QoS inheritance (same >>>>>> as in your example). This is not for OVN but for any backend. >>>>>> - FIPs: support for BW limit rules (egress/ingress). Still no network >>>>>> QoS inheritance (in progress). >>>>>> - GW IP: no support yet. >>>>>> >>>>>> Ping me in #openstack-neutron channel (ralonsoh) if you have more >>>>>> questions. >>>>>> >>>>>> Regards. >>>>>> >>>>>> >>>>>> On Tue, Nov 23, 2021 at 12:12 PM Roman Safronov >>>>>> wrote: >>>>>> >>>>>>> Hi Erlon, >>>>>>> >>>>>>> There was a bug with setting QoS on a network but it had been fixed >>>>>>> long ago. >>>>>>> https://bugs.launchpad.net/neutron/+bug/1851362 or >>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1934096 >>>>>>> At least in our downstream CI we do not observe such issues with >>>>>>> QoS+OVN. >>>>>>> >>>>>>> From the commands I see that you apply the QoS rule on the external >>>>>>> network, right? >>>>>>> >>>>>>> On Tue, Nov 23, 2021 at 12:39 PM Erlon Cruz >>>>>>> wrote: >>>>>>> >>>>>>>> Hi Roman, >>>>>>>> >>>>>>>> Forgot to add that detail, since I run the same routine in a >>>>>>>> non-ovn deployment and it worked. But this is how I did it: >>>>>>>> >>>>>>>> openstack network qos policy list >>>>>>>> openstack network qos policy create bw-limiter >>>>>>>> openstack network qos rule create --type bandwidth-limit --max-kbps >>>>>>>> 512 --max-burst-kbits 512 --egress bw-limiter >>>>>>>> openstack network qos rule create --type bandwidth-limit --max-kbps >>>>>>>> 512 --max-burst-kbits 512 --ingress bw-limiter >>>>>>>> openstack network set --qos-policy bw-limiter ext_net >>>>>>>> >>>>>>>> I didn't set it in the port though, which is something I should do. >>>>>>>> I'll set it in the port too for testing but I think the above should >>>>>>>> work regardless. >>>>>>>> >>>>>>>> Erlon >>>>>>>> >>>>>>>> >>>>>>>> Em seg., 22 de nov. de 2021 ?s 18:45, Roman Safronov < >>>>>>>> rsafrono at redhat.com> escreveu: >>>>>>>> >>>>>>>>> Hi Erlon, >>>>>>>>> >>>>>>>>> I have a couple of questions that probably will help to understand >>>>>>>>> the issue better. >>>>>>>>> Have you applied the QoS rules on a port, network or floating ip? >>>>>>>>> Have you applied the QoS rules before starting the VM (before it's >>>>>>>>> port is active) or after? >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> >>>>>>>>> >>>>>>>>> On Mon, Nov 22, 2021 at 10:53 PM Erlon Cruz >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi folks, >>>>>>>>>> >>>>>>>>>> I have a question related to the Neutron supportability of >>>>>>>>>> OVN+QoS. I have checked the config reference for both >>>>>>>>>> Victoria and Xena[1] >>>>>>>>>> >>>>>>>>>> [2] >>>>>>>>>> >>>>>>>>>> and they are shown as supported (bw limit, eggress/ingress), but I tried to >>>>>>>>>> set up an env >>>>>>>>>> with OVN+QoS but the rules are not being effective (VMs still >>>>>>>>>> download at maximum speed). I double-checked >>>>>>>>>> the configuration in the neutron API and it brings the QoS >>>>>>>>>> settings[3] >>>>>>>>>> >>>>>>>>>> [4] >>>>>>>>>> >>>>>>>>>> [5] >>>>>>>>>> , >>>>>>>>>> and the versions[6] >>>>>>>>>> >>>>>>>>>> [7] >>>>>>>>>> I'm >>>>>>>>>> using should support it. >>>>>>>>>> >>>>>>>>>> What makes me more confused is that there's a document[8] >>>>>>>>>> [9] >>>>>>>>>> with >>>>>>>>>> a gap analysis of the OVN vs OVS QoS functionality >>>>>>>>>> and the document *is* being updated over the releases, but it >>>>>>>>>> still shows that QoS is not supported in OVN. >>>>>>>>>> >>>>>>>>>> So, is there something I'm missing? >>>>>>>>>> >>>>>>>>>> Erlon >>>>>>>>>> _______________ >>>>>>>>>> [1] >>>>>>>>>> https://docs.openstack.org/neutron/victoria/admin/config-qos.html >>>>>>>>>> [2] https://docs.openstack.org/neutron/xena/admin/config-qos.html >>>>>>>>>> [3] QoS Config: >>>>>>>>>> https://gist.github.com/sombrafam/f8434c0505ed4dd3f912574e7ccebb82 >>>>>>>>>> [4] neutron.conf: >>>>>>>>>> https://gist.github.com/sombrafam/785beb10f20439c4e50eb633f294ae82 >>>>>>>>>> [5] ml2_conf.ini: >>>>>>>>>> https://gist.github.com/sombrafam/b171a38d8cd16bd4dc77cfee3916dccd >>>>>>>>>> [6] neutron-api-0 versions: >>>>>>>>>> https://gist.github.com/sombrafam/5d098daa1df3f116d599c09c96eab173 >>>>>>>>>> [7] nova-compute-0 versions: >>>>>>>>>> https://gist.github.com/sombrafam/d51102e3a32be5dc8ca03d7a23b6a998 >>>>>>>>>> [8] Gaps from ML2/OVS-OVN Xena: >>>>>>>>>> https://docs.openstack.org/neutron/xena/ovn/gaps.html >>>>>>>>>> [9] Gaps from ML2/OVS-OVN Victoria: >>>>>>>>>> https://docs.openstack.org/neutron/victoria/ovn/gaps.html >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>> Regards, >>>> >>>> >>>> Syed Ammad Ali >>>> >>> >> >> -- >> Regards, >> >> >> Syed Ammad Ali >> > -- Regards, Syed Ammad Ali -------------- next part -------------- An HTML attachment was scrubbed... URL: From akekane at redhat.com Wed Nov 24 05:57:55 2021 From: akekane at redhat.com (Abhishek Kekane) Date: Wed, 24 Nov 2021 11:27:55 +0530 Subject: [Glance] No weekly meeting Message-ID: Hello Team, As this is the holiday season and ThanksGiving is on our weekly meeting day, we are not meeting this week. Next meeting will be on 02 December. Thanks & Best Regards, Abhishek Kekane -------------- next part -------------- An HTML attachment was scrubbed... URL: From manchandavishal143 at gmail.com Wed Nov 24 07:08:41 2021 From: manchandavishal143 at gmail.com (vishal manchanda) Date: Wed, 24 Nov 2021 12:38:41 +0530 Subject: [horizon] Cancelling Today's weekly meeting Message-ID: Hello Team, Since there are no agenda items [1] to discuss for today's horizon weekly meeting. Also, Today is a holiday for me or maybe for others as well So let's cancel today's weekly meeting. Next weekly meeting will be on 01 December. Thanks & Regards, Vishal Manchanda [1] https://etherpad.opendev.org/p/horizon-release-priorities -------------- next part -------------- An HTML attachment was scrubbed... URL: From wodel.youchi at gmail.com Wed Nov 24 08:25:01 2021 From: wodel.youchi at gmail.com (wodel youchi) Date: Wed, 24 Nov 2021 09:25:01 +0100 Subject: [kolla-ansible][wallaby][magnum][Kubernetes] Cannot auto-scale workers Message-ID: Hi, I have a new kolla-ansible deployment with wallaby. I have created a kubernetes cluster using calico (flannel didn't work for me). I configured an autoscale test to see if it works. - pods autoscale is working. - worker nodes autoscale is not working. This is my deployment file :*cat php-apache.yaml* apiVersion: apps/v1 kind: Deployment metadata: name: php-apache-deployment spec: selector: matchLabels: app: php-apache replicas: 2 template: metadata: labels: app: php-apache spec: containers: - name: php-apache image: k8s.gcr.io/hpa-example ports: - containerPort: 80 resources: limits: cpu: 500m requests: cpu: 200m --- apiVersion: v1 kind: Service metadata: name: php-apache-service labels: app: php-apache spec: ports: - port: 80 targetPort: 80 protocol: TCP selector: app: php-apache type: LoadBalancer This is my HPA file :*cat php-apache-hpa.yaml* apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler metadata: name: php-apache-hpa namespace: default labels: ser