From tony at bakeyournoodle.com Mon May 1 02:35:34 2023 From: tony at bakeyournoodle.com (Tony Breeds) Date: Mon, 1 May 2023 12:35:34 +1000 Subject: Offline Openstack Release:Train Deployment In-Reply-To: References: Message-ID: On Sat, 29 Apr 2023 at 00:30, Vineet Thakur wrote: > > > Greetings openstack community members, > > It's regarding the deployment of openstack release: train on CentoOS 7.9 environment by using openstack-ansible tool. (Offline environment) I'll say upfront that train is .... quite old and you should absolutely evaluate newer releases. > We are facing challenges to download the packages, some package links are not accessible or somewhere PIP packages are not available as per the constraint requirement list. > See: > https://artifacts.ci.centos.org/sig-cloudinstance/centos-7-191001/x86-64/centos-7-x86_64-docker.tar.xz You can generate this yourself by pulling centos:centos7.9.2009 from dockerhub, I can't see an old release like that on quay.io, and then using docker save. > For pip packages: > https://opendev.org/openstack/requirements/raw/0cfc2ef6318b639bb4287fa051eae8f3ac8cc426/upper-constraints.txt You can/should replace that URL with: https://releases.openstack.org/constraints/upper/train That will persist even after train has been marked end-of-life Using Ubuntu Bionic (as that's what I have easy access to) I was able to get a full mirror of all the wheels needed for the train release. It's been a very long time since I looked at integrating any of that in OSA Tony From thierry at openstack.org Mon May 1 10:21:58 2023 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 1 May 2023 12:21:58 +0200 Subject: [largescale-sig] Next meeting: May 3, 8utc Message-ID: <42675d5c-e4d1-4819-5121-703975ecb2bc@openstack.org> Hi everyone, The Large Scale SIG will be meeting this Wednesday in #openstack-operators on OFTC IRC, at 8UTC, our EU+APAC-friendly time. You can doublecheck how that UTC time translates locally at: https://www.timeanddate.com/worldclock/fixedtime.html?iso=20230503T08 Feel free to add topics to the agenda: https://etherpad.opendev.org/p/large-scale-sig-meeting Regards, -- Thierry Carrez From tomas at leypold.cz Mon May 1 13:17:02 2023 From: tomas at leypold.cz (=?UTF-8?B?VG9tw6HFoSBMZXlwb2xk?=) Date: Mon, 1 May 2023 15:17:02 +0200 Subject: [kolla-ansible][zed][15.1.0] Horizon failed to load trunk and QoS In-Reply-To: References: Message-ID: Hi, I just bumped into the same issue and it does not seem to be some config issue in kolla, because it doesn't work on packstack too. I reported a bug on launchpad: https://bugs.launchpad.net/horizon/+bug/2018232 . On 4/4/23 10:31, Stefan Bryzga wrote: > Hi all, > > Some time ago I deployed a small cluster with kolla-ansible v15.1.0. > Before deployment I enabled neutron_qos and neutron_trunk flags in > /etc/kolla/globals.yml. Both functions?work?fine via cli and I can > create new entries via dashboard however horizon won't display created > trunks/qos on the page (It shows "No items to display"). In web browser > console I get this error while loading page: > > TypeError: Cannot read properties of undefined (reading 'data') > ? ? at addTrackBy (output.0a9483334bbd.js:1115:59) > ? ? at processQueue (output.43a09e34a303.js:1581:786) > ? ? at output.43a09e34a303.js:1585:110 > ? ? at Scope.$digest (output.43a09e34a303.js:1636:200) > ? ? at Scope.$apply (output.43a09e34a303.js:1643:269) > ? ? at done (output.43a09e34a303.js:1356:123) > ? ? at completeRequest (output.43a09e34a303.js:1375:20) > ? ? at XMLHttpRequest.requestLoaded (output.43a09e34a303.js:1368:1) > 'Possibly unhandled rejection: {}' > > Does anyone knows how to fix this? Should I modify?any configs to > display?new entries in horizon? > > Best Regards, > Stefan > --- Best Regards, Tomas Leypold From knikolla at bu.edu Mon May 1 13:52:24 2023 From: knikolla at bu.edu (Nikolla, Kristi) Date: Mon, 1 May 2023 13:52:24 +0000 Subject: [tc][all] Technical Committee Weekly Summary -- April 28, 2023 Message-ID: <5591E58E-E583-4E85-828F-E37504F196FB@bu.edu> Hi all, Here?s another edition of ?What?s happening on the Technical Committee.? Meeting ======= The Technical Committee met on April 25 on the #openstack-tc channel on OFTC IRC. The next meeting will be held on May 2 at 18:00UTC on Zoom. For more information visit https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting Happenings ========== Holding on dropping Python 3.8 ------------------------------ We are drafting guidelines to keep Python 3.8 support in OpenStack for 2023.2. This was discussed during Tuesday's TC meeting as it was causing significant breakage to the gate. [0]. https://lists.openstack.org/pipermail/openstack-discuss/2023-April/033469.html New proposal to add release naming recommendations for package maintainers ------------------------------------------------------- During the last TC meeting we discussed the confusion arising from upstream OpenStack standardizing on the release names (ex. 2023.2) whereas some distros were favoring codenames (ex. Bobcat). We have a proposal up to recommend package maintainers follow our naming.[1] [0]. https://meetings.opendev.org/meetings/tc/2023/tc.2023-04-25-18.00.log.html#l-103 [1]. https://review.opendev.org/c/openstack/governance/+/881712 New Upstream Investment Opportunity: Ironic ARM Support ------------------------------------------------------- Thanks to Jay Faulkner for proposing a new investment opportunity that we just merged![0] "The OpenStack community is seeking ARM hardware and system administrators or developers with background in provisioning ARM devices to partner with the Ironic bare metal team. The Ironic project produces the OpenStack service and libraries to manage and provision physical machines." [0]. https://governance.openstack.org/tc/reference/upstream-investment-opportunities/2023/ironic-arm-contibutions.html Changes ======= ? Merged ? Add charmed openstack-hypervisor OpenStack Charms | https://review.opendev.org/c/openstack/governance/+/879437 ? Update to hacking v6 (code-change) | https://review.opendev.org/c/openstack/governance/+/881266 ? Switch to 2023.2 testing runtime py version (code-change) | https://review.opendev.org/c/openstack/governance/+/881137 ? Correct the old deprecated policies removal timeline for SLURP release (formal-vote) | https://review.opendev.org/c/openstack/governance/+/880238 ? Deprecate TripleO (formal-vote) | https://review.opendev.org/c/openstack/governance/+/877132 ? Add Contribution Opportunity: Ironic ARM support (formal-vote) | https://review.opendev.org/c/openstack/governance/+/879080 ? New Open ? Appoint Jerry Zhou as Sahara PTL (formal-vote) | https://review.opendev.org/c/openstack/governance/+/881186 ? Add recommendations for release naming to package maintainers (formal-vote) | https://review.opendev.org/c/openstack/governance/+/881712 ? Align release naming terminology (documentation-change) | https://review.opendev.org/c/openstack/governance/+/881706 ? Abandoned ? All Open ? https://review.opendev.org/q/project:openstack/governance+status:open How to contact the TC ===================== If you would like to discuss or give feedback to TC, you can reach out to us in multiple ways: 1. Email: you can send an email with the tag [tc] on the openstack-discuss mailing list. 2. Weekly meeting: The Technical Committee conduct a weekly meeting every Thursday 16:00 UTC 3. IRC: Ping us using the 'tc-members' keyword on the #openstack-tc IRC channel on OFTC. From aspeagle at toyon.com Mon May 1 16:26:37 2023 From: aspeagle at toyon.com (Andy Speagle) Date: Mon, 1 May 2023 16:26:37 +0000 Subject: Getting 403 from keystone as admin user. Message-ID: <6937b623979c5cb37d433abae313772454451402.camel@toyon.com> We're getting strange 403's from keystone on the CLI when trying to list users, groups, and role assignments. Running keystone 17.0.1 on ussuri. I've looked through our policy.json... can't find anything strange. Has anyone seen this behavior? Any ideas what can be done? Thanks. -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 14311 bytes Desc: not available URL: From knikolla at bu.edu Mon May 1 16:49:17 2023 From: knikolla at bu.edu (Nikolla, Kristi) Date: Mon, 1 May 2023 16:49:17 +0000 Subject: [tc] Technical Committee next weekly meeting on May 2, 2023 Message-ID: <63434D52-438B-482D-B6A4-8CFD39C36277@bu.edu> Hi all, This is a reminder that the next weekly Technical Committee meeting is to be held tomorrow (May 2) at 1800 UTC on Zoom. Items can be proposed by editing the wiki page at https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting At the end of today I will send out an email with the finalized agenda. Thank you, Kristi Nikolla -------------- next part -------------- An HTML attachment was scrubbed... URL: From roger.riverac at gmail.com Mon May 1 17:44:32 2023 From: roger.riverac at gmail.com (Roger Rivera) Date: Mon, 1 May 2023 13:44:32 -0400 Subject: [openstack-ansible] Playbook/Role to install a Redis cluster for Gnocchi incoming measure storage? In-Reply-To: References: Message-ID: Hello Community, I was wondering if there is an Ansible playbook from OSA to install a Redis cluster to be configured as incoming measure for Gnocchi? There is this deployment example that points to an external Ansible role . Unfortunately, that role does not make use of LXC containers to keep consistency with OSA deployment on the infra nodes. We want the infra nodes to also host the Redis/Sentinel master-slave cluster, if possible. Questions: 1. Can anyone steer us in the right direction to do Telemetry with Gnocchi, Ceph and Redis with an Openstack-Ansible deployment with LXC containers? 2. Is there a better approach to use Redis as an incoming measure storage for the gnocchi installation? Any suggestions, guidance will be appreciated. -- *Roger Rivera* -------------- next part -------------- An HTML attachment was scrubbed... URL: From jlibosva at redhat.com Mon May 1 18:07:09 2023 From: jlibosva at redhat.com (Jakub Libosvar) Date: Mon, 1 May 2023 14:07:09 -0400 Subject: [Neutron] Bug deputy Apr 24 - May 1 Message-ID: Hi all, I was the bug deputy for the last week. There are 2 critical bugs, one is pending merge at this time and one needs an assignee: https://bugs.launchpad.net/neutron/+bug/2017992 You can find the full report below. Kuba Critical - Jobs running on vexxhost provider failing with Mirror Issues https://bugs.launchpad.net/neutron/+bug/2017992 Needs an assignee - [master][functional] test_cascading_del_in_txn fails with ovsdbapp=2.3.0 https://bugs.launchpad.net/neutron/+bug/2018130 In progress: https://review.opendev.org/c/openstack/neutron/+/881896 Assigned to Yatin High - Tenant user cannot delete a port associated with an FIP belonging to the admin tenant https://bugs.launchpad.net/neutron/+bug/2017680 In progress: https://review.opendev.org/c/openstack/neutron/+/881827 Assigned to Fernando - OVN: ovnmeta namespaces missing during scalability test causing DHCP issues https://bugs.launchpad.net/neutron/+bug/2017748 In progress: https://review.opendev.org/c/openstack/neutron/+/881487 Assigned to Lucas - Incorrect use of "db_api.CONTEXT_*? decorators https://bugs.launchpad.net/neutron/+bug/2017784 In progress: https://review.opendev.org/c/openstack/neutron/+/881569 Assigned to Rodolfo - OVN trunk subport bouncing between compute while live-migrating https://bugs.launchpad.net/neutron/+bug/2017912 Assigned to Arnau Medium - Move OVN Metadata Agent code to the new OVN Neutron Agent https://bugs.launchpad.net/neutron/+bug/2017871 Assigned to Lucas - ML2 context not considering tags at network creation and update https://bugs.launchpad.net/neutron/+bug/ Low - Remove "neutron-ovn-tempest-ovs-release-ubuntu-old? job https://bugs.launchpad.net/neutron/+bug/2017500 In progress: https://review.opendev.org/c/openstack/neutron/+/881342 Assigned to Rodolfo - ``cmd.sanity.checks._get_ovn_version`` is returning a 3 element tuple, instead of 2 https://bugs.launchpad.net/neutron/+bug/2017878 In progress: https://review.opendev.org/c/openstack/neutron/+/881708 Assigned to Rodolfo - SQL warning retrieving ports if the network list is empty https://bugs.launchpad.net/neutron/+bug/2018000 In progress: https://review.opendev.org/c/openstack/neutron/+/881830 Assigned to Rodolfo From noonedeadpunk at gmail.com Mon May 1 18:24:02 2023 From: noonedeadpunk at gmail.com (Dmitriy Rabotyagov) Date: Mon, 1 May 2023 20:24:02 +0200 Subject: [openstack-ansible] Playbook/Role to install a Redis cluster for Gnocchi incoming measure storage? In-Reply-To: References: Message-ID: Hi, Roger, No, unfortunately we don't have any in-house way to deploy redis, as it's usage area is quite limited. Moreover there're more storage/incoming drivers that are supported by Gnocchi, including Ceph, S3 and just MySQL. But roles should not be aware of LXC containers or anything OSA specific as you can create any arbitrary containers whenever needed with default OSA roles and then using a playbook execute any third-party roles against these LXC containers. You can find some documentation on how inventory works with OSA here: https://docs.openstack.org/openstack-ansible/latest/reference/inventory/understanding-inventory.html So, basically what you should do to create extra set of containers for Redis: 1. Create /etc/openstack_deploy/env.d/redis.yml with content: component_skel: redis: belongs_to: - redis_all container_skel: redis_container: belongs_to: - redis_containers contains: - redis physical_skel: redis_containers: belongs_to: - all_containers redis_hosts: belongs_to: - hosts 2. In /etc/openstack_deploy/openstack_user_config.yml (or in a separate file under conf.d) define where containers should reside, ie: redis_hosts: infra1: ip: 172.29.236.11 infra2: ip: 172.29.236.12 infra3: ip: 172.29.236.13 3. Execute playbook: openstack-ansible playbooks/lxc-containers-create.yml --limit redis_all,lxc_hosts Once this is done, you can simply deploy redis to these containers, by targeting the redis_all group from corresponsive playbook. Hope this helps:) ??, 1 ??? 2023??. ? 19:46, Roger Rivera : > > Hello Community, > > I was wondering if there is an Ansible playbook from OSA to install a Redis cluster to be configured as incoming measure for Gnocchi? > > There is this deployment example that points to an external Ansible role. Unfortunately, that role does not make use of LXC containers to keep consistency with OSA deployment on the infra nodes. > > We want the infra nodes to also host the Redis/Sentinel master-slave cluster, if possible. > > Questions: > > Can anyone steer us in the right direction to do Telemetry with Gnocchi, Ceph and Redis with an Openstack-Ansible deployment with LXC containers? > Is there a better approach to use Redis as an incoming measure storage for the gnocchi installation? > > Any suggestions, guidance will be appreciated. > > -- > Roger Rivera From satish.txt at gmail.com Mon May 1 19:34:34 2023 From: satish.txt at gmail.com (Satish Patel) Date: Mon, 1 May 2023 15:34:34 -0400 Subject: [kolla-ansible] How to test local docker registry Message-ID: Folks, I have 3 controller nodes and on controller01 I have configured the local registry and pushed all images into the local registry. I have pointed out 2 other nodes to use controller01 local registry. When i run "kolla-ansible -i multinode deploy" it works on controller01 and docker start all containers but one controller02 and 03 I encounter error related docker registry Error: https://paste.opendev.org/show/bmwbM3PB33a8hrA1Wi6e/ docker daemon.json file on controller02 and 03 (docker-reg is point to controller01) root at controller02:~# cat /etc/docker/daemon.json { "bridge": "none", "insecure-registries": [ "docker-reg:4000" ], "ip-forward": false, "iptables": false, "log-opts": { "max-file": "5", "max-size": "50m" } How do I check my docker registry is functioning? I have tried following commands on 02 and 03 nodes but following error. root at controller02:~# docker pull docker-reg:4000/openstack.kolla/ubuntu-source-fluentd Using default tag: latest Error response from daemon: manifest for docker-reg:4000/openstack.kolla/ubuntu-source-fluentd:latest not found: manifest unknown: manifest unknown Are there any way to test if my registry is functional and nothing wrong. There are no network issues. I have a test with port ping using the nc command. -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Mon May 1 20:29:42 2023 From: satish.txt at gmail.com (Satish Patel) Date: Mon, 1 May 2023 16:29:42 -0400 Subject: [kolla-ansible] How to test local docker registry In-Reply-To: References: Message-ID: Ignore my post, it looks like I pull all images but not push, I don't know why it didn't push somehow. I found how to list images from the local registry. $ curl docker-reg:4000/v2/_catalog {"repositories":["openstack.kolla/ubuntu-source-cinder-api","openstack.kolla/ubuntu-source-cinder-backup","openstack.kolla/ubuntu-source-cinder-scheduler","openstack.kolla/ubuntu-source-cinder-volume","openstack.kolla/ubuntu-source-cron","openstack.kolla/ubuntu-source-fluentd","openstack.kolla/ubuntu-source-glance-api","openstack.kolla/ubuntu-source-haproxy","openstack.kolla/ubuntu-source-heat-api","openstack.kolla/ubuntu-source-heat-api-cfn","openstack.kolla/ubuntu-source-heat-engine","openstack.kolla/ubuntu-source-horizon","openstack.kolla/ubuntu-source-keepalived","openstack.kolla/ubuntu-source-keystone","openstack.kolla/ubuntu-source-keystone-fernet","openstack.kolla/ubuntu-source-keystone-ssh","openstack.kolla/ubuntu-source-kolla-toolbox","openstack.kolla/ubuntu-source-mariadb-clustercheck","openstack.kolla/ubuntu-source-mariadb-server","openstack.kolla/ubuntu-source-memcached","openstack.kolla/ubuntu-source-neutron-metadata-agent","openstack.kolla/ubuntu-source-neutron-server","openstack.kolla/ubuntu-source-nova-api","openstack.kolla/ubuntu-source-nova-compute","openstack.kolla/ubuntu-source-nova-conductor","openstack.kolla/ubuntu-source-nova-libvirt","openstack.kolla/ubuntu-source-nova-novncproxy","openstack.kolla/ubuntu-source-nova-scheduler","openstack.kolla/ubuntu-source-nova-ssh","openstack.kolla/ubuntu-source-openvswitch-db-server","openstack.kolla/ubuntu-source-openvswitch-vswitchd","openstack.kolla/ubuntu-source-ovn-controller","openstack.kolla/ubuntu-source-ovn-nb-db-server","openstack.kolla/ubuntu-source-ovn-northd","openstack.kolla/ubuntu-source-ovn-sb-db-server","openstack.kolla/ubuntu-source-placement-api","openstack.kolla/ubuntu-source-rabbitmq"]} On Mon, May 1, 2023 at 3:34?PM Satish Patel wrote: > Folks, > > I have 3 controller nodes and on controller01 I have configured the local > registry and pushed all images into the local registry. I have pointed > out 2 other nodes to use controller01 local registry. > > When i run "kolla-ansible -i multinode deploy" it works on controller01 > and docker start all containers but one controller02 and 03 I encounter > error related docker registry > > Error: https://paste.opendev.org/show/bmwbM3PB33a8hrA1Wi6e/ > > docker daemon.json file on controller02 and 03 (docker-reg is point to > controller01) > root at controller02:~# cat /etc/docker/daemon.json > { > "bridge": "none", > "insecure-registries": [ > "docker-reg:4000" > ], > "ip-forward": false, > "iptables": false, > "log-opts": { > "max-file": "5", > "max-size": "50m" > } > > How do I check my docker registry is functioning? > > I have tried following commands on 02 and 03 nodes but following error. > > root at controller02:~# docker pull > docker-reg:4000/openstack.kolla/ubuntu-source-fluentd > Using default tag: latest > Error response from daemon: manifest for > docker-reg:4000/openstack.kolla/ubuntu-source-fluentd:latest not found: > manifest unknown: manifest unknown > > Are there any way to test if my registry is functional and nothing wrong. > There are no network issues. I have a test with port ping using the nc > command. > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knikolla at bu.edu Mon May 1 21:43:48 2023 From: knikolla at bu.edu (Nikolla, Kristi) Date: Mon, 1 May 2023 21:43:48 +0000 Subject: [tc] Technical Committee next weekly meeting on May 2, 2023 In-Reply-To: <63434D52-438B-482D-B6A4-8CFD39C36277@bu.edu> References: <63434D52-438B-482D-B6A4-8CFD39C36277@bu.edu> Message-ID: <7332C1BF-7E9F-4ACA-AC3A-8A01ADC4C865@bu.edu> Please find below the agenda for tomorrow's Technical Committee meeting (May 2nd at 1800UTC on Zoom) * Roll call * Follow up on past action items ? noonedeadpunk to propose a patch to reference that makes the recommendation for downstream packagers to use the version name rather than codename. ? noonedeadpunk write the words for "The Smith Plan(tm)" (the script of the movie about changing PTI and saving the world from the dangers of getting rid of py38) ? gmann send an email on ML asking project/release team to hold dropping the py38 until we get the pti change merged * Gate health check * 2023.2 cycle Leaderless projects ** https://etherpad.opendev.org/p/2023.2-leaderless * Broken docs due to inconsistent release naming * Schedule of removing support for Python versions by libraries - how it should align with coordinated releases (tooz case) * Recurring tasks check ** Bare 'recheck' state *** https://etherpad.opendev.org/p/recheck-weekly-summary * Open Reviews ** https://review.opendev.org/q/projects:openstack/governance+is:open > On May 1, 2023, at 12:49 PM, Nikolla, Kristi wrote: > > Hi all, > > This is a reminder that the next weekly Technical Committee meeting is to be held tomorrow (May 2) at 1800 UTC on Zoom. > > Items can be proposed by editing the wiki page at https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting > > At the end of today I will send out an email with the finalized agenda. > > Thank you, > Kristi Nikolla From rlandy at redhat.com Tue May 2 00:14:31 2023 From: rlandy at redhat.com (Ronelle Landy) Date: Mon, 1 May 2023 20:14:31 -0400 Subject: [tripleo] hold rechecks content provider gate blocker - CentOS-9 stream mirrors Message-ID: Hello All, We have a check/gate blocker impacting all CentOS-9 jobs. Details are in: https://bugs.launchpad.net/tripleo/+bug/2018265 - CentOS-9 stream mirrors are returning Status code: 404. Per https://review.opendev.org/868392 - the Rackspace mirrors are being used to sync CentOS-9 repos - so possibly there is a lag/roll back there. We will post out once there is a fix/workaround or if the situation gets unblocked dueto an update. Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rlandy at redhat.com Tue May 2 10:40:24 2023 From: rlandy at redhat.com (Ronelle Landy) Date: Tue, 2 May 2023 06:40:24 -0400 Subject: [tripleo] hold rechecks content provider gate blocker - CentOS-9 stream mirrors In-Reply-To: References: Message-ID: On Mon, May 1, 2023 at 8:14?PM Ronelle Landy wrote: > Hello All, > > We have a check/gate blocker impacting all CentOS-9 jobs. Details are in: > > https://bugs.launchpad.net/tripleo/+bug/2018265 - CentOS-9 stream > mirrors are returning Status code: 404. > > Per https://review.opendev.org/868392 - the Rackspace mirrors are being > used to sync CentOS-9 repos - so possibly there is a lag/roll back there. > > We will post out once there is a fix/workaround or if the situation gets > unblocked dueto an update. > > Thank you. > Update: It looks like http://mirror.rackspace.com/centos-stream/9-stream/BaseOS/x86_64/os/repodata/ got updated with multiple entries today. There are some held (successful) c9 content-provider jobs. And RDO jobs started clearing about one/two hours ago. So we look to have cleared this blocker. Please go ahead and recheck any failed jobs from yesterday. -------------- next part -------------- An HTML attachment was scrubbed... URL: From kozhukalov at gmail.com Tue May 2 11:11:22 2023 From: kozhukalov at gmail.com (Vladimir Kozhukalov) Date: Tue, 2 May 2023 14:11:22 +0300 Subject: [openstack-helm] Cancelling monthly meetings Message-ID: Dear Openstack-helmers, I would like to announce that we are going to stop having monthly IRC meetings due to the lack of interest. There are two major concerns: 1) It is usually not a good idea to wait few weeks for the next meeting to happen if you have something to discuss 2) If only few people are attending the meeting then it is anyway not enough to have a strong community consensus and to make any important decisions We discussed this a couple months ago in the IRC meeting and now it is time to do the next step and to cancel it. So, if you have something to discuss please feel free to start the discussion directly in the #openstack-helm Slack chat (Slack is our preferable way of communication). Today's meeting (May/02/2023) is also canceled. -- Best regards, Kozhukalov Vladimir -------------- next part -------------- An HTML attachment was scrubbed... URL: From vineetthakur09 at gmail.com Tue May 2 05:30:00 2023 From: vineetthakur09 at gmail.com (Vineet Thakur) Date: Tue, 2 May 2023 11:00:00 +0530 Subject: Offline Openstack Release:Train Deployment In-Reply-To: References: Message-ID: Hi Tony, Thank you for your valuable inputs. We have used OSA for various recent releases were introduced post train, but didn't face any such package dependency issues. Due to limited time (delivery time), we have decided to go for *manual installation* and it's been tested that we have all the required packages available for that. Hope that would work for us. Once again, many thanks to you and other community members who shared their feedback. Kind Regards, Vineet On Mon, May 1, 2023 at 8:05?AM Tony Breeds wrote: > On Sat, 29 Apr 2023 at 00:30, Vineet Thakur > wrote: > > > > > > Greetings openstack community members, > > > > It's regarding the deployment of openstack release: train on CentoOS 7.9 > environment by using openstack-ansible tool. (Offline environment) > > I'll say upfront that train is .... quite old and you should > absolutely evaluate newer releases. > > > We are facing challenges to download the packages, some package links > are not accessible or somewhere PIP packages are not available as per the > constraint requirement list. > > See: > > > https://artifacts.ci.centos.org/sig-cloudinstance/centos-7-191001/x86-64/centos-7-x86_64-docker.tar.xz > > You can generate this yourself by pulling centos:centos7.9.2009 from > dockerhub, I can't see an old release like that on quay.io, and then > using docker save. > > > For pip packages: > > > https://opendev.org/openstack/requirements/raw/0cfc2ef6318b639bb4287fa051eae8f3ac8cc426/upper-constraints.txt > > You can/should replace that URL with: > https://releases.openstack.org/constraints/upper/train > > That will persist even after train has been marked end-of-life > > Using Ubuntu Bionic (as that's what I have easy access to) I was able > to get a full mirror of all the wheels needed for the train release. > > It's been a very long time since I looked at integrating any of that in OSA > > Tony > -- Thanks & Regards, Vineet Thakur 886164765 -------------- next part -------------- An HTML attachment was scrubbed... URL: From elod.illes at est.tech Tue May 2 15:34:49 2023 From: elod.illes at est.tech (=?utf-8?B?RWzDtWQgSWxsw6lz?=) Date: Tue, 2 May 2023 15:34:49 +0000 Subject: [cinder][kolla][OpenstackAnsible] 2023.1 Antelope Cycle-Trailing Release Deadline Message-ID: Hello teams with trailing projects, The 2023.1 Antelope cycle-trailing release deadline is in ~1 months [1], and all projects following the cycle-trailing release model must release their 2023.1 Antelope deliverables by June 1st, 2023. The following trailing projects haven't been released yet for 2023.1 Antelope (aside the release candidates versions if exists). Cinder team's deliverables: - cinderlib OSA team's deliverables: - openstack-ansible-roles - openstack-ansible Kolla team's deliverables: - kayobe - kolla - kolla-ansible - ansible-collection-kolla This is just a friendly reminder to allow you to release these projects in time. Do not hesitate to ping us if you have any questions or concerns. [1] https://releases.openstack.org/bobcat/schedule.html#b-cycle-trail Thanks, El?d irc: elodilles @ #openstack-release -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Tue May 2 15:43:07 2023 From: smooney at redhat.com (Sean Mooney) Date: Tue, 02 May 2023 16:43:07 +0100 Subject: [kolla-ansible][all][nova][zed][latest] USB passthrough / Hot-plug / Cold-plug In-Reply-To: References: Message-ID: <9e9193d19892a3cc4216049a10a489b7ebd85b3b.camel@redhat.com> On Sun, 2023-04-30 at 13:34 +0330, Modyngs wrote: > 090c090cFolks, > > A USB device (not necessarily a mass/Flash drive) needs to be connected to > one of the VMs (an openstack instance). > > Openstack installed with kolla-ansible{Latest version/ Also tested on Zed} > ALL IN ONE Deployments on ubuntu server 22.04 ( Core i9 12900K). > > I found the [nova-libvirt] container which contains *virsh *and is able to > edit it or use custom config for VMs. > > I've gone through lots of the docs, to name a few: > https://libvirt.org/formatdomain.html > https://wiki.openstack.org/wiki/Nova/USB_device_hot_cold_plug > https://wiki.openstack.org/wiki/Nova/proposal_about_usb_passthrough > https://documentation.suse.com/sles/15-SP1/html/SLES-all/cha-libvirt-config-virsh.html > https://wiki.openstack.org/wiki/Nova/USB_device_hot_cold_plug > https://docs.nxp.com/bundle/GUID-487B2E69-BB19-42CB-AC38-7EF18C0FE3AE/page/GUID-A658B311-CE08-4496-9491-1F00687EC4AE.html > > but none of them worked for me!! we do not supprot usb passthough in nova the only way to do this is by doing pci passthough of a usb contoller but there is no support offically for usb pasthoough currently we do not plan to add it in the future either and instead suggest that support be added to cybrog and then that leveraged with nova. for stateles device is would not be hard to add supprot in nova but there has been relucatance to do that. > > To reproduce: > > *Cold-Plug :* > *$ lsusb (on host)* > Bus 001 Device 014: ID 090c:1000 Silicon Motion >>>> Note Device number > changed every time I disconnect the device. So it might be different in the > changed attempt shown below) > > *$ docker exec -it nova_libvirt /bin/bash* > *%Turn the Desired VM off* > *# virsh list --all* > Id Name State > --------------------------------------------- > 2 instance-00000002 running > 19 instance-00000008 running > - instance-0000000a shut off > *# virsh edit instance-0000000a* > Add the changes [1][2][3][4],... ( many efforts have > been done but few samples of them are) > [1]: under added > > > > > > > > > [2]:under added > > > > > > > > > > [3]: under added > > https://egallen.com/openstack-usb-passthrough/ > > > > >
> > > [4]: under added > > > > > > > >
> > > [5]: > > >
> > > > > > >
> > > > > *%Start the VM* > *expected behavior:* > when login to the VM, lsusb or df -h shows the USB > *what happened:* > it wont show the USB from the VM > > *OR *virsh dumpxml instance-0000000a > instance-0000000a.xml > and then change the configs as above and then > > virsh attach-device instance-0000000a --file > /path/to/updated-instance-0000000a.xml --config > > > *Hot-Plug :* > *$ lsusb (on host)* > Bus 001 Device 014: ID 090c:1000 Silicon Motion >>>> Note Device number > changed every time I disconnect the device. So it might be different in the > changed attempt shown below) > > *$ docker exec -it nova_libvirt /bin/bash* > *# virsh list --all* > Id Name State > --------------------------------------------- > 2 instance-00000002 running > 19 instance-00000008 running > 20 instance-0000000a running > > > *#nano USB.xml* > *%add changes explained in *[1][2][3][4],... > > *$ virsh attach-device instance-0000000a /path/to/USB.xml/file* > > *expected behavior:* > lsusb or df -h shows the USB > *what happened:* > it wont show the USB from the VM > > > > > *Can you please guide me through this? Any recommendation would be much > appreciated!Any custom changes comes to your mind ( Reply it) would be > solution for this problem /;* as noted above this is not a supported feature of nova. there is no offical way to do usb passthough. the unoffical way to do this is https://egallen.com/openstack-usb-passthrough/ > Thanks > Best regards From rafaelweingartner at gmail.com Tue May 2 16:12:53 2023 From: rafaelweingartner at gmail.com (=?UTF-8?Q?Rafael_Weing=C3=A4rtner?=) Date: Tue, 2 May 2023 13:12:53 -0300 Subject: [horizon][nova][masakari] Instances created with "Any AZ" problem In-Reply-To: References: <173abaa4b89efc8594b08c1c256bc873f3192828.camel@redhat.com> Message-ID: Bug report created at: https://bugs.launchpad.net/nova/+bug/2018318 On Wed, Apr 26, 2023 at 12:23?PM Sylvain Bauza wrote: > > > Le mer. 26 avr. 2023 ? 14:46, Rafael Weing?rtner < > rafaelweingartner at gmail.com> a ?crit : > >> Adding the response to this thread, as I replied in the wrong one. Sorry >> for the confusion. >> >> Hello Sylvain Bauza and Sean Mooney, >> >> The patch pointed out in [1] is doing exactly what you said that we >> should not do. I mean, changing the AZ in the request spec of the virtual >> machine (VM) that the user created. The patch we propose in [2] is intended >> to avoid exactly that. >> >> As pointed out by Florian, the configuration "cross_az_attach" is a >> constraint of the cloud environment, and as such it should be considered >> when selecting hosts to execute VMs migrations. Therefore, for those >> situations, and only for those (cross_az_attach=False), we send the current >> AZ of the VM to placement, to enable it (placement) to filter out hosts >> that are not from the current AZ of the VM. >> >> Looking at the patch suggested by you in [1], and later testing it, we >> can confirm that the problem is still there in main/upstream. This happens >> because the patch in [1] is only addressing the cases when the VM is >> created based on volumes, and then it sets the AZ of the volumes in the >> request spec of the VM. That is why everything works for the setups where >> cross_az_attach=False. However, if we create a VM based on an image, and >> then it (Nova) creates a new volume in Cinder, the AZ is not set in the >> request spec (but it is used to execute the first call to placement to >> select the hosts); thus, the issues described in [2] can still happen. >> >> Anyways, the proposal presented in [2] is simpler and works nicely. We >> can discuss it further in the patchset then, if you guys think it is worth >> it. >> >> > As I replied in the gerrit change > https://review.opendev.org/c/openstack/nova/+/864760/comments/4a302ce3_9805e7c6 > then you should create a Launchpad bug report but fwiw, you should also > modify the implementation as it would rather do the same for image metadata > that what we do for volumes with [1] > > -Sylvain > > [1] >> https://review.opendev.org/c/openstack/nova/+/469675/12/nova/compute/api.py#1173 >> [2] https://review.opendev.org/c/openstack/nova/+/864760 >> >> On Wed, Mar 29, 2023 at 10:26?AM Nguy?n H?u Kh?i < >> nguyenhuukhoinw at gmail.com> wrote: >> >>> "If they *don't* provide this parameter, then depending on the >>> default_schedule_zone config option, either the instance will eventually >>> use a specific AZ (and then it's like if the enduser was asking for this >>> AZ), or none of AZ is requested and then the instance can be created and >>> moved between any hosts within *all* AZs." >>> >>> I ask aftet that, although without az when launch instances but they >>> still have az. But i still mv to diffent host in diffent az when mirgrating >>> or spawn which masakari. i am not clear, I tested. >>> >>> >>> On Wed, Mar 29, 2023, 7:38 PM Nguy?n H?u Kh?i >>> wrote: >>> >>>> Yes. Thanks, but the things I would like to know: after instances are >>>> created, how do we know if it was launched with specified AZ or without it? >>>> I mean the way to distinguish between specified instances and non specified >>>> instances? >>>> >>>> Nguyen Huu Khoi >>>> >>>> >>>> On Wed, Mar 29, 2023 at 5:05?PM Sylvain Bauza >>>> wrote: >>>> >>>>> >>>>> >>>>> Le mer. 29 mars 2023 ? 08:06, Nguy?n H?u Kh?i < >>>>> nguyenhuukhoinw at gmail.com> a ?crit : >>>>> >>>>>> Hello. >>>>>> I have one question. >>>>>> Follow this >>>>>> >>>>>> https://docs.openstack.org/nova/latest/admin/availability-zones.html >>>>>> >>>>>> If the server was not created in a specific zone then it is free to >>>>>> be moved to other zones. but when I use >>>>>> >>>>>> openstack server show [server id] >>>>>> >>>>>> I still see the "OS-EXT-AZ:availability_zone" value belonging to my >>>>>> instance. >>>>>> >>>>>> >>>>> Correct, this is normal. If the operators creates some AZs, then the >>>>> enduser should see where the instance in which AZ. >>>>> >>>>> >>>>>> Could you tell the difference which causes "if the server was not >>>>>> created in a specific zone then it is free to be moved to other zones. >>>>>> " >>>>>> >>>>>> >>>>> To be clear, an operator can create Availability Zones. Those AZs can >>>>> then be seen by an enduser using the os-availability-zones API [1]. Then, >>>>> either the enduser wants to use a specific AZ for their next instance >>>>> creation (and if so, he/she adds --availability-zone parameter to their >>>>> instance creation client) or they don't want and then they don't provide >>>>> this parameter. >>>>> >>>>> If they provide this parameter, then the server will be created only >>>>> in one host in the specific AZ and then when moving the instance later, it >>>>> will continue to move to any host within the same AZ. >>>>> If they *don't* provide this parameter, then depending on the >>>>> default_schedule_zone config option, either the instance will eventually >>>>> use a specific AZ (and then it's like if the enduser was asking for this >>>>> AZ), or none of AZ is requested and then the instance can be created and >>>>> moved between any hosts within *all* AZs. >>>>> >>>>> That being said, as I said earlier, the enduser can still verify the >>>>> AZ from where the instance is by the server show parameter you told. >>>>> >>>>> We also have a documentation explaining about Availability Zones, >>>>> maybe this would help you more to understand about AZs : >>>>> https://docs.openstack.org/nova/latest/admin/availability-zones.html >>>>> >>>>> >>>>> [1] >>>>> https://docs.openstack.org/api-ref/compute/#availability-zones-os-availability-zone >>>>> (tbc, the enduser won't see the hosts, but they can see the list of >>>>> existing AZs) >>>>> >>>>> >>>>> >>>>>> Nguyen Huu Khoi >>>>>> >>>>>> >>>>>> On Mon, Mar 27, 2023 at 8:37?PM Nguy?n H?u Kh?i < >>>>>> nguyenhuukhoinw at gmail.com> wrote: >>>>>> >>>>>>> Hello guys. >>>>>>> >>>>>>> I just suggest to openstack nova works better. My story because >>>>>>> >>>>>>> >>>>>>> 1. >>>>>>> >>>>>>> The server was created in a specific zone with the POST /servers request >>>>>>> containing the availability_zone parameter. >>>>>>> >>>>>>> It will be nice when we attach randow zone when we create instances >>>>>>> then It will only move to the same zone when migrating or masakari ha. >>>>>>> >>>>>>> Currently we can force it to zone by default zone shedule in >>>>>>> nova.conf. >>>>>>> >>>>>>> Sorry because I am new to Openstack and I am just an operator. I try >>>>>>> to verify some real cases. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Nguyen Huu Khoi >>>>>>> >>>>>>> >>>>>>> On Mon, Mar 27, 2023 at 7:43?PM Sylvain Bauza >>>>>>> wrote: >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Le lun. 27 mars 2023 ? 14:28, Sean Mooney a >>>>>>>> ?crit : >>>>>>>> >>>>>>>>> On Mon, 2023-03-27 at 14:06 +0200, Sylvain Bauza wrote: >>>>>>>>> > Le lun. 27 mars 2023 ? 13:51, Sean Mooney >>>>>>>>> a ?crit : >>>>>>>>> > >>>>>>>>> > > On Mon, 2023-03-27 at 10:19 +0200, Sylvain Bauza wrote: >>>>>>>>> > > > Le dim. 26 mars 2023 ? 14:30, Rafael Weing?rtner < >>>>>>>>> > > > rafaelweingartner at gmail.com> a ?crit : >>>>>>>>> > > > >>>>>>>>> > > > > Hello Nguy?n H?u Kh?i, >>>>>>>>> > > > > You might want to take a look at: >>>>>>>>> > > > > https://review.opendev.org/c/openstack/nova/+/864760. We >>>>>>>>> created a >>>>>>>>> > > patch >>>>>>>>> > > > > to avoid migrating VMs to any AZ, once the VM has been >>>>>>>>> bootstrapped in >>>>>>>>> > > an >>>>>>>>> > > > > AZ that has cross zone attache equals to false. >>>>>>>>> > > > > >>>>>>>>> > > > > >>>>>>>>> > > > Well, I'll provide some comments in the change, but I'm >>>>>>>>> afraid we can't >>>>>>>>> > > > just modify the request spec like you would want. >>>>>>>>> > > > >>>>>>>>> > > > Anyway, if you want to discuss about it in the vPTG, just >>>>>>>>> add it in the >>>>>>>>> > > > etherpad and add your IRC nick so we could try to find a >>>>>>>>> time where we >>>>>>>>> > > > could be discussing it : >>>>>>>>> https://etherpad.opendev.org/p/nova-bobcat-ptg >>>>>>>>> > > > Also, this kind of behaviour modification is more a new >>>>>>>>> feature than a >>>>>>>>> > > > bugfix, so fwiw you should create a launchpad blueprint so >>>>>>>>> we could >>>>>>>>> > > better >>>>>>>>> > > > see it. >>>>>>>>> > > >>>>>>>>> > > i tought i left review feedback on that too that the approch >>>>>>>>> was not >>>>>>>>> > > correct. >>>>>>>>> > > i guess i did not in the end. >>>>>>>>> > > >>>>>>>>> > > modifying the request spec as sylvain menthioned is not >>>>>>>>> correct. >>>>>>>>> > > i disucssed this topic on irc a few weeks back with mohomad >>>>>>>>> for vxhost. >>>>>>>>> > > what can be done is as follows. >>>>>>>>> > > >>>>>>>>> > > we can add a current_az field to the Destination object >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> https://github.com/openstack/nova/blob/master/nova/objects/request_spec.py#L1092-L1122 >>>>>>>>> > > The conductor can read the instance.AZ and populate it in that >>>>>>>>> new field. >>>>>>>>> > > We can then add a new weigher to prefer hosts that are in the >>>>>>>>> same az. >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> > >>>>>>>>> > I tend to disagree this approach as people would think that the >>>>>>>>> > Destination.az field would be related to the current AZ for an >>>>>>>>> instance, >>>>>>>>> > while we only look at the original AZ. >>>>>>>>> > That being said, we could have a weigher that would look at >>>>>>>>> whether the >>>>>>>>> > host is in the same AZ than the instance.host. >>>>>>>>> you miss understood what i wrote >>>>>>>>> >>>>>>>>> i suggested addint Destination.current_az to store teh curernt AZ >>>>>>>>> of the instance before scheduling. >>>>>>>>> >>>>>>>>> so my proposal is if RequestSpec.AZ is not set and >>>>>>>>> Destination.current_az is set then the new >>>>>>>>> weigher would prefer hosts that are in the same az as >>>>>>>>> Destination.current_az >>>>>>>>> >>>>>>>>> we coudl also call Destination.current_az Destination.prefered_az >>>>>>>>> >>>>>>>>> >>>>>>>> I meant, I think we don't need to provide a new field, we can >>>>>>>> already know about what host an existing instance uses if we want (using >>>>>>>> [1]) >>>>>>>> Anyway, let's stop to discuss about it here, we should rather >>>>>>>> review that for a Launchpad blueprint or more a spec. >>>>>>>> >>>>>>>> -Sylvain >>>>>>>> >>>>>>>> [1] >>>>>>>> https://github.com/openstack/nova/blob/b9a49ffb04cb5ae2d8c439361a3552296df02988/nova/scheduler/host_manager.py#L369-L370 >>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > This will provide soft AZ affinity for the vm and preserve the >>>>>>>>> fact that if >>>>>>>>> > > a vm is created without sepcifying >>>>>>>>> > > An AZ the expectaiton at the api level woudl be that it can >>>>>>>>> migrate to any >>>>>>>>> > > AZ. >>>>>>>>> > > >>>>>>>>> > > To provide hard AZ affintiy we could also add prefileter that >>>>>>>>> would use >>>>>>>>> > > the same data but instead include it in the >>>>>>>>> > > placement query so that only the current AZ is considered. >>>>>>>>> This would have >>>>>>>>> > > to be disabled by default. >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> > Sure, we could create a new prefilter so we could then deprecate >>>>>>>>> the >>>>>>>>> > AZFilter if we want. >>>>>>>>> we already have an AZ prefilter and the AZFilter is deprecate for >>>>>>>>> removal >>>>>>>>> i ment to delete it in zed but did not have time to do it in zed >>>>>>>>> of Antielope >>>>>>>>> i deprecated the AZ| filter in >>>>>>>>> https://github.com/openstack/nova/commit/7c7a2a142d74a7deeda2a79baf21b689fe32cd08 >>>>>>>>> xena when i enabeld the az prefilter by default. >>>>>>>>> >>>>>>>>> >>>>>>>> Ah whoops, indeed I forgot the fact we already have the prefilter, >>>>>>>> so the hard support for AZ is already existing. >>>>>>>> >>>>>>>> >>>>>>>>> i will try an delete teh AZ filter before m1 if others dont. >>>>>>>>> >>>>>>>> >>>>>>>> OK. >>>>>>>> >>>>>>>> >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > > That woudl allow operators to choose the desired behavior. >>>>>>>>> > > curret behavior (disable weigher and dont enabel prefilter) >>>>>>>>> > > new default, prefer current AZ (weigher enabeld prefilter >>>>>>>>> disabled) >>>>>>>>> > > hard affintiy(prefilter enabled.) >>>>>>>>> > > >>>>>>>>> > > there are other ways to approch this but updating the request >>>>>>>>> spec is not >>>>>>>>> > > one of them. >>>>>>>>> > > we have to maintain the fact the enduser did not request an AZ. >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> > Anyway, if folks want to discuss about AZs, this week is the >>>>>>>>> good time :-) >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > > > >>>>>>>>> > > > -Sylvain >>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>>> > > > > On Sun, Mar 26, 2023 at 8:20?AM Nguy?n H?u Kh?i < >>>>>>>>> > > nguyenhuukhoinw at gmail.com> >>>>>>>>> > > > > wrote: >>>>>>>>> > > > > >>>>>>>>> > > > > > Hello guys. >>>>>>>>> > > > > > I playing with Nova AZ and Masakari >>>>>>>>> > > > > > >>>>>>>>> > > > > > >>>>>>>>> https://docs.openstack.org/nova/latest/admin/availability-zones.html >>>>>>>>> > > > > > >>>>>>>>> > > > > > Masakari will move server by nova scheduler. >>>>>>>>> > > > > > >>>>>>>>> > > > > > Openstack Docs describe that: >>>>>>>>> > > > > > >>>>>>>>> > > > > > If the server was not created in a specific zone then it >>>>>>>>> is free to >>>>>>>>> > > be >>>>>>>>> > > > > > moved to other zones, i.e. the AvailabilityZoneFilter >>>>>>>>> > > > > > < >>>>>>>>> > > >>>>>>>>> https://docs.openstack.org/nova/latest/admin/scheduling.html#availabilityzonefilter >>>>>>>>> > >>>>>>>>> > > is >>>>>>>>> > > > > > a no-op. >>>>>>>>> > > > > > >>>>>>>>> > > > > > I see that everyone usually creates instances with "Any >>>>>>>>> Availability >>>>>>>>> > > > > > Zone" on Horzion and also we don't specify AZ when >>>>>>>>> creating >>>>>>>>> > > instances by >>>>>>>>> > > > > > cli. >>>>>>>>> > > > > > >>>>>>>>> > > > > > By this way, when we use Masakari or we miragrated >>>>>>>>> instances( or >>>>>>>>> > > > > > evacuate) so our instance will be moved to other zones. >>>>>>>>> > > > > > >>>>>>>>> > > > > > Can we attach AZ to server create requests API based on >>>>>>>>> Any >>>>>>>>> > > > > > Availability Zone to limit instances moved to other >>>>>>>>> zones? >>>>>>>>> > > > > > >>>>>>>>> > > > > > Thank you. Regards >>>>>>>>> > > > > > >>>>>>>>> > > > > > Nguyen Huu Khoi >>>>>>>>> > > > > > >>>>>>>>> > > > > >>>>>>>>> > > > > >>>>>>>>> > > > > -- >>>>>>>>> > > > > Rafael Weing?rtner >>>>>>>>> > > > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> >>>>>>>>> >> >> -- >> Rafael Weing?rtner >> > -- Rafael Weing?rtner -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Tue May 2 17:02:56 2023 From: smooney at redhat.com (Sean Mooney) Date: Tue, 02 May 2023 18:02:56 +0100 Subject: [horizon][nova][masakari] Instances created with "Any AZ" problem In-Reply-To: References: <173abaa4b89efc8594b08c1c256bc873f3192828.camel@redhat.com> Message-ID: <88e3fdbc45a473ee791acc575070150491bed358.camel@redhat.com> unfortnetly while i under stand the usecase this bug report is invlaid. the behavior you are desciribe is the expected and intended behavior of nova. if a vm does not specify a AZ when it is created and https://docs.openstack.org/nova/latest/configuration/config.html#DEFAULT.default_schedule_zone i not set then the expected behavior is that it can migrate acrros az as a result using [cinder]/cross_az_attach=false without also seeting DEFAULT.default_schedule_zone is effectivly an invalid cofniguration. it is only valid to use [cinder]/cross_az_attach=false and expect move operation to fucntion if all vms have a AZ requested when they are created and that is the contindion that is not being met in your current usecase. i have discsued thsi a few times in the past. a weigher (not a filter) could be created to give prefernce to an instance currnent AZ but we cannot prevent the incance form being scduled to a diffent AZ. This would be a new feature not a bug and should be discussed as such which is why i marked tbe bug as invalid. On Tue, 2023-05-02 at 13:12 -0300, Rafael Weing?rtner wrote: > Bug report created at: https://bugs.launchpad.net/nova/+bug/2018318 > > On Wed, Apr 26, 2023 at 12:23?PM Sylvain Bauza wrote: > > > > > > > Le mer. 26 avr. 2023 ? 14:46, Rafael Weing?rtner < > > rafaelweingartner at gmail.com> a ?crit : > > > > > Adding the response to this thread, as I replied in the wrong one. Sorry > > > for the confusion. > > > > > > Hello Sylvain Bauza and Sean Mooney, > > > > > > The patch pointed out in [1] is doing exactly what you said that we > > > should not do. I mean, changing the AZ in the request spec of the virtual > > > machine (VM) that the user created. The patch we propose in [2] is intended > > > to avoid exactly that. > > > > > > As pointed out by Florian, the configuration "cross_az_attach" is a > > > constraint of the cloud environment, and as such it should be considered > > > when selecting hosts to execute VMs migrations. Therefore, for those > > > situations, and only for those (cross_az_attach=False), we send the current > > > AZ of the VM to placement, to enable it (placement) to filter out hosts > > > that are not from the current AZ of the VM. > > > > > > Looking at the patch suggested by you in [1], and later testing it, we > > > can confirm that the problem is still there in main/upstream. This happens > > > because the patch in [1] is only addressing the cases when the VM is > > > created based on volumes, and then it sets the AZ of the volumes in the > > > request spec of the VM. That is why everything works for the setups where > > > cross_az_attach=False. However, if we create a VM based on an image, and > > > then it (Nova) creates a new volume in Cinder, the AZ is not set in the > > > request spec (but it is used to execute the first call to placement to > > > select the hosts); thus, the issues described in [2] can still happen. > > > > > > Anyways, the proposal presented in [2] is simpler and works nicely. We > > > can discuss it further in the patchset then, if you guys think it is worth > > > it. > > > > > > > > As I replied in the gerrit change > > https://review.opendev.org/c/openstack/nova/+/864760/comments/4a302ce3_9805e7c6 > > then you should create a Launchpad bug report but fwiw, you should also > > modify the implementation as it would rather do the same for image metadata > > that what we do for volumes with [1] > > > > -Sylvain > > > > [1] > > > https://review.opendev.org/c/openstack/nova/+/469675/12/nova/compute/api.py#1173 > > > [2] https://review.opendev.org/c/openstack/nova/+/864760 > > > > > > On Wed, Mar 29, 2023 at 10:26?AM Nguy?n H?u Kh?i < > > > nguyenhuukhoinw at gmail.com> wrote: > > > > > > > "If they *don't* provide this parameter, then depending on the > > > > default_schedule_zone config option, either the instance will eventually > > > > use a specific AZ (and then it's like if the enduser was asking for this > > > > AZ), or none of AZ is requested and then the instance can be created and > > > > moved between any hosts within *all* AZs." > > > > > > > > I ask aftet that, although without az when launch instances but they > > > > still have az. But i still mv to diffent host in diffent az when mirgrating > > > > or spawn which masakari. i am not clear, I tested. > > > > > > > > > > > > On Wed, Mar 29, 2023, 7:38 PM Nguy?n H?u Kh?i > > > > wrote: > > > > > > > > > Yes. Thanks, but the things I would like to know: after instances are > > > > > created, how do we know if it was launched with specified AZ or without it? > > > > > I mean the way to distinguish between specified instances and non specified > > > > > instances? > > > > > > > > > > Nguyen Huu Khoi > > > > > > > > > > > > > > > On Wed, Mar 29, 2023 at 5:05?PM Sylvain Bauza > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > Le mer. 29 mars 2023 ? 08:06, Nguy?n H?u Kh?i < > > > > > > nguyenhuukhoinw at gmail.com> a ?crit : > > > > > > > > > > > > > Hello. > > > > > > > I have one question. > > > > > > > Follow this > > > > > > > > > > > > > > https://docs.openstack.org/nova/latest/admin/availability-zones.html > > > > > > > > > > > > > > If the server was not created in a specific zone then it is free to > > > > > > > be moved to other zones. but when I use > > > > > > > > > > > > > > openstack server show [server id] > > > > > > > > > > > > > > I still see the "OS-EXT-AZ:availability_zone" value belonging to my > > > > > > > instance. > > > > > > > > > > > > > > > > > > > > Correct, this is normal. If the operators creates some AZs, then the > > > > > > enduser should see where the instance in which AZ. > > > > > > > > > > > > > > > > > > > Could you tell the difference which causes "if the server was not > > > > > > > created in a specific zone then it is free to be moved to other zones. > > > > > > > " > > > > > > > > > > > > > > > > > > > > To be clear, an operator can create Availability Zones. Those AZs can > > > > > > then be seen by an enduser using the os-availability-zones API [1]. Then, > > > > > > either the enduser wants to use a specific AZ for their next instance > > > > > > creation (and if so, he/she adds --availability-zone parameter to their > > > > > > instance creation client) or they don't want and then they don't provide > > > > > > this parameter. > > > > > > > > > > > > If they provide this parameter, then the server will be created only > > > > > > in one host in the specific AZ and then when moving the instance later, it > > > > > > will continue to move to any host within the same AZ. > > > > > > If they *don't* provide this parameter, then depending on the > > > > > > default_schedule_zone config option, either the instance will eventually > > > > > > use a specific AZ (and then it's like if the enduser was asking for this > > > > > > AZ), or none of AZ is requested and then the instance can be created and > > > > > > moved between any hosts within *all* AZs. > > > > > > > > > > > > That being said, as I said earlier, the enduser can still verify the > > > > > > AZ from where the instance is by the server show parameter you told. > > > > > > > > > > > > We also have a documentation explaining about Availability Zones, > > > > > > maybe this would help you more to understand about AZs : > > > > > > https://docs.openstack.org/nova/latest/admin/availability-zones.html > > > > > > > > > > > > > > > > > > [1] > > > > > > https://docs.openstack.org/api-ref/compute/#availability-zones-os-availability-zone > > > > > > (tbc, the enduser won't see the hosts, but they can see the list of > > > > > > existing AZs) > > > > > > > > > > > > > > > > > > > > > > > > > Nguyen Huu Khoi > > > > > > > > > > > > > > > > > > > > > On Mon, Mar 27, 2023 at 8:37?PM Nguy?n H?u Kh?i < > > > > > > > nguyenhuukhoinw at gmail.com> wrote: > > > > > > > > > > > > > > > Hello guys. > > > > > > > > > > > > > > > > I just suggest to openstack nova works better. My story because > > > > > > > > > > > > > > > > > > > > > > > > 1. > > > > > > > > > > > > > > > > The server was created in a specific zone with the POST /servers request > > > > > > > > containing the availability_zone parameter. > > > > > > > > > > > > > > > > It will be nice when we attach randow zone when we create instances > > > > > > > > then It will only move to the same zone when migrating or masakari ha. > > > > > > > > > > > > > > > > Currently we can force it to zone by default zone shedule in > > > > > > > > nova.conf. > > > > > > > > > > > > > > > > Sorry because I am new to Openstack and I am just an operator. I try > > > > > > > > to verify some real cases. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Nguyen Huu Khoi > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Mar 27, 2023 at 7:43?PM Sylvain Bauza > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Le lun. 27 mars 2023 ? 14:28, Sean Mooney a > > > > > > > > > ?crit : > > > > > > > > > > > > > > > > > > > On Mon, 2023-03-27 at 14:06 +0200, Sylvain Bauza wrote: > > > > > > > > > > > Le lun. 27 mars 2023 ? 13:51, Sean Mooney > > > > > > > > > > a ?crit : > > > > > > > > > > > > > > > > > > > > > > > On Mon, 2023-03-27 at 10:19 +0200, Sylvain Bauza wrote: > > > > > > > > > > > > > Le dim. 26 mars 2023 ? 14:30, Rafael Weing?rtner < > > > > > > > > > > > > > rafaelweingartner at gmail.com> a ?crit : > > > > > > > > > > > > > > > > > > > > > > > > > > > Hello Nguy?n H?u Kh?i, > > > > > > > > > > > > > > You might want to take a look at: > > > > > > > > > > > > > > https://review.opendev.org/c/openstack/nova/+/864760. We > > > > > > > > > > created a > > > > > > > > > > > > patch > > > > > > > > > > > > > > to avoid migrating VMs to any AZ, once the VM has been > > > > > > > > > > bootstrapped in > > > > > > > > > > > > an > > > > > > > > > > > > > > AZ that has cross zone attache equals to false. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Well, I'll provide some comments in the change, but I'm > > > > > > > > > > afraid we can't > > > > > > > > > > > > > just modify the request spec like you would want. > > > > > > > > > > > > > > > > > > > > > > > > > > Anyway, if you want to discuss about it in the vPTG, just > > > > > > > > > > add it in the > > > > > > > > > > > > > etherpad and add your IRC nick so we could try to find a > > > > > > > > > > time where we > > > > > > > > > > > > > could be discussing it : > > > > > > > > > > https://etherpad.opendev.org/p/nova-bobcat-ptg > > > > > > > > > > > > > Also, this kind of behaviour modification is more a new > > > > > > > > > > feature than a > > > > > > > > > > > > > bugfix, so fwiw you should create a launchpad blueprint so > > > > > > > > > > we could > > > > > > > > > > > > better > > > > > > > > > > > > > see it. > > > > > > > > > > > > > > > > > > > > > > > > i tought i left review feedback on that too that the approch > > > > > > > > > > was not > > > > > > > > > > > > correct. > > > > > > > > > > > > i guess i did not in the end. > > > > > > > > > > > > > > > > > > > > > > > > modifying the request spec as sylvain menthioned is not > > > > > > > > > > correct. > > > > > > > > > > > > i disucssed this topic on irc a few weeks back with mohomad > > > > > > > > > > for vxhost. > > > > > > > > > > > > what can be done is as follows. > > > > > > > > > > > > > > > > > > > > > > > > we can add a current_az field to the Destination object > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/openstack/nova/blob/master/nova/objects/request_spec.py#L1092-L1122 > > > > > > > > > > > > The conductor can read the instance.AZ and populate it in that > > > > > > > > > > new field. > > > > > > > > > > > > We can then add a new weigher to prefer hosts that are in the > > > > > > > > > > same az. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I tend to disagree this approach as people would think that the > > > > > > > > > > > Destination.az field would be related to the current AZ for an > > > > > > > > > > instance, > > > > > > > > > > > while we only look at the original AZ. > > > > > > > > > > > That being said, we could have a weigher that would look at > > > > > > > > > > whether the > > > > > > > > > > > host is in the same AZ than the instance.host. > > > > > > > > > > you miss understood what i wrote > > > > > > > > > > > > > > > > > > > > i suggested addint Destination.current_az to store teh curernt AZ > > > > > > > > > > of the instance before scheduling. > > > > > > > > > > > > > > > > > > > > so my proposal is if RequestSpec.AZ is not set and > > > > > > > > > > Destination.current_az is set then the new > > > > > > > > > > weigher would prefer hosts that are in the same az as > > > > > > > > > > Destination.current_az > > > > > > > > > > > > > > > > > > > > we coudl also call Destination.current_az Destination.prefered_az > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I meant, I think we don't need to provide a new field, we can > > > > > > > > > already know about what host an existing instance uses if we want (using > > > > > > > > > [1]) > > > > > > > > > Anyway, let's stop to discuss about it here, we should rather > > > > > > > > > review that for a Launchpad blueprint or more a spec. > > > > > > > > > > > > > > > > > > -Sylvain > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > https://github.com/openstack/nova/blob/b9a49ffb04cb5ae2d8c439361a3552296df02988/nova/scheduler/host_manager.py#L369-L370 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This will provide soft AZ affinity for the vm and preserve the > > > > > > > > > > fact that if > > > > > > > > > > > > a vm is created without sepcifying > > > > > > > > > > > > An AZ the expectaiton at the api level woudl be that it can > > > > > > > > > > migrate to any > > > > > > > > > > > > AZ. > > > > > > > > > > > > > > > > > > > > > > > > To provide hard AZ affintiy we could also add prefileter that > > > > > > > > > > would use > > > > > > > > > > > > the same data but instead include it in the > > > > > > > > > > > > placement query so that only the current AZ is considered. > > > > > > > > > > This would have > > > > > > > > > > > > to be disabled by default. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Sure, we could create a new prefilter so we could then deprecate > > > > > > > > > > the > > > > > > > > > > > AZFilter if we want. > > > > > > > > > > we already have an AZ prefilter and the AZFilter is deprecate for > > > > > > > > > > removal > > > > > > > > > > i ment to delete it in zed but did not have time to do it in zed > > > > > > > > > > of Antielope > > > > > > > > > > i deprecated the AZ| filter in > > > > > > > > > > https://github.com/openstack/nova/commit/7c7a2a142d74a7deeda2a79baf21b689fe32cd08 > > > > > > > > > > xena when i enabeld the az prefilter by default. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Ah whoops, indeed I forgot the fact we already have the prefilter, > > > > > > > > > so the hard support for AZ is already existing. > > > > > > > > > > > > > > > > > > > > > > > > > > > > i will try an delete teh AZ filter before m1 if others dont. > > > > > > > > > > > > > > > > > > > > > > > > > > > > OK. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > That woudl allow operators to choose the desired behavior. > > > > > > > > > > > > curret behavior (disable weigher and dont enabel prefilter) > > > > > > > > > > > > new default, prefer current AZ (weigher enabeld prefilter > > > > > > > > > > disabled) > > > > > > > > > > > > hard affintiy(prefilter enabled.) > > > > > > > > > > > > > > > > > > > > > > > > there are other ways to approch this but updating the request > > > > > > > > > > spec is not > > > > > > > > > > > > one of them. > > > > > > > > > > > > we have to maintain the fact the enduser did not request an AZ. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Anyway, if folks want to discuss about AZs, this week is the > > > > > > > > > > good time :-) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -Sylvain > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Sun, Mar 26, 2023 at 8:20?AM Nguy?n H?u Kh?i < > > > > > > > > > > > > nguyenhuukhoinw at gmail.com> > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hello guys. > > > > > > > > > > > > > > > I playing with Nova AZ and Masakari > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.openstack.org/nova/latest/admin/availability-zones.html > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Masakari will move server by nova scheduler. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Openstack Docs describe that: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the server was not created in a specific zone then it > > > > > > > > > > is free to > > > > > > > > > > > > be > > > > > > > > > > > > > > > moved to other zones, i.e. the AvailabilityZoneFilter > > > > > > > > > > > > > > > < > > > > > > > > > > > > > > > > > > > > > > https://docs.openstack.org/nova/latest/admin/scheduling.html#availabilityzonefilter > > > > > > > > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > a no-op. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I see that everyone usually creates instances with "Any > > > > > > > > > > Availability > > > > > > > > > > > > > > > Zone" on Horzion and also we don't specify AZ when > > > > > > > > > > creating > > > > > > > > > > > > instances by > > > > > > > > > > > > > > > cli. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > By this way, when we use Masakari or we miragrated > > > > > > > > > > instances( or > > > > > > > > > > > > > > > evacuate) so our instance will be moved to other zones. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Can we attach AZ to server create requests API based on > > > > > > > > > > Any > > > > > > > > > > > > > > > Availability Zone to limit instances moved to other > > > > > > > > > > zones? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you. Regards > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Nguyen Huu Khoi > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > Rafael Weing?rtner > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > Rafael Weing?rtner > > > > > > From fungi at yuggoth.org Tue May 2 17:24:28 2023 From: fungi at yuggoth.org (Jeremy Stanley) Date: Tue, 2 May 2023 17:24:28 +0000 Subject: [tc] Reminder: 2023-05-10 OpenInfra Board Sync Message-ID: <20230502172428.vnk4gynx54tpxqj6@yuggoth.org> The Open Infrastructure Foundation Board of Directors is endeavoring to engage in regular check-ins with official OpenInfra projects. The goal is for a loosely structured discussion one-hour in length, involving members of the board and the OpenStack TC, along with other interested community members. This is not intended to be a formal presentation, and no materials need to be prepared in advance. I've started an Etherpad where participants can brainstorm potential topics of conversation, time-permitting: https://etherpad.opendev.org/p/2023-05-board-openstack-sync As previously announced[*], we're planning to hold the next call at 20:00 UTC on Wednesday, May 10. A link to the conference bridge will be added to the pad prior to that time. [*] https://lists.openstack.org/pipermail/openstack-discuss/2023-April/033244.html -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From kozhukalov at gmail.com Tue May 2 19:06:48 2023 From: kozhukalov at gmail.com (Vladimir Kozhukalov) Date: Tue, 2 May 2023 22:06:48 +0300 Subject: [openstack-helm] Nominate Karl Kloppenborg for openstack-helm-core Message-ID: Dear Openstack-helmers, I would like to suggest Karl Kloppenborg as a member of the core review team. He's been actively doing code review recently and also contributed as a developer. He has been using Openstack-Helm for a few years now, so his experience as a user is also to our advantage. I hope for your support. -- Best regards, Kozhukalov Vladimir -------------- next part -------------- An HTML attachment was scrubbed... URL: From arnaud.morin at gmail.com Tue May 2 19:13:25 2023 From: arnaud.morin at gmail.com (Arnaud Morin) Date: Tue, 2 May 2023 19:13:25 +0000 Subject: [OpenvSwitch][Neutron] native flow based firewall Vs LinuxBridge Iptables firewall In-Reply-To: References: Message-ID: Hello, We are using it in production since few years now, it works correctly. But, if you think it will be easier to debug, that will surprise me :) Openflow rules are hard to read, understand and debug. We tried working on a tool that help debugging such stuff (see [1]) which is partially used by the team, but that's far from perfect :( [1] https://github.com/openstack/osops/blob/master/contrib/neutron/br-int-flows-analyze.py Cheers, Arnaud. On 24.04.23 - 13:32, Satish Patel wrote: > Thanks, I'll check it out. > > This is great! so no harm to turn it on :) > > On Mon, Apr 24, 2023 at 2:49?AM Lajos Katona wrote: > > > H, > > The OVS flow based Neutron firewall driver is long supported by the > > community and used by many operators in production, please check the > > documentation: > > https://docs.openstack.org/neutron/latest/admin/config-ovsfwdriver.html > > > > For some details how it works please check the related internals doc: > > > > https://docs.openstack.org/neutron/latest/contributor/internals/openvswitch_firewall.html > > > > Best wished > > Lajos (lajoskatona) > > > > Satish Patel ezt ?rta (id?pont: 2023. ?pr. 24., H, > > 3:40): > > > >> Folks, > >> > >> As we know, openvswitch uses a linuxbridge based firewall to implement > >> security-groups on openstack. It works great but it has so many packet > >> hops. It also makes troubleshooting a little complicated. > >> > >> OpenvSwitch does support native firewall features in flows, Does it > >> mature enough to implement in production and replace it with LinuxBridge > >> based IPtables firewall? > >> > >> ~S > >> > >> From steveftaylor at gmail.com Tue May 2 19:25:21 2023 From: steveftaylor at gmail.com (Steve Taylor) Date: Tue, 02 May 2023 13:25:21 -0600 Subject: [openstack-helm] Nominate Karl Kloppenborg for openstack-helm-core In-Reply-To: References: Message-ID: I second that nomination. Karl has demonstrated his experience and has contributed to the project consistently. He will be a significant asset as a core reviewer. Steve Taylor On 5/2/2023 1:22:24 PM, Vladimir Kozhukalov wrote: Dear?Openstack-helmers, I would like to suggest?Karl Kloppenborg as a member of the core review team. He's been actively doing code review recently and also contributed as a developer. He has been using Openstack-Helm for a few years now, so his experience as a user is also to our advantage.? ? ?? I hope for your support.? -- Best regards, Kozhukalov Vladimir -------------- next part -------------- An HTML attachment was scrubbed... URL: From noonedeadpunk at gmail.com Wed May 3 11:08:48 2023 From: noonedeadpunk at gmail.com (Dmitriy Rabotyagov) Date: Wed, 3 May 2023 13:08:48 +0200 Subject: [all][tc][vitrage] Final call for Vitrage project maintainers In-Reply-To: <1877ba5bc71.12a20ec8f588247.2493892904205072376@ghanshyammann.com> References: <1877ba5bc71.12a20ec8f588247.2493892904205072376@ghanshyammann.com> Message-ID: Hey there, I'm interested in keeping this project alive for a while. I won't promise any active development, but can keep CI green and do reviews and releasing of the project for now. ??, 13 ???. 2023??. ? 19:28, Ghanshyam Mann : > > Hello Everyone, > > You might have noticed that the Vitrage project is leaderless in this cycle[1]. Eyal, who was the > previous PTL and the only maintainer of this project for more than two years[2], will not be > able to continue in this project[3]. We thank and appreciate all previous maintainers' work on this > project. > > This is the final call for maintainers. If anyone is using or interested in this project, this is the right > time to step up and let us know in this email or IRC(OFTC) #openstack-tc channel. If there is no > response by the end of April 2023, we will discuss the retirement of this project. > > > [1] https://etherpad.opendev.org/p/2023.2-leaderless > [2] https://www.stackalytics.io/?module=vitrage-group&release=wallaby > [3] https://lists.openstack.org/pipermail/openstack-discuss/2023-April/033333.html > > -gmann > From senrique at redhat.com Wed May 3 11:21:21 2023 From: senrique at redhat.com (Sofia Enriquez) Date: Wed, 3 May 2023 12:21:21 +0100 Subject: Cinder Bug Report 2023-05-03 Message-ID: Hello Argonauts, Cinder Bug Meeting Etherpad *Low* - Infinidat driver should use the pool's compression setting when creating volumes. - *Status: *Fix proposed to master . - Cinder retype fails with "no host supplied" for SolidFire driver. - *Status*: Unassigned. *Incomplete* - When cinder-backup and nova-compute at the same node,backup restore will delete multipath. - *Status*: Waiting for reported reply. Cheers, -- Sof?a Enriquez she/her Software Engineer Red Hat PnT IRC: @enriquetaso @RedHat Red Hat Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From eblock at nde.ag Wed May 3 11:40:11 2023 From: eblock at nde.ag (Eugen Block) Date: Wed, 03 May 2023 11:40:11 +0000 Subject: Getting 403 from keystone as admin user. In-Reply-To: <6937b623979c5cb37d433abae313772454451402.camel@toyon.com> Message-ID: <20230503114011.Horde.0GEPOkCMAL9TiHy-ckwYlZe@webmail.nde.ag> A few more details about the exact error under which conditions (always, sometimes?) would be useful and how the cluster is deployed (HA?) and used (e.g. terraform). It could be a memcache config issue, but at this point it's just wild guessing. Zitat von Andy Speagle : > We're getting strange 403's from keystone on the CLI when trying to > list users, groups, and role assignments. > > Running keystone 17.0.1 on ussuri. > > I've looked through our policy.json... can't find anything strange. Has > anyone seen this behavior? Any ideas what can be done? > > Thanks. From thierry at openstack.org Wed May 3 11:56:43 2023 From: thierry at openstack.org (Thierry Carrez) Date: Wed, 3 May 2023 13:56:43 +0200 Subject: [largescale-sig] Next meeting: May 3, 8utc In-Reply-To: <42675d5c-e4d1-4819-5121-703975ecb2bc@openstack.org> References: <42675d5c-e4d1-4819-5121-703975ecb2bc@openstack.org> Message-ID: <5be38a01-4f52-eec3-976d-5e1b1ff24ada@openstack.org> Here is the summary of our SIG meeting today. We discussed our next OpenInfra Live episode, which will probably be held in September after a Summit+summer break. We are considering two topics: one around RabbitMQ, and the other around a deep dive in a public cloud deployment. You can read the detailed meeting logs at: https://meetings.opendev.org/meetings/large_scale_sig/2023/large_scale_sig.2023-05-03-08.00.html Our next IRC meeting will be May 17, 15:00UTC on #openstack-operators on OFTC. Regards, -- Thierry Carrez (ttx) From wodel.youchi at gmail.com Wed May 3 15:13:16 2023 From: wodel.youchi at gmail.com (wodel youchi) Date: Wed, 3 May 2023 16:13:16 +0100 Subject: [kolla-ansible][yoga] fluentd problem with elasticsearch after update Message-ID: Hi, I have finished the update of my openstack platform with newer containers. While verifying I noticed that fluentd container keeps restarting. In the log file I am having this : > 2023-05-03 16:07:59 +0100 [error]: #0 config error > file="/etc/td-agent/td-agent.conf" error_class=Fluent::ConfigError > error="Using Elasticsearch client 8.7.1 is not compatible for your > Elasticsearch server. Please check your using elasticsearch gem version and > Elasticsearch server." > 2023-05-03 16:07:59 +0100 [error]: Worker 0 finished unexpectedly with > status 2 > 2023-05-03 16:07:59 +0100 [info]: Received graceful stop > Those are the images I am using : (yogavenv) [deployer at rscdeployer ~]$ sudo docker images | grep -i elas 192.168.1.16:4000/openstack.kolla/centos-source-prometheus-elasticsearch-exporter yoga030523 b48f63ed0072 12 hours ago 539MB 192.168.1.16:4000/openstack.kolla/centos-source-elasticsearch yoga030523 3558611b0cf4 12 hours ago 1.2GB 192.168.1.16:4000/openstack.kolla/centos-source-elasticsearch-curator yoga030523 83a6b48339ea 12 hours ago 637MB (yogavenv) [deployer at rscdeployer ~]$ sudo docker images | grep -i fluen 192.168.1.16:4000/openstack.kolla/centos-source-fluentd yoga030523 bf6596e139e2 12 hours ago 847MB Any ideas? Regards. Virus-free.www.avast.com <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Wed May 3 15:28:00 2023 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 3 May 2023 15:28:00 +0000 Subject: [dev][infra][qa][tact-sig] Future of Fedora and CentOS Stream Test Images Message-ID: <20230503152800.vatnnvvwi767vwfc@yuggoth.org> tl;dr is that the OpenDev Collaboratory is looking at potentially scaling back on Fedora-based test platform support. Several options are presented in this service-discuss post: https://lists.opendev.org/archives/list/service-discuss at lists.opendev.org/thread/IOYIYWGTZW3TM4TR2N47XY6X7EB2W2A6/ If you have an opinion on using Fedora in our CI jobs, please follow up to the discussion there. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From kristin at openinfra.dev Wed May 3 15:58:24 2023 From: kristin at openinfra.dev (Kristin Barrientos) Date: Wed, 3 May 2023 10:58:24 -0500 Subject: OpenInfra Live - May 4, 2023 at 9 a.m. CT / 14:00 UTC Message-ID: Hi everyone, This week?s OpenInfra Live episode is brought to you by the OpenInfra Foundation Staff. Episode: May the OpenInfra Force Be With You: Preview of the OpenInfra Summit! Join us for an insightful and informative discussion as we give you the inside scoop on what?s in store at the OpenInfra Summit, Vancouver, happening, June 13-15! Don?t miss your chance to learn how you can get involved! Speakers: Wes Wilson, Allison Price, Jimmy McArthur, Kendall Nelson Date and time: May 4, 2023, at 9 a.m. CT. (14:00 UTC) You can watch us live on: YouTube: https://www.youtube.com/live/g_nTOpLGftw?feature=share LinkedIn: https://www.linkedin.com/events/7056677974057648128/comments/ WeChat: recording will be posted on OpenStack WeChat after the live stream Have an idea for a future episode? Share it now at ideas.openinfra.live. Thanks, Kristin Barrientos Marketing Coordinator OpenInfra Foundation -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Wed May 3 17:21:07 2023 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 03 May 2023 10:21:07 -0700 Subject: [all][tc][vitrage] Final call for Vitrage project maintainers In-Reply-To: References: <1877ba5bc71.12a20ec8f588247.2493892904205072376@ghanshyammann.com> Message-ID: <187e2a1cd9f.e620a21a346038.6141593357851738000@ghanshyammann.com> ---- On Wed, 03 May 2023 04:08:48 -0700 Dmitriy Rabotyagov wrote --- > Hey there, > > I'm interested in keeping this project alive for a while. > I won't promise any active development, but can keep CI green and do > reviews and releasing of the project for now. Thanks Dmitriy for stepping up. -gmann > > ??, 13 ???. 2023??. ? 19:28, Ghanshyam Mann gmann at ghanshyammann.com>: > > > > Hello Everyone, > > > > You might have noticed that the Vitrage project is leaderless in this cycle[1]. Eyal, who was the > > previous PTL and the only maintainer of this project for more than two years[2], will not be > > able to continue in this project[3]. We thank and appreciate all previous maintainers' work on this > > project. > > > > This is the final call for maintainers. If anyone is using or interested in this project, this is the right > > time to step up and let us know in this email or IRC(OFTC) #openstack-tc channel. If there is no > > response by the end of April 2023, we will discuss the retirement of this project. > > > > > > [1] https://etherpad.opendev.org/p/2023.2-leaderless > > [2] https://www.stackalytics.io/?module=vitrage-group&release=wallaby > > [3] https://lists.openstack.org/pipermail/openstack-discuss/2023-April/033333.html > > > > -gmann > > > > From angelsantana at neoscloudllc.com Thu May 4 02:22:06 2023 From: angelsantana at neoscloudllc.com (Angel Santana) Date: Wed, 3 May 2023 22:22:06 -0400 Subject: [dev][ceilometer] Add custom uptime metric to Ceilometer Message-ID: <924CA4A5-D459-4061-BAD7-2F1B6926C86A@neoscloudllc.com> Guys, I deployed Ceilometer into my OpenStack deployment using the Juju charm. I?m trying to add a custom metric that records the uptime of instances every hour. I?m totally lost on how to archive this or if it?s possible. Any help would be appreciate. So far I tried playing around with the meters.yaml and polling.yaml to see if I could pull it off with no luck. Is this possible? how should I do it? Thanks in advance. Regards, ?ngel From mnasiadka at gmail.com Thu May 4 06:42:38 2023 From: mnasiadka at gmail.com (=?utf-8?Q?Micha=C5=82_Nasiadka?=) Date: Thu, 4 May 2023 08:42:38 +0200 Subject: [kolla-ansible][yoga] fluentd problem with elasticsearch after update In-Reply-To: References: Message-ID: <2E93A6DB-2BA6-44A4-B1A0-97AA7205B791@gmail.com> Hello, That probably is a Kolla bug - can you please raise a bug in launchpad.net ? The other alternative is to migrate to OpenSearch (we?ve back ported this functionality recently) - https://docs.openstack.org/kolla-ansible/yoga/reference/logging-and-monitoring/central-logging-guide-opensearch.html#migration Best regards, Michal > On 3 May 2023, at 17:13, wodel youchi wrote: > > Hi, > > I have finished the update of my openstack platform with newer containers. > > While verifying I noticed that fluentd container keeps restarting. > > In the log file I am having this : >> 2023-05-03 16:07:59 +0100 [error]: #0 config error file="/etc/td-agent/td-agent.conf" error_class=Fluent::ConfigError error="Using Elasticsearch client 8.7.1 is not compatible for your Elasticsearch server. Please check your using elasticsearch gem version and Elasticsearch server." >> 2023-05-03 16:07:59 +0100 [error]: Worker 0 finished unexpectedly with status 2 >> 2023-05-03 16:07:59 +0100 [info]: Received graceful stop > > Those are the images I am using : > (yogavenv) [deployer at rscdeployer ~]$ sudo docker images | grep -i elas > 192.168.1.16:4000/openstack.kolla/centos-source-prometheus-elasticsearch-exporter yoga030523 b48f63ed0072 12 hours ago 539MB > 192.168.1.16:4000/openstack.kolla/centos-source-elasticsearch yoga030523 3558611b0cf4 12 hours ago 1.2GB > 192.168.1.16:4000/openstack.kolla/centos-source-elasticsearch-curator yoga030523 83a6b48339ea 12 hours ago 637MB > > (yogavenv) [deployer at rscdeployer ~]$ sudo docker images | grep -i fluen > 192.168.1.16:4000/openstack.kolla/centos-source-fluentd yoga030523 bf6596e139e2 12 hours ago 847MB > > Any ideas? > > Regards. > > Virus-free.www.avast.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian.rohmann at inovex.de Thu May 4 08:07:43 2023 From: christian.rohmann at inovex.de (Christian Rohmann) Date: Thu, 4 May 2023 10:07:43 +0200 Subject: [designate] Proposal to deprecate the agent framework and agent based backends In-Reply-To: References: <16546577-07af-c24a-5ff6-c45eeeba9517@inovex.de> Message-ID: <5727cbd3-833a-965a-5f77-1fb704ec0d98@inovex.de> On 12/04/2023 09:14, Christian Rohmann wrote: >>> And I am in no way asking for a fast-lane or special treatment. I would >>> just like to be able advertise this as: >>> "If we write the code, someone is going to look at it and helps to get >>> this merged for Bobcat." >> I think that is a fair statement. Just note that others in the >> community may not have a lot of time to co-develop on it. >> > We agreed to pick this up and implement the catalog zone support. > So expect a change to appear for review at some point ;-) > Just an update to our commitment:? We started working on the implementation. Regards Christian -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.kavanagh at canonical.com Thu May 4 08:19:33 2023 From: alex.kavanagh at canonical.com (Alex Kavanagh) Date: Thu, 4 May 2023 10:19:33 +0200 Subject: [dev][ceilometer] Add custom uptime metric to Ceilometer In-Reply-To: <924CA4A5-D459-4061-BAD7-2F1B6926C86A@neoscloudllc.com> References: <924CA4A5-D459-4061-BAD7-2F1B6926C86A@neoscloudllc.com> Message-ID: Hi Angel So for nova instances, the ceilometer-agent charm is added as a subordinate (by relating the application to the nova-compute application). This "enable-all-pollsters" on the ceilometer-agent charm then enables all available pollsters on the the nova-compute node. The relevant template that gets written is: https://opendev.org/openstack/charm-ceilometer-agent/src/branch/master/templates/polling.yaml If this doesn't help then please report back here, or join us in either #openstack-charms on IRC or in the OpenStack Charms channel in our public mattermost (https://chat.charmhub.io/charmhub/channels/openstack-charms) if you'd like to chat or have more 'real time' help. Thanks Alex. On Thu, 4 May 2023 at 04:28, Angel Santana wrote: > Guys, > > I deployed Ceilometer into my OpenStack deployment using the Juju > charm. I?m trying to add a custom metric that records the uptime of > instances every hour. I?m totally lost on how to archive this or if it?s > possible. Any help would be appreciate. > > So far I tried playing around with the meters.yaml and polling.yaml to > see if I could pull it off with no luck. Is this possible? how should I do > it? > > Thanks in advance. > > Regards, > ?ngel > -- Alex Kavanagh OpenStack Engineering - Canonical Ltd -------------- next part -------------- An HTML attachment was scrubbed... URL: From kozhukalov at gmail.com Thu May 4 10:30:19 2023 From: kozhukalov at gmail.com (Vladimir Kozhukalov) Date: Thu, 4 May 2023 13:30:19 +0300 Subject: [openstack-helm] Nominate Karl Kloppenborg for openstack-helm-core In-Reply-To: References: Message-ID: Thanks for the support of this nomination. Karl, Congrats. You are now a member of the openstack-helm-core. On Tue, May 2, 2023 at 10:25?PM Steve Taylor wrote: > I second that nomination. Karl has demonstrated his experience and has > contributed to the project consistently. He will be a significant asset as > a core reviewer. > > Steve Taylor > > On 5/2/2023 1:22:24 PM, Vladimir Kozhukalov wrote: > Dear Openstack-helmers, > > I would like to suggest Karl Kloppenborg as a member of the core review > team. He's been actively doing code review recently and also contributed as > a developer. He has been using Openstack-Helm for a few years now, so his > experience as a user is also to our advantage. > > I hope for your support. > > -- > Best regards, > Kozhukalov Vladimir > > -- Best regards, Kozhukalov Vladimir -------------- next part -------------- An HTML attachment was scrubbed... URL: From noonedeadpunk at gmail.com Thu May 4 11:26:06 2023 From: noonedeadpunk at gmail.com (Dmitriy Rabotyagov) Date: Thu, 4 May 2023 13:26:06 +0200 Subject: [openstack-ansible] Proposing Neil Hanlon for core reviewer In-Reply-To: References: Message-ID: Hey there, Since no objections were raised, I warmly welcome Neil to the team! ??, 25 ???. 2023??. ? 09:21, Jonathan Rosser : > > This would be great. +2 from me. > > Jon. > > On 18/04/2023 20:00, Dmitriy Rabotyagov wrote: > > Hi everyone, > > > > I'm pleased to propose Neil Hanlon as OpenStack-Ansible Core Reviewer. > > Neil has helped us a lot lately with maintenance of RHEL-like distros > > - > > both CentOS Stream and Rocky Linux, and basically has brought in > > support for the last one. > > > > Neil is present at meetings and always is responsive in IRC. At the > > same time they were providing useful reviews lately [1] > > > > If there are no objections, I will add Neil to the team on 25th of > > April 2023. Until then, please feel free to provide your > > feedback/opinions on the matter. > > > > [1] https://www.stackalytics.io/report/contribution?module=openstackansible-group&project_type=openstack&days=120 > > > > > From derekokeeffe85 at yahoo.ie Thu May 4 12:13:43 2023 From: derekokeeffe85 at yahoo.ie (Derek O keeffe) Date: Thu, 4 May 2023 12:13:43 +0000 (UTC) Subject: Certbot auto renew References: <1254945930.5559770.1683202423942.ref@mail.yahoo.com> Message-ID: <1254945930.5559770.1683202423942@mail.yahoo.com> Hi all, We're having a problem with renewing letsencrypt certs via certbot in an external Neutron network where a security group is locking down HTTP+HTTPS access to select IP ranges. As far as we know the IP address for the Certbot ACME challenge server is always changing and therefore a static security group can't be set up to allow in traffic from that server. We have experimented with using UFW rules instead thinking we may be able to write a script to open port 80 periodically to allow the ACME challenge through, then close it back up, but it hasn't worked as we'd hoped either (either all traffic is blocked or the security group immediately takes precedence). Is there any way to programmatically enable + disable a security group as needed using something like OpenstackSDK to achieve the same thing? Thanks in advance. Regards,Derek -------------- next part -------------- An HTML attachment was scrubbed... URL: From donny at fortnebula.com Thu May 4 12:30:32 2023 From: donny at fortnebula.com (Donny D) Date: Thu, 4 May 2023 07:30:32 -0500 Subject: Certbot auto renew In-Reply-To: <1254945930.5559770.1683202423942@mail.yahoo.com> References: <1254945930.5559770.1683202423942.ref@mail.yahoo.com> <1254945930.5559770.1683202423942@mail.yahoo.com> Message-ID: On Thu, May 4, 2023 at 7:14?AM Derek O keeffe wrote: > Hi all, > > We're having a problem with renewing letsencrypt certs via certbot in an > external Neutron network where a security group is locking down HTTP+HTTPS > access to select IP ranges. As far as we know the IP address for the > Certbot ACME challenge server is always changing and therefore a static > security group can't be set up to allow in traffic from that server. We > have experimented with using UFW rules instead thinking we may be able to > write a script to open port 80 periodically > to allow the ACME challenge through, then close it back up, but it hasn't > worked as we'd hoped either (either all traffic is blocked or the security > group immediately takes precedence). Is there any way to programmatically > enable + disable a security group as needed using something like > OpenstackSDK to achieve the same thing? > > Thanks in advance. > > Regards, > Derek > > Derek, Instead of thinking about the security group rule being enabled or disabled - maybe think about it existing or not existing. Prior to your certbot run, you add a rule to a security group to allow 80 inbound and then when certbot is done, you delete the rule. Personally I like Ansible, but you could use literally anything to accomplish this task - even bash. https://docs.ansible.com/ansible/latest/collections/openstack/cloud/security_group_rule_info_module.html#ansible-collections-openstack-cloud-security-group-rule-info-module -- ~/DonnyD "No mission too difficult. No sacrifice too great. Duty First" -------------- next part -------------- An HTML attachment was scrubbed... URL: From wodel.youchi at gmail.com Thu May 4 12:33:48 2023 From: wodel.youchi at gmail.com (wodel youchi) Date: Thu, 4 May 2023 13:33:48 +0100 Subject: [kolla-ansible][yoga] fluentd problem with elasticsearch after update In-Reply-To: <2E93A6DB-2BA6-44A4-B1A0-97AA7205B791@gmail.com> References: <2E93A6DB-2BA6-44A4-B1A0-97AA7205B791@gmail.com> Message-ID: Hi, I'll try to open a bug for this. I am using elasticsearch also with Cloudkitty : cloudkitty_storage_backend: "elasticsearch" instead of influxdb to get some HA. Will I still get the fluentd problem even if I migrate to Opensearch leaving Cloudkitty with elasticsearch??? Regards. Virus-free.www.avast.com <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> Le jeu. 4 mai 2023 ? 07:42, Micha? Nasiadka a ?crit : > Hello, > > That probably is a Kolla bug - can you please raise a bug in launchpad.net > ? > The other alternative is to migrate to OpenSearch (we?ve back ported this > functionality recently) - > https://docs.openstack.org/kolla-ansible/yoga/reference/logging-and-monitoring/central-logging-guide-opensearch.html#migration > > Best regards, > Michal > > On 3 May 2023, at 17:13, wodel youchi wrote: > > Hi, > > I have finished the update of my openstack platform with newer containers. > > While verifying I noticed that fluentd container keeps restarting. > > In the log file I am having this : > >> 2023-05-03 16:07:59 +0100 [error]: #0 config error >> file="/etc/td-agent/td-agent.conf" error_class=Fluent::ConfigError >> error="Using Elasticsearch client 8.7.1 is not compatible for your >> Elasticsearch server. Please check your using elasticsearch gem version and >> Elasticsearch server." >> 2023-05-03 16:07:59 +0100 [error]: Worker 0 finished unexpectedly with >> status 2 >> 2023-05-03 16:07:59 +0100 [info]: Received graceful stop >> > > Those are the images I am using : > (yogavenv) [deployer at rscdeployer ~]$ sudo docker images | grep -i elas > > 192.168.1.16:4000/openstack.kolla/centos-source-prometheus-elasticsearch-exporter > yoga030523 b48f63ed0072 12 hours ago 539MB > 192.168.1.16:4000/openstack.kolla/centos-source-elasticsearch > yoga030523 3558611b0cf4 12 hours ago 1.2GB > 192.168.1.16:4000/openstack.kolla/centos-source-elasticsearch-curator > yoga030523 83a6b48339ea 12 hours ago 637MB > > (yogavenv) [deployer at rscdeployer ~]$ sudo docker images | grep -i fluen > 192.168.1.16:4000/openstack.kolla/centos-source-fluentd > yoga030523 bf6596e139e2 12 hours ago 847MB > > Any ideas? > > Regards. > > > > Virus-free.www.avast.com > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From derekokeeffe85 at yahoo.ie Thu May 4 12:45:29 2023 From: derekokeeffe85 at yahoo.ie (Derek O keeffe) Date: Thu, 4 May 2023 13:45:29 +0100 Subject: Certbot auto renew In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From mnasiadka at gmail.com Thu May 4 13:12:29 2023 From: mnasiadka at gmail.com (=?utf-8?Q?Micha=C5=82_Nasiadka?=) Date: Thu, 4 May 2023 15:12:29 +0200 Subject: [kolla-ansible][yoga] fluentd problem with elasticsearch after update In-Reply-To: References: <2E93A6DB-2BA6-44A4-B1A0-97AA7205B791@gmail.com> Message-ID: <02C03BF9-E20D-4844-93B6-E472C1AD7E50@gmail.com> Hello, Kolla-Ansible is not supporting both opensearch and elasticsearch running at the same time - so if you?re using cloudkitty - it?s better to stick for Elasticsearch for now (CK does not support OpenSearch yet). I started working on the bug - will let you know in the bug report when a fix will be merged and images published. In the meantime you can try to uninstall the too-new elasticsearch gems using td-agent-gem uninstall in your running container image. Best regards, Michal > On 4 May 2023, at 14:33, wodel youchi wrote: > > Hi, > > I'll try to open a bug for this. > > I am using elasticsearch also with Cloudkitty : cloudkitty_storage_backend: "elasticsearch" instead of influxdb to get some HA. > Will I still get the fluentd problem even if I migrate to Opensearch leaving Cloudkitty with elasticsearch??? > > Regards. > > Virus-free.www.avast.com > Le jeu. 4 mai 2023 ? 07:42, Micha? Nasiadka > a ?crit : >> Hello, >> >> That probably is a Kolla bug - can you please raise a bug in launchpad.net ? >> The other alternative is to migrate to OpenSearch (we?ve back ported this functionality recently) - https://docs.openstack.org/kolla-ansible/yoga/reference/logging-and-monitoring/central-logging-guide-opensearch.html#migration >> >> Best regards, >> Michal >> >>> On 3 May 2023, at 17:13, wodel youchi > wrote: >>> >>> Hi, >>> >>> I have finished the update of my openstack platform with newer containers. >>> >>> While verifying I noticed that fluentd container keeps restarting. >>> >>> In the log file I am having this : >>>> 2023-05-03 16:07:59 +0100 [error]: #0 config error file="/etc/td-agent/td-agent.conf" error_class=Fluent::ConfigError error="Using Elasticsearch client 8.7.1 is not compatible for your Elasticsearch server. Please check your using elasticsearch gem version and Elasticsearch server." >>>> 2023-05-03 16:07:59 +0100 [error]: Worker 0 finished unexpectedly with status 2 >>>> 2023-05-03 16:07:59 +0100 [info]: Received graceful stop >>> >>> Those are the images I am using : >>> (yogavenv) [deployer at rscdeployer ~]$ sudo docker images | grep -i elas >>> 192.168.1.16:4000/openstack.kolla/centos-source-prometheus-elasticsearch-exporter yoga030523 b48f63ed0072 12 hours ago 539MB >>> 192.168.1.16:4000/openstack.kolla/centos-source-elasticsearch yoga030523 3558611b0cf4 12 hours ago 1.2GB >>> 192.168.1.16:4000/openstack.kolla/centos-source-elasticsearch-curator yoga030523 83a6b48339ea 12 hours ago 637MB >>> >>> (yogavenv) [deployer at rscdeployer ~]$ sudo docker images | grep -i fluen >>> 192.168.1.16:4000/openstack.kolla/centos-source-fluentd yoga030523 bf6596e139e2 12 hours ago 847MB >>> >>> Any ideas? >>> >>> Regards. >>> >>> Virus-free.www.avast.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Thu May 4 13:30:50 2023 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 4 May 2023 13:30:50 +0000 Subject: Certbot auto renew In-Reply-To: References: Message-ID: <20230504133050.p2cjutoyenovtrzv@yuggoth.org> On 2023-05-04 13:45:29 +0100 (+0100), Derek O keeffe wrote: > We didn?t really want to interact with the vm afterwards, we have > many machines that need to be locked down but then need to certbot > renew which they can?t. We were thinking of a script that uses > openstack sdk to remove the security group, update the cert and > then add the security group back. [...] If you have an easy way to push records into DNS, using the DNS-based issuance and renewal workflow may be easier than orchestrating connectivity from the registrar's servers to your virtual machines. For our servers, we orchestrate the acme.sh tool and associated DNS record updates with Ansible roles: https://opendev.org/opendev/system-config/src/branch/master/playbooks/roles (specifically the ones there named like letsencrypt-*). Since we also operate our own name servers it's relatively easy for us, but if your DNS provider has an API or supports the dynamic update protocol then it's probably still pretty simple to do. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From rafaelweingartner at gmail.com Thu May 4 20:38:06 2023 From: rafaelweingartner at gmail.com (=?UTF-8?Q?Rafael_Weing=C3=A4rtner?=) Date: Thu, 4 May 2023 17:38:06 -0300 Subject: [dev][ceilometer] Add custom uptime metric to Ceilometer In-Reply-To: <924CA4A5-D459-4061-BAD7-2F1B6926C86A@neoscloudllc.com> References: <924CA4A5-D459-4061-BAD7-2F1B6926C86A@neoscloudllc.com> Message-ID: Yes it is. You should check at the Dynamic pollsters sub-system in Ceilometer [1]. You can basically monitor/collect data from anything you want or need with that. [1] https://docs.openstack.org/ceilometer/latest/admin/telemetry-dynamic-pollster.html On Wed, May 3, 2023 at 11:30?PM Angel Santana wrote: > Guys, > > I deployed Ceilometer into my OpenStack deployment using the Juju > charm. I?m trying to add a custom metric that records the uptime of > instances every hour. I?m totally lost on how to archive this or if it?s > possible. Any help would be appreciate. > > So far I tried playing around with the meters.yaml and polling.yaml to > see if I could pull it off with no luck. Is this possible? how should I do > it? > > Thanks in advance. > > Regards, > ?ngel > -- Rafael Weing?rtner -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralonsoh at redhat.com Fri May 5 07:43:19 2023 From: ralonsoh at redhat.com (Rodolfo Alonso Hernandez) Date: Fri, 5 May 2023 09:43:19 +0200 Subject: [neutron] Drivers meeting cancelled today Message-ID: Hello Neutrinos: Due to the lack of agenda [1], today's drivers meeting is cancelled. Have a nice weekend. [1]https://wiki.openstack.org/wiki/Meetings/NeutronDrivers -------------- next part -------------- An HTML attachment was scrubbed... URL: From anbanerj at redhat.com Fri May 5 11:25:46 2023 From: anbanerj at redhat.com (Ananya Banerjee) Date: Fri, 5 May 2023 13:25:46 +0200 Subject: [gate][tripleo] gate blocker Message-ID: Hello, All Centos 8 jobs which deploy standalone are failing at the moment. Please hold rechecks if you hit standalone deploy failure on Centos 8 jobs. We are working on the bug: https://bugs.launchpad.net/tripleo/+bug/2018588 Thanks, Ananya -- Ananya Banerjee, RHCSA, RHCE-OSP Software Engineer Red Hat EMEA anbanerj at redhat.com M: +491784949931 IM: frenzy_friday @RedHat Red Hat Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From elod.illes at est.tech Fri May 5 14:30:58 2023 From: elod.illes at est.tech (=?utf-8?B?RWzDtWQgSWxsw6lz?=) Date: Fri, 5 May 2023 14:30:58 +0000 Subject: [release] Release countdown for week R-21, May 08-12 Message-ID: Development Focus ----------------- The Bobcat-1 milestone is next week, on May 11th, 2023! Project team plans for the 2023.2 Bobcat cycle should now be solidified. General Information ------------------- Libraries need to be released at least once per milestone period. Next week, the release team will propose releases for any library which had changes but has not been otherwise released since the 2023.1 Antelope release. PTLs or release liaisons, please watch for these and give a +1 to acknowledge them. If there is some reason to hold off on a release, let us know that as well, by posting a -1. If we do not hear anything at all by the end of the week, we will assume things are OK to proceed. NB: If one of your libraries is still releasing 0.x versions, start thinking about when it will be appropriate to do a 1.0 version. The version number does signal the state, real or perceived, of the library, so we strongly encourage going to a full major version once things are in a good and usable state. Upcoming Deadlines & Dates -------------------------- Bobcat-1 milestone: May 11th, 2023 OpenInfra Summit Vancouver (including PTG): June 13-15, 2023 Final 2023.2 Bobcat release: October 4th, 2023 El?d Ill?s irc: elodilles @ #openstack-release -------------- next part -------------- An HTML attachment was scrubbed... URL: From gsteinmuller at vexxhost.com Fri May 5 18:03:48 2023 From: gsteinmuller at vexxhost.com (=?iso-8859-1?Q?Guilherme_Steinm=FCller?=) Date: Fri, 5 May 2023 18:03:48 +0000 Subject: Kubernetes Conformance 1.24 + 1.25 In-Reply-To: References: <387290b9-c3be-559f-afd2-b41d508d0fac@ardc.edu.au> <908b2eb4-dae4-28f8-02e2-3b9f4660ce2f@ardc.edu.au> Message-ID: Hey there! I am trying to run conformance against 1.25 and 1.26 now, but it looks like we are still with this ongoing? https://review.opendev.org/c/openstack/magnum/+/874092 Im still facing issues to create the cluster due to "PodSecurityPolicy\" is unknown. Thank you, Guilherme Steinmuller ________________________________ From: Kendall Nelson Sent: 21 February 2023 17:38 To: Guilherme Steinm?ller Cc: Jake Yip ; OpenStack Discuss ; dale at catalystcloud.nz Subject: Re: Kubernetes Conformance 1.24 + 1.25 Circling back to this thread- Thanks Jake for getting this rolling! https://review.opendev.org/c/openstack/magnum/+/874092 -Kendall On Wed, Feb 15, 2023 at 6:34 AM Guilherme Steinm?ller > wrote: Hi Jake, Yeah, that could be it. On devstack magnum master, the kube-apiserver pod fails to start with rancher 1.25 hyperkube image with: Feb 14 20:24:06 k8s-cluster-dgpwfkugdna5-master-0 conmon[119164]: E0214 20:24:06.615919 1 run.go:74] "command failed" err="admission-control plugin \"PodSecurityPolicy\" is unknown" Regards, Guilherme Steinmuller On Tue, Feb 14, 2023 at 10:03 AM Jake Yip > wrote: Hi Guilherme Steinmuller, Is the issue with 1.25 the removal of PodSecurityPolicy? And that there are pieces of PSP in Magnum code. I've been trying to remove it. Regards, Jake On 14/2/2023 11:35 pm, Guilherme Steinm?ller wrote: > Hi everyone! > > Dale, thanks for your comments here. I no longer have my devstack which > I tested v1.25. However, you pointed out something I haven't noticed: > for v1.25 I tried using the fedora coreos that is shipped with devstack, > which is f36. > > I will try to reproduce it again, but now using a newer fedora coreos. > If it fails, I will be happy to share my results here for us to figure > out and get certified for 1.25! > > Keep in tune! > > Thank you, > Guilherme Steinmuller > > On Tue, Feb 14, 2023 at 9:26 AM Jake Yip > >> wrote: > > On 14/2/2023 6:53 am, Kendall Nelson wrote: > > Hello All! > > > > First of all, I want to say a huge thanks to Guilherme > Steinmuller for > > all his help ensuring that OpenStack Magnum remains Kubernetes > Certified > > [1]! We are certified for v1.24! > > > Wow great work Guilherme Steinmuller! > > - Jake > -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Fri May 5 19:14:18 2023 From: satish.txt at gmail.com (Satish Patel) Date: Fri, 5 May 2023 15:14:18 -0400 Subject: [Nova] Clean up dead vms entries from DB Message-ID: Folks, I have a small environment where controllers and computers run on the same nodes and yes that is a bad idea. But now what happened the machine got OOM out and crashed and stuck somewhere in Zombie stat. I have deleted all vms but it didn't work. Later I used the "virsh destroy" command to delete those vms and everything recovered but my openstack hypervisor state show" command still says you have 91 VMs hanging in DB. How do i clean up vms entered from nova DB which doesn't exist at all. I have tried the command "nova-manage db archive_deleted_rows" but it didn't help. How does nova sync DB with current stat? # openstack hypervisor stats show This command is deprecated. +----------------------+--------+ | Field | Value | +----------------------+--------+ | count | 3 | | current_workload | 11 | | disk_available_least | 15452 | | free_disk_gb | 48375 | | free_ram_mb | 192464 | | local_gb | 50286 | | local_gb_used | 1911 | | memory_mb | 386570 | | memory_mb_used | 194106 | | running_vms | 91 | | vcpus | 144 | | vcpus_used | 184 | +----------------------+--------+ Technically I have only single VM # openstack server list --all +--------------------------------------+---------+--------+---------------------------------+------------+----------+ | ID | Name | Status | Networks | Image | Flavor | +--------------------------------------+---------+--------+---------------------------------+------------+----------+ | b33fd79d-2b90-41dd-a070-29f92ce205e7 | foo1 | ACTIVE | 1=100.100.75.22, 192.168.1.139 | ubuntu2204 | m1.small | +--------------------------------------+---------+--------+---------------------------------+------------+----------+ -------------- next part -------------- An HTML attachment was scrubbed... URL: From nguyenhuukhoinw at gmail.com Sat May 6 02:29:21 2023 From: nguyenhuukhoinw at gmail.com (=?UTF-8?B?Tmd1eeG7hW4gSOG7r3UgS2jDtGk=?=) Date: Sat, 6 May 2023 09:29:21 +0700 Subject: [OPENSTACK][rabbitmq] using quorum queues Message-ID: Hello guys. IS there any guy who uses the quorum queue for openstack? Could you give some feedback to compare with classic queue? Thank you. Nguyen Huu Khoi -------------- next part -------------- An HTML attachment was scrubbed... URL: From fadhel.bedda at gmail.com Sat May 6 08:18:32 2023 From: fadhel.bedda at gmail.com (BEDDA Fadhel) Date: Sat, 6 May 2023 10:18:32 +0200 Subject: openstack-discuss Digest, Vol 55, Issue 16 In-Reply-To: References: Message-ID: Bonjour, J?ai mis en place openstack, sur une VM Ubuntu 22.04. L?installation s?est bien pass?: J?ai cr?? un compte et un projet, ainsi que toute les ?tapes pour la cr?ation d?une instance Ubuntu 18. Mais lors de lancement de l?instance, je n?ai que des messages d?erreur. Ma question: Est ce qu?il y?a quelqu?un qui peut m?aider et me fournir une proc?dure compl?te et test?e. Avec mes vifs remerciements Le ven. 5 mai 2023 ? 20:09, a ?crit : > Send openstack-discuss mailing list submissions to > openstack-discuss at lists.openstack.org > > To subscribe or unsubscribe via the World Wide Web, visit > > https://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-discuss > > or, via email, send a message with subject or body 'help' to > openstack-discuss-request at lists.openstack.org > > You can reach the person managing the list at > openstack-discuss-owner at lists.openstack.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of openstack-discuss digest..." > > > Today's Topics: > > 1. [release] Release countdown for week R-21, May 08-12 (El?d Ill?s) > 2. Re: Kubernetes Conformance 1.24 + 1.25 (Guilherme Steinm?ller) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Fri, 5 May 2023 14:30:58 +0000 > From: El?d Ill?s > To: "openstack-discuss at lists.openstack.org" > > Subject: [release] Release countdown for week R-21, May 08-12 > Message-ID: > < > VI1P18901MB07511B14C9655567C4A5EC88FF729 at VI1P18901MB0751.EURP189.PROD.OUTLOOK.COM > > > > Content-Type: text/plain; charset="utf-8" > > Development Focus > ----------------- > > The Bobcat-1 milestone is next week, on May 11th, 2023! Project team > plans for the 2023.2 Bobcat cycle should now be solidified. > > General Information > ------------------- > > Libraries need to be released at least once per milestone period. Next > week, the release team will propose releases for any library which had > changes but has not been otherwise released since the 2023.1 Antelope > release. > PTLs or release liaisons, please watch for these and give a +1 to > acknowledge them. If there is some reason to hold off on a release, let > us know that as well, by posting a -1. If we do not hear anything at all > by the end of the week, we will assume things are OK to proceed. > > NB: If one of your libraries is still releasing 0.x versions, start > thinking about when it will be appropriate to do a 1.0 version. The > version number does signal the state, real or perceived, of the library, > so we strongly encourage going to a full major version once things are > in a good and usable state. > > Upcoming Deadlines & Dates > -------------------------- > > Bobcat-1 milestone: May 11th, 2023 > OpenInfra Summit Vancouver (including PTG): June 13-15, 2023 > Final 2023.2 Bobcat release: October 4th, 2023 > > > El?d Ill?s > irc: elodilles @ #openstack-release > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > https://lists.openstack.org/pipermail/openstack-discuss/attachments/20230505/df4fecbf/attachment-0001.htm > > > > ------------------------------ > > Message: 2 > Date: Fri, 5 May 2023 18:03:48 +0000 > From: Guilherme Steinm?ller > To: Kendall Nelson > Cc: Jake Yip , OpenStack Discuss > , "dale at catalystcloud.nz" > > Subject: Re: Kubernetes Conformance 1.24 + 1.25 > Message-ID: > < > YT3P288MB02720867AA20B5F11C042EFCAB729 at YT3P288MB0272.CANP288.PROD.OUTLOOK.COM > > > > Content-Type: text/plain; charset="iso-8859-1" > > Hey there! > > I am trying to run conformance against 1.25 and 1.26 now, but it looks > like we are still with this ongoing? > https://review.opendev.org/c/openstack/magnum/+/874092 > > Im still facing issues to create the cluster due to "PodSecurityPolicy\" > is unknown. > > Thank you, > Guilherme Steinmuller > ________________________________ > From: Kendall Nelson > Sent: 21 February 2023 17:38 > To: Guilherme Steinm?ller > Cc: Jake Yip ; OpenStack Discuss < > openstack-discuss at lists.openstack.org>; dale at catalystcloud.nz < > dale at catalystcloud.nz> > Subject: Re: Kubernetes Conformance 1.24 + 1.25 > > Circling back to this thread- > > Thanks Jake for getting this rolling! > https://review.opendev.org/c/openstack/magnum/+/874092 > > -Kendall > > On Wed, Feb 15, 2023 at 6:34 AM Guilherme Steinm?ller < > gsteinmuller at vexxhost.com> wrote: > Hi Jake, > > Yeah, that could be it. > > On devstack magnum master, the kube-apiserver pod fails to start with > rancher 1.25 hyperkube image with: > > Feb 14 20:24:06 k8s-cluster-dgpwfkugdna5-master-0 conmon[119164]: E0214 > 20:24:06.615919 1 run.go:74] "command failed" err="admission-control > plugin \"PodSecurityPolicy\" is unknown" > > Regards, > Guilherme Steinmuller > > On Tue, Feb 14, 2023 at 10:03 AM Jake Yip jake.yip at ardc.edu.au>> wrote: > Hi Guilherme Steinmuller, > > Is the issue with 1.25 the removal of PodSecurityPolicy? And that there > are pieces of PSP in Magnum code. I've been trying to remove it. > > Regards, > Jake > > > On 14/2/2023 11:35 pm, Guilherme Steinm?ller wrote: > > Hi everyone! > > > > Dale, thanks for your comments here. I no longer have my devstack which > > I tested v1.25. However, you pointed out something I haven't noticed: > > for v1.25 I tried using the fedora coreos that is shipped with devstack, > > which is f36. > > > > I will try to reproduce it again, but now using a newer fedora coreos. > > If it fails, I will be happy to share my results here for us to figure > > out and get certified for 1.25! > > > > Keep in tune! > > > > Thank you, > > Guilherme Steinmuller > > > > On Tue, Feb 14, 2023 at 9:26 AM Jake Yip jake.yip at ardc.edu.au> > > >> wrote: > > > > On 14/2/2023 6:53 am, Kendall Nelson wrote: > > > Hello All! > > > > > > First of all, I want to say a huge thanks to Guilherme > > Steinmuller for > > > all his help ensuring that OpenStack Magnum remains Kubernetes > > Certified > > > [1]! We are certified for v1.24! > > > > > Wow great work Guilherme Steinmuller! > > > > - Jake > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > https://lists.openstack.org/pipermail/openstack-discuss/attachments/20230505/9afb1d74/attachment.htm > > > > ------------------------------ > > Subject: Digest Footer > > _______________________________________________ > openstack-discuss mailing list > openstack-discuss at lists.openstack.org > > > ------------------------------ > > End of openstack-discuss Digest, Vol 55, Issue 16 > ************************************************* > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fadhel.bedda at gmail.com Sat May 6 08:44:45 2023 From: fadhel.bedda at gmail.com (BEDDA Fadhel) Date: Sat, 6 May 2023 10:44:45 +0200 Subject: openstack-discuss Digest, Vol 55, Issue 17 In-Reply-To: References: Message-ID: Good morning, I set up openstack, on an Ubuntu 22.04 VM. Installation went well: I created an account and a project, as well as all the steps for creating an Ubuntu 18 instance. But when launching the instance, I only have error messages. My question: Is there someone who can help me and provide me with a procedure complete and tested. With my sincere thanks Le sam. 6 mai 2023 ? 10:21, a ?crit : > Send openstack-discuss mailing list submissions to > openstack-discuss at lists.openstack.org > > To subscribe or unsubscribe via the World Wide Web, visit > > https://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-discuss > > or, via email, send a message with subject or body 'help' to > openstack-discuss-request at lists.openstack.org > > You can reach the person managing the list at > openstack-discuss-owner at lists.openstack.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of openstack-discuss digest..." > > > Today's Topics: > > 1. [Nova] Clean up dead vms entries from DB (Satish Patel) > 2. [OPENSTACK][rabbitmq] using quorum queues (Nguy?n H?u Kh?i) > 3. Re: openstack-discuss Digest, Vol 55, Issue 16 (BEDDA Fadhel) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Fri, 5 May 2023 15:14:18 -0400 > From: Satish Patel > To: OpenStack Discuss > Subject: [Nova] Clean up dead vms entries from DB > Message-ID: > S18K5-sX6XMCbjTWKA at mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > Folks, > > I have a small environment where controllers and computers run on the same > nodes and yes that is a bad idea. But now what happened the machine got OOM > out and crashed and stuck somewhere in Zombie stat. I have deleted all vms > but it didn't work. Later I used the "virsh destroy" command to delete > those vms and everything recovered but my openstack hypervisor state show" > command still says you have 91 VMs hanging in DB. > > How do i clean up vms entered from nova DB which doesn't exist at all. > > I have tried the command "nova-manage db archive_deleted_rows" but it > didn't help. How does nova sync DB with current stat? > > # openstack hypervisor stats show > This command is deprecated. > +----------------------+--------+ > | Field | Value | > +----------------------+--------+ > | count | 3 | > | current_workload | 11 | > | disk_available_least | 15452 | > | free_disk_gb | 48375 | > | free_ram_mb | 192464 | > | local_gb | 50286 | > | local_gb_used | 1911 | > | memory_mb | 386570 | > | memory_mb_used | 194106 | > | running_vms | 91 | > | vcpus | 144 | > | vcpus_used | 184 | > +----------------------+--------+ > > > Technically I have only single VM > > # openstack server list --all > > +--------------------------------------+---------+--------+---------------------------------+------------+----------+ > | ID | Name | Status | Networks > | Image | Flavor | > > +--------------------------------------+---------+--------+---------------------------------+------------+----------+ > | b33fd79d-2b90-41dd-a070-29f92ce205e7 | foo1 | ACTIVE | 1=100.100.75.22, > 192.168.1.139 | ubuntu2204 | m1.small | > > +--------------------------------------+---------+--------+---------------------------------+------------+----------+ > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > https://lists.openstack.org/pipermail/openstack-discuss/attachments/20230505/529f5c69/attachment-0001.htm > > > > ------------------------------ > > Message: 2 > Date: Sat, 6 May 2023 09:29:21 +0700 > From: Nguy?n H?u Kh?i > To: OpenStack Discuss > Subject: [OPENSTACK][rabbitmq] using quorum queues > Message-ID: > 8ajDJxTXupj_7pXA+_46wQ at mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > Hello guys. > IS there any guy who uses the quorum queue for openstack? Could you give > some feedback to compare with classic queue? > Thank you. > Nguyen Huu Khoi > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > https://lists.openstack.org/pipermail/openstack-discuss/attachments/20230506/63494316/attachment-0001.htm > > > > ------------------------------ > > Message: 3 > Date: Sat, 6 May 2023 10:18:32 +0200 > From: BEDDA Fadhel > To: openstack-discuss at lists.openstack.org > Subject: Re: openstack-discuss Digest, Vol 55, Issue 16 > Message-ID: > < > CAE1GhS4387gW87jZEdxULPgmXfcMcMY9yiZffpP36dBwSLhMww at mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > Bonjour, > J?ai mis en place openstack, sur une VM Ubuntu 22.04. > L?installation s?est bien pass?: > J?ai cr?? un compte et un projet, ainsi que toute les ?tapes pour la > cr?ation d?une instance Ubuntu 18. > Mais lors de lancement de l?instance, je n?ai que des messages d?erreur. > Ma question: > Est ce qu?il y?a quelqu?un qui peut m?aider et me fournir une proc?dure > compl?te et test?e. > Avec mes vifs remerciements > > Le ven. 5 mai 2023 ? 20:09, > > a ?crit : > > > Send openstack-discuss mailing list submissions to > > openstack-discuss at lists.openstack.org > > > > To subscribe or unsubscribe via the World Wide Web, visit > > > > https://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-discuss > > > > or, via email, send a message with subject or body 'help' to > > openstack-discuss-request at lists.openstack.org > > > > You can reach the person managing the list at > > openstack-discuss-owner at lists.openstack.org > > > > When replying, please edit your Subject line so it is more specific > > than "Re: Contents of openstack-discuss digest..." > > > > > > Today's Topics: > > > > 1. [release] Release countdown for week R-21, May 08-12 (El?d Ill?s) > > 2. Re: Kubernetes Conformance 1.24 + 1.25 (Guilherme Steinm?ller) > > > > > > ---------------------------------------------------------------------- > > > > Message: 1 > > Date: Fri, 5 May 2023 14:30:58 +0000 > > From: El?d Ill?s > > To: "openstack-discuss at lists.openstack.org" > > > > Subject: [release] Release countdown for week R-21, May 08-12 > > Message-ID: > > < > > > VI1P18901MB07511B14C9655567C4A5EC88FF729 at VI1P18901MB0751.EURP189.PROD.OUTLOOK.COM > > > > > > > Content-Type: text/plain; charset="utf-8" > > > > Development Focus > > ----------------- > > > > The Bobcat-1 milestone is next week, on May 11th, 2023! Project team > > plans for the 2023.2 Bobcat cycle should now be solidified. > > > > General Information > > ------------------- > > > > Libraries need to be released at least once per milestone period. Next > > week, the release team will propose releases for any library which had > > changes but has not been otherwise released since the 2023.1 Antelope > > release. > > PTLs or release liaisons, please watch for these and give a +1 to > > acknowledge them. If there is some reason to hold off on a release, let > > us know that as well, by posting a -1. If we do not hear anything at all > > by the end of the week, we will assume things are OK to proceed. > > > > NB: If one of your libraries is still releasing 0.x versions, start > > thinking about when it will be appropriate to do a 1.0 version. The > > version number does signal the state, real or perceived, of the library, > > so we strongly encourage going to a full major version once things are > > in a good and usable state. > > > > Upcoming Deadlines & Dates > > -------------------------- > > > > Bobcat-1 milestone: May 11th, 2023 > > OpenInfra Summit Vancouver (including PTG): June 13-15, 2023 > > Final 2023.2 Bobcat release: October 4th, 2023 > > > > > > El?d Ill?s > > irc: elodilles @ #openstack-release > > -------------- next part -------------- > > An HTML attachment was scrubbed... > > URL: < > > > https://lists.openstack.org/pipermail/openstack-discuss/attachments/20230505/df4fecbf/attachment-0001.htm > > > > > > > ------------------------------ > > > > Message: 2 > > Date: Fri, 5 May 2023 18:03:48 +0000 > > From: Guilherme Steinm?ller > > To: Kendall Nelson > > Cc: Jake Yip , OpenStack Discuss > > , "dale at catalystcloud.nz" > > > > Subject: Re: Kubernetes Conformance 1.24 + 1.25 > > Message-ID: > > < > > > YT3P288MB02720867AA20B5F11C042EFCAB729 at YT3P288MB0272.CANP288.PROD.OUTLOOK.COM > > > > > > > Content-Type: text/plain; charset="iso-8859-1" > > > > Hey there! > > > > I am trying to run conformance against 1.25 and 1.26 now, but it looks > > like we are still with this ongoing? > > https://review.opendev.org/c/openstack/magnum/+/874092 > > > > Im still facing issues to create the cluster due to "PodSecurityPolicy\" > > is unknown. > > > > Thank you, > > Guilherme Steinmuller > > ________________________________ > > From: Kendall Nelson > > Sent: 21 February 2023 17:38 > > To: Guilherme Steinm?ller > > Cc: Jake Yip ; OpenStack Discuss < > > openstack-discuss at lists.openstack.org>; dale at catalystcloud.nz < > > dale at catalystcloud.nz> > > Subject: Re: Kubernetes Conformance 1.24 + 1.25 > > > > Circling back to this thread- > > > > Thanks Jake for getting this rolling! > > https://review.opendev.org/c/openstack/magnum/+/874092 > > > > -Kendall > > > > On Wed, Feb 15, 2023 at 6:34 AM Guilherme Steinm?ller < > > gsteinmuller at vexxhost.com> wrote: > > Hi Jake, > > > > Yeah, that could be it. > > > > On devstack magnum master, the kube-apiserver pod fails to start with > > rancher 1.25 hyperkube image with: > > > > Feb 14 20:24:06 k8s-cluster-dgpwfkugdna5-master-0 conmon[119164]: E0214 > > 20:24:06.615919 1 run.go:74] "command failed" > err="admission-control > > plugin \"PodSecurityPolicy\" is unknown" > > > > Regards, > > Guilherme Steinmuller > > > > On Tue, Feb 14, 2023 at 10:03 AM Jake Yip > jake.yip at ardc.edu.au>> wrote: > > Hi Guilherme Steinmuller, > > > > Is the issue with 1.25 the removal of PodSecurityPolicy? And that there > > are pieces of PSP in Magnum code. I've been trying to remove it. > > > > Regards, > > Jake > > > > > > On 14/2/2023 11:35 pm, Guilherme Steinm?ller wrote: > > > Hi everyone! > > > > > > Dale, thanks for your comments here. I no longer have my devstack which > > > I tested v1.25. However, you pointed out something I haven't noticed: > > > for v1.25 I tried using the fedora coreos that is shipped with > devstack, > > > which is f36. > > > > > > I will try to reproduce it again, but now using a newer fedora coreos. > > > If it fails, I will be happy to share my results here for us to figure > > > out and get certified for 1.25! > > > > > > Keep in tune! > > > > > > Thank you, > > > Guilherme Steinmuller > > > > > > On Tue, Feb 14, 2023 at 9:26 AM Jake Yip > jake.yip at ardc.edu.au> > > > >> wrote: > > > > > > On 14/2/2023 6:53 am, Kendall Nelson wrote: > > > > Hello All! > > > > > > > > First of all, I want to say a huge thanks to Guilherme > > > Steinmuller for > > > > all his help ensuring that OpenStack Magnum remains Kubernetes > > > Certified > > > > [1]! We are certified for v1.24! > > > > > > > Wow great work Guilherme Steinmuller! > > > > > > - Jake > > > > > -------------- next part -------------- > > An HTML attachment was scrubbed... > > URL: < > > > https://lists.openstack.org/pipermail/openstack-discuss/attachments/20230505/9afb1d74/attachment.htm > > > > > > > ------------------------------ > > > > Subject: Digest Footer > > > > _______________________________________________ > > openstack-discuss mailing list > > openstack-discuss at lists.openstack.org > > > > > > ------------------------------ > > > > End of openstack-discuss Digest, Vol 55, Issue 16 > > ************************************************* > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > https://lists.openstack.org/pipermail/openstack-discuss/attachments/20230506/385b520c/attachment.htm > > > > ------------------------------ > > Subject: Digest Footer > > _______________________________________________ > openstack-discuss mailing list > openstack-discuss at lists.openstack.org > > > ------------------------------ > > End of openstack-discuss Digest, Vol 55, Issue 17 > ************************************************* > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eblock at nde.ag Sun May 7 08:22:18 2023 From: eblock at nde.ag (Eugen Block) Date: Sun, 07 May 2023 08:22:18 +0000 Subject: openstack-discuss Digest, Vol 55, Issue 17 In-Reply-To: References: Message-ID: <20230507082218.Horde.r69OMWXOyAMiti3Y_USij1C@webmail.nde.ag> Hi, there are plenty of people able to help you. I would recommend to create your own thread with a meaningful subject and provide more details about your setup and the exact error messages. Often you can find those error messages and helpful steps to resolve them in a search engine of your choice. If you have tried a couple of things which didn?t work, write them down as well. Regards Eugen Zitat von BEDDA Fadhel : > Good morning, > I set up openstack, on an Ubuntu 22.04 VM. > Installation went well: > I created an account and a project, as well as all the steps for > creating an Ubuntu 18 instance. > But when launching the instance, I only have error messages. > My question: > Is there someone who can help me and provide me with a procedure > complete and tested. > With my sincere thanks > > > Le sam. 6 mai 2023 ? 10:21, > a ?crit : > >> Send openstack-discuss mailing list submissions to >> openstack-discuss at lists.openstack.org >> >> To subscribe or unsubscribe via the World Wide Web, visit >> >> https://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-discuss >> >> or, via email, send a message with subject or body 'help' to >> openstack-discuss-request at lists.openstack.org >> >> You can reach the person managing the list at >> openstack-discuss-owner at lists.openstack.org >> >> When replying, please edit your Subject line so it is more specific >> than "Re: Contents of openstack-discuss digest..." >> >> >> Today's Topics: >> >> 1. [Nova] Clean up dead vms entries from DB (Satish Patel) >> 2. [OPENSTACK][rabbitmq] using quorum queues (Nguy?n H?u Kh?i) >> 3. Re: openstack-discuss Digest, Vol 55, Issue 16 (BEDDA Fadhel) >> >> >> ---------------------------------------------------------------------- >> >> Message: 1 >> Date: Fri, 5 May 2023 15:14:18 -0400 >> From: Satish Patel >> To: OpenStack Discuss >> Subject: [Nova] Clean up dead vms entries from DB >> Message-ID: >> > S18K5-sX6XMCbjTWKA at mail.gmail.com> >> Content-Type: text/plain; charset="utf-8" >> >> Folks, >> >> I have a small environment where controllers and computers run on the same >> nodes and yes that is a bad idea. But now what happened the machine got OOM >> out and crashed and stuck somewhere in Zombie stat. I have deleted all vms >> but it didn't work. Later I used the "virsh destroy" command to delete >> those vms and everything recovered but my openstack hypervisor state show" >> command still says you have 91 VMs hanging in DB. >> >> How do i clean up vms entered from nova DB which doesn't exist at all. >> >> I have tried the command "nova-manage db archive_deleted_rows" but it >> didn't help. How does nova sync DB with current stat? >> >> # openstack hypervisor stats show >> This command is deprecated. >> +----------------------+--------+ >> | Field | Value | >> +----------------------+--------+ >> | count | 3 | >> | current_workload | 11 | >> | disk_available_least | 15452 | >> | free_disk_gb | 48375 | >> | free_ram_mb | 192464 | >> | local_gb | 50286 | >> | local_gb_used | 1911 | >> | memory_mb | 386570 | >> | memory_mb_used | 194106 | >> | running_vms | 91 | >> | vcpus | 144 | >> | vcpus_used | 184 | >> +----------------------+--------+ >> >> >> Technically I have only single VM >> >> # openstack server list --all >> >> +--------------------------------------+---------+--------+---------------------------------+------------+----------+ >> | ID | Name | Status | Networks >> | Image | Flavor | >> >> +--------------------------------------+---------+--------+---------------------------------+------------+----------+ >> | b33fd79d-2b90-41dd-a070-29f92ce205e7 | foo1 | ACTIVE | 1=100.100.75.22, >> 192.168.1.139 | ubuntu2204 | m1.small | >> >> +--------------------------------------+---------+--------+---------------------------------+------------+----------+ >> -------------- next part -------------- >> An HTML attachment was scrubbed... >> URL: < >> https://lists.openstack.org/pipermail/openstack-discuss/attachments/20230505/529f5c69/attachment-0001.htm >> > >> >> ------------------------------ >> >> Message: 2 >> Date: Sat, 6 May 2023 09:29:21 +0700 >> From: Nguy?n H?u Kh?i >> To: OpenStack Discuss >> Subject: [OPENSTACK][rabbitmq] using quorum queues >> Message-ID: >> > 8ajDJxTXupj_7pXA+_46wQ at mail.gmail.com> >> Content-Type: text/plain; charset="utf-8" >> >> Hello guys. >> IS there any guy who uses the quorum queue for openstack? Could you give >> some feedback to compare with classic queue? >> Thank you. >> Nguyen Huu Khoi >> -------------- next part -------------- >> An HTML attachment was scrubbed... >> URL: < >> https://lists.openstack.org/pipermail/openstack-discuss/attachments/20230506/63494316/attachment-0001.htm >> > >> >> ------------------------------ >> >> Message: 3 >> Date: Sat, 6 May 2023 10:18:32 +0200 >> From: BEDDA Fadhel >> To: openstack-discuss at lists.openstack.org >> Subject: Re: openstack-discuss Digest, Vol 55, Issue 16 >> Message-ID: >> < >> CAE1GhS4387gW87jZEdxULPgmXfcMcMY9yiZffpP36dBwSLhMww at mail.gmail.com> >> Content-Type: text/plain; charset="utf-8" >> >> Bonjour, >> J?ai mis en place openstack, sur une VM Ubuntu 22.04. >> L?installation s?est bien pass?: >> J?ai cr?? un compte et un projet, ainsi que toute les ?tapes pour la >> cr?ation d?une instance Ubuntu 18. >> Mais lors de lancement de l?instance, je n?ai que des messages d?erreur. >> Ma question: >> Est ce qu?il y?a quelqu?un qui peut m?aider et me fournir une proc?dure >> compl?te et test?e. >> Avec mes vifs remerciements >> >> Le ven. 5 mai 2023 ? 20:09, > > >> a ?crit : >> >> > Send openstack-discuss mailing list submissions to >> > openstack-discuss at lists.openstack.org >> > >> > To subscribe or unsubscribe via the World Wide Web, visit >> > >> > https://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-discuss >> > >> > or, via email, send a message with subject or body 'help' to >> > openstack-discuss-request at lists.openstack.org >> > >> > You can reach the person managing the list at >> > openstack-discuss-owner at lists.openstack.org >> > >> > When replying, please edit your Subject line so it is more specific >> > than "Re: Contents of openstack-discuss digest..." >> > >> > >> > Today's Topics: >> > >> > 1. [release] Release countdown for week R-21, May 08-12 (El?d Ill?s) >> > 2. Re: Kubernetes Conformance 1.24 + 1.25 (Guilherme Steinm?ller) >> > >> > >> > ---------------------------------------------------------------------- >> > >> > Message: 1 >> > Date: Fri, 5 May 2023 14:30:58 +0000 >> > From: El?d Ill?s >> > To: "openstack-discuss at lists.openstack.org" >> > >> > Subject: [release] Release countdown for week R-21, May 08-12 >> > Message-ID: >> > < >> > >> VI1P18901MB07511B14C9655567C4A5EC88FF729 at VI1P18901MB0751.EURP189.PROD.OUTLOOK.COM >> > > >> > >> > Content-Type: text/plain; charset="utf-8" >> > >> > Development Focus >> > ----------------- >> > >> > The Bobcat-1 milestone is next week, on May 11th, 2023! Project team >> > plans for the 2023.2 Bobcat cycle should now be solidified. >> > >> > General Information >> > ------------------- >> > >> > Libraries need to be released at least once per milestone period. Next >> > week, the release team will propose releases for any library which had >> > changes but has not been otherwise released since the 2023.1 Antelope >> > release. >> > PTLs or release liaisons, please watch for these and give a +1 to >> > acknowledge them. If there is some reason to hold off on a release, let >> > us know that as well, by posting a -1. If we do not hear anything at all >> > by the end of the week, we will assume things are OK to proceed. >> > >> > NB: If one of your libraries is still releasing 0.x versions, start >> > thinking about when it will be appropriate to do a 1.0 version. The >> > version number does signal the state, real or perceived, of the library, >> > so we strongly encourage going to a full major version once things are >> > in a good and usable state. >> > >> > Upcoming Deadlines & Dates >> > -------------------------- >> > >> > Bobcat-1 milestone: May 11th, 2023 >> > OpenInfra Summit Vancouver (including PTG): June 13-15, 2023 >> > Final 2023.2 Bobcat release: October 4th, 2023 >> > >> > >> > El?d Ill?s >> > irc: elodilles @ #openstack-release >> > -------------- next part -------------- >> > An HTML attachment was scrubbed... >> > URL: < >> > >> https://lists.openstack.org/pipermail/openstack-discuss/attachments/20230505/df4fecbf/attachment-0001.htm >> > > >> > >> > ------------------------------ >> > >> > Message: 2 >> > Date: Fri, 5 May 2023 18:03:48 +0000 >> > From: Guilherme Steinm?ller >> > To: Kendall Nelson >> > Cc: Jake Yip , OpenStack Discuss >> > , "dale at catalystcloud.nz" >> > >> > Subject: Re: Kubernetes Conformance 1.24 + 1.25 >> > Message-ID: >> > < >> > >> YT3P288MB02720867AA20B5F11C042EFCAB729 at YT3P288MB0272.CANP288.PROD.OUTLOOK.COM >> > > >> > >> > Content-Type: text/plain; charset="iso-8859-1" >> > >> > Hey there! >> > >> > I am trying to run conformance against 1.25 and 1.26 now, but it looks >> > like we are still with this ongoing? >> > https://review.opendev.org/c/openstack/magnum/+/874092 >> > >> > Im still facing issues to create the cluster due to "PodSecurityPolicy\" >> > is unknown. >> > >> > Thank you, >> > Guilherme Steinmuller >> > ________________________________ >> > From: Kendall Nelson >> > Sent: 21 February 2023 17:38 >> > To: Guilherme Steinm?ller >> > Cc: Jake Yip ; OpenStack Discuss < >> > openstack-discuss at lists.openstack.org>; dale at catalystcloud.nz < >> > dale at catalystcloud.nz> >> > Subject: Re: Kubernetes Conformance 1.24 + 1.25 >> > >> > Circling back to this thread- >> > >> > Thanks Jake for getting this rolling! >> > https://review.opendev.org/c/openstack/magnum/+/874092 >> > >> > -Kendall >> > >> > On Wed, Feb 15, 2023 at 6:34 AM Guilherme Steinm?ller < >> > gsteinmuller at vexxhost.com> wrote: >> > Hi Jake, >> > >> > Yeah, that could be it. >> > >> > On devstack magnum master, the kube-apiserver pod fails to start with >> > rancher 1.25 hyperkube image with: >> > >> > Feb 14 20:24:06 k8s-cluster-dgpwfkugdna5-master-0 conmon[119164]: E0214 >> > 20:24:06.615919 1 run.go:74] "command failed" >> err="admission-control >> > plugin \"PodSecurityPolicy\" is unknown" >> > >> > Regards, >> > Guilherme Steinmuller >> > >> > On Tue, Feb 14, 2023 at 10:03 AM Jake Yip > > jake.yip at ardc.edu.au>> wrote: >> > Hi Guilherme Steinmuller, >> > >> > Is the issue with 1.25 the removal of PodSecurityPolicy? And that there >> > are pieces of PSP in Magnum code. I've been trying to remove it. >> > >> > Regards, >> > Jake >> > >> > >> > On 14/2/2023 11:35 pm, Guilherme Steinm?ller wrote: >> > > Hi everyone! >> > > >> > > Dale, thanks for your comments here. I no longer have my devstack which >> > > I tested v1.25. However, you pointed out something I haven't noticed: >> > > for v1.25 I tried using the fedora coreos that is shipped with >> devstack, >> > > which is f36. >> > > >> > > I will try to reproduce it again, but now using a newer fedora coreos. >> > > If it fails, I will be happy to share my results here for us to figure >> > > out and get certified for 1.25! >> > > >> > > Keep in tune! >> > > >> > > Thank you, >> > > Guilherme Steinmuller >> > > >> > > On Tue, Feb 14, 2023 at 9:26 AM Jake Yip > > jake.yip at ardc.edu.au> >> > > >> wrote: >> > > >> > > On 14/2/2023 6:53 am, Kendall Nelson wrote: >> > > > Hello All! >> > > > >> > > > First of all, I want to say a huge thanks to Guilherme >> > > Steinmuller for >> > > > all his help ensuring that OpenStack Magnum remains Kubernetes >> > > Certified >> > > > [1]! We are certified for v1.24! >> > > > >> > > Wow great work Guilherme Steinmuller! >> > > >> > > - Jake >> > > >> > -------------- next part -------------- >> > An HTML attachment was scrubbed... >> > URL: < >> > >> https://lists.openstack.org/pipermail/openstack-discuss/attachments/20230505/9afb1d74/attachment.htm >> > > >> > >> > ------------------------------ >> > >> > Subject: Digest Footer >> > >> > _______________________________________________ >> > openstack-discuss mailing list >> > openstack-discuss at lists.openstack.org >> > >> > >> > ------------------------------ >> > >> > End of openstack-discuss Digest, Vol 55, Issue 16 >> > ************************************************* >> > >> -------------- next part -------------- >> An HTML attachment was scrubbed... >> URL: < >> https://lists.openstack.org/pipermail/openstack-discuss/attachments/20230506/385b520c/attachment.htm >> > >> >> ------------------------------ >> >> Subject: Digest Footer >> >> _______________________________________________ >> openstack-discuss mailing list >> openstack-discuss at lists.openstack.org >> >> >> ------------------------------ >> >> End of openstack-discuss Digest, Vol 55, Issue 17 >> ************************************************* >> From eblock at nde.ag Sun May 7 08:31:27 2023 From: eblock at nde.ag (Eugen Block) Date: Sun, 07 May 2023 08:31:27 +0000 Subject: [Nova] Clean up dead vms entries from DB In-Reply-To: Message-ID: <20230507083127.Horde.Ez5ht-iKZQGLT7ShpiWaN8d@webmail.nde.ag> Hi, check out this thread: https://lists.openstack.org/pipermail/openstack-discuss/2023-February/032365.html I had instances in a pending state, Sylvain helped to get rid of those. Although it might not be the exact same solution for you it could help. Regards Eugen Zitat von Satish Patel : > Folks, > > I have a small environment where controllers and computers run on the same > nodes and yes that is a bad idea. But now what happened the machine got OOM > out and crashed and stuck somewhere in Zombie stat. I have deleted all vms > but it didn't work. Later I used the "virsh destroy" command to delete > those vms and everything recovered but my openstack hypervisor state show" > command still says you have 91 VMs hanging in DB. > > How do i clean up vms entered from nova DB which doesn't exist at all. > > I have tried the command "nova-manage db archive_deleted_rows" but it > didn't help. How does nova sync DB with current stat? > > # openstack hypervisor stats show > This command is deprecated. > +----------------------+--------+ > | Field | Value | > +----------------------+--------+ > | count | 3 | > | current_workload | 11 | > | disk_available_least | 15452 | > | free_disk_gb | 48375 | > | free_ram_mb | 192464 | > | local_gb | 50286 | > | local_gb_used | 1911 | > | memory_mb | 386570 | > | memory_mb_used | 194106 | > | running_vms | 91 | > | vcpus | 144 | > | vcpus_used | 184 | > +----------------------+--------+ > > > Technically I have only single VM > > # openstack server list --all > +--------------------------------------+---------+--------+---------------------------------+------------+----------+ > | ID | Name | Status | Networks > | Image | Flavor | > +--------------------------------------+---------+--------+---------------------------------+------------+----------+ > | b33fd79d-2b90-41dd-a070-29f92ce205e7 | foo1 | ACTIVE | 1=100.100.75.22, > 192.168.1.139 | ubuntu2204 | m1.small | > +--------------------------------------+---------+--------+---------------------------------+------------+----------+ From felix.huettner at mail.schwarz Mon May 8 06:14:35 2023 From: felix.huettner at mail.schwarz (=?utf-8?B?RmVsaXggSMO8dHRuZXI=?=) Date: Mon, 8 May 2023 06:14:35 +0000 Subject: [OPENSTACK][rabbitmq] using quorum queues In-Reply-To: References: Message-ID: Hi Nguyen, we are using quorum queues for one of our deployments. So fare we did not have any issue with them. They also seem to survive restarts without issues (however reply queues are still broken afterwards in a small amount of cases, but they are no quorum/mirrored queues anyway). So I would recommend them for everyone that creates a new cluster. -- Felix Huettner From: Nguy?n H?u Kh?i Sent: Saturday, May 6, 2023 4:29 AM To: OpenStack Discuss Subject: [OPENSTACK][rabbitmq] using quorum queues Hello guys. IS there any guy who uses the quorum queue for openstack? Could you give some feedback to compare with classic queue? Thank you. Nguyen Huu Khoi Diese E Mail enth?lt m?glicherweise vertrauliche Inhalte und ist nur f?r die Verwertung durch den vorgesehenen Empf?nger bestimmt. Sollten Sie nicht der vorgesehene Empf?nger sein, setzen Sie den Absender bitte unverz?glich in Kenntnis und l?schen diese E Mail. Hinweise zum Datenschutz finden Sie hier. This e-mail may contain confidential content and is intended only for the specified recipient/s. If you are not the intended recipient, please inform the sender immediately and delete this e-mail. Information on data protection can be found here. -------------- next part -------------- An HTML attachment was scrubbed... URL: From wodel.youchi at gmail.com Mon May 8 12:06:42 2023 From: wodel.youchi at gmail.com (wodel youchi) Date: Mon, 8 May 2023 13:06:42 +0100 Subject: [kolla-ansible][yoga] fluentd problem with elasticsearch after update In-Reply-To: <02C03BF9-E20D-4844-93B6-E472C1AD7E50@gmail.com> References: <2E93A6DB-2BA6-44A4-B1A0-97AA7205B791@gmail.com> <02C03BF9-E20D-4844-93B6-E472C1AD7E50@gmail.com> Message-ID: Hi, I cannot access the fluentd container, it crashes almost instantly after each restart. Is there a workaround to make the container start and wait until the commands are executed? Regards. Le jeu. 4 mai 2023 ? 14:12, Micha? Nasiadka a ?crit : > Hello, > > Kolla-Ansible is not supporting both opensearch and elasticsearch running > at the same time - so if you?re using cloudkitty - it?s better to stick for > Elasticsearch for now (CK does not support OpenSearch yet). > > I started working on the bug - will let you know in the bug report when a > fix will be merged and images published. > In the meantime you can try to uninstall the too-new elasticsearch gems > using td-agent-gem uninstall in your running container image. > > Best regards, > Michal > > On 4 May 2023, at 14:33, wodel youchi wrote: > > Hi, > > I'll try to open a bug for this. > > I am using elasticsearch also with Cloudkitty : > cloudkitty_storage_backend: "elasticsearch" instead of influxdb to get some > HA. > Will I still get the fluentd problem even if I migrate to Opensearch > leaving Cloudkitty with elasticsearch??? > > Regards. > > > > Virus-free.www.avast.com > > > Le jeu. 4 mai 2023 ? 07:42, Micha? Nasiadka a > ?crit : > >> Hello, >> >> That probably is a Kolla bug - can you please raise a bug in >> launchpad.net? >> The other alternative is to migrate to OpenSearch (we?ve back ported this >> functionality recently) - >> https://docs.openstack.org/kolla-ansible/yoga/reference/logging-and-monitoring/central-logging-guide-opensearch.html#migration >> >> Best regards, >> Michal >> >> On 3 May 2023, at 17:13, wodel youchi wrote: >> >> Hi, >> >> I have finished the update of my openstack platform with newer containers. >> >> While verifying I noticed that fluentd container keeps restarting. >> >> In the log file I am having this : >> >>> 2023-05-03 16:07:59 +0100 [error]: #0 config error >>> file="/etc/td-agent/td-agent.conf" error_class=Fluent::ConfigError >>> error="Using Elasticsearch client 8.7.1 is not compatible for your >>> Elasticsearch server. Please check your using elasticsearch gem version and >>> Elasticsearch server." >>> 2023-05-03 16:07:59 +0100 [error]: Worker 0 finished unexpectedly with >>> status 2 >>> 2023-05-03 16:07:59 +0100 [info]: Received graceful stop >>> >> >> Those are the images I am using : >> (yogavenv) [deployer at rscdeployer ~]$ sudo docker images | grep -i elas >> >> 192.168.1.16:4000/openstack.kolla/centos-source-prometheus-elasticsearch-exporter >> yoga030523 b48f63ed0072 12 hours ago 539MB >> 192.168.1.16:4000/openstack.kolla/centos-source-elasticsearch >> yoga030523 3558611b0cf4 12 hours ago 1.2GB >> 192.168.1.16:4000/openstack.kolla/centos-source-elasticsearch-curator >> yoga030523 83a6b48339ea 12 hours ago 637MB >> >> (yogavenv) [deployer at rscdeployer ~]$ sudo docker images | grep -i fluen >> 192.168.1.16:4000/openstack.kolla/centos-source-fluentd >> yoga030523 bf6596e139e2 12 hours ago 847MB >> >> Any ideas? >> >> Regards. >> >> >> >> Virus-free.www.avast.com >> >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From raghavendra-uddhav.tilay at hpe.com Mon May 8 12:18:13 2023 From: raghavendra-uddhav.tilay at hpe.com (U T, Raghavendra) Date: Mon, 8 May 2023 12:18:13 +0000 Subject: [cinder][dev] Add support in driver - Active/Active High Availability Message-ID: Hi, We wish to add Active/Active High Availability to: 1] HPE 3par driver - cinder/cinder/volume/drivers/hpe/hpe_3par_common.py 2] Nimble driver - cinder/cinder/volume/drivers/hpe/nimble.py Checked documentation at https://docs.openstack.org/cinder/latest/contributor/high_availability.html https://docs.openstack.org/cinder/latest/contributor/high_availability.html#cinder-volume https://docs.openstack.org/cinder/latest/contributor/high_availability.html#enabling-active-active-on-drivers Summary of steps: 1] In driver code, set SUPPORTS_ACTIVE_ACTIVE = True 2] Split the method failover_host() into two methods: failover() and failover_completed() 3] In cinder.conf, specify cluster name in [DEFAULT] section cluster = 4] Configure atleast two nodes in HA and perform testing Is this sufficient or anything else required ? Note: For Nimble driver, replication feature is not yet added. So can the above step 2 be skipped? Appreciate any suggestions / pointers. Regards, Raghavendra Tilay. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnasiadka at gmail.com Mon May 8 13:59:45 2023 From: mnasiadka at gmail.com (=?utf-8?Q?Micha=C5=82_Nasiadka?=) Date: Mon, 8 May 2023 15:59:45 +0200 Subject: [kolla-ansible][yoga] fluentd problem with elasticsearch after update In-Reply-To: References: <2E93A6DB-2BA6-44A4-B1A0-97AA7205B791@gmail.com> <02C03BF9-E20D-4844-93B6-E472C1AD7E50@gmail.com> Message-ID: <4756FA70-F6AE-4317-B444-CEEA54D29B93@gmail.com> Hello, Then probably the only solution is to rebuild the container image with this patch: https://review.opendev.org/c/openstack/kolla/+/882289 It will take a couple days before it?s merged upstream. Best regards, Michal > On 8 May 2023, at 14:06, wodel youchi wrote: > > Hi, > > I cannot access the fluentd container, it crashes almost instantly after each restart. > Is there a workaround to make the container start and wait until the commands are executed? > > Regards. > > Le jeu. 4 mai 2023 ? 14:12, Micha? Nasiadka > a ?crit : >> Hello, >> >> Kolla-Ansible is not supporting both opensearch and elasticsearch running at the same time - so if you?re using cloudkitty - it?s better to stick for Elasticsearch for now (CK does not support OpenSearch yet). >> >> I started working on the bug - will let you know in the bug report when a fix will be merged and images published. >> In the meantime you can try to uninstall the too-new elasticsearch gems using td-agent-gem uninstall in your running container image. >> >> Best regards, >> Michal >> >>> On 4 May 2023, at 14:33, wodel youchi > wrote: >>> >>> Hi, >>> >>> I'll try to open a bug for this. >>> >>> I am using elasticsearch also with Cloudkitty : cloudkitty_storage_backend: "elasticsearch" instead of influxdb to get some HA. >>> Will I still get the fluentd problem even if I migrate to Opensearch leaving Cloudkitty with elasticsearch??? >>> >>> Regards. >>> >>> Virus-free.www.avast.com >>> Le jeu. 4 mai 2023 ? 07:42, Micha? Nasiadka > a ?crit : >>>> Hello, >>>> >>>> That probably is a Kolla bug - can you please raise a bug in launchpad.net ? >>>> The other alternative is to migrate to OpenSearch (we?ve back ported this functionality recently) - https://docs.openstack.org/kolla-ansible/yoga/reference/logging-and-monitoring/central-logging-guide-opensearch.html#migration >>>> >>>> Best regards, >>>> Michal >>>> >>>>> On 3 May 2023, at 17:13, wodel youchi > wrote: >>>>> >>>>> Hi, >>>>> >>>>> I have finished the update of my openstack platform with newer containers. >>>>> >>>>> While verifying I noticed that fluentd container keeps restarting. >>>>> >>>>> In the log file I am having this : >>>>>> 2023-05-03 16:07:59 +0100 [error]: #0 config error file="/etc/td-agent/td-agent.conf" error_class=Fluent::ConfigError error="Using Elasticsearch client 8.7.1 is not compatible for your Elasticsearch server. Please check your using elasticsearch gem version and Elasticsearch server." >>>>>> 2023-05-03 16:07:59 +0100 [error]: Worker 0 finished unexpectedly with status 2 >>>>>> 2023-05-03 16:07:59 +0100 [info]: Received graceful stop >>>>> >>>>> Those are the images I am using : >>>>> (yogavenv) [deployer at rscdeployer ~]$ sudo docker images | grep -i elas >>>>> 192.168.1.16:4000/openstack.kolla/centos-source-prometheus-elasticsearch-exporter yoga030523 b48f63ed0072 12 hours ago 539MB >>>>> 192.168.1.16:4000/openstack.kolla/centos-source-elasticsearch yoga030523 3558611b0cf4 12 hours ago 1.2GB >>>>> 192.168.1.16:4000/openstack.kolla/centos-source-elasticsearch-curator yoga030523 83a6b48339ea 12 hours ago 637MB >>>>> >>>>> (yogavenv) [deployer at rscdeployer ~]$ sudo docker images | grep -i fluen >>>>> 192.168.1.16:4000/openstack.kolla/centos-source-fluentd yoga030523 bf6596e139e2 12 hours ago 847MB >>>>> >>>>> Any ideas? >>>>> >>>>> Regards. >>>>> >>>>> Virus-free.www.avast.com >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Mon May 8 14:46:44 2023 From: skaplons at redhat.com (Slawek Kaplonski) Date: Mon, 08 May 2023 16:46:44 +0200 Subject: [neutron][Secure RBAC] New policies enabled by default Message-ID: <2058629.NjohSO5q1h@p1> Hi, It's just a heads up that [1] was merged recently and Neutron is using new, secure RBAC policies by default now. If You would see any issues with that, please report bug(s) and let me know. [1] https://review.opendev.org/c/openstack/neutron/+/879827 -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From ralonsoh at redhat.com Mon May 8 15:05:59 2023 From: ralonsoh at redhat.com (Rodolfo Alonso Hernandez) Date: Mon, 8 May 2023 17:05:59 +0200 Subject: [neutron][Secure RBAC] New policies enabled by default In-Reply-To: <2058629.NjohSO5q1h@p1> References: <2058629.NjohSO5q1h@p1> Message-ID: Thank you Slawek! Good job. On Mon, May 8, 2023 at 4:47?PM Slawek Kaplonski wrote: > Hi, > > It's just a heads up that [1] was merged recently and Neutron is using > new, secure RBAC policies by default now. > > If You would see any issues with that, please report bug(s) and let me > know. > > [1] https://review.opendev.org/c/openstack/neutron/+/879827 > > -- > > Slawek Kaplonski > > Principal Software Engineer > > Red Hat > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knikolla at bu.edu Mon May 8 15:14:14 2023 From: knikolla at bu.edu (Nikolla, Kristi) Date: Mon, 8 May 2023 15:14:14 +0000 Subject: [tc][all] Technical Committee Weekly Summary -- May 5, 2023 Message-ID: <16A86913-9344-483A-9A50-4EDA75CE4136@bu.edu> Hi all, Here?s another edition of ?What?s happening on the Technical Committee.? Meeting ======= The Technical Committee met on May 2, 2023 on the Zoom. A recording of the meeting can be found on Youtube at https://youtu.be/e-IUZ1ymi_A The next meeting will be held on May 9, 2023 at 18:00UTC. For more information visit https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting Happenings ========== Adding Python 3.8 to PTI and expectation of Python support ---------------------------------------------------------- Based on the TC discussion during the last weekly meeting, the TC proposed a change adding Python 3.8 as a PTI requirement for libraries[0] and a proposal attempting to clarify which Python versions we should aim to support in general[1]. [0]. https://review.opendev.org/c/openstack/governance/+/882165 [1]. https://review.opendev.org/c/openstack/governance/+/882154 Changes ======= ? Merged ? Align release naming terminology (documentation-change) | https://review.opendev.org/c/openstack/governance/+/881706 ? Appoint Jerry Zhou as Sahara PTL (formal-vote) | https://review.opendev.org/c/openstack/governance/+/881186 ? Retire puppet-rally - Step 3: Retire Repository (project-update) | https://review.opendev.org/c/openstack/governance/+/880018 ? New Open ? Appoint Dmitriy Rabotyagov as Vitrage PTL (formal-vote) | https://review.opendev.org/c/openstack/governance/+/882139 ? Appoint Hasan Acar as Monasca PTL (formal-vote) | https://review.opendev.org/c/openstack/governance/+/882374 ? Add py38 as a PTI requirement for libraries (formal-vote) | https://review.opendev.org/c/openstack/governance/+/882165 ? Clarify expectations on keeping Python versions (formal-vote) | https://review.opendev.org/c/openstack/governance/+/882154 ? Abandoned ? All Open ? https://review.opendev.org/q/project:openstack/governance+status:open How to contact the TC ===================== If you would like to discuss or give feedback to TC, you can reach out to us in multiple ways: 1. Email: you can send an email with the tag [tc] on the openstack-discuss mailing list. 2. Weekly meeting: The Technical Committee conduct a weekly meeting every Thursday 16:00 UTC 3. IRC: Ping us using the 'tc-members' keyword on the #openstack-tc IRC channel on OFTC. From knikolla at bu.edu Mon May 8 15:18:42 2023 From: knikolla at bu.edu (Nikolla, Kristi) Date: Mon, 8 May 2023 15:18:42 +0000 Subject: [tc] Technical Committee next weekly meeting on May 9, 2023 Message-ID: Hi all, This is a reminder that the next weekly Technical Committee meeting is to be held on Tuesday, May 9, 2023 at 1800 UTC on #openstack-tc on OFTC IRC Please propose items to the agenda by editing the wiki page at https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting At the end of today I will send out an email with the finalized agenda. Thank you, Kristi Nikolla From Arne.Wiebalck at cern.ch Mon May 8 15:45:00 2023 From: Arne.Wiebalck at cern.ch (Arne Wiebalck) Date: Mon, 8 May 2023 15:45:00 +0000 Subject: Stepping down as Ironic core Message-ID: Dear all, With a change in my role at work about a year ago, the time I have had available for upstream technical work has dropped significantly. I would therefore like to step down from my role as one of the Ironic cores. I am really grateful to have worked closely with this team of incredibly talented, helpful and kind people, and would like to use this occasion to thank them for all the support they have provided during these past years ... and of course for all the fun we had! Cheers, Arne -- Arne Wiebalck CERN IT From jay at gr-oss.io Mon May 8 15:58:11 2023 From: jay at gr-oss.io (Jay Faulkner) Date: Mon, 8 May 2023 08:58:11 -0700 Subject: Stepping down as Ironic core In-Reply-To: References: Message-ID: Thank you Arne for your years of service to Ironic! I'm glad you remain a force for good in the larger community even if you don't have time for continued code contribution. As standard with former Ironic cores, we'll be happy to restore your access if you restart active technical contribution to the codebase. - Jay Faulkner Ironic PTL On Mon, May 8, 2023 at 8:53?AM Arne Wiebalck wrote: > Dear all, > > With a change in my role at work about a year ago, the time I have > had available for upstream technical work has dropped significantly. > I would therefore like to step down from my role as one of the Ironic > cores. > > I am really grateful to have worked closely with this team of incredibly > talented, helpful and kind people, and would like to use this occasion > to thank them for all the support they have provided during these past > years ... and of course for all the fun we had! > > Cheers, > Arne > > -- > Arne Wiebalck > CERN IT > -------------- next part -------------- An HTML attachment was scrubbed... URL: From adivya1.singh at gmail.com Mon May 8 17:35:01 2023 From: adivya1.singh at gmail.com (Adivya Singh) Date: Mon, 8 May 2023 23:05:01 +0530 Subject: Fwd: (Openstack-Cinder) Error In-Reply-To: References: Message-ID: HI, Can you please guide me, There is this error which i am getting "fatal: [t1w_cinder_volumes_container-29d71abf]: FAILED! => {"msg": "The conditional check 'cinder_backend_lvm_inuse | bool' failed. The error was: error while evaluating conditional (cinder_backend_lvm_inuse | bool): {{ (cinder_backends|default(\"\")|to_json).find(\"cinder.volume.drivers.lvm.LVMVolumeDriver\") != -1 }}: {'lvm': {'iscsi_ip_address': '{{ 172.29.244.50 }}', 'volume_backend_name': 'LVM_iSCSI', 'volume_driver': 'cinder.volume.drivers.lvm.LVMVolumeDriver', 'volume_group': 'cloud'}}: float object has no element 244\n\nThe error appears to be in '/opt/openstack-ansible/playbooks/common-playbooks/cinder.yml': line 67, column 7, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n - name: Configure container (cinder-volume) when lvm is in-use\n ^ here\n"} I understand this error to some extent, but want to confirm with you if any body have seen similiar kind of error storage_hosts: s1w: ip: 172.29.236.50 container_vars: #cinder_storage_availability_zone: nova #cinder_default_availability_zone: nova cinder_backends: lvm: volume_backend_name: LVM_iSCSI volume_driver: cinder.volume.drivers.lvm.LVMDriver volume_group: cloud-volume iscsi_ip_address: "{{ cinder_storage_address }}" limit_container_types: cinder_volume Regards Adivya Singh -------------- next part -------------- An HTML attachment was scrubbed... URL: From amy at demarco.com Mon May 8 19:35:49 2023 From: amy at demarco.com (Amy Marrich) Date: Mon, 8 May 2023 14:35:49 -0500 Subject: [Diversity] Diversity and Inclusion WG Meeting reminder Message-ID: This is a reminder that the Diversity and Inclusion WG will be meeting tomorrow at 14:00 UTC in the #openinfra-diversity channel on OFTC. We hope members of all OpenInfra projects join us as we review the Foundation-wide diversity survey questions for release at Summit. Thanks, Amy (spotz) From akanevsk at redhat.com Mon May 8 22:08:49 2023 From: akanevsk at redhat.com (Arkady Kanevsky) Date: Mon, 8 May 2023 17:08:49 -0500 Subject: Stepping down as Ironic core In-Reply-To: References: Message-ID: Thank you Arne for all your work in Ironic. On Mon, May 8, 2023 at 11:08?AM Jay Faulkner wrote: > Thank you Arne for your years of service to Ironic! I'm glad you remain a > force for good in the larger community even if you don't have time for > continued code contribution. > > As standard with former Ironic cores, we'll be happy to restore your > access if you restart active technical contribution to the codebase. > > - > Jay Faulkner > Ironic PTL > > On Mon, May 8, 2023 at 8:53?AM Arne Wiebalck > wrote: > >> Dear all, >> >> With a change in my role at work about a year ago, the time I have >> had available for upstream technical work has dropped significantly. >> I would therefore like to step down from my role as one of the Ironic >> cores. >> >> I am really grateful to have worked closely with this team of incredibly >> talented, helpful and kind people, and would like to use this occasion >> to thank them for all the support they have provided during these past >> years ... and of course for all the fun we had! >> >> Cheers, >> Arne >> >> -- >> Arne Wiebalck >> CERN IT >> > -- Arkady Kanevsky, Ph.D. Phone: 972 707-6456 Corporate Phone: 919 729-5744 ext. 8176456 -------------- next part -------------- An HTML attachment was scrubbed... URL: From knikolla at bu.edu Tue May 9 00:13:25 2023 From: knikolla at bu.edu (Nikolla, Kristi) Date: Tue, 9 May 2023 00:13:25 +0000 Subject: [tc] Technical Committee next weekly meeting on May 9, 2023 In-Reply-To: References: Message-ID: Please find below the agenda for tomorrow's meeting: * Roll call * Follow up on past action items ** noonedeadpunk write the words for "The Smith Plan(tm)" (the script of the movie about changing PTI and saving the world from the dangers of getting rid of py38) ** gmann to propose reverts for patches that bumped Python requires to 3.9 and add a voting job that depends on all those * Gate health check * Broken docs due to inconsistent release naming * Schedule of removing support for Python versions by libraries - how it should align with coordinated releases (tooz case) * Recurring tasks check ** Bare 'recheck' state *** https://etherpad.opendev.org/p/recheck-weekly-summary * Open Reviews ** https://review.opendev.org/q/projects:openstack/governance+is:open Thank you, Kristi On May 8, 2023, at 11:18 AM, Nikolla, Kristi wrote: Hi all, This is a reminder that the next weekly Technical Committee meeting is to be held on Tuesday, May 9, 2023 at 1800 UTC on #openstack-tc on OFTC IRC Please propose items to the agenda by editing the wiki page at https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting At the end of today I will send out an email with the finalized agenda. Thank you, Kristi Nikolla -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Tue May 9 03:44:48 2023 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 08 May 2023 20:44:48 -0700 Subject: [neutron][Secure RBAC] New policies enabled by default In-Reply-To: <2058629.NjohSO5q1h@p1> References: <2058629.NjohSO5q1h@p1> Message-ID: <187fe9c979a.12697e078668954.331771289965446838@ghanshyammann.com> ---- On Mon, 08 May 2023 07:46:44 -0700 Slawek Kaplonski wrote --- > Hi, > > It's just a heads up that [1] was merged recently and Neutron is using new, secure RBAC policies by default now. > If You would see any issues with that, please report bug(s) and let me know. > > [1] https://review.opendev.org/c/openstack/neutron/+/879827 Thanks Slawek for doing this. I have one comment for testing of old and new defaults. As 'new defaults are enabled by default' is not yet released, let's continue testing the old defaults as default and continue with single job testing the new defaults. Once 879827 is released (after 2023.2 release), we can switch the testing. That is why devstack does not enable the new defaults as default configuration and after 882518 change, new defaults are not tested anywhere. I added more details about it in the revert of neutron-tempest-plugin change, please check. https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/882518 -gmann > > -- > Slawek Kaplonski > Principal Software Engineer > Red Hat > From gmann at ghanshyammann.com Tue May 9 03:57:46 2023 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 08 May 2023 20:57:46 -0700 Subject: [all][tc][release] Hold the Python3.8 dropping In-Reply-To: <187b9db6cd1.d4ef9f41524517.2257044958377593573@ghanshyammann.com> References: <187b9db6cd1.d4ef9f41524517.2257044958377593573@ghanshyammann.com> Message-ID: <187fea8785f.c2854097669043.1420754679017443619@ghanshyammann.com> ---- On Tue, 25 Apr 2023 12:19:37 -0700 Ghanshyam Mann wrote --- > Hello Everyone, > > As you know the gate is breaking due to Python3.8 drop[1][2], which we discussed in the TC channel > and meeting[3]. TC is preparing the guidelines on not dropping Python3.8 in 2023.2 pti. > > Meanwhile, governance guidelines are prepared, this email is a heads-up to hold the > dropping of Python3.8 from your project/repo. Do not bump the min py version in > setup.cfg (do not bump version in python_requires ). Updates: * We reverted all the changes which dropped python .8 support[1], let me know/revert if any other projects have also merged the py3.8 dropped changes. * python 3.8 tests have been added to the general unit tests template[2]. You will be able to see the py3.8 unit test job running in the check/gate pipeline. * Governance changes to add Python 3.8 in PTI are under review[3][4] [1] https://review.opendev.org/q/topic:oslo-drop-py38+ [2] https://review.opendev.org/c/openstack/openstack-zuul-jobs/+/882175 [3] https://review.opendev.org/c/openstack/governance/+/882165 [4] https://review.opendev.org/c/openstack/governance/+/882154 -gmann > > @Release team, > It will be helpful if you can check/hold the release of any project/lib dropping the Python3.8. > > [1] https://lists.openstack.org/pipermail/openstack-discuss/2023-April/033450.html > [2] https://lists.openstack.org/pipermail/openstack-discuss/2023-April/033454.html > [3] https://meetings.opendev.org/meetings/tc/2023/tc.2023-04-25-18.00.log.html#l-191 > > -gmann > > From anbanerj at redhat.com Tue May 9 06:09:28 2023 From: anbanerj at redhat.com (Ananya Banerjee) Date: Tue, 9 May 2023 08:09:28 +0200 Subject: [gate][tripleo] gate blocker In-Reply-To: References: Message-ID: Hello, The Centos 8 jobs are now back to green. The standalone deployment failure is resolved. Thanks, Ananya On Fri, May 5, 2023 at 1:25?PM Ananya Banerjee wrote: > Hello, > > All Centos 8 jobs which deploy standalone are failing at the moment. > Please hold rechecks if you hit standalone deploy failure on Centos 8 jobs. > > We are working on the bug: https://bugs.launchpad.net/tripleo/+bug/2018588 > > Thanks, > Ananya > > -- > > Ananya Banerjee, RHCSA, RHCE-OSP > > Software Engineer > > Red Hat EMEA > > anbanerj at redhat.com > M: +491784949931 IM: frenzy_friday > @RedHat Red Hat > Red Hat > > > -- Ananya Banerjee, RHCSA, RHCE-OSP Software Engineer Red Hat EMEA anbanerj at redhat.com M: +491784949931 IM: frenzy_friday @RedHat Red Hat Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Tue May 9 07:05:40 2023 From: skaplons at redhat.com (Slawek Kaplonski) Date: Tue, 09 May 2023 09:05:40 +0200 Subject: [neutron][Secure RBAC] New policies enabled by default In-Reply-To: <187fe9c979a.12697e078668954.331771289965446838@ghanshyammann.com> References: <2058629.NjohSO5q1h@p1> <187fe9c979a.12697e078668954.331771289965446838@ghanshyammann.com> Message-ID: <1776516.Lrl0nTHhyt@p1> Hi, Dnia wtorek, 9 maja 2023 05:44:48 CEST Ghanshyam Mann pisze: > ---- On Mon, 08 May 2023 07:46:44 -0700 Slawek Kaplonski wrote --- > > Hi, > > > > It's just a heads up that [1] was merged recently and Neutron is using new, secure RBAC policies by default now. > > If You would see any issues with that, please report bug(s) and let me know. > > > > [1] https://review.opendev.org/c/openstack/neutron/+/879827 > > Thanks Slawek for doing this. > > I have one comment for testing of old and new defaults. As 'new defaults are enabled by default' > is not yet released, let's continue testing the old defaults as default and continue with single > job testing the new defaults. Once 879827 is released (after 2023.2 release), we can switch the > testing. That is why devstack does not enable the new defaults as default configuration and after > 882518 change, new defaults are not tested anywhere. > > I added more details about it in the revert of neutron-tempest-plugin change, please check. > > https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/882518 > > -gmann > > > > > -- > > Slawek Kaplonski > > Principal Software Engineer > > Red Hat > > > > Sorry but I don't think I understand reasons of this revert. IIRC switch policies to new ones by default was part of the phase 1 of the community goal and should be finished even in 2023.1 cycle. We didn't made it then and we catch up now (what was discussed during PTG and there wasn't any objections). As part of the switch I proposed patch https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/879828 which proposed new job "neutron-tempest-plugin-openvswitch-enforce-scope-old-defaults" and this new job is testing old policies still. Why do You want us to wait with this switch and revert it now? -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From robert.winbladh at atea.se Tue May 9 07:37:36 2023 From: robert.winbladh at atea.se (Robert Winbladh) Date: Tue, 9 May 2023 07:37:36 +0000 Subject: Using Veeam Message-ID: Hi, Not sure I am in the right place but thought I would give it a go. We have a s3 service using IBM Spectrum Scale that builds on/uses Openstack Swift. The problem that we are having is that if the client/user is using Veeam the bucket gets locked. IBM explains that it has something to do with outstanding requests/timeouts. And the bucket gets locked, and the client/user/Veeam cannot access it any longer. This happens all the time. So my questions is: * What can I do about it? Anything that can be tuned in Swift? * IBM tells me to spread the load across multiple buckets - but Veeam does not have support for that (as far as I understand) * Any kind of tuning that needs to be done in Veeam (if anyone would know) * We have tried lowering "concurrent tasks" to 1, but still the bucket gets locked. [cid:image001.png at 01D98259.E2C0B2C0] Robert Winbladh Seniorkonsult Technical Operations Manager Incident Manager Problem Manager Change Control Board member Asigra Certified Engineer IBM Spectrum Protect -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 3801 bytes Desc: image001.png URL: From elfosardo at gmail.com Tue May 9 08:31:48 2023 From: elfosardo at gmail.com (Riccardo Pittau) Date: Tue, 9 May 2023 10:31:48 +0200 Subject: Stepping down as Ironic core In-Reply-To: References: Message-ID: Thank you Arne for your work and contributions It's been a pleasure and an honor working with you Riccardo On Tue, May 9, 2023 at 12:16?AM Arkady Kanevsky wrote: > Thank you Arne for all your work in Ironic. > > On Mon, May 8, 2023 at 11:08?AM Jay Faulkner wrote: > >> Thank you Arne for your years of service to Ironic! I'm glad you remain a >> force for good in the larger community even if you don't have time for >> continued code contribution. >> >> As standard with former Ironic cores, we'll be happy to restore your >> access if you restart active technical contribution to the codebase. >> >> - >> Jay Faulkner >> Ironic PTL >> >> On Mon, May 8, 2023 at 8:53?AM Arne Wiebalck >> wrote: >> >>> Dear all, >>> >>> With a change in my role at work about a year ago, the time I have >>> had available for upstream technical work has dropped significantly. >>> I would therefore like to step down from my role as one of the Ironic >>> cores. >>> >>> I am really grateful to have worked closely with this team of incredibly >>> talented, helpful and kind people, and would like to use this occasion >>> to thank them for all the support they have provided during these past >>> years ... and of course for all the fun we had! >>> >>> Cheers, >>> Arne >>> >>> -- >>> Arne Wiebalck >>> CERN IT >>> >> > > -- > Arkady Kanevsky, Ph.D. > Phone: 972 707-6456 > Corporate Phone: 919 729-5744 ext. 8176456 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hanguangyu2 at gmail.com Tue May 9 08:42:24 2023 From: hanguangyu2 at gmail.com (=?UTF-8?B?6Z+p5YWJ5a6H?=) Date: Tue, 9 May 2023 08:42:24 +0000 Subject: [neutron] When num of port mappings is substantial, the response time of List API so slow Message-ID: Hello? I have a Victoria cluster with 1 controller node(also have nova-compute service) and two compute node. ALL node boasts 128 CPU cores and 512GB of memory. I established 1500 port mappings for a single floating IP A. At this point, the response time for the "List Floating IP Port Forwardings" API[1] becomes remarkably slow, approaching a duration of 9 minutes. Does anyone comprehend the cause of this phenomenon or how should I investigate it? I am immensely grateful for any assistance provided. Best regards, Han Guangyu [1] https://docs.openstack.org/api-ref/network/v2/index.html#list-floating-ip-port-forwardings From ralonsoh at redhat.com Tue May 9 09:08:12 2023 From: ralonsoh at redhat.com (Rodolfo Alonso Hernandez) Date: Tue, 9 May 2023 11:08:12 +0200 Subject: [neutron] Bug deputy May 01 - May 07 Message-ID: Hello Neutrinos: This is the bug report from last week. High: * https://bugs.launchpad.net/neutron/+bug/2018727: [SRBAC] API policies for get_policy_*_rule are wrong. Assigned to Slawek. Medium: * https://bugs.launchpad.net/neutron/+bug/2018009: ML2 context not considering tags at network creation and update. Assigned to Miro. * https://bugs.launchpad.net/neutron/+bug/2018289: OVN trunk subport missing bounding information. Assigned to Arnau. * https://bugs.launchpad.net/neutron/+bug/2018529: [OVN] Virtual ports cannot be used as VM ports. Assigned to Rodolfo. * https://bugs.launchpad.net/neutron/+bug/2018585: [SRBAC]New policies change the behavior for check rule type. Assigned to Rodolfo. * https://bugs.launchpad.net/neutron/+bug/2018599: Disable config option use_random_fully does not work. *Not assigned*. * https://bugs.launchpad.net/neutron/+bug/2018737: neutron-dynamic-routing announces routes for disabled routers. Assigned to Felix. Low: * https://bugs.launchpad.net/neutron/+bug/2018474: Unnecessary agent resource version. *Not assigned*. Regards. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralonsoh at redhat.com Tue May 9 09:25:43 2023 From: ralonsoh at redhat.com (Rodolfo Alonso Hernandez) Date: Tue, 9 May 2023 11:25:43 +0200 Subject: [neutron] When num of port mappings is substantial, the response time of List API so slow In-Reply-To: References: Message-ID: Hello Han: Please open a Launchpad bug (https://bugs.launchpad.net/neutron/), explaining the issue you have, the steps to reproduce it and the version used. Any other relevant detail will be appreciated. Regards. On Tue, May 9, 2023 at 10:43?AM ??? wrote: > Hello? > > I have a Victoria cluster with 1 controller node(also have > nova-compute service) and two compute node. ALL node boasts 128 CPU > cores and 512GB of memory. > > I established 1500 port mappings for a single floating IP A. At this > point, the response time for the "List Floating IP Port Forwardings" > API[1] becomes remarkably slow, approaching a duration of 9 minutes. > > > Does anyone comprehend the cause of this phenomenon or how should I > investigate it? > > I am immensely grateful for any assistance provided. > > Best regards, > Han Guangyu > > [1] > https://docs.openstack.org/api-ref/network/v2/index.html#list-floating-ip-port-forwardings > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Tue May 9 09:36:25 2023 From: skaplons at redhat.com (Slawek Kaplonski) Date: Tue, 09 May 2023 11:36:25 +0200 Subject: [neutron][Secure RBAC] New policies enabled by default In-Reply-To: <1776516.Lrl0nTHhyt@p1> References: <2058629.NjohSO5q1h@p1> <187fe9c979a.12697e078668954.331771289965446838@ghanshyammann.com> <1776516.Lrl0nTHhyt@p1> Message-ID: <1752701.khT16t2VZX@p1> Hi, Dnia wtorek, 9 maja 2023 09:05:40 CEST Slawek Kaplonski pisze: > Hi, > > Dnia wtorek, 9 maja 2023 05:44:48 CEST Ghanshyam Mann pisze: > > ---- On Mon, 08 May 2023 07:46:44 -0700 Slawek Kaplonski wrote --- > > > Hi, > > > > > > It's just a heads up that [1] was merged recently and Neutron is using new, secure RBAC policies by default now. > > > If You would see any issues with that, please report bug(s) and let me know. > > > > > > [1] https://review.opendev.org/c/openstack/neutron/+/879827 > > > > Thanks Slawek for doing this. > > > > I have one comment for testing of old and new defaults. As 'new defaults are enabled by default' > > is not yet released, let's continue testing the old defaults as default and continue with single > > job testing the new defaults. Once 879827 is released (after 2023.2 release), we can switch the > > testing. That is why devstack does not enable the new defaults as default configuration and after > > 882518 change, new defaults are not tested anywhere. > > > > I added more details about it in the revert of neutron-tempest-plugin change, please check. > > > > https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/882518 > > > > -gmann > > > > > > > > -- > > > Slawek Kaplonski > > > Principal Software Engineer > > > Red Hat > > > > > > > > > Sorry but I don't think I understand reasons of this revert. IIRC switch policies to new ones by default was part of the phase 1 of the community goal and should be finished even in 2023.1 cycle. We didn't made it then and we catch up now (what was discussed during PTG and there wasn't any objections). > As part of the switch I proposed patch https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/879828 which proposed new job "neutron-tempest-plugin-openvswitch-enforce-scope-old-defaults" and this new job is testing old policies still. > Why do You want us to wait with this switch and revert it now? Please ignore my previous email. I wrote it before first coffee :) Now I understand what the issue was. It's just with testing as devstack is for now always setting "enforce_new_defaults=False" and we didn't had any job which would test new policies after switch. > > -- > Slawek Kaplonski > Principal Software Engineer > Red Hat -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From andrej.moravcik at aliter.com Tue May 9 11:46:03 2023 From: andrej.moravcik at aliter.com (=?utf-8?B?QW5kcmVqIE1vcmF2xI3DrWs=?=) Date: Tue, 9 May 2023 11:46:03 +0000 Subject: [kolla-ansible][xena][vmware][neutron] Message-ID: Hello community, I am evaluating OpenStack release Xena with VMware vSphere deployed using Kolla-ansible. I would really appreciate if someone with similar environment can confirm my findings or share his/her experience. I did some research and choose Xena, because it is still maintained release and vmware-nsx latest stable branch is xena. I have built customised Docker images with additional Neutron plugins from opendev.org/x/vmware-nsx, then I deployed single controller using all-in-one Ansible inventory with customised globals.yml according https://docs.openstack.org/kolla-ansible/xena/reference/compute/vmware-guide.html#vmware-nsx-dvs 1) Neutron creates flat or VLAN type provider networks and add related Distributed Virtual Port Group (PG) to vSphere Distributed Switch (vDS or DVS). PG is automatically created with naming convention ??-?", which may not be suitable for same use cases, eg. PG was created by Cisco ACI ahead of Neutron and is part of ACI EPG. Question: is this PG naming convention fixed by design or is it configurable? If I want to use existing vDS PG with Neutron how this can be achieved? 2) Neutron metadata and DHCP agents are not accessible to Nova instance attached to DVS PG. I can deploy simple instance from Cirros image, but I cannot provision DHCP or fixed IP address. Provisioning via config drive doesn?t work either, I filled bug report to Nova ? https://bugs.launchpad.net/nova/+bug/2018973 The documentation says ? "For VMware DVS, the Neutron DHCP agent does not attaches to Open vSwitch inside VMware environment, but attach to the Open vSwitch bridge called br-dvs on the OpenStack side and replies to/receives DHCP packets through VLAN. Similar to what the DHCP agent does, Neutron metadata agent attaches to br-dvs bridge and works through VLAN.? Question: Is it expected that Kolla-ansible creates working configurations for Nova and Neutron, if all related options in globals.yml are set up? Or further adjustment is needed on Openstack or vCenter side? Thank you. ? Andrej Moravcik ________________________________ T?to spr?va a jej pr?lohy s? d?vern?. Pros?m, nar?bajte s t?mito inform?ciami ako s d?vern?mi a nezverej?ujte, nekop?rujte ani neposielajte tieto inform?cie bez predch?dzaj?ceho s?hlasu spolo?nosti Aliter Technologies, a.s. Ak nie ste adres?tom indikovan?m v z?hlav? tejto spr?vy alebo ?elan? pr?jemca, upovedomte pros?m spolo?nos? Aliter Technologies, a.s. odpove?ou na tento e-mail a p?vodn? spr?vu vyma?te zo svojho syst?mu. Aliter Technologies - registrovan? ochrann? zn?mka pre EU, USA This message and any attachments are confidential. Please treat the information as confidential, and do not disclose, copy or deliver this message to anyone without Aliter Technologies' approval. If you are not the addressee indicated in this message or an intended recipient please notify Aliter Technologies a.s. by return e-mail and delete the message from your system. Aliter Technologies - Registered in WIPO & U.S. Patent and Trademark Office -------------- next part -------------- An HTML attachment was scrubbed... URL: From mgheorghe at cloudbasesolutions.com Tue May 9 13:17:27 2023 From: mgheorghe at cloudbasesolutions.com (Mihai Gheorghe) Date: Tue, 9 May 2023 13:17:27 +0000 Subject: [Octavia] importlib fails to read first 5 chars from file Message-ID: Hello, We are experiencing a strange problem with Octavia API in an Openstack deployment. A bit of context: we have an Openstack Ussuri deployment (HA) on Ubuntu Focal. Most Openstack services are running on LXD containers on 3 baremetals. There are also two OVN-dedicated-chassis which act as gateways and nova-compute nodes. A couple of days ago we updated all the packages in the environment to the latest versions (baremetals and lxc containers). After the updates, we started seeing problems with the Octavia API. The API would randomly give an internal server error. Octavia API is running under Apache2 WSGI on 3 LXD containers on 3 separate nodes. Looking into one of the Octavia container under /var/log/apache2/octavia_error.log, we found the following error: [Fri Apr 28 10:11:07.509268 2023] [wsgi:error] [pid 1312420:tid 140012169602816] [remote 127.0.0.1:59914] mod_wsgi (pid=1312420): Failed to exec Python script file '/usr/bin/octavia-wsgi'. [Fri Apr 28 10:11:07.509455 2023] [wsgi:error] [pid 1312420:tid 140012169602816] [remote 127.0.0.1:59914] mod_wsgi (pid=1312420): Exception occurred processing WSGI script '/usr/bin/octavia-wsgi'. [Fri Apr 28 10:11:07.511688 2023] [wsgi:error] [pid 1312420:tid 140012169602816] [remote 127.0.0.1:59914] Traceback (most recent call last): [Fri Apr 28 10:11:07.511884 2023] [wsgi:error] [pid 1312420:tid 140012169602816] [remote 127.0.0.1:59914] File "/usr/bin/octavia-wsgi", line 52, in [Fri Apr 28 10:11:07.512260 2023] [wsgi:error] [pid 1312420:tid 140012169602816] [remote 127.0.0.1:59914] application = setup_app() [Fri Apr 28 10:11:07.512343 2023] [wsgi:error] [pid 1312420:tid 140012169602816] [remote 127.0.0.1:59914] File "/usr/lib/python3/dist-packages/octavia/api/app.py", line 59, in setup_app [Fri Apr 28 10:11:07.512435 2023] [wsgi:error] [pid 1312420:tid 140012169602816] [remote 127.0.0.1:59914] pecan_config = get_pecan_config() [Fri Apr 28 10:11:07.512513 2023] [wsgi:error] [pid 1312420:tid 140012169602816] [remote 127.0.0.1:59914] File "/usr/lib/python3/dist-packages/octavia/api/app.py", line 40, in get_pecan_config [Fri Apr 28 10:11:07.512529 2023] [wsgi:error] [pid 1312420:tid 140012169602816] [remote 127.0.0.1:59914] return pecan_configuration.conf_from_file(filename) [Fri Apr 28 10:11:07.512543 2023] [wsgi:error] [pid 1312420:tid 140012169602816] [remote 127.0.0.1:59914] File "/usr/lib/python3/dist-packages/pecan/configuration.py", line 176, in conf_from_file [Fri Apr 28 10:11:07.512553 2023] [wsgi:error] [pid 1312420:tid 140012169602816] [remote 127.0.0.1:59914] SourceFileLoader(module_name, abspath).load_module(module_name) [Fri Apr 28 10:11:07.512567 2023] [wsgi:error] [pid 1312420:tid 140012169602816] [remote 127.0.0.1:59914] File "", line 522, in _check_name_wrapper [Fri Apr 28 10:11:07.512705 2023] [wsgi:error] [pid 1312420:tid 140012169602816] [remote 127.0.0.1:59914] File "", line 1027, in load_module [Fri Apr 28 10:11:07.512725 2023] [wsgi:error] [pid 1312420:tid 140012169602816] [remote 127.0.0.1:59914] File "", line 852, in load_module [Fri Apr 28 10:11:07.512737 2023] [wsgi:error] [pid 1312420:tid 140012169602816] [remote 127.0.0.1:59914] File "", line 265, in _load_module_shim [Fri Apr 28 10:11:07.512750 2023] [wsgi:error] [pid 1312420:tid 140012169602816] [remote 127.0.0.1:59914] File "", line 702, in _load [Fri Apr 28 10:11:07.512763 2023] [wsgi:error] [pid 1312420:tid 140012169602816] [remote 127.0.0.1:59914] File "", line 671, in _load_unlocked [Fri Apr 28 10:11:07.512775 2023] [wsgi:error] [pid 1312420:tid 140012169602816] [remote 127.0.0.1:59914] File "", line 844, in exec_module [Fri Apr 28 10:11:07.512788 2023] [wsgi:error] [pid 1312420:tid 140012169602816] [remote 127.0.0.1:59914] File "", line 981, in get_code [Fri Apr 28 10:11:07.512800 2023] [wsgi:error] [pid 1312420:tid 140012169602816] [remote 127.0.0.1:59914] File "", line 911, in source_to_code [Fri Apr 28 10:11:07.512839 2023] [wsgi:error] [pid 1312420:tid 140012169602816] [remote 127.0.0.1:59914] File "", line 219, in _call_with_frames_removed [Fri Apr 28 10:11:07.512869 2023] [wsgi:error] [pid 1312420:tid 140012169602816] [remote 127.0.0.1:59914] File "/usr/lib/python3/dist-packages/octavia/api/config.py", line 1 [Fri Apr 28 10:11:07.512893 2023] [wsgi:error] [pid 1312420:tid 140012169602816] [remote 127.0.0.1:59914] Copyright 2014 Rackspace [Fri Apr 28 10:11:07.512921 2023] [wsgi:error] [pid 1312420:tid 140012169602816] [remote 127.0.0.1:59914] ^ [Fri Apr 28 10:11:07.512938 2023] [wsgi:error] [pid 1312420:tid 140012169602816] [remote 127.0.0.1:59914] IndentationError: unexpected indent The original /usr/lib/python3/dist-packages/octavia/api/config.py file is added to the bug report. As seen in the file, the first line starts with a comment, but python is trying to run it as code. Removing the first 5 charactersin the file so that string above starts right after the comment yelds the following traceback: [Wed Apr 26 17:08:22.609980 2023] [wsgi:error] [pid 330647:tid 140629468657408] [remote 127.0.0.1:55142] mod_wsgi (pid=330647): Failed to exec Python script file '/usr/bin/octavia-wsgi'. [Wed Apr 26 17:08:22.610114 2023] [wsgi:error] [pid 330647:tid 140629468657408] [remote 127.0.0.1:55142] mod_wsgi (pid=330647): Exception occurred processing WSGI script '/usr/bin/octavia-wsgi'. [Wed Apr 26 17:08:22.611039 2023] [wsgi:error] [pid 330647:tid 140629468657408] [remote 127.0.0.1:55142] Traceback (most recent call last): [Wed Apr 26 17:08:22.611097 2023] [wsgi:error] [pid 330647:tid 140629468657408] [remote 127.0.0.1:55142] File "/usr/bin/octavia-wsgi", line 52, in [Wed Apr 26 17:08:22.611106 2023] [wsgi:error] [pid 330647:tid 140629468657408] [remote 127.0.0.1:55142] application = setup_app() [Wed Apr 26 17:08:22.611125 2023] [wsgi:error] [pid 330647:tid 140629468657408] [remote 127.0.0.1:55142] File "/usr/lib/python3/dist-packages/octavia/api/app.py", line 59, in setup_app [Wed Apr 26 17:08:22.611135 2023] [wsgi:error] [pid 330647:tid 140629468657408] [remote 127.0.0.1:55142] pecan_config = get_pecan_config() [Wed Apr 26 17:08:22.611152 2023] [wsgi:error] [pid 330647:tid 140629468657408] [remote 127.0.0.1:55142] File "/usr/lib/python3/dist-packages/octavia/api/app.py", line 40, in get_pecan_config [Wed Apr 26 17:08:22.611160 2023] [wsgi:error] [pid 330647:tid 140629468657408] [remote 127.0.0.1:55142] return pecan_configuration.conf_from_file(filename) [Wed Apr 26 17:08:22.611178 2023] [wsgi:error] [pid 330647:tid 140629468657408] [remote 127.0.0.1:55142] File "/usr/lib/python3/dist-packages/pecan/configuration.py", line 168, in conf_from_file [Wed Apr 26 17:08:22.611327 2023] [wsgi:error] [pid 330647:tid 140629468657408] [remote 127.0.0.1:55142] compiled = compile(f.read(), abspath, 'exec') [Wed Apr 26 17:08:22.611355 2023] [wsgi:error] [pid 330647:tid 140629468657408] [remote 127.0.0.1:55142] File "/usr/lib/python3/dist-packages/octavia/api/config.py", line 1 [Wed Apr 26 17:08:22.611381 2023] [wsgi:error] [pid 330647:tid 140629468657408] [remote 127.0.0.1:55142] right 2014 Rackspace [Wed Apr 26 17:08:22.611419 2023] [wsgi:error] [pid 330647:tid 140629468657408] [remote 127.0.0.1:55142] ^ [Wed Apr 26 17:08:22.611437 2023] [wsgi:error] [pid 330647:tid 140629468657408] [remote 127.0.0.1:55142] SyntaxError: invalid syntax It seems that when the source file is read by python, the first 5 characters are skipped for some reason. Digging deeper, I see no reason for importlib to trim those characters. Internally it uses _io.open_code().read() which should return the entire file without trimming anything If adding the comment after the 5th char on the first line, the file is read as is should but the error moves to a different file: [Fri Apr 28 10:20:24.730194 2023] [wsgi:error] [pid 1316806:tid 140384699016960] [remote 127.0.0.1:41802] mod_wsgi (pid=1316806): Failed to exec Python script file '/usr/bin/octavia-wsgi'. [Fri Apr 28 10:20:24.730379 2023] [wsgi:error] [pid 1316806:tid 140384699016960] [remote 127.0.0.1:41802] mod_wsgi (pid=1316806): Exception occurred processing WSGI script '/usr/bin/octavia-wsgi'. [Fri Apr 28 10:20:24.732137 2023] [wsgi:error] [pid 1316806:tid 140384699016960] [remote 127.0.0.1:41802] Traceback (most recent call last): [Fri Apr 28 10:20:24.732294 2023] [wsgi:error] [pid 1316806:tid 140384699016960] [remote 127.0.0.1:41802] File "/usr/bin/octavia-wsgi", line 52, in [Fri Apr 28 10:20:24.732316 2023] [wsgi:error] [pid 1316806:tid 140384699016960] [remote 127.0.0.1:41802] application = setup_app() [Fri Apr 28 10:20:24.732340 2023] [wsgi:error] [pid 1316806:tid 140384699016960] [remote 127.0.0.1:41802] File "/usr/lib/python3/dist-packages/octavia/api/app.py", line 62, in setup_app [Fri Apr 28 10:20:24.732353 2023] [wsgi:error] [pid 1316806:tid 140384699016960] [remote 127.0.0.1:41802] return pecan_make_app( [Fri Apr 28 10:20:24.732381 2023] [wsgi:error] [pid 1316806:tid 140384699016960] [remote 127.0.0.1:41802] File "/usr/lib/python3/dist-packages/pecan/__init__.py", line 86, in make_app [Fri Apr 28 10:20:24.732400 2023] [wsgi:error] [pid 1316806:tid 140384699016960] [remote 127.0.0.1:41802] app = Pecan(root, **kw) [Fri Apr 28 10:20:24.732453 2023] [wsgi:error] [pid 1316806:tid 140384699016960] [remote 127.0.0.1:41802] File "/usr/lib/python3/dist-packages/pecan/core.py", line 832, in __init__ [Fri Apr 28 10:20:24.732475 2023] [wsgi:error] [pid 1316806:tid 140384699016960] [remote 127.0.0.1:41802] super(Pecan, self).__init__(*args, **kw) [Fri Apr 28 10:20:24.732516 2023] [wsgi:error] [pid 1316806:tid 140384699016960] [remote 127.0.0.1:41802] File "/usr/lib/python3/dist-packages/pecan/core.py", line 240, in __init__ [Fri Apr 28 10:20:24.732536 2023] [wsgi:error] [pid 1316806:tid 140384699016960] [remote 127.0.0.1:41802] root = self.__translate_root__(root) [Fri Apr 28 10:20:24.732582 2023] [wsgi:error] [pid 1316806:tid 140384699016960] [remote 127.0.0.1:41802] File "/usr/lib/python3/dist-packages/pecan/core.py", line 275, in __translate_root__ [Fri Apr 28 10:20:24.732618 2023] [wsgi:error] [pid 1316806:tid 140384699016960] [remote 127.0.0.1:41802] module = __import__(name, fromlist=fromlist) [Fri Apr 28 10:20:24.732645 2023] [wsgi:error] [pid 1316806:tid 140384699016960] [remote 127.0.0.1:41802] File "/usr/lib/python3/dist-packages/octavia/api/root_controller.py", line 24, in [Fri Apr 28 10:20:24.732716 2023] [wsgi:error] [pid 1316806:tid 140384699016960] [remote 127.0.0.1:41802] from octavia.api.v2 import controllers as v2_controller [Fri Apr 28 10:20:24.732760 2023] [wsgi:error] [pid 1316806:tid 140384699016960] [remote 127.0.0.1:41802] File "/usr/lib/python3/dist-packages/octavia/api/v2/controllers/__init__.py", line 19, in [Fri Apr 28 10:20:24.732821 2023] [wsgi:error] [pid 1316806:tid 140384699016960] [remote 127.0.0.1:41802] from octavia.api.v2.controllers import availability_zone_profiles [Fri Apr 28 10:20:24.732908 2023] [wsgi:error] [pid 1316806:tid 140384699016960] [remote 127.0.0.1:41802] File "/usr/lib/python3/dist-packages/octavia/api/v2/controllers/availability_zone_profiles.py", line 1 [Fri Apr 28 10:20:24.732952 2023] [wsgi:error] [pid 1316806:tid 140384699016960] [remote 127.0.0.1:41802] Copyright 2019 Verizon Media [Fri Apr 28 10:20:24.733024 2023] [wsgi:error] [pid 1316806:tid 140384699016960] [remote 127.0.0.1:41802] ^ [Fri Apr 28 10:20:24.733107 2023] [wsgi:error] [pid 1316806:tid 140384699016960] [remote 127.0.0.1:41802] SyntaxError: invalid syntax As far as we can tell, this problem appears when importlib reads the file and hands it over incomplete (missing the first 5 chars). This issue only occurs on the Octavia containers and not on the other Openstack services. The other thing is that the fault is not consistent. It happens at Apache2 reload or restart but not everytime. Once in a few restarts. When the problem occurs, Octavia API is down. Env details: OS: Ubuntu 20.04 Kernel: 5.4.0-147-generic Openstack: Ussuri Octavia: ii octavia-api 6.2.2-0ubuntu1 all OpenStack Load Balancer as a Service - API frontend ii octavia-common 6.2.2-0ubuntu1 all OpenStack Load Balancer as a Service - Common files ii octavia-driver-agent 6.2.2-0ubuntu1 all OpenStack Load Balancer Service - Driver Agent ii octavia-health-manager 6.2.2-0ubuntu1 all OpenStack Load Balancer Service - Health manager ii octavia-housekeeping 6.2.2-0ubuntu1 all OpenStack Load Balancer Service - Housekeeping manager ii octavia-worker 6.2.2-0ubuntu1 all OpenStack Load Balancer Service - Worker ii python3-octavia 6.2.2-0ubuntu1 all OpenStack Load Balancer as a Service - Python libraries ii python3-octavia-lib 2.0.0-0ubuntu1 all Library to support Octavia provider drivers ii python3-ovn-octavia-provider 0.1.0-0ubuntu1 all OpenStack Octavia Integration with OVN - Python 3 library Python: Python 3.8.10 LXD: lxd 5.13-cea5ee2 24758 latest/stable canonical? - A bug was filled on launchpad for this issue: https://bugs.launchpad.net/ubuntu/+source/importlib/+bug/2018011 Thanks! Bug #2018011 ?python3 - importlib fails to read first 5 chars fr...? : Bugs : importlib package : Ubuntu A bit of context: we have an Openstack Ussuri deployment (HA) on Ubuntu Focal. Most Openstack services are running on LXD containers on 3 baremetals. There are also two OVN-dedicated-chassis which act as gateways and nova-compute nodes. A couple of days ago we updated all the packages in the environment to the latest versions (baremetals and lxc containers). After the updates, we started seeing problems with the Octavia API. The API would randomly give an internal server error. Octavia API is... bugs.launchpad.net -------------- next part -------------- An HTML attachment was scrubbed... URL: From hanguangyu2 at gmail.com Tue May 9 15:03:28 2023 From: hanguangyu2 at gmail.com (=?UTF-8?B?6Z+p5YWJ5a6H?=) Date: Tue, 9 May 2023 23:03:28 +0800 Subject: [neutron] When num of port mappings is substantial, the response time of List API so slow In-Reply-To: References: Message-ID: Hi Rodolfo: Thank for your advice. I have opened a launchpad bug: https://bugs.launchpad.net/neutron/+bug/2019012 Regards. Han Rodolfo Alonso Hernandez ?2023?5?9??? 17:25??? > > Hello Han: > > Please open a Launchpad bug (https://bugs.launchpad.net/neutron/), explaining the issue you have, the steps to reproduce it and the version used. Any other relevant detail will be appreciated. > > Regards. > > On Tue, May 9, 2023 at 10:43?AM ??? wrote: >> >> Hello? >> >> I have a Victoria cluster with 1 controller node(also have >> nova-compute service) and two compute node. ALL node boasts 128 CPU >> cores and 512GB of memory. >> >> I established 1500 port mappings for a single floating IP A. At this >> point, the response time for the "List Floating IP Port Forwardings" >> API[1] becomes remarkably slow, approaching a duration of 9 minutes. >> >> >> Does anyone comprehend the cause of this phenomenon or how should I >> investigate it? >> >> I am immensely grateful for any assistance provided. >> >> Best regards, >> Han Guangyu >> >> [1] https://docs.openstack.org/api-ref/network/v2/index.html#list-floating-ip-port-forwardings >> From swogatpradhan22 at gmail.com Tue May 9 15:59:36 2023 From: swogatpradhan22 at gmail.com (Swogat Pradhan) Date: Tue, 9 May 2023 21:29:36 +0530 Subject: Unable to provision HCI node | Wallaby Message-ID: Hi, When i am running node provision command i am getting the following error: The full traceback is: File "/tmp/ansible_metalsmith_instances_payload_tmufs89r/ansible_metalsmith_instances_payload.zip/ansible/modules/metalsmith_instances.py", line 427, in main File "/tmp/ansible_metalsmith_instances_payload_tmufs89r/ansible_metalsmith_instances_payload.zip/ansible/modules/metalsmith_instances.py", line 340, in provision File "/usr/lib64/python3.6/concurrent/futures/thread.py", line 56, in run result = self.fn(*self.args, **self.kwargs) File "/tmp/ansible_metalsmith_instances_payload_tmufs89r/ansible_metalsmith_instances_payload.zip/ansible/modules/metalsmith_instances.py", line 372, in _provision_instance File "/usr/lib/python3.6/site-packages/metalsmith/_provisioner.py", line 488, in wait_for_provisioning raise exceptions.DeploymentFailed(str(exc)) "networks": [ { "network": "ctlplane", "vif": true }, { "network": "internal_api", "subnet": "internal_apis2_subnet" }, { "network": "tenant", "subnet": "tenants2_subnet" }, { "network": "storage", "subnet": "storages2_subnet" }, { "network": "storage_mgmt", "subnet": "storage_mgmts2_subnet" } ], "nics": [ { "network": "ctlplane" } ], "ssh_public_keys": "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCb/KTQTouURPWFO0R1zGEIKXYUDQb4+pIgNuTJ7zA43jz9nyGt/pH1pnZAq71NmfL+sICFPf4uvhqFkMU7T0eC3TVuP87kZ8pAMh0D+NPqtRQJxK0WZO2h67dUKBAtW5WamxABhDh+XqmRPXE8Fl1VvzlKO/KZGUClk24BNGjt/nqho9FGskNY/vZLQV/gZzrKfHUYpcDxIg6NpnNE6bGdy4tyL4JwYcdTP8ovU1JhKMXSTjk8WrGz/OQB8a4Pgq0WWgIRWvwZDuUwFWiK6aV7gkjhMXwwtZ7jq9fQOs/hcleATX7Cq77ayJW8DusxQLynbMrh/dBs3Smgp1Ncd3rthpJz2ujBg7ymScCq4ya0W1RkbzZ6H8kVT0hppO93Ip8VFTMnPDoWONQFdAbAVhaamycVMxxbGatHVYGGXIMMBEL1MO4ncIW4f46vNffIhmoTvG20ncy2zba7hk9D4NMbqEpyR3NG1BnMHW0h9bYp2K9+jOWS9MUfakD8kNi8Ff0= stack at hkg2director.bdxworld.com ", 2023-05-09 22:18:08.565576 | 48d539a1-1679-0477-63a8-000000000018 | FATAL | Provision instances | localhost | error={ "changed": false, "invocation": { "module_args": { "api_timeout": null, "auth": null, "auth_type": null, "availability_zone": null, "ca_cert": null, "clean_up": false, "client_cert": null, "client_key": null, "concurrency": 1, "instances": [ { "config_drive": { "meta_data": { "instance-type": "DistributedComputeHCI" } }, "hostname": "dcn01-hci-2", "image": { "href": "file:///var/lib/ironic/images/overcloud-full.raw", "kernel": "file:///var/lib/ironic/images/overcloud-full.vmlinuz", "ramdisk": "file:///var/lib/ironic/images/overcloud-full.initrd" }, "name": "17107583-48da-43eb-9e38-63e250848d05", "network_config": { "template": "/home/stack/dcn01/hci_network_bond.j2" }, "networks": [ { "network": "ctlplane", "vif": true }, { "network": "internal_api", "subnet": "internal_apis2_subnet" }, { "network": "tenant", "subnet": "tenants2_subnet" }, { "network": "storage", "subnet": "storages2_subnet" }, { "network": "storage_mgmt", "subnet": "storage_mgmts2_subnet" } ], "nics": [ { "network": "ctlplane" } ], "ssh_public_keys": "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCb/KTQTouURPWFO0R1zGEIKXYUDQb4+pIgNuTJ7zA43jz9nyGt/pH1pnZAq71NmfL+sICFPf4uvhqFkMU7T0eC3TVuP87kZ8pAMh0D+NPqtRQJxK0WZO2h67dUKBAtW5WamxABhDh+XqmRPXE8Fl1VvzlKO/KZGUClk24BNGjt/nqho9FGskNY/vZLQV/gZzrKfHUYpcDxIg6NpnNE6bGdy4tyL4JwYcdTP8ovU1JhKMXSTjk8WrGz/OQB8a4Pgq0WWgIRWvwZDuUwFWiK6aV7gkjhMXwwtZ7jq9fQOs/hcleATX7Cq77ayJW8DusxQLynbMrh/dBs3Smgp1Ncd3rthpJz2ujBg7ymScCq4ya0W1RkbzZ6H8kVT0hppO93Ip8VFTMnPDoWONQFdAbAVhaamycVMxxbGatHVYGGXIMMBEL1MO4ncIW4f46vNffIhmoTvG20ncy2zba7hk9D4NMbqEpyR3NG1BnMHW0h9bYp2K9+jOWS9MUfakD8kNi8Ff0= stack at hkg2director.bdxworld.com ", "user_name": "heat-admin" } ], "interface": "public", "log_level": "info", "region_name": null, "state": "present", "timeout": 3600, "validate_certs": null, "wait": true } }, "logging": "Created port dcn01-hci-2-ctlplane (UUID ce763b38-b047-40bc-be44-9c9da8bd2e30) for node singapore-HCI3 (UUID 17107583-48da-43eb-9e38-63e250848d05) with {'network_id': '1fad76a3-aa2a-4213-8c37-eb89629da523', 'name': 'dcn01-hci-2-ctlplane'}\nAttached port dcn01-hci-2-ctlplane (UUID ce763b38-b047-40bc-be44-9c9da8bd2e30) to node singapore-HCI3 (UUID 17107583-48da-43eb-9e38-63e250848d05)\nProvisioning started on node singapore-HCI3 (UUID 17107583-48da-43eb-9e38-63e250848d05)\n", "msg": "Node 17107583-48da-43eb-9e38-63e250848d05 reached failure state \"deploy failed\"; the last error is Deploy step deploy.prepare_instance_boot failed: Failed to install a bootloader when deploying node 17107583-48da-43eb-9e38-63e250848d05. Error: Installing GRUB2 boot loader to device /dev/sda failed with Unexpected error while running command.\nCommand: chroot /tmp/tmp8_wa6g6q /bin/sh -c \"grub2-install /dev/sda\"\nExit code: 1\nStdout: ''\nStderr: 'grub2-install: error: this utility cannot be used for EFI platforms because it does not support UEFI Secure Boot.\\n'." I have already provisioned 2 other nodes using the same hardware, but the node provisioning is failing in this particular node. With regards, Swogat Pradhan -------------- next part -------------- An HTML attachment was scrubbed... URL: From tonyliu0592 at hotmail.com Tue May 9 16:19:28 2023 From: tonyliu0592 at hotmail.com (Tony Liu) Date: Tue, 9 May 2023 16:19:28 +0000 Subject: [cinder] create a multiattach volume in two ways? Message-ID: Hi, [1] says "Currently you can create a multiattach volume in two ways." I see one way is with volume type. What's the another way? [1] https://docs.openstack.org/cinder/xena/admin/blockstorage-volume-multiattach.html Thanks! Tony From michal.arbet at ultimum.io Tue May 9 16:22:57 2023 From: michal.arbet at ultimum.io (Michal Arbet) Date: Tue, 9 May 2023 18:22:57 +0200 Subject: [heat][magnum][tacker] Future of SoftwareDeployment support In-Reply-To: References: <6396cae4-9afd-b823-525f-90690d8c29d7@ardc.edu.au> Message-ID: Hi, We would like to see this feature merged in magnum soon and we can also help... Are there some storyboard action tasks ? Did you mean these reviews ? https://review.opendev.org/q/owner:john%2540johngarbutt.com Thanks Michal Arbet Openstack Engineer Ultimum Technologies a.s. Na Po???? 1047/26, 11000 Praha 1 Czech Republic +420 604 228 897 michal.arbet at ultimum.io *https://ultimum.io * LinkedIn | Twitter | Facebook st 26. 4. 2023 v 12:08 odes?latel Jake Yip napsal: > Hi, > > Yes, there are a couple of WIP reviews in magnum by John. > > Regards, > Jake > > On 18/4/2023 11:16 pm, Michal Arbet wrote: > > Hi, > > > > Two cycles ? I thought that most of work was implemented by vexxhost ? > > https://github.com/vexxhost/magnum-cluster-api > > Do you have reviews somewhere if you are working on ClusterAPI ? > > > > > > Thanks > > > > On 4/18/23 13:53, Jake Yip wrote: > >> HI Takashi, > >> > >> Sorry I missed replying. > >> > >> On 30/3/2023 1:46 pm, Takashi Kajinami wrote: > >> > >>> > >>> 1. Magnum > >>> It seems SoftwareDeployment is used by k8s_fedora_atomic_v1 driver > >>> but I'm not too sure whether > >>> this driver is still supported, because Fedora Atomic was EOLed a > >>> while ago, right ? > >>> > >> > >> No It's still in the main k8s_fedora_coreos_v1 driver. It basically is > >> how everything is set up, so we still depend on this greatly for now. > >> > >> We are also working on a ClusterAPI driver who will bypass heat > >> altogether. We hope to get it working within two cycles, then we can > >> remove k8s_fedora_coreos_v1 together, possibly within another two > cycles. > >> > >> Thanks! > >> > >> Regards > >> Jake > >> > > -- > > Michal Arbet > > Openstack Engineer > > > > Ultimum Technologies a.s. > > Na Po???? 1047/26, 11000 Praha 1 > > Czech Republic > > > > +420 604 228 897 > > michal.arbet at ultimum.io > > _https://ultimum.io_ > > > > LinkedIn | > > Twitter | Facebook > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdhasman at redhat.com Tue May 9 16:38:58 2023 From: rdhasman at redhat.com (Rajat Dhasmana) Date: Tue, 9 May 2023 22:08:58 +0530 Subject: [cinder] create a multiattach volume in two ways? In-Reply-To: References: Message-ID: Hi Tony, There used to be a way to create a multiattach volume by passing "multiattach" parameter in the volume create request but that was deprecated and recently removed. We only recommend creating multiattach volumes by using a multiattach volume type and that is the only way. Thanks for pointing this out, looks like we need fixing in our documentation. - Rajat Dhasmana On Tue, May 9, 2023 at 9:54?PM Tony Liu wrote: > Hi, > > [1] says "Currently you can create a multiattach volume in two ways." > I see one way is with volume type. What's the another way? > > [1] > https://docs.openstack.org/cinder/xena/admin/blockstorage-volume-multiattach.html > > Thanks! > Tony > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdhasman at redhat.com Tue May 9 16:38:58 2023 From: rdhasman at redhat.com (Rajat Dhasmana) Date: Tue, 9 May 2023 22:08:58 +0530 Subject: [cinder] create a multiattach volume in two ways? In-Reply-To: References: Message-ID: Hi Tony, There used to be a way to create a multiattach volume by passing "multiattach" parameter in the volume create request but that was deprecated and recently removed. We only recommend creating multiattach volumes by using a multiattach volume type and that is the only way. Thanks for pointing this out, looks like we need fixing in our documentation. - Rajat Dhasmana On Tue, May 9, 2023 at 9:54?PM Tony Liu wrote: > Hi, > > [1] says "Currently you can create a multiattach volume in two ways." > I see one way is with volume type. What's the another way? > > [1] > https://docs.openstack.org/cinder/xena/admin/blockstorage-volume-multiattach.html > > Thanks! > Tony > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdhasman at redhat.com Tue May 9 17:01:11 2023 From: rdhasman at redhat.com (Rajat Dhasmana) Date: Tue, 9 May 2023 22:31:11 +0530 Subject: [cinder] create a multiattach volume in two ways? In-Reply-To: References: Message-ID: Proposed the fix here https://review.opendev.org/c/openstack/cinder/+/882729 On Tue, May 9, 2023 at 10:08?PM Rajat Dhasmana wrote: > Hi Tony, > > There used to be a way to create a multiattach volume by passing > "multiattach" parameter in the volume create request but that was > deprecated and recently removed. > We only recommend creating multiattach volumes by using a multiattach > volume type and that is the only way. > > Thanks for pointing this out, looks like we need fixing in our > documentation. > > - > Rajat Dhasmana > > > On Tue, May 9, 2023 at 9:54?PM Tony Liu wrote: > >> Hi, >> >> [1] says "Currently you can create a multiattach volume in two ways." >> I see one way is with volume type. What's the another way? >> >> [1] >> https://docs.openstack.org/cinder/xena/admin/blockstorage-volume-multiattach.html >> >> Thanks! >> Tony >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdhasman at redhat.com Tue May 9 17:01:11 2023 From: rdhasman at redhat.com (Rajat Dhasmana) Date: Tue, 9 May 2023 22:31:11 +0530 Subject: [cinder] create a multiattach volume in two ways? In-Reply-To: References: Message-ID: Proposed the fix here https://review.opendev.org/c/openstack/cinder/+/882729 On Tue, May 9, 2023 at 10:08?PM Rajat Dhasmana wrote: > Hi Tony, > > There used to be a way to create a multiattach volume by passing > "multiattach" parameter in the volume create request but that was > deprecated and recently removed. > We only recommend creating multiattach volumes by using a multiattach > volume type and that is the only way. > > Thanks for pointing this out, looks like we need fixing in our > documentation. > > - > Rajat Dhasmana > > > On Tue, May 9, 2023 at 9:54?PM Tony Liu wrote: > >> Hi, >> >> [1] says "Currently you can create a multiattach volume in two ways." >> I see one way is with volume type. What's the another way? >> >> [1] >> https://docs.openstack.org/cinder/xena/admin/blockstorage-volume-multiattach.html >> >> Thanks! >> Tony >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From juliaashleykreger at gmail.com Tue May 9 19:57:21 2023 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Tue, 9 May 2023 12:57:21 -0700 Subject: Unable to provision HCI node | Wallaby In-Reply-To: References: Message-ID: A few different observations: 1) It appears you're using a partition image. We recommend whole disk images 2) The Image you're deploying doesn't support grub-install being used in UEFI enabled state. Whole disk images should contain contents supporting UEFI boot in the form of /boot/EFI/ contents. 3) Newer/more recent Ironic can navigate turning the UEFI boot loader pointer into a UEFI NVRAM entry. I recommend Wallaby or newer. This specifically will need to be the deployment ramdisk. -Julia On Tue, May 9, 2023 at 9:05?AM Swogat Pradhan wrote: > Hi, > When i am running node provision command i am getting the following error: > > The full traceback is: > File > "/tmp/ansible_metalsmith_instances_payload_tmufs89r/ansible_metalsmith_instances_payload.zip/ansible/modules/metalsmith_instances.py", > line 427, in main > File > "/tmp/ansible_metalsmith_instances_payload_tmufs89r/ansible_metalsmith_instances_payload.zip/ansible/modules/metalsmith_instances.py", > line 340, in provision > File "/usr/lib64/python3.6/concurrent/futures/thread.py", line 56, in run > result = self.fn(*self.args, **self.kwargs) > File > "/tmp/ansible_metalsmith_instances_payload_tmufs89r/ansible_metalsmith_instances_payload.zip/ansible/modules/metalsmith_instances.py", > line 372, in _provision_instance > File "/usr/lib/python3.6/site-packages/metalsmith/_provisioner.py", line > 488, in wait_for_provisioning > raise exceptions.DeploymentFailed(str(exc)) > "networks": [ > { > "network": "ctlplane", > "vif": true > }, > { > "network": "internal_api", > "subnet": "internal_apis2_subnet" > }, > { > "network": "tenant", > "subnet": "tenants2_subnet" > }, > { > "network": "storage", > "subnet": "storages2_subnet" > }, > { > "network": "storage_mgmt", > "subnet": "storage_mgmts2_subnet" > } > ], > "nics": [ > { > "network": "ctlplane" > } > ], > "ssh_public_keys": "ssh-rsa > AAAAB3NzaC1yc2EAAAADAQABAAABgQCb/KTQTouURPWFO0R1zGEIKXYUDQb4+pIgNuTJ7zA43jz9nyGt/pH1pnZAq71NmfL+sICFPf4uvhqFkMU7T0eC3TVuP87kZ8pAMh0D+NPqtRQJxK0WZO2h67dUKBAtW5WamxABhDh+XqmRPXE8Fl1VvzlKO/KZGUClk24BNGjt/nqho9FGskNY/vZLQV/gZzrKfHUYpcDxIg6NpnNE6bGdy4tyL4JwYcdTP8ovU1JhKMXSTjk8WrGz/OQB8a4Pgq0WWgIRWvwZDuUwFWiK6aV7gkjhMXwwtZ7jq9fQOs/hcleATX7Cq77ayJW8DusxQLynbMrh/dBs3Smgp1Ncd3rthpJz2ujBg7ymScCq4ya0W1RkbzZ6H8kVT0hppO93Ip8VFTMnPDoWONQFdAbAVhaamycVMxxbGatHVYGGXIMMBEL1MO4ncIW4f46vNffIhmoTvG20ncy2zba7hk9D4NMbqEpyR3NG1BnMHW0h9bYp2K9+jOWS9MUfakD8kNi8Ff0= > stack at hkg2director.bdxworld.com ", > 2023-05-09 22:18:08.565576 | > 48d539a1-1679-0477-63a8-000000000018 | FATAL | Provision instances | > localhost | error={ > "changed": false, > "invocation": { > "module_args": { > "api_timeout": null, > "auth": null, > "auth_type": null, > "availability_zone": null, > "ca_cert": null, > "clean_up": false, > "client_cert": null, > "client_key": null, > "concurrency": 1, > "instances": [ > { > "config_drive": { > "meta_data": { > "instance-type": "DistributedComputeHCI" > } > }, > "hostname": "dcn01-hci-2", > "image": { > "href": > "file:///var/lib/ironic/images/overcloud-full.raw", > "kernel": > "file:///var/lib/ironic/images/overcloud-full.vmlinuz", > "ramdisk": > "file:///var/lib/ironic/images/overcloud-full.initrd" > }, > "name": "17107583-48da-43eb-9e38-63e250848d05", > "network_config": { > "template": "/home/stack/dcn01/hci_network_bond.j2" > }, > "networks": [ > { > "network": "ctlplane", > "vif": true > }, > { > "network": "internal_api", > "subnet": "internal_apis2_subnet" > }, > { > "network": "tenant", > "subnet": "tenants2_subnet" > }, > { > "network": "storage", > "subnet": "storages2_subnet" > }, > { > "network": "storage_mgmt", > "subnet": "storage_mgmts2_subnet" > } > ], > "nics": [ > { > "network": "ctlplane" > } > ], > "ssh_public_keys": "ssh-rsa > AAAAB3NzaC1yc2EAAAADAQABAAABgQCb/KTQTouURPWFO0R1zGEIKXYUDQb4+pIgNuTJ7zA43jz9nyGt/pH1pnZAq71NmfL+sICFPf4uvhqFkMU7T0eC3TVuP87kZ8pAMh0D+NPqtRQJxK0WZO2h67dUKBAtW5WamxABhDh+XqmRPXE8Fl1VvzlKO/KZGUClk24BNGjt/nqho9FGskNY/vZLQV/gZzrKfHUYpcDxIg6NpnNE6bGdy4tyL4JwYcdTP8ovU1JhKMXSTjk8WrGz/OQB8a4Pgq0WWgIRWvwZDuUwFWiK6aV7gkjhMXwwtZ7jq9fQOs/hcleATX7Cq77ayJW8DusxQLynbMrh/dBs3Smgp1Ncd3rthpJz2ujBg7ymScCq4ya0W1RkbzZ6H8kVT0hppO93Ip8VFTMnPDoWONQFdAbAVhaamycVMxxbGatHVYGGXIMMBEL1MO4ncIW4f46vNffIhmoTvG20ncy2zba7hk9D4NMbqEpyR3NG1BnMHW0h9bYp2K9+jOWS9MUfakD8kNi8Ff0= > stack at hkg2director.bdxworld.com ", > "user_name": "heat-admin" > } > ], > "interface": "public", > "log_level": "info", > "region_name": null, > "state": "present", > "timeout": 3600, > "validate_certs": null, > "wait": true > } > }, > "logging": "Created port dcn01-hci-2-ctlplane (UUID > ce763b38-b047-40bc-be44-9c9da8bd2e30) for node singapore-HCI3 (UUID > 17107583-48da-43eb-9e38-63e250848d05) with {'network_id': > '1fad76a3-aa2a-4213-8c37-eb89629da523', 'name': > 'dcn01-hci-2-ctlplane'}\nAttached port dcn01-hci-2-ctlplane (UUID > ce763b38-b047-40bc-be44-9c9da8bd2e30) to node singapore-HCI3 (UUID > 17107583-48da-43eb-9e38-63e250848d05)\nProvisioning started on node > singapore-HCI3 (UUID 17107583-48da-43eb-9e38-63e250848d05)\n", > "msg": "Node 17107583-48da-43eb-9e38-63e250848d05 reached failure > state \"deploy failed\"; the last error is Deploy step > deploy.prepare_instance_boot failed: Failed to install a bootloader when > deploying node 17107583-48da-43eb-9e38-63e250848d05. Error: Installing > GRUB2 boot loader to device /dev/sda failed with Unexpected error while > running command.\nCommand: chroot /tmp/tmp8_wa6g6q /bin/sh -c > \"grub2-install /dev/sda\"\nExit code: 1\nStdout: ''\nStderr: > 'grub2-install: error: this utility cannot be used for EFI platforms > because it does not support UEFI Secure Boot.\\n'." > > I have already provisioned 2 other nodes using the same hardware, but the > node provisioning is failing in this particular node. > > With regards, > Swogat Pradhan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tkajinam at redhat.com Wed May 10 03:11:12 2023 From: tkajinam at redhat.com (Takashi Kajinami) Date: Wed, 10 May 2023 12:11:12 +0900 Subject: [heat][magnum][tacker] Future of SoftwareDeployment support In-Reply-To: References: Message-ID: (replying to the base message because there are a few separate threads going). Thanks for these inputs and sorry for my late reply. So according to Jake Magnum still requires SoftwareDeployment/Config. I got one off-list reply from the user using the feature. Based on these feedbacks, we can keep the feature for now. The main challenge with keeping this feature is that we have to maintain a specific image with additional tools installed. We can try our best to keep these maintained but if anyone can help maintaining the job/tools, test the features and share any test feedback, that is much appreciated. On Thu, Mar 30, 2023 at 11:46?AM Takashi Kajinami wrote: > Hello, > > > We discussed this briefly in the past thread where we discussed > maintenance of os-*-agent repos, > and also talked about this topic during Heat PTG, but I'd like to > formalize the discussion to get > a clear agreement. > > Heat has been supporting SoftwareDeployment resources to configure > software in instances using > some agents such as os-collect-config[1]. > [1] > https://docs.openstack.org/heat/latest/template_guide/software_deployment.html#software-deployment-resources > > This feature was initially developed to be used by TripleO (IIUC), but > TripleO is retired now and > we are losing the first motivation to maintain the feature. > # Even TripleO replaced most of its usage of softwaredeployment by > config-download lately. > > Because the heat project team has drunk dramatically recently, we'd like > to put more focus on core > features. For that aim we are now wondering if we can deprecate and remove > this feature, and would > like to hear from anyone who has any concerns about this. > > Quickly looking through the repos, it seems currently Magnum and Tacker > are using SoftwareDeployment, > and it'd be nice especially if we can understand their current > requirements. > > 1. Magnum > It seems SoftwareDeployment is used by k8s_fedora_atomic_v1 driver but I'm > not too sure whether > this driver is still supported, because Fedora Atomic was EOLed a while > ago, right ? > > 2. Tacker > SoftwareDeployment can be found in only test code in the tacker repo. We > have some references kept > in heat-translator which look related to TOSCA templates. > > Thank you, > Takashi Kajinami > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tkajinam at redhat.com Wed May 10 04:08:18 2023 From: tkajinam at redhat.com (Takashi Kajinami) Date: Wed, 10 May 2023 13:08:18 +0900 Subject: [heat][magnum][tacker] Future of SoftwareDeployment support In-Reply-To: References: Message-ID: Rabi pointed out one quite important point which I missed when I wrote the previous email. So I noticed we do not run any tests to evaluate SoftwareDeployment/Config in heat CI. We have a scenario test but it was disabled when we migrated our jobs to zuul v3(during Ussuri) and we have had no test coverage since then. https://github.com/openstack/heat/commit/c8d1a9f901aa7b956c055668532967fd34202fe4#diff-52a39eb793da5f54635afbdceeeddc16482afb9ab022d981ebea7743dc81dc8cR446 Since TripleO has removed usage of SoftwareConfig/Deployment a while ago and Magnum is also getting rid of it, we'd have no test coverage left. Unfortunately currently resources around the heat project are quite small, and honestly I'm not too sure if anyone can work on adding the test coverage soon. We have to create a customized image with some toolings, as I earlier mentioned, but that has been kept as TODO for long. So I'm looking for any volunteers to restore the test coverage in upstream CI so that we can test the feature, which is required for maintenance. If no one appears then we likely have to deprecate the feature before 2024.1, so that we can remove it in 2025.1, instead of keeping the feature untested and unmaintained. On Wed, May 10, 2023 at 12:11?PM Takashi Kajinami wrote: > (replying to the base message because there are a few separate threads > going). > > Thanks for these inputs and sorry for my late reply. > So according to Jake Magnum still requires SoftwareDeployment/Config. I > got one off-list reply > from the user using the feature. Based on these feedbacks, we can keep the > feature for now. > > The main challenge with keeping this feature is that we have to maintain a > specific image with > additional tools installed. We can try our best to keep these maintained > but if anyone can help > maintaining the job/tools, test the features and share any test feedback, > that is much appreciated. > > > On Thu, Mar 30, 2023 at 11:46?AM Takashi Kajinami > wrote: > >> Hello, >> >> >> We discussed this briefly in the past thread where we discussed >> maintenance of os-*-agent repos, >> and also talked about this topic during Heat PTG, but I'd like to >> formalize the discussion to get >> a clear agreement. >> >> Heat has been supporting SoftwareDeployment resources to configure >> software in instances using >> some agents such as os-collect-config[1]. >> [1] >> https://docs.openstack.org/heat/latest/template_guide/software_deployment.html#software-deployment-resources >> >> This feature was initially developed to be used by TripleO (IIUC), but >> TripleO is retired now and >> we are losing the first motivation to maintain the feature. >> # Even TripleO replaced most of its usage of softwaredeployment by >> config-download lately. >> >> Because the heat project team has drunk dramatically recently, we'd like >> to put more focus on core >> features. For that aim we are now wondering if we can deprecate and >> remove this feature, and would >> like to hear from anyone who has any concerns about this. >> >> Quickly looking through the repos, it seems currently Magnum and Tacker >> are using SoftwareDeployment, >> and it'd be nice especially if we can understand their current >> requirements. >> >> 1. Magnum >> It seems SoftwareDeployment is used by k8s_fedora_atomic_v1 driver but >> I'm not too sure whether >> this driver is still supported, because Fedora Atomic was EOLed a while >> ago, right ? >> >> 2. Tacker >> SoftwareDeployment can be found in only test code in the tacker repo. We >> have some references kept >> in heat-translator which look related to TOSCA templates. >> >> Thank you, >> Takashi Kajinami >> >> >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From tonyliu0592 at hotmail.com Wed May 10 05:32:57 2023 From: tonyliu0592 at hotmail.com (Tony Liu) Date: Wed, 10 May 2023 05:32:57 +0000 Subject: [cinder] create a multiattach volume in two ways? In-Reply-To: References: Message-ID: Thank you Rajat for clarification! Tony ________________________________________ From: Rajat Dhasmana Sent: May 9, 2023 10:01 AM To: Tony Liu Cc: openstack-discuss; openstack-dev at lists.openstack.org Subject: Re: [cinder] create a multiattach volume in two ways? Proposed the fix here https://review.opendev.org/c/openstack/cinder/+/882729 On Tue, May 9, 2023 at 10:08?PM Rajat Dhasmana > wrote: Hi Tony, There used to be a way to create a multiattach volume by passing "multiattach" parameter in the volume create request but that was deprecated and recently removed. We only recommend creating multiattach volumes by using a multiattach volume type and that is the only way. Thanks for pointing this out, looks like we need fixing in our documentation. - Rajat Dhasmana On Tue, May 9, 2023 at 9:54?PM Tony Liu > wrote: Hi, [1] says "Currently you can create a multiattach volume in two ways." I see one way is with volume type. What's the another way? [1] https://docs.openstack.org/cinder/xena/admin/blockstorage-volume-multiattach.html Thanks! Tony From senrique at redhat.com Wed May 10 10:23:02 2023 From: senrique at redhat.com (Sofia Enriquez) Date: Wed, 10 May 2023 11:23:02 +0100 Subject: Cinder Bug Report 2023-05-10 Message-ID: Hello Argonauts, Cinder Bug Meeting Etherpad *Medium* - [rbac] Volume type's extra_specs dict is empty. - *Status:* Unassigned. - cinder services report_state hangs for a longer period than report_interval when losing database connection. - *Status:* Unassigned. *Low* - HPE 3par - Unable to use small QoS Latency value (less than 1) - *Status:* Fix proposed to master . Cheers -- Sof?a Enriquez she/her Software Engineer Red Hat PnT IRC: @enriquetaso @RedHat Red Hat Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Wed May 10 17:15:45 2023 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 10 May 2023 17:15:45 +0000 Subject: [OSSA-2023-003] cinder, glance_store, nova, os-brick: Unauthorized volume access through deleted volume attachments (CVE-2023-2088) Message-ID: <20230510171544.jj6tbd2owpq5pog3@yuggoth.org> ============================================================================ OSSA-2023-003: Unauthorized volume access through deleted volume attachments ============================================================================ :Date: May 10, 2023 :CVE: CVE-2023-2088 Affects ~~~~~~~ - Cinder: <20.2.1, >=21.0.0 <21.2.1, ==22.0.0 - Glance_store: <3.0.1, >=4.0.0 <4.1.1, >=4.2.0 <4.3.1 - Nova: <25.1.2, >=26.0.0 <26.1.2, ==27.0.0 - Os-brick: <5.2.3, >=6.0.0 <6.1.1, >=6.2.0 <6.2.2 Description ~~~~~~~~~~~ An unauthorized access to a volume could occur when an iSCSI or FC connection from a host is severed due to a volume being unmapped on the storage system and the device is later reused for another volume on the same host. **Scope:** Only deployments with iSCSI or FC volumes are affected. However, the fix for this issue includes a configuration change in Nova and Cinder that may impact you on your next upgrade regardless of what backend storage technology you use. See the *Configuration change* section below, and item 4(B) in the *Patches and Associated Deployment Changes* for details. This data leak can be triggered by two different situations. **Accidental case:** If there is a problem with network connectivity during a normal detach operation, OpenStack may fail to clean the situation up properly. Instead of force-detaching the compute node device, Nova ignores the error, assuming the instance has already been deleted. Due to this incomplete operation OpenStack may end up selecting the wrong multipath device when connecting another volume to an instance. **Intentional case:** A regular user can create an instance with a volume, and then delete the volume attachment directly in Cinder, which neglects to notify Nova. The compute node SCSI plumbing (over iSCSI/FC) will continue trying to connect to the original host/port/LUN, not knowing the attachment has been deleted. If a subsequent volume attachment re-uses the host/port/LUN for a different instance and volume, the original instance will gain access to it once the SCSI plumbing reconnects. Configuration Change -------------------- To prevent the intentional case, the Block Storage API provided by Cinder must only accept attachment delete requests from Nova for instance-attached volumes. A complicating factor is that Nova deletes an attachment by making a call to the Block Storage API on behalf of the user (that is, by passing the user's token), which makes the request indistinguishable from the user making this request directly. The solution is to have Nova include a service token along with the user's token so that Cinder can determine that the detach request is coming from Nova. The ability for Nova to pass a service token has been supported since Ocata, but has not been required until now. Thus, deployments that are not currently sending service user credentials from Nova will need to apply the relevant code changes and also make configuration changes to solve the problem. Patches and Associated Deployment Changes ----------------------------------------- Given the above analysis, a thorough fix must include the following elements: 1. The os-brick library must implement the ``force`` option for fibre channel, which which has only been available for iSCSI until now (covered by the linked patches). 2. Nova must call os-brick with the ``force`` option when disconnecting volumes from deleted instances (covered by the linked patches). 3. In deployments where Glance uses the cinder glance_store driver, glance must call os-brick with the ``force`` option when disconnecting volumes (covered by the linked patches). 4. Cinder must distinguish between safe and unsafe attachment delete requests and reject the unsafe ones. This part of the fix has two components: a. The Block Storage API will return a 409 (Conflict) for a request to delete an attachment if there is an instance currently using the attachment, **unless** the request is being made by a service (for example, Nova) on behalf of a user (covered by the linked patches). b. In order to recognize that a request is being made by a service on behalf of a user, Nova must be configured to send a service token along with the user token. If this configuration change is not made, the cinder change will reject **any** request to delete an attachment associated with a volume that is attached to an instance. Nova must be configured to send a service token to Cinder, and Cinder must be configured to accept service tokens. This is described in the following document and **IS NOT AUTOMATICALLY APPLIED BY THE LINKED PATCHES:** (Using service tokens to prevent long-running job failures) https://docs.openstack.org/cinder/latest/configuration/block-storage/service-token.html The Nova patch mentioned in step 2 includes a similar document more focused on Nova: doc/source/admin/configuration/service-user-token.rst 5. The cinder glance_store driver does not attach volumes to instances; instead, it attaches volumes directly to the Glance node. Thus, the Cinder change in step 4 will recognize an attachment-delete request coming from Glance as safe and allow it. (Of course, we expect that you will have applied the patches in steps 1 and 3 to your Glance nodes.) Errata ~~~~~~ An additional nova patch is required to fix a minor regression in periodic tasks and some nova-manage actions (errata 1). Also a patch to tempest is needed to account for behavior changes with fixes in place (errata 2). Patches ~~~~~~~ - https://review.opendev.org/882836 (2023.1/antelope cinder) - https://review.opendev.org/882851 (2023.1/antelope glance_store) - https://review.opendev.org/882858 (2023.1/antelope nova) - https://review.opendev.org/882859 (2023.1/antelope nova errata 1) - https://review.opendev.org/882843 (2023.1/antelope os-brick) - https://review.opendev.org/882835 (2023.2/bobcat cinder) - https://review.opendev.org/882834 (2023.2/bobcat glance_store) - https://review.opendev.org/882847 (2023.2/bobcat nova) - https://review.opendev.org/882852 (2023.2/bobcat nova errata 1) - https://review.opendev.org/882840 (2023.2/bobcat os-brick) - https://review.opendev.org/882876 (2023.2/bobcat tempest errata 2) - https://review.opendev.org/882869 (Wallaby nova) - https://review.opendev.org/882870 (Wallaby nova errata 1) - https://review.opendev.org/882839 (Xena cinder) - https://review.opendev.org/882855 (Xena glance_store) - https://review.opendev.org/882867 (Xena nova) - https://review.opendev.org/882868 (Xena nova errata 1) - https://review.opendev.org/882848 (Xena os-brick) - https://review.opendev.org/882838 (Yoga cinder) - https://review.opendev.org/882854 (Yoga glance_store) - https://review.opendev.org/882863 (Yoga nova) - https://review.opendev.org/882864 (Yoga nova errata 1) - https://review.opendev.org/882846 (Yoga os-brick) - https://review.opendev.org/882837 (Zed cinder) - https://review.opendev.org/882853 (Zed glance_store) - https://review.opendev.org/882860 (Zed nova) - https://review.opendev.org/882861 (Zed nova errata 1) - https://review.opendev.org/882844 (Zed os-brick) Credits ~~~~~~~ - Jan Wasilewski from Atman (CVE-2023-2088) - Gorka Eguileor from Red Hat (CVE-2023-2088) References ~~~~~~~~~~ - https://launchpad.net/bugs/2004555 - http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-2088 Notes ~~~~~ - Limited Protection Against Accidents... If you are only concerned with protecting against the accidental case described earlier in this document, steps 1-3 above should be sufficient. Note, however, that only applying steps 1-3 leaves your cloud wide open to the intentional exploitation of this vulnerability. Therefore, we recommend that the full fix be applied to all deployments. - Using Configuration as a Short-Term Mitigation... An alternative approach to mitigation can be found in OSSN-0092 https://wiki.openstack.org/wiki/OSSN/OSSN-0092 - The stable/xena and stable/wallaby branches are under extended maintenance and will receive no new point releases, but patches for them are provided as a courtesy where available. OSSA History ~~~~~~~~~~~~ - 2023-05-10 - Errata 2 - 2023-05-10 - Errata 1 - 2023-05-10 - Original Version -- Jeremy Stanley OpenStack Vulnerability Management Team -------------- next part -------------- Using Configuration as a Short-Term Mitigation for OSSA-2023-003 --- ### Summary ### An unauthorized access to a volume could occur when an iSCSI or FC connection from a host is severed due to a volume being unmapped on the storage system and the device is later reused for another volume on the same host. ### Affected Services / Software ### - cinder: <20.2.1, >=21.0.0 <21.2.1, ==22.0.0 - glance_store: <3.0.1, >=4.0.0 <4.1.1, >=4.2.0 <4.3.1 - nova: <25.1.2, >=26.0.0 <26.1.2, ==27.0.0 - os-brick: <5.2.3, >=6.0.0 <6.1.1, >=6.2.0 <6.2.2 ### Discussion ### It is recommended to apply the fixes provided in OSSA-2023-003 https://security.openstack.org/ossa/OSSA-2023-003.html but updating an OpenStack deployment may take a long time requiring a proper maintenance window and may even require a validation process of the release prior to the deployment, so operators may prefer applying tactical configuration changes to their cloud to prevent harmful actions while they go through their standarized process. In this case the fastest way to prevent an unsafe attach deletion is twofold: 1. ensuring that Nova uses a user with a service role to send its token on all the requests made to Cinder on behalf of users, and 2. Cinder protects the vulnerable APIs via policy. ### Recommended Actions ### If the deployment has Glance using Cinder as a backend, in order to use this alternative short-term mitigation, Glance must be configured to use a single user having the service role for all its requests to Cinder. If your deployment is *not* using a single user (that is, instead of all the image-volumes being stored in a single project, they are stored in each user's project), then you cannot use this Short-Term Mitigation strategy, but must instead apply the full change described in the previous section. Steps for Mitigation: A. Ensure the users used by Nova (and Glance, if applicable) have the service role * In Nova, this is the user configured in the [service_user] section of nova.conf * In Glance, this user will only be defined if you are using Cinder as a Glance storage backend. It is defined in the [cinder] section of glance.conf ** if you do not use Cinder as a backend for Glance, you do not need to define a service user for Glance B. Configure Nova to send the service token https://docs.openstack.org/cinder/latest/configuration/block-storage/service-token.html * Glance does not have the ability to send a service token to Cinder; instead, the mitigation-by-policy strategy described below relies upon the user configured in glance in the [cinder]/cinder_store_user_name option in glance.conf having been granted the service role in Keystone C. Configure the cinder policies as per https://docs.openstack.org/cinder/latest/configuration/block-storage/policy-config-HOWTO.html to have the following: "is_service": "role:service or service_user_id:" "volume:attachment_delete": "rule:xena_system_admin_or_project_member and rule:is_service" "volume_extension:volume_actions:terminate_connection": "rule:xena_system_admin_or_project_member and rule:is_service" "volume_extension:volume_actions:detach": "rule:xena_system_admin_or_project_member and rule:is_service" "volume_extension:volume_admin_actions:force_detach": "!" Notes: - The operator should replace "" with the actual UUID of the user configured in the [service_user] section of nova.conf - If the role name in Keystone to identify a service is not "service"' then the operator should also replace "role:service" accordingly and also make appropriate adjustment to the [keystone_authtoken]/service_token_roles setting in your cinder configuration file. - It may not be obvious why there are four policy targets defined. It's because the Block Storage API v3 provides four different API calls by which an attachment-delete may be accomplished: ** DELETE /v3/attachments/{attachment_id} (introduced in microversion 3.27, the preferred way to do this) ** POST /v3/volumes/{volume_id} with 'os-detach' action in the request body ** POST /v3/volumes/{volume_id} with 'os-terminate_connection' action in the request body ** POST /v3/volumes/{volume_id} with 'os-force_detach' action in the request body Limitations: The drawback to this configuration-only approach is that while it protects against the intentional case described earlier, it does not protect against the accidental case. Additionally, it is not fine-grained enough to distinguish acceptable end-user Block Storage API attachment-delete requests from unsafe ones (the Cinder code change is required for that). For these reasons, we emphasize that this is only a short-term mitigation, and we recommend that the full fix be applied as soon as possible in your deployment. Warning: Note that if you deploy this short-term mitigation, you should roll back the policy changes after the full fix is applied, or end users will continue to be unable to make acceptable attachment-delete requests. ### Credits ### Jan Wasilewski, Atman Gorka Eguileor, Red Hat ### Contacts / References ### Authors: - Brian Rosmaita, Red Hat - Dan Smith, Red Hat - Gorka Eguileor, Red Hat - Jeremy Stanley, OpenInfra Foundation - Nick Tait, Red Hat This OSSN: https://wiki.openstack.org/wiki/OSSN/OSSN-0092#Discussion Original Launchpad bug: https://launchpad.net/bugs/2004555 Mailing List : [security-sig] tag on openstack-discuss at lists.openstack.org OpenStack Security Project : https://launchpad.net/~openstack-ossg CVE: CVE-2023-2088 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From haiwu.us at gmail.com Wed May 10 17:27:10 2023 From: haiwu.us at gmail.com (hai wu) Date: Wed, 10 May 2023 12:27:10 -0500 Subject: [nova] hw:numa_nodes question Message-ID: Is there any concern to enable 'hw:numa_nodes=1' on all flavors, as long as that flavor can fit into one numa node? From juliaashleykreger at gmail.com Wed May 10 17:31:06 2023 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Wed, 10 May 2023 10:31:06 -0700 Subject: [neutron][ironic] Postgres support? Message-ID: Greetings folks, I have spent a substantial amount of time investigating CI issues as of recent, and I noticed that at some point the legacy "ironic-tempest-pxe_ipmitool-postgres" CI job which has been kept around as a non-voting postgres support canary, is now failing on the master branch. Specifically, looking through the logs[0], it appears that structurally the queries are no longer compatible. np0033933847 neutron-server[90677]: WARNING oslo_db.sqlalchemy.exc_filters [None req-155d0299-07ef-433a-8576-467127c82be4 None None] DBAPIError exception wrapped.: psycopg2.errors.GroupingError: column "subnetpools.address_scope_id" must appear in the GROUP BY clause or be used in an aggregate function May 03 18:41:57.893335 np0033933847 neutron-server[90677]: LINE 2: ...standard_attr_id AS floatingips_standard_attr_id, subnetpool... May 03 18:41:57.893335 np0033933847 neutron-server[90677]: ^ May 03 18:41:57.893335 np0033933847 neutron-server[90677]: ERROR oslo_db.sqlalchemy.exc_filters Traceback (most recent call last): May 03 18:41:57.893335 np0033933847 neutron-server[90677]: ERROR oslo_db.sqlalchemy.exc_filters File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 1900, in _execute_context May 03 18:41:57.893335 np0033933847 neutron-server[90677]: ERROR oslo_db.sqlalchemy.exc_filters self.dialect.do_execute( May 03 18:41:57.893335 np0033933847 neutron-server[90677]: ERROR oslo_db.sqlalchemy.exc_filters File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/default.py", line 736, in do_execute May 03 18:41:57.893335 np0033933847 neutron-server[90677]: ERROR oslo_db.sqlalchemy.exc_filters cursor.execute(statement, parameters) May 03 18:41:57.893335 np0033933847 neutron-server[90677]: ERROR oslo_db.sqlalchemy.exc_filters psycopg2.errors.GroupingError: column "subnetpools.address_scope_id" must appear in the GROUP BY clause or be used in an aggregate function May 03 18:41:57.893335 np0033933847 neutron-server[90677]: ERROR oslo_db.sqlalchemy.exc_filters LINE 2: ...standard_attr_id AS floatingips_standard_attr_id, subnetpool... May 03 18:41:57.893335 np0033933847 neutron-server[90677]: ERROR oslo_db.sqlalchemy.exc_filters ^ >From Ironic's point of view, we're wondering if this is expected? Is it likely to be fixed? If there are no plans to fix the DB queries, Ironic will have no choice but to drop the CI job. Thanks in advance! -Julia [0]: https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_291/882164/1/check/ironic-tempest-pxe_ipmitool-postgres/291f2f8/controller/logs/screen-q-svc.txt -------------- next part -------------- An HTML attachment was scrubbed... URL: From alsotoes at gmail.com Wed May 10 17:50:43 2023 From: alsotoes at gmail.com (Alvaro Soto) Date: Wed, 10 May 2023 11:50:43 -0600 Subject: [nova] hw:numa_nodes question In-Reply-To: References: Message-ID: I don't think so. ~~~ The most common case will be that the admin only sets hw:numa_nodes and then the flavor vCPUs and memory will be divided equally across the NUMA nodes. When a NUMA policy is in effect, it is mandatory for the instance's memory allocations to come from the NUMA nodes to which it is bound except where overriden by hw:numa_mem.NN. ~~~ Here are the implementation documents since Juno release: https://opendev.org/openstack/nova-specs/src/branch/master/specs/juno/implemented/virt-driver-numa-placement.rst https://opendev.org/openstack/nova-specs/commit/45252df4c54674d2ac71cd88154af476c4d510e1 ? On Wed, May 10, 2023 at 11:31?AM hai wu wrote: > Is there any concern to enable 'hw:numa_nodes=1' on all flavors, as > long as that flavor can fit into one numa node? > > -- Alvaro Soto *Note: My work hours may not be your work hours. Please do not feel the need to respond during a time that is not convenient for you.* ---------------------------------------------------------- Great people talk about ideas, ordinary people talk about things, small people talk... about other people. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alsotoes at gmail.com Wed May 10 17:52:24 2023 From: alsotoes at gmail.com (Alvaro Soto) Date: Wed, 10 May 2023 11:52:24 -0600 Subject: [nova] hw:numa_nodes question In-Reply-To: References: Message-ID: Another good resource =) https://that.guru/blog/cpu-resources/ On Wed, May 10, 2023 at 11:50?AM Alvaro Soto wrote: > I don't think so. > > ~~~ > The most common case will be that the admin only sets hw:numa_nodes and > then the flavor vCPUs and memory will be divided equally across the NUMA > nodes. When a NUMA policy is in effect, it is mandatory for the instance's > memory allocations to come from the NUMA nodes to which it is bound except > where overriden by hw:numa_mem.NN. > ~~~ > > Here are the implementation documents since Juno release: > > > https://opendev.org/openstack/nova-specs/src/branch/master/specs/juno/implemented/virt-driver-numa-placement.rst > > https://opendev.org/openstack/nova-specs/commit/45252df4c54674d2ac71cd88154af476c4d510e1 > ? > > > On Wed, May 10, 2023 at 11:31?AM hai wu wrote: > >> Is there any concern to enable 'hw:numa_nodes=1' on all flavors, as >> long as that flavor can fit into one numa node? >> >> > > -- > > Alvaro Soto > > *Note: My work hours may not be your work hours. Please do not feel the > need to respond during a time that is not convenient for you.* > ---------------------------------------------------------- > Great people talk about ideas, > ordinary people talk about things, > small people talk... about other people. > -- Alvaro Soto *Note: My work hours may not be your work hours. Please do not feel the need to respond during a time that is not convenient for you.* ---------------------------------------------------------- Great people talk about ideas, ordinary people talk about things, small people talk... about other people. -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Wed May 10 18:45:10 2023 From: smooney at redhat.com (Sean Mooney) Date: Wed, 10 May 2023 19:45:10 +0100 Subject: [nova] hw:numa_nodes question In-Reply-To: References: Message-ID: <713262656198f4e0330a086b906daa8a1cb3e40c.camel@redhat.com> if you set hw:numa_nodes there are two things you should keep in mind first if hw:numa_nodes si set to any value incluing hw:numa_nodes=1 then hw:mem_page_size shoudl also be defiend on the falvor. if you dont set hw:mem_page_size then the vam will be pinned to a host numa node but the avaible memory on the host numa node will not be taken into account only the total free memory on the host so this almost always results in VMs being killed by the OOM reaper in the kernel. i recomend setting hw:mem_page_size=small hw:mem_page_size=large or hw:mem_page_size=any small will use your kernels default page size for guest memory, typically this is 4k pages large will use any pages size other then the smallest that is avaiable (i.e. this will use hugepages) and any will use small pages but allow the guest to request hugepages via the hw_page_size image property. hw:mem_page_size=any is the most flexable as a result but generally i recommend using hw:mem_page_size=small and having a seperate flavor for hugepages. its really up to you. the second thing to keep in mind is using expict numa toplolig8ies including hw:numa_nodes=1 disables memory oversubsctipion. so you will not be able ot oversubscibe the memory on the host. in general its better to avoid memory oversubscribtion anyway but jsut keep that in mind. you cant jsut allocate a buch of swap space and run vms at a 2:1 or higher memory over subscription ratio if you are using numa affinity. https://that.guru/blog/the-numa-scheduling-story-in-nova/ and https://that.guru/blog/cpu-resources-redux/ are also good to read i do not think stephen has a dedicated block on the memory aspect but https://bugs.launchpad.net/nova/+bug/1893121 covers some of the probelem that only setting hw:numa_nodes=1 will casue. if you have vms with hw:numa_nodes=1 set and you do not have hw:mem_page_size set in the falvor or hw_mem_page_size set in the image then that vm is not configure properly. On Wed, 2023-05-10 at 11:52 -0600, Alvaro Soto wrote: > Another good resource =) > > https://that.guru/blog/cpu-resources/ > > On Wed, May 10, 2023 at 11:50?AM Alvaro Soto wrote: > > > I don't think so. > > > > ~~~ > > The most common case will be that the admin only sets hw:numa_nodes and > > then the flavor vCPUs and memory will be divided equally across the NUMA > > nodes. When a NUMA policy is in effect, it is mandatory for the instance's > > memory allocations to come from the NUMA nodes to which it is bound except > > where overriden by hw:numa_mem.NN. > > ~~~ > > > > Here are the implementation documents since Juno release: > > > > > > https://opendev.org/openstack/nova-specs/src/branch/master/specs/juno/implemented/virt-driver-numa-placement.rst > > > > https://opendev.org/openstack/nova-specs/commit/45252df4c54674d2ac71cd88154af476c4d510e1 > > ? > > > > > > On Wed, May 10, 2023 at 11:31?AM hai wu wrote: > > > > > Is there any concern to enable 'hw:numa_nodes=1' on all flavors, as > > > long as that flavor can fit into one numa node? > > > > > > > > > > -- > > > > Alvaro Soto > > > > *Note: My work hours may not be your work hours. Please do not feel the > > need to respond during a time that is not convenient for you.* > > ---------------------------------------------------------- > > Great people talk about ideas, > > ordinary people talk about things, > > small people talk... about other people. > > > > From haiwu.us at gmail.com Wed May 10 19:22:35 2023 From: haiwu.us at gmail.com (hai wu) Date: Wed, 10 May 2023 14:22:35 -0500 Subject: [nova] hw:numa_nodes question In-Reply-To: <713262656198f4e0330a086b906daa8a1cb3e40c.camel@redhat.com> References: <713262656198f4e0330a086b906daa8a1cb3e40c.camel@redhat.com> Message-ID: So there's no default value assumed/set for hw:mem_page_size for each flavor? Yes https://bugs.launchpad.net/nova/+bug/1893121 is critical when using hw:numa_nodes=1. I did not hit an issue with 'hw:mem_page_size' not set, maybe I am missing some known test cases? It would be very helpful to have a test case where I could reproduce this issue with 'hw:numa_nodes=1' being set, but without 'hw:mem_page_size' being set. How to ensure this one for existing vms already running with 'hw:numa_nodes=1', but without 'hw:mem_page_size' being set? On Wed, May 10, 2023 at 1:47?PM Sean Mooney wrote: > > if you set hw:numa_nodes there are two things you should keep in mind > > first if hw:numa_nodes si set to any value incluing hw:numa_nodes=1 > then hw:mem_page_size shoudl also be defiend on the falvor. > > if you dont set hw:mem_page_size then the vam will be pinned to a host numa node > but the avaible memory on the host numa node will not be taken into account > > only the total free memory on the host so this almost always results in VMs being killed by the OOM reaper > in the kernel. > > i recomend setting hw:mem_page_size=small hw:mem_page_size=large or hw:mem_page_size=any > small will use your kernels default page size for guest memory, typically this is 4k pages > large will use any pages size other then the smallest that is avaiable (i.e. this will use hugepages) > and any will use small pages but allow the guest to request hugepages via the hw_page_size image property. > > hw:mem_page_size=any is the most flexable as a result but generally i recommend using hw:mem_page_size=small > and having a seperate flavor for hugepages. its really up to you. > > > the second thing to keep in mind is using expict numa toplolig8ies including hw:numa_nodes=1 > disables memory oversubsctipion. > > so you will not be able ot oversubscibe the memory on the host. > > in general its better to avoid memory oversubscribtion anyway but jsut keep that in mind. > you cant jsut allocate a buch of swap space and run vms at a 2:1 or higher memory over subscription ratio > if you are using numa affinity. > > https://that.guru/blog/the-numa-scheduling-story-in-nova/ > and > https://that.guru/blog/cpu-resources-redux/ > > are also good to read > > i do not think stephen has a dedicated block on the memory aspect > but https://bugs.launchpad.net/nova/+bug/1893121 covers some of the probelem that only setting > hw:numa_nodes=1 will casue. > > if you have vms with hw:numa_nodes=1 set and you do not have hw:mem_page_size set in the falvor or > hw_mem_page_size set in the image then that vm is not configure properly. > > On Wed, 2023-05-10 at 11:52 -0600, Alvaro Soto wrote: > > Another good resource =) > > > > https://that.guru/blog/cpu-resources/ > > > > On Wed, May 10, 2023 at 11:50?AM Alvaro Soto wrote: > > > > > I don't think so. > > > > > > ~~~ > > > The most common case will be that the admin only sets hw:numa_nodes and > > > then the flavor vCPUs and memory will be divided equally across the NUMA > > > nodes. When a NUMA policy is in effect, it is mandatory for the instance's > > > memory allocations to come from the NUMA nodes to which it is bound except > > > where overriden by hw:numa_mem.NN. > > > ~~~ > > > > > > Here are the implementation documents since Juno release: > > > > > > > > > https://opendev.org/openstack/nova-specs/src/branch/master/specs/juno/implemented/virt-driver-numa-placement.rst > > > > > > https://opendev.org/openstack/nova-specs/commit/45252df4c54674d2ac71cd88154af476c4d510e1 > > > ? > > > > > > > > > On Wed, May 10, 2023 at 11:31?AM hai wu wrote: > > > > > > > Is there any concern to enable 'hw:numa_nodes=1' on all flavors, as > > > > long as that flavor can fit into one numa node? > > > > > > > > > > > > > > -- > > > > > > Alvaro Soto > > > > > > *Note: My work hours may not be your work hours. Please do not feel the > > > need to respond during a time that is not convenient for you.* > > > ---------------------------------------------------------- > > > Great people talk about ideas, > > > ordinary people talk about things, > > > small people talk... about other people. > > > > > > > > From smooney at redhat.com Wed May 10 19:45:46 2023 From: smooney at redhat.com (Sean Mooney) Date: Wed, 10 May 2023 20:45:46 +0100 Subject: [nova] hw:numa_nodes question In-Reply-To: References: <713262656198f4e0330a086b906daa8a1cb3e40c.camel@redhat.com> Message-ID: <1a05b92654acb6309bc52fac14c9ae79242ab40e.camel@redhat.com> On Wed, 2023-05-10 at 14:22 -0500, hai wu wrote: > So there's no default value assumed/set for hw:mem_page_size for each > flavor? > correct this is a known edgecase in the currnt design hw:mem_page_size=any would be a resonable default but techinially if just set hw:numa_nodes=1 nova allow memory over subscription in pratch if you try to do that you will almost always end up with vms being killed due to OOM events. so from a api point of view it woudl be a change of behvior for use to default to hw:mem_page_size=any but i think it would be the correct thign to do for operators in the long run. i could bring this up with the core team again but in the past we decided to be conservitive and just warn peopel to alwasy set hw:mem_page_size if using numa affinity. > Yes https://bugs.launchpad.net/nova/+bug/1893121 is critical > when using hw:numa_nodes=1. > > I did not hit an issue with 'hw:mem_page_size' not set, maybe I am > missing some known test cases? It would be very helpful to have a test > case where I could reproduce this issue with 'hw:numa_nodes=1' being > set, but without 'hw:mem_page_size' being set. > > How to ensure this one for existing vms already running with > 'hw:numa_nodes=1', but without 'hw:mem_page_size' being set? you unfortuletly need to resize the instance. tehre are some image porpeties you can set on an instance via nova-manage but you cannot use nova-mange to update the enbedd flavor and set this. so you need to define a new flavour and resize. this is the main reason we have not changed the default as it may requrie you to move instnace around if there placement is now invalid now that per numa node memory allocatons are correctly being accounted for. if it was simple to change the default without any enduser or operator impact we would. > > On Wed, May 10, 2023 at 1:47?PM Sean Mooney wrote: > > > > if you set hw:numa_nodes there are two things you should keep in mind > > > > first if hw:numa_nodes si set to any value incluing hw:numa_nodes=1 > > then hw:mem_page_size shoudl also be defiend on the falvor. > > > > if you dont set hw:mem_page_size then the vam will be pinned to a host numa node > > but the avaible memory on the host numa node will not be taken into account > > > > only the total free memory on the host so this almost always results in VMs being killed by the OOM reaper > > in the kernel. > > > > i recomend setting hw:mem_page_size=small hw:mem_page_size=large or hw:mem_page_size=any > > small will use your kernels default page size for guest memory, typically this is 4k pages > > large will use any pages size other then the smallest that is avaiable (i.e. this will use hugepages) > > and any will use small pages but allow the guest to request hugepages via the hw_page_size image property. > > > > hw:mem_page_size=any is the most flexable as a result but generally i recommend using hw:mem_page_size=small > > and having a seperate flavor for hugepages. its really up to you. > > > > > > the second thing to keep in mind is using expict numa toplolig8ies including hw:numa_nodes=1 > > disables memory oversubsctipion. > > > > so you will not be able ot oversubscibe the memory on the host. > > > > in general its better to avoid memory oversubscribtion anyway but jsut keep that in mind. > > you cant jsut allocate a buch of swap space and run vms at a 2:1 or higher memory over subscription ratio > > if you are using numa affinity. > > > > https://that.guru/blog/the-numa-scheduling-story-in-nova/ > > and > > https://that.guru/blog/cpu-resources-redux/ > > > > are also good to read > > > > i do not think stephen has a dedicated block on the memory aspect > > but https://bugs.launchpad.net/nova/+bug/1893121 covers some of the probelem that only setting > > hw:numa_nodes=1 will casue. > > > > if you have vms with hw:numa_nodes=1 set and you do not have hw:mem_page_size set in the falvor or > > hw_mem_page_size set in the image then that vm is not configure properly. > > > > On Wed, 2023-05-10 at 11:52 -0600, Alvaro Soto wrote: > > > Another good resource =) > > > > > > https://that.guru/blog/cpu-resources/ > > > > > > On Wed, May 10, 2023 at 11:50?AM Alvaro Soto wrote: > > > > > > > I don't think so. > > > > > > > > ~~~ > > > > The most common case will be that the admin only sets hw:numa_nodes and > > > > then the flavor vCPUs and memory will be divided equally across the NUMA > > > > nodes. When a NUMA policy is in effect, it is mandatory for the instance's > > > > memory allocations to come from the NUMA nodes to which it is bound except > > > > where overriden by hw:numa_mem.NN. > > > > ~~~ > > > > > > > > Here are the implementation documents since Juno release: > > > > > > > > > > > > https://opendev.org/openstack/nova-specs/src/branch/master/specs/juno/implemented/virt-driver-numa-placement.rst > > > > > > > > https://opendev.org/openstack/nova-specs/commit/45252df4c54674d2ac71cd88154af476c4d510e1 > > > > ? > > > > > > > > > > > > On Wed, May 10, 2023 at 11:31?AM hai wu wrote: > > > > > > > > > Is there any concern to enable 'hw:numa_nodes=1' on all flavors, as > > > > > long as that flavor can fit into one numa node? > > > > > > > > > > > > > > > > > > -- > > > > > > > > Alvaro Soto > > > > > > > > *Note: My work hours may not be your work hours. Please do not feel the > > > > need to respond during a time that is not convenient for you.* > > > > ---------------------------------------------------------- > > > > Great people talk about ideas, > > > > ordinary people talk about things, > > > > small people talk... about other people. > > > > > > > > > > > > > From haiwu.us at gmail.com Wed May 10 20:06:11 2023 From: haiwu.us at gmail.com (hai wu) Date: Wed, 10 May 2023 15:06:11 -0500 Subject: [nova] hw:numa_nodes question In-Reply-To: <1a05b92654acb6309bc52fac14c9ae79242ab40e.camel@redhat.com> References: <713262656198f4e0330a086b906daa8a1cb3e40c.camel@redhat.com> <1a05b92654acb6309bc52fac14c9ae79242ab40e.camel@redhat.com> Message-ID: Is it possible to update something in the Openstack database for the relevant VMs in order to do the same, and then hard reboot the VM so that the VM would have this attribute? On Wed, May 10, 2023 at 2:47?PM Sean Mooney wrote: > > On Wed, 2023-05-10 at 14:22 -0500, hai wu wrote: > > So there's no default value assumed/set for hw:mem_page_size for each > > flavor? > > > correct this is a known edgecase in the currnt design > hw:mem_page_size=any would be a resonable default but > techinially if just set hw:numa_nodes=1 nova allow memory over subscription > > in pratch if you try to do that you will almost always end up with vms > being killed due to OOM events. > > so from a api point of view it woudl be a change of behvior for use to default > to hw:mem_page_size=any but i think it would be the correct thign to do for operators > in the long run. > > i could bring this up with the core team again but in the past we > decided to be conservitive and just warn peopel to alwasy set > hw:mem_page_size if using numa affinity. > > > Yes https://bugs.launchpad.net/nova/+bug/1893121 is critical > > when using hw:numa_nodes=1. > > > > I did not hit an issue with 'hw:mem_page_size' not set, maybe I am > > missing some known test cases? It would be very helpful to have a test > > case where I could reproduce this issue with 'hw:numa_nodes=1' being > > set, but without 'hw:mem_page_size' being set. > > > > How to ensure this one for existing vms already running with > > 'hw:numa_nodes=1', but without 'hw:mem_page_size' being set? > you unfortuletly need to resize the instance. > tehre are some image porpeties you can set on an instance via nova-manage > but you cannot use nova-mange to update the enbedd flavor and set this. > > so you need to define a new flavour and resize. > > this is the main reason we have not changed the default as it may requrie you to > move instnace around if there placement is now invalid now that per numa node memory > allocatons are correctly being accounted for. > > if it was simple to change the default without any enduser or operator impact we would. > > > > > > > On Wed, May 10, 2023 at 1:47?PM Sean Mooney wrote: > > > > > > if you set hw:numa_nodes there are two things you should keep in mind > > > > > > first if hw:numa_nodes si set to any value incluing hw:numa_nodes=1 > > > then hw:mem_page_size shoudl also be defiend on the falvor. > > > > > > if you dont set hw:mem_page_size then the vam will be pinned to a host numa node > > > but the avaible memory on the host numa node will not be taken into account > > > > > > only the total free memory on the host so this almost always results in VMs being killed by the OOM reaper > > > in the kernel. > > > > > > i recomend setting hw:mem_page_size=small hw:mem_page_size=large or hw:mem_page_size=any > > > small will use your kernels default page size for guest memory, typically this is 4k pages > > > large will use any pages size other then the smallest that is avaiable (i.e. this will use hugepages) > > > and any will use small pages but allow the guest to request hugepages via the hw_page_size image property. > > > > > > hw:mem_page_size=any is the most flexable as a result but generally i recommend using hw:mem_page_size=small > > > and having a seperate flavor for hugepages. its really up to you. > > > > > > > > > the second thing to keep in mind is using expict numa toplolig8ies including hw:numa_nodes=1 > > > disables memory oversubsctipion. > > > > > > so you will not be able ot oversubscibe the memory on the host. > > > > > > in general its better to avoid memory oversubscribtion anyway but jsut keep that in mind. > > > you cant jsut allocate a buch of swap space and run vms at a 2:1 or higher memory over subscription ratio > > > if you are using numa affinity. > > > > > > https://that.guru/blog/the-numa-scheduling-story-in-nova/ > > > and > > > https://that.guru/blog/cpu-resources-redux/ > > > > > > are also good to read > > > > > > i do not think stephen has a dedicated block on the memory aspect > > > but https://bugs.launchpad.net/nova/+bug/1893121 covers some of the probelem that only setting > > > hw:numa_nodes=1 will casue. > > > > > > if you have vms with hw:numa_nodes=1 set and you do not have hw:mem_page_size set in the falvor or > > > hw_mem_page_size set in the image then that vm is not configure properly. > > > > > > On Wed, 2023-05-10 at 11:52 -0600, Alvaro Soto wrote: > > > > Another good resource =) > > > > > > > > https://that.guru/blog/cpu-resources/ > > > > > > > > On Wed, May 10, 2023 at 11:50?AM Alvaro Soto wrote: > > > > > > > > > I don't think so. > > > > > > > > > > ~~~ > > > > > The most common case will be that the admin only sets hw:numa_nodes and > > > > > then the flavor vCPUs and memory will be divided equally across the NUMA > > > > > nodes. When a NUMA policy is in effect, it is mandatory for the instance's > > > > > memory allocations to come from the NUMA nodes to which it is bound except > > > > > where overriden by hw:numa_mem.NN. > > > > > ~~~ > > > > > > > > > > Here are the implementation documents since Juno release: > > > > > > > > > > > > > > > https://opendev.org/openstack/nova-specs/src/branch/master/specs/juno/implemented/virt-driver-numa-placement.rst > > > > > > > > > > https://opendev.org/openstack/nova-specs/commit/45252df4c54674d2ac71cd88154af476c4d510e1 > > > > > ? > > > > > > > > > > > > > > > On Wed, May 10, 2023 at 11:31?AM hai wu wrote: > > > > > > > > > > > Is there any concern to enable 'hw:numa_nodes=1' on all flavors, as > > > > > > long as that flavor can fit into one numa node? > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Alvaro Soto > > > > > > > > > > *Note: My work hours may not be your work hours. Please do not feel the > > > > > need to respond during a time that is not convenient for you.* > > > > > ---------------------------------------------------------- > > > > > Great people talk about ideas, > > > > > ordinary people talk about things, > > > > > small people talk... about other people. > > > > > > > > > > > > > > > > > > > From fkr at hazardous.org Thu May 11 07:46:24 2023 From: fkr at hazardous.org (Felix Kronlage-Dammers) Date: Thu, 11 May 2023 09:46:24 +0200 Subject: [publiccloud-sig] Bi-weekly meeting times adjusted to 0700 UTC Message-ID: Hi, since some of us (to be honest, probably mostly me ;) have a conflicting meeting during summer time with the slot of the publiccloud-sig we discussed moving it to 0700 UTC during summer time. Since nobody in the past two meetings spoke against that, this was concluded yesterday. During summer time the publiccloud SIG will meet at 0700 UTC, not at 0800 UTC. The change is also reflected in the wiki page: https://wiki.openstack.org/wiki/PublicCloudSIG See you next time (and I'll set myself reminder to send out the reminder ;) felix -- GPG: 824CE0F0 / 2082 651E 5104 F989 4D18 BB2E 0B26 6738 824C E0F0 https://hazardous.org/ - fkr at hazardous.org - fkr at irc - @felixkronlage -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From ralonsoh at redhat.com Thu May 11 08:10:50 2023 From: ralonsoh at redhat.com (Rodolfo Alonso Hernandez) Date: Thu, 11 May 2023 10:10:50 +0200 Subject: [neutron][ironic] Postgres support? In-Reply-To: References: Message-ID: Hi Julia: Let me open a LP bug for this issue. I'll check if this call is being tested in our CI and with the specific DB backend. Regards. On Wed, May 10, 2023 at 7:32?PM Julia Kreger wrote: > Greetings folks, > > I have spent a substantial amount of time investigating CI issues as of > recent, and I noticed that at some point the legacy > "ironic-tempest-pxe_ipmitool-postgres" CI job which has been kept around as > a non-voting postgres support canary, is now failing on the master branch. > > Specifically, looking through the logs[0], it appears that structurally > the queries are no longer compatible. > > np0033933847 neutron-server[90677]: WARNING oslo_db.sqlalchemy.exc_filters > [None req-155d0299-07ef-433a-8576-467127c82be4 None None] DBAPIError > exception wrapped.: psycopg2.errors.GroupingError: column > "subnetpools.address_scope_id" must appear in the GROUP BY clause or be > used in an aggregate function > May 03 18:41:57.893335 np0033933847 neutron-server[90677]: LINE 2: > ...standard_attr_id AS floatingips_standard_attr_id, subnetpool... > May 03 18:41:57.893335 np0033933847 neutron-server[90677]: > ^ > May 03 18:41:57.893335 np0033933847 neutron-server[90677]: ERROR > oslo_db.sqlalchemy.exc_filters Traceback (most recent call last): > May 03 18:41:57.893335 np0033933847 neutron-server[90677]: ERROR > oslo_db.sqlalchemy.exc_filters File > "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line > 1900, in _execute_context > May 03 18:41:57.893335 np0033933847 neutron-server[90677]: ERROR > oslo_db.sqlalchemy.exc_filters self.dialect.do_execute( > May 03 18:41:57.893335 np0033933847 neutron-server[90677]: ERROR > oslo_db.sqlalchemy.exc_filters File > "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/default.py", line > 736, in do_execute > May 03 18:41:57.893335 np0033933847 neutron-server[90677]: ERROR > oslo_db.sqlalchemy.exc_filters cursor.execute(statement, parameters) > May 03 18:41:57.893335 np0033933847 neutron-server[90677]: ERROR > oslo_db.sqlalchemy.exc_filters psycopg2.errors.GroupingError: column > "subnetpools.address_scope_id" must appear in the GROUP BY clause or be > used in an aggregate function > May 03 18:41:57.893335 np0033933847 neutron-server[90677]: ERROR > oslo_db.sqlalchemy.exc_filters LINE 2: ...standard_attr_id AS > floatingips_standard_attr_id, subnetpool... > May 03 18:41:57.893335 np0033933847 neutron-server[90677]: ERROR > oslo_db.sqlalchemy.exc_filters > ^ > > From Ironic's point of view, we're wondering if this is expected? Is it > likely to be fixed? If there are no plans to fix the DB queries, Ironic > will have no choice but to drop the CI job. > > Thanks in advance! > > -Julia > > [0]: > https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_291/882164/1/check/ironic-tempest-pxe_ipmitool-postgres/291f2f8/controller/logs/screen-q-svc.txt > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kennelson11 at gmail.com Thu May 11 08:46:10 2023 From: kennelson11 at gmail.com (Kendall Nelson) Date: Thu, 11 May 2023 03:46:10 -0500 Subject: Extending the Forum Message-ID: Hello Everyone, We received feedback that critical Forum conversations were not able to make the schedule due to the intentional work on our part to avoid PTG / Forum overlap which ended up restricting the number of Forum discussion proposals we were able to accept. To mitigate this, we have opened up the Forum room for a third day. We have created a form to collect critical Forum proposals. We will leave the form open for one week to give all timezones fair representation. The deadline to submit a proposal for a forum discussion is Thursday May, 18th at 7:00 UTC. Shortly after, the OpenInfra Foundation will make final decisions and program the Forum on Thursday. -Kendall Nelson [1] https://openinfrafoundation.formstack.com/forms/forum_expansion -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Thu May 11 09:22:12 2023 From: skaplons at redhat.com (Slawek Kaplonski) Date: Thu, 11 May 2023 11:22:12 +0200 Subject: [neutron][ironic] Postgres support? In-Reply-To: References: Message-ID: <2597921.4JXAxupmLU@p1> Hi, Dnia czwartek, 11 maja 2023 10:10:50 CEST Rodolfo Alonso Hernandez pisze: > Hi Julia: > > Let me open a LP bug for this issue. I'll check if this call is being > tested in our CI and with the specific DB backend. > > Regards. > > On Wed, May 10, 2023 at 7:32?PM Julia Kreger > wrote: > > > Greetings folks, > > > > I have spent a substantial amount of time investigating CI issues as of > > recent, and I noticed that at some point the legacy > > "ironic-tempest-pxe_ipmitool-postgres" CI job which has been kept around as > > a non-voting postgres support canary, is now failing on the master branch. > > > > Specifically, looking through the logs[0], it appears that structurally > > the queries are no longer compatible. > > > > np0033933847 neutron-server[90677]: WARNING oslo_db.sqlalchemy.exc_filters > > [None req-155d0299-07ef-433a-8576-467127c82be4 None None] DBAPIError > > exception wrapped.: psycopg2.errors.GroupingError: column > > "subnetpools.address_scope_id" must appear in the GROUP BY clause or be > > used in an aggregate function > > May 03 18:41:57.893335 np0033933847 neutron-server[90677]: LINE 2: > > ...standard_attr_id AS floatingips_standard_attr_id, subnetpool... > > May 03 18:41:57.893335 np0033933847 neutron-server[90677]: > > ^ > > May 03 18:41:57.893335 np0033933847 neutron-server[90677]: ERROR > > oslo_db.sqlalchemy.exc_filters Traceback (most recent call last): > > May 03 18:41:57.893335 np0033933847 neutron-server[90677]: ERROR > > oslo_db.sqlalchemy.exc_filters File > > "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line > > 1900, in _execute_context > > May 03 18:41:57.893335 np0033933847 neutron-server[90677]: ERROR > > oslo_db.sqlalchemy.exc_filters self.dialect.do_execute( > > May 03 18:41:57.893335 np0033933847 neutron-server[90677]: ERROR > > oslo_db.sqlalchemy.exc_filters File > > "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/default.py", line > > 736, in do_execute > > May 03 18:41:57.893335 np0033933847 neutron-server[90677]: ERROR > > oslo_db.sqlalchemy.exc_filters cursor.execute(statement, parameters) > > May 03 18:41:57.893335 np0033933847 neutron-server[90677]: ERROR > > oslo_db.sqlalchemy.exc_filters psycopg2.errors.GroupingError: column > > "subnetpools.address_scope_id" must appear in the GROUP BY clause or be > > used in an aggregate function > > May 03 18:41:57.893335 np0033933847 neutron-server[90677]: ERROR > > oslo_db.sqlalchemy.exc_filters LINE 2: ...standard_attr_id AS > > floatingips_standard_attr_id, subnetpool... > > May 03 18:41:57.893335 np0033933847 neutron-server[90677]: ERROR > > oslo_db.sqlalchemy.exc_filters > > ^ > > > > From Ironic's point of view, we're wondering if this is expected? Is it > > likely to be fixed? If there are no plans to fix the DB queries, Ironic > > will have no choice but to drop the CI job. > > > > Thanks in advance! > > > > -Julia > > > > [0]: > > https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_291/882164/1/check/ironic-tempest-pxe_ipmitool-postgres/291f2f8/controller/logs/screen-q-svc.txt > > > We have only one postgres related job in our periodic queue neutron-ovn-tempest-postgres-full and it seems to be pretty stable [1]. I don't know however if it somehow skips this path but I don't see any errors like that in the neutron's log file. [1] https://zuul.openstack.org/builds?job_name=neutron-ovn-tempest-postgres-full&project=openstack/neutron -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From smooney at redhat.com Thu May 11 09:22:45 2023 From: smooney at redhat.com (Sean Mooney) Date: Thu, 11 May 2023 10:22:45 +0100 Subject: [nova] hw:numa_nodes question In-Reply-To: References: <713262656198f4e0330a086b906daa8a1cb3e40c.camel@redhat.com> <1a05b92654acb6309bc52fac14c9ae79242ab40e.camel@redhat.com> Message-ID: <1705b563fb0e936a2aa8356f6adccddd948b69bf.camel@redhat.com> On Wed, 2023-05-10 at 15:06 -0500, hai wu wrote: > Is it possible to update something in the Openstack database for the > relevant VMs in order to do the same, and then hard reboot the VM so > that the VM would have this attribute? not really adding the missing hw:mem_page_size requirement to the flavor chagnes the requirements for node placement and numa affinity so you really can only change this via resizing the vm to a new flavor > > On Wed, May 10, 2023 at 2:47?PM Sean Mooney wrote: > > > > On Wed, 2023-05-10 at 14:22 -0500, hai wu wrote: > > > So there's no default value assumed/set for hw:mem_page_size for each > > > flavor? > > > > > correct this is a known edgecase in the currnt design > > hw:mem_page_size=any would be a resonable default but > > techinially if just set hw:numa_nodes=1 nova allow memory over subscription > > > > in pratch if you try to do that you will almost always end up with vms > > being killed due to OOM events. > > > > so from a api point of view it woudl be a change of behvior for use to default > > to hw:mem_page_size=any but i think it would be the correct thign to do for operators > > in the long run. > > > > i could bring this up with the core team again but in the past we > > decided to be conservitive and just warn peopel to alwasy set > > hw:mem_page_size if using numa affinity. > > > > > Yes https://bugs.launchpad.net/nova/+bug/1893121 is critical > > > when using hw:numa_nodes=1. > > > > > > I did not hit an issue with 'hw:mem_page_size' not set, maybe I am > > > missing some known test cases? It would be very helpful to have a test > > > case where I could reproduce this issue with 'hw:numa_nodes=1' being > > > set, but without 'hw:mem_page_size' being set. > > > > > > How to ensure this one for existing vms already running with > > > 'hw:numa_nodes=1', but without 'hw:mem_page_size' being set? > > you unfortuletly need to resize the instance. > > tehre are some image porpeties you can set on an instance via nova-manage > > but you cannot use nova-mange to update the enbedd flavor and set this. > > > > so you need to define a new flavour and resize. > > > > this is the main reason we have not changed the default as it may requrie you to > > move instnace around if there placement is now invalid now that per numa node memory > > allocatons are correctly being accounted for. > > > > if it was simple to change the default without any enduser or operator impact we would. > > > > > > > > > > > > On Wed, May 10, 2023 at 1:47?PM Sean Mooney wrote: > > > > > > > > if you set hw:numa_nodes there are two things you should keep in mind > > > > > > > > first if hw:numa_nodes si set to any value incluing hw:numa_nodes=1 > > > > then hw:mem_page_size shoudl also be defiend on the falvor. > > > > > > > > if you dont set hw:mem_page_size then the vam will be pinned to a host numa node > > > > but the avaible memory on the host numa node will not be taken into account > > > > > > > > only the total free memory on the host so this almost always results in VMs being killed by the OOM reaper > > > > in the kernel. > > > > > > > > i recomend setting hw:mem_page_size=small hw:mem_page_size=large or hw:mem_page_size=any > > > > small will use your kernels default page size for guest memory, typically this is 4k pages > > > > large will use any pages size other then the smallest that is avaiable (i.e. this will use hugepages) > > > > and any will use small pages but allow the guest to request hugepages via the hw_page_size image property. > > > > > > > > hw:mem_page_size=any is the most flexable as a result but generally i recommend using hw:mem_page_size=small > > > > and having a seperate flavor for hugepages. its really up to you. > > > > > > > > > > > > the second thing to keep in mind is using expict numa toplolig8ies including hw:numa_nodes=1 > > > > disables memory oversubsctipion. > > > > > > > > so you will not be able ot oversubscibe the memory on the host. > > > > > > > > in general its better to avoid memory oversubscribtion anyway but jsut keep that in mind. > > > > you cant jsut allocate a buch of swap space and run vms at a 2:1 or higher memory over subscription ratio > > > > if you are using numa affinity. > > > > > > > > https://that.guru/blog/the-numa-scheduling-story-in-nova/ > > > > and > > > > https://that.guru/blog/cpu-resources-redux/ > > > > > > > > are also good to read > > > > > > > > i do not think stephen has a dedicated block on the memory aspect > > > > but https://bugs.launchpad.net/nova/+bug/1893121 covers some of the probelem that only setting > > > > hw:numa_nodes=1 will casue. > > > > > > > > if you have vms with hw:numa_nodes=1 set and you do not have hw:mem_page_size set in the falvor or > > > > hw_mem_page_size set in the image then that vm is not configure properly. > > > > > > > > On Wed, 2023-05-10 at 11:52 -0600, Alvaro Soto wrote: > > > > > Another good resource =) > > > > > > > > > > https://that.guru/blog/cpu-resources/ > > > > > > > > > > On Wed, May 10, 2023 at 11:50?AM Alvaro Soto wrote: > > > > > > > > > > > I don't think so. > > > > > > > > > > > > ~~~ > > > > > > The most common case will be that the admin only sets hw:numa_nodes and > > > > > > then the flavor vCPUs and memory will be divided equally across the NUMA > > > > > > nodes. When a NUMA policy is in effect, it is mandatory for the instance's > > > > > > memory allocations to come from the NUMA nodes to which it is bound except > > > > > > where overriden by hw:numa_mem.NN. > > > > > > ~~~ > > > > > > > > > > > > Here are the implementation documents since Juno release: > > > > > > > > > > > > > > > > > > https://opendev.org/openstack/nova-specs/src/branch/master/specs/juno/implemented/virt-driver-numa-placement.rst > > > > > > > > > > > > https://opendev.org/openstack/nova-specs/commit/45252df4c54674d2ac71cd88154af476c4d510e1 > > > > > > ? > > > > > > > > > > > > > > > > > > On Wed, May 10, 2023 at 11:31?AM hai wu wrote: > > > > > > > > > > > > > Is there any concern to enable 'hw:numa_nodes=1' on all flavors, as > > > > > > > long as that flavor can fit into one numa node? > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > Alvaro Soto > > > > > > > > > > > > *Note: My work hours may not be your work hours. Please do not feel the > > > > > > need to respond during a time that is not convenient for you.* > > > > > > ---------------------------------------------------------- > > > > > > Great people talk about ideas, > > > > > > ordinary people talk about things, > > > > > > small people talk... about other people. > > > > > > > > > > > > > > > > > > > > > > > > > > From ralonsoh at redhat.com Thu May 11 09:40:54 2023 From: ralonsoh at redhat.com (Rodolfo Alonso Hernandez) Date: Thu, 11 May 2023 11:40:54 +0200 Subject: [neutron][ironic] Postgres support? In-Reply-To: <2597921.4JXAxupmLU@p1> References: <2597921.4JXAxupmLU@p1> Message-ID: LP bug: https://bugs.launchpad.net/neutron/+bug/2019186 Patch: https://review.opendev.org/c/openstack/neutron/+/882935 Ironic patch: https://review.opendev.org/c/openstack/ironic/+/882936 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ozzzo at yahoo.com Thu May 11 12:14:30 2023 From: ozzzo at yahoo.com (Albert Braden) Date: Thu, 11 May 2023 12:14:30 +0000 (UTC) Subject: [kolla] [train] [neutron] References: <1535783846.481815.1683807270392.ref@mail.yahoo.com> Message-ID: <1535783846.481815.1683807270392@mail.yahoo.com> From ozzzo at yahoo.com Thu May 11 12:32:01 2023 From: ozzzo at yahoo.com (Albert Braden) Date: Thu, 11 May 2023 12:32:01 +0000 (UTC) Subject: [kolla] [train] haproxy and controller restart causes user impact References: <314427760.909657.1683808321773.ref@mail.yahoo.com> Message-ID: <314427760.909657.1683808321773@mail.yahoo.com> We have our haproxy and controller nodes on KVM hosts. When those KVM hosts are restarted, customers who are building or deleting VMs see impact. VMs may go into error status, fail to get DNS records, fail to delete, etc. The obvious reason is because traffic that is being routed to the haproxy on the restarting KVM is lost. If we manually fail over haproxy before restarting the KVM, will that be sufficient to stop traffic being lost, or do we also need to do something with the controller? From haiwu.us at gmail.com Thu May 11 13:40:22 2023 From: haiwu.us at gmail.com (hai wu) Date: Thu, 11 May 2023 08:40:22 -0500 Subject: [nova] hw:numa_nodes question In-Reply-To: <1705b563fb0e936a2aa8356f6adccddd948b69bf.camel@redhat.com> References: <713262656198f4e0330a086b906daa8a1cb3e40c.camel@redhat.com> <1a05b92654acb6309bc52fac14c9ae79242ab40e.camel@redhat.com> <1705b563fb0e936a2aa8356f6adccddd948b69bf.camel@redhat.com> Message-ID: Ok. Then I don't understand why 'hw:mem_page_size' is not made the default in case if hw:numa_node is set. There is a huge disadvantage if not having this one set (all existing VMs with hw:numa_node set will have to be taken down for resizing in order to get this one right). I could not find this point mentioned in any existing Openstack documentation: that we would have to set hw:mem_page_size explicitly if hw:numa_node is set. Also this slide at https://www.linux-kvm.org/images/0/0b/03x03-Openstackpdf.pdf kind of indicates that hw:mem_page_size `Default to small pages`. Another question: Let's say a VM runs on one host's numa node #0. If we live-migrate this VM to another host, and that host's numa node #1 has more free memory, is it possible for this VM to land on the other host's numa node #1? On Thu, May 11, 2023 at 4:25?AM Sean Mooney wrote: > > On Wed, 2023-05-10 at 15:06 -0500, hai wu wrote: > > Is it possible to update something in the Openstack database for the > > relevant VMs in order to do the same, and then hard reboot the VM so > > that the VM would have this attribute? > not really adding the missing hw:mem_page_size requirement to the flavor chagnes the > requirements for node placement and numa affinity > so you really can only change this via resizing the vm to a new flavor > > > > On Wed, May 10, 2023 at 2:47?PM Sean Mooney wrote: > > > > > > On Wed, 2023-05-10 at 14:22 -0500, hai wu wrote: > > > > So there's no default value assumed/set for hw:mem_page_size for each > > > > flavor? > > > > > > > correct this is a known edgecase in the currnt design > > > hw:mem_page_size=any would be a resonable default but > > > techinially if just set hw:numa_nodes=1 nova allow memory over subscription > > > > > > in pratch if you try to do that you will almost always end up with vms > > > being killed due to OOM events. > > > > > > so from a api point of view it woudl be a change of behvior for use to default > > > to hw:mem_page_size=any but i think it would be the correct thign to do for operators > > > in the long run. > > > > > > i could bring this up with the core team again but in the past we > > > decided to be conservitive and just warn peopel to alwasy set > > > hw:mem_page_size if using numa affinity. > > > > > > > Yes https://bugs.launchpad.net/nova/+bug/1893121 is critical > > > > when using hw:numa_nodes=1. > > > > > > > > I did not hit an issue with 'hw:mem_page_size' not set, maybe I am > > > > missing some known test cases? It would be very helpful to have a test > > > > case where I could reproduce this issue with 'hw:numa_nodes=1' being > > > > set, but without 'hw:mem_page_size' being set. > > > > > > > > How to ensure this one for existing vms already running with > > > > 'hw:numa_nodes=1', but without 'hw:mem_page_size' being set? > > > you unfortuletly need to resize the instance. > > > tehre are some image porpeties you can set on an instance via nova-manage > > > but you cannot use nova-mange to update the enbedd flavor and set this. > > > > > > so you need to define a new flavour and resize. > > > > > > this is the main reason we have not changed the default as it may requrie you to > > > move instnace around if there placement is now invalid now that per numa node memory > > > allocatons are correctly being accounted for. > > > > > > if it was simple to change the default without any enduser or operator impact we would. > > > > > > > > > > > > > > > > > On Wed, May 10, 2023 at 1:47?PM Sean Mooney wrote: > > > > > > > > > > if you set hw:numa_nodes there are two things you should keep in mind > > > > > > > > > > first if hw:numa_nodes si set to any value incluing hw:numa_nodes=1 > > > > > then hw:mem_page_size shoudl also be defiend on the falvor. > > > > > > > > > > if you dont set hw:mem_page_size then the vam will be pinned to a host numa node > > > > > but the avaible memory on the host numa node will not be taken into account > > > > > > > > > > only the total free memory on the host so this almost always results in VMs being killed by the OOM reaper > > > > > in the kernel. > > > > > > > > > > i recomend setting hw:mem_page_size=small hw:mem_page_size=large or hw:mem_page_size=any > > > > > small will use your kernels default page size for guest memory, typically this is 4k pages > > > > > large will use any pages size other then the smallest that is avaiable (i.e. this will use hugepages) > > > > > and any will use small pages but allow the guest to request hugepages via the hw_page_size image property. > > > > > > > > > > hw:mem_page_size=any is the most flexable as a result but generally i recommend using hw:mem_page_size=small > > > > > and having a seperate flavor for hugepages. its really up to you. > > > > > > > > > > > > > > > the second thing to keep in mind is using expict numa toplolig8ies including hw:numa_nodes=1 > > > > > disables memory oversubsctipion. > > > > > > > > > > so you will not be able ot oversubscibe the memory on the host. > > > > > > > > > > in general its better to avoid memory oversubscribtion anyway but jsut keep that in mind. > > > > > you cant jsut allocate a buch of swap space and run vms at a 2:1 or higher memory over subscription ratio > > > > > if you are using numa affinity. > > > > > > > > > > https://that.guru/blog/the-numa-scheduling-story-in-nova/ > > > > > and > > > > > https://that.guru/blog/cpu-resources-redux/ > > > > > > > > > > are also good to read > > > > > > > > > > i do not think stephen has a dedicated block on the memory aspect > > > > > but https://bugs.launchpad.net/nova/+bug/1893121 covers some of the probelem that only setting > > > > > hw:numa_nodes=1 will casue. > > > > > > > > > > if you have vms with hw:numa_nodes=1 set and you do not have hw:mem_page_size set in the falvor or > > > > > hw_mem_page_size set in the image then that vm is not configure properly. > > > > > > > > > > On Wed, 2023-05-10 at 11:52 -0600, Alvaro Soto wrote: > > > > > > Another good resource =) > > > > > > > > > > > > https://that.guru/blog/cpu-resources/ > > > > > > > > > > > > On Wed, May 10, 2023 at 11:50?AM Alvaro Soto wrote: > > > > > > > > > > > > > I don't think so. > > > > > > > > > > > > > > ~~~ > > > > > > > The most common case will be that the admin only sets hw:numa_nodes and > > > > > > > then the flavor vCPUs and memory will be divided equally across the NUMA > > > > > > > nodes. When a NUMA policy is in effect, it is mandatory for the instance's > > > > > > > memory allocations to come from the NUMA nodes to which it is bound except > > > > > > > where overriden by hw:numa_mem.NN. > > > > > > > ~~~ > > > > > > > > > > > > > > Here are the implementation documents since Juno release: > > > > > > > > > > > > > > > > > > > > > https://opendev.org/openstack/nova-specs/src/branch/master/specs/juno/implemented/virt-driver-numa-placement.rst > > > > > > > > > > > > > > https://opendev.org/openstack/nova-specs/commit/45252df4c54674d2ac71cd88154af476c4d510e1 > > > > > > > ? > > > > > > > > > > > > > > > > > > > > > On Wed, May 10, 2023 at 11:31?AM hai wu wrote: > > > > > > > > > > > > > > > Is there any concern to enable 'hw:numa_nodes=1' on all flavors, as > > > > > > > > long as that flavor can fit into one numa node? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > Alvaro Soto > > > > > > > > > > > > > > *Note: My work hours may not be your work hours. Please do not feel the > > > > > > > need to respond during a time that is not convenient for you.* > > > > > > > ---------------------------------------------------------- > > > > > > > Great people talk about ideas, > > > > > > > ordinary people talk about things, > > > > > > > small people talk... about other people. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From juliaashleykreger at gmail.com Thu May 11 13:54:26 2023 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Thu, 11 May 2023 06:54:26 -0700 Subject: [neutron][ironic] Postgres support? In-Reply-To: References: <2597921.4JXAxupmLU@p1> Message-ID: Awesome! Thanks for the prompt reply and action on a patch for Neutron. The test patch proposed against Ironic passed, which is definitely a good sign! Thanks again! -Julia On Thu, May 11, 2023 at 2:41?AM Rodolfo Alonso Hernandez < ralonsoh at redhat.com> wrote: > LP bug: https://bugs.launchpad.net/neutron/+bug/2019186 > Patch: https://review.opendev.org/c/openstack/neutron/+/882935 > Ironic patch: https://review.opendev.org/c/openstack/ironic/+/882936 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jay at gr-oss.io Thu May 11 17:32:56 2023 From: jay at gr-oss.io (Jay Faulkner) Date: Thu, 11 May 2023 10:32:56 -0700 Subject: [ironic] Meeting cancelled; Metal3 PTG next Monday Message-ID: Hi all, The Metal3 team informed us they are having a one day PTG, which will take place on May 15th, 2023 from 14:00 to 17:00 UTC. I strongly encourage any interested Ironic cores to review the invitation[1] and attend if interested. Because this overlaps our regularly scheduled Ironic meeting, and there is significant overlap between the Metal3 and Ironic communities, the Ironic meeting for Monday, May 15h is cancelled to allow our communities to collaborate without distraction. Thanks, Jay Faulkner Ironic PTL 1: https://groups.google.com/g/metal3-dev/c/62PCloXk-zQ -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian.rohmann at inovex.de Thu May 11 18:51:35 2023 From: christian.rohmann at inovex.de (Christian Rohmann) Date: Thu, 11 May 2023 20:51:35 +0200 Subject: [neutron][ironic] Postgres support? In-Reply-To: References: <2597921.4JXAxupmLU@p1> Message-ID: <8910fa6c-9d28-d2a8-c863-c17cd2f8b190@inovex.de> On 11/05/2023 15:54, Julia Kreger wrote: > Awesome! Thanks for the prompt reply and action on a patch for > Neutron. The test patch proposed against Ironic passed, which is > definitely a good sign! While it's really cool to see how quickly this PostgreSQL issue was fixed for Ironic, I believe there is no widespread testing and validation of PostgreSQL for other projects (anymore). There even was an official communication from the TC in 2017 about not "supporting" anything but MySQL / MariaDB: https://governance.openstack.org/tc/resolutions/20170613-postgresql-status.html I myself ran into multiple issues, some during schema upgrades and some only at runtime because certain side-effect such as improper booleans or "free solo SQL" where used within the code, rendering the ORM abstraction of the actually used DBMS useless. This is also making it very hard to fully validate the DBMS is interchangeable without particularly testing for those kind of cases in all sorts of database fields or by doing code reviews. What I am saying is, that as a an operator I appreciate clear communication and clear deprecation processes over "best-effort" commitments. It does not help if there is a minefield of issues and raised bugs about incompatibilities are not considered really important since the user-base for PostgreSQL installs has diminished. If I use a set of services just fine, but as soon as I add another OS project to my cloud I could run into all sorts of problems, just because I chose the "wrong" DBMS. Regards Christian From jay at gr-oss.io Thu May 11 19:13:58 2023 From: jay at gr-oss.io (Jay Faulkner) Date: Thu, 11 May 2023 12:13:58 -0700 Subject: [neutron][ironic] Postgres support? In-Reply-To: <8910fa6c-9d28-d2a8-c863-c17cd2f8b190@inovex.de> References: <2597921.4JXAxupmLU@p1> <8910fa6c-9d28-d2a8-c863-c17cd2f8b190@inovex.de> Message-ID: Hey Christian, You're 100% correct to use resolutions from the TC and governance policies to help you determine what is officially supported for an integrated OpenStack release. However, these guidelines, from a project perspective, are minimums. Many projects; Ironic included, run jobs to ensure postgresql support continues working. This is absolutely best-effort, and I believe is mostly grown out of trying to continue to support operators who made the choice to use postgresql. With my Ironic PTL hat on, I appreciate that we explicitly ensure we're not limiting our project to a single DBMS forever. We test Ironic against the following DBMS: mysql/mariadb (used by most users), postgresql, and sqlite (used by metal3, primarily). In the case of postgresql (as mentioned in this thread), we test more than Ironic, we ensure a full deployment is possible which tests the basic functionality of Ironic, Neutron, Nova, and many other projects. This doesn't mean postgresql or sqlite is supported from the perspective of an integrated OpenStack install, but it's there and available to use for deployers who might choose to use Ironic standalone (metal3/bifrost) or for operators with clouds that predate OpenStack's statement of non-support for postgres. I apologize if this appears inconsistent from the outside, but we want to continue to serve as many use cases as possible -- sometimes that means going above and beyond the minimum supported platforms. Thanks, Jay Faulkner Ironic PTL TC Vice-Chair On Thu, May 11, 2023 at 12:02?PM Christian Rohmann < christian.rohmann at inovex.de> wrote: > > On 11/05/2023 15:54, Julia Kreger wrote: > > Awesome! Thanks for the prompt reply and action on a patch for > > Neutron. The test patch proposed against Ironic passed, which is > > definitely a good sign! > > While it's really cool to see how quickly this PostgreSQL issue was > fixed for Ironic, I believe there is no widespread testing and > validation of PostgreSQL for other projects (anymore). > > There even was an official communication from the TC in 2017 about not > "supporting" anything but MySQL / MariaDB: > > https://governance.openstack.org/tc/resolutions/20170613-postgresql-status.html > > I myself ran into multiple issues, some during schema upgrades and some > only at runtime because certain side-effect such as improper booleans or > "free solo SQL" > where used within the code, rendering the ORM abstraction of the > actually used DBMS useless. This is also making it very hard to fully > validate the DBMS is interchangeable > without particularly testing for those kind of cases in all sorts of > database fields or by doing code reviews. > > > What I am saying is, that as a an operator I appreciate clear > communication and clear deprecation processes over "best-effort" > commitments. > It does not help if there is a minefield of issues and raised bugs about > incompatibilities are not considered really important since the > user-base for PostgreSQL installs has diminished. > > If I use a set of services just fine, but as soon as I add another OS > project to my cloud I could run into all sorts of problems, just because > I chose the "wrong" DBMS. > > > > Regards > > Christian > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Thu May 11 19:40:08 2023 From: smooney at redhat.com (Sean Mooney) Date: Thu, 11 May 2023 20:40:08 +0100 Subject: [nova] hw:numa_nodes question In-Reply-To: References: <713262656198f4e0330a086b906daa8a1cb3e40c.camel@redhat.com> <1a05b92654acb6309bc52fac14c9ae79242ab40e.camel@redhat.com> <1705b563fb0e936a2aa8356f6adccddd948b69bf.camel@redhat.com> Message-ID: <56ffe1e6cabcc54920b6f8a3a255d13bd7407628.camel@redhat.com> On Thu, 2023-05-11 at 08:40 -0500, hai wu wrote: > Ok. Then I don't understand why 'hw:mem_page_size' is not made the > default in case if hw:numa_node is set. There is a huge disadvantage > if not having this one set (all existing VMs with hw:numa_node set > will have to be taken down for resizing in order to get this one > right). there is an upgrade impact to changign the default. its not impossibel to do but its complicated if we dont want to break exisitng deployments we woudl need to recored a value for eveny current instance that was spawned before this default was changed that had hw:numa_node without hw:mem_page_size so they kept the old behavior and make sure that is cleared when the vm is next moved so it can have the new default after a live migratoin. > > I could not find this point mentioned in any existing Openstack > documentation: that we would have to set hw:mem_page_size explicitly > if hw:numa_node is set. Also this slide at > https://www.linux-kvm.org/images/0/0b/03x03-Openstackpdf.pdf kind of > indicates that hw:mem_page_size `Default to small pages`. it defaults to unset. that results in small pages by default but its not the same as hw:mem_page_size=small or hw:mem_page_size=any. > > Another question: Let's say a VM runs on one host's numa node #0. If > we live-migrate this VM to another host, and that host's numa node #1 > has more free memory, is it possible for this VM to land on the other > host's numa node #1? yes it is on newer relsese we will prefer to balance the load across numa nodes on older release nova woudl fill the first numa node then move to the second. > > On Thu, May 11, 2023 at 4:25?AM Sean Mooney wrote: > > > > On Wed, 2023-05-10 at 15:06 -0500, hai wu wrote: > > > Is it possible to update something in the Openstack database for the > > > relevant VMs in order to do the same, and then hard reboot the VM so > > > that the VM would have this attribute? > > not really adding the missing hw:mem_page_size requirement to the flavor chagnes the > > requirements for node placement and numa affinity > > so you really can only change this via resizing the vm to a new flavor > > > > > > On Wed, May 10, 2023 at 2:47?PM Sean Mooney wrote: > > > > > > > > On Wed, 2023-05-10 at 14:22 -0500, hai wu wrote: > > > > > So there's no default value assumed/set for hw:mem_page_size for each > > > > > flavor? > > > > > > > > > correct this is a known edgecase in the currnt design > > > > hw:mem_page_size=any would be a resonable default but > > > > techinially if just set hw:numa_nodes=1 nova allow memory over subscription > > > > > > > > in pratch if you try to do that you will almost always end up with vms > > > > being killed due to OOM events. > > > > > > > > so from a api point of view it woudl be a change of behvior for use to default > > > > to hw:mem_page_size=any but i think it would be the correct thign to do for operators > > > > in the long run. > > > > > > > > i could bring this up with the core team again but in the past we > > > > decided to be conservitive and just warn peopel to alwasy set > > > > hw:mem_page_size if using numa affinity. > > > > > > > > > Yes https://bugs.launchpad.net/nova/+bug/1893121 is critical > > > > > when using hw:numa_nodes=1. > > > > > > > > > > I did not hit an issue with 'hw:mem_page_size' not set, maybe I am > > > > > missing some known test cases? It would be very helpful to have a test > > > > > case where I could reproduce this issue with 'hw:numa_nodes=1' being > > > > > set, but without 'hw:mem_page_size' being set. > > > > > > > > > > How to ensure this one for existing vms already running with > > > > > 'hw:numa_nodes=1', but without 'hw:mem_page_size' being set? > > > > you unfortuletly need to resize the instance. > > > > tehre are some image porpeties you can set on an instance via nova-manage > > > > but you cannot use nova-mange to update the enbedd flavor and set this. > > > > > > > > so you need to define a new flavour and resize. > > > > > > > > this is the main reason we have not changed the default as it may requrie you to > > > > move instnace around if there placement is now invalid now that per numa node memory > > > > allocatons are correctly being accounted for. > > > > > > > > if it was simple to change the default without any enduser or operator impact we would. > > > > > > > > > > > > > > > > > > > > > > On Wed, May 10, 2023 at 1:47?PM Sean Mooney wrote: > > > > > > > > > > > > if you set hw:numa_nodes there are two things you should keep in mind > > > > > > > > > > > > first if hw:numa_nodes si set to any value incluing hw:numa_nodes=1 > > > > > > then hw:mem_page_size shoudl also be defiend on the falvor. > > > > > > > > > > > > if you dont set hw:mem_page_size then the vam will be pinned to a host numa node > > > > > > but the avaible memory on the host numa node will not be taken into account > > > > > > > > > > > > only the total free memory on the host so this almost always results in VMs being killed by the OOM reaper > > > > > > in the kernel. > > > > > > > > > > > > i recomend setting hw:mem_page_size=small hw:mem_page_size=large or hw:mem_page_size=any > > > > > > small will use your kernels default page size for guest memory, typically this is 4k pages > > > > > > large will use any pages size other then the smallest that is avaiable (i.e. this will use hugepages) > > > > > > and any will use small pages but allow the guest to request hugepages via the hw_page_size image property. > > > > > > > > > > > > hw:mem_page_size=any is the most flexable as a result but generally i recommend using hw:mem_page_size=small > > > > > > and having a seperate flavor for hugepages. its really up to you. > > > > > > > > > > > > > > > > > > the second thing to keep in mind is using expict numa toplolig8ies including hw:numa_nodes=1 > > > > > > disables memory oversubsctipion. > > > > > > > > > > > > so you will not be able ot oversubscibe the memory on the host. > > > > > > > > > > > > in general its better to avoid memory oversubscribtion anyway but jsut keep that in mind. > > > > > > you cant jsut allocate a buch of swap space and run vms at a 2:1 or higher memory over subscription ratio > > > > > > if you are using numa affinity. > > > > > > > > > > > > https://that.guru/blog/the-numa-scheduling-story-in-nova/ > > > > > > and > > > > > > https://that.guru/blog/cpu-resources-redux/ > > > > > > > > > > > > are also good to read > > > > > > > > > > > > i do not think stephen has a dedicated block on the memory aspect > > > > > > but https://bugs.launchpad.net/nova/+bug/1893121 covers some of the probelem that only setting > > > > > > hw:numa_nodes=1 will casue. > > > > > > > > > > > > if you have vms with hw:numa_nodes=1 set and you do not have hw:mem_page_size set in the falvor or > > > > > > hw_mem_page_size set in the image then that vm is not configure properly. > > > > > > > > > > > > On Wed, 2023-05-10 at 11:52 -0600, Alvaro Soto wrote: > > > > > > > Another good resource =) > > > > > > > > > > > > > > https://that.guru/blog/cpu-resources/ > > > > > > > > > > > > > > On Wed, May 10, 2023 at 11:50?AM Alvaro Soto wrote: > > > > > > > > > > > > > > > I don't think so. > > > > > > > > > > > > > > > > ~~~ > > > > > > > > The most common case will be that the admin only sets hw:numa_nodes and > > > > > > > > then the flavor vCPUs and memory will be divided equally across the NUMA > > > > > > > > nodes. When a NUMA policy is in effect, it is mandatory for the instance's > > > > > > > > memory allocations to come from the NUMA nodes to which it is bound except > > > > > > > > where overriden by hw:numa_mem.NN. > > > > > > > > ~~~ > > > > > > > > > > > > > > > > Here are the implementation documents since Juno release: > > > > > > > > > > > > > > > > > > > > > > > > https://opendev.org/openstack/nova-specs/src/branch/master/specs/juno/implemented/virt-driver-numa-placement.rst > > > > > > > > > > > > > > > > https://opendev.org/openstack/nova-specs/commit/45252df4c54674d2ac71cd88154af476c4d510e1 > > > > > > > > ? > > > > > > > > > > > > > > > > > > > > > > > > On Wed, May 10, 2023 at 11:31?AM hai wu wrote: > > > > > > > > > > > > > > > > > Is there any concern to enable 'hw:numa_nodes=1' on all flavors, as > > > > > > > > > long as that flavor can fit into one numa node? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > > > Alvaro Soto > > > > > > > > > > > > > > > > *Note: My work hours may not be your work hours. Please do not feel the > > > > > > > > need to respond during a time that is not convenient for you.* > > > > > > > > ---------------------------------------------------------- > > > > > > > > Great people talk about ideas, > > > > > > > > ordinary people talk about things, > > > > > > > > small people talk... about other people. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From scott.little at windriver.com Thu May 11 20:27:56 2023 From: scott.little at windriver.com (Scott Little) Date: Thu, 11 May 2023 16:27:56 -0400 Subject: Train EOL Message-ID: <0802ccc5-4adc-f523-aff3-6c62af0d0aca@windriver.com> Hello OpenStack community I'm one of the members of the StarlingX community.? We've had a lot of stability issues with our ability to compile both our development branch and supported release branches these last few weeks.? It all traces back to the Train EOL. We weren't monitoring openstack mailing lists, and missed the EOL announcement.? We are actively moving off of Train, but aren't yet ready. What's really causing us grief is that some sub-projects, e.g.heat and nova, have started deleting elements of Train, e.g. git branches. Now please don't take this wrong.? Ending support for an old branch is a totally normal thing, and we accept that.? If StarlingX customers need support in that area, we'll provide it. However I would plea to you to NOT delete the elements of Train that allow other projects to compile old openstack releases, e.g. your gits branches. Sincerely Scott Little on behalf of StarlingX From jay at gr-oss.io Thu May 11 22:39:59 2023 From: jay at gr-oss.io (Jay Faulkner) Date: Thu, 11 May 2023 15:39:59 -0700 Subject: Train EOL In-Reply-To: <0802ccc5-4adc-f523-aff3-6c62af0d0aca@windriver.com> References: <0802ccc5-4adc-f523-aff3-6c62af0d0aca@windriver.com> Message-ID: Hey Scott, Good news! It's all still there, just not as branches. When a branch is moved from "Extended Maintenance" (EM) to End of Life (EOL), we remove the branch but retain the git refs on a tag.[1] (e.g. https://opendev.org/openstack/ironic/src/tag/stein-eol is the tag representing Ironic stable/stein at the point of EOL). Look for the `train-eol` tag on the projects you're struggling with, and that should be the git ref you're looking for. Hopefully your tooling is happy getting any git ref and not just a branch ref. 1: https://docs.openstack.org/project-team-guide/stable-branches.html#end-of-life Thanks, Jay Faulkner Ironic PTL TC Vice-Chair On Thu, May 11, 2023 at 3:20?PM Scott Little wrote: > Hello OpenStack community > > I'm one of the members of the StarlingX community. We've had a lot of > stability issues with our ability to compile both our development branch > and supported release branches these last few weeks. It all traces back > to the Train EOL. We weren't monitoring openstack mailing lists, and > missed the EOL announcement. We are actively moving off of Train, but > aren't yet ready. > > What's really causing us grief is that some sub-projects, e.g.heat and > nova, have started deleting elements of Train, e.g. git branches. > > Now please don't take this wrong. Ending support for an old branch is a > totally normal thing, and we accept that. If StarlingX customers need > support in that area, we'll provide it. However I would plea to you to > NOT delete the elements of Train that allow other projects to compile > old openstack releases, e.g. your gits branches. > > Sincerely > > Scott Little on behalf of StarlingX > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Fri May 12 03:04:06 2023 From: satish.txt at gmail.com (Satish Patel) Date: Thu, 11 May 2023 23:04:06 -0400 Subject: [kolla] [train] haproxy and controller restart causes user impact In-Reply-To: <314427760.909657.1683808321773@mail.yahoo.com> References: <314427760.909657.1683808321773.ref@mail.yahoo.com> <314427760.909657.1683808321773@mail.yahoo.com> Message-ID: Are you running your stack on top of the kvm virtual machine? How many controller nodes do you have? mostly rabbitMQ causing issues if you restart controller nodes. On Thu, May 11, 2023 at 8:34?AM Albert Braden wrote: > We have our haproxy and controller nodes on KVM hosts. When those KVM > hosts are restarted, customers who are building or deleting VMs see impact. > VMs may go into error status, fail to get DNS records, fail to delete, etc. > The obvious reason is because traffic that is being routed to the haproxy > on the restarting KVM is lost. If we manually fail over haproxy before > restarting the KVM, will that be sufficient to stop traffic being lost, or > do we also need to do something with the controller? > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.vanommen at gmail.com Fri May 12 04:16:38 2023 From: john.vanommen at gmail.com (John van Ommen) Date: Thu, 11 May 2023 21:16:38 -0700 Subject: Did StarlingX Break? Message-ID: I've installed StarlingX 10+ times using the iso "bootimage.iso" from their website. I opted to do a reinstall yesterday, and it keeps hitting the same error at the same spot in the install. I've been able to replicate the error three times using two different HP servers. The installer basically reaches the point where it's installing package number 866 out of 1,217 packages, and fails with this error: process [/usr/libexec/anaconda/anaconda-yum --config /tmp/anaconda-yum.conf --tsfile /mnt/sysimage/anaconda-yum.yumtx --rpmlog /tmp/rpm-script.log --installroot /mnt/sysimage --release 7 --arch x86_64 --macro __dbi_htconfig hash nofsync /etc/selinux/targeted/contexts/files/file_contexts exited with status 1 I did a Google search on this error, and there's a slightly similar error reported on a RedHat forum from someone who had a bad ISO. But I know my ISO is working; it's the exact same ISO I've already installed from 10+ times. In the event that I have a bad DIMM or a bad NVME, I tried an identical server, and was able to reproduce the same error. In an earlier message on the list today, someone reported that missing repos may be impacting StarlingX, which definitely makes me wonder if I'm being impacted by that. I'm on the CentOS based version of StarlingX, version 7.0 I also tried to install from the Debian based version (8.0) and that one wouldn't even boot properly. It basically asks you what type of install you want to do, then crashes. -------------- next part -------------- An HTML attachment was scrubbed... URL: From eblock at nde.ag Fri May 12 07:33:27 2023 From: eblock at nde.ag (Eugen Block) Date: Fri, 12 May 2023 07:33:27 +0000 Subject: [kolla] [train] haproxy and controller restart causes user impact In-Reply-To: References: <314427760.909657.1683808321773.ref@mail.yahoo.com> <314427760.909657.1683808321773@mail.yahoo.com> Message-ID: <20230512073327.Horde.nPa9c_1UYY_XW_n-sc1pyQM@webmail.nde.ag> Hi Albert, how is your haproxy placement controlled, something like pacemaker or similar? I would always do a failover when I'm aware of interruptions (maintenance window), that should speed things up for clients. We have a pacemaker controlled HA control plane, it takes more time until pacemaker realizes that the resource is gone if I just rebooted a server without failing over. I have no benchmarks though. There's always a risk of losing a couple of requests during the failover but we didn't have complaints yet, I believe most of the components try to resend the lost messages. In one of our customer's cluster with many resources (they also use terraform) I haven't seen issues during a regular maintenance window. When they had a DNS outage a few months back it resulted in a mess, manual cleaning was necessary, but the regular failovers seem to work just fine. And I don't see rabbitmq issues either after rebooting a server, usually the haproxy (and virtual IP) failover suffice to prevent interruptions. Regards, Eugen Zitat von Satish Patel : > Are you running your stack on top of the kvm virtual machine? How many > controller nodes do you have? mostly rabbitMQ causing issues if you restart > controller nodes. > > On Thu, May 11, 2023 at 8:34?AM Albert Braden wrote: > >> We have our haproxy and controller nodes on KVM hosts. When those KVM >> hosts are restarted, customers who are building or deleting VMs see impact. >> VMs may go into error status, fail to get DNS records, fail to delete, etc. >> The obvious reason is because traffic that is being routed to the haproxy >> on the restarting KVM is lost. If we manually fail over haproxy before >> restarting the KVM, will that be sufficient to stop traffic being lost, or >> do we also need to do something with the controller? >> >> From nicolas at kektus.xyz Fri May 12 08:21:52 2023 From: nicolas at kektus.xyz (Nicolas Froger) Date: Fri, 12 May 2023 10:21:52 +0200 Subject: [kolla-ansible] TLS and internal VIP Message-ID: <2281c390-e1c8-da98-cafe-df1c2ffc8fb7@kektus.xyz> Hello, I was debugging the monitoring stack of our deployment and I noticed that our Prometheus could not reach the OpenStack Exporter. The error is about a certificate name mismatch because Prometheus is scraping the exporter with the internal IP address instead of the internal FQDN while the certificate we have is only valid for the internal FQDN. Indeed, the Prometheus config specifies kolla_internal_vip_address as a target and uses HTTPS when kolla_enable_tls_internal is true. Replacing the target with kolla_internal_fqdn which is a DNS name for which the certificate is valid fixed my issue. My question is the following: should the internal certificate also be valid for the internal VIP when kolla_enable_tls_internal is set to true or is it okay if it's only valid for the FQDN? In the later case, does it make sense if I open an issue to use the FQDN instead of the IP address in the Prometheus config? Regards, -- Nicolas Froger From fungi at yuggoth.org Fri May 12 11:49:44 2023 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 12 May 2023 11:49:44 +0000 Subject: Train EOL In-Reply-To: <0802ccc5-4adc-f523-aff3-6c62af0d0aca@windriver.com> References: <0802ccc5-4adc-f523-aff3-6c62af0d0aca@windriver.com> Message-ID: <20230512114943.opgaevawn5dzzlvg@yuggoth.org> On 2023-05-11 16:27:56 -0400 (-0400), Scott Little wrote: [...] > I would plea to you to NOT delete the elements of Train that allow > other projects to compile old openstack releases, e.g. your gits > branches. For future reference, it's unsafe to assume branches stick around forever. Projects may EOL and delete them as soon as 18 months from the coordinated release (when normal maintenance for those branches ends). Tags, on the other hand, are kept indefinitely. If you need to know the most recent stable point release tag for a given coordinated release series, you can find them listed on the releases site: As an example, https://releases.openstack.org/train/index.html lists them for Train. While many projects practice an "extended maintenance" to allow interested members of the community to continue backporting fixes on stable branches after that point, those branches are intended as a point of coordination for downstream maintainers who want to collaborate upstream on developing and reviewing backports for older releases. Branches under extended maintenance don't receive point releases and aren't intended to be consumed directly for deployment, but are rather meant as a source for cherry-picking relevant fixes. Not all projects practice extended maintenance and may EOL and delete their branches at any point after normal maintenance ends, but even those who do extend maintenance as long as possible are eventually sunset in a similar fashion. Mandatory sunset is currently about 5 years from the initial coordinated release for that series, but we probably need to shorten it given challenges with testing 10 different branches across projects and lack of actual extended maintenance interest from downstream consumers and redistributors (they're interested in things being maintained of course, but not doing the work to make it happen). -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From fungi at yuggoth.org Fri May 12 11:53:53 2023 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 12 May 2023 11:53:53 +0000 Subject: Did StarlingX Break? In-Reply-To: References: Message-ID: <20230512115353.idsrrxb75q56lbw6@yuggoth.org> On 2023-05-11 21:16:38 -0700 (-0700), John van Ommen wrote: > I've installed StarlingX 10+ times using the iso "bootimage.iso" > from their website. [...] StarlingX has its own mailing lists, where you're far more likely to find people who know anything about it: https://lists.starlingx.io/ -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From scott.little at windriver.com Fri May 12 13:32:19 2023 From: scott.little at windriver.com (Scott Little) Date: Fri, 12 May 2023 09:32:19 -0400 Subject: Train EOL In-Reply-To: References: <0802ccc5-4adc-f523-aff3-6c62af0d0aca@windriver.com> Message-ID: <451f9dd7-1a0f-5b77-de7d-205a0bf17ea2@windriver.com> Thanks for your response Jay Yes the -eol tag is somewhat useful, but it doesn't appear to be created until the branch is removed.? There is no way for a downstream consumer to to prepare for the forthcoming branch deletion.? There is no way to avoid a period of breakage. I'd like to suggest the branches be 'frozen', as in accepting no further updates, and tagged with -eol, but otherwise allowed to continue to exist for several months if not a year. That would allow downstream projects a reasonable period to switch from branch to tag and avoid a period of breakage. Second point.? Is there a separate mailing list to announce major events such as EOL of a branch?? It's hard to pick such announcements out of a general mailing list. Scott On 2023-05-11 18:39, Jay Faulkner wrote: > ** > *CAUTION: This email comes from a non Wind River email account!* > Do not click links or open attachments unless you recognize the sender > and know the content is safe. > Hey Scott, > > Good news! It's all still there, just not as branches. When a branch > is moved from "Extended Maintenance" (EM) to End of Life (EOL), we > remove the branch but retain the git refs on a tag.[1] (e.g. > https://opendev.org/openstack/ironic/src/tag/stein-eol > > is the tag representing Ironic stable/stein at the point of EOL). > > Look for the `train-eol` tag on the projects you're struggling with, > and that should be the git ref you're looking for. Hopefully your > tooling is happy getting any git ref and not just a branch ref. > > 1: > https://docs.openstack.org/project-team-guide/stable-branches.html#end-of-life > > > Thanks, > Jay Faulkner > Ironic PTL > TC Vice-Chair > > > On Thu, May 11, 2023 at 3:20?PM Scott Little > wrote: > > Hello OpenStack community > > I'm one of the members of the StarlingX community.? We've had a > lot of > stability issues with our ability to compile both our development > branch > and supported release branches these last few weeks.? It all > traces back > to the Train EOL. We weren't monitoring openstack mailing lists, and > missed the EOL announcement.? We are actively moving off of Train, > but > aren't yet ready. > > What's really causing us grief is that some sub-projects, e.g.heat > and > nova, have started deleting elements of Train, e.g. git branches. > > Now please don't take this wrong.? Ending support for an old > branch is a > totally normal thing, and we accept that.? If StarlingX customers > need > support in that area, we'll provide it. However I would plea to > you to > NOT delete the elements of Train that allow other projects to compile > old openstack releases, e.g. your gits branches. > > Sincerely > > Scott Little on behalf of StarlingX > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vineetthakur09 at gmail.com Fri May 12 13:45:24 2023 From: vineetthakur09 at gmail.com (Vineet Thakur) Date: Fri, 12 May 2023 19:15:24 +0530 Subject: Ussuri - Volume delete status not updated in horizon dashboard Message-ID: Hi Team, The issue is about quota update in the horizon dashboard where volume deletion status is not updated. For example: We deleted the volume, cinder logs showing the deletion status as success and even in the dashboard's "volume" section as well. But in the dashboard quota view, the entry for volume existence is still available, and this problem is only for one volume . (not for all volumes) We have tried almost all the possible ways suggested in online blogs to remove this volume id. But still that volume is persisting in the dashboard quota. Could you please guide us what is the best practice to remove this volume id from the quota without affecting the production environment. Appreciate your help. Thank you. Kind Regards, Vineet -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Fri May 12 14:41:23 2023 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 12 May 2023 14:41:23 +0000 Subject: Train EOL In-Reply-To: <451f9dd7-1a0f-5b77-de7d-205a0bf17ea2@windriver.com> References: <0802ccc5-4adc-f523-aff3-6c62af0d0aca@windriver.com> <451f9dd7-1a0f-5b77-de7d-205a0bf17ea2@windriver.com> Message-ID: <20230512144123.gsykkgt4b7ocrjw3@yuggoth.org> On 2023-05-12 09:32:19 -0400 (-0400), Scott Little wrote: [...] > I'd like to suggest the branches be 'frozen', as in accepting no > further updates, and tagged with -eol, but otherwise > allowed to continue to exist for several months if not a year. > That would allow downstream projects a reasonable period to switch > from branch to tag and avoid a period of breakage. We can't easily make changes proposed for those branches get automatically rejected without deleting the branches, but also the deletion is meant to send a clear and noticeable signal to anyone still trying to pull from it, while an EOL tag may go unnoticed. The branch was effectively frozen (as in no new stable point releases were made) the moment normal maintenance of that branch ended, and then several years went by where users were expected to stop relying on it. The EOL tagging and subsequent branch cleanup is the final signal to anyone why may have not otherwise noticed we stopped maintaining and releasing the branch years prior. We've had some discussions about the term "extended maintenance" being confusing to consumers who think that means the branch is still maintained and recommended for direct use. I've been advocating we refer to everything after the end of normal maintenance as "unmaintained" in order to make that more clear, though there was another suggestion for "community maintenance" which might be closer to the truth for some projects. > Second point.? Is there a separate mailing list to announce major > events such as EOL of a branch?? It's hard to pick such > announcements out of a general mailing list. I can see making a case for announcing the end of normal maintenance on the openstack-announce mailing list. EOL of individual projects on the other hand is going to be too high-volume for an announcements since they are not required to coordinate that with each other and can do so any time between the end of normal maintenance and the mandatory sunset. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From jay at gr-oss.io Fri May 12 14:50:17 2023 From: jay at gr-oss.io (Jay Faulkner) Date: Fri, 12 May 2023 07:50:17 -0700 Subject: Train EOL In-Reply-To: <451f9dd7-1a0f-5b77-de7d-205a0bf17ea2@windriver.com> References: <0802ccc5-4adc-f523-aff3-6c62af0d0aca@windriver.com> <451f9dd7-1a0f-5b77-de7d-205a0bf17ea2@windriver.com> Message-ID: Scott, I acknowledge and empathize with the pain -- I've operated OpenStack forks downstream at previous jobs and have had to push changes to swap from the branch to the EOL-tag. With that being said, I still wouldn't be in favor of that kind of change in policy, because the downsides -- such as it not being clear to contributors what branches were open for contribution -- are pretty high. We have an announcement list[1]. I don't think it's a bad idea to start publishing posts there when a branch changes support status. I'm curious what other contributors would think of this. Thanks, Jay Faulkner Ironic PTL TC Vice-Chair 1: https://lists.openstack.org/pipermail/openstack-announce/ On Fri, May 12, 2023 at 6:32?AM Scott Little wrote: > Thanks for your response Jay > > Yes the -eol tag is somewhat useful, but it doesn't appear to be > created until the branch is removed. There is no way for a downstream > consumer to to prepare for the forthcoming branch deletion. There is no > way to avoid a period of breakage. > > I'd like to suggest the branches be 'frozen', as in accepting no further > updates, and tagged with -eol, but otherwise allowed to continue to > exist for several months if not a year. That would allow downstream > projects a reasonable period to switch from branch to tag and avoid a > period of breakage. > > Second point. Is there a separate mailing list to announce major events > such as EOL of a branch? It's hard to pick such announcements out of a > general mailing list. > > Scott > > > > On 2023-05-11 18:39, Jay Faulkner wrote: > > *CAUTION: This email comes from a non Wind River email account!* > Do not click links or open attachments unless you recognize the sender and > know the content is safe. > Hey Scott, > > Good news! It's all still there, just not as branches. When a branch is > moved from "Extended Maintenance" (EM) to End of Life (EOL), we remove the > branch but retain the git refs on a tag.[1] (e.g. > https://opendev.org/openstack/ironic/src/tag/stein-eol > > is the tag representing Ironic stable/stein at the point of EOL). > > Look for the `train-eol` tag on the projects you're struggling with, and > that should be the git ref you're looking for. Hopefully your tooling is > happy getting any git ref and not just a branch ref. > > 1: > https://docs.openstack.org/project-team-guide/stable-branches.html#end-of-life > > > Thanks, > Jay Faulkner > Ironic PTL > TC Vice-Chair > > > On Thu, May 11, 2023 at 3:20?PM Scott Little > wrote: > >> Hello OpenStack community >> >> I'm one of the members of the StarlingX community. We've had a lot of >> stability issues with our ability to compile both our development branch >> and supported release branches these last few weeks. It all traces back >> to the Train EOL. We weren't monitoring openstack mailing lists, and >> missed the EOL announcement. We are actively moving off of Train, but >> aren't yet ready. >> >> What's really causing us grief is that some sub-projects, e.g.heat and >> nova, have started deleting elements of Train, e.g. git branches. >> >> Now please don't take this wrong. Ending support for an old branch is a >> totally normal thing, and we accept that. If StarlingX customers need >> support in that area, we'll provide it. However I would plea to you to >> NOT delete the elements of Train that allow other projects to compile >> old openstack releases, e.g. your gits branches. >> >> Sincerely >> >> Scott Little on behalf of StarlingX >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From elod.illes at est.tech Fri May 12 14:52:51 2023 From: elod.illes at est.tech (=?utf-8?B?RWzDtWQgSWxsw6lz?=) Date: Fri, 12 May 2023 14:52:51 +0000 Subject: [release] Release countdown for week R-20, May 15-19 Message-ID: Development Focus ----------------- We are now past the Bobcat-1 milestone. Teams should now be focused on feature development. General Information ------------------- Our next milestone in this development cycle will be Bobcat-2, on July 6th, 2023. This milestone is when we freeze the list of deliverables that will be included in the 2023.2 Bobcat final release, so if you plan to introduce new deliverables in this release, please propose a change to add an empty deliverable file in the deliverables/bobcat directory of the openstack/releases repository. Now is also generally a good time to look at bugfixes that were introduced in the master branch that might make sense to be backported and released in a stable release. If you have any question around the OpenStack release process, feel free to ask on this mailing-list or on the #openstack-release channel on IRC. Upcoming Deadlines & Dates -------------------------- OpenInfra Summit & PTG in Vancouver: June 13-15, 2023 Bobcat-2 Milestone: July 6th, 2023 El?d Ill?s irc: elodilles @ #openstack-release -------------- next part -------------- An HTML attachment was scrubbed... URL: From elod.illes at est.tech Fri May 12 15:30:57 2023 From: elod.illes at est.tech (=?utf-8?B?RWzDtWQgSWxsw6lz?=) Date: Fri, 12 May 2023 15:30:57 +0000 Subject: Train EOL In-Reply-To: References: <0802ccc5-4adc-f523-aff3-6c62af0d0aca@windriver.com> <451f9dd7-1a0f-5b77-de7d-205a0bf17ea2@windriver.com> Message-ID: Thanks Jay & Jeremy for all the answers you gave to Scott. Let me also emphasize (with my stable maintainer hat on ?) that Train is *not* End of Life yet [1], but some of the projects decided to remove some of their stable/train (and even newer) branches due to lack of maintainers and broken gates and to free up resources. (Note that, for example core projects have EOL'd their stable/rocky branches for quite a while now, and Rocky is about to EOL as a whole. Stein is in the very same situation (broken gates, long gone core components, etc.) and it is inevitable that Train will reach that point, too.) Nevertheless, we also discussed recently that perhaps it would be beneficial to keep old branches open instead of deleting them, but for different reasons teams were not favor of doing that. Jay had a very good point in his previous mail as an example why it is better to delete old branches. [1] https://releases.openstack.org/ Thanks, El?d Ill?s irc: elodilles @ #openstack-stable #openstack-release ________________________________ From: Jay Faulkner Sent: Friday, May 12, 2023 4:50 PM To: Scott Little Cc: openstack-discuss at lists.openstack.org Subject: Re: Train EOL Scott, I acknowledge and empathize with the pain -- I've operated OpenStack forks downstream at previous jobs and have had to push changes to swap from the branch to the EOL-tag. With that being said, I still wouldn't be in favor of that kind of change in policy, because the downsides -- such as it not being clear to contributors what branches were open for contribution -- are pretty high. We have an announcement list[1]. I don't think it's a bad idea to start publishing posts there when a branch changes support status. I'm curious what other contributors would think of this. Thanks, Jay Faulkner Ironic PTL TC Vice-Chair 1: https://lists.openstack.org/pipermail/openstack-announce/ On Fri, May 12, 2023 at 6:32?AM Scott Little > wrote: Thanks for your response Jay Yes the -eol tag is somewhat useful, but it doesn't appear to be created until the branch is removed. There is no way for a downstream consumer to to prepare for the forthcoming branch deletion. There is no way to avoid a period of breakage. I'd like to suggest the branches be 'frozen', as in accepting no further updates, and tagged with -eol, but otherwise allowed to continue to exist for several months if not a year. That would allow downstream projects a reasonable period to switch from branch to tag and avoid a period of breakage. Second point. Is there a separate mailing list to announce major events such as EOL of a branch? It's hard to pick such announcements out of a general mailing list. Scott On 2023-05-11 18:39, Jay Faulkner wrote: CAUTION: This email comes from a non Wind River email account! Do not click links or open attachments unless you recognize the sender and know the content is safe. Hey Scott, Good news! It's all still there, just not as branches. When a branch is moved from "Extended Maintenance" (EM) to End of Life (EOL), we remove the branch but retain the git refs on a tag.[1] (e.g. https://opendev.org/openstack/ironic/src/tag/stein-eol is the tag representing Ironic stable/stein at the point of EOL). Look for the `train-eol` tag on the projects you're struggling with, and that should be the git ref you're looking for. Hopefully your tooling is happy getting any git ref and not just a branch ref. 1: https://docs.openstack.org/project-team-guide/stable-branches.html#end-of-life Thanks, Jay Faulkner Ironic PTL TC Vice-Chair On Thu, May 11, 2023 at 3:20?PM Scott Little > wrote: Hello OpenStack community I'm one of the members of the StarlingX community. We've had a lot of stability issues with our ability to compile both our development branch and supported release branches these last few weeks. It all traces back to the Train EOL. We weren't monitoring openstack mailing lists, and missed the EOL announcement. We are actively moving off of Train, but aren't yet ready. What's really causing us grief is that some sub-projects, e.g.heat and nova, have started deleting elements of Train, e.g. git branches. Now please don't take this wrong. Ending support for an old branch is a totally normal thing, and we accept that. If StarlingX customers need support in that area, we'll provide it. However I would plea to you to NOT delete the elements of Train that allow other projects to compile old openstack releases, e.g. your gits branches. Sincerely Scott Little on behalf of StarlingX -------------- next part -------------- An HTML attachment was scrubbed... URL: From ozzzo at yahoo.com Fri May 12 15:58:58 2023 From: ozzzo at yahoo.com (Albert Braden) Date: Fri, 12 May 2023 15:58:58 +0000 (UTC) Subject: [kolla] [train] haproxy and controller restart causes user impact In-Reply-To: References: <314427760.909657.1683808321773.ref@mail.yahoo.com> <314427760.909657.1683808321773@mail.yahoo.com> Message-ID: <1466896660.1488791.1683907138477@mail.yahoo.com> Controllers and haproxy are on KVM. We have 3 controllers. We used to have RMQ issues during KVM reboot but that stopped after we switched to durable queues. We expected to stop seeing customer impact after fixing the RMQ issues, but we still see it. On Thursday, May 11, 2023, 11:14:17 PM EDT, Satish Patel wrote: Are you running your stack on top of the kvm virtual machine? How many controller nodes do?you have? mostly rabbitMQ causing issues if you restart controller nodes.? On Thu, May 11, 2023 at 8:34?AM Albert Braden wrote: We have our haproxy and controller nodes on KVM hosts. When those KVM hosts are restarted, customers who are building or deleting VMs see impact. VMs may go into error status, fail to get DNS records, fail to delete, etc. The obvious reason is because traffic that is being routed to the haproxy on the restarting KVM is lost. If we manually fail over haproxy before restarting the KVM, will that be sufficient to stop traffic being lost, or do we also need to do something with the controller? -------------- next part -------------- An HTML attachment was scrubbed... URL: From haleyb.dev at gmail.com Fri May 12 18:13:59 2023 From: haleyb.dev at gmail.com (Brian Haley) Date: Fri, 12 May 2023 14:13:59 -0400 Subject: [neutron] Gate broken, please don't recheck Message-ID: <3bde1b10-01ff-c9aa-2d9e-0747fca12cf9@gmail.com> Hi, With the latest release of neutronclient version 10.0.0 the neutron gate pep8 job is now failing. Please hold your rechecks until a fix [0] merges. There's also some old unused devstack code that uses this neutron-debug code, will remove that in a follow-up. Thanks, -Brian [0] https://review.opendev.org/c/openstack/neutron/+/883081 From fadhel.bedda at gmail.com Fri May 12 18:31:47 2023 From: fadhel.bedda at gmail.com (BEDDA Fadhel) Date: Fri, 12 May 2023 20:31:47 +0200 Subject: Installation openstack Multi node Message-ID: Good morning, I am looking for a complete video or digital procedure that allows me to set up an openstack multi node test environment on vamware workstation. THANKS -------------- next part -------------- An HTML attachment was scrubbed... URL: From ozzzo at yahoo.com Fri May 12 18:56:57 2023 From: ozzzo at yahoo.com (Albert Braden) Date: Fri, 12 May 2023 18:56:57 +0000 (UTC) Subject: [kolla] [train] haproxy and controller restart causes user impact In-Reply-To: <20230512073327.Horde.nPa9c_1UYY_XW_n-sc1pyQM@webmail.nde.ag> References: <314427760.909657.1683808321773.ref@mail.yahoo.com> <314427760.909657.1683808321773@mail.yahoo.com> <20230512073327.Horde.nPa9c_1UYY_XW_n-sc1pyQM@webmail.nde.ag> Message-ID: <1577432210.483872.1683917817875@mail.yahoo.com> We use keepalived and exabgp to manage failover for haproxy. That works but it takes a few minutes, and during those few minutes customers experience impact. We tell them to not build/delete VMs during patching, but they still do, and then complain about the failures. We're planning to experiment with adding a "manual" haproxy failover to our patching automation, but I'm wondering if there is anything on the controller that needs to be failed over or disabled before rebooting the KVM. I looked at the "remove from cluster" and "add to cluster" procedures but that seems unnecessarily cumbersome for rebooting the KVM. On Friday, May 12, 2023, 03:42:42 AM EDT, Eugen Block wrote: Hi Albert, how is your haproxy placement controlled, something like pacemaker or? similar? I would always do a failover when I'm aware of interruptions? (maintenance window), that should speed things up for clients. We have? a pacemaker controlled HA control plane, it takes more time until? pacemaker realizes that the resource is gone if I just rebooted a? server without failing over. I have no benchmarks though. There's? always a risk of losing a couple of requests during the failover but? we didn't have complaints yet, I believe most of the components try to? resend the lost messages. In one of our customer's cluster with many? resources (they also use terraform) I haven't seen issues during a? regular maintenance window. When they had a DNS outage a few months? back it resulted in a mess, manual cleaning was necessary, but the? regular failovers seem to work just fine. And I don't see rabbitmq issues either after rebooting a server,? usually the haproxy (and virtual IP) failover suffice to prevent? interruptions. Regards, Eugen Zitat von Satish Patel : > Are you running your stack on top of the kvm virtual machine? How many > controller nodes do you have? mostly rabbitMQ causing issues if you restart > controller nodes. > > On Thu, May 11, 2023 at 8:34?AM Albert Braden wrote: > >> We have our haproxy and controller nodes on KVM hosts. When those KVM >> hosts are restarted, customers who are building or deleting VMs see impact. >> VMs may go into error status, fail to get DNS records, fail to delete, etc. >> The obvious reason is because traffic that is being routed to the haproxy >> on the restarting KVM is lost. If we manually fail over haproxy before >> restarting the KVM, will that be sufficient to stop traffic being lost, or >> do we also need to do something with the controller? >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Fri May 12 20:51:03 2023 From: satish.txt at gmail.com (Satish Patel) Date: Fri, 12 May 2023 16:51:03 -0400 Subject: [kolla] [train] haproxy and controller restart causes user impact In-Reply-To: <1577432210.483872.1683917817875@mail.yahoo.com> References: <314427760.909657.1683808321773.ref@mail.yahoo.com> <314427760.909657.1683808321773@mail.yahoo.com> <20230512073327.Horde.nPa9c_1UYY_XW_n-sc1pyQM@webmail.nde.ag> <1577432210.483872.1683917817875@mail.yahoo.com> Message-ID: Don't expect zero issue when you reboot the controller. It won't be user transparent. your computer nodes and other services still hang on old connections (rabbitmq/amqp) etc and that takes some time to get settled. Curious why are you running control plane service on KVM and second question why do you need them reboot frequently? I have physical nodes for the control plane and we see strange issues whenever we shouldn't use one of the controllers for maintenance. On Fri, May 12, 2023 at 2:59?PM Albert Braden wrote: > We use keepalived and exabgp to manage failover for haproxy. That works > but it takes a few minutes, and during those few minutes customers > experience impact. We tell them to not build/delete VMs during patching, > but they still do, and then complain about the failures. > > We're planning to experiment with adding a "manual" haproxy failover to > our patching automation, but I'm wondering if there is anything on the > controller that needs to be failed over or disabled before rebooting the > KVM. I looked at the "remove from cluster" and "add to cluster" procedures > but that seems unnecessarily cumbersome for rebooting the KVM. > On Friday, May 12, 2023, 03:42:42 AM EDT, Eugen Block > wrote: > > > Hi Albert, > > how is your haproxy placement controlled, something like pacemaker or > similar? I would always do a failover when I'm aware of interruptions > (maintenance window), that should speed things up for clients. We have > a pacemaker controlled HA control plane, it takes more time until > pacemaker realizes that the resource is gone if I just rebooted a > server without failing over. I have no benchmarks though. There's > always a risk of losing a couple of requests during the failover but > we didn't have complaints yet, I believe most of the components try to > resend the lost messages. In one of our customer's cluster with many > resources (they also use terraform) I haven't seen issues during a > regular maintenance window. When they had a DNS outage a few months > back it resulted in a mess, manual cleaning was necessary, but the > regular failovers seem to work just fine. > And I don't see rabbitmq issues either after rebooting a server, > usually the haproxy (and virtual IP) failover suffice to prevent > interruptions. > > Regards, > Eugen > > Zitat von Satish Patel : > > > Are you running your stack on top of the kvm virtual machine? How many > > controller nodes do you have? mostly rabbitMQ causing issues if you > restart > > controller nodes. > > > > On Thu, May 11, 2023 at 8:34?AM Albert Braden wrote: > > > >> We have our haproxy and controller nodes on KVM hosts. When those KVM > >> hosts are restarted, customers who are building or deleting VMs see > impact. > >> VMs may go into error status, fail to get DNS records, fail to delete, > etc. > >> The obvious reason is because traffic that is being routed to the > haproxy > >> on the restarting KVM is lost. If we manually fail over haproxy before > >> restarting the KVM, will that be sufficient to stop traffic being lost, > or > >> do we also need to do something with the controller? > >> > >> > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Fri May 12 20:52:47 2023 From: satish.txt at gmail.com (Satish Patel) Date: Fri, 12 May 2023 16:52:47 -0400 Subject: [kolla] [train] haproxy and controller restart causes user impact In-Reply-To: References: <314427760.909657.1683808321773.ref@mail.yahoo.com> <314427760.909657.1683808321773@mail.yahoo.com> <20230512073327.Horde.nPa9c_1UYY_XW_n-sc1pyQM@webmail.nde.ag> <1577432210.483872.1683917817875@mail.yahoo.com> Message-ID: My two cents, If you are still running Train (which is EOL) then please upgrade to the next or latest release, you never know what bug causing the issue. On Fri, May 12, 2023 at 4:51?PM Satish Patel wrote: > Don't expect zero issue when you reboot the controller. It won't be user > transparent. your computer nodes and other services still hang on old > connections (rabbitmq/amqp) etc and that takes some time to get settled. > > Curious why are you running control plane service on KVM and second > question why do you need them reboot frequently? > > I have physical nodes for the control plane and we see strange issues > whenever we shouldn't use one of the controllers for maintenance. > > On Fri, May 12, 2023 at 2:59?PM Albert Braden wrote: > >> We use keepalived and exabgp to manage failover for haproxy. That works >> but it takes a few minutes, and during those few minutes customers >> experience impact. We tell them to not build/delete VMs during patching, >> but they still do, and then complain about the failures. >> >> We're planning to experiment with adding a "manual" haproxy failover to >> our patching automation, but I'm wondering if there is anything on the >> controller that needs to be failed over or disabled before rebooting the >> KVM. I looked at the "remove from cluster" and "add to cluster" procedures >> but that seems unnecessarily cumbersome for rebooting the KVM. >> On Friday, May 12, 2023, 03:42:42 AM EDT, Eugen Block >> wrote: >> >> >> Hi Albert, >> >> how is your haproxy placement controlled, something like pacemaker or >> similar? I would always do a failover when I'm aware of interruptions >> (maintenance window), that should speed things up for clients. We have >> a pacemaker controlled HA control plane, it takes more time until >> pacemaker realizes that the resource is gone if I just rebooted a >> server without failing over. I have no benchmarks though. There's >> always a risk of losing a couple of requests during the failover but >> we didn't have complaints yet, I believe most of the components try to >> resend the lost messages. In one of our customer's cluster with many >> resources (they also use terraform) I haven't seen issues during a >> regular maintenance window. When they had a DNS outage a few months >> back it resulted in a mess, manual cleaning was necessary, but the >> regular failovers seem to work just fine. >> And I don't see rabbitmq issues either after rebooting a server, >> usually the haproxy (and virtual IP) failover suffice to prevent >> interruptions. >> >> Regards, >> Eugen >> >> Zitat von Satish Patel : >> >> > Are you running your stack on top of the kvm virtual machine? How many >> > controller nodes do you have? mostly rabbitMQ causing issues if you >> restart >> > controller nodes. >> > >> > On Thu, May 11, 2023 at 8:34?AM Albert Braden wrote: >> > >> >> We have our haproxy and controller nodes on KVM hosts. When those KVM >> >> hosts are restarted, customers who are building or deleting VMs see >> impact. >> >> VMs may go into error status, fail to get DNS records, fail to delete, >> etc. >> >> The obvious reason is because traffic that is being routed to the >> haproxy >> >> on the restarting KVM is lost. If we manually fail over haproxy before >> >> restarting the KVM, will that be sufficient to stop traffic being >> lost, or >> >> do we also need to do something with the controller? >> >> >> >> >> >> >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From ozzzo at yahoo.com Fri May 12 23:21:28 2023 From: ozzzo at yahoo.com (Albert Braden) Date: Fri, 12 May 2023 23:21:28 +0000 (UTC) Subject: [kolla] [train] haproxy and controller restart causes user impact In-Reply-To: References: <314427760.909657.1683808321773.ref@mail.yahoo.com> <314427760.909657.1683808321773@mail.yahoo.com> <20230512073327.Horde.nPa9c_1UYY_XW_n-sc1pyQM@webmail.nde.ag> <1577432210.483872.1683917817875@mail.yahoo.com> Message-ID: <1203772548.1669183.1683933688450@mail.yahoo.com> We reboot quarterly for patching. On Friday, May 12, 2023, 04:58:45 PM EDT, Satish Patel wrote: Don't expect zero issue when you reboot the controller. It won't be user transparent. your computer nodes and other services still hang on old connections (rabbitmq/amqp) etc and that takes some time?to get settled.? Curious why are you running control plane service on KVM and second question why do you need them reboot frequently??? I have physical nodes for the control plane and we see strange issues whenever we shouldn't use one of the controllers for maintenance.? On Fri, May 12, 2023 at 2:59?PM Albert Braden wrote: We use keepalived and exabgp to manage failover for haproxy. That works but it takes a few minutes, and during those few minutes customers experience impact. We tell them to not build/delete VMs during patching, but they still do, and then complain about the failures. We're planning to experiment with adding a "manual" haproxy failover to our patching automation, but I'm wondering if there is anything on the controller that needs to be failed over or disabled before rebooting the KVM. I looked at the "remove from cluster" and "add to cluster" procedures but that seems unnecessarily cumbersome for rebooting the KVM. On Friday, May 12, 2023, 03:42:42 AM EDT, Eugen Block wrote: Hi Albert, how is your haproxy placement controlled, something like pacemaker or? similar? I would always do a failover when I'm aware of interruptions? (maintenance window), that should speed things up for clients. We have? a pacemaker controlled HA control plane, it takes more time until? pacemaker realizes that the resource is gone if I just rebooted a? server without failing over. I have no benchmarks though. There's? always a risk of losing a couple of requests during the failover but? we didn't have complaints yet, I believe most of the components try to? resend the lost messages. In one of our customer's cluster with many? resources (they also use terraform) I haven't seen issues during a? regular maintenance window. When they had a DNS outage a few months? back it resulted in a mess, manual cleaning was necessary, but the? regular failovers seem to work just fine. And I don't see rabbitmq issues either after rebooting a server,? usually the haproxy (and virtual IP) failover suffice to prevent? interruptions. Regards, Eugen Zitat von Satish Patel : > Are you running your stack on top of the kvm virtual machine? How many > controller nodes do you have? mostly rabbitMQ causing issues if you restart > controller nodes. > > On Thu, May 11, 2023 at 8:34?AM Albert Braden wrote: > >> We have our haproxy and controller nodes on KVM hosts. When those KVM >> hosts are restarted, customers who are building or deleting VMs see impact. >> VMs may go into error status, fail to get DNS records, fail to delete, etc. >> The obvious reason is because traffic that is being routed to the haproxy >> on the restarting KVM is lost. If we manually fail over haproxy before >> restarting the KVM, will that be sufficient to stop traffic being lost, or >> do we also need to do something with the controller? >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From adivya1.singh at gmail.com Sat May 13 18:13:02 2023 From: adivya1.singh at gmail.com (Adivya Singh) Date: Sat, 13 May 2023 23:43:02 +0530 Subject: (Openstack-glance) Image service Failing Message-ID: Hi Team, I have a scenario, where my glance nfs mount point are no longer require as the nfs external.server is down, i have remove all NFS variables from open stack user config file and user variable and delete the glance containet and try to bulid it again. But still it is trying to mount that path. Is therr any user variables to ignore this Regards Adivya Singh -------------- next part -------------- An HTML attachment was scrubbed... URL: From wodel.youchi at gmail.com Sun May 14 01:01:14 2023 From: wodel.youchi at gmail.com (wodel youchi) Date: Sun, 14 May 2023 02:01:14 +0100 Subject: [kolla-ansible][yoga] fluentd problem with elasticsearch after update In-Reply-To: <4756FA70-F6AE-4317-B444-CEEA54D29B93@gmail.com> References: <2E93A6DB-2BA6-44A4-B1A0-97AA7205B791@gmail.com> <02C03BF9-E20D-4844-93B6-E472C1AD7E50@gmail.com> <4756FA70-F6AE-4317-B444-CEEA54D29B93@gmail.com> Message-ID: Hi; I did update another update of my platforme, and still having problems with fluentd container (it is still restarting) but with different error : > *Running command: '/usr/sbin/td-agent -o > /var/log/kolla/fluentd/fluentd.log'/opt/td-agent/lib/ruby/2.7.0/rubygems/specification.rb:2247:in > `raise_if_conflicts': Unable to activate fluent-plugin-elasticsearch-5.3.0, > because faraday-1.10.3 conflicts with faraday (>= 2.0.0), > faraday-excon-1.1.0 conflicts with faraday-excon (>= 2.0.0) > (Gem::ConflictError)* > from > /opt/td-agent/lib/ruby/2.7.0/rubygems/specification.rb:1369:in `activate' > from /opt/td-agent/lib/ruby/2.7.0/rubygems.rb:217:in `rescue in > try_activate' > from /opt/td-agent/lib/ruby/2.7.0/rubygems.rb:210:in `try_activate' > from > /opt/td-agent/lib/ruby/2.7.0/rubygems/core_ext/kernel_require.rb:151:in > `rescue in require' > from > /opt/td-agent/lib/ruby/2.7.0/rubygems/core_ext/kernel_require.rb:147:in > `require' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluent-plugin-elasticsearch-5.3.0/lib/fluent/plugin/out_elasticsearch.rb:20:in > `' > from > /opt/td-agent/lib/ruby/2.7.0/rubygems/core_ext/kernel_require.rb:83:in > `require' > from > /opt/td-agent/lib/ruby/2.7.0/rubygems/core_ext/kernel_require.rb:83:in > `require' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/registry.rb:103:in > `block in search' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/registry.rb:100:in > `each' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/registry.rb:100:in > `search' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/registry.rb:44:in > `lookup' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/plugin.rb:169:in > `new_impl' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/plugin.rb:114:in > `new_output' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/plugin/multi_output.rb:108:in > `block in configure' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/plugin/multi_output.rb:99:in > `each' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/plugin/multi_output.rb:99:in > `configure' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/plugin/out_copy.rb:39:in > `configure' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/plugin.rb:187:in > `configure' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/agent.rb:132:in > `add_match' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/agent.rb:74:in > `block in configure' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/agent.rb:64:in > `each' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/agent.rb:64:in > `configure' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/root_agent.rb:149:in > `configure' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/engine.rb:105:in > `configure' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/engine.rb:80:in > `run_configure' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/supervisor.rb:571:in > `run_supervisor' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/command/fluentd.rb:352:in > `' > from > /opt/td-agent/lib/ruby/2.7.0/rubygems/core_ext/kernel_require.rb:83:in > `require' > from > /opt/td-agent/lib/ruby/2.7.0/rubygems/core_ext/kernel_require.rb:83:in > `require' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/bin/fluentd:15:in > `' > from /opt/td-agent/bin/fluentd:23:in `load' > from /opt/td-agent/bin/fluentd:23:in `' > from /usr/sbin/td-agent:15:in `load' > from /usr/sbin/td-agent:15:in `
' > */opt/td-agent/lib/ruby/2.7.0/rubygems/specification.rb:2247:in > `raise_if_conflicts': Unable to activate fluent-plugin-elasticsearch-5.3.0, > because faraday-1.10.3 conflicts with faraday (>= 2.0.0), > faraday-excon-1.1.0 conflicts with faraday-excon (>= 2.0.0) > (Gem::ConflictError)* > from > /opt/td-agent/lib/ruby/2.7.0/rubygems/specification.rb:1369:in `activate' > from /opt/td-agent/lib/ruby/2.7.0/rubygems.rb:211:in `try_activate' > from > /opt/td-agent/lib/ruby/2.7.0/rubygems/core_ext/kernel_require.rb:151:in > `rescue in require' > from > /opt/td-agent/lib/ruby/2.7.0/rubygems/core_ext/kernel_require.rb:147:in > `require' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluent-plugin-elasticsearch-5.3.0/lib/fluent/plugin/out_elasticsearch.rb:20:in > `' > from > /opt/td-agent/lib/ruby/2.7.0/rubygems/core_ext/kernel_require.rb:83:in > `require' > from > /opt/td-agent/lib/ruby/2.7.0/rubygems/core_ext/kernel_require.rb:83:in > `require' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/registry.rb:103:in > `block in search' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/registry.rb:100:in > `each' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/registry.rb:100:in > `search' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/registry.rb:44:in > `lookup' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/plugin.rb:169:in > `new_impl' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/plugin.rb:114:in > `new_output' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/plugin/multi_output.rb:108:in > `block in configure' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/plugin/multi_output.rb:99:in > `each' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/plugin/multi_output.rb:99:in > `configure' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/plugin/out_copy.rb:39:in > `configure' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/plugin.rb:187:in > `configure' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/agent.rb:132:in > `add_match' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/agent.rb:74:in > `block in configure' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/agent.rb:64:in > `each' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/agent.rb:64:in > `configure' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/root_agent.rb:149:in > `configure' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/engine.rb:105:in > `configure' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/engine.rb:80:in > `run_configure' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/supervisor.rb:571:in > `run_supervisor' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/command/fluentd.rb:352:in > `' > from > /opt/td-agent/lib/ruby/2.7.0/rubygems/core_ext/kernel_require.rb:83:in > `require' > from > /opt/td-agent/lib/ruby/2.7.0/rubygems/core_ext/kernel_require.rb:83:in > `require' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/bin/fluentd:15:in > `' > from /opt/td-agent/bin/fluentd:23:in `load' > from /opt/td-agent/bin/fluentd:23:in `' > from /usr/sbin/td-agent:15:in `load' > from /usr/sbin/td-agent:15:in `
' > > > */opt/td-agent/lib/ruby/2.7.0/rubygems/core_ext/kernel_require.rb:83:in > `require': cannot load such file -- fluent/log-ext (LoadError) from > /opt/td-agent/lib/ruby/2.7.0/rubygems/core_ext/kernel_require.rb:83:in > `require' from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluent-plugin-elasticsearch-5.3.0/lib/fluent/plugin/out_elasticsearch.rb:20:in > `'* > from > /opt/td-agent/lib/ruby/2.7.0/rubygems/core_ext/kernel_require.rb:83:in > `require' > from > /opt/td-agent/lib/ruby/2.7.0/rubygems/core_ext/kernel_require.rb:83:in > `require' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/registry.rb:103:in > `block in search' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/registry.rb:100:in > `each' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/registry.rb:100:in > `search' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/registry.rb:44:in > `lookup' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/plugin.rb:169:in > `new_impl' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/plugin.rb:114:in > `new_output' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/plugin/multi_output.rb:108:in > `block in configure' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/plugin/multi_output.rb:99:in > `each' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/plugin/multi_output.rb:99:in > `configure' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/plugin/out_copy.rb:39:in > `configure' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/plugin.rb:187:in > `configure' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/agent.rb:132:in > `add_match' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/agent.rb:74:in > `block in configure' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/agent.rb:64:in > `each' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/agent.rb:64:in > `configure' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/root_agent.rb:149:in > `configure' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/engine.rb:105:in > `configure' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/engine.rb:80:in > `run_configure' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/supervisor.rb:571:in > `run_supervisor' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/lib/fluent/command/fluentd.rb:352:in > `' > from > /opt/td-agent/lib/ruby/2.7.0/rubygems/core_ext/kernel_require.rb:83:in > `require' > from > /opt/td-agent/lib/ruby/2.7.0/rubygems/core_ext/kernel_require.rb:83:in > `require' > from > /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.16.1/bin/fluentd:15:in > `' > from /opt/td-agent/bin/fluentd:23:in `load' > from /opt/td-agent/bin/fluentd:23:in `' > from /usr/sbin/td-agent:15:in `load' > from /usr/sbin/td-agent:15:in `
' > Regards. Virus-free.www.avast.com <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> Le lun. 8 mai 2023 ? 14:59, Micha? Nasiadka a ?crit : > Hello, > > Then probably the only solution is to rebuild the container image with > this patch: https://review.opendev.org/c/openstack/kolla/+/882289 > It will take a couple days before it?s merged upstream. > > Best regards, > Michal > > On 8 May 2023, at 14:06, wodel youchi wrote: > > Hi, > > I cannot access the fluentd container, it crashes almost instantly after > each restart. > Is there a workaround to make the container start and wait until the > commands are executed? > > Regards. > > Le jeu. 4 mai 2023 ? 14:12, Micha? Nasiadka a > ?crit : > >> Hello, >> >> Kolla-Ansible is not supporting both opensearch and elasticsearch running >> at the same time - so if you?re using cloudkitty - it?s better to stick for >> Elasticsearch for now (CK does not support OpenSearch yet). >> >> I started working on the bug - will let you know in the bug report when a >> fix will be merged and images published. >> In the meantime you can try to uninstall the too-new elasticsearch gems >> using td-agent-gem uninstall in your running container image. >> >> Best regards, >> Michal >> >> On 4 May 2023, at 14:33, wodel youchi wrote: >> >> Hi, >> >> I'll try to open a bug for this. >> >> I am using elasticsearch also with Cloudkitty : >> cloudkitty_storage_backend: "elasticsearch" instead of influxdb to get some >> HA. >> Will I still get the fluentd problem even if I migrate to Opensearch >> leaving Cloudkitty with elasticsearch??? >> >> Regards. >> >> >> >> Virus-free.www.avast.com >> >> >> Le jeu. 4 mai 2023 ? 07:42, Micha? Nasiadka a >> ?crit : >> >>> Hello, >>> >>> That probably is a Kolla bug - can you please raise a bug in >>> launchpad.net? >>> The other alternative is to migrate to OpenSearch (we?ve back ported >>> this functionality recently) - >>> https://docs.openstack.org/kolla-ansible/yoga/reference/logging-and-monitoring/central-logging-guide-opensearch.html#migration >>> >>> Best regards, >>> Michal >>> >>> On 3 May 2023, at 17:13, wodel youchi wrote: >>> >>> Hi, >>> >>> I have finished the update of my openstack platform with newer >>> containers. >>> >>> While verifying I noticed that fluentd container keeps restarting. >>> >>> In the log file I am having this : >>> >>>> 2023-05-03 16:07:59 +0100 [error]: #0 config error >>>> file="/etc/td-agent/td-agent.conf" error_class=Fluent::ConfigError >>>> error="Using Elasticsearch client 8.7.1 is not compatible for your >>>> Elasticsearch server. Please check your using elasticsearch gem version and >>>> Elasticsearch server." >>>> 2023-05-03 16:07:59 +0100 [error]: Worker 0 finished unexpectedly with >>>> status 2 >>>> 2023-05-03 16:07:59 +0100 [info]: Received graceful stop >>>> >>> >>> Those are the images I am using : >>> (yogavenv) [deployer at rscdeployer ~]$ sudo docker images | grep -i elas >>> >>> 192.168.1.16:4000/openstack.kolla/centos-source-prometheus-elasticsearch-exporter >>> yoga030523 b48f63ed0072 12 hours ago 539MB >>> 192.168.1.16:4000/openstack.kolla/centos-source-elasticsearch >>> yoga030523 3558611b0cf4 12 hours ago 1.2GB >>> 192.168.1.16:4000/openstack.kolla/centos-source-elasticsearch-curator >>> yoga030523 83a6b48339ea 12 hours ago 637MB >>> >>> (yogavenv) [deployer at rscdeployer ~]$ sudo docker images | grep -i fluen >>> 192.168.1.16:4000/openstack.kolla/centos-source-fluentd >>> yoga030523 bf6596e139e2 12 hours ago 847MB >>> >>> Any ideas? >>> >>> Regards. >>> >>> >>> >>> Virus-free.www.avast.com >>> >>> >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From amonster369 at gmail.com Sun May 14 11:45:47 2023 From: amonster369 at gmail.com (A Monster) Date: Sun, 14 May 2023 12:45:47 +0100 Subject: Destroy an Openstack deployment and Keep Cinder Volumes Message-ID: I have a deployment of openstack using kolla-ansible and LVM as a cinder backend, I want to make a kolla-ansible destroy to remove all openstack components and re deploy it again from scratch, however I want to keep the lvm volumes which is the backend for cinder, to re-use them for the fresh openstack deployment, and I was wondering if kolla-ansible destroys the lvm volumes used by cinder or no, and if you can suggest me a better alternative for what I'm trying to do. Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jamesleong123098 at gmail.com Sun May 14 22:32:32 2023 From: jamesleong123098 at gmail.com (James Leong) Date: Sun, 14 May 2023 17:32:32 -0500 Subject: [keystone][horizon][kolla-ansible] user access specific domain Message-ID: Hi all, I am playing around with the domain in the yoga version of OpenStack using kolla-ansible as the deployment tool. I have set up Globus as my authentication tool. However, I am curious if it is possible to log in to an existing OpenStack user account via federated login (based on Gmail) In my case, first, I created a user named "James" in one of the domains called federated_login. When I attempt to log in, a new user is created in the default domain instead of the federated_login domain. Below is a sample of my globus.json. [{"local": [ { "user": { "name":"{0}, "email":"{2} }, "group":{ "name": "federated_user", "domain: {"name":"{1} } } ], "remote": [ { "type":"OIDC-name"}, { "type":"OIDC-organization"},{"type":"OIDC-email"} ] }] Apart from the above question, is there another easier way of restricting users from login in via federated? For example, allow only existing users on OpenStack with a specific email to access the OpenStack dashboard via federated login. Best Regards, James -------------- next part -------------- An HTML attachment was scrubbed... URL: From nguyenhuukhoinw at gmail.com Mon May 15 03:03:13 2023 From: nguyenhuukhoinw at gmail.com (=?UTF-8?B?Tmd1eeG7hW4gSOG7r3UgS2jDtGk=?=) Date: Mon, 15 May 2023 10:03:13 +0700 Subject: [keystone][horizon][kolla-ansible] user access specific domain In-Reply-To: References: Message-ID: Hello. This is my example. { "local": [ { "user": { "name": "{0}", "email": "{1}" }, "group": { "name": "your keystone group", "domain": { "name": "Default" } } } ], "remote": [ { "type": "OIDC-preferred_username", "any_one_of": [ "xxx at gmail.com", "xxx1 at gmail.com ] }, { "type": "OIDC-preferred_username" }, { "type": "OIDC-email" } ] } Nguyen Huu Khoi On Mon, May 15, 2023 at 5:41?AM James Leong wrote: > Hi all, > > I am playing around with the domain in the yoga version of OpenStack using > kolla-ansible as the deployment tool. I have set up Globus as my > authentication tool. However, I am curious if it is possible to log in to > an existing OpenStack user account via federated login (based on Gmail) > > In my case, first, I created a user named "James" in one of the domains > called federated_login. When I attempt to log in, a new user is created in > the default domain instead of the federated_login domain. Below is a sample > of my globus.json. > > [{"local": [ > { > "user": { > "name":"{0}, > "email":"{2} > }, > "group":{ > "name": "federated_user", > "domain: {"name":"{1} > } > } > ], > "remote": [ > { "type":"OIDC-name"}, > { "type":"OIDC-organization"},{"type":"OIDC-email"} > ] > }] > > Apart from the above question, is there another easier way of restricting > users from login in via federated? For example, allow only existing users > on OpenStack with a specific email to access the OpenStack dashboard via > federated login. > > Best Regards, > James > -------------- next part -------------- An HTML attachment was scrubbed... URL: From adivya1.singh at gmail.com Mon May 15 03:03:54 2023 From: adivya1.singh at gmail.com (Adivya Singh) Date: Mon, 15 May 2023 08:33:54 +0530 Subject: (Openstack-glance) Image service Failing In-Reply-To: References: Message-ID: Hi Team, Any input on this Regards Adivya Singh On Sat, May 13, 2023 at 11:43?PM Adivya Singh wrote: > Hi Team, > > I have a scenario, where my glance nfs mount point are no longer require > as the nfs external.server is down, i have remove all NFS variables from > open stack user config file and user variable and delete the glance > containet and try to bulid it again. But still it is trying to mount that > path. > > Is therr any user variables to ignore this > > Regards > Adivya Singh > -------------- next part -------------- An HTML attachment was scrubbed... URL: From raghavendra-uddhav.tilay at hpe.com Mon May 15 05:14:51 2023 From: raghavendra-uddhav.tilay at hpe.com (U T, Raghavendra) Date: Mon, 15 May 2023 05:14:51 +0000 Subject: Installation openstack Multi node In-Reply-To: References: Message-ID: Hi, Kindly refer below documentation: https://docs.openstack.org/devstack/latest/guides/multinode-lab.html Although its not a video, it can be used as starting point. Regards, Raghavendra Tilay. From: BEDDA Fadhel Sent: Saturday, May 13, 2023 12:02 AM To: openstack-discuss at lists.openstack.org Subject: Installation openstack Multi node Good morning, I am looking for a complete video or digital procedure that allows me to set up an openstack multi node test environment on vamware workstation. THANKS -------------- next part -------------- An HTML attachment was scrubbed... URL: From thierry at openstack.org Mon May 15 07:50:26 2023 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 15 May 2023 09:50:26 +0200 Subject: [largescale-sig] Next meeting: May 17, 15utc Message-ID: <478ed560-ed34-06db-0da9-242126cda3ee@openstack.org> Hi everyone, The Large Scale SIG will be meeting this Wednesday in #openstack-operators on OFTC IRC, at 15UTC, our EU+US-friendly time. Kristin will be chairing. You can doublecheck how that UTC time translates locally at: https://www.timeanddate.com/worldclock/fixedtime.html?iso=20230517T15 Feel free to add topics to the agenda: https://etherpad.opendev.org/p/large-scale-sig-meeting Regards, -- Thierry Carrez From nurmatov.mamatisa at huawei.com Mon May 15 09:53:47 2023 From: nurmatov.mamatisa at huawei.com (Nurmatov Mamatisa) Date: Mon, 15 May 2023 09:53:47 +0000 Subject: [neutron] Bug Deputy Report May 8 - May 14 Message-ID: <7534dea807f6428cb53e69a7332a9a41@huawei.com> Hi, Below is the week summary of bug deputy report for last week. 2 bugs are not assigned. Details: Critical -------- - https://bugs.launchpad.net/neutron/+bug/2019097 - CI tests neutron-dynamic-routing broken - In progress: https://review.opendev.org/c/openstack/neutron-dynamic-routing/+/882815 - Assigned to Felix Huettner High ---- - https://bugs.launchpad.net/neutron/+bug/2018727 - [SRBAC] API policies for get_policy_*_rule are wrong - Fix released: https://review.opendev.org/c/openstack/neutron/+/882688 - Assigned to Slawek Kaplonski - https://bugs.launchpad.net/neutron/+bug/2019449 - Neutron server log file full of tracebacks in neutron-tempest-plugin-ovn job - Confirmed - Miguel Lavalle - https://bugs.launchpad.net/neutron/+bug/2019314 - On shutdown, linuxbridge agent queue on rabbitmq is not cleared - Confirmed - Unassigned Medium ------ - https://bugs.launchpad.net/neutron/+bug/2018967 - [fwaas] test_update_firewall_group fails randomly - Confirmed - Unassigned - https://bugs.launchpad.net/neutron/+bug/2018737 - neutron-dynamic-routing announces routes for disabled routers - In progress: https://review.opendev.org/c/openstack/neutron-dynamic-routing/+/882560 - Assigned to Felix Huettner - https://bugs.launchpad.net/neutron/+bug/2018989 - [SRBAC] FIP Port Forwarding policies should be available for PARENT_OWNER with proper role - Fix released: https://review.opendev.org/c/openstack/neutron/+/882691 - Assigned to Slawek Kaplonski - https://bugs.launchpad.net/neutron/+bug/2019119 - [sqlalchemy-20] Add the corresponding context to the upgrade checks methods - In progress: https://review.opendev.org/c/openstack/neutron/+/882865 - Assigned to Rodolfo Alonso - https://bugs.launchpad.net/neutron/+bug/2019132 - [OVN] ``OVNClient._add_router_ext_gw`` should use the context passed by the API call - In progress: https://review.opendev.org/c/openstack/neutron/+/882857 - Assigned to Rodolfo Alonso - https://bugs.launchpad.net/neutron/+bug/2019186 - [PostgreSQL] Issue with "get_scoped_floating_ips" query - In progress: https://review.opendev.org/c/openstack/neutron/+/882935 - Assigned to Rodolfo Alonso Low --- - https://bugs.launchpad.net/neutron/+bug/2019217 - [OVN] L3 port scheduler, defaults to use all chassis; stop this behavior - New - Assigned to Rodolfo Alonso Undecided --------- - https://bugs.launchpad.net/neutron/+bug/2019012 - When num of port mappings is substantial, the response time of List API so slow Best regards, Mamatisa Nurmatov Advanced Software Technology Lab / Cloud Technologies Research -------------- next part -------------- An HTML attachment was scrubbed... URL: From wodel.youchi at gmail.com Mon May 15 10:26:11 2023 From: wodel.youchi at gmail.com (wodel youchi) Date: Mon, 15 May 2023 11:26:11 +0100 Subject: [Yoga][Magnum] change boot disk storage for COE VMs Message-ID: Hi, When creating Magnum clusters, two disks are created for each VM sda and sdb, the sda is the boot disk and the sdb is for docker images. By default the sda is created (at least in my deployment) in nova vms pool, as ephemeral disks, the second disk sdb is created in cinder volume. Is it possible to move the sda from nova vms to cinder volume? I tried with default_boot_volume_type, but it didn't work. default_boot_volume_type = MyCinderPool Regards. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Nahim.Souza at netapp.com Mon May 15 12:04:07 2023 From: Nahim.Souza at netapp.com (Souza, Nahim) Date: Mon, 15 May 2023 12:04:07 +0000 Subject: [cinder][dev] Add support in driver - Active/Active High Availability In-Reply-To: References: Message-ID: Hi, Raghavendra, Just sharing my experience, I started working on A/A support for NetApp NFS driver, and I followed the same steps you summarized. Besides that, I think the effort is to understand/test if any of the driver features might break in the A/A environment. If anyone knows about anything else we should test, I would be happy to know too. Regards, Nahim Souza. From: U T, Raghavendra Sent: Monday, May 8, 2023 09:18 To: openstack-discuss at lists.openstack.org Subject: [cinder][dev] Add support in driver - Active/Active High Availability NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe. Hi, We wish to add Active/Active High Availability to: 1] HPE 3par driver - cinder/cinder/volume/drivers/hpe/hpe_3par_common.py 2] Nimble driver - cinder/cinder/volume/drivers/hpe/nimble.py Checked documentation at https://docs.openstack.org/cinder/latest/contributor/high_availability.html https://docs.openstack.org/cinder/latest/contributor/high_availability.html#cinder-volume https://docs.openstack.org/cinder/latest/contributor/high_availability.html#enabling-active-active-on-drivers Summary of steps: 1] In driver code, set SUPPORTS_ACTIVE_ACTIVE = True 2] Split the method failover_host() into two methods: failover() and failover_completed() 3] In cinder.conf, specify cluster name in [DEFAULT] section cluster = 4] Configure atleast two nodes in HA and perform testing Is this sufficient or anything else required ? Note: For Nimble driver, replication feature is not yet added. So can the above step 2 be skipped? Appreciate any suggestions / pointers. Regards, Raghavendra Tilay. -------------- next part -------------- An HTML attachment was scrubbed... URL: From haiwu.us at gmail.com Mon May 15 18:03:52 2023 From: haiwu.us at gmail.com (hai wu) Date: Mon, 15 May 2023 13:03:52 -0500 Subject: [nova] hw:numa_nodes question In-Reply-To: <56ffe1e6cabcc54920b6f8a3a255d13bd7407628.camel@redhat.com> References: <713262656198f4e0330a086b906daa8a1cb3e40c.camel@redhat.com> <1a05b92654acb6309bc52fac14c9ae79242ab40e.camel@redhat.com> <1705b563fb0e936a2aa8356f6adccddd948b69bf.camel@redhat.com> <56ffe1e6cabcc54920b6f8a3a255d13bd7407628.camel@redhat.com> Message-ID: > > Another question: Let's say a VM runs on one host's numa node #0. If > > we live-migrate this VM to another host, and that host's numa node #1 > > has more free memory, is it possible for this VM to land on the other > > host's numa node #1? > yes it is > on newer relsese we will prefer to balance the load across numa nodes > on older release nova woudl fill the first numa node then move to the second. About the above point, it seems even with the numa patch back ported and in place, the VM would be stuck in its existing numa node. Per my tests, after its live migration, the VM will end up on the other host's numa node #0, even if numa node#1 has more free memory. This is not the case for newly built VMs. Is this a design issue? On Thu, May 11, 2023 at 2:42?PM Sean Mooney wrote: > > On Thu, 2023-05-11 at 08:40 -0500, hai wu wrote: > > Ok. Then I don't understand why 'hw:mem_page_size' is not made the > > default in case if hw:numa_node is set. There is a huge disadvantage > > if not having this one set (all existing VMs with hw:numa_node set > > will have to be taken down for resizing in order to get this one > > right). > there is an upgrade impact to changign the default. > its not impossibel to do but its complicated if we dont want to break exisitng deployments > we woudl need to recored a value for eveny current instance that was spawned before > this default was changed that had hw:numa_node without hw:mem_page_size so they kept the old behavior > and make sure that is cleared when the vm is next moved so it can have the new default > after a live migratoin. > > > > I could not find this point mentioned in any existing Openstack > > documentation: that we would have to set hw:mem_page_size explicitly > > if hw:numa_node is set. Also this slide at > > https://www.linux-kvm.org/images/0/0b/03x03-Openstackpdf.pdf kind of > > indicates that hw:mem_page_size `Default to small pages`. > it defaults to unset. > that results in small pages by default but its not the same as hw:mem_page_size=small > or hw:mem_page_size=any. > > > > > > Another question: Let's say a VM runs on one host's numa node #0. If > > we live-migrate this VM to another host, and that host's numa node #1 > > has more free memory, is it possible for this VM to land on the other > > host's numa node #1? > yes it is > on newer relsese we will prefer to balance the load across numa nodes > on older release nova woudl fill the first numa node then move to the second. > > > > On Thu, May 11, 2023 at 4:25?AM Sean Mooney wrote: > > > > > > On Wed, 2023-05-10 at 15:06 -0500, hai wu wrote: > > > > Is it possible to update something in the Openstack database for the > > > > relevant VMs in order to do the same, and then hard reboot the VM so > > > > that the VM would have this attribute? > > > not really adding the missing hw:mem_page_size requirement to the flavor chagnes the > > > requirements for node placement and numa affinity > > > so you really can only change this via resizing the vm to a new flavor > > > > > > > > On Wed, May 10, 2023 at 2:47?PM Sean Mooney wrote: > > > > > > > > > > On Wed, 2023-05-10 at 14:22 -0500, hai wu wrote: > > > > > > So there's no default value assumed/set for hw:mem_page_size for each > > > > > > flavor? > > > > > > > > > > > correct this is a known edgecase in the currnt design > > > > > hw:mem_page_size=any would be a resonable default but > > > > > techinially if just set hw:numa_nodes=1 nova allow memory over subscription > > > > > > > > > > in pratch if you try to do that you will almost always end up with vms > > > > > being killed due to OOM events. > > > > > > > > > > so from a api point of view it woudl be a change of behvior for use to default > > > > > to hw:mem_page_size=any but i think it would be the correct thign to do for operators > > > > > in the long run. > > > > > > > > > > i could bring this up with the core team again but in the past we > > > > > decided to be conservitive and just warn peopel to alwasy set > > > > > hw:mem_page_size if using numa affinity. > > > > > > > > > > > Yes https://bugs.launchpad.net/nova/+bug/1893121 is critical > > > > > > when using hw:numa_nodes=1. > > > > > > > > > > > > I did not hit an issue with 'hw:mem_page_size' not set, maybe I am > > > > > > missing some known test cases? It would be very helpful to have a test > > > > > > case where I could reproduce this issue with 'hw:numa_nodes=1' being > > > > > > set, but without 'hw:mem_page_size' being set. > > > > > > > > > > > > How to ensure this one for existing vms already running with > > > > > > 'hw:numa_nodes=1', but without 'hw:mem_page_size' being set? > > > > > you unfortuletly need to resize the instance. > > > > > tehre are some image porpeties you can set on an instance via nova-manage > > > > > but you cannot use nova-mange to update the enbedd flavor and set this. > > > > > > > > > > so you need to define a new flavour and resize. > > > > > > > > > > this is the main reason we have not changed the default as it may requrie you to > > > > > move instnace around if there placement is now invalid now that per numa node memory > > > > > allocatons are correctly being accounted for. > > > > > > > > > > if it was simple to change the default without any enduser or operator impact we would. > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, May 10, 2023 at 1:47?PM Sean Mooney wrote: > > > > > > > > > > > > > > if you set hw:numa_nodes there are two things you should keep in mind > > > > > > > > > > > > > > first if hw:numa_nodes si set to any value incluing hw:numa_nodes=1 > > > > > > > then hw:mem_page_size shoudl also be defiend on the falvor. > > > > > > > > > > > > > > if you dont set hw:mem_page_size then the vam will be pinned to a host numa node > > > > > > > but the avaible memory on the host numa node will not be taken into account > > > > > > > > > > > > > > only the total free memory on the host so this almost always results in VMs being killed by the OOM reaper > > > > > > > in the kernel. > > > > > > > > > > > > > > i recomend setting hw:mem_page_size=small hw:mem_page_size=large or hw:mem_page_size=any > > > > > > > small will use your kernels default page size for guest memory, typically this is 4k pages > > > > > > > large will use any pages size other then the smallest that is avaiable (i.e. this will use hugepages) > > > > > > > and any will use small pages but allow the guest to request hugepages via the hw_page_size image property. > > > > > > > > > > > > > > hw:mem_page_size=any is the most flexable as a result but generally i recommend using hw:mem_page_size=small > > > > > > > and having a seperate flavor for hugepages. its really up to you. > > > > > > > > > > > > > > > > > > > > > the second thing to keep in mind is using expict numa toplolig8ies including hw:numa_nodes=1 > > > > > > > disables memory oversubsctipion. > > > > > > > > > > > > > > so you will not be able ot oversubscibe the memory on the host. > > > > > > > > > > > > > > in general its better to avoid memory oversubscribtion anyway but jsut keep that in mind. > > > > > > > you cant jsut allocate a buch of swap space and run vms at a 2:1 or higher memory over subscription ratio > > > > > > > if you are using numa affinity. > > > > > > > > > > > > > > https://that.guru/blog/the-numa-scheduling-story-in-nova/ > > > > > > > and > > > > > > > https://that.guru/blog/cpu-resources-redux/ > > > > > > > > > > > > > > are also good to read > > > > > > > > > > > > > > i do not think stephen has a dedicated block on the memory aspect > > > > > > > but https://bugs.launchpad.net/nova/+bug/1893121 covers some of the probelem that only setting > > > > > > > hw:numa_nodes=1 will casue. > > > > > > > > > > > > > > if you have vms with hw:numa_nodes=1 set and you do not have hw:mem_page_size set in the falvor or > > > > > > > hw_mem_page_size set in the image then that vm is not configure properly. > > > > > > > > > > > > > > On Wed, 2023-05-10 at 11:52 -0600, Alvaro Soto wrote: > > > > > > > > Another good resource =) > > > > > > > > > > > > > > > > https://that.guru/blog/cpu-resources/ > > > > > > > > > > > > > > > > On Wed, May 10, 2023 at 11:50?AM Alvaro Soto wrote: > > > > > > > > > > > > > > > > > I don't think so. > > > > > > > > > > > > > > > > > > ~~~ > > > > > > > > > The most common case will be that the admin only sets hw:numa_nodes and > > > > > > > > > then the flavor vCPUs and memory will be divided equally across the NUMA > > > > > > > > > nodes. When a NUMA policy is in effect, it is mandatory for the instance's > > > > > > > > > memory allocations to come from the NUMA nodes to which it is bound except > > > > > > > > > where overriden by hw:numa_mem.NN. > > > > > > > > > ~~~ > > > > > > > > > > > > > > > > > > Here are the implementation documents since Juno release: > > > > > > > > > > > > > > > > > > > > > > > > > > > https://opendev.org/openstack/nova-specs/src/branch/master/specs/juno/implemented/virt-driver-numa-placement.rst > > > > > > > > > > > > > > > > > > https://opendev.org/openstack/nova-specs/commit/45252df4c54674d2ac71cd88154af476c4d510e1 > > > > > > > > > ? > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, May 10, 2023 at 11:31?AM hai wu wrote: > > > > > > > > > > > > > > > > > > > Is there any concern to enable 'hw:numa_nodes=1' on all flavors, as > > > > > > > > > > long as that flavor can fit into one numa node? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > > > > > Alvaro Soto > > > > > > > > > > > > > > > > > > *Note: My work hours may not be your work hours. Please do not feel the > > > > > > > > > need to respond during a time that is not convenient for you.* > > > > > > > > > ---------------------------------------------------------- > > > > > > > > > Great people talk about ideas, > > > > > > > > > ordinary people talk about things, > > > > > > > > > small people talk... about other people. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From smooney at redhat.com Mon May 15 19:31:29 2023 From: smooney at redhat.com (Sean Mooney) Date: Mon, 15 May 2023 20:31:29 +0100 Subject: [nova] hw:numa_nodes question In-Reply-To: References: <713262656198f4e0330a086b906daa8a1cb3e40c.camel@redhat.com> <1a05b92654acb6309bc52fac14c9ae79242ab40e.camel@redhat.com> <1705b563fb0e936a2aa8356f6adccddd948b69bf.camel@redhat.com> <56ffe1e6cabcc54920b6f8a3a255d13bd7407628.camel@redhat.com> Message-ID: <8acd0ffb7bb09de4b48c5c69f849659d805134c5.camel@redhat.com> On Mon, 2023-05-15 at 13:03 -0500, hai wu wrote: > > > Another question: Let's say a VM runs on one host's numa node #0. If > > > we live-migrate this VM to another host, and that host's numa node #1 > > > has more free memory, is it possible for this VM to land on the other > > > host's numa node #1? > > yes it is > > on newer relsese we will prefer to balance the load across numa nodes > > on older release nova woudl fill the first numa node then move to the second. > > About the above point, it seems even with the numa patch back ported > and in place, the VM would be stuck in its existing numa node. Per my > tests, after its live migration, the VM will end up on the other > host's numa node #0, even if numa node#1 has more free memory. This is > not the case for newly built VMs. > > Is this a design issue? if you are using a release that supprot numa live migration (train +) https://specs.openstack.org/openstack/nova-specs/specs/train/implemented/numa-aware-live-migration.html then the numa affintiy is recalulated on live migration however numa node 0 is prefered. as of xena [compute]/packing_host_numa_cells_allocation_strategy has been added to contol how vms are balanced acros numa nodes in zed the default was changed form packing vms per host numa node to balancing vms between host numa nodes https://docs.openstack.org/releasenotes/nova/zed.html#relnotes-26-0-0-stable-zed-upgrade-notes even without the enhanchemt in xena and zed it was possible for the scheduler to select a numa node if you dont enable memory or cpu aware numa placment with hw:mem_page_size or hw:cpu_policy=dedicated then it will always select numa 0 if you do not request cpu pinnign or a specifc page size the sechudler cant properly select the host nuam node and will alwasy use numa node 0. That is one of the reason i said that if hw:numa_nodes is set then hw:mem_page_size shoudl be set. from a nova point of view using numa_nodes without mem_page_size is logically incorrect as you asked for a vm to be affinites to n host numa nodes but did not enable numa aware memory scheduling. we unfortnally cant provent this in the nova api without breaking upgrades for everyone who has made this mistake. we woudl need to force them to resize all affected instances which means guest downtime. the other issue si multiple numa nodes are supproted by Hyper-V but they do not supprot mem_page_size we have tried to document this in the past but never agreed on how becasuse it subtel and requries alot of context. the tl;dr is if the instace has a numa toplogy it should have mem_page_size set in the image or flavor but we never foudn a good place to capture that. > > On Thu, May 11, 2023 at 2:42?PM Sean Mooney wrote: > > > > On Thu, 2023-05-11 at 08:40 -0500, hai wu wrote: > > > Ok. Then I don't understand why 'hw:mem_page_size' is not made the > > > default in case if hw:numa_node is set. There is a huge disadvantage > > > if not having this one set (all existing VMs with hw:numa_node set > > > will have to be taken down for resizing in order to get this one > > > right). > > there is an upgrade impact to changign the default. > > its not impossibel to do but its complicated if we dont want to break exisitng deployments > > we woudl need to recored a value for eveny current instance that was spawned before > > this default was changed that had hw:numa_node without hw:mem_page_size so they kept the old behavior > > and make sure that is cleared when the vm is next moved so it can have the new default > > after a live migratoin. > > > > > > I could not find this point mentioned in any existing Openstack > > > documentation: that we would have to set hw:mem_page_size explicitly > > > if hw:numa_node is set. Also this slide at > > > https://www.linux-kvm.org/images/0/0b/03x03-Openstackpdf.pdf kind of > > > indicates that hw:mem_page_size `Default to small pages`. > > it defaults to unset. > > that results in small pages by default but its not the same as hw:mem_page_size=small > > or hw:mem_page_size=any. > > > > > > > > > > Another question: Let's say a VM runs on one host's numa node #0. If > > > we live-migrate this VM to another host, and that host's numa node #1 > > > has more free memory, is it possible for this VM to land on the other > > > host's numa node #1? > > yes it is > > on newer relsese we will prefer to balance the load across numa nodes > > on older release nova woudl fill the first numa node then move to the second. > > > > > > On Thu, May 11, 2023 at 4:25?AM Sean Mooney wrote: > > > > > > > > On Wed, 2023-05-10 at 15:06 -0500, hai wu wrote: > > > > > Is it possible to update something in the Openstack database for the > > > > > relevant VMs in order to do the same, and then hard reboot the VM so > > > > > that the VM would have this attribute? > > > > not really adding the missing hw:mem_page_size requirement to the flavor chagnes the > > > > requirements for node placement and numa affinity > > > > so you really can only change this via resizing the vm to a new flavor > > > > > > > > > > On Wed, May 10, 2023 at 2:47?PM Sean Mooney wrote: > > > > > > > > > > > > On Wed, 2023-05-10 at 14:22 -0500, hai wu wrote: > > > > > > > So there's no default value assumed/set for hw:mem_page_size for each > > > > > > > flavor? > > > > > > > > > > > > > correct this is a known edgecase in the currnt design > > > > > > hw:mem_page_size=any would be a resonable default but > > > > > > techinially if just set hw:numa_nodes=1 nova allow memory over subscription > > > > > > > > > > > > in pratch if you try to do that you will almost always end up with vms > > > > > > being killed due to OOM events. > > > > > > > > > > > > so from a api point of view it woudl be a change of behvior for use to default > > > > > > to hw:mem_page_size=any but i think it would be the correct thign to do for operators > > > > > > in the long run. > > > > > > > > > > > > i could bring this up with the core team again but in the past we > > > > > > decided to be conservitive and just warn peopel to alwasy set > > > > > > hw:mem_page_size if using numa affinity. > > > > > > > > > > > > > ?Yes https://bugs.launchpad.net/nova/+bug/1893121 is critical > > > > > > > when using hw:numa_nodes=1. > > > > > > > > > > > > > > I did not hit an issue with 'hw:mem_page_size' not set, maybe I am > > > > > > > missing some known test cases? It would be very helpful to have a test > > > > > > > case where I could reproduce this issue with 'hw:numa_nodes=1' being > > > > > > > set, but without 'hw:mem_page_size' being set. > > > > > > > > > > > > > > How to ensure this one for existing vms already running with > > > > > > > 'hw:numa_nodes=1', but without 'hw:mem_page_size' being set? > > > > > > you unfortuletly need to resize the instance. > > > > > > tehre are some image porpeties you can set on an instance via nova-manage > > > > > > but you cannot use nova-mange to update the enbedd flavor and set this. > > > > > > > > > > > > so you need to define a new flavour and resize. > > > > > > > > > > > > this is the main reason we have not changed the default as it may requrie you to > > > > > > move instnace around if there placement is now invalid now that per numa node memory > > > > > > allocatons are correctly being accounted for. > > > > > > > > > > > > if it was simple to change the default without any enduser or operator impact we would. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, May 10, 2023 at 1:47?PM Sean Mooney wrote: > > > > > > > > > > > > > > > > if you set hw:numa_nodes there are two things you should keep in mind > > > > > > > > > > > > > > > > first if hw:numa_nodes si set to any value incluing hw:numa_nodes=1 > > > > > > > > then hw:mem_page_size shoudl also be defiend on the falvor. > > > > > > > > > > > > > > > > if you dont set hw:mem_page_size then the vam will be pinned to a host numa node > > > > > > > > but the avaible memory on the host numa node will not be taken into account > > > > > > > > > > > > > > > > only the total free memory on the host so this almost always results in VMs being killed by the OOM reaper > > > > > > > > in the kernel. > > > > > > > > > > > > > > > > i recomend setting hw:mem_page_size=small hw:mem_page_size=large or hw:mem_page_size=any > > > > > > > > small will use your kernels default page size for guest memory, typically this is 4k pages > > > > > > > > large will use any pages size other then the smallest that is avaiable (i.e. this will use hugepages) > > > > > > > > and any will use small pages but allow the guest to request hugepages via the hw_page_size image property. > > > > > > > > > > > > > > > > hw:mem_page_size=any is the most flexable as a result but generally i recommend using hw:mem_page_size=small > > > > > > > > and having a seperate flavor for hugepages. its really up to you. > > > > > > > > > > > > > > > > > > > > > > > > the second thing to keep in mind is using expict numa toplolig8ies including hw:numa_nodes=1 > > > > > > > > disables memory oversubsctipion. > > > > > > > > > > > > > > > > so you will not be able ot oversubscibe the memory on the host. > > > > > > > > > > > > > > > > in general its better to avoid memory oversubscribtion anyway but jsut keep that in mind. > > > > > > > > you cant jsut allocate a buch of swap space and run vms at a 2:1 or higher memory over subscription ratio > > > > > > > > if you are using numa affinity. > > > > > > > > > > > > > > > > https://that.guru/blog/the-numa-scheduling-story-in-nova/ > > > > > > > > and > > > > > > > > https://that.guru/blog/cpu-resources-redux/ > > > > > > > > > > > > > > > > are also good to read > > > > > > > > > > > > > > > > i do not think stephen has a dedicated block on the memory aspect > > > > > > > > but https://bugs.launchpad.net/nova/+bug/1893121 covers some of the probelem that only setting > > > > > > > > hw:numa_nodes=1 will casue. > > > > > > > > > > > > > > > > if you have vms with hw:numa_nodes=1 set and you do not have hw:mem_page_size set in the falvor or > > > > > > > > hw_mem_page_size set in the image then that vm is not configure properly. > > > > > > > > > > > > > > > > On Wed, 2023-05-10 at 11:52 -0600, Alvaro Soto wrote: > > > > > > > > > Another good resource =) > > > > > > > > > > > > > > > > > > https://that.guru/blog/cpu-resources/ > > > > > > > > > > > > > > > > > > On Wed, May 10, 2023 at 11:50?AM Alvaro Soto wrote: > > > > > > > > > > > > > > > > > > > I don't think so. > > > > > > > > > > > > > > > > > > > > ~~~ > > > > > > > > > > The most common case will be that the admin only sets hw:numa_nodes and > > > > > > > > > > then the flavor vCPUs and memory will be divided equally across the NUMA > > > > > > > > > > nodes. When a NUMA policy is in effect, it is mandatory for the instance's > > > > > > > > > > memory allocations to come from the NUMA nodes to which it is bound except > > > > > > > > > > where overriden by hw:numa_mem.NN. > > > > > > > > > > ~~~ > > > > > > > > > > > > > > > > > > > > Here are the implementation documents since Juno release: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://opendev.org/openstack/nova-specs/src/branch/master/specs/juno/implemented/virt-driver-numa-placement.rst > > > > > > > > > > > > > > > > > > > > https://opendev.org/openstack/nova-specs/commit/45252df4c54674d2ac71cd88154af476c4d510e1 > > > > > > > > > > ? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, May 10, 2023 at 11:31?AM hai wu wrote: > > > > > > > > > > > > > > > > > > > > > Is there any concern to enable 'hw:numa_nodes=1' on all flavors, as > > > > > > > > > > > long as that flavor can fit into one numa node? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > > > > > > > Alvaro Soto > > > > > > > > > > > > > > > > > > > > *Note: My work hours may not be your work hours. Please do not feel the > > > > > > > > > > need to respond during a time that is not convenient for you.* > > > > > > > > > > ---------------------------------------------------------- > > > > > > > > > > Great people talk about ideas, > > > > > > > > > > ordinary people talk about things, > > > > > > > > > > small people talk... about other people. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From haiwu.us at gmail.com Mon May 15 19:46:29 2023 From: haiwu.us at gmail.com (hai wu) Date: Mon, 15 May 2023 14:46:29 -0500 Subject: [nova] hw:numa_nodes question In-Reply-To: <8acd0ffb7bb09de4b48c5c69f849659d805134c5.camel@redhat.com> References: <713262656198f4e0330a086b906daa8a1cb3e40c.camel@redhat.com> <1a05b92654acb6309bc52fac14c9ae79242ab40e.camel@redhat.com> <1705b563fb0e936a2aa8356f6adccddd948b69bf.camel@redhat.com> <56ffe1e6cabcc54920b6f8a3a255d13bd7407628.camel@redhat.com> <8acd0ffb7bb09de4b48c5c69f849659d805134c5.camel@redhat.com> Message-ID: This patch was backported: https://review.opendev.org/c/openstack/nova/+/805649. Once this is in place, new VMs always get assigned correctly to the numa node with more free memory. But when existing VMs (created with vm flavor with hw:numa_node=1 set) already running on numa node #0 got live migrated, it would always be stuck on numa node #0 after live migration. So it seems we would also need to set hw:mem_page_size=small on the vm flavor, so that new VMs created from that flavor would be able to land on different numa node other than node#0 after its live migration? On Mon, May 15, 2023 at 2:33?PM Sean Mooney wrote: > > On Mon, 2023-05-15 at 13:03 -0500, hai wu wrote: > > > > Another question: Let's say a VM runs on one host's numa node #0. If > > > > we live-migrate this VM to another host, and that host's numa node #1 > > > > has more free memory, is it possible for this VM to land on the other > > > > host's numa node #1? > > > yes it is > > > on newer relsese we will prefer to balance the load across numa nodes > > > on older release nova woudl fill the first numa node then move to the second. > > > > About the above point, it seems even with the numa patch back ported > > and in place, the VM would be stuck in its existing numa node. Per my > > tests, after its live migration, the VM will end up on the other > > host's numa node #0, even if numa node#1 has more free memory. This is > > not the case for newly built VMs. > > > > Is this a design issue? > if you are using a release that supprot numa live migration (train +) > https://specs.openstack.org/openstack/nova-specs/specs/train/implemented/numa-aware-live-migration.html > then the numa affintiy is recalulated on live migration however numa node 0 is prefered. > > as of xena [compute]/packing_host_numa_cells_allocation_strategy has been added to contol how vms are balanced acros numa nodes > in zed the default was changed form packing vms per host numa node to balancing vms between host numa nodes > https://docs.openstack.org/releasenotes/nova/zed.html#relnotes-26-0-0-stable-zed-upgrade-notes > > even without the enhanchemt in xena and zed it was possible for the scheduler to select a numa node > > if you dont enable memory or cpu aware numa placment with > hw:mem_page_size or hw:cpu_policy=dedicated then it will always select numa 0 > > if you do not request cpu pinnign or a specifc page size the sechudler cant properly select the host nuam node > and will alwasy use numa node 0. That is one of the reason i said that if hw:numa_nodes is set then hw:mem_page_size shoudl be set. > > from a nova point of view using numa_nodes without mem_page_size is logically incorrect as you asked for > a vm to be affinites to n host numa nodes but did not enable numa aware memory scheduling. > > we unfortnally cant provent this in the nova api without breaking upgrades for everyone who has made this mistake. > we woudl need to force them to resize all affected instances which means guest downtime. > the other issue si multiple numa nodes are supproted by Hyper-V but they do not supprot mem_page_size > > we have tried to document this in the past but never agreed on how becasuse it subtel and requries alot of context. > the tl;dr is if the instace has a numa toplogy it should have mem_page_size set in the image or flavor but > we never foudn a good place to capture that. > > > > > On Thu, May 11, 2023 at 2:42?PM Sean Mooney wrote: > > > > > > On Thu, 2023-05-11 at 08:40 -0500, hai wu wrote: > > > > Ok. Then I don't understand why 'hw:mem_page_size' is not made the > > > > default in case if hw:numa_node is set. There is a huge disadvantage > > > > if not having this one set (all existing VMs with hw:numa_node set > > > > will have to be taken down for resizing in order to get this one > > > > right). > > > there is an upgrade impact to changign the default. > > > its not impossibel to do but its complicated if we dont want to break exisitng deployments > > > we woudl need to recored a value for eveny current instance that was spawned before > > > this default was changed that had hw:numa_node without hw:mem_page_size so they kept the old behavior > > > and make sure that is cleared when the vm is next moved so it can have the new default > > > after a live migratoin. > > > > > > > > I could not find this point mentioned in any existing Openstack > > > > documentation: that we would have to set hw:mem_page_size explicitly > > > > if hw:numa_node is set. Also this slide at > > > > https://www.linux-kvm.org/images/0/0b/03x03-Openstackpdf.pdf kind of > > > > indicates that hw:mem_page_size `Default to small pages`. > > > it defaults to unset. > > > that results in small pages by default but its not the same as hw:mem_page_size=small > > > or hw:mem_page_size=any. > > > > > > > > > > > > > > Another question: Let's say a VM runs on one host's numa node #0. If > > > > we live-migrate this VM to another host, and that host's numa node #1 > > > > has more free memory, is it possible for this VM to land on the other > > > > host's numa node #1? > > > yes it is > > > on newer relsese we will prefer to balance the load across numa nodes > > > on older release nova woudl fill the first numa node then move to the second. > > > > > > > > On Thu, May 11, 2023 at 4:25?AM Sean Mooney wrote: > > > > > > > > > > On Wed, 2023-05-10 at 15:06 -0500, hai wu wrote: > > > > > > Is it possible to update something in the Openstack database for the > > > > > > relevant VMs in order to do the same, and then hard reboot the VM so > > > > > > that the VM would have this attribute? > > > > > not really adding the missing hw:mem_page_size requirement to the flavor chagnes the > > > > > requirements for node placement and numa affinity > > > > > so you really can only change this via resizing the vm to a new flavor > > > > > > > > > > > > On Wed, May 10, 2023 at 2:47?PM Sean Mooney wrote: > > > > > > > > > > > > > > On Wed, 2023-05-10 at 14:22 -0500, hai wu wrote: > > > > > > > > So there's no default value assumed/set for hw:mem_page_size for each > > > > > > > > flavor? > > > > > > > > > > > > > > > correct this is a known edgecase in the currnt design > > > > > > > hw:mem_page_size=any would be a resonable default but > > > > > > > techinially if just set hw:numa_nodes=1 nova allow memory over subscription > > > > > > > > > > > > > > in pratch if you try to do that you will almost always end up with vms > > > > > > > being killed due to OOM events. > > > > > > > > > > > > > > so from a api point of view it woudl be a change of behvior for use to default > > > > > > > to hw:mem_page_size=any but i think it would be the correct thign to do for operators > > > > > > > in the long run. > > > > > > > > > > > > > > i could bring this up with the core team again but in the past we > > > > > > > decided to be conservitive and just warn peopel to alwasy set > > > > > > > hw:mem_page_size if using numa affinity. > > > > > > > > > > > > > > > Yes https://bugs.launchpad.net/nova/+bug/1893121 is critical > > > > > > > > when using hw:numa_nodes=1. > > > > > > > > > > > > > > > > I did not hit an issue with 'hw:mem_page_size' not set, maybe I am > > > > > > > > missing some known test cases? It would be very helpful to have a test > > > > > > > > case where I could reproduce this issue with 'hw:numa_nodes=1' being > > > > > > > > set, but without 'hw:mem_page_size' being set. > > > > > > > > > > > > > > > > How to ensure this one for existing vms already running with > > > > > > > > 'hw:numa_nodes=1', but without 'hw:mem_page_size' being set? > > > > > > > you unfortuletly need to resize the instance. > > > > > > > tehre are some image porpeties you can set on an instance via nova-manage > > > > > > > but you cannot use nova-mange to update the enbedd flavor and set this. > > > > > > > > > > > > > > so you need to define a new flavour and resize. > > > > > > > > > > > > > > this is the main reason we have not changed the default as it may requrie you to > > > > > > > move instnace around if there placement is now invalid now that per numa node memory > > > > > > > allocatons are correctly being accounted for. > > > > > > > > > > > > > > if it was simple to change the default without any enduser or operator impact we would. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, May 10, 2023 at 1:47?PM Sean Mooney wrote: > > > > > > > > > > > > > > > > > > if you set hw:numa_nodes there are two things you should keep in mind > > > > > > > > > > > > > > > > > > first if hw:numa_nodes si set to any value incluing hw:numa_nodes=1 > > > > > > > > > then hw:mem_page_size shoudl also be defiend on the falvor. > > > > > > > > > > > > > > > > > > if you dont set hw:mem_page_size then the vam will be pinned to a host numa node > > > > > > > > > but the avaible memory on the host numa node will not be taken into account > > > > > > > > > > > > > > > > > > only the total free memory on the host so this almost always results in VMs being killed by the OOM reaper > > > > > > > > > in the kernel. > > > > > > > > > > > > > > > > > > i recomend setting hw:mem_page_size=small hw:mem_page_size=large or hw:mem_page_size=any > > > > > > > > > small will use your kernels default page size for guest memory, typically this is 4k pages > > > > > > > > > large will use any pages size other then the smallest that is avaiable (i.e. this will use hugepages) > > > > > > > > > and any will use small pages but allow the guest to request hugepages via the hw_page_size image property. > > > > > > > > > > > > > > > > > > hw:mem_page_size=any is the most flexable as a result but generally i recommend using hw:mem_page_size=small > > > > > > > > > and having a seperate flavor for hugepages. its really up to you. > > > > > > > > > > > > > > > > > > > > > > > > > > > the second thing to keep in mind is using expict numa toplolig8ies including hw:numa_nodes=1 > > > > > > > > > disables memory oversubsctipion. > > > > > > > > > > > > > > > > > > so you will not be able ot oversubscibe the memory on the host. > > > > > > > > > > > > > > > > > > in general its better to avoid memory oversubscribtion anyway but jsut keep that in mind. > > > > > > > > > you cant jsut allocate a buch of swap space and run vms at a 2:1 or higher memory over subscription ratio > > > > > > > > > if you are using numa affinity. > > > > > > > > > > > > > > > > > > https://that.guru/blog/the-numa-scheduling-story-in-nova/ > > > > > > > > > and > > > > > > > > > https://that.guru/blog/cpu-resources-redux/ > > > > > > > > > > > > > > > > > > are also good to read > > > > > > > > > > > > > > > > > > i do not think stephen has a dedicated block on the memory aspect > > > > > > > > > but https://bugs.launchpad.net/nova/+bug/1893121 covers some of the probelem that only setting > > > > > > > > > hw:numa_nodes=1 will casue. > > > > > > > > > > > > > > > > > > if you have vms with hw:numa_nodes=1 set and you do not have hw:mem_page_size set in the falvor or > > > > > > > > > hw_mem_page_size set in the image then that vm is not configure properly. > > > > > > > > > > > > > > > > > > On Wed, 2023-05-10 at 11:52 -0600, Alvaro Soto wrote: > > > > > > > > > > Another good resource =) > > > > > > > > > > > > > > > > > > > > https://that.guru/blog/cpu-resources/ > > > > > > > > > > > > > > > > > > > > On Wed, May 10, 2023 at 11:50?AM Alvaro Soto wrote: > > > > > > > > > > > > > > > > > > > > > I don't think so. > > > > > > > > > > > > > > > > > > > > > > ~~~ > > > > > > > > > > > The most common case will be that the admin only sets hw:numa_nodes and > > > > > > > > > > > then the flavor vCPUs and memory will be divided equally across the NUMA > > > > > > > > > > > nodes. When a NUMA policy is in effect, it is mandatory for the instance's > > > > > > > > > > > memory allocations to come from the NUMA nodes to which it is bound except > > > > > > > > > > > where overriden by hw:numa_mem.NN. > > > > > > > > > > > ~~~ > > > > > > > > > > > > > > > > > > > > > > Here are the implementation documents since Juno release: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://opendev.org/openstack/nova-specs/src/branch/master/specs/juno/implemented/virt-driver-numa-placement.rst > > > > > > > > > > > > > > > > > > > > > > https://opendev.org/openstack/nova-specs/commit/45252df4c54674d2ac71cd88154af476c4d510e1 > > > > > > > > > > > ? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, May 10, 2023 at 11:31?AM hai wu wrote: > > > > > > > > > > > > > > > > > > > > > > > Is there any concern to enable 'hw:numa_nodes=1' on all flavors, as > > > > > > > > > > > > long as that flavor can fit into one numa node? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > > > > > > > > > Alvaro Soto > > > > > > > > > > > > > > > > > > > > > > *Note: My work hours may not be your work hours. Please do not feel the > > > > > > > > > > > need to respond during a time that is not convenient for you.* > > > > > > > > > > > ---------------------------------------------------------- > > > > > > > > > > > Great people talk about ideas, > > > > > > > > > > > ordinary people talk about things, > > > > > > > > > > > small people talk... about other people. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From maksim.malchuk at gmail.com Mon May 15 20:07:52 2023 From: maksim.malchuk at gmail.com (Maksim Malchuk) Date: Mon, 15 May 2023 23:07:52 +0300 Subject: [nova] hw:numa_nodes question In-Reply-To: References: <713262656198f4e0330a086b906daa8a1cb3e40c.camel@redhat.com> <1a05b92654acb6309bc52fac14c9ae79242ab40e.camel@redhat.com> <1705b563fb0e936a2aa8356f6adccddd948b69bf.camel@redhat.com> <56ffe1e6cabcc54920b6f8a3a255d13bd7407628.camel@redhat.com> <8acd0ffb7bb09de4b48c5c69f849659d805134c5.camel@redhat.com> Message-ID: There is 6 month this backport without review, since Sean Mooney gives +2. The next rebase needed to solve merge conflict have cleaned +2 from review. On Mon, May 15, 2023 at 10:53?PM hai wu wrote: > This patch was backported: > https://review.opendev.org/c/openstack/nova/+/805649. Once this is in > place, new VMs always get assigned correctly to the numa node with > more free memory. But when existing VMs (created with vm flavor with > hw:numa_node=1 set) already running on numa node #0 got live migrated, > it would always be stuck on numa node #0 after live migration. > > So it seems we would also need to set hw:mem_page_size=small on the vm > flavor, so that new VMs created from that flavor would be able to land > on different numa node other than node#0 after its live migration? > > On Mon, May 15, 2023 at 2:33?PM Sean Mooney wrote: > > > > On Mon, 2023-05-15 at 13:03 -0500, hai wu wrote: > > > > > Another question: Let's say a VM runs on one host's numa node #0. > If > > > > > we live-migrate this VM to another host, and that host's numa node > #1 > > > > > has more free memory, is it possible for this VM to land on the > other > > > > > host's numa node #1? > > > > yes it is > > > > on newer relsese we will prefer to balance the load across numa nodes > > > > on older release nova woudl fill the first numa node then move to > the second. > > > > > > About the above point, it seems even with the numa patch back ported > > > and in place, the VM would be stuck in its existing numa node. Per my > > > tests, after its live migration, the VM will end up on the other > > > host's numa node #0, even if numa node#1 has more free memory. This is > > > not the case for newly built VMs. > > > > > > Is this a design issue? > > if you are using a release that supprot numa live migration (train +) > > > https://specs.openstack.org/openstack/nova-specs/specs/train/implemented/numa-aware-live-migration.html > > then the numa affintiy is recalulated on live migration however numa > node 0 is prefered. > > > > as of xena [compute]/packing_host_numa_cells_allocation_strategy has > been added to contol how vms are balanced acros numa nodes > > in zed the default was changed form packing vms per host numa node to > balancing vms between host numa nodes > > > https://docs.openstack.org/releasenotes/nova/zed.html#relnotes-26-0-0-stable-zed-upgrade-notes > > > > even without the enhanchemt in xena and zed it was possible for the > scheduler to select a numa node > > > > if you dont enable memory or cpu aware numa placment with > > hw:mem_page_size or hw:cpu_policy=dedicated then it will always select > numa 0 > > > > if you do not request cpu pinnign or a specifc page size the sechudler > cant properly select the host nuam node > > and will alwasy use numa node 0. That is one of the reason i said that > if hw:numa_nodes is set then hw:mem_page_size shoudl be set. > > > > from a nova point of view using numa_nodes without mem_page_size is > logically incorrect as you asked for > > a vm to be affinites to n host numa nodes but did not enable numa aware > memory scheduling. > > > > we unfortnally cant provent this in the nova api without breaking > upgrades for everyone who has made this mistake. > > we woudl need to force them to resize all affected instances which means > guest downtime. > > the other issue si multiple numa nodes are supproted by Hyper-V but they > do not supprot mem_page_size > > > > we have tried to document this in the past but never agreed on how > becasuse it subtel and requries alot of context. > > the tl;dr is if the instace has a numa toplogy it should have > mem_page_size set in the image or flavor but > > we never foudn a good place to capture that. > > > > > > > > On Thu, May 11, 2023 at 2:42?PM Sean Mooney > wrote: > > > > > > > > On Thu, 2023-05-11 at 08:40 -0500, hai wu wrote: > > > > > Ok. Then I don't understand why 'hw:mem_page_size' is not made the > > > > > default in case if hw:numa_node is set. There is a huge > disadvantage > > > > > if not having this one set (all existing VMs with hw:numa_node set > > > > > will have to be taken down for resizing in order to get this one > > > > > right). > > > > there is an upgrade impact to changign the default. > > > > its not impossibel to do but its complicated if we dont want to > break exisitng deployments > > > > we woudl need to recored a value for eveny current instance that was > spawned before > > > > this default was changed that had hw:numa_node without > hw:mem_page_size so they kept the old behavior > > > > and make sure that is cleared when the vm is next moved so it can > have the new default > > > > after a live migratoin. > > > > > > > > > > I could not find this point mentioned in any existing Openstack > > > > > documentation: that we would have to set hw:mem_page_size > explicitly > > > > > if hw:numa_node is set. Also this slide at > > > > > https://www.linux-kvm.org/images/0/0b/03x03-Openstackpdf.pdf kind > of > > > > > indicates that hw:mem_page_size `Default to small pages`. > > > > it defaults to unset. > > > > that results in small pages by default but its not the same as > hw:mem_page_size=small > > > > or hw:mem_page_size=any. > > > > > > > > > > > > > > > > > > Another question: Let's say a VM runs on one host's numa node #0. > If > > > > > we live-migrate this VM to another host, and that host's numa node > #1 > > > > > has more free memory, is it possible for this VM to land on the > other > > > > > host's numa node #1? > > > > yes it is > > > > on newer relsese we will prefer to balance the load across numa nodes > > > > on older release nova woudl fill the first numa node then move to > the second. > > > > > > > > > > On Thu, May 11, 2023 at 4:25?AM Sean Mooney > wrote: > > > > > > > > > > > > On Wed, 2023-05-10 at 15:06 -0500, hai wu wrote: > > > > > > > Is it possible to update something in the Openstack database > for the > > > > > > > relevant VMs in order to do the same, and then hard reboot the > VM so > > > > > > > that the VM would have this attribute? > > > > > > not really adding the missing hw:mem_page_size requirement to > the flavor chagnes the > > > > > > requirements for node placement and numa affinity > > > > > > so you really can only change this via resizing the vm to a new > flavor > > > > > > > > > > > > > > On Wed, May 10, 2023 at 2:47?PM Sean Mooney < > smooney at redhat.com> wrote: > > > > > > > > > > > > > > > > On Wed, 2023-05-10 at 14:22 -0500, hai wu wrote: > > > > > > > > > So there's no default value assumed/set for > hw:mem_page_size for each > > > > > > > > > flavor? > > > > > > > > > > > > > > > > > correct this is a known edgecase in the currnt design > > > > > > > > hw:mem_page_size=any would be a resonable default but > > > > > > > > techinially if just set hw:numa_nodes=1 nova allow memory > over subscription > > > > > > > > > > > > > > > > in pratch if you try to do that you will almost always end > up with vms > > > > > > > > being killed due to OOM events. > > > > > > > > > > > > > > > > so from a api point of view it woudl be a change of behvior > for use to default > > > > > > > > to hw:mem_page_size=any but i think it would be the correct > thign to do for operators > > > > > > > > in the long run. > > > > > > > > > > > > > > > > i could bring this up with the core team again but in the > past we > > > > > > > > decided to be conservitive and just warn peopel to alwasy set > > > > > > > > hw:mem_page_size if using numa affinity. > > > > > > > > > > > > > > > > > Yes https://bugs.launchpad.net/nova/+bug/1893121 is > critical > > > > > > > > > when using hw:numa_nodes=1. > > > > > > > > > > > > > > > > > > I did not hit an issue with 'hw:mem_page_size' not set, > maybe I am > > > > > > > > > missing some known test cases? It would be very helpful to > have a test > > > > > > > > > case where I could reproduce this issue with > 'hw:numa_nodes=1' being > > > > > > > > > set, but without 'hw:mem_page_size' being set. > > > > > > > > > > > > > > > > > > How to ensure this one for existing vms already running > with > > > > > > > > > 'hw:numa_nodes=1', but without 'hw:mem_page_size' being > set? > > > > > > > > you unfortuletly need to resize the instance. > > > > > > > > tehre are some image porpeties you can set on an instance > via nova-manage > > > > > > > > but you cannot use nova-mange to update the enbedd flavor > and set this. > > > > > > > > > > > > > > > > so you need to define a new flavour and resize. > > > > > > > > > > > > > > > > this is the main reason we have not changed the default as > it may requrie you to > > > > > > > > move instnace around if there placement is now invalid now > that per numa node memory > > > > > > > > allocatons are correctly being accounted for. > > > > > > > > > > > > > > > > if it was simple to change the default without any enduser > or operator impact we would. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, May 10, 2023 at 1:47?PM Sean Mooney < > smooney at redhat.com> wrote: > > > > > > > > > > > > > > > > > > > > if you set hw:numa_nodes there are two things you should > keep in mind > > > > > > > > > > > > > > > > > > > > first if hw:numa_nodes si set to any value incluing > hw:numa_nodes=1 > > > > > > > > > > then hw:mem_page_size shoudl also be defiend on the > falvor. > > > > > > > > > > > > > > > > > > > > if you dont set hw:mem_page_size then the vam will be > pinned to a host numa node > > > > > > > > > > but the avaible memory on the host numa node will not be > taken into account > > > > > > > > > > > > > > > > > > > > only the total free memory on the host so this almost > always results in VMs being killed by the OOM reaper > > > > > > > > > > in the kernel. > > > > > > > > > > > > > > > > > > > > i recomend setting hw:mem_page_size=small > hw:mem_page_size=large or hw:mem_page_size=any > > > > > > > > > > small will use your kernels default page size for guest > memory, typically this is 4k pages > > > > > > > > > > large will use any pages size other then the smallest > that is avaiable (i.e. this will use hugepages) > > > > > > > > > > and any will use small pages but allow the guest to > request hugepages via the hw_page_size image property. > > > > > > > > > > > > > > > > > > > > hw:mem_page_size=any is the most flexable as a result > but generally i recommend using hw:mem_page_size=small > > > > > > > > > > and having a seperate flavor for hugepages. its really > up to you. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > the second thing to keep in mind is using expict numa > toplolig8ies including hw:numa_nodes=1 > > > > > > > > > > disables memory oversubsctipion. > > > > > > > > > > > > > > > > > > > > so you will not be able ot oversubscibe the memory on > the host. > > > > > > > > > > > > > > > > > > > > in general its better to avoid memory oversubscribtion > anyway but jsut keep that in mind. > > > > > > > > > > you cant jsut allocate a buch of swap space and run vms > at a 2:1 or higher memory over subscription ratio > > > > > > > > > > if you are using numa affinity. > > > > > > > > > > > > > > > > > > > > > https://that.guru/blog/the-numa-scheduling-story-in-nova/ > > > > > > > > > > and > > > > > > > > > > https://that.guru/blog/cpu-resources-redux/ > > > > > > > > > > > > > > > > > > > > are also good to read > > > > > > > > > > > > > > > > > > > > i do not think stephen has a dedicated block on the > memory aspect > > > > > > > > > > but https://bugs.launchpad.net/nova/+bug/1893121 covers > some of the probelem that only setting > > > > > > > > > > hw:numa_nodes=1 will casue. > > > > > > > > > > > > > > > > > > > > if you have vms with hw:numa_nodes=1 set and you do not > have hw:mem_page_size set in the falvor or > > > > > > > > > > hw_mem_page_size set in the image then that vm is not > configure properly. > > > > > > > > > > > > > > > > > > > > On Wed, 2023-05-10 at 11:52 -0600, Alvaro Soto wrote: > > > > > > > > > > > Another good resource =) > > > > > > > > > > > > > > > > > > > > > > https://that.guru/blog/cpu-resources/ > > > > > > > > > > > > > > > > > > > > > > On Wed, May 10, 2023 at 11:50?AM Alvaro Soto < > alsotoes at gmail.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > I don't think so. > > > > > > > > > > > > > > > > > > > > > > > > ~~~ > > > > > > > > > > > > The most common case will be that the admin only > sets hw:numa_nodes and > > > > > > > > > > > > then the flavor vCPUs and memory will be divided > equally across the NUMA > > > > > > > > > > > > nodes. When a NUMA policy is in effect, it is > mandatory for the instance's > > > > > > > > > > > > memory allocations to come from the NUMA nodes to > which it is bound except > > > > > > > > > > > > where overriden by hw:numa_mem.NN. > > > > > > > > > > > > ~~~ > > > > > > > > > > > > > > > > > > > > > > > > Here are the implementation documents since Juno > release: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://opendev.org/openstack/nova-specs/src/branch/master/specs/juno/implemented/virt-driver-numa-placement.rst > > > > > > > > > > > > > > > > > > > > > > > > > https://opendev.org/openstack/nova-specs/commit/45252df4c54674d2ac71cd88154af476c4d510e1 > > > > > > > > > > > > ? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, May 10, 2023 at 11:31?AM hai wu < > haiwu.us at gmail.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > Is there any concern to enable 'hw:numa_nodes=1' > on all flavors, as > > > > > > > > > > > > > long as that flavor can fit into one numa node? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > > > > > > > > > > > Alvaro Soto > > > > > > > > > > > > > > > > > > > > > > > > *Note: My work hours may not be your work hours. > Please do not feel the > > > > > > > > > > > > need to respond during a time that is not convenient > for you.* > > > > > > > > > > > > > ---------------------------------------------------------- > > > > > > > > > > > > Great people talk about ideas, > > > > > > > > > > > > ordinary people talk about things, > > > > > > > > > > > > small people talk... about other people. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- Regards, Maksim Malchuk -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Mon May 15 20:18:53 2023 From: smooney at redhat.com (Sean Mooney) Date: Mon, 15 May 2023 21:18:53 +0100 Subject: [nova] hw:numa_nodes question In-Reply-To: References: <713262656198f4e0330a086b906daa8a1cb3e40c.camel@redhat.com> <1a05b92654acb6309bc52fac14c9ae79242ab40e.camel@redhat.com> <1705b563fb0e936a2aa8356f6adccddd948b69bf.camel@redhat.com> <56ffe1e6cabcc54920b6f8a3a255d13bd7407628.camel@redhat.com> <8acd0ffb7bb09de4b48c5c69f849659d805134c5.camel@redhat.com> Message-ID: <63f8c407de908e1bbe8589ad9400e7751c4b4d44.camel@redhat.com> On Mon, 2023-05-15 at 14:46 -0500, hai wu wrote: > This patch was backported: > https://review.opendev.org/c/openstack/nova/+/805649. Once this is in > place, new VMs always get assigned correctly to the numa node with > more free memory. But when existing VMs (created with vm flavor with > hw:numa_node=1 set) already running on numa node #0 got live migrated, > it would always be stuck on numa node #0 after live migration. if the vm only has hw:numa_node=1 https://review.opendev.org/c/openstack/nova/+/805649 wont help because we never claim any mempages or cpus in the host numa toplogy blob as such the sorting based on usage to balance the nodes wont work since there is never any usage recored for vms with just hw:numa_node=1 and nothign else set. > > So it seems we would also need to set hw:mem_page_size=small on the vm > flavor, so that new VMs created from that flavor would be able to land > on different numa node other than node#0 after its live migration? yes again becasue mem_page_size there is no usage in the host numa toplogy blob so as far as the schduler/resouces tracker is concerned all numa nodes are equally used. so it will always select nuam 0 by default since the scheduling algortim is deterministic. > > On Mon, May 15, 2023 at 2:33?PM Sean Mooney wrote: > > > > On Mon, 2023-05-15 at 13:03 -0500, hai wu wrote: > > > > > Another question: Let's say a VM runs on one host's numa node #0. If > > > > > we live-migrate this VM to another host, and that host's numa node #1 > > > > > has more free memory, is it possible for this VM to land on the other > > > > > host's numa node #1? > > > > yes it is > > > > on newer relsese we will prefer to balance the load across numa nodes > > > > on older release nova woudl fill the first numa node then move to the second. > > > > > > About the above point, it seems even with the numa patch back ported > > > and in place, the VM would be stuck in its existing numa node. Per my > > > tests, after its live migration, the VM will end up on the other > > > host's numa node #0, even if numa node#1 has more free memory. This is > > > not the case for newly built VMs. > > > > > > Is this a design issue? > > if you are using a release that supprot numa live migration (train +) > > https://specs.openstack.org/openstack/nova-specs/specs/train/implemented/numa-aware-live-migration.html > > then the numa affintiy is recalulated on live migration however numa node 0 is prefered. > > > > as of xena [compute]/packing_host_numa_cells_allocation_strategy has been added to contol how vms are balanced acros numa nodes > > in zed the default was changed form packing vms per host numa node to balancing vms between host numa nodes > > https://docs.openstack.org/releasenotes/nova/zed.html#relnotes-26-0-0-stable-zed-upgrade-notes > > > > even without the enhanchemt in xena and zed it was possible for the scheduler to select a numa node > > > > if you dont enable memory or cpu aware numa placment with > > hw:mem_page_size or hw:cpu_policy=dedicated then it will always select numa 0 > > > > if you do not request cpu pinnign or a specifc page size the sechudler cant properly select the host nuam node > > and will alwasy use numa node 0. That is one of the reason i said that if hw:numa_nodes is set then hw:mem_page_size shoudl be set. > > > > from a nova point of view using numa_nodes without mem_page_size is logically incorrect as you asked for > > a vm to be affinites to n host numa nodes but did not enable numa aware memory scheduling. > > > > we unfortnally cant provent this in the nova api without breaking upgrades for everyone who has made this mistake. > > we woudl need to force them to resize all affected instances which means guest downtime. > > the other issue si multiple numa nodes are supproted by Hyper-V but they do not supprot mem_page_size > > > > we have tried to document this in the past but never agreed on how becasuse it subtel and requries alot of context. > > the tl;dr is if the instace has a numa toplogy it should have mem_page_size set in the image or flavor but > > we never foudn a good place to capture that. > > > > > > > > On Thu, May 11, 2023 at 2:42?PM Sean Mooney wrote: > > > > > > > > On Thu, 2023-05-11 at 08:40 -0500, hai wu wrote: > > > > > Ok. Then I don't understand why 'hw:mem_page_size' is not made the > > > > > default in case if hw:numa_node is set. There is a huge disadvantage > > > > > if not having this one set (all existing VMs with hw:numa_node set > > > > > will have to be taken down for resizing in order to get this one > > > > > right). > > > > there is an upgrade impact to changign the default. > > > > its not impossibel to do but its complicated if we dont want to break exisitng deployments > > > > we woudl need to recored a value for eveny current instance that was spawned before > > > > this default was changed that had hw:numa_node without hw:mem_page_size so they kept the old behavior > > > > and make sure that is cleared when the vm is next moved so it can have the new default > > > > after a live migratoin. > > > > > > > > > > I could not find this point mentioned in any existing Openstack > > > > > documentation: that we would have to set hw:mem_page_size explicitly > > > > > if hw:numa_node is set. Also this slide at > > > > > https://www.linux-kvm.org/images/0/0b/03x03-Openstackpdf.pdf kind of > > > > > indicates that hw:mem_page_size `Default to small pages`. > > > > it defaults to unset. > > > > that results in small pages by default but its not the same as hw:mem_page_size=small > > > > or hw:mem_page_size=any. > > > > > > > > > > > > > > > > > > Another question: Let's say a VM runs on one host's numa node #0. If > > > > > we live-migrate this VM to another host, and that host's numa node #1 > > > > > has more free memory, is it possible for this VM to land on the other > > > > > host's numa node #1? > > > > yes it is > > > > on newer relsese we will prefer to balance the load across numa nodes > > > > on older release nova woudl fill the first numa node then move to the second. > > > > > > > > > > On Thu, May 11, 2023 at 4:25?AM Sean Mooney wrote: > > > > > > > > > > > > On Wed, 2023-05-10 at 15:06 -0500, hai wu wrote: > > > > > > > Is it possible to update something in the Openstack database for the > > > > > > > relevant VMs in order to do the same, and then hard reboot the VM so > > > > > > > that the VM would have this attribute? > > > > > > not really adding the missing hw:mem_page_size requirement to the flavor chagnes the > > > > > > requirements for node placement and numa affinity > > > > > > so you really can only change this via resizing the vm to a new flavor > > > > > > > > > > > > > > On Wed, May 10, 2023 at 2:47?PM Sean Mooney wrote: > > > > > > > > > > > > > > > > On Wed, 2023-05-10 at 14:22 -0500, hai wu wrote: > > > > > > > > > So there's no default value assumed/set for hw:mem_page_size for each > > > > > > > > > flavor? > > > > > > > > > > > > > > > > > correct this is a known edgecase in the currnt design > > > > > > > > hw:mem_page_size=any would be a resonable default but > > > > > > > > techinially if just set hw:numa_nodes=1 nova allow memory over subscription > > > > > > > > > > > > > > > > in pratch if you try to do that you will almost always end up with vms > > > > > > > > being killed due to OOM events. > > > > > > > > > > > > > > > > so from a api point of view it woudl be a change of behvior for use to default > > > > > > > > to hw:mem_page_size=any but i think it would be the correct thign to do for operators > > > > > > > > in the long run. > > > > > > > > > > > > > > > > i could bring this up with the core team again but in the past we > > > > > > > > decided to be conservitive and just warn peopel to alwasy set > > > > > > > > hw:mem_page_size if using numa affinity. > > > > > > > > > > > > > > > > > Yes https://bugs.launchpad.net/nova/+bug/1893121 is critical > > > > > > > > > when using hw:numa_nodes=1. > > > > > > > > > > > > > > > > > > I did not hit an issue with 'hw:mem_page_size' not set, maybe I am > > > > > > > > > missing some known test cases? It would be very helpful to have a test > > > > > > > > > case where I could reproduce this issue with 'hw:numa_nodes=1' being > > > > > > > > > set, but without 'hw:mem_page_size' being set. > > > > > > > > > > > > > > > > > > How to ensure this one for existing vms already running with > > > > > > > > > 'hw:numa_nodes=1', but without 'hw:mem_page_size' being set? > > > > > > > > you unfortuletly need to resize the instance. > > > > > > > > tehre are some image porpeties you can set on an instance via nova-manage > > > > > > > > but you cannot use nova-mange to update the enbedd flavor and set this. > > > > > > > > > > > > > > > > so you need to define a new flavour and resize. > > > > > > > > > > > > > > > > this is the main reason we have not changed the default as it may requrie you to > > > > > > > > move instnace around if there placement is now invalid now that per numa node memory > > > > > > > > allocatons are correctly being accounted for. > > > > > > > > > > > > > > > > if it was simple to change the default without any enduser or operator impact we would. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, May 10, 2023 at 1:47?PM Sean Mooney wrote: > > > > > > > > > > > > > > > > > > > > if you set hw:numa_nodes there are two things you should keep in mind > > > > > > > > > > > > > > > > > > > > first if hw:numa_nodes si set to any value incluing hw:numa_nodes=1 > > > > > > > > > > then hw:mem_page_size shoudl also be defiend on the falvor. > > > > > > > > > > > > > > > > > > > > if you dont set hw:mem_page_size then the vam will be pinned to a host numa node > > > > > > > > > > but the avaible memory on the host numa node will not be taken into account > > > > > > > > > > > > > > > > > > > > only the total free memory on the host so this almost always results in VMs being killed by the OOM reaper > > > > > > > > > > in the kernel. > > > > > > > > > > > > > > > > > > > > i recomend setting hw:mem_page_size=small hw:mem_page_size=large or hw:mem_page_size=any > > > > > > > > > > small will use your kernels default page size for guest memory, typically this is 4k pages > > > > > > > > > > large will use any pages size other then the smallest that is avaiable (i.e. this will use hugepages) > > > > > > > > > > and any will use small pages but allow the guest to request hugepages via the hw_page_size image property. > > > > > > > > > > > > > > > > > > > > hw:mem_page_size=any is the most flexable as a result but generally i recommend using hw:mem_page_size=small > > > > > > > > > > and having a seperate flavor for hugepages. its really up to you. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > the second thing to keep in mind is using expict numa toplolig8ies including hw:numa_nodes=1 > > > > > > > > > > disables memory oversubsctipion. > > > > > > > > > > > > > > > > > > > > so you will not be able ot oversubscibe the memory on the host. > > > > > > > > > > > > > > > > > > > > in general its better to avoid memory oversubscribtion anyway but jsut keep that in mind. > > > > > > > > > > you cant jsut allocate a buch of swap space and run vms at a 2:1 or higher memory over subscription ratio > > > > > > > > > > if you are using numa affinity. > > > > > > > > > > > > > > > > > > > > https://that.guru/blog/the-numa-scheduling-story-in-nova/ > > > > > > > > > > and > > > > > > > > > > https://that.guru/blog/cpu-resources-redux/ > > > > > > > > > > > > > > > > > > > > are also good to read > > > > > > > > > > > > > > > > > > > > i do not think stephen has a dedicated block on the memory aspect > > > > > > > > > > but https://bugs.launchpad.net/nova/+bug/1893121 covers some of the probelem that only setting > > > > > > > > > > hw:numa_nodes=1 will casue. > > > > > > > > > > > > > > > > > > > > if you have vms with hw:numa_nodes=1 set and you do not have hw:mem_page_size set in the falvor or > > > > > > > > > > hw_mem_page_size set in the image then that vm is not configure properly. > > > > > > > > > > > > > > > > > > > > On Wed, 2023-05-10 at 11:52 -0600, Alvaro Soto wrote: > > > > > > > > > > > Another good resource =) > > > > > > > > > > > > > > > > > > > > > > https://that.guru/blog/cpu-resources/ > > > > > > > > > > > > > > > > > > > > > > On Wed, May 10, 2023 at 11:50?AM Alvaro Soto wrote: > > > > > > > > > > > > > > > > > > > > > > > I don't think so. > > > > > > > > > > > > > > > > > > > > > > > > ~~~ > > > > > > > > > > > > The most common case will be that the admin only sets hw:numa_nodes and > > > > > > > > > > > > then the flavor vCPUs and memory will be divided equally across the NUMA > > > > > > > > > > > > nodes. When a NUMA policy is in effect, it is mandatory for the instance's > > > > > > > > > > > > memory allocations to come from the NUMA nodes to which it is bound except > > > > > > > > > > > > where overriden by hw:numa_mem.NN. > > > > > > > > > > > > ~~~ > > > > > > > > > > > > > > > > > > > > > > > > Here are the implementation documents since Juno release: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://opendev.org/openstack/nova-specs/src/branch/master/specs/juno/implemented/virt-driver-numa-placement.rst > > > > > > > > > > > > > > > > > > > > > > > > https://opendev.org/openstack/nova-specs/commit/45252df4c54674d2ac71cd88154af476c4d510e1 > > > > > > > > > > > > ? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, May 10, 2023 at 11:31?AM hai wu wrote: > > > > > > > > > > > > > > > > > > > > > > > > > Is there any concern to enable 'hw:numa_nodes=1' on all flavors, as > > > > > > > > > > > > > long as that flavor can fit into one numa node? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > > > > > > > > > > > Alvaro Soto > > > > > > > > > > > > > > > > > > > > > > > > *Note: My work hours may not be your work hours. Please do not feel the > > > > > > > > > > > > need to respond during a time that is not convenient for you.* > > > > > > > > > > > > ---------------------------------------------------------- > > > > > > > > > > > > Great people talk about ideas, > > > > > > > > > > > > ordinary people talk about things, > > > > > > > > > > > > small people talk... about other people. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From smooney at redhat.com Mon May 15 20:25:35 2023 From: smooney at redhat.com (Sean Mooney) Date: Mon, 15 May 2023 21:25:35 +0100 Subject: [nova] hw:numa_nodes question In-Reply-To: References: <713262656198f4e0330a086b906daa8a1cb3e40c.camel@redhat.com> <1a05b92654acb6309bc52fac14c9ae79242ab40e.camel@redhat.com> <1705b563fb0e936a2aa8356f6adccddd948b69bf.camel@redhat.com> <56ffe1e6cabcc54920b6f8a3a255d13bd7407628.camel@redhat.com> <8acd0ffb7bb09de4b48c5c69f849659d805134c5.camel@redhat.com> Message-ID: On Mon, 2023-05-15 at 23:07 +0300, Maksim Malchuk wrote: > There is 6 month this backport without review, since Sean Mooney gives +2. > The next rebase needed to solve merge conflict have cleaned +2 from review. yes it was blocked on a question regarding does this confirm to stable backprot policy. we do not backport features and while this was considreed a bugfix on master it was also acknoladged that it is also a little featureish we discussed this last weak in the nova team meeting and agree it could proceed. but as i noted in my last reply this will have no effect if you jsut have hw:numa_nodes=1 without hw:cpu_policy=dedicated or hw:mem_page_size with enableing cpu pinnign or explcit page size we do not track per numa node cpu or memory usage in the host numa toplogy object for a given compute node. as such without any usage informational there is noting to we the numa nodes with. so packing_host_numa_cells_allocation_strategy=false will not make vms that request a numa toplogy without numa resouce be balanced between the numa nodes. you still need to resize the instance to a flavor that actully properly request memory or cpu pinging. > > On Mon, May 15, 2023 at 10:53?PM hai wu wrote: > > > This patch was backported: > > https://review.opendev.org/c/openstack/nova/+/805649. Once this is in > > place, new VMs always get assigned correctly to the numa node with > > more free memory. But when existing VMs (created with vm flavor with > > hw:numa_node=1 set) already running on numa node #0 got live migrated, > > it would always be stuck on numa node #0 after live migration. > > > > So it seems we would also need to set hw:mem_page_size=small on the vm > > flavor, so that new VMs created from that flavor would be able to land > > on different numa node other than node#0 after its live migration? > > > > On Mon, May 15, 2023 at 2:33?PM Sean Mooney wrote: > > > > > > On Mon, 2023-05-15 at 13:03 -0500, hai wu wrote: > > > > > > Another question: Let's say a VM runs on one host's numa node #0. > > If > > > > > > we live-migrate this VM to another host, and that host's numa node > > #1 > > > > > > has more free memory, is it possible for this VM to land on the > > other > > > > > > host's numa node #1? > > > > > yes it is > > > > > on newer relsese we will prefer to balance the load across numa nodes > > > > > on older release nova woudl fill the first numa node then move to > > the second. > > > > > > > > About the above point, it seems even with the numa patch back ported > > > > and in place, the VM would be stuck in its existing numa node. Per my > > > > tests, after its live migration, the VM will end up on the other > > > > host's numa node #0, even if numa node#1 has more free memory. This is > > > > not the case for newly built VMs. > > > > > > > > Is this a design issue? > > > if you are using a release that supprot numa live migration (train +) > > > > > https://specs.openstack.org/openstack/nova-specs/specs/train/implemented/numa-aware-live-migration.html > > > then the numa affintiy is recalulated on live migration however numa > > node 0 is prefered. > > > > > > as of xena [compute]/packing_host_numa_cells_allocation_strategy has > > been added to contol how vms are balanced acros numa nodes > > > in zed the default was changed form packing vms per host numa node to > > balancing vms between host numa nodes > > > > > https://docs.openstack.org/releasenotes/nova/zed.html#relnotes-26-0-0-stable-zed-upgrade-notes > > > > > > even without the enhanchemt in xena and zed it was possible for the > > scheduler to select a numa node > > > > > > if you dont enable memory or cpu aware numa placment with > > > hw:mem_page_size or hw:cpu_policy=dedicated then it will always select > > numa 0 > > > > > > if you do not request cpu pinnign or a specifc page size the sechudler > > cant properly select the host nuam node > > > and will alwasy use numa node 0. That is one of the reason i said that > > if hw:numa_nodes is set then hw:mem_page_size shoudl be set. > > > > > > from a nova point of view using numa_nodes without mem_page_size is > > logically incorrect as you asked for > > > a vm to be affinites to n host numa nodes but did not enable numa aware > > memory scheduling. > > > > > > we unfortnally cant provent this in the nova api without breaking > > upgrades for everyone who has made this mistake. > > > we woudl need to force them to resize all affected instances which means > > guest downtime. > > > the other issue si multiple numa nodes are supproted by Hyper-V but they > > do not supprot mem_page_size > > > > > > we have tried to document this in the past but never agreed on how > > becasuse it subtel and requries alot of context. > > > the tl;dr is if the instace has a numa toplogy it should have > > mem_page_size set in the image or flavor but > > > we never foudn a good place to capture that. > > > > > > > > > > > On Thu, May 11, 2023 at 2:42?PM Sean Mooney > > wrote: > > > > > > > > > > On Thu, 2023-05-11 at 08:40 -0500, hai wu wrote: > > > > > > Ok. Then I don't understand why 'hw:mem_page_size' is not made the > > > > > > default in case if hw:numa_node is set. There is a huge > > disadvantage > > > > > > if not having this one set (all existing VMs with hw:numa_node set > > > > > > will have to be taken down for resizing in order to get this one > > > > > > right). > > > > > there is an upgrade impact to changign the default. > > > > > its not impossibel to do but its complicated if we dont want to > > break exisitng deployments > > > > > we woudl need to recored a value for eveny current instance that was > > spawned before > > > > > this default was changed that had hw:numa_node without > > hw:mem_page_size so they kept the old behavior > > > > > and make sure that is cleared when the vm is next moved so it can > > have the new default > > > > > after a live migratoin. > > > > > > > > > > > > I could not find this point mentioned in any existing Openstack > > > > > > documentation: that we would have to set hw:mem_page_size > > explicitly > > > > > > if hw:numa_node is set. Also this slide at > > > > > > https://www.linux-kvm.org/images/0/0b/03x03-Openstackpdf.pdf kind > > of > > > > > > indicates that hw:mem_page_size `Default to small pages`. > > > > > it defaults to unset. > > > > > that results in small pages by default but its not the same as > > hw:mem_page_size=small > > > > > or hw:mem_page_size=any. > > > > > > > > > > > > > > > > > > > > > > Another question: Let's say a VM runs on one host's numa node #0. > > If > > > > > > we live-migrate this VM to another host, and that host's numa node > > #1 > > > > > > has more free memory, is it possible for this VM to land on the > > other > > > > > > host's numa node #1? > > > > > yes it is > > > > > on newer relsese we will prefer to balance the load across numa nodes > > > > > on older release nova woudl fill the first numa node then move to > > the second. > > > > > > > > > > > > On Thu, May 11, 2023 at 4:25?AM Sean Mooney > > wrote: > > > > > > > > > > > > > > On Wed, 2023-05-10 at 15:06 -0500, hai wu wrote: > > > > > > > > Is it possible to update something in the Openstack database > > for the > > > > > > > > relevant VMs in order to do the same, and then hard reboot the > > VM so > > > > > > > > that the VM would have this attribute? > > > > > > > not really adding the missing hw:mem_page_size requirement to > > the flavor chagnes the > > > > > > > requirements for node placement and numa affinity > > > > > > > so you really can only change this via resizing the vm to a new > > flavor > > > > > > > > > > > > > > > > On Wed, May 10, 2023 at 2:47?PM Sean Mooney < > > smooney at redhat.com> wrote: > > > > > > > > > > > > > > > > > > On Wed, 2023-05-10 at 14:22 -0500, hai wu wrote: > > > > > > > > > > So there's no default value assumed/set for > > hw:mem_page_size for each > > > > > > > > > > flavor? > > > > > > > > > > > > > > > > > > > correct this is a known edgecase in the currnt design > > > > > > > > > hw:mem_page_size=any would be a resonable default but > > > > > > > > > techinially if just set hw:numa_nodes=1 nova allow memory > > over subscription > > > > > > > > > > > > > > > > > > in pratch if you try to do that you will almost always end > > up with vms > > > > > > > > > being killed due to OOM events. > > > > > > > > > > > > > > > > > > so from a api point of view it woudl be a change of behvior > > for use to default > > > > > > > > > to hw:mem_page_size=any but i think it would be the correct > > thign to do for operators > > > > > > > > > in the long run. > > > > > > > > > > > > > > > > > > i could bring this up with the core team again but in the > > past we > > > > > > > > > decided to be conservitive and just warn peopel to alwasy set > > > > > > > > > hw:mem_page_size if using numa affinity. > > > > > > > > > > > > > > > > > > > Yes https://bugs.launchpad.net/nova/+bug/1893121 is > > critical > > > > > > > > > > when using hw:numa_nodes=1. > > > > > > > > > > > > > > > > > > > > I did not hit an issue with 'hw:mem_page_size' not set, > > maybe I am > > > > > > > > > > missing some known test cases? It would be very helpful to > > have a test > > > > > > > > > > case where I could reproduce this issue with > > 'hw:numa_nodes=1' being > > > > > > > > > > set, but without 'hw:mem_page_size' being set. > > > > > > > > > > > > > > > > > > > > How to ensure this one for existing vms already running > > with > > > > > > > > > > 'hw:numa_nodes=1', but without 'hw:mem_page_size' being > > set? > > > > > > > > > you unfortuletly need to resize the instance. > > > > > > > > > tehre are some image porpeties you can set on an instance > > via nova-manage > > > > > > > > > but you cannot use nova-mange to update the enbedd flavor > > and set this. > > > > > > > > > > > > > > > > > > so you need to define a new flavour and resize. > > > > > > > > > > > > > > > > > > this is the main reason we have not changed the default as > > it may requrie you to > > > > > > > > > move instnace around if there placement is now invalid now > > that per numa node memory > > > > > > > > > allocatons are correctly being accounted for. > > > > > > > > > > > > > > > > > > if it was simple to change the default without any enduser > > or operator impact we would. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, May 10, 2023 at 1:47?PM Sean Mooney < > > smooney at redhat.com> wrote: > > > > > > > > > > > > > > > > > > > > > > if you set hw:numa_nodes there are two things you should > > keep in mind > > > > > > > > > > > > > > > > > > > > > > first if hw:numa_nodes si set to any value incluing > > hw:numa_nodes=1 > > > > > > > > > > > then hw:mem_page_size shoudl also be defiend on the > > falvor. > > > > > > > > > > > > > > > > > > > > > > if you dont set hw:mem_page_size then the vam will be > > pinned to a host numa node > > > > > > > > > > > but the avaible memory on the host numa node will not be > > taken into account > > > > > > > > > > > > > > > > > > > > > > only the total free memory on the host so this almost > > always results in VMs being killed by the OOM reaper > > > > > > > > > > > in the kernel. > > > > > > > > > > > > > > > > > > > > > > i recomend setting hw:mem_page_size=small > > hw:mem_page_size=large or hw:mem_page_size=any > > > > > > > > > > > small will use your kernels default page size for guest > > memory, typically this is 4k pages > > > > > > > > > > > large will use any pages size other then the smallest > > that is avaiable (i.e. this will use hugepages) > > > > > > > > > > > and any will use small pages but allow the guest to > > request hugepages via the hw_page_size image property. > > > > > > > > > > > > > > > > > > > > > > hw:mem_page_size=any is the most flexable as a result > > but generally i recommend using hw:mem_page_size=small > > > > > > > > > > > and having a seperate flavor for hugepages. its really > > up to you. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > the second thing to keep in mind is using expict numa > > toplolig8ies including hw:numa_nodes=1 > > > > > > > > > > > disables memory oversubsctipion. > > > > > > > > > > > > > > > > > > > > > > so you will not be able ot oversubscibe the memory on > > the host. > > > > > > > > > > > > > > > > > > > > > > in general its better to avoid memory oversubscribtion > > anyway but jsut keep that in mind. > > > > > > > > > > > you cant jsut allocate a buch of swap space and run vms > > at a 2:1 or higher memory over subscription ratio > > > > > > > > > > > if you are using numa affinity. > > > > > > > > > > > > > > > > > > > > > > > > https://that.guru/blog/the-numa-scheduling-story-in-nova/ > > > > > > > > > > > and > > > > > > > > > > > https://that.guru/blog/cpu-resources-redux/ > > > > > > > > > > > > > > > > > > > > > > are also good to read > > > > > > > > > > > > > > > > > > > > > > i do not think stephen has a dedicated block on the > > memory aspect > > > > > > > > > > > but https://bugs.launchpad.net/nova/+bug/1893121 covers > > some of the probelem that only setting > > > > > > > > > > > hw:numa_nodes=1 will casue. > > > > > > > > > > > > > > > > > > > > > > if you have vms with hw:numa_nodes=1 set and you do not > > have hw:mem_page_size set in the falvor or > > > > > > > > > > > hw_mem_page_size set in the image then that vm is not > > configure properly. > > > > > > > > > > > > > > > > > > > > > > On Wed, 2023-05-10 at 11:52 -0600, Alvaro Soto wrote: > > > > > > > > > > > > Another good resource =) > > > > > > > > > > > > > > > > > > > > > > > > https://that.guru/blog/cpu-resources/ > > > > > > > > > > > > > > > > > > > > > > > > On Wed, May 10, 2023 at 11:50?AM Alvaro Soto < > > alsotoes at gmail.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > I don't think so. > > > > > > > > > > > > > > > > > > > > > > > > > > ~~~ > > > > > > > > > > > > > The most common case will be that the admin only > > sets hw:numa_nodes and > > > > > > > > > > > > > then the flavor vCPUs and memory will be divided > > equally across the NUMA > > > > > > > > > > > > > nodes. When a NUMA policy is in effect, it is > > mandatory for the instance's > > > > > > > > > > > > > memory allocations to come from the NUMA nodes to > > which it is bound except > > > > > > > > > > > > > where overriden by hw:numa_mem.NN. > > > > > > > > > > > > > ~~~ > > > > > > > > > > > > > > > > > > > > > > > > > > Here are the implementation documents since Juno > > release: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://opendev.org/openstack/nova-specs/src/branch/master/specs/juno/implemented/virt-driver-numa-placement.rst > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://opendev.org/openstack/nova-specs/commit/45252df4c54674d2ac71cd88154af476c4d510e1 > > > > > > > > > > > > > ? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, May 10, 2023 at 11:31?AM hai wu < > > haiwu.us at gmail.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > Is there any concern to enable 'hw:numa_nodes=1' > > on all flavors, as > > > > > > > > > > > > > > long as that flavor can fit into one numa node? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > > > > > > > > > > > > > Alvaro Soto > > > > > > > > > > > > > > > > > > > > > > > > > > *Note: My work hours may not be your work hours. > > Please do not feel the > > > > > > > > > > > > > need to respond during a time that is not convenient > > for you.* > > > > > > > > > > > > > > > ---------------------------------------------------------- > > > > > > > > > > > > > Great people talk about ideas, > > > > > > > > > > > > > ordinary people talk about things, > > > > > > > > > > > > > small people talk... about other people. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From maksim.malchuk at gmail.com Mon May 15 20:32:59 2023 From: maksim.malchuk at gmail.com (Maksim Malchuk) Date: Mon, 15 May 2023 23:32:59 +0300 Subject: [nova] hw:numa_nodes question In-Reply-To: References: <713262656198f4e0330a086b906daa8a1cb3e40c.camel@redhat.com> <1a05b92654acb6309bc52fac14c9ae79242ab40e.camel@redhat.com> <1705b563fb0e936a2aa8356f6adccddd948b69bf.camel@redhat.com> <56ffe1e6cabcc54920b6f8a3a255d13bd7407628.camel@redhat.com> <8acd0ffb7bb09de4b48c5c69f849659d805134c5.camel@redhat.com> Message-ID: Good news, we waiting for it in Xena almost an year. On Mon, May 15, 2023 at 11:27?PM Sean Mooney wrote: > On Mon, 2023-05-15 at 23:07 +0300, Maksim Malchuk wrote: > > There is 6 month this backport without review, since Sean Mooney gives > +2. > > The next rebase needed to solve merge conflict have cleaned +2 from > review. > > yes it was blocked on a question regarding does this confirm to stable > backprot policy. > > we do not backport features and while this was considreed a bugfix on > master it was also > acknoladged that it is also a little featureish > > we discussed this last weak in the nova team meeting and agree it could > proceed. > > but as i noted in my last reply this will have no effect if you jsut have > hw:numa_nodes=1 > > without hw:cpu_policy=dedicated or hw:mem_page_size > > > with enableing cpu pinnign or explcit page size we do not track per numa > node cpu or memory usage > in the host numa toplogy object for a given compute node. > as such without any usage informational there is noting to we the numa > nodes with. > > so packing_host_numa_cells_allocation_strategy=false will not make vms > that request a numa > toplogy without numa resouce be balanced between the numa nodes. > > you still need to resize the instance to a flavor that actully properly > request memory or cpu pinging. > > > > > On Mon, May 15, 2023 at 10:53?PM hai wu wrote: > > > > > This patch was backported: > > > https://review.opendev.org/c/openstack/nova/+/805649. Once this is in > > > place, new VMs always get assigned correctly to the numa node with > > > more free memory. But when existing VMs (created with vm flavor with > > > hw:numa_node=1 set) already running on numa node #0 got live migrated, > > > it would always be stuck on numa node #0 after live migration. > > > > > > So it seems we would also need to set hw:mem_page_size=small on the vm > > > flavor, so that new VMs created from that flavor would be able to land > > > on different numa node other than node#0 after its live migration? > > > > > > On Mon, May 15, 2023 at 2:33?PM Sean Mooney > wrote: > > > > > > > > On Mon, 2023-05-15 at 13:03 -0500, hai wu wrote: > > > > > > > Another question: Let's say a VM runs on one host's numa node > #0. > > > If > > > > > > > we live-migrate this VM to another host, and that host's numa > node > > > #1 > > > > > > > has more free memory, is it possible for this VM to land on the > > > other > > > > > > > host's numa node #1? > > > > > > yes it is > > > > > > on newer relsese we will prefer to balance the load across numa > nodes > > > > > > on older release nova woudl fill the first numa node then move to > > > the second. > > > > > > > > > > About the above point, it seems even with the numa patch back > ported > > > > > and in place, the VM would be stuck in its existing numa node. Per > my > > > > > tests, after its live migration, the VM will end up on the other > > > > > host's numa node #0, even if numa node#1 has more free memory. > This is > > > > > not the case for newly built VMs. > > > > > > > > > > Is this a design issue? > > > > if you are using a release that supprot numa live migration (train +) > > > > > > > > https://specs.openstack.org/openstack/nova-specs/specs/train/implemented/numa-aware-live-migration.html > > > > then the numa affintiy is recalulated on live migration however numa > > > node 0 is prefered. > > > > > > > > as of xena [compute]/packing_host_numa_cells_allocation_strategy has > > > been added to contol how vms are balanced acros numa nodes > > > > in zed the default was changed form packing vms per host numa node to > > > balancing vms between host numa nodes > > > > > > > > https://docs.openstack.org/releasenotes/nova/zed.html#relnotes-26-0-0-stable-zed-upgrade-notes > > > > > > > > even without the enhanchemt in xena and zed it was possible for the > > > scheduler to select a numa node > > > > > > > > if you dont enable memory or cpu aware numa placment with > > > > hw:mem_page_size or hw:cpu_policy=dedicated then it will always > select > > > numa 0 > > > > > > > > if you do not request cpu pinnign or a specifc page size the > sechudler > > > cant properly select the host nuam node > > > > and will alwasy use numa node 0. That is one of the reason i said > that > > > if hw:numa_nodes is set then hw:mem_page_size shoudl be set. > > > > > > > > from a nova point of view using numa_nodes without mem_page_size is > > > logically incorrect as you asked for > > > > a vm to be affinites to n host numa nodes but did not enable numa > aware > > > memory scheduling. > > > > > > > > we unfortnally cant provent this in the nova api without breaking > > > upgrades for everyone who has made this mistake. > > > > we woudl need to force them to resize all affected instances which > means > > > guest downtime. > > > > the other issue si multiple numa nodes are supproted by Hyper-V but > they > > > do not supprot mem_page_size > > > > > > > > we have tried to document this in the past but never agreed on how > > > becasuse it subtel and requries alot of context. > > > > the tl;dr is if the instace has a numa toplogy it should have > > > mem_page_size set in the image or flavor but > > > > we never foudn a good place to capture that. > > > > > > > > > > > > > > On Thu, May 11, 2023 at 2:42?PM Sean Mooney > > > wrote: > > > > > > > > > > > > On Thu, 2023-05-11 at 08:40 -0500, hai wu wrote: > > > > > > > Ok. Then I don't understand why 'hw:mem_page_size' is not made > the > > > > > > > default in case if hw:numa_node is set. There is a huge > > > disadvantage > > > > > > > if not having this one set (all existing VMs with hw:numa_node > set > > > > > > > will have to be taken down for resizing in order to get this > one > > > > > > > right). > > > > > > there is an upgrade impact to changign the default. > > > > > > its not impossibel to do but its complicated if we dont want to > > > break exisitng deployments > > > > > > we woudl need to recored a value for eveny current instance that > was > > > spawned before > > > > > > this default was changed that had hw:numa_node without > > > hw:mem_page_size so they kept the old behavior > > > > > > and make sure that is cleared when the vm is next moved so it can > > > have the new default > > > > > > after a live migratoin. > > > > > > > > > > > > > > I could not find this point mentioned in any existing Openstack > > > > > > > documentation: that we would have to set hw:mem_page_size > > > explicitly > > > > > > > if hw:numa_node is set. Also this slide at > > > > > > > https://www.linux-kvm.org/images/0/0b/03x03-Openstackpdf.pdf > kind > > > of > > > > > > > indicates that hw:mem_page_size `Default to small pages`. > > > > > > it defaults to unset. > > > > > > that results in small pages by default but its not the same as > > > hw:mem_page_size=small > > > > > > or hw:mem_page_size=any. > > > > > > > > > > > > > > > > > > > > > > > > > > Another question: Let's say a VM runs on one host's numa node > #0. > > > If > > > > > > > we live-migrate this VM to another host, and that host's numa > node > > > #1 > > > > > > > has more free memory, is it possible for this VM to land on the > > > other > > > > > > > host's numa node #1? > > > > > > yes it is > > > > > > on newer relsese we will prefer to balance the load across numa > nodes > > > > > > on older release nova woudl fill the first numa node then move to > > > the second. > > > > > > > > > > > > > > On Thu, May 11, 2023 at 4:25?AM Sean Mooney < > smooney at redhat.com> > > > wrote: > > > > > > > > > > > > > > > > On Wed, 2023-05-10 at 15:06 -0500, hai wu wrote: > > > > > > > > > Is it possible to update something in the Openstack > database > > > for the > > > > > > > > > relevant VMs in order to do the same, and then hard reboot > the > > > VM so > > > > > > > > > that the VM would have this attribute? > > > > > > > > not really adding the missing hw:mem_page_size requirement to > > > the flavor chagnes the > > > > > > > > requirements for node placement and numa affinity > > > > > > > > so you really can only change this via resizing the vm to a > new > > > flavor > > > > > > > > > > > > > > > > > > On Wed, May 10, 2023 at 2:47?PM Sean Mooney < > > > smooney at redhat.com> wrote: > > > > > > > > > > > > > > > > > > > > On Wed, 2023-05-10 at 14:22 -0500, hai wu wrote: > > > > > > > > > > > So there's no default value assumed/set for > > > hw:mem_page_size for each > > > > > > > > > > > flavor? > > > > > > > > > > > > > > > > > > > > > correct this is a known edgecase in the currnt design > > > > > > > > > > hw:mem_page_size=any would be a resonable default but > > > > > > > > > > techinially if just set hw:numa_nodes=1 nova allow memory > > > over subscription > > > > > > > > > > > > > > > > > > > > in pratch if you try to do that you will almost always > end > > > up with vms > > > > > > > > > > being killed due to OOM events. > > > > > > > > > > > > > > > > > > > > so from a api point of view it woudl be a change of > behvior > > > for use to default > > > > > > > > > > to hw:mem_page_size=any but i think it would be the > correct > > > thign to do for operators > > > > > > > > > > in the long run. > > > > > > > > > > > > > > > > > > > > i could bring this up with the core team again but in the > > > past we > > > > > > > > > > decided to be conservitive and just warn peopel to > alwasy set > > > > > > > > > > hw:mem_page_size if using numa affinity. > > > > > > > > > > > > > > > > > > > > > Yes https://bugs.launchpad.net/nova/+bug/1893121 is > > > critical > > > > > > > > > > > when using hw:numa_nodes=1. > > > > > > > > > > > > > > > > > > > > > > I did not hit an issue with 'hw:mem_page_size' not set, > > > maybe I am > > > > > > > > > > > missing some known test cases? It would be very > helpful to > > > have a test > > > > > > > > > > > case where I could reproduce this issue with > > > 'hw:numa_nodes=1' being > > > > > > > > > > > set, but without 'hw:mem_page_size' being set. > > > > > > > > > > > > > > > > > > > > > > How to ensure this one for existing vms already running > > > with > > > > > > > > > > > 'hw:numa_nodes=1', but without 'hw:mem_page_size' being > > > set? > > > > > > > > > > you unfortuletly need to resize the instance. > > > > > > > > > > tehre are some image porpeties you can set on an instance > > > via nova-manage > > > > > > > > > > but you cannot use nova-mange to update the enbedd flavor > > > and set this. > > > > > > > > > > > > > > > > > > > > so you need to define a new flavour and resize. > > > > > > > > > > > > > > > > > > > > this is the main reason we have not changed the default > as > > > it may requrie you to > > > > > > > > > > move instnace around if there placement is now invalid > now > > > that per numa node memory > > > > > > > > > > allocatons are correctly being accounted for. > > > > > > > > > > > > > > > > > > > > if it was simple to change the default without any > enduser > > > or operator impact we would. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, May 10, 2023 at 1:47?PM Sean Mooney < > > > smooney at redhat.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > > if you set hw:numa_nodes there are two things you > should > > > keep in mind > > > > > > > > > > > > > > > > > > > > > > > > first if hw:numa_nodes si set to any value incluing > > > hw:numa_nodes=1 > > > > > > > > > > > > then hw:mem_page_size shoudl also be defiend on the > > > falvor. > > > > > > > > > > > > > > > > > > > > > > > > if you dont set hw:mem_page_size then the vam will be > > > pinned to a host numa node > > > > > > > > > > > > but the avaible memory on the host numa node will > not be > > > taken into account > > > > > > > > > > > > > > > > > > > > > > > > only the total free memory on the host so this almost > > > always results in VMs being killed by the OOM reaper > > > > > > > > > > > > in the kernel. > > > > > > > > > > > > > > > > > > > > > > > > i recomend setting hw:mem_page_size=small > > > hw:mem_page_size=large or hw:mem_page_size=any > > > > > > > > > > > > small will use your kernels default page size for > guest > > > memory, typically this is 4k pages > > > > > > > > > > > > large will use any pages size other then the smallest > > > that is avaiable (i.e. this will use hugepages) > > > > > > > > > > > > and any will use small pages but allow the guest to > > > request hugepages via the hw_page_size image property. > > > > > > > > > > > > > > > > > > > > > > > > hw:mem_page_size=any is the most flexable as a result > > > but generally i recommend using hw:mem_page_size=small > > > > > > > > > > > > and having a seperate flavor for hugepages. its > really > > > up to you. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > the second thing to keep in mind is using expict numa > > > toplolig8ies including hw:numa_nodes=1 > > > > > > > > > > > > disables memory oversubsctipion. > > > > > > > > > > > > > > > > > > > > > > > > so you will not be able ot oversubscibe the memory on > > > the host. > > > > > > > > > > > > > > > > > > > > > > > > in general its better to avoid memory > oversubscribtion > > > anyway but jsut keep that in mind. > > > > > > > > > > > > you cant jsut allocate a buch of swap space and run > vms > > > at a 2:1 or higher memory over subscription ratio > > > > > > > > > > > > if you are using numa affinity. > > > > > > > > > > > > > > > > > > > > > > > > > > > https://that.guru/blog/the-numa-scheduling-story-in-nova/ > > > > > > > > > > > > and > > > > > > > > > > > > https://that.guru/blog/cpu-resources-redux/ > > > > > > > > > > > > > > > > > > > > > > > > are also good to read > > > > > > > > > > > > > > > > > > > > > > > > i do not think stephen has a dedicated block on the > > > memory aspect > > > > > > > > > > > > but https://bugs.launchpad.net/nova/+bug/1893121 > covers > > > some of the probelem that only setting > > > > > > > > > > > > hw:numa_nodes=1 will casue. > > > > > > > > > > > > > > > > > > > > > > > > if you have vms with hw:numa_nodes=1 set and you do > not > > > have hw:mem_page_size set in the falvor or > > > > > > > > > > > > hw_mem_page_size set in the image then that vm is not > > > configure properly. > > > > > > > > > > > > > > > > > > > > > > > > On Wed, 2023-05-10 at 11:52 -0600, Alvaro Soto wrote: > > > > > > > > > > > > > Another good resource =) > > > > > > > > > > > > > > > > > > > > > > > > > > https://that.guru/blog/cpu-resources/ > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, May 10, 2023 at 11:50?AM Alvaro Soto < > > > alsotoes at gmail.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > I don't think so. > > > > > > > > > > > > > > > > > > > > > > > > > > > > ~~~ > > > > > > > > > > > > > > The most common case will be that the admin only > > > sets hw:numa_nodes and > > > > > > > > > > > > > > then the flavor vCPUs and memory will be divided > > > equally across the NUMA > > > > > > > > > > > > > > nodes. When a NUMA policy is in effect, it is > > > mandatory for the instance's > > > > > > > > > > > > > > memory allocations to come from the NUMA nodes to > > > which it is bound except > > > > > > > > > > > > > > where overriden by hw:numa_mem.NN. > > > > > > > > > > > > > > ~~~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > Here are the implementation documents since Juno > > > release: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://opendev.org/openstack/nova-specs/src/branch/master/specs/juno/implemented/virt-driver-numa-placement.rst > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://opendev.org/openstack/nova-specs/commit/45252df4c54674d2ac71cd88154af476c4d510e1 > > > > > > > > > > > > > > ? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, May 10, 2023 at 11:31?AM hai wu < > > > haiwu.us at gmail.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Is there any concern to enable > 'hw:numa_nodes=1' > > > on all flavors, as > > > > > > > > > > > > > > > long as that flavor can fit into one numa node? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > > > > > > > > > > > > > > > Alvaro Soto > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Note: My work hours may not be your work hours. > > > Please do not feel the > > > > > > > > > > > > > > need to respond during a time that is not > convenient > > > for you.* > > > > > > > > > > > > > > > > > ---------------------------------------------------------- > > > > > > > > > > > > > > Great people talk about ideas, > > > > > > > > > > > > > > ordinary people talk about things, > > > > > > > > > > > > > > small people talk... about other people. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- Regards, Maksim Malchuk -------------- next part -------------- An HTML attachment was scrubbed... URL: From haiwu.us at gmail.com Mon May 15 20:46:10 2023 From: haiwu.us at gmail.com (hai wu) Date: Mon, 15 May 2023 15:46:10 -0500 Subject: [nova] hw:numa_nodes question In-Reply-To: <63f8c407de908e1bbe8589ad9400e7751c4b4d44.camel@redhat.com> References: <713262656198f4e0330a086b906daa8a1cb3e40c.camel@redhat.com> <1a05b92654acb6309bc52fac14c9ae79242ab40e.camel@redhat.com> <1705b563fb0e936a2aa8356f6adccddd948b69bf.camel@redhat.com> <56ffe1e6cabcc54920b6f8a3a255d13bd7407628.camel@redhat.com> <8acd0ffb7bb09de4b48c5c69f849659d805134c5.camel@redhat.com> <63f8c407de908e1bbe8589ad9400e7751c4b4d44.camel@redhat.com> Message-ID: Hmm, regarding this: `if the vm only has hw:numa_node=1 https://review.opendev.org/c/openstack/nova/+/805649 wont help`. Per my recent numerous tests, if the vm only has hw:numa_node=1 https://review.opendev.org/c/openstack/nova/+/805649 will actually help, but only for newly built VMs, it works pretty well only for newly built VMs. On Mon, May 15, 2023 at 3:21?PM Sean Mooney wrote: > > On Mon, 2023-05-15 at 14:46 -0500, hai wu wrote: > > This patch was backported: > > https://review.opendev.org/c/openstack/nova/+/805649. Once this is in > > place, new VMs always get assigned correctly to the numa node with > > more free memory. But when existing VMs (created with vm flavor with > > hw:numa_node=1 set) already running on numa node #0 got live migrated, > > it would always be stuck on numa node #0 after live migration. > if the vm only has hw:numa_node=1 https://review.opendev.org/c/openstack/nova/+/805649 wont help > > because we never claim any mempages or cpus in the host numa toplogy blob > as such the sorting based on usage to balance the nodes wont work since there is never any usage recored > for vms with just hw:numa_node=1 and nothign else set. > > > > So it seems we would also need to set hw:mem_page_size=small on the vm > > flavor, so that new VMs created from that flavor would be able to land > > on different numa node other than node#0 after its live migration? > yes again becasue mem_page_size there is no usage in the host numa toplogy blob so as far as the schduler/resouces > tracker is concerned all numa nodes are equally used. > > so it will always select nuam 0 by default since the scheduling algortim is deterministic. > > > > On Mon, May 15, 2023 at 2:33?PM Sean Mooney wrote: > > > > > > On Mon, 2023-05-15 at 13:03 -0500, hai wu wrote: > > > > > > Another question: Let's say a VM runs on one host's numa node #0. If > > > > > > we live-migrate this VM to another host, and that host's numa node #1 > > > > > > has more free memory, is it possible for this VM to land on the other > > > > > > host's numa node #1? > > > > > yes it is > > > > > on newer relsese we will prefer to balance the load across numa nodes > > > > > on older release nova woudl fill the first numa node then move to the second. > > > > > > > > About the above point, it seems even with the numa patch back ported > > > > and in place, the VM would be stuck in its existing numa node. Per my > > > > tests, after its live migration, the VM will end up on the other > > > > host's numa node #0, even if numa node#1 has more free memory. This is > > > > not the case for newly built VMs. > > > > > > > > Is this a design issue? > > > if you are using a release that supprot numa live migration (train +) > > > https://specs.openstack.org/openstack/nova-specs/specs/train/implemented/numa-aware-live-migration.html > > > then the numa affintiy is recalulated on live migration however numa node 0 is prefered. > > > > > > as of xena [compute]/packing_host_numa_cells_allocation_strategy has been added to contol how vms are balanced acros numa nodes > > > in zed the default was changed form packing vms per host numa node to balancing vms between host numa nodes > > > https://docs.openstack.org/releasenotes/nova/zed.html#relnotes-26-0-0-stable-zed-upgrade-notes > > > > > > even without the enhanchemt in xena and zed it was possible for the scheduler to select a numa node > > > > > > if you dont enable memory or cpu aware numa placment with > > > hw:mem_page_size or hw:cpu_policy=dedicated then it will always select numa 0 > > > > > > if you do not request cpu pinnign or a specifc page size the sechudler cant properly select the host nuam node > > > and will alwasy use numa node 0. That is one of the reason i said that if hw:numa_nodes is set then hw:mem_page_size shoudl be set. > > > > > > from a nova point of view using numa_nodes without mem_page_size is logically incorrect as you asked for > > > a vm to be affinites to n host numa nodes but did not enable numa aware memory scheduling. > > > > > > we unfortnally cant provent this in the nova api without breaking upgrades for everyone who has made this mistake. > > > we woudl need to force them to resize all affected instances which means guest downtime. > > > the other issue si multiple numa nodes are supproted by Hyper-V but they do not supprot mem_page_size > > > > > > we have tried to document this in the past but never agreed on how becasuse it subtel and requries alot of context. > > > the tl;dr is if the instace has a numa toplogy it should have mem_page_size set in the image or flavor but > > > we never foudn a good place to capture that. > > > > > > > > > > > On Thu, May 11, 2023 at 2:42?PM Sean Mooney wrote: > > > > > > > > > > On Thu, 2023-05-11 at 08:40 -0500, hai wu wrote: > > > > > > Ok. Then I don't understand why 'hw:mem_page_size' is not made the > > > > > > default in case if hw:numa_node is set. There is a huge disadvantage > > > > > > if not having this one set (all existing VMs with hw:numa_node set > > > > > > will have to be taken down for resizing in order to get this one > > > > > > right). > > > > > there is an upgrade impact to changign the default. > > > > > its not impossibel to do but its complicated if we dont want to break exisitng deployments > > > > > we woudl need to recored a value for eveny current instance that was spawned before > > > > > this default was changed that had hw:numa_node without hw:mem_page_size so they kept the old behavior > > > > > and make sure that is cleared when the vm is next moved so it can have the new default > > > > > after a live migratoin. > > > > > > > > > > > > I could not find this point mentioned in any existing Openstack > > > > > > documentation: that we would have to set hw:mem_page_size explicitly > > > > > > if hw:numa_node is set. Also this slide at > > > > > > https://www.linux-kvm.org/images/0/0b/03x03-Openstackpdf.pdf kind of > > > > > > indicates that hw:mem_page_size `Default to small pages`. > > > > > it defaults to unset. > > > > > that results in small pages by default but its not the same as hw:mem_page_size=small > > > > > or hw:mem_page_size=any. > > > > > > > > > > > > > > > > > > > > > > Another question: Let's say a VM runs on one host's numa node #0. If > > > > > > we live-migrate this VM to another host, and that host's numa node #1 > > > > > > has more free memory, is it possible for this VM to land on the other > > > > > > host's numa node #1? > > > > > yes it is > > > > > on newer relsese we will prefer to balance the load across numa nodes > > > > > on older release nova woudl fill the first numa node then move to the second. > > > > > > > > > > > > On Thu, May 11, 2023 at 4:25?AM Sean Mooney wrote: > > > > > > > > > > > > > > On Wed, 2023-05-10 at 15:06 -0500, hai wu wrote: > > > > > > > > Is it possible to update something in the Openstack database for the > > > > > > > > relevant VMs in order to do the same, and then hard reboot the VM so > > > > > > > > that the VM would have this attribute? > > > > > > > not really adding the missing hw:mem_page_size requirement to the flavor chagnes the > > > > > > > requirements for node placement and numa affinity > > > > > > > so you really can only change this via resizing the vm to a new flavor > > > > > > > > > > > > > > > > On Wed, May 10, 2023 at 2:47?PM Sean Mooney wrote: > > > > > > > > > > > > > > > > > > On Wed, 2023-05-10 at 14:22 -0500, hai wu wrote: > > > > > > > > > > So there's no default value assumed/set for hw:mem_page_size for each > > > > > > > > > > flavor? > > > > > > > > > > > > > > > > > > > correct this is a known edgecase in the currnt design > > > > > > > > > hw:mem_page_size=any would be a resonable default but > > > > > > > > > techinially if just set hw:numa_nodes=1 nova allow memory over subscription > > > > > > > > > > > > > > > > > > in pratch if you try to do that you will almost always end up with vms > > > > > > > > > being killed due to OOM events. > > > > > > > > > > > > > > > > > > so from a api point of view it woudl be a change of behvior for use to default > > > > > > > > > to hw:mem_page_size=any but i think it would be the correct thign to do for operators > > > > > > > > > in the long run. > > > > > > > > > > > > > > > > > > i could bring this up with the core team again but in the past we > > > > > > > > > decided to be conservitive and just warn peopel to alwasy set > > > > > > > > > hw:mem_page_size if using numa affinity. > > > > > > > > > > > > > > > > > > > Yes https://bugs.launchpad.net/nova/+bug/1893121 is critical > > > > > > > > > > when using hw:numa_nodes=1. > > > > > > > > > > > > > > > > > > > > I did not hit an issue with 'hw:mem_page_size' not set, maybe I am > > > > > > > > > > missing some known test cases? It would be very helpful to have a test > > > > > > > > > > case where I could reproduce this issue with 'hw:numa_nodes=1' being > > > > > > > > > > set, but without 'hw:mem_page_size' being set. > > > > > > > > > > > > > > > > > > > > How to ensure this one for existing vms already running with > > > > > > > > > > 'hw:numa_nodes=1', but without 'hw:mem_page_size' being set? > > > > > > > > > you unfortuletly need to resize the instance. > > > > > > > > > tehre are some image porpeties you can set on an instance via nova-manage > > > > > > > > > but you cannot use nova-mange to update the enbedd flavor and set this. > > > > > > > > > > > > > > > > > > so you need to define a new flavour and resize. > > > > > > > > > > > > > > > > > > this is the main reason we have not changed the default as it may requrie you to > > > > > > > > > move instnace around if there placement is now invalid now that per numa node memory > > > > > > > > > allocatons are correctly being accounted for. > > > > > > > > > > > > > > > > > > if it was simple to change the default without any enduser or operator impact we would. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, May 10, 2023 at 1:47?PM Sean Mooney wrote: > > > > > > > > > > > > > > > > > > > > > > if you set hw:numa_nodes there are two things you should keep in mind > > > > > > > > > > > > > > > > > > > > > > first if hw:numa_nodes si set to any value incluing hw:numa_nodes=1 > > > > > > > > > > > then hw:mem_page_size shoudl also be defiend on the falvor. > > > > > > > > > > > > > > > > > > > > > > if you dont set hw:mem_page_size then the vam will be pinned to a host numa node > > > > > > > > > > > but the avaible memory on the host numa node will not be taken into account > > > > > > > > > > > > > > > > > > > > > > only the total free memory on the host so this almost always results in VMs being killed by the OOM reaper > > > > > > > > > > > in the kernel. > > > > > > > > > > > > > > > > > > > > > > i recomend setting hw:mem_page_size=small hw:mem_page_size=large or hw:mem_page_size=any > > > > > > > > > > > small will use your kernels default page size for guest memory, typically this is 4k pages > > > > > > > > > > > large will use any pages size other then the smallest that is avaiable (i.e. this will use hugepages) > > > > > > > > > > > and any will use small pages but allow the guest to request hugepages via the hw_page_size image property. > > > > > > > > > > > > > > > > > > > > > > hw:mem_page_size=any is the most flexable as a result but generally i recommend using hw:mem_page_size=small > > > > > > > > > > > and having a seperate flavor for hugepages. its really up to you. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > the second thing to keep in mind is using expict numa toplolig8ies including hw:numa_nodes=1 > > > > > > > > > > > disables memory oversubsctipion. > > > > > > > > > > > > > > > > > > > > > > so you will not be able ot oversubscibe the memory on the host. > > > > > > > > > > > > > > > > > > > > > > in general its better to avoid memory oversubscribtion anyway but jsut keep that in mind. > > > > > > > > > > > you cant jsut allocate a buch of swap space and run vms at a 2:1 or higher memory over subscription ratio > > > > > > > > > > > if you are using numa affinity. > > > > > > > > > > > > > > > > > > > > > > https://that.guru/blog/the-numa-scheduling-story-in-nova/ > > > > > > > > > > > and > > > > > > > > > > > https://that.guru/blog/cpu-resources-redux/ > > > > > > > > > > > > > > > > > > > > > > are also good to read > > > > > > > > > > > > > > > > > > > > > > i do not think stephen has a dedicated block on the memory aspect > > > > > > > > > > > but https://bugs.launchpad.net/nova/+bug/1893121 covers some of the probelem that only setting > > > > > > > > > > > hw:numa_nodes=1 will casue. > > > > > > > > > > > > > > > > > > > > > > if you have vms with hw:numa_nodes=1 set and you do not have hw:mem_page_size set in the falvor or > > > > > > > > > > > hw_mem_page_size set in the image then that vm is not configure properly. > > > > > > > > > > > > > > > > > > > > > > On Wed, 2023-05-10 at 11:52 -0600, Alvaro Soto wrote: > > > > > > > > > > > > Another good resource =) > > > > > > > > > > > > > > > > > > > > > > > > https://that.guru/blog/cpu-resources/ > > > > > > > > > > > > > > > > > > > > > > > > On Wed, May 10, 2023 at 11:50?AM Alvaro Soto wrote: > > > > > > > > > > > > > > > > > > > > > > > > > I don't think so. > > > > > > > > > > > > > > > > > > > > > > > > > > ~~~ > > > > > > > > > > > > > The most common case will be that the admin only sets hw:numa_nodes and > > > > > > > > > > > > > then the flavor vCPUs and memory will be divided equally across the NUMA > > > > > > > > > > > > > nodes. When a NUMA policy is in effect, it is mandatory for the instance's > > > > > > > > > > > > > memory allocations to come from the NUMA nodes to which it is bound except > > > > > > > > > > > > > where overriden by hw:numa_mem.NN. > > > > > > > > > > > > > ~~~ > > > > > > > > > > > > > > > > > > > > > > > > > > Here are the implementation documents since Juno release: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://opendev.org/openstack/nova-specs/src/branch/master/specs/juno/implemented/virt-driver-numa-placement.rst > > > > > > > > > > > > > > > > > > > > > > > > > > https://opendev.org/openstack/nova-specs/commit/45252df4c54674d2ac71cd88154af476c4d510e1 > > > > > > > > > > > > > ? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, May 10, 2023 at 11:31?AM hai wu wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > Is there any concern to enable 'hw:numa_nodes=1' on all flavors, as > > > > > > > > > > > > > > long as that flavor can fit into one numa node? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > > > > > > > > > > > > > Alvaro Soto > > > > > > > > > > > > > > > > > > > > > > > > > > *Note: My work hours may not be your work hours. Please do not feel the > > > > > > > > > > > > > need to respond during a time that is not convenient for you.* > > > > > > > > > > > > > ---------------------------------------------------------- > > > > > > > > > > > > > Great people talk about ideas, > > > > > > > > > > > > > ordinary people talk about things, > > > > > > > > > > > > > small people talk... about other people. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From jamesleong123098 at gmail.com Mon May 15 22:04:20 2023 From: jamesleong123098 at gmail.com (James Leong) Date: Mon, 15 May 2023 17:04:20 -0500 Subject: [keystone][horizon][kolla-ansible] user access specific domain In-Reply-To: References: Message-ID: Thanks! I have also tried your example, it works the same as mine, except that it checked the user's email. However, I am curious if it is possible to login to an existing user on openstack via federated login. Best, James. On Sun, May 14, 2023 at 10:03?PM Nguy?n H?u Kh?i wrote: > Hello. This is my example. > > { > "local": [ > { > "user": { > "name": "{0}", > "email": "{1}" > }, > "group": { > "name": "your keystone group", > "domain": { > "name": "Default" > } > } > } > ], > "remote": [ > { > "type": "OIDC-preferred_username", > "any_one_of": [ > "xxx at gmail.com", > "xxx1 at gmail.com > ] > }, > { > "type": "OIDC-preferred_username" > }, > { > "type": "OIDC-email" > } > ] > } > > > Nguyen Huu Khoi > > > On Mon, May 15, 2023 at 5:41?AM James Leong > wrote: > >> Hi all, >> >> I am playing around with the domain in the yoga version of OpenStack >> using kolla-ansible as the deployment tool. I have set up Globus as my >> authentication tool. However, I am curious if it is possible to log in to >> an existing OpenStack user account via federated login (based on Gmail) >> >> In my case, first, I created a user named "James" in one of the domains >> called federated_login. When I attempt to log in, a new user is created in >> the default domain instead of the federated_login domain. Below is a sample >> of my globus.json. >> >> [{"local": [ >> { >> "user": { >> "name":"{0}, >> "email":"{2} >> }, >> "group":{ >> "name": "federated_user", >> "domain: {"name":"{1} >> } >> } >> ], >> "remote": [ >> { "type":"OIDC-name"}, >> { "type":"OIDC-organization"},{"type":"OIDC-email"} >> ] >> }] >> >> Apart from the above question, is there another easier way of restricting >> users from login in via federated? For example, allow only existing users >> on OpenStack with a specific email to access the OpenStack dashboard via >> federated login. >> >> Best Regards, >> James >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knikolla at bu.edu Tue May 16 07:41:07 2023 From: knikolla at bu.edu (Nikolla, Kristi) Date: Tue, 16 May 2023 07:41:07 +0000 Subject: [tc] Technical Committee next weekly meeting on May 16, 2023 Message-ID: Hi all, This is a reminder that the next weekly Technical Committee meeting is to be held on Tuesday, May 16, 2023 at 1800 UTC on #openstack-tc on OFTC IRC. I?m out of office this week, therefore this week?s meeting will be chaired by Jay Faulkner. Thank you Jay! The meeting?s agenda can be found on the wiki page at https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting Thank you, Kristi Nikolla From skaplons at redhat.com Tue May 16 08:13:25 2023 From: skaplons at redhat.com (Slawek Kaplonski) Date: Tue, 16 May 2023 10:13:25 +0200 Subject: [neutron] CI meeting cancelled this week Message-ID: <2639132.cph77YBjOU@p1> Hi, I can't attend and chair today's Neutron CI meeting so we decided that we will cancel it if there will be no anything urgent. I think that CI in overall is running fine this week. Here's short summary of what I found: * Stadium projects - it seems that many projects may be affected by [1] - @Lajos, are You aware of those? * Grafana - it looks good this week * Rechecks - all good this week * Periodic jobs - looks good, only neutron-functional-with-sqlalchemy-master is still failing See You on the neutron team meeting today and on CI meeting next week. [1] https://review.opendev.org/c/openstack/neutron/+/879827 -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From katonalala at gmail.com Tue May 16 08:28:12 2023 From: katonalala at gmail.com (Lajos Katona) Date: Tue, 16 May 2023 10:28:12 +0200 Subject: [neutron] CI meeting cancelled this week In-Reply-To: <2639132.cph77YBjOU@p1> References: <2639132.cph77YBjOU@p1> Message-ID: Hi, For Neutron stadiums, I started to push a series of patches for recent issues, check this topic: https://review.opendev.org/q/topic:bug%252F2019097 Lajos (lajoskatona) Slawek Kaplonski ezt ?rta (id?pont: 2023. m?j. 16., K, 10:13): > Hi, > > I can't attend and chair today's Neutron CI meeting so we decided that we > will cancel it if there will be no anything urgent. > > I think that CI in overall is running fine this week. Here's short summary > of what I found: > > * *Stadium projects *- it seems that many projects may be affected by [1] > - @Lajos, are You aware of those? > > * *Grafana* - it looks good this week > > * *Rechecks *- all good this week > > * *Periodic jobs* - looks good, only *neutron-functional-with-sqlalchemy-master > *is > still failing > > See You on the neutron team meeting today and on CI meeting next week. > > [1] https://review.opendev.org/c/openstack/neutron/+/879827 > > -- > > Slawek Kaplonski > > Principal Software Engineer > > Red Hat > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wodel.youchi at gmail.com Tue May 16 08:45:12 2023 From: wodel.youchi at gmail.com (wodel youchi) Date: Tue, 16 May 2023 09:45:12 +0100 Subject: [Kolla] how to build kolla containers with kolla-build for specific branch Message-ID: Hi, Could you help me understand the process of building kolla containers from source? I am not a developer but I want to be to build containers rapidly especially when there is an urgent patch. I am using Yoga branch which is the 14th version, do I need to use the same version of kolla-build to build containers or it matters not? Regards. -------------- next part -------------- An HTML attachment was scrubbed... URL: From maksim.malchuk at gmail.com Tue May 16 10:16:03 2023 From: maksim.malchuk at gmail.com (Maksim Malchuk) Date: Tue, 16 May 2023 13:16:03 +0300 Subject: [Kolla] how to build kolla containers with kolla-build for specific branch In-Reply-To: References: Message-ID: Hi Wodel, In your case don't use kolla<14 version. Use the same version or newer version to build containers. Use the official documentation: https://docs.openstack.org/kolla/latest/admin/image-building.html On Tue, May 16, 2023 at 11:52?AM wodel youchi wrote: > Hi, > > Could you help me understand the process of building kolla containers from > source? > > I am not a developer but I want to be to build containers rapidly > especially when there is an urgent patch. > > I am using Yoga branch which is the 14th version, do I need to use the > same version of kolla-build to build containers or it matters not? > > > > Regards. > -- Regards, Maksim Malchuk -------------- next part -------------- An HTML attachment was scrubbed... URL: From rlandy at redhat.com Tue May 16 10:50:23 2023 From: rlandy at redhat.com (Ronelle Landy) Date: Tue, 16 May 2023 06:50:23 -0400 Subject: [TripleO] Removal of TripleO Zed Integration and Component Lines Message-ID: Hello All, Removal of TripleO Zed Integration and Component Lines Per the decision to not maintain TripleO after the Zed release [1], the Zed integration and component lines are being removed in the following patches: https://review.rdoproject.org/r/c/config/+/48073 https://review.rdoproject.org/r/c/config/+/48074 https://review.rdoproject.org/r/c/rdo-jobs/+/48075 To be clear, please note that following these changes, the gate for stable/zed TripleO repos will no longer be updated or maintained. Per the earlier communications, there should be no more patches submitted for stable/zed TripleO repos, and any backports will go to stable/wallaby or stable/train. The last promoted release of Zed through TripleO is: https://trunk.rdoproject.org/centos9-zed/current-tripleo/delorean.repo (hash:61828177e94d5f179ee0885cf3eee102), which was promoted on 05/15/2023. The Ceph promotion lines related to Zed are also removed in the above patches. Check/gate testing for the master branch is in process of being removed as well (https://review.opendev.org/c/openstack/tripleo-ci/+/882759). [1] https://review.opendev.org/c/openstack/governance/+/878799 -------------- next part -------------- An HTML attachment was scrubbed... URL: From wodel.youchi at gmail.com Tue May 16 12:56:44 2023 From: wodel.youchi at gmail.com (wodel youchi) Date: Tue, 16 May 2023 13:56:44 +0100 Subject: [Kolla] how to build kolla containers with kolla-build for specific branch In-Reply-To: References: Message-ID: Thanks, How can I be sure to be building containers for the right branch? should I use *--openstack-release yoga* with the command? Regards. Le mar. 16 mai 2023 ? 11:16, Maksim Malchuk a ?crit : > Hi Wodel, > > In your case don't use kolla<14 version. Use the same version or newer > version to build containers. > Use the official documentation: > https://docs.openstack.org/kolla/latest/admin/image-building.html > > > On Tue, May 16, 2023 at 11:52?AM wodel youchi > wrote: > >> Hi, >> >> Could you help me understand the process of building kolla containers >> from source? >> >> I am not a developer but I want to be to build containers rapidly >> especially when there is an urgent patch. >> >> I am using Yoga branch which is the 14th version, do I need to use the >> same version of kolla-build to build containers or it matters not? >> >> >> >> Regards. >> > > > -- > Regards, > Maksim Malchuk > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From paoloemilio.mazzon at unipd.it Tue May 16 10:00:34 2023 From: paoloemilio.mazzon at unipd.it (Paolo Emilio Mazzon) Date: Tue, 16 May 2023 12:00:34 +0200 Subject: [neutron] policy rules: filter on name field Message-ID: <57436bfb-0acb-dffa-bcdb-2bef2c8b8472@unipd.it> Hello, I'm trying to understand if this is feasible: I would like to avoid a regular user from tampering the "default" security group of a project. Specifically I would like to prevent him from deleting sg rules *from the default sg only* I can wite a policy.yaml like this # Delete a security group rule # DELETE /security-group-rules/{id} # Intended scope(s): project "delete_security_group_rule": "role:project_manager and project_id:%(project_id)s" but this is sub-optimal since the regular member can still *add* rules... Is it possible to create a rule like "sg_is_default" : ...the sg group whose name is 'default' so I can write "delete_security_group_rule": "not rule:sg_is_default" ? Thanks! Paolo -- Paolo Emilio Mazzon System and Network Administrator paoloemilio.mazzon[at]unipd.it PNC - Padova Neuroscience Center https://www.pnc.unipd.it Via Orus 2/B - 35131 Padova, Italy +39 049 821 2624 From mwatkins at linuxfoundation.org Tue May 16 10:31:28 2023 From: mwatkins at linuxfoundation.org (Matt Watkins) Date: Tue, 16 May 2023 11:31:28 +0100 Subject: Devstack and SQLAlchemy (Problems with stack.sh) Message-ID: <7F187725-B0B8-4170-85FF-C66FD5D10D87@linuxfoundation.org> Folks, I?m having problems running the devstack install script (stack.sh) under an Ubuntu-22.04.2 VM. The same point in the script is causing the install to abort, but the exact error depends on some Python dependencies. Anybody have a possible solution to this; it looks like some kind of Python dependency issue? Error with SQLAlchemy==1.4.48 +lib/keystone:init_keystone:489 /usr/local/bin/keystone-manage --config-file /etc/keystone/keystone.conf db_sync CRITICAL keystone [-] Unhandled error: sqlalchemy.exc.NoSuchModuleError: Can't load plugin: sqlalchemy.plugins:dbcounter ERROR keystone File "/home/devstack/.local/lib/python3.10/site-packages/sqlalchemy/util/langhelpers.py", line 343, in load ERROR keystone raise exc.NoSuchModuleError( ERROR keystone sqlalchemy.exc.NoSuchModuleError: Can't load plugin: sqlalchemy.plugins:dbcounter Error with SQLAlchemy==2.0.12 +lib/keystone:init_keystone:489 /usr/local/bin/keystone-manage --config-file /etc/keystone/keystone.conf db_sync Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/migrate/changeset/databases/sqlite.py", line 13, in from sqlalchemy.databases import sqlite as sa_base ModuleNotFoundError: No module named 'sqlalchemy.databases' Thanks in advance, - Matt From fungi at yuggoth.org Tue May 16 14:04:45 2023 From: fungi at yuggoth.org (Jeremy Stanley) Date: Tue, 16 May 2023 14:04:45 +0000 Subject: [dev][qa] Devstack and SQLAlchemy (Problems with stack.sh) In-Reply-To: <7F187725-B0B8-4170-85FF-C66FD5D10D87@linuxfoundation.org> References: <7F187725-B0B8-4170-85FF-C66FD5D10D87@linuxfoundation.org> Message-ID: <20230516140444.kxepai4chfthmntd@yuggoth.org> [I'm keeping your address in Cc since you don't appear to be subscribed, but please reply to the list rather than to me directly.] On 2023-05-16 11:31:28 +0100 (+0100), Matt Watkins wrote: [...] > Anybody have a possible solution to this; it looks like some kind > of Python dependency issue? > > Error with SQLAlchemy==1.4.48 [...] It looks like the constraints file isn't being applied when pip is deciding what version of SQLA to install, since we currently pin it to 1.4.41 for master branches: https://opendev.org/openstack/requirements/src/commit/19eb2d2/upper-constraints.txt#L156 I don't know if that's the reason for the exception you quoted, but it's probably a good place to start looking. When I check example output from one of our CI jobs, I see it's passing "-c /opt/stack/requirements/upper-constraints.txt" in pip install commands, including when it installs keystone, which (for the master branch of devstack anyway) currently results in selecting SQLAlchemy-1.4.41-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl to satisfy the requirement. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From stephenfin at redhat.com Tue May 16 14:05:44 2023 From: stephenfin at redhat.com (Stephen Finucane) Date: Tue, 16 May 2023 15:05:44 +0100 Subject: Devstack and SQLAlchemy (Problems with stack.sh) In-Reply-To: <7F187725-B0B8-4170-85FF-C66FD5D10D87@linuxfoundation.org> References: <7F187725-B0B8-4170-85FF-C66FD5D10D87@linuxfoundation.org> Message-ID: <45115d1113e2f292b06625360d45d503d7ba6fdc.camel@redhat.com> On Tue, 2023-05-16 at 11:31 +0100, Matt Watkins wrote: > Folks, > > I?m having problems running the devstack install script (stack.sh) under an Ubuntu-22.04.2 VM. > > The same point in the script is causing the install to abort, but the exact error depends on some Python dependencies. > > Anybody have a possible solution to this; it looks like some kind of Python dependency issue? > > Error with SQLAlchemy==1.4.48 > > +lib/keystone:init_keystone:489 /usr/local/bin/keystone-manage --config-file /etc/keystone/keystone.conf db_sync > CRITICAL keystone [-] Unhandled error: sqlalchemy.exc.NoSuchModuleError: Can't load plugin: sqlalchemy.plugins:dbcounter > ERROR keystone File "/home/devstack/.local/lib/python3.10/site-packages/sqlalchemy/util/langhelpers.py", line 343, in load > ERROR keystone raise exc.NoSuchModuleError( > ERROR keystone sqlalchemy.exc.NoSuchModuleError: Can't load plugin: sqlalchemy.plugins:dbcounter > > Error with SQLAlchemy==2.0.12 > > +lib/keystone:init_keystone:489 /usr/local/bin/keystone-manage --config-file /etc/keystone/keystone.conf db_sync > Traceback (most recent call last): > File "/usr/local/lib/python3.10/dist-packages/migrate/changeset/databases/sqlite.py", line 13, in > from sqlalchemy.databases import sqlite as sa_base > ModuleNotFoundError: No module named 'sqlalchemy.databases' > > Thanks in advance, > > - Matt > OpenStack as a whole doesn't support SQLAlchemy 2.x yet so that's the reason for the second failure. Regarding the latter, has the plugin been installed (you should see it in 'pip freeze')? We're not seeing this in the gate so my guess would be that this is something environmental. Stephen From dms at danplanet.com Tue May 16 14:08:13 2023 From: dms at danplanet.com (Dan Smith) Date: Tue, 16 May 2023 07:08:13 -0700 Subject: Devstack and SQLAlchemy (Problems with stack.sh) In-Reply-To: <7F187725-B0B8-4170-85FF-C66FD5D10D87@linuxfoundation.org> References: <7F187725-B0B8-4170-85FF-C66FD5D10D87@linuxfoundation.org> Message-ID: <325C0543-F49C-4C55-977A-78E72385823C@danplanet.com> > Error with SQLAlchemy==1.4.48 > > +lib/keystone:init_keystone:489 /usr/local/bin/keystone-manage --config-file /etc/keystone/keystone.conf db_sync > CRITICAL keystone [-] Unhandled error: sqlalchemy.exc.NoSuchModuleError: Can't load plugin: sqlalchemy.plugins:dbcounter > ERROR keystone File "/home/devstack/.local/lib/python3.10/site-packages/sqlalchemy/util/langhelpers.py", line 343, in load > ERROR keystone raise exc.NoSuchModuleError( > ERROR keystone sqlalchemy.exc.NoSuchModuleError: Can't load plugin: sqlalchemy.plugins:dbcounter This is an internal-to-devstack python module. Not sure why it?s not installing for you, but the easy button is to just disable its use in your localrc: MYSQL_GATHER_PERFORMANCE=False -?Dan From fungi at yuggoth.org Tue May 16 14:15:15 2023 From: fungi at yuggoth.org (Jeremy Stanley) Date: Tue, 16 May 2023 14:15:15 +0000 Subject: [dev][qa] Devstack and SQLAlchemy (Problems with stack.sh) In-Reply-To: <325C0543-F49C-4C55-977A-78E72385823C@danplanet.com> References: <7F187725-B0B8-4170-85FF-C66FD5D10D87@linuxfoundation.org> <325C0543-F49C-4C55-977A-78E72385823C@danplanet.com> Message-ID: <20230516141515.myf5qayqmforhoip@yuggoth.org> On 2023-05-16 07:08:13 -0700 (-0700), Dan Smith wrote: > > Error with SQLAlchemy==1.4.48 > > > > +lib/keystone:init_keystone:489 /usr/local/bin/keystone-manage --config-file /etc/keystone/keystone.conf db_sync > > CRITICAL keystone [-] Unhandled error: sqlalchemy.exc.NoSuchModuleError: Can't load plugin: sqlalchemy.plugins:dbcounter > > ERROR keystone File "/home/devstack/.local/lib/python3.10/site-packages/sqlalchemy/util/langhelpers.py", line 343, in load > > ERROR keystone raise exc.NoSuchModuleError( > > ERROR keystone sqlalchemy.exc.NoSuchModuleError: Can't load plugin: sqlalchemy.plugins:dbcounter > > This is an internal-to-devstack python module. Not sure why it?s > not installing for you, but the easy button is to just disable its > use in your localrc: > > MYSQL_GATHER_PERFORMANCE=False Also https://review.opendev.org/876601 purports to fix that exception under SQLAlchemy 2.x, but is only applied on the master branch (Matt didn't tell us what branch this occurred for). -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From skaplons at redhat.com Tue May 16 14:25:52 2023 From: skaplons at redhat.com (Slawek Kaplonski) Date: Tue, 16 May 2023 16:25:52 +0200 Subject: [neutron] policy rules: filter on name field In-Reply-To: <57436bfb-0acb-dffa-bcdb-2bef2c8b8472@unipd.it> References: <57436bfb-0acb-dffa-bcdb-2bef2c8b8472@unipd.it> Message-ID: <2957175.lYqTQsUpk7@p1> Hi, Dnia wtorek, 16 maja 2023 12:00:34 CEST Paolo Emilio Mazzon pisze: > Hello, > > I'm trying to understand if this is feasible: I would like to avoid a regular user from > tampering the "default" security group of a project. Specifically I would like to prevent > him from deleting sg rules *from the default sg only* > > I can wite a policy.yaml like this > > # Delete a security group rule > # DELETE /security-group-rules/{id} > # Intended scope(s): project > "delete_security_group_rule": "role:project_manager and project_id:%(project_id)s" > > but this is sub-optimal since the regular member can still *add* rules... > > Is it possible to create a rule like > > "sg_is_default" : ...the sg group whose name is 'default' > > so I can write > > "delete_security_group_rule": "not rule:sg_is_default" ? > > Thanks! I'm not sure but I will try to check it later today or tomorrow morning and will let You know if that is possible or not. > > Paolo > > -- > Paolo Emilio Mazzon > System and Network Administrator > > paoloemilio.mazzon[at]unipd.it > > PNC - Padova Neuroscience Center > https://www.pnc.unipd.it > Via Orus 2/B - 35131 Padova, Italy > +39 049 821 2624 > > -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From Tony.Saad at dell.com Tue May 16 15:11:56 2023 From: Tony.Saad at dell.com (Saad, Tony) Date: Tue, 16 May 2023 15:11:56 +0000 Subject: Discuss Fix for Bug #2003179 Message-ID: Hello, I am reaching out to start a discussion about Bug #2003179 https://bugs.launchpad.net/cinder/+bug/2003179 The password is getting leaked in plain text from https://opendev.org/openstack/oslo.privsep/src/commit/9c026804de74ae23a60ab3c9565d0c689b2b4579/oslo_privsep/daemon.py#L501. This logger line does not always contain a password so using mask_password() and mask_dict_password() from https://docs.openstack.org/oslo.utils/latest/reference/strutils.html is probably not the best solution. Anyone have any thoughts on how to stop the password from appearing in plain text? Thanks, Tony Internal Use - Confidential -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Tue May 16 16:07:20 2023 From: satish.txt at gmail.com (Satish Patel) Date: Tue, 16 May 2023 12:07:20 -0400 Subject: [Kolla] how to build kolla containers with kolla-build for specific branch In-Reply-To: References: Message-ID: This is what you should do. someone correct me if I'm wrong. ## If you looking for stable release then use following, otherwise you can checkout specific tag also like 14.x.x $ git clone https://opendev.org/openstack/kolla -b stable/zed ## Example to build multiple images or specific. $ kolla-build --registry 10.10.1.100:4000 -b ubuntu -t source -T 16 --tag zed fluentd kolla-toolbox cron chrony memcached mariadb rabbitmq dnsmasq keepalived haproxy ## If you want to customize image do following, Create template-overrides.j2 file and add following (In my example its horizon) {% extends parent_template %} # Horizon {% block horizon_ubuntu_source_setup %} RUN apt update -y RUN apt install -y net-tools vim RUN touch /root/foobar.txt {% endblock %} ## Run to build a custom horizon image. $ kolla-build --registry 10.10.1.100:4000 -b ubuntu -t source --tag zed-1 --template-override template-overrides.j2 horizon On Tue, May 16, 2023 at 8:59?AM wodel youchi wrote: > Thanks, > > How can I be sure to be building containers for the right branch? should I > use *--openstack-release yoga* with the command? > > Regards. > > Le mar. 16 mai 2023 ? 11:16, Maksim Malchuk a > ?crit : > >> Hi Wodel, >> >> In your case don't use kolla<14 version. Use the same version or newer >> version to build containers. >> Use the official documentation: >> https://docs.openstack.org/kolla/latest/admin/image-building.html >> >> >> On Tue, May 16, 2023 at 11:52?AM wodel youchi >> wrote: >> >>> Hi, >>> >>> Could you help me understand the process of building kolla containers >>> from source? >>> >>> I am not a developer but I want to be to build containers rapidly >>> especially when there is an urgent patch. >>> >>> I am using Yoga branch which is the 14th version, do I need to use the >>> same version of kolla-build to build containers or it matters not? >>> >>> >>> >>> Regards. >>> >> >> >> -- >> Regards, >> Maksim Malchuk >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From wodel.youchi at gmail.com Tue May 16 16:46:42 2023 From: wodel.youchi at gmail.com (wodel youchi) Date: Tue, 16 May 2023 17:46:42 +0100 Subject: [Kolla] how to build kolla containers with kolla-build for specific branch In-Reply-To: References: Message-ID: Hi, Thanks for all these details. I am using a virtual environment for kolla-build in my home directory, where should I execute the git command in the root path of my home dir or inside the venv? And if in the venv, where? Regards. On Tue, May 16, 2023, 17:07 Satish Patel wrote: > This is what you should do. someone correct me if I'm wrong. > > ## If you looking for stable release then use following, otherwise you can > checkout specific tag also like 14.x.x > $ git clone https://opendev.org/openstack/kolla -b stable/zed > > ## Example to build multiple images or specific. > $ kolla-build --registry 10.10.1.100:4000 -b ubuntu -t source -T 16 --tag > zed fluentd kolla-toolbox cron chrony memcached mariadb rabbitmq dnsmasq > keepalived haproxy > > ## If you want to customize image do following, Create > template-overrides.j2 file and add following (In my example its horizon) > > {% extends parent_template %} > > # Horizon > {% block horizon_ubuntu_source_setup %} > RUN apt update -y > RUN apt install -y net-tools vim > RUN touch /root/foobar.txt > {% endblock %} > > ## Run to build a custom horizon image. > $ kolla-build --registry 10.10.1.100:4000 -b ubuntu -t source --tag zed-1 > --template-override template-overrides.j2 horizon > > On Tue, May 16, 2023 at 8:59?AM wodel youchi > wrote: > >> Thanks, >> >> How can I be sure to be building containers for the right branch? should >> I use *--openstack-release yoga* with the command? >> >> Regards. >> >> Le mar. 16 mai 2023 ? 11:16, Maksim Malchuk a >> ?crit : >> >>> Hi Wodel, >>> >>> In your case don't use kolla<14 version. Use the same version or newer >>> version to build containers. >>> Use the official documentation: >>> https://docs.openstack.org/kolla/latest/admin/image-building.html >>> >>> >>> On Tue, May 16, 2023 at 11:52?AM wodel youchi >>> wrote: >>> >>>> Hi, >>>> >>>> Could you help me understand the process of building kolla containers >>>> from source? >>>> >>>> I am not a developer but I want to be to build containers rapidly >>>> especially when there is an urgent patch. >>>> >>>> I am using Yoga branch which is the 14th version, do I need to use the >>>> same version of kolla-build to build containers or it matters not? >>>> >>>> >>>> >>>> Regards. >>>> >>> >>> >>> -- >>> Regards, >>> Maksim Malchuk >>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Tue May 16 17:11:02 2023 From: satish.txt at gmail.com (Satish Patel) Date: Tue, 16 May 2023 13:11:02 -0400 Subject: [Kolla] how to build kolla containers with kolla-build for specific branch In-Reply-To: References: Message-ID: You can run it anywhere you want, it's just a checkout of the kolla-build binary. I did in /opt $ git clone https://opendev.org/openstack/kolla -b stable/zed $ cd /opt/kolla/tools $ ./build.py --registry 10.10.1.100:4000 -b ubuntu -t source --tag zed-1 --template-override template-overrides.j2 horizon On Tue, May 16, 2023 at 12:46?PM wodel youchi wrote: > Hi, > > Thanks for all these details. > I am using a virtual environment for kolla-build in my home directory, > where should I execute the git command in the root path of my home dir or > inside the venv? > And if in the venv, where? > > Regards. > > On Tue, May 16, 2023, 17:07 Satish Patel wrote: > >> This is what you should do. someone correct me if I'm wrong. >> >> ## If you looking for stable release then use following, otherwise you >> can checkout specific tag also like 14.x.x >> $ git clone https://opendev.org/openstack/kolla -b stable/zed >> >> ## Example to build multiple images or specific. >> $ kolla-build --registry 10.10.1.100:4000 -b ubuntu -t source -T 16 >> --tag zed fluentd kolla-toolbox cron chrony memcached mariadb rabbitmq >> dnsmasq keepalived haproxy >> >> ## If you want to customize image do following, Create >> template-overrides.j2 file and add following (In my example its horizon) >> >> {% extends parent_template %} >> >> # Horizon >> {% block horizon_ubuntu_source_setup %} >> RUN apt update -y >> RUN apt install -y net-tools vim >> RUN touch /root/foobar.txt >> {% endblock %} >> >> ## Run to build a custom horizon image. >> $ kolla-build --registry 10.10.1.100:4000 -b ubuntu -t source --tag >> zed-1 --template-override template-overrides.j2 horizon >> >> On Tue, May 16, 2023 at 8:59?AM wodel youchi >> wrote: >> >>> Thanks, >>> >>> How can I be sure to be building containers for the right branch? should >>> I use *--openstack-release yoga* with the command? >>> >>> Regards. >>> >>> Le mar. 16 mai 2023 ? 11:16, Maksim Malchuk >>> a ?crit : >>> >>>> Hi Wodel, >>>> >>>> In your case don't use kolla<14 version. Use the same version or newer >>>> version to build containers. >>>> Use the official documentation: >>>> https://docs.openstack.org/kolla/latest/admin/image-building.html >>>> >>>> >>>> On Tue, May 16, 2023 at 11:52?AM wodel youchi >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> Could you help me understand the process of building kolla containers >>>>> from source? >>>>> >>>>> I am not a developer but I want to be to build containers rapidly >>>>> especially when there is an urgent patch. >>>>> >>>>> I am using Yoga branch which is the 14th version, do I need to use the >>>>> same version of kolla-build to build containers or it matters not? >>>>> >>>>> >>>>> >>>>> Regards. >>>>> >>>> >>>> >>>> -- >>>> Regards, >>>> Maksim Malchuk >>>> >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From lokendrarathour at gmail.com Tue May 16 17:11:27 2023 From: lokendrarathour at gmail.com (Lokendra Rathour) Date: Tue, 16 May 2023 22:41:27 +0530 Subject: [Port Creation failed] - openstack Wallaby Message-ID: Hi All, Was trying to create OpenStack VM in OpenStack wallaby release, not able to create VM, it is failing because of Port not getting created. The error that we are getting: nova-compute.log: 2023-05-16 18:15:35.495 7 INFO nova.compute.provider_config [req-faaf38e7-b5ee-43d1-9303-d508285f5ab7 - - - - -] No provider configs found in /etc/nova/provider_config. If files are present, ensure the Nova process has access. 2023-05-16 18:15:35.549 7 ERROR nova.cmd.common [req-8842f11c-fe5a-4ad3-92ea-a6898f482bf0 - - - - -] No db access allowed in nova-compute: File "/usr/bin/nova-compute", line 10, in sys.exit(main()) File "/usr/lib/python3.6/site-packages/nova/cmd/compute.py", line 59, in main topic=compute_rpcapi.RPC_TOPIC) File "/usr/lib/python3.6/site-packages/nova/service.py", line 264, in create utils.raise_if_old_compute() File "/usr/lib/python3.6/site-packages/nova/utils.py", line 1068, in raise_if_old_compute ctxt, ['nova-compute']) File "/usr/lib/python3.6/site-packages/nova/objects/service.py", line 563, in get_minimum_version_all_cells binaries) File "/usr/lib/python3.6/site-packages/nova/context.py", line 544, in scatter_gather_all_cells fn, *args, **kwargs) File "/usr/lib/python3.6/site-packages/nova/context.py", line 432, in scatter_gather_cells with target_cell(context, cell_mapping) as cctxt: File "/usr/lib64/python3.6/contextlib.py", line 81, in __enter__ return next(self.gen) neutron/ovn-metadata-agent.log 2023-05-16 22:33:41.871 45204 ERROR ovsdbapp.backend.ovs_idl.command ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Chassis_Private with name=b6e6f0d3-40c6-4d4e-8ef6-c935fa027bd2 2023-05-16 22:33:41.871 45204 ERROR ovsdbapp.backend.ovs_idl.command 2023-05-16 22:36:41.876 45204 ERROR ovsdbapp.backend.ovs_idl.transaction [-] Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/connection.py", line 131, in run txn.results.put(txn.do_commit()) File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", line 93, in do_commit command.run_idl(txn) File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/command.py", line 172, in run_idl record = self.api.lookup(self.table, self.record) File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/__init__.py", line 208, in lookup return self._lookup(table, record) File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/__init__.py", line 268, in _lookup row = idlutils.row_by_value(self, rl.table, rl.column, record) File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/idlutils.py", line 114, in row_by_value raise RowNotFound(table=table, col=column, match=match) ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Chassis_Private with name=b6e6f0d3-40c6-4d4e-8ef6-c935fa027bd2 any input to help get this issue fixed would be of great help. thanks -- ~ Lokendra skype: lokendrarathour -------------- next part -------------- An HTML attachment was scrubbed... URL: From noonedeadpunk at gmail.com Tue May 16 17:15:23 2023 From: noonedeadpunk at gmail.com (Dmitriy Rabotyagov) Date: Tue, 16 May 2023 19:15:23 +0200 Subject: [TripleO] Removal of TripleO Zed Integration and Component Lines In-Reply-To: References: Message-ID: Hi Ronelle, It's actually not what was discussed and decided. After Zed, means starting from Antelope, which is the release after Zed. Zed release should still be able to accept patches and interested parties are allowed to contribute to the branch until it goes to the Extended Maintenance according to the release schedule [1]. So there can be no active contributions to the Zed release, but CI or gating should not be dropped on purpose to prevent any interested party on contribute to the branch. [1] https://releases.openstack.org/ ??, 16 ??? 2023 ?., 12:53 Ronelle Landy : > Hello All, > > Removal of TripleO Zed Integration and Component Lines > > Per the decision to not maintain TripleO after the Zed release [1], the > Zed integration and component lines are being removed in the following > patches: > > https://review.rdoproject.org/r/c/config/+/48073 > > https://review.rdoproject.org/r/c/config/+/48074 > > https://review.rdoproject.org/r/c/rdo-jobs/+/48075 > > To be clear, please note that following these changes, the gate for > stable/zed TripleO repos will no longer be updated or maintained. Per the > earlier communications, there should be no more patches submitted for > stable/zed TripleO repos, and any backports will go to stable/wallaby or > stable/train. > > The last promoted release of Zed through TripleO is: > https://trunk.rdoproject.org/centos9-zed/current-tripleo/delorean.repo > (hash:61828177e94d5f179ee0885cf3eee102), > which was promoted on 05/15/2023. > > The Ceph promotion lines related to Zed are also removed in the above > patches. > > Check/gate testing for the master branch is in process of being removed as > well (https://review.opendev.org/c/openstack/tripleo-ci/+/882759). > > > [1] https://review.opendev.org/c/openstack/governance/+/878799 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ken at jots.org Tue May 16 17:21:47 2023 From: ken at jots.org (Ken D'Ambrosio) Date: Tue, 16 May 2023 13:21:47 -0400 Subject: RMQ + Juno Message-ID: <6fa256068486ac9a2d2d760142552eda@jots.org> Yeah, I know. 173 versions out of date. But we've got an old cloud kicking around and it's having some weird problems, and we're trying to eliminate things. One thing is RMQ -- we were wondering if anyone knows what would happen if we flushed all the queues; we'd kinda like to start that from a clean slate. Is that a Bad Idea(tm), or would stuff simply re-populate organically? Thanks! -Ken From ozzzo at yahoo.com Tue May 16 17:29:04 2023 From: ozzzo at yahoo.com (Albert Braden) Date: Tue, 16 May 2023 17:29:04 +0000 (UTC) Subject: [kolla] [train] haproxy and controller restart causes user impact In-Reply-To: <1577432210.483872.1683917817875@mail.yahoo.com> References: <314427760.909657.1683808321773.ref@mail.yahoo.com> <314427760.909657.1683808321773@mail.yahoo.com> <20230512073327.Horde.nPa9c_1UYY_XW_n-sc1pyQM@webmail.nde.ag> <1577432210.483872.1683917817875@mail.yahoo.com> Message-ID: <1979459451.2557820.1684258144482@mail.yahoo.com> What's the recommended method for rebooting controllers? Do we need to use the "remove from cluster" and "add to cluster" procedures or is there a better way? https://docs.openstack.org/kolla-ansible/train/user/adding-and-removing-hosts.html On Friday, May 12, 2023, 03:04:26 PM EDT, Albert Braden wrote: We use keepalived and exabgp to manage failover for haproxy. That works but it takes a few minutes, and during those few minutes customers experience impact. We tell them to not build/delete VMs during patching, but they still do, and then complain about the failures. We're planning to experiment with adding a "manual" haproxy failover to our patching automation, but I'm wondering if there is anything on the controller that needs to be failed over or disabled before rebooting the KVM. I looked at the "remove from cluster" and "add to cluster" procedures but that seems unnecessarily cumbersome for rebooting the KVM. On Friday, May 12, 2023, 03:42:42 AM EDT, Eugen Block wrote: Hi Albert, how is your haproxy placement controlled, something like pacemaker or? similar? I would always do a failover when I'm aware of interruptions? (maintenance window), that should speed things up for clients. We have? a pacemaker controlled HA control plane, it takes more time until? pacemaker realizes that the resource is gone if I just rebooted a? server without failing over. I have no benchmarks though. There's? always a risk of losing a couple of requests during the failover but? we didn't have complaints yet, I believe most of the components try to? resend the lost messages. In one of our customer's cluster with many? resources (they also use terraform) I haven't seen issues during a? regular maintenance window. When they had a DNS outage a few months? back it resulted in a mess, manual cleaning was necessary, but the? regular failovers seem to work just fine. And I don't see rabbitmq issues either after rebooting a server,? usually the haproxy (and virtual IP) failover suffice to prevent? interruptions. Regards, Eugen Zitat von Satish Patel : > Are you running your stack on top of the kvm virtual machine? How many > controller nodes do you have? mostly rabbitMQ causing issues if you restart > controller nodes. > > On Thu, May 11, 2023 at 8:34?AM Albert Braden wrote: > >> We have our haproxy and controller nodes on KVM hosts. When those KVM >> hosts are restarted, customers who are building or deleting VMs see impact. >> VMs may go into error status, fail to get DNS records, fail to delete, etc. >> The obvious reason is because traffic that is being routed to the haproxy >> on the restarting KVM is lost. If we manually fail over haproxy before >> restarting the KVM, will that be sufficient to stop traffic being lost, or >> do we also need to do something with the controller? >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Tue May 16 18:05:51 2023 From: fungi at yuggoth.org (Jeremy Stanley) Date: Tue, 16 May 2023 18:05:51 +0000 Subject: [TripleO] Removal of TripleO Zed Integration and Component Lines In-Reply-To: References: Message-ID: <20230516180550.6truc7mechk2xffv@yuggoth.org> On 2023-05-16 19:15:23 +0200 (+0200), Dmitriy Rabotyagov wrote: > It's actually not what was discussed and decided. > > After Zed, means starting from Antelope, which is the release after Zed. > Zed release should still be able to accept patches and interested parties > are allowed to contribute to the branch until it goes to the Extended > Maintenance according to the release schedule [1]. > > So there can be no active contributions to the Zed release, but CI or > gating should not be dropped on purpose to prevent any interested party on > contribute to the branch. > > [1] https://releases.openstack.org/ [...] Just to be clear, the concern you're raising is over the tripleo-ci change[*] that's removing jobs in the upstream CI for the stable/zed branch? Also worth noting, it looks like these jobs skipped multiple releases previously, since they're currently only run for stable/train, stable/wallaby and stable/zed. [*] https://review.opendev.org/882759 -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From crazy+kolla at geeks.pl Tue May 16 18:16:12 2023 From: crazy+kolla at geeks.pl (Ryszard Mielcarek) Date: Tue, 16 May 2023 20:16:12 +0200 Subject: RMQ + Juno In-Reply-To: <6fa256068486ac9a2d2d760142552eda@jots.org> References: <6fa256068486ac9a2d2d760142552eda@jots.org> Message-ID: <20230516201545.3ab64a54@brain> Dnia 2023-05-16, o godz. 13:21:47 Ken D'Ambrosio napisa?(a): > Yeah, I know. 173 versions out of date. But we've got an old cloud > kicking around and it's having some weird problems, and we're trying > to eliminate things. One thing is RMQ -- we were wondering if anyone > knows what would happen if we flushed all the queues; we'd kinda like > to start that from a clean slate. Is that a Bad Idea(tm), or would > stuff simply re-populate organically? Heyo, not sure about Juno and your exact configuration, but in general - it's OK to flush everything with stopping rabbit cluster - I did it many times in different clusters/releases. Only please make sure you are not in the middle of creating/changing anything - these operations probably will fail. hint: if you are not sure and want to test it before, check i.e. kolla-ansible and all-in-one deployment in VM or something similar just for devel/test mini-cluster. crazik From eblock at nde.ag Tue May 16 18:17:03 2023 From: eblock at nde.ag (Eugen Block) Date: Tue, 16 May 2023 18:17:03 +0000 Subject: [kolla] [train] haproxy and controller restart causes user impact In-Reply-To: <1979459451.2557820.1684258144482@mail.yahoo.com> References: <314427760.909657.1683808321773.ref@mail.yahoo.com> <314427760.909657.1683808321773@mail.yahoo.com> <20230512073327.Horde.nPa9c_1UYY_XW_n-sc1pyQM@webmail.nde.ag> <1577432210.483872.1683917817875@mail.yahoo.com> <1979459451.2557820.1684258144482@mail.yahoo.com> Message-ID: <20230516181703.Horde.xQASX_23dzWCwbxddBcHuLa@webmail.nde.ag> Hi Albert, sorry, I'm swamped with different stuff right now. I just took a glance at the docs you mentioned and it seems way too much for something simple as a controller restart to actually remove hosts, that should definitely not be necessary. I'm not familiar with kolla or exabgp, but can you describe what exactly takes that long to failover? Maybe that could be improved? And can you limit the failing requests to a specific service (volumes, network ports, etc.) or do they all fail? Maybe rabbitmq should be considered after all, you could share your rabbitmq settings from the different openstack services and I will collect mine to compare. And then also the rabbitmq config (policies, vhosts, queues). Regards, Eugen Zitat von Albert Braden : > What's the recommended method for rebooting controllers? Do we need > to use the "remove from cluster" and "add to cluster" procedures or > is there a better way? > > https://docs.openstack.org/kolla-ansible/train/user/adding-and-removing-hosts.html > On Friday, May 12, 2023, 03:04:26 PM EDT, Albert Braden > wrote: > > We use keepalived and exabgp to manage failover for haproxy. That > works but it takes a few minutes, and during those few minutes > customers experience impact. We tell them to not build/delete VMs > during patching, but they still do, and then complain about the > failures. > > We're planning to experiment with adding a "manual" haproxy failover > to our patching automation, but I'm wondering if there is anything > on the controller that needs to be failed over or disabled before > rebooting the KVM. I looked at the "remove from cluster" and "add to > cluster" procedures but that seems unnecessarily cumbersome for > rebooting the KVM. > On Friday, May 12, 2023, 03:42:42 AM EDT, Eugen Block > wrote: > > Hi Albert, > > how is your haproxy placement controlled, something like pacemaker or? > similar? I would always do a failover when I'm aware of interruptions? > (maintenance window), that should speed things up for clients. We have? > a pacemaker controlled HA control plane, it takes more time until? > pacemaker realizes that the resource is gone if I just rebooted a? > server without failing over. I have no benchmarks though. There's? > always a risk of losing a couple of requests during the failover but? > we didn't have complaints yet, I believe most of the components try to? > resend the lost messages. In one of our customer's cluster with many? > resources (they also use terraform) I haven't seen issues during a? > regular maintenance window. When they had a DNS outage a few months? > back it resulted in a mess, manual cleaning was necessary, but the? > regular failovers seem to work just fine. > And I don't see rabbitmq issues either after rebooting a server,? > usually the haproxy (and virtual IP) failover suffice to prevent? > interruptions. > > Regards, > Eugen > > Zitat von Satish Patel : > >> Are you running your stack on top of the kvm virtual machine? How many >> controller nodes do you have? mostly rabbitMQ causing issues if you restart >> controller nodes. >> >> On Thu, May 11, 2023 at 8:34?AM Albert Braden wrote: >> >>> We have our haproxy and controller nodes on KVM hosts. When those KVM >>> hosts are restarted, customers who are building or deleting VMs see impact. >>> VMs may go into error status, fail to get DNS records, fail to delete, etc. >>> The obvious reason is because traffic that is being routed to the haproxy >>> on the restarting KVM is lost. If we manually fail over haproxy before >>> restarting the KVM, will that be sufficient to stop traffic being lost, or >>> do we also need to do something with the controller? >>> >>> From smooney at redhat.com Tue May 16 18:31:30 2023 From: smooney at redhat.com (Sean Mooney) Date: Tue, 16 May 2023 19:31:30 +0100 Subject: Discuss Fix for Bug #2003179 In-Reply-To: References: Message-ID: <818b05618c97e49b8707aa738e381fc7f0db8619.camel@redhat.com> i would proably fix thei the way we did in nova we instaled a log filter that prevents the preives deams logs at debug level form being logged. https://github.com/openstack/nova/blob/master/nova/config.py#L78-L80 https://github.com/openstack/nova/commit/86a8aac0d76fa149b5e43c73b31227fbcf427278 cinder should also insatll a log filter to only log privsep log at info by default On Tue, 2023-05-16 at 15:11 +0000, Saad, Tony wrote: > Hello, > > I am reaching out to start a discussion about Bug #2003179 https://bugs.launchpad.net/cinder/+bug/2003179 > > The password is getting leaked in plain text from https://opendev.org/openstack/oslo.privsep/src/commit/9c026804de74ae23a60ab3c9565d0c689b2b4579/oslo_privsep/daemon.py#L501. This logger line does not always contain a password so using mask_password() and mask_dict_password() from https://docs.openstack.org/oslo.utils/latest/reference/strutils.html is probably not the best solution. > Anyone have any thoughts on how to stop the password from appearing in plain text? > > Thanks, > Tony > > > Internal Use - Confidential From smooney at redhat.com Tue May 16 18:44:00 2023 From: smooney at redhat.com (Sean Mooney) Date: Tue, 16 May 2023 19:44:00 +0100 Subject: RMQ + Juno In-Reply-To: <20230516201545.3ab64a54@brain> References: <6fa256068486ac9a2d2d760142552eda@jots.org> <20230516201545.3ab64a54@brain> Message-ID: <776b07a7964539621975dc458a580a099796fddb.camel@redhat.com> On Tue, 2023-05-16 at 20:16 +0200, Ryszard Mielcarek wrote: > Dnia 2023-05-16, o godz. 13:21:47 > Ken D'Ambrosio napisa?(a): > > > Yeah, I know. 173 versions out of date. But we've got an old cloud > > kicking around and it's having some weird problems, and we're trying > > to eliminate things. One thing is RMQ -- we were wondering if anyone > > knows what would happen if we flushed all the queues; we'd kinda like > > to start that from a clean slate. Is that a Bad Idea(tm), or would > > stuff simply re-populate organically? > > Heyo, > not sure about Juno and your exact configuration, but in general - it's > OK to flush everything with stopping rabbit cluster - I did it many > times in different clusters/releases. flushing the queue will cause any inflight request to be lost and will break any operations that are happening on cloud resouces. i.e. level vm in booting for ever or volumes in attaching so its really only ok to do if the opensstack service are stopped or you are sure there are no rpcs in progress openstack is not built to tollerage flush the queue while its running > Only please make sure you are not in the middle of creating/changing > anything - these operations probably will fail. > > hint: if you are not sure and want to test it before, check i.e. > kolla-ansible and all-in-one deployment in VM or something similar just > for devel/test mini-cluster. > > crazik > > From satish.txt at gmail.com Tue May 16 18:51:11 2023 From: satish.txt at gmail.com (Satish Patel) Date: Tue, 16 May 2023 14:51:11 -0400 Subject: [cinder-backup][ceph] cinder-backup support of incremental backup with ceph backend Message-ID: Folks, I have ceph storage for my openstack and configure cinder-volume and cinder-backup service for my disaster solution. I am trying to use the cinder-backup incremental option to save storage space but somehow It doesn't work the way it should work. Whenever I take incremental backup it shows a similar size of original volume. Technically It should be smaller. Question is does ceph support incremental backup with cinder? I am running a Yoga release. $ openstack volume list +--------------------------------------+------------+------------+------+-------------------------------------+ | ID | Name | Status | Size | Attached to | +--------------------------------------+------------+------------+------+-------------------------------------+ | 285a49a6-0e03-49e5-abf1-1c1efbfeb5f2 | spatel-vol | backing-up | 10 | Attached to spatel-foo on /dev/sdc | +--------------------------------------+------------+------------+------+-------------------------------------+ ### Create full backup $ openstack volume backup create --name spatel-vol-backup spatel-vol --force +-------+--------------------------------------+ | Field | Value | +-------+--------------------------------------+ | id | 4351d9d3-85fa-4cd5-b21d-619b3385aefc | | name | spatel-vol-backup | +-------+--------------------------------------+ ### Create incremental $ openstack volume backup create --name spatel-vol-backup-1 --incremental --force spatel-vol +-------+--------------------------------------+ | Field | Value | +-------+--------------------------------------+ | id | 294b58af-771b-4a9f-bb7b-c37a4f84d678 | | name | spatel-vol-backup-1 | +-------+--------------------------------------+ $ openstack volume backup list +--------------------------------------+---------------------+-------------+-----------+------+ | ID | Name | Description | Status | Size | +--------------------------------------+---------------------+-------------+-----------+------+ | 294b58af-771b-4a9f-bb7b-c37a4f84d678 | spatel-vol-backup-1 | None | available | 10 | | 4351d9d3-85fa-4cd5-b21d-619b3385aefc | spatel-vol-backup | None | available | 10 | +--------------------------------------+---------------------+-------------+-----------+------+ My incremental backup still shows 10G size which should be lower compared to the first backup. -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian.rohmann at inovex.de Tue May 16 20:12:03 2023 From: christian.rohmann at inovex.de (Christian Rohmann) Date: Tue, 16 May 2023 22:12:03 +0200 Subject: [Storyboard] How to reach the Storyboard operators / maintainers? Message-ID: <0e206466-3cf9-40ca-ceea-508996445182@inovex.de> Hello, a while back I opened a story / bug on storyboard (https://storyboard.openstack.org/#!/story/2010689) about improving email deliver-ability. I am uncertain if that was the right place to raise this issue or if there is another channel. In any case, could someone kindly point me or this suggestion into the right direction? Regards Christian From james.slagle at gmail.com Tue May 16 20:31:50 2023 From: james.slagle at gmail.com (James Slagle) Date: Tue, 16 May 2023 16:31:50 -0400 Subject: [TripleO] Removal of TripleO Zed Integration and Component Lines In-Reply-To: References: Message-ID: On Tue, May 16, 2023 at 1:18?PM Dmitriy Rabotyagov wrote: > > Hi Ronelle, > > It's actually not what was discussed and decided. > > After Zed, means starting from Antelope, which is the release after Zed. > Zed release should still be able to accept patches and interested parties are allowed to contribute to the branch until it goes to the Extended Maintenance according to the release schedule [1]. > > So there can be no active contributions to the Zed release, but CI or gating should not be dropped on purpose to prevent any interested party on contribute to the branch. I feel the wording "on purpose to prevent..." is a mis-characterization of the intent. The discussion resulted in no volunteers or contributors willing to maintain TripleO Zed. The outcome was to consider it 'supported but no maintainers'[1]. Now, I can't really describe how that works in actual practice. Who is supporting it if there are no maintainers? Are there a group of individuals somewhere that consider themselves the supporters of TripleO Zed, but not the maintainers? To the best of my knowledge, no, there is not. To be clear, this patch https://review.opendev.org/c/openstack/tripleo-ci/+/882759 is for zed jobs on branchless CI repos. As there are no maintainers for Zed, if they start failing, we will need to mark them non-voting if we don't remove them. The intent with removing them was to signal the reality of the situation with what is actually still being maintained. And by maintained, I mean people actually working on it. If you would instead prefer that those jobs continue to exist, then ok. I don't understand the status of them at this time, and I don't expect them to stay passing (if they are). I also disagree that the absence of those jobs prevents anyone from submitting a patch to TripleO Zed. The integration lines promoted through RDO results in repositories of rpm's. When those are torn down (via the rdoproject.org patches), the content will still exist, but it will not be updated in a way that TripleO typically expects. The CI and gate jobs that run on TripleO patches proposed to stable/zed of branched TripleO repos may or may not continue to pass as a result of stale content. [1] https://lists.openstack.org/pipermail/openstack-discuss/2023-March/032663.html > > [1] https://releases.openstack.org/ > > > ??, 16 ??? 2023 ?., 12:53 Ronelle Landy : >> >> Hello All, >> >> Removal of TripleO Zed Integration and Component Lines >> >> Per the decision to not maintain TripleO after the Zed release [1], the Zed integration and component lines are being removed in the following patches: >> >> https://review.rdoproject.org/r/c/config/+/48073 >> >> https://review.rdoproject.org/r/c/config/+/48074 >> >> https://review.rdoproject.org/r/c/rdo-jobs/+/48075 >> >> To be clear, please note that following these changes, the gate for stable/zed TripleO repos will no longer be updated or maintained. Per the earlier communications, there should be no more patches submitted for stable/zed TripleO repos, and any backports will go to stable/wallaby or stable/train. >> >> The last promoted release of Zed through TripleO is: >> https://trunk.rdoproject.org/centos9-zed/current-tripleo/delorean.repo (hash:61828177e94d5f179ee0885cf3eee102), >> which was promoted on 05/15/2023. >> >> The Ceph promotion lines related to Zed are also removed in the above patches. >> >> Check/gate testing for the master branch is in process of being removed as well (https://review.opendev.org/c/openstack/tripleo-ci/+/882759). >> >> >> [1] https://review.opendev.org/c/openstack/governance/+/878799 -- -- James Slagle -- From rlandy at redhat.com Tue May 16 20:44:56 2023 From: rlandy at redhat.com (Ronelle Landy) Date: Tue, 16 May 2023 16:44:56 -0400 Subject: [TripleO] Removal of TripleO Zed Integration and Component Lines In-Reply-To: References: Message-ID: On Tue, May 16, 2023 at 4:32?PM James Slagle wrote: > On Tue, May 16, 2023 at 1:18?PM Dmitriy Rabotyagov > wrote: > > > > Hi Ronelle, > > > > It's actually not what was discussed and decided. > > > > After Zed, means starting from Antelope, which is the release after Zed. > > Zed release should still be able to accept patches and interested > parties are allowed to contribute to the branch until it goes to the > Extended Maintenance according to the release schedule [1]. > > > > So there can be no active contributions to the Zed release, but CI or > gating should not be dropped on purpose to prevent any interested party on > contribute to the branch. > > I feel the wording "on purpose to prevent..." is a > mis-characterization of the intent. > > The discussion resulted in no volunteers or contributors willing to > maintain TripleO Zed. The outcome was to consider it 'supported > but no maintainers'[1]. Now, I can't really describe how that works in > actual practice. Who is supporting it if there are no maintainers? Are > there a group of individuals somewhere that consider themselves the > supporters of TripleO Zed, but not the maintainers? To the best of my > knowledge, no, there is not. > > To be clear, this patch > https://review.opendev.org/c/openstack/tripleo-ci/+/882759 is for zed > jobs on branchless CI repos. As there are no maintainers for Zed, if > they start failing, we will need to mark them non-voting if we don't > remove them. The intent with removing them was to signal the reality > of the situation with what is actually still being maintained. And by > maintained, I mean people actually working on it. > > If you would instead prefer that those jobs continue to exist, then > ok. I don't understand the status of them at this time, and I don't > expect them to stay passing (if they are). I also disagree that the > absence of those jobs prevents anyone from submitting a patch to > TripleO Zed. > > The integration lines promoted through RDO results in repositories of > rpm's. When those are torn down (via the rdoproject.org patches), the > content will still exist, but it will not be updated in a way that > TripleO typically expects. The CI and gate jobs that run on TripleO > patches proposed to stable/zed of branched TripleO repos may or may > not continue to pass as a result of stale content. > > [1] > https://lists.openstack.org/pipermail/openstack-discuss/2023-March/032663.html James, thanks for the fuller explanation. As we understand, removing CI branchful jobs should not prevent anyone committing patches to the stable/zed branch. We did not even remove any testing from TripleO related repos that have a stable/zed branch - for example: https://github.com/openstack/tripleo-heat-templates/blob/stable/zed/zuul.d/layout.yaml stil has templates included. What we are proposing to remove is just forcing stable/zed branch testing on changes to CI repos - which are branchless. > > > > > > [1] https://releases.openstack.org/ > > > > > > ??, 16 ??? 2023 ?., 12:53 Ronelle Landy : > >> > >> Hello All, > >> > >> Removal of TripleO Zed Integration and Component Lines > >> > >> Per the decision to not maintain TripleO after the Zed release [1], the > Zed integration and component lines are being removed in the following > patches: > >> > >> https://review.rdoproject.org/r/c/config/+/48073 > >> > >> https://review.rdoproject.org/r/c/config/+/48074 > >> > >> https://review.rdoproject.org/r/c/rdo-jobs/+/48075 > >> > >> To be clear, please note that following these changes, the gate for > stable/zed TripleO repos will no longer be updated or maintained. Per the > earlier communications, there should be no more patches submitted for > stable/zed TripleO repos, and any backports will go to stable/wallaby or > stable/train. > >> > >> The last promoted release of Zed through TripleO is: > >> https://trunk.rdoproject.org/centos9-zed/current-tripleo/delorean.repo > (hash:61828177e94d5f179ee0885cf3eee102), > >> which was promoted on 05/15/2023. > >> > >> The Ceph promotion lines related to Zed are also removed in the above > patches. > >> > >> Check/gate testing for the master branch is in process of being removed > as well (https://review.opendev.org/c/openstack/tripleo-ci/+/882759). > >> > >> > >> [1] https://review.opendev.org/c/openstack/governance/+/878799 > > > > -- > -- James Slagle > -- > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Tue May 16 20:52:54 2023 From: fungi at yuggoth.org (Jeremy Stanley) Date: Tue, 16 May 2023 20:52:54 +0000 Subject: [Storyboard][tact-sig] How to reach the Storyboard operators / maintainers? In-Reply-To: <0e206466-3cf9-40ca-ceea-508996445182@inovex.de> References: <0e206466-3cf9-40ca-ceea-508996445182@inovex.de> Message-ID: <20230516205253.n6sqbh2jr7hkirji@yuggoth.org> On 2023-05-16 22:12:03 +0200 (+0200), Christian Rohmann wrote: > a while back I opened a story / bug on storyboard > (https://storyboard.openstack.org/#!/story/2010689) about > improving email deliver-ability. I am uncertain if that was the > right place to raise this issue or if there is another channel. > > In any case, could someone kindly point me or this suggestion into > the right direction? I guess there's a bit of an unclear distinction which could be made. Responsibility for keeping the storyboard.openstack.org service running is sort of shared between the OpenStack TaCT SIG and the OpenDev Collaboratory. The former is best reached on this mailing list with a [tact-sig] subject tag or in the #openstack-infra channel on the OFTC IRC network. The latter is best reached on the service-discuss at lists.opendev.org mailing list or the #opendev channel on the OFTC IRC network. (There's a third group, the StoryBoard developers, in the #storyboard IRC channel on OFTC, but that seems to be orthogonal to the story you're asking about.) Ultimately, the few people involved in each of those groups (myself included) are pretty much all the same sets of folks any more as loss of interest over time has substantially distilled the pool of people why have any bandwidth to continue caring about it. For integral things like how messages are being formatted by the underlying software, we're in a bit of a bind since we're not deploying the latest version of StoryBoard and need help redoing all of the configuration management and deployment automation for it before we can attempt to upgrade. If you're looking to assist with maintenance of the storyboard.openstack.org service, we're always happy to have more help. I'll see what I can do about the CSS listing, but it looks like it's because of other tenants sharing the same IP space in the cloud provider donating the server resources so SpamHaus may or may not grant us an exclusion for that flagged netblock. As for the DKIM/SPF concerns, we currently don't do that for other significant services we run either: particularly Gerrit code review and Mailman mailing lists (including this one). There's been some past consensus among our sysadmin collective that DMARC is a calculated landgrab by freemail providers aimed at thinning the market and stamping out private E-mail operators under the guise of improving spam filtering, so we'd prefer to find ways to improve the reputations of the services we run in other ways rather than caving to monopolistic bullying tactics (and yes, I realize it's probably a futile exercise of shouting into the wind, but I'm not personally ready to admit defeat on that front). -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From fungi at yuggoth.org Tue May 16 21:41:46 2023 From: fungi at yuggoth.org (Jeremy Stanley) Date: Tue, 16 May 2023 21:41:46 +0000 Subject: [TripleO] Removal of TripleO Zed Integration and Component Lines In-Reply-To: References: Message-ID: <20230516214145.nhjp3rdkcbiqaj6h@yuggoth.org> On 2023-05-16 16:31:50 -0400 (-0400), James Slagle wrote: [...] > I feel the wording "on purpose to prevent..." is a > mis-characterization of the intent. [...] On further discussion in #openstack-tc, it appears some of the concern was due to the phrase "there should be no more patches submitted for stable/zed TripleO repos" and the "should" there was interpreted as implying that the prior contributors were doing this in order to disallow participation by future contributors who might want to do so. It seems like this was not actually the intent you were trying to convey, and was an unfortunate misreading. If you had said that further patches for that branch are not expected, or that you personally wouldn't be submitting patches for that branch, I doubt the reaction would have been so strong. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From gmann at ghanshyammann.com Tue May 16 21:52:39 2023 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 16 May 2023 14:52:39 -0700 Subject: [neutron] policy rules: filter on name field In-Reply-To: <2957175.lYqTQsUpk7@p1> References: <57436bfb-0acb-dffa-bcdb-2bef2c8b8472@unipd.it> <2957175.lYqTQsUpk7@p1> Message-ID: <188268d12bd.1227636f51301451.1653718688720363876@ghanshyammann.com> ---- On Tue, 16 May 2023 07:25:52 -0700 Slawek Kaplonski wrote --- > Hi, > > Dnia wtorek, 16 maja 2023 12:00:34 CEST Paolo Emilio Mazzon pisze: > > Hello, > > > > I'm trying to understand if this is feasible: I would like to avoid a regular user from > > tampering the "default" security group of a project. Specifically I would like to prevent > > him from deleting sg rules *from the default sg only* > > > > I can wite a policy.yaml like this > > > > # Delete a security group rule > > # DELETE /security-group-rules/{id} > > # Intended scope(s): project > > "delete_security_group_rule": "role:project_manager and project_id:%(project_id)s" > > > > but this is sub-optimal since the regular member can still *add* rules... > > > > Is it possible to create a rule like > > > > "sg_is_default" : ...the sg group whose name is 'default' > > > > so I can write > > > > "delete_security_group_rule": "not rule:sg_is_default" ? > > > > Thanks! > > I'm not sure but I will try to check it later today or tomorrow morning and will let You know if that is possible or not. 'not' operator is supported in oslo policy. I think the below one should work which allows admin to delete the default SG and manager role can delete only non-default SG. NOTE: I have not tested this, may be you can check while trying other combinations. "delete_security_group_rule": "role:project_manager and project_id:%(project_id)s and not 'default':%(name)s or 'default':%(name)s and role:admin" -gmann > > > > > ????Paolo > > > > -- > > Paolo Emilio Mazzon > > System and Network Administrator > > > > paoloemilio.mazzon[at]unipd.it > > > > PNC - Padova Neuroscience Center > > https://www.pnc.unipd.it > > Via Orus 2/B - 35131 Padova, Italy > > +39 049 821 2624 > > > > > > > -- > Slawek Kaplonski > Principal Software Engineer > Red Hat From james.slagle at gmail.com Tue May 16 22:27:47 2023 From: james.slagle at gmail.com (James Slagle) Date: Tue, 16 May 2023 18:27:47 -0400 Subject: [TripleO] Removal of TripleO Zed Integration and Component Lines In-Reply-To: <20230516214145.nhjp3rdkcbiqaj6h@yuggoth.org> References: <20230516214145.nhjp3rdkcbiqaj6h@yuggoth.org> Message-ID: On Tue, May 16, 2023 at 5:45?PM Jeremy Stanley wrote: > > On 2023-05-16 16:31:50 -0400 (-0400), James Slagle wrote: > [...] > > I feel the wording "on purpose to prevent..." is a > > mis-characterization of the intent. > [...] > > On further discussion in #openstack-tc, it appears some of the > concern was due to the phrase "there should be no more patches > submitted for stable/zed TripleO repos" and the "should" there was > interpreted as implying that the prior contributors were doing this > in order to disallow participation by future contributors who might > want to do so. It seems like this was not actually the intent you > were trying to convey, and was an unfortunate misreading. If you had > said that further patches for that branch are not expected, or that > you personally wouldn't be submitting patches for that branch, I > doubt the reaction would have been so strong. Fair enough. Given the circumstances, I wouldn't personally encourage anyone to contribute patches to Zed given there are no reviewers, degraded CI, and degraded content. To me, that is a pretty big signal that you probably don't want to submit code there as it's not going to end up how you expect. I certainly defer to however the TC wants to phrase it for these future contributors. -- -- James Slagle -- From fungi at yuggoth.org Tue May 16 22:35:48 2023 From: fungi at yuggoth.org (Jeremy Stanley) Date: Tue, 16 May 2023 22:35:48 +0000 Subject: [TripleO] Removal of TripleO Zed Integration and Component Lines In-Reply-To: References: <20230516214145.nhjp3rdkcbiqaj6h@yuggoth.org> Message-ID: <20230516223547.via4ivz633frqbgt@yuggoth.org> On 2023-05-16 18:27:47 -0400 (-0400), James Slagle wrote: [...] > Fair enough. Given the circumstances, I wouldn't personally encourage > anyone to contribute patches to Zed given there are no reviewers, > degraded CI, and degraded content. To me, that is a pretty big signal > that you probably don't want to submit code there as it's not going to > end up how you expect. > > I certainly defer to however the TC wants to phrase it for these > future contributors. Well yes, obviously (to me at any rate) random drive-by contributions would be pointless, and any group of people interested in resuming actual development on it would need to be ready to also take over maintenance responsibilities, do code review, fix CI jobs, et cetera. Odds of that happening are slim, but it can't be ruled out. After all, it happened before... TripleO was originally started by HPCloud dev teams. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From nguyenhuukhoinw at gmail.com Wed May 17 00:48:43 2023 From: nguyenhuukhoinw at gmail.com (=?UTF-8?B?Tmd1eeG7hW4gSOG7r3UgS2jDtGk=?=) Date: Wed, 17 May 2023 07:48:43 +0700 Subject: [keystone][horizon][kolla-ansible] user access specific domain In-Reply-To: References: Message-ID: Hello. I doest try this. Nguyen Huu Khoi On Tue, May 16, 2023 at 5:04?AM James Leong wrote: > Thanks! I have also tried your example, it works the same as mine, except > that it checked the user's email. However, I am curious if it is possible > to login to an existing user on openstack via federated login. > > Best, > James. > > On Sun, May 14, 2023 at 10:03?PM Nguy?n H?u Kh?i < > nguyenhuukhoinw at gmail.com> wrote: > >> Hello. This is my example. >> >> { >> "local": [ >> { >> "user": { >> "name": "{0}", >> "email": "{1}" >> }, >> "group": { >> "name": "your keystone group", >> "domain": { >> "name": "Default" >> } >> } >> } >> ], >> "remote": [ >> { >> "type": "OIDC-preferred_username", >> "any_one_of": [ >> "xxx at gmail.com", >> "xxx1 at gmail.com >> ] >> }, >> { >> "type": "OIDC-preferred_username" >> }, >> { >> "type": "OIDC-email" >> } >> ] >> } >> >> >> Nguyen Huu Khoi >> >> >> On Mon, May 15, 2023 at 5:41?AM James Leong >> wrote: >> >>> Hi all, >>> >>> I am playing around with the domain in the yoga version of OpenStack >>> using kolla-ansible as the deployment tool. I have set up Globus as my >>> authentication tool. However, I am curious if it is possible to log in to >>> an existing OpenStack user account via federated login (based on Gmail) >>> >>> In my case, first, I created a user named "James" in one of the domains >>> called federated_login. When I attempt to log in, a new user is created in >>> the default domain instead of the federated_login domain. Below is a sample >>> of my globus.json. >>> >>> [{"local": [ >>> { >>> "user": { >>> "name":"{0}, >>> "email":"{2} >>> }, >>> "group":{ >>> "name": "federated_user", >>> "domain: {"name":"{1} >>> } >>> } >>> ], >>> "remote": [ >>> { "type":"OIDC-name"}, >>> { "type":"OIDC-organization"},{"type":"OIDC-email"} >>> ] >>> }] >>> >>> Apart from the above question, is there another easier way of >>> restricting users from login in via federated? For example, allow only >>> existing users on OpenStack with a specific email to access the OpenStack >>> dashboard via federated login. >>> >>> Best Regards, >>> James >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Wed May 17 03:50:53 2023 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 16 May 2023 20:50:53 -0700 Subject: =?UTF-8?Q?Re:_=E7=AD=94=E5=A4=8D:_=E7=AD=94=E5=A4=8D:_[ptl]_Need_PTL_v?= =?UTF-8?Q?olunteer_for_OpenStack_Sahara?= In-Reply-To: <4d3b952157f84886b52c7475b6bae68e@inspur.com> References: <18776962f8e.b5be9467498529.2443601353157879154@ghanshyammann.com> <4d3b952157f84886b52c7475b6bae68e@inspur.com> Message-ID: <18827d50cb8.cfee5b2e1305648.2964607450323554781@ghanshyammann.com> ---- On Tue, 16 May 2023 19:05:50 -0700 Jerry Zhou (???) wrote --- > Hi Gmann, > > The governance patch to PTL appointment already merged into. > > [1] https://review.opendev.org/c/openstack/governance/+/881186 > > But there is no management authority for the project, and the review patch does not have +2 authority. > > I can take on the responsibility of PTL, maintain the project gate, and ensure that zuul can be executed normally; review and merge the commits submitted on the project. Could you add me to the sahara-core group. It makes sense to add you to the group but I would like Sahara's core member if anyone is around to add you if no one is active then TC can add you there. -gmann > > > -----????----- > ???: Ghanshyam Mann gmann at ghanshyammann.com> > ????: 2023?4?13? 1:49 > ???: Brin Zhang(???) zhangbailin at inspur.com>; Jerry Zhou (???) zhouxinyong at inspur.com> > ??: Tom Guo(??2810) guotao.bj at inspur.com>; Weiting Kong (???) kongweiting at inspur.com>; Alex Song (???) songwenping at inspur.com>; openstack-discuss openstack-discuss at lists.openstack.org> > ??: Re: ??: [ptl] Need PTL volunteer for OpenStack Sahara > > Hi Jerry, > > We discussed it in the TC meeting[1], and below is the summary and next step: > > * Sahara has been almost unmaintained in the last cycle, even though we had PTL volunteering to lead/maintain it. > > * From the PTL role, we expect the project to keep its testing/gate green, merge the incoming review requests, and do the releases on time. For example, if no maintainer is left in any project (which is the Sahara case), then its PTL's responsibility is to get all those project's basic maintenance done (by onboarding new maintainers or by themself) otherwise, inform TC about it so we can decide on the project status on time. > > * Next step: > Thanks for volunteering and helping to maintain the Sahara project. Please propose the governance patch to PTL appointment, and TC will review it. Example: https://review.opendev.org/c/openstack/governance/+/878286 > > [1] https://meetings.opendev.org/meetings/tc/2023/tc.2023-04-11-17.59.log.html#l-83 > > -gmann > > ---- On Mon, 10 Apr 2023 09:09:37 -0700 Ghanshyam Mann wrote --- > Thanks Jerry, Brin for the updates and showing interest in Sahara project. > > > > In PTG discussion, there were some concern on Sahara maintenance even > we had PTL volunteer to lead this project but not much things got merged and > gate was broken many times. But let me discuss it in TC meeting tomorrow and > update here. > > > > -gmann > > > > ---- On Mon, 10 Apr 2023 00:22:08 -0700 Brin Zhang(???) wrote --- > > Hi Gmann, > > ????Jerry Zhou works the same company as qiujunting (Qiujunting has out of the office, so he cannot response us, I am so sorry), and he will be involved in Sahara. > > > ????Hope TC can add him in Sahara team, let him gradually complete the maintenance of the Sahara project. > > > ????Thanks. > > > > > > brinzhang > > > > > > -----????----- > > > ???: Jerry Zhou (???) zhouxinyong at inspur.com> > > ????: 2023?4?7? 11:10 > > ???: gmann at ghanshyammann.com; Juntingqiu Qiujunting (???) qiujunting at inspur.com> > > ??: openstack-discuss at lists.openstack.org; Tom Guo(??2810) guotao.bj at inspur.com>; Brin Zhang(???) zhangbailin at inspur.com>; Weiting Kong (???) kongweiting at inspur.com> > > ??: ??: [ptl] Need PTL volunteer for OpenStack Sahara > > > > Hi Gmann, > > > I can lead this project. > > > > > > -----????----- > > > ???: Ghanshyam Mann gmann at ghanshyammann.com> > > ????: 2023?3?23? 2:35 > > ???: Qiujunting qiujunting at inspur.com>; openstack-discuss openstack-discuss at lists.openstack.org> > > > ??: [ptl] Need PTL volunteer for OpenStack Sahara > > > > Hi Qiu, > > > > I am reaching out to you as you were PTL for OpenStack Sahara project in the last cycle. > > > > > > There is no PTL candidate for the next cycle (2023.2), and it is on the leaderless project list. Please check if you or anyone you know would like to lead this project. > > > > > > - https://etherpad.opendev.org/p/2023.2-leaderless > > > > > > Also, if anyone else would like to help leading this project, this is time to let TC knows. > > > > > > -gmann > > > > > > > > > From pdeore at redhat.com Wed May 17 04:44:04 2023 From: pdeore at redhat.com (Pranali Deore) Date: Wed, 17 May 2023 10:14:04 +0530 Subject: [Glance] Weekly Meeting Cancelled Message-ID: Hello, As per the discussion [1], glance weekly meeting is cancelled for this week as most of the team members will not be around. For next week - 25th May, we will have the meeting if anything comes urgent, otherwise we'll directly meet on Thursday 1st June. [1]: https://meetings.opendev.org/meetings/glance/2023/glance.2023-05-11-13.59.log.html#l-168 Thanks & Regards, Pranali Deore -------------- next part -------------- An HTML attachment was scrubbed... URL: From ozzzo at yahoo.com Wed May 17 05:09:18 2023 From: ozzzo at yahoo.com (Albert Braden) Date: Wed, 17 May 2023 05:09:18 +0000 (UTC) Subject: [kolla] [train] haproxy and controller restart causes user impact In-Reply-To: <20230516181703.Horde.xQASX_23dzWCwbxddBcHuLa@webmail.nde.ag> References: <314427760.909657.1683808321773.ref@mail.yahoo.com> <314427760.909657.1683808321773@mail.yahoo.com> <20230512073327.Horde.nPa9c_1UYY_XW_n-sc1pyQM@webmail.nde.ag> <1577432210.483872.1683917817875@mail.yahoo.com> <1979459451.2557820.1684258144482@mail.yahoo.com> <20230516181703.Horde.xQASX_23dzWCwbxddBcHuLa@webmail.nde.ag> Message-ID: <393615542.2867360.1684300158715@mail.yahoo.com> Before we switched to durable queues we were seeing RMQ issues after a restart. Now RMQ is fine after restart, but operations in progress will fail. VMs will fail to build, or not get DNS records. Volumes don't get attached or detached. It looks like haproxy is the issue now; connections continue going to the down node. I think we can fix that by failing over haproxy before rebooting. The problem is, I'm not sure that haproxy is the only issue. All 3 controllers are doing stuff, and when I reboot one, whatever it is doing is likely to fail. Is there an orderly way to stop work from being done on a controller without ruining work that is already in progress, besides removing it from the cluster? Would "kolla-ansible stop" do it? On Tuesday, May 16, 2023, 02:23:59 PM EDT, Eugen Block wrote: Hi Albert, sorry, I'm swamped with different stuff right now. I just took a? glance at the docs you mentioned and it seems way too much for? something simple as a controller restart to actually remove hosts,? that should definitely not be necessary. I'm not familiar with kolla or exabgp, but can you describe what? exactly takes that long to failover? Maybe that could be improved? And? can you limit the failing requests to a specific service (volumes,? network ports, etc.) or do they all fail? Maybe rabbitmq should be? considered after all, you could share your rabbitmq settings from the? different openstack services and I will collect mine to compare. And? then also the rabbitmq config (policies, vhosts, queues). Regards, Eugen Zitat von Albert Braden : > What's the recommended method for rebooting controllers? Do we need? > to use the "remove from cluster" and "add to cluster" procedures or? > is there a better way? > > https://docs.openstack.org/kolla-ansible/train/user/adding-and-removing-hosts.html >? ? ? On Friday, May 12, 2023, 03:04:26 PM EDT, Albert Braden? > wrote: > >? We use keepalived and exabgp to manage failover for haproxy. That? > works but it takes a few minutes, and during those few minutes? > customers experience impact. We tell them to not build/delete VMs? > during patching, but they still do, and then complain about the? > failures. > > We're planning to experiment with adding a "manual" haproxy failover? > to our patching automation, but I'm wondering if there is anything? > on the controller that needs to be failed over or disabled before? > rebooting the KVM. I looked at the "remove from cluster" and "add to? > cluster" procedures but that seems unnecessarily cumbersome for? > rebooting the KVM. >? ? ? On Friday, May 12, 2023, 03:42:42 AM EDT, Eugen Block? > wrote: > >? Hi Albert, > > how is your haproxy placement controlled, something like pacemaker or? > similar? I would always do a failover when I'm aware of interruptions? > (maintenance window), that should speed things up for clients. We have? > a pacemaker controlled HA control plane, it takes more time until? > pacemaker realizes that the resource is gone if I just rebooted a? > server without failing over. I have no benchmarks though. There's? > always a risk of losing a couple of requests during the failover but? > we didn't have complaints yet, I believe most of the components try to? > resend the lost messages. In one of our customer's cluster with many? > resources (they also use terraform) I haven't seen issues during a? > regular maintenance window. When they had a DNS outage a few months? > back it resulted in a mess, manual cleaning was necessary, but the? > regular failovers seem to work just fine. > And I don't see rabbitmq issues either after rebooting a server,? > usually the haproxy (and virtual IP) failover suffice to prevent? > interruptions. > > Regards, > Eugen > > Zitat von Satish Patel : > >> Are you running your stack on top of the kvm virtual machine? How many >> controller nodes do you have? mostly rabbitMQ causing issues if you restart >> controller nodes. >> >> On Thu, May 11, 2023 at 8:34?AM Albert Braden wrote: >> >>> We have our haproxy and controller nodes on KVM hosts. When those KVM >>> hosts are restarted, customers who are building or deleting VMs see impact. >>> VMs may go into error status, fail to get DNS records, fail to delete, etc. >>> The obvious reason is because traffic that is being routed to the haproxy >>> on the restarting KVM is lost. If we manually fail over haproxy before >>> restarting the KVM, will that be sufficient to stop traffic being lost, or >>> do we also need to do something with the controller? >>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From marios at redhat.com Wed May 17 06:33:52 2023 From: marios at redhat.com (Marios Andreou) Date: Wed, 17 May 2023 09:33:52 +0300 Subject: [TripleO] Removal of TripleO Zed Integration and Component Lines In-Reply-To: <20230516214145.nhjp3rdkcbiqaj6h@yuggoth.org> References: <20230516214145.nhjp3rdkcbiqaj6h@yuggoth.org> Message-ID: On Wed, May 17, 2023 at 12:47?AM Jeremy Stanley wrote: > > On 2023-05-16 16:31:50 -0400 (-0400), James Slagle wrote: > [...] > > I feel the wording "on purpose to prevent..." is a > > mis-characterization of the intent. > [...] > > On further discussion in #openstack-tc, it appears some of the > concern was due to the phrase "there should be no more patches > submitted for stable/zed TripleO repos" and the "should" there was > interpreted as implying that the prior contributors were doing this > in order to disallow participation by future contributors who might > want to do so. It seems like this was not actually the intent you > were trying to convey, and was an unfortunate misreading. If you had > said that further patches for that branch are not expected, or that > you personally wouldn't be submitting patches for that branch, I > doubt the reaction would have been so strong. This has been well explained by James and Ronelle, but having worked with Ronelle on that announcement and since the sentence in question was added on my suggestion ("there should be no more patches submitted..." ) I feel a responsibility to clarify the intent behind it. The intent is to signal that after this point there will be degraded CI for all TripleO repos' stable/zed branch because it will no longer be maintained by the team that has been doing so until now. The gate will break and your patches will be blocked from merging unless you skip jobs or start fixing them (we have not removed the job definitions or zuul layouts but they are going to fall into disrepair). For this reason, "there should be no more patches submitted for stable/zed TripleO repos". thanks all for helping to clarify and my apologies for the concerns this raised - in retrospect the wording could been expanded a little to make it clearer, regards, marios > -- > Jeremy Stanley From skaplons at redhat.com Wed May 17 07:55:47 2023 From: skaplons at redhat.com (Slawek Kaplonski) Date: Wed, 17 May 2023 09:55:47 +0200 Subject: [neutron] policy rules: filter on name field In-Reply-To: <188268d12bd.1227636f51301451.1653718688720363876@ghanshyammann.com> References: <57436bfb-0acb-dffa-bcdb-2bef2c8b8472@unipd.it> <2957175.lYqTQsUpk7@p1> <188268d12bd.1227636f51301451.1653718688720363876@ghanshyammann.com> Message-ID: <2521426.hzxb4AvoNt@p1> Hi, Dnia wtorek, 16 maja 2023 23:52:39 CEST Ghanshyam Mann pisze: > > ---- On Tue, 16 May 2023 07:25:52 -0700 Slawek Kaplonski wrote --- > > Hi, > > > > Dnia wtorek, 16 maja 2023 12:00:34 CEST Paolo Emilio Mazzon pisze: > > > Hello, > > > > > > I'm trying to understand if this is feasible: I would like to avoid a regular user from > > > tampering the "default" security group of a project. Specifically I would like to prevent > > > him from deleting sg rules *from the default sg only* > > > > > > I can wite a policy.yaml like this > > > > > > # Delete a security group rule > > > # DELETE /security-group-rules/{id} > > > # Intended scope(s): project > > > "delete_security_group_rule": "role:project_manager and project_id:%(project_id)s" > > > > > > but this is sub-optimal since the regular member can still *add* rules... > > > > > > Is it possible to create a rule like > > > > > > "sg_is_default" : ...the sg group whose name is 'default' > > > > > > so I can write > > > > > > "delete_security_group_rule": "not rule:sg_is_default" ? > > > > > > Thanks! > > > > I'm not sure but I will try to check it later today or tomorrow morning and will let You know if that is possible or not. > > 'not' operator is supported in oslo policy. I think the below one should work which allows admin to delete the default SG and manager role > can delete only non-default SG. > > NOTE: I have not tested this, may be you can check while trying other combinations. > > "delete_security_group_rule": "role:project_manager and project_id:%(project_id)s and not 'default':%(name)s or 'default':%(name)s and role:admin" > > -gmann > > > > > > > > > Paolo > > > > > > -- > > > Paolo Emilio Mazzon > > > System and Network Administrator > > > > > > paoloemilio.mazzon[at]unipd.it > > > > > > PNC - Padova Neuroscience Center > > > https://www.pnc.unipd.it > > > Via Orus 2/B - 35131 Padova, Italy > > > +39 049 821 2624 > > > > > > > > > > > > -- > > Slawek Kaplonski > > Principal Software Engineer > > Red Hat > > I checked it today and it can be done like: "sg_is_default": "field:security_groups:name=default", "delete_security_group": "(role:member and project_id:%(project_id)s and not rule:sg_is_default) or role:admin" for Security Group. But it won't work like that for security group rules as You want to rely Your policy on the value of the attribute which belongs to parent resource (name of the Security group when doing API call for SG rule). We had similar problem for the "network:shared" field - see [1] and it was fixed with [2] but that fix is specific for this special field ("network:shared" only). Maybe we would need to add such special handling for the default security group as well. If You would like to have something like that, please open LP bug for it and we can investigate that deeper. [1] https://bugs.launchpad.net/neutron/+bug/1808112 [2] https://review.opendev.org/c/openstack/neutron/+/652636 -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From rdhasman at redhat.com Wed May 17 08:40:16 2023 From: rdhasman at redhat.com (Rajat Dhasmana) Date: Wed, 17 May 2023 14:10:16 +0530 Subject: [cinder-backup][ceph] cinder-backup support of incremental backup with ceph backend In-Reply-To: References: Message-ID: Hi Satish, Did you check the size of the actual backup file in ceph storage? It should be created in the *backups* pool[1]. Cinder shows the same size of incremental backup as a normal backup but file size should be different from the size shown in cinder DB records. Also file size of incremental backup should not be the same as the file size of full backup. [1] https://github.com/openstack/devstack/blob/34afa91fc9f830fc8e1fdc4d76e7aa6d4248eaaa/lib/cinder_backups/ceph#L22 Thanks Rajat Dhasmana On Wed, May 17, 2023 at 12:25?AM Satish Patel wrote: > Folks, > > I have ceph storage for my openstack and configure cinder-volume and > cinder-backup service for my disaster solution. I am trying to use the > cinder-backup incremental option to save storage space but somehow It > doesn't work the way it should work. > > Whenever I take incremental backup it shows a similar size of original > volume. Technically It should be smaller. Question is does ceph > support incremental backup with cinder? > > I am running a Yoga release. > > $ openstack volume list > +--------------------------------------+------------+------------+------+-------------------------------------+ > | ID | Name | Status | Size | Attached to | > +--------------------------------------+------------+------------+------+-------------------------------------+ > | 285a49a6-0e03-49e5-abf1-1c1efbfeb5f2 | spatel-vol | backing-up | 10 | Attached to spatel-foo on /dev/sdc | > +--------------------------------------+------------+------------+------+-------------------------------------+ > > ### Create full backup > $ openstack volume backup create --name spatel-vol-backup spatel-vol --force > +-------+--------------------------------------+ > | Field | Value | > +-------+--------------------------------------+ > | id | 4351d9d3-85fa-4cd5-b21d-619b3385aefc | > | name | spatel-vol-backup | > +-------+--------------------------------------+ > > ### Create incremental > $ openstack volume backup create --name spatel-vol-backup-1 --incremental --force spatel-vol > +-------+--------------------------------------+ > | Field | Value | > +-------+--------------------------------------+ > | id | 294b58af-771b-4a9f-bb7b-c37a4f84d678 | > | name | spatel-vol-backup-1 | > +-------+--------------------------------------+ > > $ openstack volume backup list > +--------------------------------------+---------------------+-------------+-----------+------+ > | ID | Name | Description | Status | Size | > +--------------------------------------+---------------------+-------------+-----------+------+ > | 294b58af-771b-4a9f-bb7b-c37a4f84d678 | spatel-vol-backup-1 | None | available | 10 | > | 4351d9d3-85fa-4cd5-b21d-619b3385aefc | spatel-vol-backup | None | available | 10 | > +--------------------------------------+---------------------+-------------+-----------+------+ > > > My incremental backup still shows 10G size which should be lower compared to the first backup. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From crazy+kolla at geeks.pl Wed May 17 09:01:41 2023 From: crazy+kolla at geeks.pl (Ryszard Mielcarek) Date: Wed, 17 May 2023 11:01:41 +0200 Subject: Transition to kolla Message-ID: <20230517110141.7acc03dc@brain> Hello, is there anyone who already did transition on existing cluster, from package-version to kolla-ansible managed installation? Looking for guides, tips, your experience on that. BR, crazik From piotrmisiak1984 at gmail.com Wed May 17 09:53:11 2023 From: piotrmisiak1984 at gmail.com (Piotr Misiak) Date: Wed, 17 May 2023 11:53:11 +0200 Subject: Transition to kolla In-Reply-To: <20230517110141.7acc03dc@brain> References: <20230517110141.7acc03dc@brain> Message-ID: Hello, I haven?t done such migration but in theory it should be possible. It depends on how your storage and networking looks like on the current env. Have you thought about creating a new env and migrating workload in stages from the old one to the new one? You can start with a small new env and after freeing compute and storage hardware after first migration you can move them to the new env to make more space for a next migration stage. BR, Piotr On Wed, 17 May 2023 at 11:02 Ryszard Mielcarek wrote: > Hello, > is there anyone who already did transition on existing cluster, > from package-version to kolla-ansible managed installation? > Looking for guides, tips, your experience on that. > > BR, > crazik > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From crazy+kolla at geeks.pl Wed May 17 10:03:56 2023 From: crazy+kolla at geeks.pl (Ryszard Mielcarek) Date: Wed, 17 May 2023 12:03:56 +0200 Subject: Transition to kolla In-Reply-To: References: <20230517110141.7acc03dc@brain> Message-ID: <20230517120356.559a554d@brain> Dnia 2023-05-17, o godz. 11:53:11 Piotr Misiak napisa?(a): > Hello, > > I haven?t done such migration but in theory it should be possible. It > depends on how your storage and networking looks like on the current > env. Have you thought about creating a new env and migrating workload > in stages from the old one to the new one? You can start with a small > new env and after freeing compute and storage hardware after first > migration you can move them to the new env to make more space for a > next migration stage. Hey Piotr, thanks for the idea, but my priority is minimal impact for the VMs (I want only live-migrate them during takeover). With new env I will end up with downtime for each VM and IP change. BR, crazik > > BR, > Piotr > > On Wed, 17 May 2023 at 11:02 Ryszard Mielcarek > wrote: > > > Hello, > > is there anyone who already did transition on existing cluster, > > from package-version to kolla-ansible managed installation? > > Looking for guides, tips, your experience on that. > > > > BR, > > crazik > > > > From eblock at nde.ag Wed May 17 11:17:30 2023 From: eblock at nde.ag (Eugen Block) Date: Wed, 17 May 2023 11:17:30 +0000 Subject: [kolla] [train] haproxy and controller restart causes user impact In-Reply-To: <393615542.2867360.1684300158715@mail.yahoo.com> References: <314427760.909657.1683808321773.ref@mail.yahoo.com> <314427760.909657.1683808321773@mail.yahoo.com> <20230512073327.Horde.nPa9c_1UYY_XW_n-sc1pyQM@webmail.nde.ag> <1577432210.483872.1683917817875@mail.yahoo.com> <1979459451.2557820.1684258144482@mail.yahoo.com> <20230516181703.Horde.xQASX_23dzWCwbxddBcHuLa@webmail.nde.ag> <393615542.2867360.1684300158715@mail.yahoo.com> Message-ID: <20230517111730.Horde.ao5h0-n-fH5zG3Yj_Nnt7oT@webmail.nde.ag> Hi, I found this [1] reference, it recommends to reduce the kernel option for tcp_retries to reduce the impact of a service interruption: # /etc/kolla/globals.yml haproxy_host_ipv4_tcp_retries2: 6 Apparently, this option was introduced in Victoria [2], it states: > Added a new haproxy configuration variable, > haproxy_host_ipv4_tcp_retries2, which allows users to modify this > kernel option. This option sets maximum number of times a TCP packet > is retransmitted in established state before giving up. The default > kernel value is 15, which corresponds to a duration of approximately > between 13 to 30 minutes, depending on the retransmission timeout. > This variable can be used to mitigate an issue with stuck > connections in case of VIP failover, see bug 1917068 for details. It reads like exactly what you're describing. If I remember correctly, you're still on Train? In that case you'll probably have to configure that setting manually (scripted maybe), it is this value: /proc/sys/net/ipv4/tcp_retries2 The solution in [3] even talks about setting it to 3 for HA deployments. # sysctl -a | grep net.ipv4.tcp_retries2 net.ipv4.tcp_retries2 = 15 Regards, Eugen [1] https://docs.openstack.org/kolla-ansible/latest/reference/high-availability/haproxy-guide.html [2] https://docs.openstack.org/releasenotes/kolla-ansible/victoria.html [3] https://access.redhat.com/solutions/726753 Zitat von Albert Braden : > Before we switched to durable queues we were seeing RMQ issues after > a restart. Now RMQ is fine after restart, but operations in progress > will fail. VMs will fail to build, or not get DNS records. Volumes > don't get attached or detached. It looks like haproxy is the issue > now; connections continue going to the down node. I think we can fix > that by failing over haproxy before rebooting. > > The problem is, I'm not sure that haproxy is the only issue. All 3 > controllers are doing stuff, and when I reboot one, whatever it is > doing is likely to fail. Is there an orderly way to stop work from > being done on a controller without ruining work that is already in > progress, besides removing it from the cluster? Would "kolla-ansible > stop" do it? > On Tuesday, May 16, 2023, 02:23:59 PM EDT, Eugen Block > wrote: > > Hi Albert, > > sorry, I'm swamped with different stuff right now. I just took a? > glance at the docs you mentioned and it seems way too much for? > something simple as a controller restart to actually remove hosts,? > that should definitely not be necessary. > I'm not familiar with kolla or exabgp, but can you describe what? > exactly takes that long to failover? Maybe that could be improved? And? > can you limit the failing requests to a specific service (volumes,? > network ports, etc.) or do they all fail? Maybe rabbitmq should be? > considered after all, you could share your rabbitmq settings from the? > different openstack services and I will collect mine to compare. And? > then also the rabbitmq config (policies, vhosts, queues). > > Regards, > Eugen > > Zitat von Albert Braden : > >> What's the recommended method for rebooting controllers? Do we need? >> to use the "remove from cluster" and "add to cluster" procedures or? >> is there a better way? >> >> https://docs.openstack.org/kolla-ansible/train/user/adding-and-removing-hosts.html >> ? ? ? On Friday, May 12, 2023, 03:04:26 PM EDT, Albert Braden? >> wrote: >> >> ? We use keepalived and exabgp to manage failover for haproxy. That? >> works but it takes a few minutes, and during those few minutes? >> customers experience impact. We tell them to not build/delete VMs? >> during patching, but they still do, and then complain about the? >> failures. >> >> We're planning to experiment with adding a "manual" haproxy failover? >> to our patching automation, but I'm wondering if there is anything? >> on the controller that needs to be failed over or disabled before? >> rebooting the KVM. I looked at the "remove from cluster" and "add to? >> cluster" procedures but that seems unnecessarily cumbersome for? >> rebooting the KVM. >> ? ? ? On Friday, May 12, 2023, 03:42:42 AM EDT, Eugen Block? >> wrote: >> >> ? Hi Albert, >> >> how is your haproxy placement controlled, something like pacemaker or? >> similar? I would always do a failover when I'm aware of interruptions? >> (maintenance window), that should speed things up for clients. We have? >> a pacemaker controlled HA control plane, it takes more time until? >> pacemaker realizes that the resource is gone if I just rebooted a? >> server without failing over. I have no benchmarks though. There's? >> always a risk of losing a couple of requests during the failover but? >> we didn't have complaints yet, I believe most of the components try to? >> resend the lost messages. In one of our customer's cluster with many? >> resources (they also use terraform) I haven't seen issues during a? >> regular maintenance window. When they had a DNS outage a few months? >> back it resulted in a mess, manual cleaning was necessary, but the? >> regular failovers seem to work just fine. >> And I don't see rabbitmq issues either after rebooting a server,? >> usually the haproxy (and virtual IP) failover suffice to prevent? >> interruptions. >> >> Regards, >> Eugen >> >> Zitat von Satish Patel : >> >>> Are you running your stack on top of the kvm virtual machine? How many >>> controller nodes do you have? mostly rabbitMQ causing issues if you restart >>> controller nodes. >>> >>> On Thu, May 11, 2023 at 8:34?AM Albert Braden wrote: >>> >>>> We have our haproxy and controller nodes on KVM hosts. When those KVM >>>> hosts are restarted, customers who are building or deleting VMs >>>> see impact. >>>> VMs may go into error status, fail to get DNS records, fail to >>>> delete, etc. >>>> The obvious reason is because traffic that is being routed to the haproxy >>>> on the restarting KVM is lost. If we manually fail over haproxy before >>>> restarting the KVM, will that be sufficient to stop traffic being lost, or >>>> do we also need to do something with the controller? >>>> >>>> From tony at bakeyournoodle.com Wed May 17 11:41:57 2023 From: tony at bakeyournoodle.com (Tony Breeds) Date: Wed, 17 May 2023 21:41:57 +1000 Subject: Train EOL In-Reply-To: <451f9dd7-1a0f-5b77-de7d-205a0bf17ea2@windriver.com> References: <0802ccc5-4adc-f523-aff3-6c62af0d0aca@windriver.com> <451f9dd7-1a0f-5b77-de7d-205a0bf17ea2@windriver.com> Message-ID: On Sat, 13 May 2023 at 00:35, Scott Little wrote: > > Thanks for your response Jay > > Yes the -eol tag is somewhat useful, but it doesn't appear to be created until the branch is removed. There is no way for a downstream consumer to to prepare for the forthcoming branch deletion. There is no way to avoid a period of breakage. This doesn't help you for train, but may save you some headaches in the future? You can keep an eye out for -em tags arriving. They're created when a project moves into Extended Maintenance mode. That's a pretty good signal that downstream consumers should expect the branch to translation to -eol "soon". Soon is hard to define but essentially it's a 6 month warning. Tony. From noonedeadpunk at gmail.com Wed May 17 12:47:52 2023 From: noonedeadpunk at gmail.com (Dmitriy Rabotyagov) Date: Wed, 17 May 2023 14:47:52 +0200 Subject: Transition to kolla In-Reply-To: <20230517120356.559a554d@brain> References: <20230517110141.7acc03dc@brain> <20230517120356.559a554d@brain> Message-ID: Hey, We did such migration from RDO to the OSA deployment and it was quite resilient for the end users. But we've played with LB quite a lot, and had new hardware for net nodes and control plane, but other then that - it's completely possible to achieve. I think as a first step we've spawned new rabbit and mariadb cluster, though it was easier for us as we ran galera before, so we were able to just scale up galera cluster and point LB to the new IP. And then it was just about setup of new services with osa and pointing haproxy to the new set of the backends. As long as you're deploying the same major version it should be fine to do so. When it comes to net nodes, we migrated l3 and DHCP agents to the freshly provisioned nodes with quite simple script, that list all agents for agent A, removes agent it from A and adds to B afterwards. For compute nodes, I think we didn't do even live migrates - just stopped old service and provisioned new in place. As hostname remains the same, it was just picked up and placement wasn't complaining. But yeah, live migration should just work as long as you have same os version. I think this all should work nicely for kolla as well. ??, 17 ??? 2023 ?., 12:12 Ryszard Mielcarek : > Dnia 2023-05-17, o godz. 11:53:11 > Piotr Misiak napisa?(a): > > > Hello, > > > > I haven?t done such migration but in theory it should be possible. It > > depends on how your storage and networking looks like on the current > > env. Have you thought about creating a new env and migrating workload > > in stages from the old one to the new one? You can start with a small > > new env and after freeing compute and storage hardware after first > > migration you can move them to the new env to make more space for a > > next migration stage. > > Hey Piotr, > thanks for the idea, but my priority is minimal impact for > the VMs (I want only live-migrate them during takeover). > With new env I will end up with downtime for each VM and IP change. > > BR, > crazik > > > > > BR, > > Piotr > > > > On Wed, 17 May 2023 at 11:02 Ryszard Mielcarek > > wrote: > > > > > Hello, > > > is there anyone who already did transition on existing cluster, > > > from package-version to kolla-ansible managed installation? > > > Looking for guides, tips, your experience on that. > > > > > > BR, > > > crazik > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From adivya1.singh at gmail.com Wed May 17 12:56:07 2023 From: adivya1.singh at gmail.com (Adivya Singh) Date: Wed, 17 May 2023 18:26:07 +0530 Subject: [cinder-backup][ceph] cinder-backup support of incremental backup with ceph backend In-Reply-To: References: Message-ID: Hi, I am not sure, some body have tested this solution before, But still come from a background of backup, i strongly believe that for a Software to understand the difference between incremental and Full it needs to have a agent at the client side to do a Journalling based on backup objects, I do not see thai feature is there in Ceph Regards Adivya Singh On Wed, May 17, 2023 at 2:16?PM Rajat Dhasmana wrote: > Hi Satish, > > Did you check the size of the actual backup file in ceph storage? It > should be created in the *backups* pool[1]. > Cinder shows the same size of incremental backup as a normal backup but > file size should be different from > the size shown in cinder DB records. Also file size of incremental backup > should not be the same as the file size of full backup. > > [1] > https://github.com/openstack/devstack/blob/34afa91fc9f830fc8e1fdc4d76e7aa6d4248eaaa/lib/cinder_backups/ceph#L22 > > Thanks > Rajat Dhasmana > > On Wed, May 17, 2023 at 12:25?AM Satish Patel > wrote: > >> Folks, >> >> I have ceph storage for my openstack and configure cinder-volume and >> cinder-backup service for my disaster solution. I am trying to use the >> cinder-backup incremental option to save storage space but somehow It >> doesn't work the way it should work. >> >> Whenever I take incremental backup it shows a similar size of original >> volume. Technically It should be smaller. Question is does ceph >> support incremental backup with cinder? >> >> I am running a Yoga release. >> >> $ openstack volume list >> +--------------------------------------+------------+------------+------+-------------------------------------+ >> | ID | Name | Status | Size | Attached to | >> +--------------------------------------+------------+------------+------+-------------------------------------+ >> | 285a49a6-0e03-49e5-abf1-1c1efbfeb5f2 | spatel-vol | backing-up | 10 | Attached to spatel-foo on /dev/sdc | >> +--------------------------------------+------------+------------+------+-------------------------------------+ >> >> ### Create full backup >> $ openstack volume backup create --name spatel-vol-backup spatel-vol --force >> +-------+--------------------------------------+ >> | Field | Value | >> +-------+--------------------------------------+ >> | id | 4351d9d3-85fa-4cd5-b21d-619b3385aefc | >> | name | spatel-vol-backup | >> +-------+--------------------------------------+ >> >> ### Create incremental >> $ openstack volume backup create --name spatel-vol-backup-1 --incremental --force spatel-vol >> +-------+--------------------------------------+ >> | Field | Value | >> +-------+--------------------------------------+ >> | id | 294b58af-771b-4a9f-bb7b-c37a4f84d678 | >> | name | spatel-vol-backup-1 | >> +-------+--------------------------------------+ >> >> $ openstack volume backup list >> +--------------------------------------+---------------------+-------------+-----------+------+ >> | ID | Name | Description | Status | Size | >> +--------------------------------------+---------------------+-------------+-----------+------+ >> | 294b58af-771b-4a9f-bb7b-c37a4f84d678 | spatel-vol-backup-1 | None | available | 10 | >> | 4351d9d3-85fa-4cd5-b21d-619b3385aefc | spatel-vol-backup | None | available | 10 | >> +--------------------------------------+---------------------+-------------+-----------+------+ >> >> >> My incremental backup still shows 10G size which should be lower compared to the first backup. >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From eblock at nde.ag Wed May 17 13:23:04 2023 From: eblock at nde.ag (Eugen Block) Date: Wed, 17 May 2023 13:23:04 +0000 Subject: [cinder-backup][ceph] cinder-backup support of incremental backup with ceph backend In-Reply-To: References: Message-ID: <20230517132304.Horde.rvjONnn7BX0feZwASy3ZmP8@webmail.nde.ag> Hi, just to visualize Rajat's response, ceph creates copy-on-write snapshots so the incremental backup doesn't really use much space. On a Victoria cloud I created one full backup of an almost empty volume (made an ext4 filesystem and mounted it, create one tiny file), then created a second tiny file and then made another incremental backup, this is what ceph sees: ceph01:~ # rbd du /volume-6662f50a-a74c-47a4-8abd-a49069f3614c NAME PROVISIONED USED volume-6662f50a-a74c-47a4-8abd-a49069f3614c at backup.650a4f8f-7b61-447e-9eb9-767c74b15342.snap.1684329174.8419683 5 GiB 192 MiB volume-6662f50a-a74c-47a4-8abd-a49069f3614c at backup.1d358548-5d1d-4e03-9728-bb863c717910.snap.1684329450.9599462 5 GiB 16 MiB volume-6662f50a-a74c-47a4-8abd-a49069f3614c 5 GiB 0 B backup.650a4f8f-7b61-447e-9eb9-767c74b15342 (using 192 MiB) is the full backup, backup.1d358548-5d1d-4e03-9728-bb863c717910 is the incremental backup (using 16 MiB). Zitat von Rajat Dhasmana : > Hi Satish, > > Did you check the size of the actual backup file in ceph storage? It should > be created in the *backups* pool[1]. > Cinder shows the same size of incremental backup as a normal backup but > file size should be different from > the size shown in cinder DB records. Also file size of incremental backup > should not be the same as the file size of full backup. > > [1] > https://github.com/openstack/devstack/blob/34afa91fc9f830fc8e1fdc4d76e7aa6d4248eaaa/lib/cinder_backups/ceph#L22 > > Thanks > Rajat Dhasmana > > On Wed, May 17, 2023 at 12:25?AM Satish Patel wrote: > >> Folks, >> >> I have ceph storage for my openstack and configure cinder-volume and >> cinder-backup service for my disaster solution. I am trying to use the >> cinder-backup incremental option to save storage space but somehow It >> doesn't work the way it should work. >> >> Whenever I take incremental backup it shows a similar size of original >> volume. Technically It should be smaller. Question is does ceph >> support incremental backup with cinder? >> >> I am running a Yoga release. >> >> $ openstack volume list >> +--------------------------------------+------------+------------+------+-------------------------------------+ >> | ID | Name | Status | >> Size | Attached to | >> +--------------------------------------+------------+------------+------+-------------------------------------+ >> | 285a49a6-0e03-49e5-abf1-1c1efbfeb5f2 | spatel-vol | backing-up | >> 10 | Attached to spatel-foo on /dev/sdc | >> +--------------------------------------+------------+------------+------+-------------------------------------+ >> >> ### Create full backup >> $ openstack volume backup create --name spatel-vol-backup spatel-vol --force >> +-------+--------------------------------------+ >> | Field | Value | >> +-------+--------------------------------------+ >> | id | 4351d9d3-85fa-4cd5-b21d-619b3385aefc | >> | name | spatel-vol-backup | >> +-------+--------------------------------------+ >> >> ### Create incremental >> $ openstack volume backup create --name spatel-vol-backup-1 >> --incremental --force spatel-vol >> +-------+--------------------------------------+ >> | Field | Value | >> +-------+--------------------------------------+ >> | id | 294b58af-771b-4a9f-bb7b-c37a4f84d678 | >> | name | spatel-vol-backup-1 | >> +-------+--------------------------------------+ >> >> $ openstack volume backup list >> +--------------------------------------+---------------------+-------------+-----------+------+ >> | ID | Name | >> Description | Status | Size | >> +--------------------------------------+---------------------+-------------+-----------+------+ >> | 294b58af-771b-4a9f-bb7b-c37a4f84d678 | spatel-vol-backup-1 | None >> | available | 10 | >> | 4351d9d3-85fa-4cd5-b21d-619b3385aefc | spatel-vol-backup | None >> | available | 10 | >> +--------------------------------------+---------------------+-------------+-----------+------+ >> >> >> My incremental backup still shows 10G size which should be lower >> compared to the first backup. >> >> >> From satish.txt at gmail.com Wed May 17 13:41:58 2023 From: satish.txt at gmail.com (Satish Patel) Date: Wed, 17 May 2023 09:41:58 -0400 Subject: [cinder-backup][ceph] cinder-backup support of incremental backup with ceph backend In-Reply-To: <20230517132304.Horde.rvjONnn7BX0feZwASy3ZmP8@webmail.nde.ag> References: <20230517132304.Horde.rvjONnn7BX0feZwASy3ZmP8@webmail.nde.ag> Message-ID: Thank you Eugen, I am noticing similar thing what you noticed. That means cinder lying to use or doesn't know how to calculate size of copy-on-write. One more question, I have created 1G file using dd command and took incremental backup and found ceph only showing 28 MiB size of backup. Does that sound right? root at ceph1:~# rbd -p backups du NAME PROVISIONED USED volume-285a49a6-0e03-49e5-abf1-1c1efbfeb5f2.backup.4351d9d3-85fa-4cd5-b21d-619b3385aefc at backup.4351d9d3-85fa-4cd5-b21d-619b3385aefc.snap.1684260707.1682937 10 GiB 68 MiB volume-285a49a6-0e03-49e5-abf1-1c1efbfeb5f2.backup.4351d9d3-85fa-4cd5-b21d-619b3385aefc at backup.294b58af-771b-4a9f-bb7b-c37a4f84d678.snap.1684260787.943873 10 GiB 36 MiB volume-285a49a6-0e03-49e5-abf1-1c1efbfeb5f2.backup.4351d9d3-85fa-4cd5-b21d-619b3385aefc at backup.c9652662-36bd-4e74-b822-f7ae10eb7246.snap.1684330702.6955926 10 GiB 28 MiB volume-285a49a6-0e03-49e5-abf1-1c1efbfeb5f2.backup.4351d9d3-85fa-4cd5-b21d-619b3385aefc 10 GiB 0 B On Wed, May 17, 2023 at 9:26?AM Eugen Block wrote: > Hi, > > just to visualize Rajat's response, ceph creates copy-on-write > snapshots so the incremental backup doesn't really use much space. On > a Victoria cloud I created one full backup of an almost empty volume > (made an ext4 filesystem and mounted it, create one tiny file), then > created a second tiny file and then made another incremental backup, > this is what ceph sees: > > ceph01:~ # rbd du /volume-6662f50a-a74c-47a4-8abd-a49069f3614c > NAME > PROVISIONED USED > > volume-6662f50a-a74c-47a4-8abd-a49069f3614c at backup.650a4f8f-7b61-447e-9eb9-767c74b15342.snap.1684329174.8419683 > 5 GiB 192 > MiB > > volume-6662f50a-a74c-47a4-8abd-a49069f3614c at backup.1d358548-5d1d-4e03-9728-bb863c717910.snap.1684329450.9599462 > 5 GiB 16 > MiB > volume-6662f50a-a74c-47a4-8abd-a49069f3614c > 5 GiB 0 B > > > backup.650a4f8f-7b61-447e-9eb9-767c74b15342 (using 192 MiB) is the > full backup, backup.1d358548-5d1d-4e03-9728-bb863c717910 is the > incremental backup (using 16 MiB). > > Zitat von Rajat Dhasmana : > > > Hi Satish, > > > > Did you check the size of the actual backup file in ceph storage? It > should > > be created in the *backups* pool[1]. > > Cinder shows the same size of incremental backup as a normal backup but > > file size should be different from > > the size shown in cinder DB records. Also file size of incremental backup > > should not be the same as the file size of full backup. > > > > [1] > > > https://github.com/openstack/devstack/blob/34afa91fc9f830fc8e1fdc4d76e7aa6d4248eaaa/lib/cinder_backups/ceph#L22 > > > > Thanks > > Rajat Dhasmana > > > > On Wed, May 17, 2023 at 12:25?AM Satish Patel > wrote: > > > >> Folks, > >> > >> I have ceph storage for my openstack and configure cinder-volume and > >> cinder-backup service for my disaster solution. I am trying to use the > >> cinder-backup incremental option to save storage space but somehow It > >> doesn't work the way it should work. > >> > >> Whenever I take incremental backup it shows a similar size of original > >> volume. Technically It should be smaller. Question is does ceph > >> support incremental backup with cinder? > >> > >> I am running a Yoga release. > >> > >> $ openstack volume list > >> > +--------------------------------------+------------+------------+------+-------------------------------------+ > >> | ID | Name | Status | > >> Size | Attached to | > >> > +--------------------------------------+------------+------------+------+-------------------------------------+ > >> | 285a49a6-0e03-49e5-abf1-1c1efbfeb5f2 | spatel-vol | backing-up | > >> 10 | Attached to spatel-foo on /dev/sdc | > >> > +--------------------------------------+------------+------------+------+-------------------------------------+ > >> > >> ### Create full backup > >> $ openstack volume backup create --name spatel-vol-backup spatel-vol > --force > >> +-------+--------------------------------------+ > >> | Field | Value | > >> +-------+--------------------------------------+ > >> | id | 4351d9d3-85fa-4cd5-b21d-619b3385aefc | > >> | name | spatel-vol-backup | > >> +-------+--------------------------------------+ > >> > >> ### Create incremental > >> $ openstack volume backup create --name spatel-vol-backup-1 > >> --incremental --force spatel-vol > >> +-------+--------------------------------------+ > >> | Field | Value | > >> +-------+--------------------------------------+ > >> | id | 294b58af-771b-4a9f-bb7b-c37a4f84d678 | > >> | name | spatel-vol-backup-1 | > >> +-------+--------------------------------------+ > >> > >> $ openstack volume backup list > >> > +--------------------------------------+---------------------+-------------+-----------+------+ > >> | ID | Name | > >> Description | Status | Size | > >> > +--------------------------------------+---------------------+-------------+-----------+------+ > >> | 294b58af-771b-4a9f-bb7b-c37a4f84d678 | spatel-vol-backup-1 | None > >> | available | 10 | > >> | 4351d9d3-85fa-4cd5-b21d-619b3385aefc | spatel-vol-backup | None > >> | available | 10 | > >> > +--------------------------------------+---------------------+-------------+-----------+------+ > >> > >> > >> My incremental backup still shows 10G size which should be lower > >> compared to the first backup. > >> > >> > >> > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hanguangyu2 at gmail.com Wed May 17 13:47:06 2023 From: hanguangyu2 at gmail.com (=?UTF-8?B?6Z+p5YWJ5a6H?=) Date: Wed, 17 May 2023 13:47:06 +0000 Subject: [neutron] floating ip port forwarding and dvr Message-ID: Hello, I know that "For instances with a floating IPv4 address, routing between self-service and provider networks resides completely on the compute nodes"[1] when enable neutron dvr. This can eliminate performance issues with network nodes. But it need fix floating ip to vm. I want to ask when employing floating IP port forwarding, is it feasible to enable direct access to the compute nodes for traffic? I may encounter a scenario where tens of thousands of ports are forwarded from a floating IP to different ports on different VMs. This would result in a significant concentration of traffic at the network node. I would like to inquire if there are any methods available to alleviate the pressure on the network node or to directly distribute the traffic across the compute nodes. Appreciate for any assistance or suggestions. Best regards, Han Guangyu [1] https://docs.openstack.org/neutron/train/admin/deploy-ovs-ha-dvr.html#deploy-ovs-ha-dvr From eblock at nde.ag Wed May 17 13:49:39 2023 From: eblock at nde.ag (Eugen Block) Date: Wed, 17 May 2023 13:49:39 +0000 Subject: [cinder-backup][ceph] cinder-backup support of incremental backup with ceph backend In-Reply-To: References: <20230517132304.Horde.rvjONnn7BX0feZwASy3ZmP8@webmail.nde.ag> Message-ID: <20230517134939.Horde.rzsanBNjTZ0omzur-U1mcNx@webmail.nde.ag> I don't think cinder is lying. Firstly, the "backup create" dialog states: > Backups will be the same size as the volume they originate from. Secondly, I believe it highly depends on the actual storage backend and its implementation how to create a backup. On a different storage backend it might look completely different. We're on ceph from the beginning so I can't comment on alternatives. As for your question, how exactly did your dd command look like? Did you fill up the file with zeroes (dd if=/dev/zero)? In that case the low used bytes number would make sense, if you filled it up randomly the du size should be higher. Zitat von Satish Patel : > Thank you Eugen, > > I am noticing similar thing what you noticed. That means cinder lying to > use or doesn't know how to calculate size of copy-on-write. > > One more question, I have created 1G file using dd command and took > incremental backup and found ceph only showing 28 MiB size of backup. Does > that sound right? > > > root at ceph1:~# rbd -p backups du NAME PROVISIONED USED > volume-285a49a6-0e03-49e5-abf1-1c1efbfeb5f2.backup.4351d9d3-85fa-4cd5-b21d-619b3385aefc at backup.4351d9d3-85fa-4cd5-b21d-619b3385aefc.snap.1684260707.1682937 > 10 GiB 68 MiB > volume-285a49a6-0e03-49e5-abf1-1c1efbfeb5f2.backup.4351d9d3-85fa-4cd5-b21d-619b3385aefc at backup.294b58af-771b-4a9f-bb7b-c37a4f84d678.snap.1684260787.943873 > 10 GiB 36 MiB > volume-285a49a6-0e03-49e5-abf1-1c1efbfeb5f2.backup.4351d9d3-85fa-4cd5-b21d-619b3385aefc at backup.c9652662-36bd-4e74-b822-f7ae10eb7246.snap.1684330702.6955926 > 10 GiB 28 MiB > volume-285a49a6-0e03-49e5-abf1-1c1efbfeb5f2.backup.4351d9d3-85fa-4cd5-b21d-619b3385aefc > 10 GiB 0 B > > On Wed, May 17, 2023 at 9:26?AM Eugen Block wrote: > >> Hi, >> >> just to visualize Rajat's response, ceph creates copy-on-write >> snapshots so the incremental backup doesn't really use much space. On >> a Victoria cloud I created one full backup of an almost empty volume >> (made an ext4 filesystem and mounted it, create one tiny file), then >> created a second tiny file and then made another incremental backup, >> this is what ceph sees: >> >> ceph01:~ # rbd du /volume-6662f50a-a74c-47a4-8abd-a49069f3614c >> NAME >> PROVISIONED USED >> >> volume-6662f50a-a74c-47a4-8abd-a49069f3614c at backup.650a4f8f-7b61-447e-9eb9-767c74b15342.snap.1684329174.8419683 >> 5 GiB 192 >> MiB >> >> volume-6662f50a-a74c-47a4-8abd-a49069f3614c at backup.1d358548-5d1d-4e03-9728-bb863c717910.snap.1684329450.9599462 >> 5 GiB 16 >> MiB >> volume-6662f50a-a74c-47a4-8abd-a49069f3614c >> 5 GiB 0 B >> >> >> backup.650a4f8f-7b61-447e-9eb9-767c74b15342 (using 192 MiB) is the >> full backup, backup.1d358548-5d1d-4e03-9728-bb863c717910 is the >> incremental backup (using 16 MiB). >> >> Zitat von Rajat Dhasmana : >> >> > Hi Satish, >> > >> > Did you check the size of the actual backup file in ceph storage? It >> should >> > be created in the *backups* pool[1]. >> > Cinder shows the same size of incremental backup as a normal backup but >> > file size should be different from >> > the size shown in cinder DB records. Also file size of incremental backup >> > should not be the same as the file size of full backup. >> > >> > [1] >> > >> https://github.com/openstack/devstack/blob/34afa91fc9f830fc8e1fdc4d76e7aa6d4248eaaa/lib/cinder_backups/ceph#L22 >> > >> > Thanks >> > Rajat Dhasmana >> > >> > On Wed, May 17, 2023 at 12:25?AM Satish Patel >> wrote: >> > >> >> Folks, >> >> >> >> I have ceph storage for my openstack and configure cinder-volume and >> >> cinder-backup service for my disaster solution. I am trying to use the >> >> cinder-backup incremental option to save storage space but somehow It >> >> doesn't work the way it should work. >> >> >> >> Whenever I take incremental backup it shows a similar size of original >> >> volume. Technically It should be smaller. Question is does ceph >> >> support incremental backup with cinder? >> >> >> >> I am running a Yoga release. >> >> >> >> $ openstack volume list >> >> >> +--------------------------------------+------------+------------+------+-------------------------------------+ >> >> | ID | Name | Status | >> >> Size | Attached to | >> >> >> +--------------------------------------+------------+------------+------+-------------------------------------+ >> >> | 285a49a6-0e03-49e5-abf1-1c1efbfeb5f2 | spatel-vol | backing-up | >> >> 10 | Attached to spatel-foo on /dev/sdc | >> >> >> +--------------------------------------+------------+------------+------+-------------------------------------+ >> >> >> >> ### Create full backup >> >> $ openstack volume backup create --name spatel-vol-backup spatel-vol >> --force >> >> +-------+--------------------------------------+ >> >> | Field | Value | >> >> +-------+--------------------------------------+ >> >> | id | 4351d9d3-85fa-4cd5-b21d-619b3385aefc | >> >> | name | spatel-vol-backup | >> >> +-------+--------------------------------------+ >> >> >> >> ### Create incremental >> >> $ openstack volume backup create --name spatel-vol-backup-1 >> >> --incremental --force spatel-vol >> >> +-------+--------------------------------------+ >> >> | Field | Value | >> >> +-------+--------------------------------------+ >> >> | id | 294b58af-771b-4a9f-bb7b-c37a4f84d678 | >> >> | name | spatel-vol-backup-1 | >> >> +-------+--------------------------------------+ >> >> >> >> $ openstack volume backup list >> >> >> +--------------------------------------+---------------------+-------------+-----------+------+ >> >> | ID | Name | >> >> Description | Status | Size | >> >> >> +--------------------------------------+---------------------+-------------+-----------+------+ >> >> | 294b58af-771b-4a9f-bb7b-c37a4f84d678 | spatel-vol-backup-1 | None >> >> | available | 10 | >> >> | 4351d9d3-85fa-4cd5-b21d-619b3385aefc | spatel-vol-backup | None >> >> | available | 10 | >> >> >> +--------------------------------------+---------------------+-------------+-----------+------+ >> >> >> >> >> >> My incremental backup still shows 10G size which should be lower >> >> compared to the first backup. >> >> >> >> >> >> >> >> >> >> >> From zhouxinyong at inspur.com Wed May 17 02:05:50 2023 From: zhouxinyong at inspur.com (=?utf-8?B?SmVycnkgWmhvdSAo5ZGo6ZGr5YuHKQ==?=) Date: Wed, 17 May 2023 02:05:50 +0000 Subject: =?utf-8?B?562U5aSNOiDnrZTlpI06IFtwdGxdIE5lZWQgUFRMIHZvbHVudGVlciBmb3Ig?= =?utf-8?Q?OpenStack_Sahara?= In-Reply-To: <18776962f8e.b5be9467498529.2443601353157879154@ghanshyammann.com> References: <18776962f8e.b5be9467498529.2443601353157879154@ghanshyammann.com> Message-ID: <4d3b952157f84886b52c7475b6bae68e@inspur.com> Hi Gmann, The governance patch to PTL appointment already merged into. [1] https://review.opendev.org/c/openstack/governance/+/881186 But there is no management authority for the project, and the review patch does not have +2 authority. I can take on the responsibility of PTL, maintain the project gate, and ensure that zuul can be executed normally; review and merge the commits submitted on the project. Could you add me to the sahara-core group. -----????----- ???: Ghanshyam Mann ????: 2023?4?13? 1:49 ???: Brin Zhang(???) ; Jerry Zhou (???) ??: Tom Guo(??2810) ; Weiting Kong (???) ; Alex Song (???) ; openstack-discuss ??: Re: ??: [ptl] Need PTL volunteer for OpenStack Sahara Hi Jerry, We discussed it in the TC meeting[1], and below is the summary and next step: * Sahara has been almost unmaintained in the last cycle, even though we had PTL volunteering to lead/maintain it. * From the PTL role, we expect the project to keep its testing/gate green, merge the incoming review requests, and do the releases on time. For example, if no maintainer is left in any project (which is the Sahara case), then its PTL's responsibility is to get all those project's basic maintenance done (by onboarding new maintainers or by themself) otherwise, inform TC about it so we can decide on the project status on time. * Next step: Thanks for volunteering and helping to maintain the Sahara project. Please propose the governance patch to PTL appointment, and TC will review it. Example: https://review.opendev.org/c/openstack/governance/+/878286 [1] https://meetings.opendev.org/meetings/tc/2023/tc.2023-04-11-17.59.log.html#l-83 -gmann ---- On Mon, 10 Apr 2023 09:09:37 -0700 Ghanshyam Mann wrote --- > Thanks Jerry, Brin for the updates and showing interest in Sahara project. > > In PTG discussion, there were some concern on Sahara maintenance even > we had PTL volunteer to lead this project but not much things got merged and > gate was broken many times. But let me discuss it in TC meeting tomorrow and > update here. > > -gmann > > ---- On Mon, 10 Apr 2023 00:22:08 -0700 Brin Zhang(???) wrote --- > > Hi Gmann, > > ????Jerry Zhou works the same company as qiujunting (Qiujunting has out of the office, so he cannot response us, I am so sorry), and he will be involved in Sahara. > > ????Hope TC can add him in Sahara team, let him gradually complete the maintenance of the Sahara project. > > ????Thanks. > > > > brinzhang > > > > -----????----- > > ???: Jerry Zhou (???) zhouxinyong at inspur.com> > > ????: 2023?4?7? 11:10 > > ???: gmann at ghanshyammann.com; Juntingqiu Qiujunting (???) qiujunting at inspur.com> > > ??: openstack-discuss at lists.openstack.org; Tom Guo(??2810) guotao.bj at inspur.com>; Brin Zhang(???) zhangbailin at inspur.com>; Weiting Kong (???) kongweiting at inspur.com> > > ??: ??: [ptl] Need PTL volunteer for OpenStack Sahara > > > > Hi Gmann, > > I can lead this project. > > > > -----????----- > > ???: Ghanshyam Mann gmann at ghanshyammann.com> > > ????: 2023?3?23? 2:35 > > ???: Qiujunting qiujunting at inspur.com>; openstack-discuss openstack-discuss at lists.openstack.org> > > ??: [ptl] Need PTL volunteer for OpenStack Sahara > > > > Hi Qiu, > > > > I am reaching out to you as you were PTL for OpenStack Sahara project in the last cycle. > > > > There is no PTL candidate for the next cycle (2023.2), and it is on the leaderless project list. Please check if you or anyone you know would like to lead this project. > > > > - https://etherpad.opendev.org/p/2023.2-leaderless > > > > Also, if anyone else would like to help leading this project, this is time to let TC knows. > > > > -gmann > > > > > From paoloemilio.mazzon at unipd.it Wed May 17 12:42:23 2023 From: paoloemilio.mazzon at unipd.it (Paolo Emilio Mazzon) Date: Wed, 17 May 2023 14:42:23 +0200 Subject: [neutron] policy rules: filter on name field In-Reply-To: <2521426.hzxb4AvoNt@p1> References: <57436bfb-0acb-dffa-bcdb-2bef2c8b8472@unipd.it> <2957175.lYqTQsUpk7@p1> <188268d12bd.1227636f51301451.1653718688720363876@ghanshyammann.com> <2521426.hzxb4AvoNt@p1> Message-ID: <5dc6282d-3946-be68-5333-f2225e132f75@unipd.it> Thank you all for investigating this. I came to the same conclusion while messing with the policy file: what Ghanshyam proposed, in fact, prevents the deletion also from any user created SG... As far as I understand there is no concept of "which SG group I'm dealing with" in the security group *rules* API, right? Anyway: I will file a bug report. Thank you, Paolo On 17/05/23 09:55, Slawek Kaplonski wrote: > [...] > > > > > 'not' operator is supported in oslo policy. I think the below one should work which > allows admin to delete the default SG and manager role > > > can delete only non-default SG. > > > > > > NOTE: I have not tested this, may be you can check while trying other combinations. > > > > > > "delete_security_group_rule": "role:project_manager and project_id:%(project_id)s and > not 'default':%(name)s or 'default':%(name)s and role:admin" > > > > > > -gmann > > > > I checked it today and it can be done like: > > > ??? "sg_is_default": "field:security_groups:name=default", > ??? "delete_security_group": "(role:member and project_id:%(project_id)s and not > rule:sg_is_default) or role:admin" > > for *Security Group*. > > But it *won't work*?like that *for security group rules*?as You want to rely Your policy > on the value of the attribute which belongs to parent resource (name of the Security group > when doing API call for SG rule). We had similar problem for the "network:shared" field - > see [1] and it was fixed with [2] but that fix is specific for this special field > ("network:shared" only). Maybe we would need to add such special handling for the default > security group as well. If You would like to have something like that, please open LP bug > for it and we can investigate that deeper. > > > [1] https://bugs.launchpad.net/neutron/+bug/1808112 > > > [2] https://review.opendev.org/c/openstack/neutron/+/652636 > > > > > -- > > Slawek Kaplonski > > Principal Software Engineer > > Red Hat > From mwatkins at linuxfoundation.org Wed May 17 08:47:01 2023 From: mwatkins at linuxfoundation.org (Matt Watkins) Date: Wed, 17 May 2023 09:47:01 +0100 Subject: Devstack and SQLAlchemy (Problems with stack.sh) In-Reply-To: <325C0543-F49C-4C55-977A-78E72385823C@danplanet.com> References: <7F187725-B0B8-4170-85FF-C66FD5D10D87@linuxfoundation.org> <325C0543-F49C-4C55-977A-78E72385823C@danplanet.com> Message-ID: Thank you - this change/suggestion has got things moving to completion on my system; many thanks. - Matt > On 16 May 2023, at 15:08, Dan Smith wrote: > >> Error with SQLAlchemy==1.4.48 >> >> +lib/keystone:init_keystone:489 /usr/local/bin/keystone-manage --config-file /etc/keystone/keystone.conf db_sync >> CRITICAL keystone [-] Unhandled error: sqlalchemy.exc.NoSuchModuleError: Can't load plugin: sqlalchemy.plugins:dbcounter >> ERROR keystone File "/home/devstack/.local/lib/python3.10/site-packages/sqlalchemy/util/langhelpers.py", line 343, in load >> ERROR keystone raise exc.NoSuchModuleError( >> ERROR keystone sqlalchemy.exc.NoSuchModuleError: Can't load plugin: sqlalchemy.plugins:dbcounter > > This is an internal-to-devstack python module. Not sure why it?s not installing for you, but the easy button is to just disable its use in your localrc: > > MYSQL_GATHER_PERFORMANCE=False > > -?Dan > From senrique at redhat.com Wed May 17 14:00:38 2023 From: senrique at redhat.com (Sofia Enriquez) Date: Wed, 17 May 2023 15:00:38 +0100 Subject: Cinder Bug Report 2023-05-17 Message-ID: Hello Argonauts, Cinder Bug Meeting Etherpad *Medium* - [RBD] Retyping of in-use boot volumes renders instances unusable (possible data corruption) - *Status:* Unassigned. *Low* - Volume upload to glance as image,use compression to accelerate gzip. Occasionally, there may be errors. - *Status:* Fix proposed to master . - [DELL Unity] Image volume creation fails in Unity. - *Status: *Unassigned. Cheers -- Sof?a Enriquez she/her Software Engineer Red Hat PnT IRC: @enriquetaso @RedHat Red Hat Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Wed May 17 14:04:19 2023 From: satish.txt at gmail.com (Satish Patel) Date: Wed, 17 May 2023 10:04:19 -0400 Subject: [cinder-backup][ceph] cinder-backup support of incremental backup with ceph backend In-Reply-To: <20230517134939.Horde.rzsanBNjTZ0omzur-U1mcNx@webmail.nde.ag> References: <20230517132304.Horde.rvjONnn7BX0feZwASy3ZmP8@webmail.nde.ag> <20230517134939.Horde.rzsanBNjTZ0omzur-U1mcNx@webmail.nde.ag> Message-ID: You are goddamn right :) I was using /dev/zero before. After using /dev/random I can see the correct representation 1.1 GiB size of the last incremental backup. root at ceph1:~# rbd -p backups du NAME PROVISIONED USED volume-285a49a6-0e03-49e5-abf1-1c1efbfeb5f2.backup.4351d9d3-85fa-4cd5-b21d-619b3385aefc at backup.4351d9d3-85fa-4cd5-b21d-619b3385aefc.snap.1684260707.1682937 10 GiB 68 MiB volume-285a49a6-0e03-49e5-abf1-1c1efbfeb5f2.backup.4351d9d3-85fa-4cd5-b21d-619b3385aefc at backup.294b58af-771b-4a9f-bb7b-c37a4f84d678.snap.1684260787.943873 10 GiB 36 MiB volume-285a49a6-0e03-49e5-abf1-1c1efbfeb5f2.backup.4351d9d3-85fa-4cd5-b21d-619b3385aefc at backup.c9652662-36bd-4e74-b822-f7ae10eb7246.snap.1684330702.6955926 10 GiB 28 MiB volume-285a49a6-0e03-49e5-abf1-1c1efbfeb5f2.backup.4351d9d3-85fa-4cd5-b21d-619b3385aefc at backup.7e9a48db-b513-40d8-8018-f73ef52cb025.snap.1684331929.0653753 10 GiB 1.1 GiB This is just confusion when I look at the backup list, They all say 10G. I wish it would bring actual numbers from ceph instead of 10G. I can understand but it's hard to explain that to customers :( # openstack volume backup list +--------------------------------------+---------------------+-------------+-----------+------+ | ID | Name | Description | Status | Size | +--------------------------------------+---------------------+-------------+-----------+------+ | ec141929-cc74-459a-b8e7-03f016df9cec | spatel-vol-backup-4 | None | available | 10 | | 7e9a48db-b513-40d8-8018-f73ef52cb025 | spatel-vol-backup-3 | None | available | 10 | | c9652662-36bd-4e74-b822-f7ae10eb7246 | spatel-vol-backup-2 | None | available | 10 | | 294b58af-771b-4a9f-bb7b-c37a4f84d678 | spatel-vol-backup-1 | None | available | 10 | | 4351d9d3-85fa-4cd5-b21d-619b3385aefc | spatel-vol-backup | None | available | 10 | +--------------------------------------+---------------------+-------------+-----------+------+ On Wed, May 17, 2023 at 9:49?AM Eugen Block wrote: > I don't think cinder is lying. Firstly, the "backup create" dialog states: > > > Backups will be the same size as the volume they originate from. > > Secondly, I believe it highly depends on the actual storage backend > and its implementation how to create a backup. On a different storage > backend it might look completely different. We're on ceph from the > beginning so I can't comment on alternatives. > > As for your question, how exactly did your dd command look like? Did > you fill up the file with zeroes (dd if=/dev/zero)? In that case the > low used bytes number would make sense, if you filled it up randomly > the du size should be higher. > > Zitat von Satish Patel : > > > Thank you Eugen, > > > > I am noticing similar thing what you noticed. That means cinder lying to > > use or doesn't know how to calculate size of copy-on-write. > > > > One more question, I have created 1G file using dd command and took > > incremental backup and found ceph only showing 28 MiB size of backup. > Does > > that sound right? > > > > > > root at ceph1:~# rbd -p backups du NAME PROVISIONED USED > > > volume-285a49a6-0e03-49e5-abf1-1c1efbfeb5f2.backup.4351d9d3-85fa-4cd5-b21d-619b3385aefc at backup.4351d9d3-85fa-4cd5-b21d-619b3385aefc.snap.1684260707.1682937 > > 10 GiB 68 MiB > > > volume-285a49a6-0e03-49e5-abf1-1c1efbfeb5f2.backup.4351d9d3-85fa-4cd5-b21d-619b3385aefc at backup.294b58af-771b-4a9f-bb7b-c37a4f84d678.snap.1684260787.943873 > > 10 GiB 36 MiB > > > volume-285a49a6-0e03-49e5-abf1-1c1efbfeb5f2.backup.4351d9d3-85fa-4cd5-b21d-619b3385aefc at backup.c9652662-36bd-4e74-b822-f7ae10eb7246.snap.1684330702.6955926 > > 10 GiB 28 MiB > > > volume-285a49a6-0e03-49e5-abf1-1c1efbfeb5f2.backup.4351d9d3-85fa-4cd5-b21d-619b3385aefc > > 10 GiB 0 B > > > > On Wed, May 17, 2023 at 9:26?AM Eugen Block wrote: > > > >> Hi, > >> > >> just to visualize Rajat's response, ceph creates copy-on-write > >> snapshots so the incremental backup doesn't really use much space. On > >> a Victoria cloud I created one full backup of an almost empty volume > >> (made an ext4 filesystem and mounted it, create one tiny file), then > >> created a second tiny file and then made another incremental backup, > >> this is what ceph sees: > >> > >> ceph01:~ # rbd du > /volume-6662f50a-a74c-47a4-8abd-a49069f3614c > >> NAME > >> PROVISIONED USED > >> > >> > volume-6662f50a-a74c-47a4-8abd-a49069f3614c at backup.650a4f8f-7b61-447e-9eb9-767c74b15342.snap.1684329174.8419683 > >> 5 GiB 192 > >> MiB > >> > >> > volume-6662f50a-a74c-47a4-8abd-a49069f3614c at backup.1d358548-5d1d-4e03-9728-bb863c717910.snap.1684329450.9599462 > >> 5 GiB 16 > >> MiB > >> volume-6662f50a-a74c-47a4-8abd-a49069f3614c > >> 5 GiB 0 B > >> > >> > >> backup.650a4f8f-7b61-447e-9eb9-767c74b15342 (using 192 MiB) is the > >> full backup, backup.1d358548-5d1d-4e03-9728-bb863c717910 is the > >> incremental backup (using 16 MiB). > >> > >> Zitat von Rajat Dhasmana : > >> > >> > Hi Satish, > >> > > >> > Did you check the size of the actual backup file in ceph storage? It > >> should > >> > be created in the *backups* pool[1]. > >> > Cinder shows the same size of incremental backup as a normal backup > but > >> > file size should be different from > >> > the size shown in cinder DB records. Also file size of incremental > backup > >> > should not be the same as the file size of full backup. > >> > > >> > [1] > >> > > >> > https://github.com/openstack/devstack/blob/34afa91fc9f830fc8e1fdc4d76e7aa6d4248eaaa/lib/cinder_backups/ceph#L22 > >> > > >> > Thanks > >> > Rajat Dhasmana > >> > > >> > On Wed, May 17, 2023 at 12:25?AM Satish Patel > >> wrote: > >> > > >> >> Folks, > >> >> > >> >> I have ceph storage for my openstack and configure cinder-volume and > >> >> cinder-backup service for my disaster solution. I am trying to use > the > >> >> cinder-backup incremental option to save storage space but somehow It > >> >> doesn't work the way it should work. > >> >> > >> >> Whenever I take incremental backup it shows a similar size of > original > >> >> volume. Technically It should be smaller. Question is does ceph > >> >> support incremental backup with cinder? > >> >> > >> >> I am running a Yoga release. > >> >> > >> >> $ openstack volume list > >> >> > >> > +--------------------------------------+------------+------------+------+-------------------------------------+ > >> >> | ID | Name | Status | > >> >> Size | Attached to | > >> >> > >> > +--------------------------------------+------------+------------+------+-------------------------------------+ > >> >> | 285a49a6-0e03-49e5-abf1-1c1efbfeb5f2 | spatel-vol | backing-up | > >> >> 10 | Attached to spatel-foo on /dev/sdc | > >> >> > >> > +--------------------------------------+------------+------------+------+-------------------------------------+ > >> >> > >> >> ### Create full backup > >> >> $ openstack volume backup create --name spatel-vol-backup spatel-vol > >> --force > >> >> +-------+--------------------------------------+ > >> >> | Field | Value | > >> >> +-------+--------------------------------------+ > >> >> | id | 4351d9d3-85fa-4cd5-b21d-619b3385aefc | > >> >> | name | spatel-vol-backup | > >> >> +-------+--------------------------------------+ > >> >> > >> >> ### Create incremental > >> >> $ openstack volume backup create --name spatel-vol-backup-1 > >> >> --incremental --force spatel-vol > >> >> +-------+--------------------------------------+ > >> >> | Field | Value | > >> >> +-------+--------------------------------------+ > >> >> | id | 294b58af-771b-4a9f-bb7b-c37a4f84d678 | > >> >> | name | spatel-vol-backup-1 | > >> >> +-------+--------------------------------------+ > >> >> > >> >> $ openstack volume backup list > >> >> > >> > +--------------------------------------+---------------------+-------------+-----------+------+ > >> >> | ID | Name | > >> >> Description | Status | Size | > >> >> > >> > +--------------------------------------+---------------------+-------------+-----------+------+ > >> >> | 294b58af-771b-4a9f-bb7b-c37a4f84d678 | spatel-vol-backup-1 | None > >> >> | available | 10 | > >> >> | 4351d9d3-85fa-4cd5-b21d-619b3385aefc | spatel-vol-backup | None > >> >> | available | 10 | > >> >> > >> > +--------------------------------------+---------------------+-------------+-----------+------+ > >> >> > >> >> > >> >> My incremental backup still shows 10G size which should be lower > >> >> compared to the first backup. > >> >> > >> >> > >> >> > >> > >> > >> > >> > >> > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jay at gr-oss.io Wed May 17 14:31:25 2023 From: jay at gr-oss.io (Jay Faulkner) Date: Wed, 17 May 2023 07:31:25 -0700 Subject: [ironic] RFC: In-Person-only PTG in Vancouver Message-ID: Hey all, It was brought to my attention that there will not be much opportunity for remote participation in the PTG. We would have to manage the A/V ourselves in a shared room which would likely be extremely disruptive and ineffective. So I'm left with a few questions: should we continue forward with a PTG session for Ironic in Vancouver? What topics would be suitable with only a limited number of core reviewers available? I propose that the Ironic team uses the PTG venue for an informal collaboration and hacking space. It doesn't seem right for us to make any in-depth plans with only a subset of the core team. What do you all think? -- Jay Faulkner Ironic PTL TC Vice-Chair -------------- next part -------------- An HTML attachment was scrubbed... URL: From kennelson11 at gmail.com Wed May 17 15:00:18 2023 From: kennelson11 at gmail.com (Kendall Nelson) Date: Wed, 17 May 2023 10:00:18 -0500 Subject: [ironic] RFC: In-Person-only PTG in Vancouver In-Reply-To: References: Message-ID: Hey Jay, One other option we could provide is a dedicated room for the discussions you need others to be involved in that are not physically present. There should be two of them available. -Kendall On Wed, May 17, 2023 at 9:32?AM Jay Faulkner wrote: > Hey all, > > It was brought to my attention that there will not be much opportunity for > remote participation in the PTG. We would have to manage the A/V ourselves > in a shared room which would likely be extremely disruptive and ineffective. > > So I'm left with a few questions: should we continue forward with a PTG > session for Ironic in Vancouver? What topics would be suitable with only a > limited number of core reviewers available? > > I propose that the Ironic team uses the PTG venue for an informal > collaboration and hacking space. It doesn't seem right for us to make any > in-depth plans with only a subset of the core team. > > What do you all think? > > -- > Jay Faulkner > Ironic PTL > TC Vice-Chair > -------------- next part -------------- An HTML attachment was scrubbed... URL: From noonedeadpunk at gmail.com Wed May 17 15:13:58 2023 From: noonedeadpunk at gmail.com (Dmitriy Rabotyagov) Date: Wed, 17 May 2023 17:13:58 +0200 Subject: [ironic] RFC: In-Person-only PTG in Vancouver In-Reply-To: References: Message-ID: Oh, that would help us a lot as well, since some highly valuable core contributors to OpenStack-Ansible won't be present in Vancouver despite good half of active contributors will be on site. Though I see how we all might start fighting for these rooms :D ??, 17 ??? 2023 ?., 17:03 Kendall Nelson : > Hey Jay, > > One other option we could provide is a dedicated room for the discussions > you need others to be involved in that are not physically present. There > should be two of them available. > > -Kendall > > On Wed, May 17, 2023 at 9:32?AM Jay Faulkner wrote: > >> Hey all, >> >> It was brought to my attention that there will not be much opportunity >> for remote participation in the PTG. We would have to manage the A/V >> ourselves in a shared room which would likely be extremely disruptive and >> ineffective. >> >> So I'm left with a few questions: should we continue forward with a PTG >> session for Ironic in Vancouver? What topics would be suitable with only a >> limited number of core reviewers available? >> >> I propose that the Ironic team uses the PTG venue for an informal >> collaboration and hacking space. It doesn't seem right for us to make any >> in-depth plans with only a subset of the core team. >> >> What do you all think? >> >> -- >> Jay Faulkner >> Ironic PTL >> TC Vice-Chair >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Wed May 17 15:38:04 2023 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 17 May 2023 08:38:04 -0700 Subject: [neutron] policy rules: filter on name field In-Reply-To: <2521426.hzxb4AvoNt@p1> References: <57436bfb-0acb-dffa-bcdb-2bef2c8b8472@unipd.it> <2957175.lYqTQsUpk7@p1> <188268d12bd.1227636f51301451.1653718688720363876@ghanshyammann.com> <2521426.hzxb4AvoNt@p1> Message-ID: <1882a5c7bf9.f222e63172103.7817044410310448905@ghanshyammann.com> ---- On Wed, 17 May 2023 00:55:47 -0700 Slawek Kaplonski wrote --- > Hi, > > Dnia wtorek, 16 maja 2023 23:52:39 CEST Ghanshyam Mann pisze: > > > >? ---- On Tue, 16 May 2023 07:25:52 -0700? Slawek Kaplonski? wrote --- > >? > Hi, > >? > > >? > Dnia wtorek, 16 maja 2023 12:00:34 CEST Paolo Emilio Mazzon pisze: > >? > > Hello, > >? > > > >? > > I'm trying to understand if this is feasible: I would like to avoid a regular user from > >? > > tampering the "default" security group of a project. Specifically I would like to prevent > >? > > him from deleting sg rules *from the default sg only* > >? > > > >? > > I can wite a policy.yaml like this > >? > > > >? > > # Delete a security group rule > >? > > # DELETE? /security-group-rules/{id} > >? > > # Intended scope(s): project > >? > > "delete_security_group_rule": "role:project_manager and project_id:%(project_id)s" > >? > > > >? > > but this is sub-optimal since the regular member can still *add* rules... > >? > > > >? > > Is it possible to create a rule like > >? > > > >? > > "sg_is_default" : ...the sg group whose name is 'default' > >? > > > >? > > so I can write > >? > > > >? > > "delete_security_group_rule": "not rule:sg_is_default" ? > >? > > > >? > > Thanks! > >? > > >? > I'm not sure but I will try to check it later today or tomorrow morning and will let You know if that is possible or not. > > > > 'not' operator is supported in oslo policy. I think the below one should work which allows admin to delete the default SG and manager role > > can delete only non-default SG. > > > > NOTE: I have not tested this, may be you can check while trying other combinations. > > > > "delete_security_group_rule": "role:project_manager and project_id:%(project_id)s and not 'default':%(name)s or 'default':%(name)s and role:admin" > > > > -gmann > > > >? > > >? > > > >? > > ????Paolo > >? > > > >? > > -- > >? > >?? Paolo Emilio Mazzon > >? > >?? System and Network Administrator > >? > > > >? > >?? paoloemilio.mazzon[at]unipd.it > >? > > > >? > >?? PNC - Padova Neuroscience Center > >? > >?? https://www.pnc.unipd.it > >? > >?? Via Orus 2/B - 35131 Padova, Italy > >? > >?? +39 049 821 2624 > >? > > > >? > > > >? > > >? > > >? > -- > >? > Slawek Kaplonski > >? > Principal Software Engineer > >? > Red Hat > > > > > > I checked it today and it can be done like: > > ??? "sg_is_default": "field:security_groups:name=default", > ??? "delete_security_group": "(role:member and project_id:%(project_id)s and not rule:sg_is_default) or role:admin" > > for Security Group. > But it won't work?like that for security group rules?as You want to rely Your policy on the value of the attribute which belongs to parent resource (name of the Security group when doing API call for SG rule). We had similar problem for the "network:shared" field - see [1] and it was fixed with [2] but that fix is specific for this special field ("network:shared" only). Maybe we would need to add such special handling for the default security group as well. If You would like to have something like that, please open LP bug for it and we can investigate that deeper. ++, default SG being a special case here, I agree on handling this case in code instead of making the configuration more complex. May be a separate policy for default SG can also make sense. -gmann > > [1] https://bugs.launchpad.net/neutron/+bug/1808112 > [2] https://review.opendev.org/c/openstack/neutron/+/652636 > > > -- > Slawek Kaplonski > Principal Software Engineer > Red Hat > From juliaashleykreger at gmail.com Wed May 17 15:41:28 2023 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Wed, 17 May 2023 08:41:28 -0700 Subject: [ironic] RFC: In-Person-only PTG in Vancouver In-Reply-To: References: Message-ID: If we can determine a list of items and time blocks to move to that room, then I suspect we could all make it work. With the virtual ptgs over the last few years with contributors all over the world, we have surely gotten better at time boxing high bandwidth discussions. I hope. ;) -Julia On Wed, May 17, 2023 at 8:17?AM Dmitriy Rabotyagov wrote: > Oh, that would help us a lot as well, since some highly valuable core > contributors to OpenStack-Ansible won't be present in Vancouver despite > good half of active contributors will be on site. > > Though I see how we all might start fighting for these rooms :D > > ??, 17 ??? 2023 ?., 17:03 Kendall Nelson : > >> Hey Jay, >> >> One other option we could provide is a dedicated room for the discussions >> you need others to be involved in that are not physically present. There >> should be two of them available. >> >> -Kendall >> >> On Wed, May 17, 2023 at 9:32?AM Jay Faulkner wrote: >> >>> Hey all, >>> >>> It was brought to my attention that there will not be much opportunity >>> for remote participation in the PTG. We would have to manage the A/V >>> ourselves in a shared room which would likely be extremely disruptive and >>> ineffective. >>> >>> So I'm left with a few questions: should we continue forward with a PTG >>> session for Ironic in Vancouver? What topics would be suitable with only a >>> limited number of core reviewers available? >>> >>> I propose that the Ironic team uses the PTG venue for an informal >>> collaboration and hacking space. It doesn't seem right for us to make any >>> in-depth plans with only a subset of the core team. >>> >>> What do you all think? >>> >>> -- >>> Jay Faulkner >>> Ironic PTL >>> TC Vice-Chair >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From gregory.orange at pawsey.org.au Wed May 17 15:50:47 2023 From: gregory.orange at pawsey.org.au (Gregory Orange) Date: Wed, 17 May 2023 23:50:47 +0800 Subject: Transition to kolla In-Reply-To: <20230517110141.7acc03dc@brain> References: <20230517110141.7acc03dc@brain> Message-ID: <47e195fd-8247-90f1-7689-3c7b3e721a9d@pawsey.org.au> Hi, On 17/5/23 17:01, Ryszard Mielcarek wrote: > is there anyone who already did transition on existing cluster, > from package-version to kolla-ansible managed installation? > Looking for guides, tips, your experience on that. We have just completed doing exactly that, from Puppet OpenStack (train) with each service in its own VM, to a straightforward Kolla Ansible (train-eol) config on 3 control plane nodes and same compute nodes with staged migration of the instances. This page gave us great inspiration and structure to the approach: https://www.stackhpc.com/migrating-to-kolla.html Our only outage was to migrate databases, haproxy, keepalived and rabbitmq. It may have been possible to avoid that outage, but we decided to simplify those steps a little. Compute nodes ended up having trouble with mismatching vcpu features and security drivers, so we scheduled and cold migrated the instances across to kolla compute nodes. I am very pleased with the experience so far, and grateful to the Kolla Ansible community past and present. Next up, upgrades! Greg. From lokendrarathour at gmail.com Wed May 17 16:12:55 2023 From: lokendrarathour at gmail.com (Lokendra Rathour) Date: Wed, 17 May 2023 21:42:55 +0530 Subject: [Port Creation failed] - openstack Wallaby In-Reply-To: References: Message-ID: Hi Swogat, Thanks for the inputs, it was showing a similar issue but somehow the issue is not getting resolved. we are trying to explore more around it. getting the error in ovn-metadata-agent.log Cannot find Chassis_Private with name=f80799b9-0cf4-4413-bb4b-e36278c73f6c detailed: 2023-05-17 19:26:31.984 45317 INFO oslo.privsep.daemon [-] privsep daemon running as pid 45317 2023-05-17 19:26:32.712 44735 ERROR ovsdbapp.backend.ovs_idl.transaction [-] Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/connection.py", line 131, in run txn.results.put(txn.do_commit()) File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", line 93, in do_commit command.run_idl(txn) File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/command.py", line 172, in run_idl record = self.api.lookup(self.table, self.record) File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/__init__.py", line 208, in lookup return self._lookup(table, record) File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/__init__.py", line 268, in _lookup row = idlutils.row_by_value(self, rl.table, rl.column, record) File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/idlutils.py", line 114, in row_by_value raise RowNotFound(table=table, col=column, match=match) ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Chassis_Private with name=f80799b9-0cf4-4413-bb4b-e36278c73f6c 2023-05-17 19:26:32.713 44735 ERROR ovsdbapp.backend.ovs_idl.command [-] Error executing command (DbAddCommand): ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Chassis_Private with name=f80799b9-0cf4-4413-bb4b-e36278c73f6c 2023-05-17 19:26:32.713 44735 ERROR ovsdbapp.backend.ovs_idl.command Traceback (most recent call last): 2023-05-17 19:26:32.713 44735 ERROR ovsdbapp.backend.ovs_idl.command File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/command.py", line 42, in execute waiting for your always helpful inputs. On Tue, May 16, 2023 at 10:47?PM Swogat Pradhan wrote: > Hi > I am not sure if this will help, but i faced something similar. > You might need to check the ovn database entries. > http://www.jimmdenton.com/neutron-ovn-private-chassis/ > > Or maybe try restarting the ovn service from pcs, sometimes issue comes up > when ovn doesn't sync up. > > Again m not sure if this will be of any help to you. > > With regards, > Swogat Pradhan > > On Tue, 16 May 2023, 10:41 pm Lokendra Rathour, > wrote: > >> Hi All, >> Was trying to create OpenStack VM in OpenStack wallaby release, not able >> to create VM, it is failing because of Port not getting created. >> >> The error that we are getting: >> nova-compute.log: >> >> 2023-05-16 18:15:35.495 7 INFO nova.compute.provider_config >> [req-faaf38e7-b5ee-43d1-9303-d508285f5ab7 - - - - -] No provider configs >> found in /etc/nova/provider_config. If files are present, ensure the Nova >> process has access. >> 2023-05-16 18:15:35.549 7 ERROR nova.cmd.common >> [req-8842f11c-fe5a-4ad3-92ea-a6898f482bf0 - - - - -] No db access allowed >> in nova-compute: File "/usr/bin/nova-compute", line 10, in >> sys.exit(main()) >> File "/usr/lib/python3.6/site-packages/nova/cmd/compute.py", line 59, >> in main >> topic=compute_rpcapi.RPC_TOPIC) >> File "/usr/lib/python3.6/site-packages/nova/service.py", line 264, in >> create >> utils.raise_if_old_compute() >> File "/usr/lib/python3.6/site-packages/nova/utils.py", line 1068, in >> raise_if_old_compute >> ctxt, ['nova-compute']) >> File "/usr/lib/python3.6/site-packages/nova/objects/service.py", line >> 563, in get_minimum_version_all_cells >> binaries) >> File "/usr/lib/python3.6/site-packages/nova/context.py", line 544, in >> scatter_gather_all_cells >> fn, *args, **kwargs) >> File "/usr/lib/python3.6/site-packages/nova/context.py", line 432, in >> scatter_gather_cells >> with target_cell(context, cell_mapping) as cctxt: >> File "/usr/lib64/python3.6/contextlib.py", line 81, in __enter__ >> return next(self.gen) >> >> >> neutron/ovn-metadata-agent.log >> >> 2023-05-16 22:33:41.871 45204 ERROR ovsdbapp.backend.ovs_idl.command >> ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Chassis_Private >> with name=b6e6f0d3-40c6-4d4e-8ef6-c935fa027bd2 >> 2023-05-16 22:33:41.871 45204 ERROR ovsdbapp.backend.ovs_idl.command >> 2023-05-16 22:36:41.876 45204 ERROR ovsdbapp.backend.ovs_idl.transaction >> [-] Traceback (most recent call last): >> File >> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/connection.py", >> line 131, in run >> txn.results.put(txn.do_commit()) >> File >> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", >> line 93, in do_commit >> command.run_idl(txn) >> File >> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/command.py", >> line 172, in run_idl >> record = self.api.lookup(self.table, self.record) >> File >> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/__init__.py", >> line 208, in lookup >> return self._lookup(table, record) >> File >> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/__init__.py", >> line 268, in _lookup >> row = idlutils.row_by_value(self, rl.table, rl.column, record) >> File >> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/idlutils.py", >> line 114, in row_by_value >> raise RowNotFound(table=table, col=column, match=match) >> ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find >> Chassis_Private with name=b6e6f0d3-40c6-4d4e-8ef6-c935fa027bd2 >> >> any input to help get this issue fixed would be of great help. >> thanks >> -- >> ~ Lokendra >> skype: lokendrarathour >> >> >> -- ~ Lokendra skype: lokendrarathour -------------- next part -------------- An HTML attachment was scrubbed... URL: From ltoscano at redhat.com Wed May 17 16:13:52 2023 From: ltoscano at redhat.com (Luigi Toscano) Date: Wed, 17 May 2023 18:13:52 +0200 Subject: =?UTF-8?B?562U5aSNOiDnrZTlpI06?= [ptl] Need PTL volunteer for OpenStack Sahara In-Reply-To: <18827d50cb8.cfee5b2e1305648.2964607450323554781@ghanshyammann.com> References: <4d3b952157f84886b52c7475b6bae68e@inspur.com> <18827d50cb8.cfee5b2e1305648.2964607450323554781@ghanshyammann.com> Message-ID: <5675627.DvuYhMxLoT@whitebase> On Wednesday, 17 May 2023 05:50:53 CEST Ghanshyam Mann wrote: > ---- On Tue, 16 May 2023 19:05:50 -0700 Jerry Zhou (???) wrote --- > > > Hi Gmann, > > > > The governance patch to PTL appointment already merged into. > > > > [1] https://review.opendev.org/c/openstack/governance/+/881186 > > > > But there is no management authority for the project, and the review > > patch does not have +2 authority. > > > > I can take on the responsibility of PTL, maintain the project gate, and > > ensure that zuul can be executed normally; review and merge the commits > > submitted on the project. Could you add me to the sahara-core group. > > It makes sense to add you to the group but I would like Sahara's core member > if anyone is around to add you if no one is active then TC can add you > there. I've added Jerry to the following groups: sahara-core sahara-stable-maint All the other relevant sahara groups include one of those two groups, with the exception of the sahara-release group, which I don't have access to and I would need to help of the TC to change it. Ciao -- Luigi From satish.txt at gmail.com Wed May 17 16:21:49 2023 From: satish.txt at gmail.com (Satish Patel) Date: Wed, 17 May 2023 12:21:49 -0400 Subject: [kolla] How to patch images during build Message-ID: Folks, I'm using kolla-build to build all images and push them to the local repo. So far all good but let's say If in future for some reason I want to patch some bug and rebuild the image in that case how do i patch kolla images? I am reading at [1] and didn't see any example to patch Image for any bug. Should I be downloading tarball and patch it and then use type=local to build the image from local source or add patch to Dockerfile? What people use for best practice for this kind of thing? [1] https://docs.openstack.org/kolla/latest/admin/image-building.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Wed May 17 16:28:13 2023 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 17 May 2023 09:28:13 -0700 Subject: =?UTF-8?Q?Re:_=E7=AD=94=E5=A4=8D:_=E7=AD=94=E5=A4=8D:_[ptl]_Need_PTL_v?= =?UTF-8?Q?olunteer_for_OpenStack_Sahara?= In-Reply-To: <5675627.DvuYhMxLoT@whitebase> References: <4d3b952157f84886b52c7475b6bae68e@inspur.com> <18827d50cb8.cfee5b2e1305648.2964607450323554781@ghanshyammann.com> <5675627.DvuYhMxLoT@whitebase> Message-ID: <1882a8a67ac.efce01a976022.2515819052572941367@ghanshyammann.com> ---- On Wed, 17 May 2023 09:13:52 -0700 Luigi Toscano wrote --- > On Wednesday, 17 May 2023 05:50:53 CEST Ghanshyam Mann wrote: > > ---- On Tue, 16 May 2023 19:05:50 -0700 Jerry Zhou (???) wrote --- > > > > > Hi Gmann, > > > > > > The governance patch to PTL appointment already merged into. > > > > > > [1] https://review.opendev.org/c/openstack/governance/+/881186 > > > > > > But there is no management authority for the project, and the review > > > patch does not have +2 authority. > > > > > > I can take on the responsibility of PTL, maintain the project gate, and > > > ensure that zuul can be executed normally; review and merge the commits > > > submitted on the project. Could you add me to the sahara-core group. > > > > It makes sense to add you to the group but I would like Sahara's core member > > if anyone is around to add you if no one is active then TC can add you > > there. > > I've added Jerry to the following groups: > > sahara-core > sahara-stable-maint > > All the other relevant sahara groups include one of those two groups, with the > exception of the sahara-release group, which I don't have access to and I > would need to help of the TC to change it. Thanks a lot tosky for helping here. -gmann > > Ciao > -- > Luigi > > > > From thierry at openstack.org Wed May 17 16:41:43 2023 From: thierry at openstack.org (Thierry Carrez) Date: Wed, 17 May 2023 18:41:43 +0200 Subject: [largescale-sig] Next meeting: May 17, 15utc In-Reply-To: <478ed560-ed34-06db-0da9-242126cda3ee@openstack.org> References: <478ed560-ed34-06db-0da9-242126cda3ee@openstack.org> Message-ID: <2d6a7ea8-dca8-f50f-5899-62cc512dbdcd@openstack.org> Here is the summary of our SIG meeting today. We discussed our next OpenInfra Live episode, with a tentative date set to Sept 21. We are considering two topics: one around Galera, and the other around a deep dive in a APAC public cloud deployment. You can read the detailed meeting logs at: https://meetings.opendev.org/meetings/large_scale_sig/2023/large_scale_sig.2023-05-17-15.00.html Our next IRC meeting will be May 31, 8:00UTC on #openstack-operators on OFTC. Regards, -- Thierry Carrez (ttx) From fungi at yuggoth.org Wed May 17 16:51:16 2023 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 17 May 2023 16:51:16 +0000 Subject: =?utf-8?B?562U5aSNOiDnrZTlpI0=?= =?utf-8?Q?=3A?= [ptl] Need PTL volunteer for OpenStack Sahara In-Reply-To: <5675627.DvuYhMxLoT@whitebase> References: <4d3b952157f84886b52c7475b6bae68e@inspur.com> <18827d50cb8.cfee5b2e1305648.2964607450323554781@ghanshyammann.com> <5675627.DvuYhMxLoT@whitebase> Message-ID: <20230517165115.3mugi3vusyan4i6w@yuggoth.org> On 2023-05-17 18:13:52 +0200 (+0200), Luigi Toscano wrote: [...] > All the other relevant sahara groups include one of those two > groups, with the exception of the sahara-release group, which I > don't have access to and I would need to help of the TC to change > it. According to Clark's comment in the #openstack-tc channel, that group isn't used by any Gerrit ACLs, so should be safe to just ignore it as abandoned cruft. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From maksim.malchuk at gmail.com Wed May 17 17:43:42 2023 From: maksim.malchuk at gmail.com (Maksim Malchuk) Date: Wed, 17 May 2023 20:43:42 +0300 Subject: [kolla] How to patch images during build In-Reply-To: References: Message-ID: Hi Satish, The correct way to apply bugfix is build the image from the git source. Commit bugfix changes to the git, and build from the bugfix branch for example. On Wed, May 17, 2023 at 7:30?PM Satish Patel wrote: > Folks, > > I'm using kolla-build to build all images and push them to the local repo. > So far all good but let's say If in future for some reason I want to patch > some bug and rebuild the image in that case how do i patch kolla images? > > I am reading at [1] and didn't see any example to patch Image for any bug. > > Should I be downloading tarball and patch it and then use type=local to > build the image from local source or add patch to Dockerfile? What people > use for best practice for this kind of thing? > > [1] https://docs.openstack.org/kolla/latest/admin/image-building.html > -- Regards, Maksim Malchuk -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Wed May 17 18:00:50 2023 From: satish.txt at gmail.com (Satish Patel) Date: Wed, 17 May 2023 14:00:50 -0400 Subject: [kolla] How to patch images during build In-Reply-To: References: Message-ID: Thank Maksim, Like the following example? For reference, can I pass commit hash? *[keystone-base]* type = git location = https://opendev.org/openstack/keystone reference = stable/mitaka On Wed, May 17, 2023 at 1:43?PM Maksim Malchuk wrote: > Hi Satish, > > The correct way to apply bugfix is build the image from the git source. > Commit bugfix changes to the git, and build from the bugfix branch for > example. > > On Wed, May 17, 2023 at 7:30?PM Satish Patel wrote: > >> Folks, >> >> I'm using kolla-build to build all images and push them to the local >> repo. So far all good but let's say If in future for some reason I want to >> patch some bug and rebuild the image in that case how do i patch kolla >> images? >> >> I am reading at [1] and didn't see any example to patch Image for any >> bug. >> >> Should I be downloading tarball and patch it and then use type=local to >> build the image from local source or add patch to Dockerfile? What people >> use for best practice for this kind of thing? >> >> [1] https://docs.openstack.org/kolla/latest/admin/image-building.html >> > > > -- > Regards, > Maksim Malchuk > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From maksim.malchuk at gmail.com Wed May 17 18:02:02 2023 From: maksim.malchuk at gmail.com (Maksim Malchuk) Date: Wed, 17 May 2023 21:02:02 +0300 Subject: [kolla] How to patch images during build In-Reply-To: References: Message-ID: Yes, you can do that, but note bene mitaka not supported. On Wed, May 17, 2023 at 9:01?PM Satish Patel wrote: > Thank Maksim, > > Like the following example? For reference, can I pass commit hash? > > *[keystone-base]* > > type = git > > location = https://opendev.org/openstack/keystone > > reference = stable/mitaka > > On Wed, May 17, 2023 at 1:43?PM Maksim Malchuk > wrote: > >> Hi Satish, >> >> The correct way to apply bugfix is build the image from the git source. >> Commit bugfix changes to the git, and build from the bugfix branch for >> example. >> >> On Wed, May 17, 2023 at 7:30?PM Satish Patel >> wrote: >> >>> Folks, >>> >>> I'm using kolla-build to build all images and push them to the local >>> repo. So far all good but let's say If in future for some reason I want to >>> patch some bug and rebuild the image in that case how do i patch kolla >>> images? >>> >>> I am reading at [1] and didn't see any example to patch Image for any >>> bug. >>> >>> Should I be downloading tarball and patch it and then use type=local to >>> build the image from local source or add patch to Dockerfile? What people >>> use for best practice for this kind of thing? >>> >>> [1] https://docs.openstack.org/kolla/latest/admin/image-building.html >>> >> >> >> -- >> Regards, >> Maksim Malchuk >> >> -- Regards, Maksim Malchuk -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Wed May 17 18:09:00 2023 From: satish.txt at gmail.com (Satish Patel) Date: Wed, 17 May 2023 14:09:00 -0400 Subject: [kolla] How to patch images during build In-Reply-To: References: Message-ID: That was example, I took from here - https://docs.openstack.org/kolla/latest/admin/image-building.html On Wed, May 17, 2023 at 2:02?PM Maksim Malchuk wrote: > Yes, you can do that, but note bene mitaka not supported. > > On Wed, May 17, 2023 at 9:01?PM Satish Patel wrote: > >> Thank Maksim, >> >> Like the following example? For reference, can I pass commit hash? >> >> *[keystone-base]* >> >> type = git >> >> location = https://opendev.org/openstack/keystone >> >> reference = stable/mitaka >> >> On Wed, May 17, 2023 at 1:43?PM Maksim Malchuk >> wrote: >> >>> Hi Satish, >>> >>> The correct way to apply bugfix is build the image from the git source. >>> Commit bugfix changes to the git, and build from the bugfix branch for >>> example. >>> >>> On Wed, May 17, 2023 at 7:30?PM Satish Patel >>> wrote: >>> >>>> Folks, >>>> >>>> I'm using kolla-build to build all images and push them to the local >>>> repo. So far all good but let's say If in future for some reason I want to >>>> patch some bug and rebuild the image in that case how do i patch kolla >>>> images? >>>> >>>> I am reading at [1] and didn't see any example to patch Image for any >>>> bug. >>>> >>>> Should I be downloading tarball and patch it and then use type=local to >>>> build the image from local source or add patch to Dockerfile? What people >>>> use for best practice for this kind of thing? >>>> >>>> [1] https://docs.openstack.org/kolla/latest/admin/image-building.html >>>> >>> >>> >>> -- >>> Regards, >>> Maksim Malchuk >>> >>> > > -- > Regards, > Maksim Malchuk > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Wed May 17 18:15:01 2023 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 17 May 2023 18:15:01 +0000 Subject: [kolla] How to patch images during build In-Reply-To: References: Message-ID: <20230517181500.gvb2dvjmhc3jyt4g@yuggoth.org> On 2023-05-17 21:02:02 +0300 (+0300), Maksim Malchuk wrote: > Yes, you can do that, but note bene mitaka not supported. [...] Not only unsupported, but the stable/mitaka branch of openstack/keystone was deleted when it reached EOL in 2017. You may instead want to specify `reference = mitaka-eol` (assuming Git tags also work there). That should get you the final state of the stable/mitaka branch prior to its deletion. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From noonedeadpunk at gmail.com Wed May 17 18:33:29 2023 From: noonedeadpunk at gmail.com (Dmitriy Rabotyagov) Date: Wed, 17 May 2023 20:33:29 +0200 Subject: Installation openstack Multi node In-Reply-To: References: Message-ID: We have also docs for that, but it's based on libvirt rather then VMware: https://opendev.org/openstack/openstack-ansible-ops/src/branch/master/multi-node-aio ??, 15 ??? 2023 ?., 08:09 U T, Raghavendra : > Hi, > > > > Kindly refer below documentation: > > https://docs.openstack.org/devstack/latest/guides/multinode-lab.html > > > > Although its not a video, it can be used as starting point. > > > > Regards, > > Raghavendra Tilay. > > > > > > *From:* BEDDA Fadhel > *Sent:* Saturday, May 13, 2023 12:02 AM > *To:* openstack-discuss at lists.openstack.org > *Subject:* Installation openstack Multi node > > > > Good morning, > > I am looking for a complete video or digital procedure that allows me to set up an openstack multi node test environment on vamware workstation. > > THANKS > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliver.weinmann at me.com Wed May 17 20:02:31 2023 From: oliver.weinmann at me.com (Oliver Weinmann) Date: Wed, 17 May 2023 22:02:31 +0200 Subject: Installation openstack Multi node In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From jorgevisentini at gmail.com Thu May 18 01:16:47 2023 From: jorgevisentini at gmail.com (Jorge Visentini) Date: Wed, 17 May 2023 22:16:47 -0300 Subject: Opentack + FCP Storages Message-ID: Hello. Today in our environment we only use FCP 3PAR Storages. Is there a "friendly" way to use FCP Storages with Openstack? I know and I've already tested Ceph, so I can say that it's the best storage integration for Openstack, but it's not my case hehe All the best! -- Att, Jorge Visentini +55 55 98432-9868 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jichenjc at cn.ibm.com Thu May 18 01:34:37 2023 From: jichenjc at cn.ibm.com (Chen CH Ji) Date: Thu, 18 May 2023 01:34:37 +0000 Subject: Opentack + FCP Storages In-Reply-To: References: Message-ID: We use FCP backend storage while the backend is IBM FLASH/DS8K , works fine so not sure the `friendly` you are saying, cinder/nova/os-brick support FCP well so far, FYI ________________________________ From: Jorge Visentini Sent: Thursday, May 18, 2023 9:16 AM To: openstack-discuss at lists.openstack.org Subject: [EXTERNAL] Opentack + FCP Storages Hello. Today in our environment we only use FCP 3PAR Storages. Is there a "friendly" way to use FCP Storages with Openstack? I know and I've already tested Ceph, so I can say that it's the best storage integration for Openstack, ZjQcmQRYFpfptBannerStart This Message Is From an Untrusted Sender You have not previously corresponded with this sender. ZjQcmQRYFpfptBannerEnd Hello. Today in our environment we only use FCP 3PAR Storages. Is there a "friendly" way to use FCP Storages with Openstack? I know and I've already tested Ceph, so I can say that it's the best storage integration for Openstack, but it's not my case hehe All the best! -- Att, Jorge Visentini +55 55 98432-9868 -------------- next part -------------- An HTML attachment was scrubbed... URL: From masayuki.igawa at gmail.com Thu May 18 02:56:03 2023 From: masayuki.igawa at gmail.com (Masayuki Igawa) Date: Thu, 18 May 2023 11:56:03 +0900 Subject: [cinder-backup][ceph] cinder-backup support of incremental backup with ceph backend In-Reply-To: References: Message-ID: <7af7d0f9-ff01-482d-91e4-1b99a9659222@app.fastmail.com> Hi Satish, > Whenever I take incremental backup it shows a similar size of original > volume. Technically It should be smaller. Question is does ceph support > incremental backup with cinder? IIUC, it would be expected behavior. According to the API Doc[1], "size" is "The size of the volume, in gibibytes (GiB)." So, it's not the actual size of the snapshot itself. What about the "object_count" of "openstack volume backup show" output? The incremental's one should be zero or less than the full backup at least? [1] https://docs.openstack.org/api-ref/block-storage/v3/?expanded=show-backup-detail-detail,list-backups-with-detail-detail#id428 -- Masayuki Igawa On Wed, May 17, 2023, at 03:51, Satish Patel wrote: > Folks, > > I have ceph storage for my openstack and configure cinder-volume and > cinder-backup service for my disaster solution. I am trying to use the > cinder-backup incremental option to save storage space but somehow It > doesn't work the way it should work. > > Whenever I take incremental backup it shows a similar size of original > volume. Technically It should be smaller. Question is does ceph support > incremental backup with cinder? > > I am running a Yoga release. > > $ openstack volume list > +--------------------------------------+------------+------------+------+-------------------------------------+ > | ID | Name | Status | Size > | Attached to | > +--------------------------------------+------------+------------+------+-------------------------------------+ > | 285a49a6-0e03-49e5-abf1-1c1efbfeb5f2 | spatel-vol | backing-up | 10 > | Attached to spatel-foo on /dev/sdc | > +--------------------------------------+------------+------------+------+-------------------------------------+ > > ### Create full backup > $ openstack volume backup create --name spatel-vol-backup spatel-vol --force > +-------+--------------------------------------+ > | Field | Value | > +-------+--------------------------------------+ > | id | 4351d9d3-85fa-4cd5-b21d-619b3385aefc | > | name | spatel-vol-backup | > +-------+--------------------------------------+ > > ### Create incremental > $ openstack volume backup create --name spatel-vol-backup-1 > --incremental --force spatel-vol > +-------+--------------------------------------+ > | Field | Value | > +-------+--------------------------------------+ > | id | 294b58af-771b-4a9f-bb7b-c37a4f84d678 | > | name | spatel-vol-backup-1 | > +-------+--------------------------------------+ > > $ openstack volume backup list > +--------------------------------------+---------------------+-------------+-----------+------+ > | ID | Name | > Description | Status | Size | > +--------------------------------------+---------------------+-------------+-----------+------+ > | 294b58af-771b-4a9f-bb7b-c37a4f84d678 | spatel-vol-backup-1 | None > | available | 10 | > | 4351d9d3-85fa-4cd5-b21d-619b3385aefc | spatel-vol-backup | None > | available | 10 | > +--------------------------------------+---------------------+-------------+-----------+------+ > My incremental backup still shows 10G size which should be lower > compared to the first backup. From dale at catalystcloud.nz Thu May 18 03:19:22 2023 From: dale at catalystcloud.nz (Dale Smith) Date: Thu, 18 May 2023 15:19:22 +1200 Subject: [Yoga][Magnum] change boot disk storage for COE VMs In-Reply-To: References: Message-ID: <47724db3-f575-d106-18c8-ba67aee445a5@catalystcloud.nz> Yes, you can change various boot volume types in `magnum.conf` under section `cinder`. Is it possible you've missed the section name? Example below: [cinder] default_boot_volume_type=MyCinderPool default_docker_volume_type=MyCinderPool default_etcd_volume_type=MyCinderPool default_boot_volume_size=20 On 15/05/23 22:26, wodel youchi wrote: > Hi, > > When creating Magnum clusters, two disks are created for each VM sda > and sdb, the sda is the boot disk and the sdb is for docker images. > By default the sda is created (at least in my deployment) in nova vms > pool, as ephemeral disks, the second disk sdb is created in cinder volume. > > Is it possible to move the sda from nova vms to cinder volume? > > I tried with default_boot_volume_type, but it didn't work. > default_boot_volume_type = MyCinderPool > > Regards. -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_signature Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From zhouxinyong at inspur.com Thu May 18 06:25:26 2023 From: zhouxinyong at inspur.com (=?utf-8?B?SmVycnkgWmhvdSAo5ZGo6ZGr5YuHKQ==?=) Date: Thu, 18 May 2023 06:25:26 +0000 Subject: =?utf-8?B?562U5aSNOiDnrZTlpI06IOetlOWkjTogW3B0bF0gTmVlZCBQVEwgdm9sdW50?= =?utf-8?Q?eer_for_OpenStack_Sahara?= In-Reply-To: <1882a8a67ac.efce01a976022.2515819052572941367@ghanshyammann.com> References: <4d3b952157f84886b52c7475b6bae68e@inspur.com> <18827d50cb8.cfee5b2e1305648.2964607450323554781@ghanshyammann.com> <5675627.DvuYhMxLoT@whitebase> <1882a8a67ac.efce01a976022.2515819052572941367@ghanshyammann.com> Message-ID: <055f798dcfad4cd9a94097b9f59a680b@inspur.com> I especially appreciate you taking the time and effort to help me. I have been added to the group. Once again, thank you for your help and support. -----????----- ???: Ghanshyam Mann ????: 2023?5?18? 0:28 ???: Luigi Toscano ??: Jerry Zhou (???) ; openstack-discuss ; Brin Zhang(???) ; Tom Guo(??2810) ; Weiting Kong (???) ; Alex Song (???) ??: Re: ??: ??: [ptl] Need PTL volunteer for OpenStack Sahara ---- On Wed, 17 May 2023 09:13:52 -0700 Luigi Toscano wrote --- > On Wednesday, 17 May 2023 05:50:53 CEST Ghanshyam Mann wrote: > > ---- On Tue, 16 May 2023 19:05:50 -0700 Jerry Zhou (???) wrote --- > > > > > Hi Gmann, > > > > > > The governance patch to PTL appointment already merged into. > > > > > > [1] https://review.opendev.org/c/openstack/governance/+/881186 > > > > > > But there is no management authority for the project, and the review > > > patch does not have +2 authority. > > > > > > I can take on the responsibility of PTL, maintain the project gate, and > > > ensure that zuul can be executed normally; review and merge the commits > > > submitted on the project. Could you add me to the sahara-core group. > > > > It makes sense to add you to the group but I would like Sahara's core member > > if anyone is around to add you if no one is active then TC can add you > > there. > > I've added Jerry to the following groups: > > sahara-core > sahara-stable-maint > > All the other relevant sahara groups include one of those two groups, with the > exception of the sahara-release group, which I don't have access to and I > would need to help of the TC to change it. Thanks a lot tosky for helping here. -gmann > > Ciao > -- > Luigi > > > > From ozzzo at yahoo.com Thu May 18 12:19:08 2023 From: ozzzo at yahoo.com (Albert Braden) Date: Thu, 18 May 2023 12:19:08 +0000 (UTC) Subject: [kolla] [train] haproxy and controller restart causes user impact In-Reply-To: <20230517111730.Horde.ao5h0-n-fH5zG3Yj_Nnt7oT@webmail.nde.ag> References: <314427760.909657.1683808321773.ref@mail.yahoo.com> <314427760.909657.1683808321773@mail.yahoo.com> <20230512073327.Horde.nPa9c_1UYY_XW_n-sc1pyQM@webmail.nde.ag> <1577432210.483872.1683917817875@mail.yahoo.com> <1979459451.2557820.1684258144482@mail.yahoo.com> <20230516181703.Horde.xQASX_23dzWCwbxddBcHuLa@webmail.nde.ag> <393615542.2867360.1684300158715@mail.yahoo.com> <20230517111730.Horde.ao5h0-n-fH5zG3Yj_Nnt7oT@webmail.nde.ag> Message-ID: <30747802.3897418.1684412348929@mail.yahoo.com> There must be a way to stop traffic from being sent to a controller, so that it can be rebooted in an orderly fashion. If that's not possible, then reducing the period of disruption with network settings would be my second choice. Can someone from the kolla team give advice about this? What is the recommended method for rebooting a kolla-ansible controller in an orderly fashion? Do I need to use the "remove from cluster" and "add to cluster" procedures, or is there a better way? On Wednesday, May 17, 2023, 07:25:34 AM EDT, Eugen Block wrote: Hi, I found this [1] reference, it recommends to reduce the kernel option? for tcp_retries to reduce the impact of a service interruption: # /etc/kolla/globals.yml haproxy_host_ipv4_tcp_retries2: 6 Apparently, this option was introduced in Victoria [2], it states: > Added a new haproxy configuration variable,? > haproxy_host_ipv4_tcp_retries2, which allows users to modify this? > kernel option. This option sets maximum number of times a TCP packet? > is retransmitted in established state before giving up. The default? > kernel value is 15, which corresponds to a duration of approximately? > between 13 to 30 minutes, depending on the retransmission timeout.? > This variable can be used to mitigate an issue with stuck? > connections in case of VIP failover, see bug 1917068 for details. It reads like exactly what you're describing. If I remember correctly,? you're still on Train? In that case you'll probably have to configure? that setting manually (scripted maybe), it is this value:? /proc/sys/net/ipv4/tcp_retries2 The solution in [3] even talks about setting it to 3 for HA deployments. # sysctl -a | grep net.ipv4.tcp_retries2 net.ipv4.tcp_retries2 = 15 Regards, Eugen [1]? https://docs.openstack.org/kolla-ansible/latest/reference/high-availability/haproxy-guide.html [2] https://docs.openstack.org/releasenotes/kolla-ansible/victoria.html [3] https://access.redhat.com/solutions/726753 Zitat von Albert Braden : > Before we switched to durable queues we were seeing RMQ issues after? > a restart. Now RMQ is fine after restart, but operations in progress? > will fail. VMs will fail to build, or not get DNS records. Volumes? > don't get attached or detached. It looks like haproxy is the issue? > now; connections continue going to the down node. I think we can fix? > that by failing over haproxy before rebooting. > > The problem is, I'm not sure that haproxy is the only issue. All 3? > controllers are doing stuff, and when I reboot one, whatever it is? > doing is likely to fail. Is there an orderly way to stop work from? > being done on a controller without ruining work that is already in? > progress, besides removing it from the cluster? Would "kolla-ansible? > stop" do it? >? ? ? On Tuesday, May 16, 2023, 02:23:59 PM EDT, Eugen Block? > wrote: > >? Hi Albert, > > sorry, I'm swamped with different stuff right now. I just took a? > glance at the docs you mentioned and it seems way too much for? > something simple as a controller restart to actually remove hosts,? > that should definitely not be necessary. > I'm not familiar with kolla or exabgp, but can you describe what? > exactly takes that long to failover? Maybe that could be improved? And? > can you limit the failing requests to a specific service (volumes,? > network ports, etc.) or do they all fail? Maybe rabbitmq should be? > considered after all, you could share your rabbitmq settings from the? > different openstack services and I will collect mine to compare. And? > then also the rabbitmq config (policies, vhosts, queues). > > Regards, > Eugen > > Zitat von Albert Braden : > >> What's the recommended method for rebooting controllers? Do we need? >> to use the "remove from cluster" and "add to cluster" procedures or? >> is there a better way? >> >> https://docs.openstack.org/kolla-ansible/train/user/adding-and-removing-hosts.html >> ? ? ? On Friday, May 12, 2023, 03:04:26 PM EDT, Albert Braden? >> wrote: >> >> ? We use keepalived and exabgp to manage failover for haproxy. That? >> works but it takes a few minutes, and during those few minutes? >> customers experience impact. We tell them to not build/delete VMs? >> during patching, but they still do, and then complain about the? >> failures. >> >> We're planning to experiment with adding a "manual" haproxy failover? >> to our patching automation, but I'm wondering if there is anything? >> on the controller that needs to be failed over or disabled before? >> rebooting the KVM. I looked at the "remove from cluster" and "add to? >> cluster" procedures but that seems unnecessarily cumbersome for? >> rebooting the KVM. >> ? ? ? On Friday, May 12, 2023, 03:42:42 AM EDT, Eugen Block? >> wrote: >> >> ? Hi Albert, >> >> how is your haproxy placement controlled, something like pacemaker or? >> similar? I would always do a failover when I'm aware of interruptions? >> (maintenance window), that should speed things up for clients. We have? >> a pacemaker controlled HA control plane, it takes more time until? >> pacemaker realizes that the resource is gone if I just rebooted a? >> server without failing over. I have no benchmarks though. There's? >> always a risk of losing a couple of requests during the failover but? >> we didn't have complaints yet, I believe most of the components try to? >> resend the lost messages. In one of our customer's cluster with many? >> resources (they also use terraform) I haven't seen issues during a? >> regular maintenance window. When they had a DNS outage a few months? >> back it resulted in a mess, manual cleaning was necessary, but the? >> regular failovers seem to work just fine. >> And I don't see rabbitmq issues either after rebooting a server,? >> usually the haproxy (and virtual IP) failover suffice to prevent? >> interruptions. >> >> Regards, >> Eugen >> >> Zitat von Satish Patel : >> >>> Are you running your stack on top of the kvm virtual machine? How many >>> controller nodes do you have? mostly rabbitMQ causing issues if you restart >>> controller nodes. >>> >>> On Thu, May 11, 2023 at 8:34?AM Albert Braden wrote: >>> >>>> We have our haproxy and controller nodes on KVM hosts. When those KVM >>>> hosts are restarted, customers who are building or deleting VMs? >>>> see impact. >>>> VMs may go into error status, fail to get DNS records, fail to? >>>> delete, etc. >>>> The obvious reason is because traffic that is being routed to the haproxy >>>> on the restarting KVM is lost. If we manually fail over haproxy before >>>> restarting the KVM, will that be sufficient to stop traffic being lost, or >>>> do we also need to do something with the controller? >>>> >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Thu May 18 13:09:21 2023 From: satish.txt at gmail.com (Satish Patel) Date: Thu, 18 May 2023 09:09:21 -0400 Subject: [kolla] [train] haproxy and controller restart causes user impact In-Reply-To: <30747802.3897418.1684412348929@mail.yahoo.com> References: <314427760.909657.1683808321773.ref@mail.yahoo.com> <314427760.909657.1683808321773@mail.yahoo.com> <20230512073327.Horde.nPa9c_1UYY_XW_n-sc1pyQM@webmail.nde.ag> <1577432210.483872.1683917817875@mail.yahoo.com> <1979459451.2557820.1684258144482@mail.yahoo.com> <20230516181703.Horde.xQASX_23dzWCwbxddBcHuLa@webmail.nde.ag> <393615542.2867360.1684300158715@mail.yahoo.com> <20230517111730.Horde.ao5h0-n-fH5zG3Yj_Nnt7oT@webmail.nde.ag> <30747802.3897418.1684412348929@mail.yahoo.com> Message-ID: We have rebooted the controller many times one by one without removing it from the cluster and things survived. We have 200 compute nodes in the cluster and the cluster is quite chatty too. If they are not, that means you have something wrong either in your configuration of the openstack version you are running. In my experience RabbitMQ is the biggest culprit when it comes to reboot controller nodes. This is the two bit I have configure for RabbitMQ # /etc/kolla/config/global.conf [oslo_messaging_rabbit] kombu_reconnect_delay=0.5 rabbit_transient_queues_ttl=60 # /etc/kolla/global.yml om_enable_rabbitmq_high_availability: True I encountered one bug in amqp during ocata release which caused lots of issues but they are all fixed. I highly doubt it's related to HAProxy. On Thu, May 18, 2023 at 8:23?AM Albert Braden wrote: > There must be a way to stop traffic from being sent to a controller, so > that it can be rebooted in an orderly fashion. If that's not possible, then > reducing the period of disruption with network settings would be my second > choice. > > Can someone from the kolla team give advice about this? What is the > recommended method for rebooting a kolla-ansible controller in an orderly > fashion? Do I need to use the "remove from cluster" and "add to cluster" > procedures, or is there a better way? > On Wednesday, May 17, 2023, 07:25:34 AM EDT, Eugen Block > wrote: > > > Hi, > I found this [1] reference, it recommends to reduce the kernel option > for tcp_retries to reduce the impact of a service interruption: > > # /etc/kolla/globals.yml > haproxy_host_ipv4_tcp_retries2: 6 > > Apparently, this option was introduced in Victoria [2], it states: > > > Added a new haproxy configuration variable, > > haproxy_host_ipv4_tcp_retries2, which allows users to modify this > > kernel option. This option sets maximum number of times a TCP packet > > is retransmitted in established state before giving up. The default > > kernel value is 15, which corresponds to a duration of approximately > > between 13 to 30 minutes, depending on the retransmission timeout. > > This variable can be used to mitigate an issue with stuck > > connections in case of VIP failover, see bug 1917068 for details. > > It reads like exactly what you're describing. If I remember correctly, > you're still on Train? In that case you'll probably have to configure > that setting manually (scripted maybe), it is this value: > /proc/sys/net/ipv4/tcp_retries2 > The solution in [3] even talks about setting it to 3 for HA deployments. > > # sysctl -a | grep net.ipv4.tcp_retries2 > net.ipv4.tcp_retries2 = 15 > > Regards, > Eugen > > [1] > > https://docs.openstack.org/kolla-ansible/latest/reference/high-availability/haproxy-guide.html > [2] https://docs.openstack.org/releasenotes/kolla-ansible/victoria.html > [3] https://access.redhat.com/solutions/726753 > > Zitat von Albert Braden : > > > Before we switched to durable queues we were seeing RMQ issues after > > a restart. Now RMQ is fine after restart, but operations in progress > > will fail. VMs will fail to build, or not get DNS records. Volumes > > don't get attached or detached. It looks like haproxy is the issue > > now; connections continue going to the down node. I think we can fix > > that by failing over haproxy before rebooting. > > > > The problem is, I'm not sure that haproxy is the only issue. All 3 > > controllers are doing stuff, and when I reboot one, whatever it is > > doing is likely to fail. Is there an orderly way to stop work from > > being done on a controller without ruining work that is already in > > progress, besides removing it from the cluster? Would "kolla-ansible > > stop" do it? > > On Tuesday, May 16, 2023, 02:23:59 PM EDT, Eugen Block > > wrote: > > > > Hi Albert, > > > > sorry, I'm swamped with different stuff right now. I just took a > > glance at the docs you mentioned and it seems way too much for > > something simple as a controller restart to actually remove hosts, > > that should definitely not be necessary. > > I'm not familiar with kolla or exabgp, but can you describe what > > exactly takes that long to failover? Maybe that could be improved? And > > can you limit the failing requests to a specific service (volumes, > > network ports, etc.) or do they all fail? Maybe rabbitmq should be > > considered after all, you could share your rabbitmq settings from the > > different openstack services and I will collect mine to compare. And > > then also the rabbitmq config (policies, vhosts, queues). > > > > Regards, > > Eugen > > > > Zitat von Albert Braden : > > > >> What's the recommended method for rebooting controllers? Do we need > >> to use the "remove from cluster" and "add to cluster" procedures or > >> is there a better way? > >> > >> > https://docs.openstack.org/kolla-ansible/train/user/adding-and-removing-hosts.html > >> On Friday, May 12, 2023, 03:04:26 PM EDT, Albert Braden > >> wrote: > >> > >> We use keepalived and exabgp to manage failover for haproxy. That > >> works but it takes a few minutes, and during those few minutes > >> customers experience impact. We tell them to not build/delete VMs > >> during patching, but they still do, and then complain about the > >> failures. > >> > >> We're planning to experiment with adding a "manual" haproxy failover > >> to our patching automation, but I'm wondering if there is anything > >> on the controller that needs to be failed over or disabled before > >> rebooting the KVM. I looked at the "remove from cluster" and "add to > >> cluster" procedures but that seems unnecessarily cumbersome for > >> rebooting the KVM. > >> On Friday, May 12, 2023, 03:42:42 AM EDT, Eugen Block > >> wrote: > >> > >> Hi Albert, > >> > >> how is your haproxy placement controlled, something like pacemaker or > >> similar? I would always do a failover when I'm aware of interruptions > >> (maintenance window), that should speed things up for clients. We have > >> a pacemaker controlled HA control plane, it takes more time until > >> pacemaker realizes that the resource is gone if I just rebooted a > >> server without failing over. I have no benchmarks though. There's > >> always a risk of losing a couple of requests during the failover but > >> we didn't have complaints yet, I believe most of the components try to > >> resend the lost messages. In one of our customer's cluster with many > >> resources (they also use terraform) I haven't seen issues during a > >> regular maintenance window. When they had a DNS outage a few months > >> back it resulted in a mess, manual cleaning was necessary, but the > >> regular failovers seem to work just fine. > >> And I don't see rabbitmq issues either after rebooting a server, > >> usually the haproxy (and virtual IP) failover suffice to prevent > >> interruptions. > >> > >> Regards, > >> Eugen > >> > >> Zitat von Satish Patel : > >> > >>> Are you running your stack on top of the kvm virtual machine? How many > >>> controller nodes do you have? mostly rabbitMQ causing issues if you > restart > >>> controller nodes. > >>> > >>> On Thu, May 11, 2023 at 8:34?AM Albert Braden wrote: > >>> > >>>> We have our haproxy and controller nodes on KVM hosts. When those KVM > >>>> hosts are restarted, customers who are building or deleting VMs > >>>> see impact. > >>>> VMs may go into error status, fail to get DNS records, fail to > >>>> delete, etc. > >>>> The obvious reason is because traffic that is being routed to the > haproxy > >>>> on the restarting KVM is lost. If we manually fail over haproxy before > >>>> restarting the KVM, will that be sufficient to stop traffic being > lost, or > >>>> do we also need to do something with the controller? > >>>> > >>>> > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralonsoh at redhat.com Thu May 18 15:09:34 2023 From: ralonsoh at redhat.com (Rodolfo Alonso Hernandez) Date: Thu, 18 May 2023 17:09:34 +0200 Subject: [Port Creation failed] - openstack Wallaby In-Reply-To: References: Message-ID: Hello Lokendra: Did you check the version of the ovn-controller service in the compute nodes and the ovn services in the controller nodes? The services should be in sync. What is the "openstack network agent list" output? Do you see the OVN controller, OVN gateways and OVN metadata entries corresponding to the compute and controller nodes you have? And did you check the sanity of your OVN SB database? What is the list of "Chassis" and "Chassis_Private" registers? Each "Chassis_Private" register must have a "Chassis" register associated. Regards. On Wed, May 17, 2023 at 6:14?PM Lokendra Rathour wrote: > Hi Swogat, > Thanks for the inputs, it was showing a similar issue but somehow the > issue is not getting resolved. > we are trying to explore more around it. > > getting the error in > ovn-metadata-agent.log > Cannot find Chassis_Private with name=f80799b9-0cf4-4413-bb4b-e36278c73f6c > > detailed: > 2023-05-17 19:26:31.984 45317 INFO oslo.privsep.daemon [-] privsep daemon > running as pid 45317 > 2023-05-17 19:26:32.712 44735 ERROR ovsdbapp.backend.ovs_idl.transaction > [-] Traceback (most recent call last): > File > "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/connection.py", > line 131, in run > txn.results.put(txn.do_commit()) > File > "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", > line 93, in do_commit > command.run_idl(txn) > File > "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/command.py", > line 172, in run_idl > record = self.api.lookup(self.table, self.record) > File > "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/__init__.py", > line 208, in lookup > return self._lookup(table, record) > File > "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/__init__.py", > line 268, in _lookup > row = idlutils.row_by_value(self, rl.table, rl.column, record) > File > "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/idlutils.py", > line 114, in row_by_value > raise RowNotFound(table=table, col=column, match=match) > ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Chassis_Private > with name=f80799b9-0cf4-4413-bb4b-e36278c73f6c > > 2023-05-17 19:26:32.713 44735 ERROR ovsdbapp.backend.ovs_idl.command [-] > Error executing command (DbAddCommand): > ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Chassis_Private > with name=f80799b9-0cf4-4413-bb4b-e36278c73f6c > 2023-05-17 19:26:32.713 44735 ERROR ovsdbapp.backend.ovs_idl.command > Traceback (most recent call last): > 2023-05-17 19:26:32.713 44735 ERROR ovsdbapp.backend.ovs_idl.command > File > "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/command.py", > line 42, in execute > > waiting for your always helpful inputs. > > On Tue, May 16, 2023 at 10:47?PM Swogat Pradhan > wrote: > >> Hi >> I am not sure if this will help, but i faced something similar. >> You might need to check the ovn database entries. >> http://www.jimmdenton.com/neutron-ovn-private-chassis/ >> >> Or maybe try restarting the ovn service from pcs, sometimes issue comes >> up when ovn doesn't sync up. >> >> Again m not sure if this will be of any help to you. >> >> With regards, >> Swogat Pradhan >> >> On Tue, 16 May 2023, 10:41 pm Lokendra Rathour, < >> lokendrarathour at gmail.com> wrote: >> >>> Hi All, >>> Was trying to create OpenStack VM in OpenStack wallaby release, not able >>> to create VM, it is failing because of Port not getting created. >>> >>> The error that we are getting: >>> nova-compute.log: >>> >>> 2023-05-16 18:15:35.495 7 INFO nova.compute.provider_config >>> [req-faaf38e7-b5ee-43d1-9303-d508285f5ab7 - - - - -] No provider configs >>> found in /etc/nova/provider_config. If files are present, ensure the Nova >>> process has access. >>> 2023-05-16 18:15:35.549 7 ERROR nova.cmd.common >>> [req-8842f11c-fe5a-4ad3-92ea-a6898f482bf0 - - - - -] No db access allowed >>> in nova-compute: File "/usr/bin/nova-compute", line 10, in >>> sys.exit(main()) >>> File "/usr/lib/python3.6/site-packages/nova/cmd/compute.py", line 59, >>> in main >>> topic=compute_rpcapi.RPC_TOPIC) >>> File "/usr/lib/python3.6/site-packages/nova/service.py", line 264, in >>> create >>> utils.raise_if_old_compute() >>> File "/usr/lib/python3.6/site-packages/nova/utils.py", line 1068, in >>> raise_if_old_compute >>> ctxt, ['nova-compute']) >>> File "/usr/lib/python3.6/site-packages/nova/objects/service.py", line >>> 563, in get_minimum_version_all_cells >>> binaries) >>> File "/usr/lib/python3.6/site-packages/nova/context.py", line 544, in >>> scatter_gather_all_cells >>> fn, *args, **kwargs) >>> File "/usr/lib/python3.6/site-packages/nova/context.py", line 432, in >>> scatter_gather_cells >>> with target_cell(context, cell_mapping) as cctxt: >>> File "/usr/lib64/python3.6/contextlib.py", line 81, in __enter__ >>> return next(self.gen) >>> >>> >>> neutron/ovn-metadata-agent.log >>> >>> 2023-05-16 22:33:41.871 45204 ERROR ovsdbapp.backend.ovs_idl.command >>> ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Chassis_Private >>> with name=b6e6f0d3-40c6-4d4e-8ef6-c935fa027bd2 >>> 2023-05-16 22:33:41.871 45204 ERROR ovsdbapp.backend.ovs_idl.command >>> 2023-05-16 22:36:41.876 45204 ERROR ovsdbapp.backend.ovs_idl.transaction >>> [-] Traceback (most recent call last): >>> File >>> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/connection.py", >>> line 131, in run >>> txn.results.put(txn.do_commit()) >>> File >>> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", >>> line 93, in do_commit >>> command.run_idl(txn) >>> File >>> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/command.py", >>> line 172, in run_idl >>> record = self.api.lookup(self.table, self.record) >>> File >>> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/__init__.py", >>> line 208, in lookup >>> return self._lookup(table, record) >>> File >>> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/__init__.py", >>> line 268, in _lookup >>> row = idlutils.row_by_value(self, rl.table, rl.column, record) >>> File >>> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/idlutils.py", >>> line 114, in row_by_value >>> raise RowNotFound(table=table, col=column, match=match) >>> ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find >>> Chassis_Private with name=b6e6f0d3-40c6-4d4e-8ef6-c935fa027bd2 >>> >>> any input to help get this issue fixed would be of great help. >>> thanks >>> -- >>> ~ Lokendra >>> skype: lokendrarathour >>> >>> >>> > > -- > ~ Lokendra > skype: lokendrarathour > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jay at gr-oss.io Thu May 18 16:06:37 2023 From: jay at gr-oss.io (Jay Faulkner) Date: Thu, 18 May 2023 09:06:37 -0700 Subject: [ironic] Core team updates Message-ID: Hey all, I wanted to let you know that Shivanand Tendulker let me know they will no longer be working on Ironic. I've removed them from the ironic-core group as a result. As usual, with former Ironic core reviewers, Shivanand will be welcomed back quickly if they return to work on Ironic. Thank you for years of helping us keep things maintained! -- Jay Faulkner Ironic PTL TC Vice-Chair -------------- next part -------------- An HTML attachment was scrubbed... URL: From finarffin at gmail.com Thu May 18 16:11:01 2023 From: finarffin at gmail.com (Jan Wasilewski) Date: Thu, 18 May 2023 18:11:01 +0200 Subject: [manila] Architecture: separation between share and ctl nodes Message-ID: Hi, I would like to integrate Manila with our current OpenStack cloud. In the architecture and configuration instructions, it is mentioned that the share node and controller node should be located on separate nodes. However, in our environment, each service is located on a separate VM, so I'm wondering if it would be reasonable to integrate the controller node and share node on a single node. Especially since we are using Huawei Dorado as our backend, and from our tests, we have not observed significant traffic generated by it. Therefore, I would like to know if it is a sensible approach to have everything located on a single node or if we may have missed something important in this integration. Thank you in advance for your insights and advice. Best regards, Jan -------------- next part -------------- An HTML attachment was scrubbed... URL: From haiwu.us at gmail.com Thu May 18 17:30:06 2023 From: haiwu.us at gmail.com (hai wu) Date: Thu, 18 May 2023 12:30:06 -0500 Subject: [oslo.metrics][nova][oslo][oslo.messaging] oslo.metrics patch not working Message-ID: I tried to backport this particular patch in order to add support for 'oslo.metrics' on openstack controller node first: https://github.com/openstack/oslo.messaging/commit/bdbb6d62ee20bfd5ffc59f8772a5a0e60614ba90. But it does not work. After adding the following configs in /etc/nova/nova.conf: [DEFAULT] default_log_levels = oslo.messaging=DEBUG [oslo_messaging_metrics] metrics_enabled = True I could only see messages related to 'oslo.messaging._drivers' in nova log files, and there's nothing related to 'oslo.messaging._metrics', or any other ones under 'oslo.messaging'. What else is needed for this back-ported patch to work on the controller? From ralonsoh at redhat.com Fri May 19 09:20:28 2023 From: ralonsoh at redhat.com (Rodolfo Alonso Hernandez) Date: Fri, 19 May 2023 11:20:28 +0200 Subject: [neutron] Drivers meeting Message-ID: Hello Neutrinos: Remember this afternoon we have the drivers meeting at 14UTC. The agenda has one topic: https://wiki.openstack.org/wiki/Meetings/NeutronDrivers As commented during the last team meeting, I also want you to review the open specs. Please spend some minutes on them: * https://review.opendev.org/q/project:openstack%252Fneutron-specs+status:open * https://review.opendev.org/c/openstack/nova-specs/+/859290 Regards. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdhasman at redhat.com Fri May 19 09:23:34 2023 From: rdhasman at redhat.com (Rajat Dhasmana) Date: Fri, 19 May 2023 14:53:34 +0530 Subject: [cinder] festival of feature reviews 19 May 2023 Message-ID: Hello Argonauts, We will be having our monthly festival of reviews today i.e. 19th May (Friday) from 1400-1600 UTC. Following are some additional details: Date: 19th May, 2023 Time: 1400-1600 UTC Meeting link: will be shared in #openstack-cinder around 1400 UTC etherpad: https://etherpad.opendev.org/p/cinder-festival-of-reviews Thanks Rajat Dhasmana -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralonsoh at redhat.com Fri May 19 09:57:12 2023 From: ralonsoh at redhat.com (Rodolfo Alonso Hernandez) Date: Fri, 19 May 2023 11:57:12 +0200 Subject: [neutron] policy rules: filter on name field In-Reply-To: <1882a5c7bf9.f222e63172103.7817044410310448905@ghanshyammann.com> References: <57436bfb-0acb-dffa-bcdb-2bef2c8b8472@unipd.it> <2957175.lYqTQsUpk7@p1> <188268d12bd.1227636f51301451.1653718688720363876@ghanshyammann.com> <2521426.hzxb4AvoNt@p1> <1882a5c7bf9.f222e63172103.7817044410310448905@ghanshyammann.com> Message-ID: I've added https://bugs.launchpad.net/neutron/+bug/2019960 to the Neutron drivers meeting agenda (today at 14UTC). It will be discussed if we need to create new rules for the default SG and its rules (or any other proposal). On Wed, May 17, 2023 at 5:38?PM Ghanshyam Mann wrote: > ---- On Wed, 17 May 2023 00:55:47 -0700 Slawek Kaplonski wrote --- > > Hi, > > > > Dnia wtorek, 16 maja 2023 23:52:39 CEST Ghanshyam Mann pisze: > > > > > > ---- On Tue, 16 May 2023 07:25:52 -0700 Slawek Kaplonski wrote --- > > > > Hi, > > > > > > > > Dnia wtorek, 16 maja 2023 12:00:34 CEST Paolo Emilio Mazzon pisze: > > > > > Hello, > > > > > > > > > > I'm trying to understand if this is feasible: I would like to > avoid a regular user from > > > > > tampering the "default" security group of a project. > Specifically I would like to prevent > > > > > him from deleting sg rules *from the default sg only* > > > > > > > > > > I can wite a policy.yaml like this > > > > > > > > > > # Delete a security group rule > > > > > # DELETE /security-group-rules/{id} > > > > > # Intended scope(s): project > > > > > "delete_security_group_rule": "role:project_manager and > project_id:%(project_id)s" > > > > > > > > > > but this is sub-optimal since the regular member can still *add* > rules... > > > > > > > > > > Is it possible to create a rule like > > > > > > > > > > "sg_is_default" : ...the sg group whose name is 'default' > > > > > > > > > > so I can write > > > > > > > > > > "delete_security_group_rule": "not rule:sg_is_default" ? > > > > > > > > > > Thanks! > > > > > > > > I'm not sure but I will try to check it later today or tomorrow > morning and will let You know if that is possible or not. > > > > > > 'not' operator is supported in oslo policy. I think the below one > should work which allows admin to delete the default SG and manager role > > > can delete only non-default SG. > > > > > > NOTE: I have not tested this, may be you can check while trying other > combinations. > > > > > > "delete_security_group_rule": "role:project_manager and > project_id:%(project_id)s and not 'default':%(name)s or 'default':%(name)s > and role:admin" > > > > > > -gmann > > > > > > > > > > > > > > > > > Paolo > > > > > > > > > > -- > > > > > Paolo Emilio Mazzon > > > > > System and Network Administrator > > > > > > > > > > paoloemilio.mazzon[at]unipd.it > > > > > > > > > > PNC - Padova Neuroscience Center > > > > > https://www.pnc.unipd.it > > > > > Via Orus 2/B - 35131 Padova, Italy > > > > > +39 049 821 2624 > > > > > > > > > > > > > > > > > > > > > > -- > > > > Slawek Kaplonski > > > > Principal Software Engineer > > > > Red Hat > > > > > > > > > > I checked it today and it can be done like: > > > > "sg_is_default": "field:security_groups:name=default", > > "delete_security_group": "(role:member and > project_id:%(project_id)s and not rule:sg_is_default) or role:admin" > > > > for Security Group. > > But it won't work like that for security group rules as You want to > rely Your policy on the value of the attribute which belongs to parent > resource (name of the Security group when doing API call for SG rule). We > had similar problem for the "network:shared" field - see [1] and it was > fixed with [2] but that fix is specific for this special field > ("network:shared" only). Maybe we would need to add such special handling > for the default security group as well. If You would like to have something > like that, please open LP bug for it and we can investigate that deeper. > > ++, default SG being a special case here, I agree on handling this case in > code instead of making the configuration more complex. > May be a separate policy for default SG can also make sense. > > -gmann > > > > > [1] https://bugs.launchpad.net/neutron/+bug/1808112 > > [2] https://review.opendev.org/c/openstack/neutron/+/652636 > > > > > > -- > > Slawek Kaplonski > > Principal Software Engineer > > Red Hat > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Fri May 19 10:53:11 2023 From: skaplons at redhat.com (Slawek Kaplonski) Date: Fri, 19 May 2023 12:53:11 +0200 Subject: [neutron] policy rules: filter on name field In-Reply-To: References: <57436bfb-0acb-dffa-bcdb-2bef2c8b8472@unipd.it> <1882a5c7bf9.f222e63172103.7817044410310448905@ghanshyammann.com> Message-ID: <2292178.ErOD2KgLaG@p1> Hi, Dnia pi?tek, 19 maja 2023 11:57:12 CEST Rodolfo Alonso Hernandez pisze: > I've added https://bugs.launchpad.net/neutron/+bug/2019960 to the Neutron > drivers meeting agenda (today at 14UTC). It will be discussed if we need to > create new rules for the default SG and its rules (or any other proposal). Thx Rodolfo. > > On Wed, May 17, 2023 at 5:38?PM Ghanshyam Mann > wrote: > > > ---- On Wed, 17 May 2023 00:55:47 -0700 Slawek Kaplonski wrote --- > > > Hi, > > > > > > Dnia wtorek, 16 maja 2023 23:52:39 CEST Ghanshyam Mann pisze: > > > > > > > > ---- On Tue, 16 May 2023 07:25:52 -0700 Slawek Kaplonski wrote --- > > > > > Hi, > > > > > > > > > > Dnia wtorek, 16 maja 2023 12:00:34 CEST Paolo Emilio Mazzon pisze: > > > > > > Hello, > > > > > > > > > > > > I'm trying to understand if this is feasible: I would like to > > avoid a regular user from > > > > > > tampering the "default" security group of a project. > > Specifically I would like to prevent > > > > > > him from deleting sg rules *from the default sg only* > > > > > > > > > > > > I can wite a policy.yaml like this > > > > > > > > > > > > # Delete a security group rule > > > > > > # DELETE /security-group-rules/{id} > > > > > > # Intended scope(s): project > > > > > > "delete_security_group_rule": "role:project_manager and > > project_id:%(project_id)s" > > > > > > > > > > > > but this is sub-optimal since the regular member can still *add* > > rules... > > > > > > > > > > > > Is it possible to create a rule like > > > > > > > > > > > > "sg_is_default" : ...the sg group whose name is 'default' > > > > > > > > > > > > so I can write > > > > > > > > > > > > "delete_security_group_rule": "not rule:sg_is_default" ? > > > > > > > > > > > > Thanks! > > > > > > > > > > I'm not sure but I will try to check it later today or tomorrow > > morning and will let You know if that is possible or not. > > > > > > > > 'not' operator is supported in oslo policy. I think the below one > > should work which allows admin to delete the default SG and manager role > > > > can delete only non-default SG. > > > > > > > > NOTE: I have not tested this, may be you can check while trying other > > combinations. > > > > > > > > "delete_security_group_rule": "role:project_manager and > > project_id:%(project_id)s and not 'default':%(name)s or 'default':%(name)s > > and role:admin" > > > > > > > > -gmann > > > > > > > > > > > > > > > > > > > > > Paolo > > > > > > > > > > > > -- > > > > > > Paolo Emilio Mazzon > > > > > > System and Network Administrator > > > > > > > > > > > > paoloemilio.mazzon[at]unipd.it > > > > > > > > > > > > PNC - Padova Neuroscience Center > > > > > > https://www.pnc.unipd.it > > > > > > Via Orus 2/B - 35131 Padova, Italy > > > > > > +39 049 821 2624 > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Slawek Kaplonski > > > > > Principal Software Engineer > > > > > Red Hat > > > > > > > > > > > > > > I checked it today and it can be done like: > > > > > > "sg_is_default": "field:security_groups:name=default", > > > "delete_security_group": "(role:member and > > project_id:%(project_id)s and not rule:sg_is_default) or role:admin" > > > > > > for Security Group. > > > But it won't work like that for security group rules as You want to > > rely Your policy on the value of the attribute which belongs to parent > > resource (name of the Security group when doing API call for SG rule). We > > had similar problem for the "network:shared" field - see [1] and it was > > fixed with [2] but that fix is specific for this special field > > ("network:shared" only). Maybe we would need to add such special handling > > for the default security group as well. If You would like to have something > > like that, please open LP bug for it and we can investigate that deeper. > > > > ++, default SG being a special case here, I agree on handling this case in > > code instead of making the configuration more complex. > > May be a separate policy for default SG can also make sense. > > > > -gmann > > > > > > > > [1] https://bugs.launchpad.net/neutron/+bug/1808112 > > > [2] https://review.opendev.org/c/openstack/neutron/+/652636 > > > > > > > > > -- > > > Slawek Kaplonski > > > Principal Software Engineer > > > Red Hat > > > > > > > > -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From bram.kranendonk at nl.team.blue Fri May 19 12:17:22 2023 From: bram.kranendonk at nl.team.blue (Bram Kranendonk) Date: Fri, 19 May 2023 12:17:22 +0000 Subject: [kolla-ansible] Adding a USB device to a Docker container Message-ID: <70326d857c4546a8b38894b1d7241da1@nl.team.blue> Hi OpenStack Discuss, I'm looking for a way to mount an USB device to a kolla-ansible Docker container. Is there a way to achieve this using the kolla-ansible configuration global vars? I couldn't find a way to do so. Thanks in advance, . Bram Kranendonk System Engineer Oostmaaslaan 71 (15e etage) 3063 AN Rotterdam The Netherlands [team.blue logo] -------------- next part -------------- An HTML attachment was scrubbed... URL: From ozzzo at yahoo.com Fri May 19 13:29:17 2023 From: ozzzo at yahoo.com (Albert Braden) Date: Fri, 19 May 2023 13:29:17 +0000 (UTC) Subject: [kolla] [train] [keystone] Number of User/Group entities returned by LDAP exceeded size limit References: <1696731980.1287315.1684502957871.ref@mail.yahoo.com> Message-ID: <1696731980.1287315.1684502957871@mail.yahoo.com> We have 200 groups in our LDAP server. We recently started getting an error when we try to list groups: $ os group list --domain AUTH.OURDOMAIN.COM Number of User/Group entities returned by LDAP exceeded size limit. Contact your LDAP administrator. (HTTP 500) I read the "Additional LDAP integration settings" section in [1] and then tried setting various values of page_size (10, 100, 1000) in the [ldap] section of keystone.conf but that didn't make a difference. What am I missing? [1] https://docs.openstack.org/keystone/train/admin/configuration.html#identity-ldap-server-set-up Here's the stack trace: 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application [req-198741c6-58b2-46b1-8622-bae1fc5c5280 d64c83e1ea954c368e9fe08a5d8450a1 47dc15c280c9436fadac4d41f1d54a64 - default default] Number of User/Group entities returned by LDAP exceeded size limit. Contact your LDAP administrator.: keystone.exception.LDAPSizeLimitExceeded: Number of User/Group entities returned by LDAP exceeded size limit. Contact your LDAP administrator. 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application Traceback (most recent call last): 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/common.py", line 996, in search_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application attrlist, attrsonly) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/common.py", line 689, in wrapper 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return func(self, conn, *args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/common.py", line 824, in search_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application attrsonly) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 870, in search_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return self.search_ext_s(base,scope,filterstr,attrlist,attrsonly,None,None,timeout=self.timeout) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 1286, in search_ext_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return self._apply_method_s(SimpleLDAPObject.search_ext_s,*args,**kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 1224, in _apply_method_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return func(self,*args,**kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 864, in search_ext_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return self.result(msgid,all=1,timeout=timeout)[1] 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 756, in result 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application resp_type, resp_data, resp_msgid = self.result2(msgid,all,timeout) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 760, in result2 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application resp_type, resp_data, resp_msgid, resp_ctrls = self.result3(msgid,all,timeout) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 767, in result3 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application resp_ctrl_classes=resp_ctrl_classes 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 774, in result4 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application ldap_result = self._ldap_call(self._l.result4,msgid,all,timeout,add_ctrls,add_intermediates,add_extop) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 340, in _ldap_call 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application reraise(exc_type, exc_value, exc_traceback) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib64/python3.6/site-packages/ldap/compat.py", line 46, in reraise 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application raise exc_value 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 324, in _ldap_call 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application result = func(*args,**kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application ldap.SIZELIMIT_EXCEEDED: {'msgtype': 100, 'msgid': 2, 'result': 4, 'desc': 'Size limit exceeded', 'ctrls': []} 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application During handling of the above exception, another exception occurred: 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application Traceback (most recent call last): 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/flask/app.py", line 1813, in full_dispatch_request 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application rv = self.dispatch_request() 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/flask/app.py", line 1799, in dispatch_request 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return self.view_functions[rule.endpoint](**req.view_args) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/flask_restful/__init__.py", line 480, in wrapper 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application resp = resource(*args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/flask/views.py", line 88, in view 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return self.dispatch_request(*args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/flask_restful/__init__.py", line 595, in dispatch_request 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application resp = meth(*args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/api/groups.py", line 59, in get 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return self._list_groups() 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/api/groups.py", line 86, in _list_groups 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application hints=hints) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/common/manager.py", line 116, in wrapped 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application __ret_val = __f(*args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/identity/core.py", line 414, in wrapper 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return f(self, *args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/identity/core.py", line 424, in wrapper 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return f(self, *args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/identity/core.py", line 1329, in list_groups 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application ref_list = driver.list_groups(hints) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/core.py", line 116, in list_groups 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return self.group.get_all_filtered(hints) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/core.py", line 474, in get_all_filtered 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application for group in self.get_all(query, hints)] 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/common.py", line 1647, in get_all 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application for x in self._ldap_get_all(hints, ldap_filter)] 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/common/driver_hints.py", line 42, in wrapper 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return f(self, hints, *args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/common.py", line 1600, in _ldap_get_all 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application attrs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/common.py", line 998, in search_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application raise exception.LDAPSizeLimitExceeded() 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application keystone.exception.LDAPSizeLimitExceeded: Number of User/Group entities returned by LDAP exceeded size limit. Contact your LDAP administrator. 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application From satish.txt at gmail.com Fri May 19 15:53:01 2023 From: satish.txt at gmail.com (Satish Patel) Date: Fri, 19 May 2023 11:53:01 -0400 Subject: [cinder-backup][ceph] cinder-backup support of incremental backup with ceph backend In-Reply-To: <7af7d0f9-ff01-482d-91e4-1b99a9659222@app.fastmail.com> References: <7af7d0f9-ff01-482d-91e4-1b99a9659222@app.fastmail.com> Message-ID: Thank you Masayuki, Are there any API for ceph which I can use to get real usage from ceph directly related to incremental backup usage? Do I need to configure RGW service to obtain that level of information from ceph using API? On Wed, May 17, 2023 at 10:58?PM Masayuki Igawa wrote: > Hi Satish, > > > Whenever I take incremental backup it shows a similar size of original > > volume. Technically It should be smaller. Question is does ceph support > > incremental backup with cinder? > > IIUC, it would be expected behavior. According to the API Doc[1], > "size" is "The size of the volume, in gibibytes (GiB)." > So, it's not the actual size of the snapshot itself. > > What about the "object_count" of "openstack volume backup show" output? > The incremental's one should be zero or less than the full backup at least? > > [1] > https://docs.openstack.org/api-ref/block-storage/v3/?expanded=show-backup-detail-detail,list-backups-with-detail-detail#id428 > > -- Masayuki Igawa > > On Wed, May 17, 2023, at 03:51, Satish Patel wrote: > > Folks, > > > > I have ceph storage for my openstack and configure cinder-volume and > > cinder-backup service for my disaster solution. I am trying to use the > > cinder-backup incremental option to save storage space but somehow It > > doesn't work the way it should work. > > > > Whenever I take incremental backup it shows a similar size of original > > volume. Technically It should be smaller. Question is does ceph support > > incremental backup with cinder? > > > > I am running a Yoga release. > > > > $ openstack volume list > > > +--------------------------------------+------------+------------+------+-------------------------------------+ > > | ID | Name | Status | Size > > | Attached to | > > > +--------------------------------------+------------+------------+------+-------------------------------------+ > > | 285a49a6-0e03-49e5-abf1-1c1efbfeb5f2 | spatel-vol | backing-up | 10 > > | Attached to spatel-foo on /dev/sdc | > > > +--------------------------------------+------------+------------+------+-------------------------------------+ > > > > ### Create full backup > > $ openstack volume backup create --name spatel-vol-backup spatel-vol > --force > > +-------+--------------------------------------+ > > | Field | Value | > > +-------+--------------------------------------+ > > | id | 4351d9d3-85fa-4cd5-b21d-619b3385aefc | > > | name | spatel-vol-backup | > > +-------+--------------------------------------+ > > > > ### Create incremental > > $ openstack volume backup create --name spatel-vol-backup-1 > > --incremental --force spatel-vol > > +-------+--------------------------------------+ > > | Field | Value | > > +-------+--------------------------------------+ > > | id | 294b58af-771b-4a9f-bb7b-c37a4f84d678 | > > | name | spatel-vol-backup-1 | > > +-------+--------------------------------------+ > > > > $ openstack volume backup list > > > +--------------------------------------+---------------------+-------------+-----------+------+ > > | ID | Name | > > Description | Status | Size | > > > +--------------------------------------+---------------------+-------------+-----------+------+ > > | 294b58af-771b-4a9f-bb7b-c37a4f84d678 | spatel-vol-backup-1 | None > > | available | 10 | > > | 4351d9d3-85fa-4cd5-b21d-619b3385aefc | spatel-vol-backup | None > > | available | 10 | > > > +--------------------------------------+---------------------+-------------+-----------+------+ > > My incremental backup still shows 10G size which should be lower > > compared to the first backup. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From murilo at evocorp.com.br Fri May 19 23:14:47 2023 From: murilo at evocorp.com.br (Murilo Morais) Date: Fri, 19 May 2023 20:14:47 -0300 Subject: [OSA][NEUTRON] Doubts about network setup Message-ID: Good evening everyone! I'm trying to set up a lab for testing using Openstack Ansible (OSA) and I'm having a lot of trouble understanding/setting up the network. I'm trying something similar to AIO (All-in-one) but with customizations (2 compute node). I'm using Debian 11 as OS. My problem is that I need the instances to communicate through VLANs that are being delivered directly to the interface of each compute node, as I need the same instances to participate in an existing network. I have a lot of doubts about this type of setup and how the configuration of provider_networks would be. Thanks in advance! -------------- next part -------------- An HTML attachment was scrubbed... URL: From gouthampravi at gmail.com Sat May 20 04:08:20 2023 From: gouthampravi at gmail.com (Goutham Pacha Ravi) Date: Fri, 19 May 2023 21:08:20 -0700 Subject: [manila] Architecture: separation between share and ctl nodes In-Reply-To: References: Message-ID: Hi, Yes; that?s totally fine; we should be clarifying in that doc that that limitation applies to using the Generic driver. Your approach should be fine for any other external storage systems. Thanks, Goutham On Thu, May 18, 2023 at 9:11 AM Jan Wasilewski wrote: > Hi, > > I would like to integrate Manila with our current OpenStack cloud. In the > architecture and configuration instructions, it is mentioned that the share > node and controller node should be located on separate nodes. However, in > our environment, each service is located on a separate VM, so I'm wondering > if it would be reasonable to integrate the controller node and share node > on a single node. > > Especially since we are using Huawei Dorado as our backend, and from our > tests, we have not observed significant traffic generated by it. Therefore, > I would like to know if it is a sensible approach to have everything > located on a single node or if we may have missed something important in > this integration. > > Thank you in advance for your insights and advice. > > Best regards, > Jan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From noonedeadpunk at gmail.com Sat May 20 06:06:23 2023 From: noonedeadpunk at gmail.com (Dmitriy Rabotyagov) Date: Sat, 20 May 2023 08:06:23 +0200 Subject: [OSA][NEUTRON] Doubts about network setup In-Reply-To: References: Message-ID: Hi! So, what are your doubts? This kind of setup is totally possible to do. At very least while using ml2.ovs/lxb as a network driver. Assuming, that your interface, that you're going to use for VLANs is named bond0, provider_networks can look like that then: provider_networks: ... - network: container_bridge: "br-vlan" container_type: "veth" network_interface: "bond0" net_name: "vlan-net" type: "vlan" range: "200:1200" group_binds: - neutron_openvswitch_agent With that config you don't need to create a br-vlan bridge anywhere, just having bond0 interface consistently across all compute and network nodes is enough. After that in neutron you can create a network like that: openstack network create --provider-network-type vlan --provider-physical-network vlan-net --provider-segment 200 vlan-200 You can check more docs on OVS setup here: https://docs.openstack.org/openstack-ansible-os_neutron/latest/app-openvswitch.html#openstack-ansible-user-variables https://docs.openstack.org/openstack-ansible/latest/user/network-arch/example.html But keep in mind that vxlans are used more commonly and are a recommended way to connect VMs between compute nodes and with neutron l3 routers for floating IPs functionality. I'm not very familiar with ml2.ovn though to answer how to setup VLANs to pass directly to computes there, as it might be slightly different in terms of group binding at least. But according to doc https://docs.openstack.org/neutron/latest/install/ovn/manual_install.html it should be pretty much same otherwise. Hope this helps. ??, 20 ??? 2023 ?., 01:23 Murilo Morais : > Good evening everyone! > > I'm trying to set up a lab for testing using Openstack Ansible (OSA) and > I'm having a lot of trouble understanding/setting up the network. > > I'm trying something similar to AIO (All-in-one) but with customizations > (2 compute node). > > I'm using Debian 11 as OS. > > My problem is that I need the instances to communicate through VLANs that > are being delivered directly to the interface of each compute node, as I > need the same instances to participate in an existing network. > > I have a lot of doubts about this type of setup and how the configuration > of provider_networks would be. > > Thanks in advance! > -------------- next part -------------- An HTML attachment was scrubbed... URL: From soheil.amiri at live.com Sat May 20 14:05:49 2023 From: soheil.amiri at live.com (Soheil D.Amiri) Date: Sat, 20 May 2023 14:05:49 +0000 Subject: OVS-DPDK poor performance with Intel 82599 Message-ID: Dear satish About your topic on openstack.org by subject of "OVS-DPDK poor performance with Intel 82599". I have the same problem. I could not get good performance on my compute node. Did you solve your problem ? Would you please guide me about this problem https://lists.openstack.org/pipermail/openstack-discuss/2020-November/019120.html Best Regards Soheil Amiri -------------- next part -------------- An HTML attachment was scrubbed... URL: From jamesleong123098 at gmail.com Sun May 21 00:35:51 2023 From: jamesleong123098 at gmail.com (James Leong) Date: Sat, 20 May 2023 19:35:51 -0500 Subject: [kolla-ansible] Plugin pulling from other repositories Message-ID: Hi everyone, How can I change the location where the repositories are being git pulled? I am using kolla-ansible as my deployment tool for deploying the yoga version of OpenStack. It seems like all plugins, such as zun_ui, blazar_dashboard, etc., are all being pulled from this path " https://opendev.org/openstack." I have attempted to change the variable kolla_dev_repos_git in globals.yml from "https://opendev.org/openstack" to point to my GitHub repository. However, it did not work. Is there a place where I can override the default path to pull plugins from my repository? Thanks for your help, James -------------- next part -------------- An HTML attachment was scrubbed... URL: From maksim.malchuk at gmail.com Sun May 21 16:52:12 2023 From: maksim.malchuk at gmail.com (Maksim Malchuk) Date: Sun, 21 May 2023 19:52:12 +0300 Subject: [kolla-ansible] Plugin pulling from other repositories In-Reply-To: References: Message-ID: This is 'dev mode', to use it you should set (enable) to True variables like *_dev_mode. For example to enable nova you should set nova_dev_mode: True and repo would be cloned from the URL set by nova_git_repository variable which is defaults to "{{ kolla_dev_repos_git }}/{{ project_name }}" in the nova-cell role. Note: the 'dev mode' is only for development. Please read the documentation: https://docs.openstack.org/kolla-ansible/latest/contributor/kolla-for-openstack-development.html On Sun, May 21, 2023 at 3:45?AM James Leong wrote: > Hi everyone, > > How can I change the location where the repositories are being git pulled? > I am using kolla-ansible as my deployment tool for deploying the yoga > version of OpenStack. It seems like all plugins, such as zun_ui, > blazar_dashboard, etc., are all being pulled from this path " > https://opendev.org/openstack." I have attempted to change the variable > kolla_dev_repos_git in globals.yml from "https://opendev.org/openstack" > to point to my GitHub repository. However, it did not work. Is there a > place where I can override the default path to pull plugins from my > repository? > > Thanks for your help, > James > -- Regards, Maksim Malchuk -------------- next part -------------- An HTML attachment was scrubbed... URL: From masayuki.igawa at gmail.com Mon May 22 00:26:48 2023 From: masayuki.igawa at gmail.com (Masayuki Igawa) Date: Mon, 22 May 2023 09:26:48 +0900 Subject: [cinder-backup][ceph] cinder-backup support of incremental backup with ceph backend In-Reply-To: References: <7af7d0f9-ff01-482d-91e4-1b99a9659222@app.fastmail.com> Message-ID: Hi, > Are there any API for ceph which I can use to get real usage from ceph > directly related to incremental backup usage? Do I need to configure > RGW service to obtain that level of information from ceph using API? AFAIK, we don't have it like that in OpenStack API because API users shouldn't know its backend. If you need like that level of information, I think you need to like that if "object_count" is not sufficient for your usage. Best Regards, -- Masayuki Igawa On Sat, May 20, 2023, at 00:53, Satish Patel wrote: > Thank you Masayuki, > > Are there any API for ceph which I can use to get real usage from ceph > directly related to incremental backup usage? Do I need to configure > RGW service to obtain that level of information from ceph using API? > > On Wed, May 17, 2023 at 10:58?PM Masayuki Igawa > wrote: >> Hi Satish, >> >> > Whenever I take incremental backup it shows a similar size of original >> > volume. Technically It should be smaller. Question is does ceph support >> > incremental backup with cinder? >> >> IIUC, it would be expected behavior. According to the API Doc[1], >> "size" is "The size of the volume, in gibibytes (GiB)." >> So, it's not the actual size of the snapshot itself. >> >> What about the "object_count" of "openstack volume backup show" output? >> The incremental's one should be zero or less than the full backup at least? >> >> [1] https://docs.openstack.org/api-ref/block-storage/v3/?expanded=show-backup-detail-detail,list-backups-with-detail-detail#id428 >> >> -- Masayuki Igawa >> >> On Wed, May 17, 2023, at 03:51, Satish Patel wrote: >> > Folks, >> > >> > I have ceph storage for my openstack and configure cinder-volume and >> > cinder-backup service for my disaster solution. I am trying to use the >> > cinder-backup incremental option to save storage space but somehow It >> > doesn't work the way it should work. >> > >> > Whenever I take incremental backup it shows a similar size of original >> > volume. Technically It should be smaller. Question is does ceph support >> > incremental backup with cinder? >> > >> > I am running a Yoga release. >> > >> > $ openstack volume list >> > +--------------------------------------+------------+------------+------+-------------------------------------+ >> > | ID | Name | Status | Size >> > | Attached to | >> > +--------------------------------------+------------+------------+------+-------------------------------------+ >> > | 285a49a6-0e03-49e5-abf1-1c1efbfeb5f2 | spatel-vol | backing-up | 10 >> > | Attached to spatel-foo on /dev/sdc | >> > +--------------------------------------+------------+------------+------+-------------------------------------+ >> > >> > ### Create full backup >> > $ openstack volume backup create --name spatel-vol-backup spatel-vol --force >> > +-------+--------------------------------------+ >> > | Field | Value | >> > +-------+--------------------------------------+ >> > | id | 4351d9d3-85fa-4cd5-b21d-619b3385aefc | >> > | name | spatel-vol-backup | >> > +-------+--------------------------------------+ >> > >> > ### Create incremental >> > $ openstack volume backup create --name spatel-vol-backup-1 >> > --incremental --force spatel-vol >> > +-------+--------------------------------------+ >> > | Field | Value | >> > +-------+--------------------------------------+ >> > | id | 294b58af-771b-4a9f-bb7b-c37a4f84d678 | >> > | name | spatel-vol-backup-1 | >> > +-------+--------------------------------------+ >> > >> > $ openstack volume backup list >> > +--------------------------------------+---------------------+-------------+-----------+------+ >> > | ID | Name | >> > Description | Status | Size | >> > +--------------------------------------+---------------------+-------------+-----------+------+ >> > | 294b58af-771b-4a9f-bb7b-c37a4f84d678 | spatel-vol-backup-1 | None >> > | available | 10 | >> > | 4351d9d3-85fa-4cd5-b21d-619b3385aefc | spatel-vol-backup | None >> > | available | 10 | >> > +--------------------------------------+---------------------+-------------+-----------+------+ >> > My incremental backup still shows 10G size which should be lower >> > compared to the first backup. >> From egarciar at redhat.com Mon May 22 08:58:53 2023 From: egarciar at redhat.com (Elvira Garcia Ruiz) Date: Mon, 22 May 2023 10:58:53 +0200 Subject: [neutron] Bug Deputy Report May 15 - May 21 Message-ID: Hi everyone! I was the bug deputy last week. There are two critical bugs reported, both have fixes ready for review. Find the full report below: Critical --------- - https://bugs.launchpad.net/neutron/+bug/2019802/ - [master] Not all fullstack tests running in CI Fix proposed by yatin: https://review.opendev.org/c/openstack/neutron/+/883120 - https://bugs.launchpad.net/neutron/+bug/2019946 - [S-RBAC] context.elevated() method from neutron-lib should ensure all required roles are set in context object Fix: https://review.opendev.org/c/openstack/neutron-lib/+/883345 Assigned to Slawek High ------ https://bugs.launchpad.net/neutron/+bug/2020050 - [sqlalchemy-20] Use a ``TextClause`` object for the empty strings in the DB model definitions Assigned to Rodolfo https://bugs.launchpad.net/neutron/+bug/2020195 - [ovn-octavia-provider] functional test intermittently fail with DB error: Cursor needed to be reset because of commit/rollback and can no longer be fetched from. Related commit: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/883662 Assigned to Fernando - https://bugs.launchpad.net/neutron/+bug/2020215 - ml2/ovn refuses to bind port due to dead agent randomly in the nova-live-migrate ci job Fix: https://review.opendev.org/c/openstack/neutron/+/883687 Assigned to Sean Mooney Medium ----------- - https://bugs.launchpad.net/neutron/+bug/2019887 - [OVN] Log object deletion fails if related security group was simultaneously modified Fix: https://review.opendev.org/c/openstack/neutron/+/883102 Assigned to Elvira - https://bugs.launchpad.net/neutron/+bug/2019859 - stable/xena regression - NotImplementedError: Operator 'getitem' is not supported on this expression Fix: https://review.opendev.org/c/openstack/neutron/+/883288 Assigned to Rodolfo - https://bugs.launchpad.net/neutron/+bug/2019948 - [alembic] Alembic operations require keywords only arguments Fix: https://review.opendev.org/c/openstack/neutron/+/883340 Assigned to Rodolfo - https://bugs.launchpad.net/neutron/+bug/2020058 - [OVN][OVN-BGP-AGENT] Expose port hosting information for virtual ports In Progress Assigned to Lucas Alvares - https://bugs.launchpad.net/neutron/+bug/2020114 - [sqlalchemy-20] ``Engine`` class no longer inherits from ``Connectable`` Fix proposed: https://review.opendev.org/c/openstack/neutron/+/883521 Assigned to Rodolfo Wishlist ----------- - https://bugs.launchpad.net/neutron/+bug/2019960 - [RFE] Can't protect the "default" security group from regular users Unassigned Undecided -------------- - https://bugs.launchpad.net/neutron/+bug/2020001 - Neutron Dynamic Routing : vip is not advertised via BGP Marked incomplete by Jens Harbott so I decided to wait on their opinion - https://bugs.launchpad.net/neutron/+bug/2020060 - Stateless Feature of Security Group Not Functioning in Case of other Port same compute use stateful Rodolfo is discussing with the reporter more information about this but, since it is linux bridge driver I was not sure about confirming Assigned to Slawek - https://bugs.launchpad.net/neutron/+bug/2020168 - [OVN][SRIOV] traffic problems when sriov and non-sriov ports are bound on the same hypervisor The reporter is not sure if this is a bug in Neutron or in OVN. Someone with SRIOV knowledge might be able to help better. Unassigned Kind regards, Elvira -------------- next part -------------- An HTML attachment was scrubbed... URL: From finarffin at gmail.com Mon May 22 09:51:57 2023 From: finarffin at gmail.com (Jan Wasilewski) Date: Mon, 22 May 2023 11:51:57 +0200 Subject: [manila] Architecture: separation between share and ctl nodes In-Reply-To: References: Message-ID: Hi Goutham, Thank you very much for your clarification. I will proceed this way. Best regards, /Jan sob., 20 maj 2023 o 06:08 Goutham Pacha Ravi napisa?(a): > Hi, > > Yes; that?s totally fine; we should be clarifying in that doc that that > limitation applies to using the Generic driver. Your approach should be > fine for any other external storage systems. > > Thanks, > Goutham > > On Thu, May 18, 2023 at 9:11 AM Jan Wasilewski > wrote: > >> Hi, >> >> I would like to integrate Manila with our current OpenStack cloud. In the >> architecture and configuration instructions, it is mentioned that the share >> node and controller node should be located on separate nodes. However, in >> our environment, each service is located on a separate VM, so I'm wondering >> if it would be reasonable to integrate the controller node and share node >> on a single node. >> >> Especially since we are using Huawei Dorado as our backend, and from our >> tests, we have not observed significant traffic generated by it. Therefore, >> I would like to know if it is a sensible approach to have everything >> located on a single node or if we may have missed something important in >> this integration. >> >> Thank you in advance for your insights and advice. >> >> Best regards, >> Jan >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Mon May 22 14:28:12 2023 From: fungi at yuggoth.org (Jeremy Stanley) Date: Mon, 22 May 2023 14:28:12 +0000 Subject: [ptl][tc] New approach for project update presentations Message-ID: <20230522142811.32ysdscuq7bvnf7f@yuggoth.org> I'm reaching out to you about the video recordings that the OpenInfra Foundation has been collecting to capture a project update from each OpenInfa community (including those for individual OpenStack Project Teams). These videos were previously collected leading up to the OpenInfra Summits to showcase what each project had accomplished in the past year/release. This year we are implementing a change to the cadence of collecting this video content, to better fit with the communities' timeline, milestones and achievements. It will also help us better promote the new content, without it getting lost among the news and buzz around the OpenInfra Summit. Moving forward, OpenStack Project Teams will need to make decisions about items, such as the cadence and format of recording the project updates. Cadence: There are various options to pick from. Teams can define a periodic schedule, or stick to bigger milestones like major releases and project anniversaries. Format: Teams can produce pre-recorded presentations with slide materials, and share that. If enough content and volunteers are available, they could also talk about their latest project news and updates during an OpenInfra Live episode. We will post all recordings on the OpenInfra Foundation YouTube channel and promote them through relevant social media accounts. Timeline: The new process will start after this year's OpenInfra Summit, which will happen on June 13-15 in Vancouver, Canada. After the event, I will reach out to help the Technical Committee and individual teams decide on the above items and take next steps in the process. Please let me know if you have any questions in the meantime. -- Jeremy Stanley on behalf of the OpenInfra Foundation Staff -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From work.ceph.user.mailing at gmail.com Sat May 20 18:15:15 2023 From: work.ceph.user.mailing at gmail.com (Work Ceph) Date: Sat, 20 May 2023 15:15:15 -0300 Subject: Mixing IDE and VIRTIO volumes Message-ID: Hello guys, We have a situation where an image requires an IDE bus in KVM. Therefore, we configured it in OpenStack to have the IDE bus being used. However, when we add new volumes to this VM (server), all of them are being allocated as IDE volumes; therefore, we are limited to 4 volumes in total in the VM. Is it possible to mix different types of volumes in a VM in OpenStack? I know that in other platforms, such as when we use KVM directly, proxmox, Apache CloudStack, we can do such combinations, but we were not able to achieve it in OpenStack. Have you guys worked with similar use cases? I know that I can convert the image to use virtIO or iSCSI bus, and to do that I need to fix/patch the operating system inside the image. However, I would like to check if there is a method to use a root volume of a VM as IDE, and other volumes as virtio to avoid the limit of volumes that IDE has. -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Mon May 22 15:04:29 2023 From: satish.txt at gmail.com (Satish Patel) Date: Mon, 22 May 2023 11:04:29 -0400 Subject: Mixing IDE and VIRTIO volumes In-Reply-To: References: Message-ID: Let others chime in but all I would say is that VirtIO is much faster than IDE so it's worth patching images if performance matters. On Mon, May 22, 2023 at 10:39?AM Work Ceph wrote: > Hello guys, > We have a situation where an image requires an IDE bus in KVM. Therefore, > we configured it in OpenStack to have the IDE bus being used. However, when > we add new volumes to this VM (server), all of them are being allocated as > IDE volumes; therefore, we are limited to 4 volumes in total in the VM. Is > it possible to mix different types of volumes in a VM in OpenStack? I know > that in other platforms, such as when we use KVM directly, proxmox, Apache > CloudStack, we can do such combinations, but we were not able to achieve it > in OpenStack. > > Have you guys worked with similar use cases? I know that I can convert the > image to use virtIO or iSCSI bus, and to do that I need to fix/patch the > operating system inside the image. However, I would like to check if there > is a method to use a root volume of a VM as IDE, and other volumes as > virtio to avoid the limit of volumes that IDE has. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From geguileo at redhat.com Mon May 22 15:41:52 2023 From: geguileo at redhat.com (Gorka Eguileor) Date: Mon, 22 May 2023 17:41:52 +0200 Subject: Mixing IDE and VIRTIO volumes In-Reply-To: References: Message-ID: <20230522154152.md3lv6xjnnnn3zf3@localhost> On 20/05, Work Ceph wrote: > Hello guys, > We have a situation where an image requires an IDE bus in KVM. Therefore, > we configured it in OpenStack to have the IDE bus being used. However, when > we add new volumes to this VM (server), all of them are being allocated as > IDE volumes; therefore, we are limited to 4 volumes in total in the VM. Is > it possible to mix different types of volumes in a VM in OpenStack? I know > that in other platforms, such as when we use KVM directly, proxmox, Apache > CloudStack, we can do such combinations, but we were not able to achieve it > in OpenStack. > > Have you guys worked with similar use cases? I know that I can convert the > image to use virtIO or iSCSI bus, and to do that I need to fix/patch the > operating system inside the image. However, I would like to check if there > is a method to use a root volume of a VM as IDE, and other volumes as > virtio to avoid the limit of volumes that IDE has. Hi, I think I heard some Nova people talking about a new feature to support different buses for different volumes. Unfortunately I don't remember if it's something they were considering doing or something they were already working on. Cheers, Gorka. From geguileo at redhat.com Mon May 22 15:57:16 2023 From: geguileo at redhat.com (Gorka Eguileor) Date: Mon, 22 May 2023 17:57:16 +0200 Subject: Opentack + FCP Storages In-Reply-To: References: Message-ID: <20230522155716.63gvzsyc2epnir6p@localhost> On 17/05, Jorge Visentini wrote: > Hello. > > Today in our environment we only use FCP 3PAR Storages. > Is there a "friendly" way to use FCP Storages with Openstack? > I know and I've already tested Ceph, so I can say that it's the best > storage integration for Openstack, but it's not my case hehe Hi, As a Cinder and OS-Brick developer I use a FC 3PAR system for most of my testing and FC development of os-brick, and the only requirement for it to work is an external Python dependency (python-3parclient) installed wherever cinder-volume is going to ru. I've tried the driver with both FC zone managers, cisco and brocade, and it works as expected. My only complain would be that there are a couple of nuisances and issues, which may be related to my 3PAR system being really, really, old, so I end up using a custom driver that includes my own patches that haven't merged yet [1][2][3]. I also use a custom python-3parclient with my fix that hasn't merged either [4]. For me the most important of those patches is the one that allows me to disable the online copy [2], because I find that this 3PAR feature gives me more problems that benefits, though that may only be to me. If you are doing a full OpenStack deployment with multiple controller services that are running cinder-volume in Active-Passive and then a bunch of compute nodes, just remember that you'll need HBAs in all the controller nodes where cinder-volume could be running as well as all your compute nodes. If you are not using the Zone manager driver you'll need to configure your switches manually to allow those hosts access to the 3PAR. Cheers, Gorka. [1]: https://review.opendev.org/c/openstack/cinder/+/756709 [2]: https://review.opendev.org/c/openstack/cinder/+/756710 [3]: https://review.opendev.org/c/openstack/cinder/+/756711 [4]: https://github.com/hpe-storage/python-3parclient/pull/79 > > All the best! > -- > Att, > Jorge Visentini > +55 55 98432-9868 From jim at kilborns.com Mon May 22 16:13:17 2023 From: jim at kilborns.com (Jim Kilborn) Date: Mon, 22 May 2023 16:13:17 +0000 Subject: upgrade issue with nova/cinder and api version error Message-ID: <46D8C43B01FAC6419C20963F12B88E2E01E9569C67@MAILSERVER.alamois.com> Hello, First time posting here. We have been running a production openstack environment at my office since the kilo release. We are currently on train, and I'm trying to get up to a more recent version. To make it more difficult, we are on centos7, so having to switch to ubuntu as we update versions. The problem that I am having after updaing to victoria, is that when I delete a vm via horizon, the instance disappears but the cinder volume doesn't delete the attachment. It appears this is due to the following error in /var/log/apache2/cinder_error.log ERROR cinder.api.middleware.fault novaclient.exceptions.NotAcceptable: Version 2.89 is not supported by the API. Minimum is 2.1 and maximum is 2.87. (HTTP 406) When I look at the /usr/lib/python3/dist-packages/cinder/compute/nova.py I can see it's using 2.89 in get_server_volume def get_server_volume(context, server_id, volume_id): # Use microversion that includes attachment_id nova = novaclient(context, api_version='2.89') return nova.volumes.get_server_volume(server_id, volume_id) I am not sure why cinder and nova are in disagreement on the api_version. I have verified that they are both upgraded to the victoria release. Anyone have any ideas as to why I would be getting this error or a possible fix? I haven't been able to find any information on this error. Here are the nova package versions: nova-api/focal-updates,now 2:22.4.0-0ubuntu1~cloud3 all [installed] nova-common/focal-updates,now 2:22.4.0-0ubuntu1~cloud3 all [installed] nova-conductor/focal-updates,now 2:22.4.0-0ubuntu1~cloud3 all [installed] nova-novncproxy/focal-updates,now 2:22.4.0-0ubuntu1~cloud3 all [installed] nova-scheduler/focal-updates,now 2:22.4.0-0ubuntu1~cloud3 all [installed] python3-nova/focal-updates,now 2:22.4.0-0ubuntu1~cloud3 all [installed] python3-novaclient/focal-updates,now 2:17.2.1-0ubuntu1~cloud0 all [installed,automatic] Here are the cinder package versions: cinder-api/focal-updates,now 2:17.4.0-0ubuntu1~cloud3 all [installed] cinder-common/focal-updates,now 2:17.4.0-0ubuntu1~cloud3 all [installed,automatic] cinder-scheduler/focal-updates,now 2:17.4.0-0ubuntu1~cloud3 all [installed] cinder-volume/focal-updates,now 2:17.4.0-0ubuntu1~cloud3 all [installed] python3-cinder/focal-updates,now 2:17.4.0-0ubuntu1~cloud3 all [installed,automatic] python3-cinderclient/focal-updates,now 1:7.2.0-0ubuntu1~cloud0 all [installed] Thanks in advance for any ideas! From geguileo at redhat.com Mon May 22 16:16:19 2023 From: geguileo at redhat.com (Gorka Eguileor) Date: Mon, 22 May 2023 18:16:19 +0200 Subject: Discuss Fix for Bug #2003179 In-Reply-To: <818b05618c97e49b8707aa738e381fc7f0db8619.camel@redhat.com> References: <818b05618c97e49b8707aa738e381fc7f0db8619.camel@redhat.com> Message-ID: <20230522161619.wrcpt5xr6l7uce72@localhost> On 16/05, Sean Mooney wrote: > i would proably fix thei the way we did in nova > > we instaled a log filter that prevents the preives deams logs at debug level form being logged. > > https://github.com/openstack/nova/blob/master/nova/config.py#L78-L80 > https://github.com/openstack/nova/commit/86a8aac0d76fa149b5e43c73b31227fbcf427278 > > cinder should also insatll a log filter to only log privsep log at info by default Hi, Thanks Sean for the suggestion, unfortunately we won't be going that route in os-brick for the time being, because those changes in Nova were the reasons why I had to add a feature to privsep [1] and os-brick [2]. Without nova logging privsep calls we were literally blind to know what was happening on attach and detach operations in the os-brick code, even with DEBUG log levels enabled in Nova using the `debug=true` config option. The workaround was to modify the log levels in the nova config explicitly, which no customer had to do before, so I had to write a KCS article explaining it [3]. For the record, this issue of the displayed password is also going to happen in later Nova releases now that we have separated os-brick and nova privsep logging levels. I think Eric has a very good suggestion [4] that should be easy to implement. Cheers, Gorka. [1]: https://review.opendev.org/c/openstack/oslo.privsep/+/784098 [2]: https://review.opendev.org/c/openstack/os-brick/+/871835 [3]: https://access.redhat.com/articles/5906971 [4]: https://bugs.launchpad.net/cinder/+bug/2003179/comments/7 > > > > On Tue, 2023-05-16 at 15:11 +0000, Saad, Tony wrote: > > Hello, > > > > I am reaching out to start a discussion about Bug #2003179 https://bugs.launchpad.net/cinder/+bug/2003179 > > > > The password is getting leaked in plain text from https://opendev.org/openstack/oslo.privsep/src/commit/9c026804de74ae23a60ab3c9565d0c689b2b4579/oslo_privsep/daemon.py#L501. This logger line does not always contain a password so using mask_password() and mask_dict_password() from https://docs.openstack.org/oslo.utils/latest/reference/strutils.html is probably not the best solution. > > Anyone have any thoughts on how to stop the password from appearing in plain text? > > > > Thanks, > > Tony > > > > > > Internal Use - Confidential > > From jorgevisentini at gmail.com Mon May 22 16:18:01 2023 From: jorgevisentini at gmail.com (Jorge Visentini) Date: Mon, 22 May 2023 13:18:01 -0300 Subject: Opentack + FCP Storages In-Reply-To: <20230522155716.63gvzsyc2epnir6p@localhost> References: <20230522155716.63gvzsyc2epnir6p@localhost> Message-ID: Hi, Gorka. Many, many thanks for the information. We are "old school" and we really like the latency and stability of the 3PAR and FCP suite, so we want to keep this structure and use it with Openstack automation. Have a nice week! Em seg., 22 de mai. de 2023 ?s 12:57, Gorka Eguileor escreveu: > On 17/05, Jorge Visentini wrote: > > Hello. > > > > Today in our environment we only use FCP 3PAR Storages. > > Is there a "friendly" way to use FCP Storages with Openstack? > > I know and I've already tested Ceph, so I can say that it's the best > > storage integration for Openstack, but it's not my case hehe > > Hi, > > As a Cinder and OS-Brick developer I use a FC 3PAR system for most of my > testing and FC development of os-brick, and the only requirement for it > to work is an external Python dependency (python-3parclient) installed > wherever cinder-volume is going to ru. > > I've tried the driver with both FC zone managers, cisco and brocade, and > it works as expected. > > My only complain would be that there are a couple of nuisances and > issues, which may be related to my 3PAR system being really, really, > old, so I end up using a custom driver that includes my own patches that > haven't merged yet [1][2][3]. > > I also use a custom python-3parclient with my fix that hasn't merged > either [4]. > > For me the most important of those patches is the one that allows me to > disable the online copy [2], because I find that this 3PAR feature gives > me more problems that benefits, though that may only be to me. > > If you are doing a full OpenStack deployment with multiple controller > services that are running cinder-volume in Active-Passive and then a > bunch of compute nodes, just remember that you'll need HBAs in all the > controller nodes where cinder-volume could be running as well as all > your compute nodes. If you are not using the Zone manager driver you'll > need to configure your switches manually to allow those hosts access to > the 3PAR. > > Cheers, > Gorka. > > [1]: https://review.opendev.org/c/openstack/cinder/+/756709 > [2]: https://review.opendev.org/c/openstack/cinder/+/756710 > [3]: https://review.opendev.org/c/openstack/cinder/+/756711 > [4]: https://github.com/hpe-storage/python-3parclient/pull/79 > > > > > All the best! > > -- > > Att, > > Jorge Visentini > > +55 55 98432-9868 > > -- Att, Jorge Visentini +55 55 98432-9868 -------------- next part -------------- An HTML attachment was scrubbed... URL: From geguileo at redhat.com Mon May 22 16:23:07 2023 From: geguileo at redhat.com (Gorka Eguileor) Date: Mon, 22 May 2023 18:23:07 +0200 Subject: [cinder][dev] Add support in driver - Active/Active High Availability In-Reply-To: References: Message-ID: <20230522162307.py4k27wrqurkm524@localhost> On 15/05, Souza, Nahim wrote: > Hi, Raghavendra, > > Just sharing my experience, I started working on A/A support for NetApp NFS driver, and I followed the same steps you summarized. > Besides that, I think the effort is to understand/test if any of the driver features might break in the A/A environment. > > If anyone knows about anything else we should test, I would be happy to know too. > > Regards, > Nahim Souza. Hi, I see that there is no mention of deploying and configuring the DLM, which is necessary for the critical sections between different hosts. For reference, most drivers require code changes. A very common location were drivers need code changes is when creating the host entities in the storage array. I see that the 3PAR driver has already changed that lock to use the DLM locks [1]. If a driver doesn't support replication then there is no need to split the failover_host method. Cheers, Gorka. [1]: https://github.com/openstack/cinder/blob/7bca35c935cf8566bcf9e4874be78174cb0b6df5/cinder/volume/drivers/hpe/hpe_3par_fc.py#L169 > > > From: U T, Raghavendra > Sent: Monday, May 8, 2023 09:18 > To: openstack-discuss at lists.openstack.org > Subject: [cinder][dev] Add support in driver - Active/Active High Availability > > NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe. > > > Hi, > > We wish to add Active/Active High Availability to: > 1] HPE 3par driver - cinder/cinder/volume/drivers/hpe/hpe_3par_common.py > 2] Nimble driver - cinder/cinder/volume/drivers/hpe/nimble.py > > Checked documentation at > https://docs.openstack.org/cinder/latest/contributor/high_availability.html > https://docs.openstack.org/cinder/latest/contributor/high_availability.html#cinder-volume > https://docs.openstack.org/cinder/latest/contributor/high_availability.html#enabling-active-active-on-drivers > > Summary of steps: > 1] In driver code, set SUPPORTS_ACTIVE_ACTIVE = True > 2] Split the method failover_host() into two methods: failover() and failover_completed() > 3] In cinder.conf, specify cluster name in [DEFAULT] section > cluster = > 4] Configure atleast two nodes in HA and perform testing > > Is this sufficient or anything else required ? > > Note: For Nimble driver, replication feature is not yet added. > So can the above step 2 be skipped? > > Appreciate any suggestions / pointers. > > Regards, > Raghavendra Tilay. > > > From smooney at redhat.com Mon May 22 16:30:43 2023 From: smooney at redhat.com (Sean Mooney) Date: Mon, 22 May 2023 17:30:43 +0100 Subject: Mixing IDE and VIRTIO volumes In-Reply-To: <20230522154152.md3lv6xjnnnn3zf3@localhost> References: <20230522154152.md3lv6xjnnnn3zf3@localhost> Message-ID: On Mon, 2023-05-22 at 17:41 +0200, Gorka Eguileor wrote: > On 20/05, Work Ceph wrote: > > Hello guys, > > We have a situation where an image requires an IDE bus in KVM. Therefore, > > we configured it in OpenStack to have the IDE bus being used. However, when > > we add new volumes to this VM (server), all of them are being allocated as > > IDE volumes; therefore, we are limited to 4 volumes in total in the VM. Is > > it possible to mix different types of volumes in a VM in OpenStack? I know > > that in other platforms, such as when we use KVM directly, proxmox, Apache > > CloudStack, we can do such combinations, but we were not able to achieve it > > in OpenStack. > > > > Have you guys worked with similar use cases? I know that I can convert the > > image to use virtIO or iSCSI bus, and to do that I need to fix/patch the > > operating system inside the image. However, I would like to check if there > > is a method to use a root volume of a VM as IDE, and other volumes as > > virtio to avoid the limit of volumes that IDE has. > > Hi, > > I think I heard some Nova people talking about a new feature to support > different buses for different volumes. Unfortunately I don't remember if > it's something they were considering doing or something they were > already working on. it was something we were considering supporting in the future if someone volenterred to work on it. i was suggesting it shoudl be a preriqustit to supprot vdpa for cinder volumes in the futrue. currently we do not supprot using multiple buses for block devices in general. even when not using cinder. the hw_disk_bus option applies to all volumes (local or cinder) of type block. the only real excption to that is we have ha_rescue_bus in the event you are rescuing an instance in that case you can be using virtio for evernty else and then use ide or another bus for the rescue disk i.e. hw_disk_bus=virtio hw_rescue_bus=ide libvirt can supprot this as evident by the fact we cann support this for the rescue usecase but there is a non zero amount of work that woudl be required. no one is currenlty working on this we have a downstream tracker to look at this in some future release but based on our internal backlog i doubth anyone form redhat will have capsity to work on this in the next 12 months. if you want to propose such a feature the nova team are open to reviweing a spec code proposal but it would take a effort form member of the comunity to advance this this cycle. effectlivly it would require using metadata on the volume to allow requesting a disk bus per volume. form the ptg conversation i belive nothing is required on the cinder side for that. on the nova side we would need ot modify nova to use this metadata from the vlome and cache it in the instance_system_metadta table or the bdm in our db so that change to the bus after its attach to an instance would have no effect unless the volume was detached and reattached. there is a related edge case. currently you cannot create a cinder volume form an iso and boot form it. the reason for this is the voluem is treated liek a blockdevice instead of a cdrom image. the same or a similar mechinium could be used to say this volume is a block device vs removable media. this would be useful for things like driver disk for windows but if the driver you are tyring to install is virtio-blk or virtio scsi you need a way to attach that driver voluem to a differnt bus to install it. which is why this is related. > > Cheers, > Gorka. > > From fungi at yuggoth.org Mon May 22 17:06:16 2023 From: fungi at yuggoth.org (Jeremy Stanley) Date: Mon, 22 May 2023 17:06:16 +0000 Subject: [tc] August 2023 OpenInfra Board Sync Message-ID: <20230522170616.ulunhyffngzvapd7@yuggoth.org> The Open Infrastructure Foundation Board of Directors is endeavoring to engage in regular check-ins with official OpenInfra projects. The goal is for a loosely structured discussion one-hour in length, involving members of the board and the OpenStack TC, along with other interested community members. This is not intended to be a formal presentation, and no materials need to be prepared in advance. I've started an Etherpad where participants can brainstorm potential topics of conversation, time-permitting: https://etherpad.opendev.org/p/2023-08-board-openstack-sync At the end of the May 10 discussion, we tentatively agreed to schedule the next call for 18:00 UTC on Wednesday, August 9. I've attached a calendar file which can serve as a convenient schedule hold for this, in case anyone needs it. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: 2023-08-board-openstack-sync.ics Type: text/calendar Size: 615 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From geguileo at redhat.com Mon May 22 17:49:06 2023 From: geguileo at redhat.com (Gorka Eguileor) Date: Mon, 22 May 2023 19:49:06 +0200 Subject: Mixing IDE and VIRTIO volumes In-Reply-To: References: <20230522154152.md3lv6xjnnnn3zf3@localhost> Message-ID: <20230522174906.sacebvh7e67yzghg@localhost> On 22/05, Sean Mooney wrote: > On Mon, 2023-05-22 at 17:41 +0200, Gorka Eguileor wrote: > > On 20/05, Work Ceph wrote: > > > Hello guys, > > > We have a situation where an image requires an IDE bus in KVM. Therefore, > > > we configured it in OpenStack to have the IDE bus being used. However, when > > > we add new volumes to this VM (server), all of them are being allocated as > > > IDE volumes; therefore, we are limited to 4 volumes in total in the VM. Is > > > it possible to mix different types of volumes in a VM in OpenStack? I know > > > that in other platforms, such as when we use KVM directly, proxmox, Apache > > > CloudStack, we can do such combinations, but we were not able to achieve it > > > in OpenStack. > > > > > > Have you guys worked with similar use cases? I know that I can convert the > > > image to use virtIO or iSCSI bus, and to do that I need to fix/patch the > > > operating system inside the image. However, I would like to check if there > > > is a method to use a root volume of a VM as IDE, and other volumes as > > > virtio to avoid the limit of volumes that IDE has. > > > > Hi, > > > > I think I heard some Nova people talking about a new feature to support > > different buses for different volumes. Unfortunately I don't remember if > > it's something they were considering doing or something they were > > already working on. > it was something we were considering supporting in the future if someone volenterred to work on it. > i was suggesting it shoudl be a preriqustit to supprot vdpa for cinder volumes in the futrue. > Thanks Sean! That was precisely the conversation I had in my mind but couldn't remember the details. :-) From knikolla at bu.edu Mon May 22 17:52:24 2023 From: knikolla at bu.edu (Nikolla, Kristi) Date: Mon, 22 May 2023 17:52:24 +0000 Subject: [tc] Technical Committee next weekly meeting on May 23, 2023 Message-ID: <7005EB40-5313-418E-B01F-F1F02E411240@bu.edu> Hi all, This is a reminder that the next weekly Technical Committee meeting is to be held on Tuesday, May 23, 2023 at 1800 UTC on #openstack-tc on OFTC IRC. Please propose items to the agenda by editing the wiki page at https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting At the end of the day I will send out an email with the finalized agenda. Thank you, Kristi Nikolla -------------- next part -------------- An HTML attachment was scrubbed... URL: From ces.eduardo98 at gmail.com Mon May 22 22:09:00 2023 From: ces.eduardo98 at gmail.com (Carlos Silva) Date: Mon, 22 May 2023 19:09:00 -0300 Subject: [manila] Bobcat hackathon: tackling tech debt Message-ID: Hello, Zorillas and interested stackers! As discussed in the previous week's weekly meeting, we will be starting our hackathon tomorrow, which will be an effort to work on tech debt items [1]. The kick-off meeting will be at 15 UTC in a jitsi bridge [2]. Over tomorrow's meeting, we will be assigning the topics to work on. We can work with small teams or get items to work on things individually. We will use our upstream weekly meeting slot as a checkpoint for the assigned tasks, and the hackathon will run until this Friday (May 26th). [1] https://etherpad.opendev.org/p/zorilla-bobcat-hackathon [2] https://meetpad.opendev.org/ManilaBobcatHackathon Looking forward to it! Regards, carloss -------------- next part -------------- An HTML attachment was scrubbed... URL: From masayuki.igawa at gmail.com Tue May 23 05:41:12 2023 From: masayuki.igawa at gmail.com (Masayuki Igawa) Date: Tue, 23 May 2023 14:41:12 +0900 Subject: upgrade issue with nova/cinder and api version error In-Reply-To: <46D8C43B01FAC6419C20963F12B88E2E01E9569C67@MAILSERVER.alamois.com> References: <46D8C43B01FAC6419C20963F12B88E2E01E9569C67@MAILSERVER.alamois.com> Message-ID: <2152423e-bd99-47b4-aab4-cf101e00fbbd@app.fastmail.com> Hi, > When I look at the > /usr/lib/python3/dist-packages/cinder/compute/nova.py I can see it's > using 2.89 in get_server_volume > > def get_server_volume(context, server_id, volume_id): > # Use microversion that includes attachment_id > nova = novaclient(context, api_version='2.89') > return nova.volumes.get_server_volume(server_id, volume_id) It's weird for me. That function was introduced this patch[1] but it was backported till xena not victoria. So, I wondering if you are using mixed versioned openstack somehow. [1] https://review.opendev.org/q/I612905a1bf4a1706cce913c0d8a6df7a240d599a -- Masayuki Igawa On Tue, May 23, 2023, at 01:13, Jim Kilborn wrote: > Hello, > > First time posting here. > We have been running a production openstack environment at my office > since the kilo release. We are currently on train, and I'm trying to > get up to a more recent version. To make it more difficult, we are on > centos7, so having to switch to ubuntu as we update versions. > > The problem that I am having after updaing to victoria, is that when I > delete a vm via horizon, the instance disappears but the cinder volume > doesn't delete the attachment. > It appears this is due to the following error in > /var/log/apache2/cinder_error.log > > ERROR cinder.api.middleware.fault novaclient.exceptions.NotAcceptable: > Version 2.89 is not supported by the API. Minimum is 2.1 and maximum is > 2.87. (HTTP 406) > > When I look at the > /usr/lib/python3/dist-packages/cinder/compute/nova.py I can see it's > using 2.89 in get_server_volume > > def get_server_volume(context, server_id, volume_id): > # Use microversion that includes attachment_id > nova = novaclient(context, api_version='2.89') > return nova.volumes.get_server_volume(server_id, volume_id) > > I am not sure why cinder and nova are in disagreement on the api_version. > I have verified that they are both upgraded to the victoria release. > > Anyone have any ideas as to why I would be getting this error or a > possible fix? I haven't been able to find any information on this error. > > > Here are the nova package versions: > nova-api/focal-updates,now 2:22.4.0-0ubuntu1~cloud3 all [installed] > nova-common/focal-updates,now 2:22.4.0-0ubuntu1~cloud3 all [installed] > nova-conductor/focal-updates,now 2:22.4.0-0ubuntu1~cloud3 all > [installed] > nova-novncproxy/focal-updates,now 2:22.4.0-0ubuntu1~cloud3 all > [installed] > nova-scheduler/focal-updates,now 2:22.4.0-0ubuntu1~cloud3 all > [installed] > python3-nova/focal-updates,now 2:22.4.0-0ubuntu1~cloud3 all [installed] > python3-novaclient/focal-updates,now 2:17.2.1-0ubuntu1~cloud0 all > [installed,automatic] > > Here are the cinder package versions: > cinder-api/focal-updates,now 2:17.4.0-0ubuntu1~cloud3 all [installed] > cinder-common/focal-updates,now 2:17.4.0-0ubuntu1~cloud3 all > [installed,automatic] > cinder-scheduler/focal-updates,now 2:17.4.0-0ubuntu1~cloud3 all > [installed] > cinder-volume/focal-updates,now 2:17.4.0-0ubuntu1~cloud3 all [installed] > python3-cinder/focal-updates,now 2:17.4.0-0ubuntu1~cloud3 all > [installed,automatic] > python3-cinderclient/focal-updates,now 1:7.2.0-0ubuntu1~cloud0 all > [installed] > > > Thanks in advance for any ideas! From raghavendra-uddhav.tilay at hpe.com Tue May 23 05:50:08 2023 From: raghavendra-uddhav.tilay at hpe.com (U T, Raghavendra) Date: Tue, 23 May 2023 05:50:08 +0000 Subject: [cinder][dev] Add support in driver - Active/Active High Availability In-Reply-To: <20230522162307.py4k27wrqurkm524@localhost> References: <20230522162307.py4k27wrqurkm524@localhost> Message-ID: Thank you Gorka for your inputs. -----Original Message----- From: Gorka Eguileor Sent: Monday, May 22, 2023 9:53 PM To: Souza, Nahim Cc: openstack-discuss at lists.openstack.org Subject: Re: [cinder][dev] Add support in driver - Active/Active High Availability On 15/05, Souza, Nahim wrote: > Hi, Raghavendra, > > Just sharing my experience, I started working on A/A support for NetApp NFS driver, and I followed the same steps you summarized. > Besides that, I think the effort is to understand/test if any of the driver features might break in the A/A environment. > > If anyone knows about anything else we should test, I would be happy to know too. > > Regards, > Nahim Souza. Hi, I see that there is no mention of deploying and configuring the DLM, which is necessary for the critical sections between different hosts. For reference, most drivers require code changes. A very common location were drivers need code changes is when creating the host entities in the storage array. I see that the 3PAR driver has already changed that lock to use the DLM locks [1]. If a driver doesn't support replication then there is no need to split the failover_host method. Cheers, Gorka. [1]: https://github.com/openstack/cinder/blob/7bca35c935cf8566bcf9e4874be78174cb0b6df5/cinder/volume/drivers/hpe/hpe_3par_fc.py#L169 > > > From: U T, Raghavendra > Sent: Monday, May 8, 2023 09:18 > To: openstack-discuss at lists.openstack.org > Subject: [cinder][dev] Add support in driver - Active/Active High > Availability > > NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe. > > > Hi, > > We wish to add Active/Active High Availability to: > 1] HPE 3par driver - > cinder/cinder/volume/drivers/hpe/hpe_3par_common.py > 2] Nimble driver - cinder/cinder/volume/drivers/hpe/nimble.py > > Checked documentation at > INVALID URI REMOVED > ontributor/high_availability.html__;!!NpxR!lH2JQ63rjF_VfeZ3-GLyjF_HxF1 > zel5dqcmGtBMpwroDAextkmaAC6O6MMXQvI0V2yZUuqnrnuzdRseR_nVAnLlJNXsn$ > INVALID URI REMOVED > ontributor/high_availability.html*cinder-volume__;Iw!!NpxR!lH2JQ63rjF_ > VfeZ3-GLyjF_HxF1zel5dqcmGtBMpwroDAextkmaAC6O6MMXQvI0V2yZUuqnrnuzdRseR_ > nVAnG252wrF$ > INVALID URI REMOVED > ontributor/high_availability.html*enabling-active-active-on-drivers__; > Iw!!NpxR!lH2JQ63rjF_VfeZ3-GLyjF_HxF1zel5dqcmGtBMpwroDAextkmaAC6O6MMXQv > I0V2yZUuqnrnuzdRseR_nVAnCQXkiH9$ > > Summary of steps: > 1] In driver code, set SUPPORTS_ACTIVE_ACTIVE = True 2] Split the > method failover_host() into two methods: failover() and > failover_completed() 3] In cinder.conf, specify cluster name in > [DEFAULT] section cluster = 4] Configure atleast two > nodes in HA and perform testing > > Is this sufficient or anything else required ? > > Note: For Nimble driver, replication feature is not yet added. > So can the above step 2 be skipped? > > Appreciate any suggestions / pointers. > > Regards, > Raghavendra Tilay. > > > From fkr at hazardous.org Tue May 23 06:41:43 2023 From: fkr at hazardous.org (Felix Kronlage-Dammers) Date: Tue, 23 May 2023 08:41:43 +0200 Subject: [publiccloud-sig] Reminder - next meeting May 24th - 0700 UTC Message-ID: Hi everyone, tomorrow is the next meeting of the public cloud sig. We meet on IRC in #openstack-operators at 0700 UTC (see my mail from couple weeks ago regarding the shift during summer time: https://lists.openstack.org/pipermail/openstack-discuss/2023-May/033623.html). A preliminary agenda can be found in the pad: https://etherpad.opendev.org/p/publiccloud-sig-meeting See also here for all other details: https://wiki.openstack.org/wiki/PublicCloudSIG read you on wednesday! felix From knikolla at bu.edu Tue May 23 10:10:04 2023 From: knikolla at bu.edu (Nikolla, Kristi) Date: Tue, 23 May 2023 10:10:04 +0000 Subject: [tc] Technical Committee next weekly meeting on May 23, 2023 In-Reply-To: <7005EB40-5313-418E-B01F-F1F02E411240@bu.edu> References: <7005EB40-5313-418E-B01F-F1F02E411240@bu.edu> Message-ID: Please find below the agenda for today's meeting * Roll call * Follow up on past action items * Gate health check * Broken docs due to inconsistent release naming * Schedule of removing support for Python versions by libraries - how it should align with coordinated releases (tooz case) * Recurring tasks check ** Bare 'recheck' state *** https://etherpad.opendev.org/p/recheck-weekly-summary * Open Reviews ** https://review.opendev.org/q/projects:openstack/governance+is:open On May 22, 2023, at 1:52 PM, Nikolla, Kristi wrote: Hi all, This is a reminder that the next weekly Technical Committee meeting is to be held on Tuesday, May 23, 2023 at 1800 UTC on #openstack-tc on OFTC IRC. Please propose items to the agenda by editing the wiki page at https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting At the end of the day I will send out an email with the finalized agenda. Thank you, Kristi Nikolla -------------- next part -------------- An HTML attachment was scrubbed... URL: From lokendrarathour at gmail.com Tue May 23 10:41:09 2023 From: lokendrarathour at gmail.com (Lokendra Rathour) Date: Tue, 23 May 2023 16:11:09 +0530 Subject: [Port Creation failed] - openstack Wallaby In-Reply-To: References: Message-ID: Hi Team, Issue is yet to be solved. openstack network agent list: (overcloud) [stack at undercloud-loke ~]$ openstack network agent list /usr/lib64/python3.6/site-packages/_yaml/__init__.py:23: DeprecationWarning: The _yaml extension module is now located at yaml._yaml and its location is subject to change. To use the LibYAML-based parser and emitter, import from `yaml`: `from yaml import CLoader as Loader, CDumper as Dumper`. DeprecationWarning +--------------------------------------+------------+----------------------------------+-------------------+-------+-------+--------------------+ | ID | Agent Type | Host | Availability Zone | Alive | State | Binary | +--------------------------------------+------------+----------------------------------+-------------------+-------+-------+--------------------+ | 8e6ce556-84f7-48c7-b9a0-5ebecad648d1 | DHCP agent | overcloud-controller-0.myhsc.com | nova | :-) | UP | neutron-dhcp-agent | | c0f29f3c-7eb0-4667-b522-61323185adac | DHCP agent | overcloud-controller-2.myhsc.com | nova | :-) | UP | neutron-dhcp-agent | | e5f5f950-99dd-4a4d-9702-232ffe9d0475 | DHCP agent | overcloud-controller-1.myhsc.com | nova | :-) | UP | neutron-dhcp-agent | +--------------------------------------+------------+----------------------------------+-------------------+-------+-------+--------------------+ (overcloud) [stack at undercloud-loke ~]$ source stackrc we do not see any OVN Controller agent. Also as reported earlier we see no entry in this chassis DB. Any pointers would be helpful. Thanks, Lokendra On Thu, May 18, 2023 at 8:39?PM Rodolfo Alonso Hernandez < ralonsoh at redhat.com> wrote: > Hello Lokendra: > > Did you check the version of the ovn-controller service in the compute > nodes and the ovn services in the controller nodes? The services should be > in sync. > > What is the "openstack network agent list" output? Do you see the OVN > controller, OVN gateways and OVN metadata entries corresponding to the > compute and controller nodes you have? And did you check the sanity of your > OVN SB database? What is the list of "Chassis" and "Chassis_Private" > registers? Each "Chassis_Private" register must have a "Chassis" register > associated. > > Regards. > > On Wed, May 17, 2023 at 6:14?PM Lokendra Rathour < > lokendrarathour at gmail.com> wrote: > >> Hi Swogat, >> Thanks for the inputs, it was showing a similar issue but somehow the >> issue is not getting resolved. >> we are trying to explore more around it. >> >> getting the error in >> ovn-metadata-agent.log >> Cannot find Chassis_Private with name=f80799b9-0cf4-4413-bb4b-e36278c73f6c >> >> detailed: >> 2023-05-17 19:26:31.984 45317 INFO oslo.privsep.daemon [-] privsep daemon >> running as pid 45317 >> 2023-05-17 19:26:32.712 44735 ERROR ovsdbapp.backend.ovs_idl.transaction >> [-] Traceback (most recent call last): >> File >> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/connection.py", >> line 131, in run >> txn.results.put(txn.do_commit()) >> File >> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", >> line 93, in do_commit >> command.run_idl(txn) >> File >> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/command.py", >> line 172, in run_idl >> record = self.api.lookup(self.table, self.record) >> File >> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/__init__.py", >> line 208, in lookup >> return self._lookup(table, record) >> File >> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/__init__.py", >> line 268, in _lookup >> row = idlutils.row_by_value(self, rl.table, rl.column, record) >> File >> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/idlutils.py", >> line 114, in row_by_value >> raise RowNotFound(table=table, col=column, match=match) >> ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find >> Chassis_Private with name=f80799b9-0cf4-4413-bb4b-e36278c73f6c >> >> 2023-05-17 19:26:32.713 44735 ERROR ovsdbapp.backend.ovs_idl.command [-] >> Error executing command (DbAddCommand): >> ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Chassis_Private >> with name=f80799b9-0cf4-4413-bb4b-e36278c73f6c >> 2023-05-17 19:26:32.713 44735 ERROR ovsdbapp.backend.ovs_idl.command >> Traceback (most recent call last): >> 2023-05-17 19:26:32.713 44735 ERROR ovsdbapp.backend.ovs_idl.command >> File >> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/command.py", >> line 42, in execute >> >> waiting for your always helpful inputs. >> >> On Tue, May 16, 2023 at 10:47?PM Swogat Pradhan < >> swogatpradhan22 at gmail.com> wrote: >> >>> Hi >>> I am not sure if this will help, but i faced something similar. >>> You might need to check the ovn database entries. >>> http://www.jimmdenton.com/neutron-ovn-private-chassis/ >>> >>> Or maybe try restarting the ovn service from pcs, sometimes issue comes >>> up when ovn doesn't sync up. >>> >>> Again m not sure if this will be of any help to you. >>> >>> With regards, >>> Swogat Pradhan >>> >>> On Tue, 16 May 2023, 10:41 pm Lokendra Rathour, < >>> lokendrarathour at gmail.com> wrote: >>> >>>> Hi All, >>>> Was trying to create OpenStack VM in OpenStack wallaby release, not >>>> able to create VM, it is failing because of Port not getting created. >>>> >>>> The error that we are getting: >>>> nova-compute.log: >>>> >>>> 2023-05-16 18:15:35.495 7 INFO nova.compute.provider_config >>>> [req-faaf38e7-b5ee-43d1-9303-d508285f5ab7 - - - - -] No provider configs >>>> found in /etc/nova/provider_config. If files are present, ensure the Nova >>>> process has access. >>>> 2023-05-16 18:15:35.549 7 ERROR nova.cmd.common >>>> [req-8842f11c-fe5a-4ad3-92ea-a6898f482bf0 - - - - -] No db access allowed >>>> in nova-compute: File "/usr/bin/nova-compute", line 10, in >>>> sys.exit(main()) >>>> File "/usr/lib/python3.6/site-packages/nova/cmd/compute.py", line 59, >>>> in main >>>> topic=compute_rpcapi.RPC_TOPIC) >>>> File "/usr/lib/python3.6/site-packages/nova/service.py", line 264, in >>>> create >>>> utils.raise_if_old_compute() >>>> File "/usr/lib/python3.6/site-packages/nova/utils.py", line 1068, in >>>> raise_if_old_compute >>>> ctxt, ['nova-compute']) >>>> File "/usr/lib/python3.6/site-packages/nova/objects/service.py", line >>>> 563, in get_minimum_version_all_cells >>>> binaries) >>>> File "/usr/lib/python3.6/site-packages/nova/context.py", line 544, in >>>> scatter_gather_all_cells >>>> fn, *args, **kwargs) >>>> File "/usr/lib/python3.6/site-packages/nova/context.py", line 432, in >>>> scatter_gather_cells >>>> with target_cell(context, cell_mapping) as cctxt: >>>> File "/usr/lib64/python3.6/contextlib.py", line 81, in __enter__ >>>> return next(self.gen) >>>> >>>> >>>> neutron/ovn-metadata-agent.log >>>> >>>> 2023-05-16 22:33:41.871 45204 ERROR ovsdbapp.backend.ovs_idl.command >>>> ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Chassis_Private >>>> with name=b6e6f0d3-40c6-4d4e-8ef6-c935fa027bd2 >>>> 2023-05-16 22:33:41.871 45204 ERROR ovsdbapp.backend.ovs_idl.command >>>> 2023-05-16 22:36:41.876 45204 ERROR >>>> ovsdbapp.backend.ovs_idl.transaction [-] Traceback (most recent call last): >>>> File >>>> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/connection.py", >>>> line 131, in run >>>> txn.results.put(txn.do_commit()) >>>> File >>>> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", >>>> line 93, in do_commit >>>> command.run_idl(txn) >>>> File >>>> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/command.py", >>>> line 172, in run_idl >>>> record = self.api.lookup(self.table, self.record) >>>> File >>>> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/__init__.py", >>>> line 208, in lookup >>>> return self._lookup(table, record) >>>> File >>>> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/__init__.py", >>>> line 268, in _lookup >>>> row = idlutils.row_by_value(self, rl.table, rl.column, record) >>>> File >>>> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/idlutils.py", >>>> line 114, in row_by_value >>>> raise RowNotFound(table=table, col=column, match=match) >>>> ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find >>>> Chassis_Private with name=b6e6f0d3-40c6-4d4e-8ef6-c935fa027bd2 >>>> >>>> any input to help get this issue fixed would be of great help. >>>> thanks >>>> -- >>>> ~ Lokendra >>>> skype: lokendrarathour >>>> >>>> >>>> >> >> -- >> ~ Lokendra >> skype: lokendrarathour >> >> >> -- ~ Lokendra skype: lokendrarathour -------------- next part -------------- An HTML attachment was scrubbed... URL: From michal.arbet at ultimum.io Tue May 23 12:01:39 2023 From: michal.arbet at ultimum.io (Michal Arbet) Date: Tue, 23 May 2023 14:01:39 +0200 Subject: [kolla] How to patch images during build In-Reply-To: <20230517181500.gvb2dvjmhc3jyt4g@yuggoth.org> References: <20230517181500.gvb2dvjmhc3jyt4g@yuggoth.org> Message-ID: I am very glad that someone asked for an option to patch kolla images. I've already proposed patches for kolla here [1] and here [2]. But unfortunately I didn't get that many votes to merge into master and I abandoned this. [1] https://review.opendev.org/c/openstack/kolla/+/829296 [2] https://review.opendev.org/c/openstack/kolla/+/829295 With these above patches you can patch files inside every container. Maybe we can discuss this again ?? For example now xena, yoga, zed, antelope has oslo.messaging broken : https://bugs.launchpad.net/oslo.messaging/+bug/2019978 fixed by https://review.opendev.org/c/openstack/oslo.messaging/+/866617 As I am using my kolla patches in my downstream kolla git repo i've only created patches/ directory and place fix for openstack-base container :) patches/ patches/openstack-base patches/openstack-base/series patches/openstack-base/fix-rabbitmq-issue-opendev-883538.patch But, you still can use template-override https://docs.openstack.org/kolla/latest/admin/image-building.html . Thanks Michal Arbet Openstack Engineer Ultimum Technologies a.s. Na Po???? 1047/26, 11000 Praha 1 Czech Republic +420 604 228 897 michal.arbet at ultimum.io *https://ultimum.io * LinkedIn | Twitter | Facebook st 17. 5. 2023 v 20:19 odes?latel Jeremy Stanley napsal: > On 2023-05-17 21:02:02 +0300 (+0300), Maksim Malchuk wrote: > > Yes, you can do that, but note bene mitaka not supported. > [...] > > Not only unsupported, but the stable/mitaka branch of > openstack/keystone was deleted when it reached EOL in 2017. You may > instead want to specify `reference = mitaka-eol` (assuming Git tags > also work there). That should get you the final state of the > stable/mitaka branch prior to its deletion. > -- > Jeremy Stanley > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wodel.youchi at gmail.com Tue May 23 12:19:04 2023 From: wodel.youchi at gmail.com (wodel youchi) Date: Tue, 23 May 2023 13:19:04 +0100 Subject: Does Openstack have the notion of tenant admin? Message-ID: Hi, Does Openstack have the notion of tenant admin? If not, can a role be created to simulate such notion? Regards. Virus-free.www.avast.com <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Tue May 23 12:31:29 2023 From: smooney at redhat.com (Sean Mooney) Date: Tue, 23 May 2023 13:31:29 +0100 Subject: Does Openstack have the notion of tenant admin? In-Reply-To: References: Message-ID: On Tue, 2023-05-23 at 13:19 +0100, wodel youchi wrote: > Hi, > > Does Openstack have the notion of tenant admin? no it does not. there is global admin or you can use member. > > If not, can a role be created to simulate such notion? not really you could use custom policy to simulate it but the real qustion you have to ask/answer is what woudl a teant admin be able to do that a project member cant. you woudl then need to create custom policy rules for all service to enable that persona and assocate that with a custom role. finally you would have to assging that role to the relevent tenant admins. > > Regards. > > > Virus-free.www.avast.com > > <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> From wodel.youchi at gmail.com Tue May 23 13:02:05 2023 From: wodel.youchi at gmail.com (wodel youchi) Date: Tue, 23 May 2023 14:02:05 +0100 Subject: multi-region deployment Message-ID: Hi, I have been reading about multi-region in openstack. >From my understanding, that keystone and horizon are shared and deployed only in the first region. Is this correct? If yes, what happens if the first region goes down? Regards. -------------- next part -------------- An HTML attachment was scrubbed... URL: From eblock at nde.ag Tue May 23 13:36:13 2023 From: eblock at nde.ag (Eugen Block) Date: Tue, 23 May 2023 13:36:13 +0000 Subject: [cinder-backup][ceph] cinder-backup support of incremental backup with ceph backend In-Reply-To: <7af7d0f9-ff01-482d-91e4-1b99a9659222@app.fastmail.com> References: <7af7d0f9-ff01-482d-91e4-1b99a9659222@app.fastmail.com> Message-ID: <20230523133613.Horde.FQupEQ3O_c7c0sp-Egr6XIu@webmail.nde.ag> Hi, I don't see an object_count > 0 for all incremental backups or the full backup. I tried both with a "full" volume (from image) as well as en empty volume, put a filesystem on it and copied tiny files onto it. This is the result: controller02:~ # openstack volume backup list +--------------------------------------+--------------+-------------+-----------+------+ | ID | Name | Description | Status | Size | +--------------------------------------+--------------+-------------+-----------+------+ | a8a448e7-8bfd-46e3-81bf-3b1d607893e7 | inc-backup2 | None | available | 4 | | 3d11faa0-d67c-432d-afb1-ff44f6a3b4a7 | inc-backup1 | None | available | 4 | | 125c23cd-a5e8-4a7a-b59a-015d0bc5902c | full-backup1 | None | available | 4 | +--------------------------------------+--------------+-------------+-----------+------+ controller02:~ # for i in `openstack volume backup list -c ID -f value`; do openstack volume backup show $i -c id -c is_incremental -c object_count -f value; done a8a448e7-8bfd-46e3-81bf-3b1d607893e7 True 3d11faa0-d67c-432d-afb1-ff44f6a3b4a7 True 125c23cd-a5e8-4a7a-b59a-015d0bc5902c False This is still Victoria, though, I think I have a Wallaby test installation, I'll try that as well. In which case should object_count be > 0? All my installations have ceph as storage backend. Thanks, Eugen Zitat von Masayuki Igawa : > Hi Satish, > >> Whenever I take incremental backup it shows a similar size of original >> volume. Technically It should be smaller. Question is does ceph support >> incremental backup with cinder? > > IIUC, it would be expected behavior. According to the API Doc[1], > "size" is "The size of the volume, in gibibytes (GiB)." > So, it's not the actual size of the snapshot itself. > > What about the "object_count" of "openstack volume backup show" output? > The incremental's one should be zero or less than the full backup at least? > > [1] > https://docs.openstack.org/api-ref/block-storage/v3/?expanded=show-backup-detail-detail,list-backups-with-detail-detail#id428 > > -- Masayuki Igawa > > On Wed, May 17, 2023, at 03:51, Satish Patel wrote: >> Folks, >> >> I have ceph storage for my openstack and configure cinder-volume and >> cinder-backup service for my disaster solution. I am trying to use the >> cinder-backup incremental option to save storage space but somehow It >> doesn't work the way it should work. >> >> Whenever I take incremental backup it shows a similar size of original >> volume. Technically It should be smaller. Question is does ceph support >> incremental backup with cinder? >> >> I am running a Yoga release. >> >> $ openstack volume list >> +--------------------------------------+------------+------------+------+-------------------------------------+ >> | ID | Name | Status | Size >> | Attached to | >> +--------------------------------------+------------+------------+------+-------------------------------------+ >> | 285a49a6-0e03-49e5-abf1-1c1efbfeb5f2 | spatel-vol | backing-up | 10 >> | Attached to spatel-foo on /dev/sdc | >> +--------------------------------------+------------+------------+------+-------------------------------------+ >> >> ### Create full backup >> $ openstack volume backup create --name spatel-vol-backup spatel-vol --force >> +-------+--------------------------------------+ >> | Field | Value | >> +-------+--------------------------------------+ >> | id | 4351d9d3-85fa-4cd5-b21d-619b3385aefc | >> | name | spatel-vol-backup | >> +-------+--------------------------------------+ >> >> ### Create incremental >> $ openstack volume backup create --name spatel-vol-backup-1 >> --incremental --force spatel-vol >> +-------+--------------------------------------+ >> | Field | Value | >> +-------+--------------------------------------+ >> | id | 294b58af-771b-4a9f-bb7b-c37a4f84d678 | >> | name | spatel-vol-backup-1 | >> +-------+--------------------------------------+ >> >> $ openstack volume backup list >> +--------------------------------------+---------------------+-------------+-----------+------+ >> | ID | Name | >> Description | Status | Size | >> +--------------------------------------+---------------------+-------------+-----------+------+ >> | 294b58af-771b-4a9f-bb7b-c37a4f84d678 | spatel-vol-backup-1 | None >> | available | 10 | >> | 4351d9d3-85fa-4cd5-b21d-619b3385aefc | spatel-vol-backup | None >> | available | 10 | >> +--------------------------------------+---------------------+-------------+-----------+------+ >> My incremental backup still shows 10G size which should be lower >> compared to the first backup. From maksim.malchuk at gmail.com Tue May 23 13:47:05 2023 From: maksim.malchuk at gmail.com (Maksim Malchuk) Date: Tue, 23 May 2023 16:47:05 +0300 Subject: [kolla] How to patch images during build In-Reply-To: References: <20230517181500.gvb2dvjmhc3jyt4g@yuggoth.org> Message-ID: IMHO, patching - is not a production way. On Tue, May 23, 2023 at 3:11?PM Michal Arbet wrote: > I am very glad that someone asked for an option to patch kolla images. > I've already proposed patches for kolla here [1] and here [2]. > But unfortunately I didn't get that many votes to merge into master and I > abandoned this. > > [1] https://review.opendev.org/c/openstack/kolla/+/829296 > [2] https://review.opendev.org/c/openstack/kolla/+/829295 > > With these above patches you can patch files inside every container. > Maybe we can discuss this again ?? > > For example now xena, yoga, zed, antelope has oslo.messaging broken : > > https://bugs.launchpad.net/oslo.messaging/+bug/2019978 > fixed by > https://review.opendev.org/c/openstack/oslo.messaging/+/866617 > > As I am using my kolla patches in my downstream kolla git repo i've only > created patches/ directory and place fix for openstack-base container :) > > patches/ > patches/openstack-base > patches/openstack-base/series > patches/openstack-base/fix-rabbitmq-issue-opendev-883538.patch > > But, you still can use template-override > https://docs.openstack.org/kolla/latest/admin/image-building.html . > > Thanks > > Michal Arbet > Openstack Engineer > > Ultimum Technologies a.s. > Na Po???? 1047/26, 11000 Praha 1 > Czech Republic > > +420 604 228 897 > michal.arbet at ultimum.io > *https://ultimum.io * > > LinkedIn | Twitter > | Facebook > > > > st 17. 5. 2023 v 20:19 odes?latel Jeremy Stanley > napsal: > >> On 2023-05-17 21:02:02 +0300 (+0300), Maksim Malchuk wrote: >> > Yes, you can do that, but note bene mitaka not supported. >> [...] >> >> Not only unsupported, but the stable/mitaka branch of >> openstack/keystone was deleted when it reached EOL in 2017. You may >> instead want to specify `reference = mitaka-eol` (assuming Git tags >> also work there). That should get you the final state of the >> stable/mitaka branch prior to its deletion. >> -- >> Jeremy Stanley >> > -- Regards, Maksim Malchuk -------------- next part -------------- An HTML attachment was scrubbed... URL: From eblock at nde.ag Tue May 23 13:58:19 2023 From: eblock at nde.ag (Eugen Block) Date: Tue, 23 May 2023 13:58:19 +0000 Subject: [cinder-backup][ceph] cinder-backup support of incremental backup with ceph backend In-Reply-To: <20230523133613.Horde.FQupEQ3O_c7c0sp-Egr6XIu@webmail.nde.ag> References: <7af7d0f9-ff01-482d-91e4-1b99a9659222@app.fastmail.com> <20230523133613.Horde.FQupEQ3O_c7c0sp-Egr6XIu@webmail.nde.ag> Message-ID: <20230523135819.Horde.SQdB_-j6lQN2wos2fvnaW0Z@webmail.nde.ag> I see the same for Wallaby, object_count is always 0. Zitat von Eugen Block : > Hi, > > I don't see an object_count > 0 for all incremental backups or the > full backup. I tried both with a "full" volume (from image) as well > as en empty volume, put a filesystem on it and copied tiny files > onto it. This is the result: > > controller02:~ # openstack volume backup list > +--------------------------------------+--------------+-------------+-----------+------+ > | ID | Name | Description > | Status | Size | > +--------------------------------------+--------------+-------------+-----------+------+ > | a8a448e7-8bfd-46e3-81bf-3b1d607893e7 | inc-backup2 | None > | available | 4 | > | 3d11faa0-d67c-432d-afb1-ff44f6a3b4a7 | inc-backup1 | None > | available | 4 | > | 125c23cd-a5e8-4a7a-b59a-015d0bc5902c | full-backup1 | None > | available | 4 | > +--------------------------------------+--------------+-------------+-----------+------+ > > controller02:~ # for i in `openstack volume backup list -c ID -f > value`; do openstack volume backup show $i -c id -c is_incremental > -c object_count -f value; done > a8a448e7-8bfd-46e3-81bf-3b1d607893e7 > True > > 3d11faa0-d67c-432d-afb1-ff44f6a3b4a7 > True > > 125c23cd-a5e8-4a7a-b59a-015d0bc5902c > False > > > This is still Victoria, though, I think I have a Wallaby test > installation, I'll try that as well. In which case should > object_count be > 0? All my installations have ceph as storage > backend. > > Thanks, > Eugen > > Zitat von Masayuki Igawa : > >> Hi Satish, >> >>> Whenever I take incremental backup it shows a similar size of original >>> volume. Technically It should be smaller. Question is does ceph support >>> incremental backup with cinder? >> >> IIUC, it would be expected behavior. According to the API Doc[1], >> "size" is "The size of the volume, in gibibytes (GiB)." >> So, it's not the actual size of the snapshot itself. >> >> What about the "object_count" of "openstack volume backup show" output? >> The incremental's one should be zero or less than the full backup at least? >> >> [1] >> https://docs.openstack.org/api-ref/block-storage/v3/?expanded=show-backup-detail-detail,list-backups-with-detail-detail#id428 >> >> -- Masayuki Igawa >> >> On Wed, May 17, 2023, at 03:51, Satish Patel wrote: >>> Folks, >>> >>> I have ceph storage for my openstack and configure cinder-volume and >>> cinder-backup service for my disaster solution. I am trying to use the >>> cinder-backup incremental option to save storage space but somehow It >>> doesn't work the way it should work. >>> >>> Whenever I take incremental backup it shows a similar size of original >>> volume. Technically It should be smaller. Question is does ceph support >>> incremental backup with cinder? >>> >>> I am running a Yoga release. >>> >>> $ openstack volume list >>> +--------------------------------------+------------+------------+------+-------------------------------------+ >>> | ID | Name | Status | Size >>> | Attached to | >>> +--------------------------------------+------------+------------+------+-------------------------------------+ >>> | 285a49a6-0e03-49e5-abf1-1c1efbfeb5f2 | spatel-vol | backing-up | 10 >>> | Attached to spatel-foo on /dev/sdc | >>> +--------------------------------------+------------+------------+------+-------------------------------------+ >>> >>> ### Create full backup >>> $ openstack volume backup create --name spatel-vol-backup >>> spatel-vol --force >>> +-------+--------------------------------------+ >>> | Field | Value | >>> +-------+--------------------------------------+ >>> | id | 4351d9d3-85fa-4cd5-b21d-619b3385aefc | >>> | name | spatel-vol-backup | >>> +-------+--------------------------------------+ >>> >>> ### Create incremental >>> $ openstack volume backup create --name spatel-vol-backup-1 >>> --incremental --force spatel-vol >>> +-------+--------------------------------------+ >>> | Field | Value | >>> +-------+--------------------------------------+ >>> | id | 294b58af-771b-4a9f-bb7b-c37a4f84d678 | >>> | name | spatel-vol-backup-1 | >>> +-------+--------------------------------------+ >>> >>> $ openstack volume backup list >>> +--------------------------------------+---------------------+-------------+-----------+------+ >>> | ID | Name | >>> Description | Status | Size | >>> +--------------------------------------+---------------------+-------------+-----------+------+ >>> | 294b58af-771b-4a9f-bb7b-c37a4f84d678 | spatel-vol-backup-1 | None >>> | available | 10 | >>> | 4351d9d3-85fa-4cd5-b21d-619b3385aefc | spatel-vol-backup | None >>> | available | 10 | >>> +--------------------------------------+---------------------+-------------+-----------+------+ >>> My incremental backup still shows 10G size which should be lower >>> compared to the first backup. From berndbausch at gmail.com Tue May 23 14:01:11 2023 From: berndbausch at gmail.com (berndbausch at gmail.com) Date: Tue, 23 May 2023 23:01:11 +0900 Subject: Does Openstack have the notion of tenant admin? In-Reply-To: References: Message-ID: <057d01d98d7f$092c9520$1b85bf60$@gmail.com> There is a plan to introduce a "project manager" role next year or so. See https://governance.openstack.org/tc/goals/selected/consistent-and-secure-rbac.html (if you are in a hurry, scroll down to "phase 3"). -----Original Message----- From: Sean Mooney Sent: Tuesday, May 23, 2023 9:31 PM To: wodel youchi ; OpenStack Discuss Subject: Re: Does Openstack have the notion of tenant admin? On Tue, 2023-05-23 at 13:19 +0100, wodel youchi wrote: > Hi, > > Does Openstack have the notion of tenant admin? no it does not. there is global admin or you can use member. > > If not, can a role be created to simulate such notion? not really you could use custom policy to simulate it but the real qustion you have to ask/answer is what woudl a teant admin be able to do that a project member cant. you woudl then need to create custom policy rules for all service to enable that persona and assocate that with a custom role. finally you would have to assging that role to the relevent tenant admins. > > Regards. > > campaign=sig-email&utm_content=webmail> > Virus-free.www.avast.com > campaign=sig-email&utm_content=webmail> > <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> From Danny.Webb at thehutgroup.com Tue May 23 14:01:09 2023 From: Danny.Webb at thehutgroup.com (Danny Webb) Date: Tue, 23 May 2023 14:01:09 +0000 Subject: multi-region deployment In-Reply-To: References: Message-ID: We tested this out here at THG (a bit too late as we'd already deployed multiple regions). What we did was install a stretched galera cluster across multiple DCs with clustered proxysql fronting it so we could force writes to a single node. We then deployed keystone in each regions on top of this. We put keysone behind a geo-loadbalancer so each region would prefer talking to their local keystones and proxy sql would force writes to a single node in the stretched cluster avoiding issues with commit latency and deadlocks. Horizon doesn't need to exist in a single region, it can exist in all regions as long as it has api connectivity to all regions endponts and the "global" keystone. If you are using kolla we found it easier to have a specific deployment for keystone only and then each region had it's normal deployment but with a remote keystone. ________________________________ From: wodel youchi Sent: 23 May 2023 14:02 To: OpenStack Discuss Subject: multi-region deployment CAUTION: This email originates from outside THG ________________________________ Hi, I have been reading about multi-region in openstack. >From my understanding, that keystone and horizon are shared and deployed only in the first region. Is this correct? If yes, what happens if the first region goes down? Regards. -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Tue May 23 14:15:15 2023 From: fungi at yuggoth.org (Jeremy Stanley) Date: Tue, 23 May 2023 14:15:15 +0000 Subject: Does Openstack have the notion of tenant admin? In-Reply-To: References: Message-ID: <20230523141514.4fwei4aiuhk53d2j@yuggoth.org> On 2023-05-23 13:31:29 +0100 (+0100), Sean Mooney wrote: > On Tue, 2023-05-23 at 13:19 +0100, wodel youchi wrote: > > Does Openstack have the notion of tenant admin? > > no it does not. > > there is global admin or you can use member. > > > If not, can a role be created to simulate such notion? > > not really > > you could use custom policy to simulate it but the real qustion > you have to ask/answer is what woudl a teant admin be able to do > that a project member cant. [...] Developers have been working recently on adding a read-only "reader" role to their respective services as an initial phase of the Consistent and Secure Default RBAC goal[*], so you might think of it as people who need to be able to make changes to project resources (project members) are conceptually akin to your tenant admin idea while people who only need to be able to look at status and settings for project resources (project readers) are limited to just those capabilities and cannot make changes. In phase 3, the plan (as it stands now) is to add a project "manager" role which will gain exclusive control of lower level resource API methods, further limiting the current project member role. [*] https://governance.openstack.org/tc/goals/selected/consistent-and-secure-rbac.html -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From Danny.Webb at thehutgroup.com Tue May 23 14:17:51 2023 From: Danny.Webb at thehutgroup.com (Danny Webb) Date: Tue, 23 May 2023 14:17:51 +0000 Subject: [kolla] How to patch images during build In-Reply-To: References: <20230517181500.gvb2dvjmhc3jyt4g@yuggoth.org> Message-ID: You can already do this with the kolla image builder which seems to me to be a much better solution than patching containers post creation. ________________________________ From: Michal Arbet Sent: 23 May 2023 13:01 To: openstack-discuss at lists.openstack.org Subject: Re: [kolla] How to patch images during build CAUTION: This email originates from outside THG ________________________________ I am very glad that someone asked for an option to patch kolla images. I've already proposed patches for kolla here [1] and here [2]. But unfortunately I didn't get that many votes to merge into master and I abandoned this. [1] https://review.opendev.org/c/openstack/kolla/+/829296 [2] https://review.opendev.org/c/openstack/kolla/+/829295 With these above patches you can patch files inside every container. Maybe we can discuss this again ?? For example now xena, yoga, zed, antelope has oslo.messaging broken : https://bugs.launchpad.net/oslo.messaging/+bug/2019978 fixed by https://review.opendev.org/c/openstack/oslo.messaging/+/866617 As I am using my kolla patches in my downstream kolla git repo i've only created patches/ directory and place fix for openstack-base container :) patches/ patches/openstack-base patches/openstack-base/series patches/openstack-base/fix-rabbitmq-issue-opendev-883538.patch But, you still can use template-override https://docs.openstack.org/kolla/latest/admin/image-building.html . Thanks Michal Arbet Openstack Engineer [https://www.google.com/a/ultimum.io/images/logo.gif] Ultimum Technologies a.s. Na Po???? 1047/26, 11000 Praha 1 Czech Republic +420 604 228 897 michal.arbet at ultimum.io https://ultimum.io LinkedIn | Twitter | Facebook st 17. 5. 2023 v 20:19 odes?latel Jeremy Stanley > napsal: On 2023-05-17 21:02:02 +0300 (+0300), Maksim Malchuk wrote: > Yes, you can do that, but note bene mitaka not supported. [...] Not only unsupported, but the stable/mitaka branch of openstack/keystone was deleted when it reached EOL in 2017. You may instead want to specify `reference = mitaka-eol` (assuming Git tags also work there). That should get you the final state of the stable/mitaka branch prior to its deletion. -- Jeremy Stanley -------------- next part -------------- An HTML attachment was scrubbed... URL: From Russell.Stather at ignitiongroup.co.za Tue May 23 15:28:29 2023 From: Russell.Stather at ignitiongroup.co.za (Russell Stather) Date: Tue, 23 May 2023 15:28:29 +0000 Subject: Designate and neutron - only creating records for floating ips and not servers or ports Message-ID: Hi We have a problem where we can create a floating ip and the neutron integration happily creates the dns record. However, if we create a port or a server the record does not get created. This is a charm installation, with designate using designate-bind as the backend. Any suggestions or places to look would be appreciated. I can't see any errors in the designate logs or the neutron logs. Thanks Russell -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Tue May 23 15:32:33 2023 From: smooney at redhat.com (Sean Mooney) Date: Tue, 23 May 2023 16:32:33 +0100 Subject: Does Openstack have the notion of tenant admin? In-Reply-To: <20230523141514.4fwei4aiuhk53d2j@yuggoth.org> References: <20230523141514.4fwei4aiuhk53d2j@yuggoth.org> Message-ID: On Tue, 2023-05-23 at 14:15 +0000, Jeremy Stanley wrote: > On 2023-05-23 13:31:29 +0100 (+0100), Sean Mooney wrote: > > On Tue, 2023-05-23 at 13:19 +0100, wodel youchi wrote: > > > Does Openstack have the notion of tenant admin? > > > > no it does not. > > > > there is global admin or you can use member. > > > > > If not, can a role be created to simulate such notion? > > > > not really > > > > you could use custom policy to simulate it but the real qustion > > you have to ask/answer is what woudl a teant admin be able to do > > that a project member cant. > [...] > > Developers have been working recently on adding a read-only "reader" > role to their respective services as an initial phase of the > Consistent and Secure Default RBAC goal[*], so you might think of it > as people who need to be able to make changes to project resources > (project members) are conceptually akin to your tenant admin idea > while people who only need to be able to look at status and settings > for project resources (project readers) are limited to just those > capabilities and cannot make changes. > > In phase 3, the plan (as it stands now) is to add a project > "manager" role which will gain exclusive control of lower level > resource API methods, further limiting the current project member > role. not nessisarly. limiting the scope of project member was not the intent of adding project manager. the orginal intent was to allow proejct-manger to do more pivladge oepration that would normally requrie admin. At least for nova i dont think we plan to remove any permissions form user with project member today. we were considering allow project manager to have addtional capablity however. a project member can lock an instance but an admin can overriede that. a project member cannot unlock an instance that an admin has locked. we could add the ablity for proejct managers to unlock an instance that a project member has locked or to lock the instance such that a project member cannot unlock it. an admin would still be able to override a project manager as it can with project memeber today. i would consier the removal of permisisons to do thing form proejct member and requireign project manager to be a breaking api change. there might be specific cases where we may decided its justifed but unless we are plannign to do a new major version fo the nova api im expectign project manager to be a purly additive cahnge. adding new capablitys that woudl previously have been admin only be default. for exampel the ablity to cold migrate a vm or live migrate it perhaps. but i do not expect use to restict what project member can do. live migration is the example usecase although there have been other discussed in the past https://governance.openstack.org/tc/goals/selected/consistent-and-secure-rbac.html#implement-support-for-project-manager-personas any changes in this regard will likely require per project specs to detail exactly what will and will not change. for example keystone might desire to allow project managers to delegate any of the roles they have to other users in a project or give a project manager the ability to add or remove a user form the project. if that was a change that was desirable i woudl expect to see a keystone spec that would detail exactly how that will be done. similarly i expect any usage of project manager in nova to be accompanied by a spec to describe that. while we have examples we do not have any detailed proposals for nova at this time. > > [*] https://governance.openstack.org/tc/goals/selected/consistent-and-secure-rbac.html > From eblock at nde.ag Tue May 23 15:36:10 2023 From: eblock at nde.ag (Eugen Block) Date: Tue, 23 May 2023 15:36:10 +0000 Subject: [cinder-backup][ceph] cinder-backup support of incremental backup with ceph backend In-Reply-To: <20230523135819.Horde.SQdB_-j6lQN2wos2fvnaW0Z@webmail.nde.ag> References: <7af7d0f9-ff01-482d-91e4-1b99a9659222@app.fastmail.com> <20230523133613.Horde.FQupEQ3O_c7c0sp-Egr6XIu@webmail.nde.ag> <20230523135819.Horde.SQdB_-j6lQN2wos2fvnaW0Z@webmail.nde.ag> Message-ID: <20230523153610.Horde.6JFxWRvW9NAED5MLq81G037@webmail.nde.ag> I looked through the code with a colleague, apparently the code to increase object counters is not executed with ceph as backend. Is that assumption correct? Would be interesting to know for which backends that would actually increase per backup. Zitat von Eugen Block : > I see the same for Wallaby, object_count is always 0. > > Zitat von Eugen Block : > >> Hi, >> >> I don't see an object_count > 0 for all incremental backups or the >> full backup. I tried both with a "full" volume (from image) as well >> as en empty volume, put a filesystem on it and copied tiny files >> onto it. This is the result: >> >> controller02:~ # openstack volume backup list >> +--------------------------------------+--------------+-------------+-----------+------+ >> | ID | Name | Description >> | Status | Size | >> +--------------------------------------+--------------+-------------+-----------+------+ >> | a8a448e7-8bfd-46e3-81bf-3b1d607893e7 | inc-backup2 | None >> | available | 4 | >> | 3d11faa0-d67c-432d-afb1-ff44f6a3b4a7 | inc-backup1 | None >> | available | 4 | >> | 125c23cd-a5e8-4a7a-b59a-015d0bc5902c | full-backup1 | None >> | available | 4 | >> +--------------------------------------+--------------+-------------+-----------+------+ >> >> controller02:~ # for i in `openstack volume backup list -c ID -f >> value`; do openstack volume backup show $i -c id -c is_incremental >> -c object_count -f value; done >> a8a448e7-8bfd-46e3-81bf-3b1d607893e7 >> True >> >> 3d11faa0-d67c-432d-afb1-ff44f6a3b4a7 >> True >> >> 125c23cd-a5e8-4a7a-b59a-015d0bc5902c >> False >> >> >> This is still Victoria, though, I think I have a Wallaby test >> installation, I'll try that as well. In which case should >> object_count be > 0? All my installations have ceph as storage >> backend. >> >> Thanks, >> Eugen >> >> Zitat von Masayuki Igawa : >> >>> Hi Satish, >>> >>>> Whenever I take incremental backup it shows a similar size of original >>>> volume. Technically It should be smaller. Question is does ceph support >>>> incremental backup with cinder? >>> >>> IIUC, it would be expected behavior. According to the API Doc[1], >>> "size" is "The size of the volume, in gibibytes (GiB)." >>> So, it's not the actual size of the snapshot itself. >>> >>> What about the "object_count" of "openstack volume backup show" output? >>> The incremental's one should be zero or less than the full backup at least? >>> >>> [1] >>> https://docs.openstack.org/api-ref/block-storage/v3/?expanded=show-backup-detail-detail,list-backups-with-detail-detail#id428 >>> >>> -- Masayuki Igawa >>> >>> On Wed, May 17, 2023, at 03:51, Satish Patel wrote: >>>> Folks, >>>> >>>> I have ceph storage for my openstack and configure cinder-volume and >>>> cinder-backup service for my disaster solution. I am trying to use the >>>> cinder-backup incremental option to save storage space but somehow It >>>> doesn't work the way it should work. >>>> >>>> Whenever I take incremental backup it shows a similar size of original >>>> volume. Technically It should be smaller. Question is does ceph support >>>> incremental backup with cinder? >>>> >>>> I am running a Yoga release. >>>> >>>> $ openstack volume list >>>> +--------------------------------------+------------+------------+------+-------------------------------------+ >>>> | ID | Name | Status | Size >>>> | Attached to | >>>> +--------------------------------------+------------+------------+------+-------------------------------------+ >>>> | 285a49a6-0e03-49e5-abf1-1c1efbfeb5f2 | spatel-vol | backing-up | 10 >>>> | Attached to spatel-foo on /dev/sdc | >>>> +--------------------------------------+------------+------------+------+-------------------------------------+ >>>> >>>> ### Create full backup >>>> $ openstack volume backup create --name spatel-vol-backup >>>> spatel-vol --force >>>> +-------+--------------------------------------+ >>>> | Field | Value | >>>> +-------+--------------------------------------+ >>>> | id | 4351d9d3-85fa-4cd5-b21d-619b3385aefc | >>>> | name | spatel-vol-backup | >>>> +-------+--------------------------------------+ >>>> >>>> ### Create incremental >>>> $ openstack volume backup create --name spatel-vol-backup-1 >>>> --incremental --force spatel-vol >>>> +-------+--------------------------------------+ >>>> | Field | Value | >>>> +-------+--------------------------------------+ >>>> | id | 294b58af-771b-4a9f-bb7b-c37a4f84d678 | >>>> | name | spatel-vol-backup-1 | >>>> +-------+--------------------------------------+ >>>> >>>> $ openstack volume backup list >>>> +--------------------------------------+---------------------+-------------+-----------+------+ >>>> | ID | Name | >>>> Description | Status | Size | >>>> +--------------------------------------+---------------------+-------------+-----------+------+ >>>> | 294b58af-771b-4a9f-bb7b-c37a4f84d678 | spatel-vol-backup-1 | None >>>> | available | 10 | >>>> | 4351d9d3-85fa-4cd5-b21d-619b3385aefc | spatel-vol-backup | None >>>> | available | 10 | >>>> +--------------------------------------+---------------------+-------------+-----------+------+ >>>> My incremental backup still shows 10G size which should be lower >>>> compared to the first backup. From maksim.malchuk at gmail.com Tue May 23 15:44:01 2023 From: maksim.malchuk at gmail.com (Maksim Malchuk) Date: Tue, 23 May 2023 18:44:01 +0300 Subject: multi-region deployment In-Reply-To: References: Message-ID: We use a similar approach to deploy the 'service' region first, which contains of Keystone+Horizon+MariaDB+Memcached and HA (keepalived, haproxy, galera) components on 3 separate controller nodes. Then we deploy KVM regions (Compute, Ironic, etc) and add them into the 'service' region, also integrate each with external CEPH backend (for Cinder, Glance, etc). With HA configuration and correct architecture each region shouldn't go down. On Tue, May 23, 2023 at 5:10?PM Danny Webb wrote: > We tested this out here at THG (a bit too late as we'd already deployed > multiple regions). What we did was install a stretched galera cluster > across multiple DCs with clustered proxysql fronting it so we could force > writes to a single node. We then deployed keystone in each regions on top > of this. We put keysone behind a geo-loadbalancer so each region would > prefer talking to their local keystones and proxy sql would force writes to > a single node in the stretched cluster avoiding issues with commit latency > and deadlocks. > > Horizon doesn't need to exist in a single region, it can exist in all > regions as long as it has api connectivity to all regions endponts and the > "global" keystone. > > If you are using kolla we found it easier to have a specific deployment > for keystone only and then each region had it's normal deployment but with > a remote keystone. > ------------------------------ > *From:* wodel youchi > *Sent:* 23 May 2023 14:02 > *To:* OpenStack Discuss > *Subject:* multi-region deployment > > > * CAUTION: This email originates from outside THG * > ------------------------------ > Hi, > > I have been reading about multi-region in openstack. > > From my understanding, that keystone and horizon are shared and deployed > only in the first region. > > Is this correct? If yes, what happens if the first region goes down? > > Regards. > -- Regards, Maksim Malchuk -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Tue May 23 16:02:05 2023 From: fungi at yuggoth.org (Jeremy Stanley) Date: Tue, 23 May 2023 16:02:05 +0000 Subject: Does Openstack have the notion of tenant admin? In-Reply-To: References: <20230523141514.4fwei4aiuhk53d2j@yuggoth.org> Message-ID: <20230523160204.645dahh45aly2y4g@yuggoth.org> On 2023-05-23 16:32:33 +0100 (+0100), Sean Mooney wrote: [...] > limiting the scope of project member was not the intent of adding > project manager. the orginal intent was to allow proejct-manger to > do more pivladge oepration that would normally requrie admin. [...] Thanks! That nuance wasn't clear to me from the goal document, but with the examples you cited it makes perfect sense and it's awesome that it will enable deployments to expose additional features to users that way which they can't access today. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From senrique at redhat.com Tue May 23 16:23:17 2023 From: senrique at redhat.com (Sofia Enriquez) Date: Tue, 23 May 2023 17:23:17 +0100 Subject: [cinder-backup][ceph] cinder-backup support of incremental backup with ceph backend In-Reply-To: <20230523153610.Horde.6JFxWRvW9NAED5MLq81G037@webmail.nde.ag> References: <7af7d0f9-ff01-482d-91e4-1b99a9659222@app.fastmail.com> <20230523133613.Horde.FQupEQ3O_c7c0sp-Egr6XIu@webmail.nde.ag> <20230523135819.Horde.SQdB_-j6lQN2wos2fvnaW0Z@webmail.nde.ag> <20230523153610.Horde.6JFxWRvW9NAED5MLq81G037@webmail.nde.ag> Message-ID: https://web.archive.org/web/20160404120859/http://gorka.eguileor.com/inside-cinders-incremental-backup/?replytocom=2267 On Tue, May 23, 2023 at 4:39?PM Eugen Block wrote: > I looked through the code with a colleague, apparently the code to > increase object counters is not executed with ceph as backend. Is that > assumption correct? Would be interesting to know for which backends > that would actually increase per backup. > > Zitat von Eugen Block : > > > I see the same for Wallaby, object_count is always 0. > > > > Zitat von Eugen Block : > > > >> Hi, > >> > >> I don't see an object_count > 0 for all incremental backups or the > >> full backup. I tried both with a "full" volume (from image) as well > >> as en empty volume, put a filesystem on it and copied tiny files > >> onto it. This is the result: > >> > >> controller02:~ # openstack volume backup list > >> > +--------------------------------------+--------------+-------------+-----------+------+ > >> | ID | Name | Description > >> | Status | Size | > >> > +--------------------------------------+--------------+-------------+-----------+------+ > >> | a8a448e7-8bfd-46e3-81bf-3b1d607893e7 | inc-backup2 | None > >> | available | 4 | > >> | 3d11faa0-d67c-432d-afb1-ff44f6a3b4a7 | inc-backup1 | None > >> | available | 4 | > >> | 125c23cd-a5e8-4a7a-b59a-015d0bc5902c | full-backup1 | None > >> | available | 4 | > >> > +--------------------------------------+--------------+-------------+-----------+------+ > >> > >> controller02:~ # for i in `openstack volume backup list -c ID -f > >> value`; do openstack volume backup show $i -c id -c is_incremental > >> -c object_count -f value; done > >> a8a448e7-8bfd-46e3-81bf-3b1d607893e7 > >> True > >> > >> 3d11faa0-d67c-432d-afb1-ff44f6a3b4a7 > >> True > >> > >> 125c23cd-a5e8-4a7a-b59a-015d0bc5902c > >> False > >> > >> > >> This is still Victoria, though, I think I have a Wallaby test > >> installation, I'll try that as well. In which case should > >> object_count be > 0? All my installations have ceph as storage > >> backend. > >> > >> Thanks, > >> Eugen > >> > >> Zitat von Masayuki Igawa : > >> > >>> Hi Satish, > >>> > >>>> Whenever I take incremental backup it shows a similar size of original > >>>> volume. Technically It should be smaller. Question is does ceph > support > >>>> incremental backup with cinder? > >>> > >>> IIUC, it would be expected behavior. According to the API Doc[1], > >>> "size" is "The size of the volume, in gibibytes (GiB)." > >>> So, it's not the actual size of the snapshot itself. > >>> > >>> What about the "object_count" of "openstack volume backup show" output? > >>> The incremental's one should be zero or less than the full backup at > least? > >>> > >>> [1] > >>> > https://docs.openstack.org/api-ref/block-storage/v3/?expanded=show-backup-detail-detail,list-backups-with-detail-detail#id428 > >>> > >>> -- Masayuki Igawa > >>> > >>> On Wed, May 17, 2023, at 03:51, Satish Patel wrote: > >>>> Folks, > >>>> > >>>> I have ceph storage for my openstack and configure cinder-volume and > >>>> cinder-backup service for my disaster solution. I am trying to use the > >>>> cinder-backup incremental option to save storage space but somehow It > >>>> doesn't work the way it should work. > >>>> > >>>> Whenever I take incremental backup it shows a similar size of original > >>>> volume. Technically It should be smaller. Question is does ceph > support > >>>> incremental backup with cinder? > >>>> > >>>> I am running a Yoga release. > >>>> > >>>> $ openstack volume list > >>>> > +--------------------------------------+------------+------------+------+-------------------------------------+ > >>>> | ID | Name | Status | > Size > >>>> | Attached to | > >>>> > +--------------------------------------+------------+------------+------+-------------------------------------+ > >>>> | 285a49a6-0e03-49e5-abf1-1c1efbfeb5f2 | spatel-vol | backing-up | > 10 > >>>> | Attached to spatel-foo on /dev/sdc | > >>>> > +--------------------------------------+------------+------------+------+-------------------------------------+ > >>>> > >>>> ### Create full backup > >>>> $ openstack volume backup create --name spatel-vol-backup > >>>> spatel-vol --force > >>>> +-------+--------------------------------------+ > >>>> | Field | Value | > >>>> +-------+--------------------------------------+ > >>>> | id | 4351d9d3-85fa-4cd5-b21d-619b3385aefc | > >>>> | name | spatel-vol-backup | > >>>> +-------+--------------------------------------+ > >>>> > >>>> ### Create incremental > >>>> $ openstack volume backup create --name spatel-vol-backup-1 > >>>> --incremental --force spatel-vol > >>>> +-------+--------------------------------------+ > >>>> | Field | Value | > >>>> +-------+--------------------------------------+ > >>>> | id | 294b58af-771b-4a9f-bb7b-c37a4f84d678 | > >>>> | name | spatel-vol-backup-1 | > >>>> +-------+--------------------------------------+ > >>>> > >>>> $ openstack volume backup list > >>>> > +--------------------------------------+---------------------+-------------+-----------+------+ > >>>> | ID | Name | > >>>> Description | Status | Size | > >>>> > +--------------------------------------+---------------------+-------------+-----------+------+ > >>>> | 294b58af-771b-4a9f-bb7b-c37a4f84d678 | spatel-vol-backup-1 | None > >>>> | available | 10 | > >>>> | 4351d9d3-85fa-4cd5-b21d-619b3385aefc | spatel-vol-backup | None > >>>> | available | 10 | > >>>> > +--------------------------------------+---------------------+-------------+-----------+------+ > >>>> My incremental backup still shows 10G size which should be lower > >>>> compared to the first backup. > > > > > -- Sof?a Enriquez she/her Software Engineer Red Hat PnT IRC: @enriquetaso @RedHat Red Hat Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From jim at kilborns.com Tue May 23 16:28:20 2023 From: jim at kilborns.com (Jim Kilborn) Date: Tue, 23 May 2023 16:28:20 +0000 Subject: upgrade issue with nova/cinder and api version error In-Reply-To: <2152423e-bd99-47b4-aab4-cf101e00fbbd@app.fastmail.com> References: <46D8C43B01FAC6419C20963F12B88E2E01E9569C67@MAILSERVER.alamois.com>, <2152423e-bd99-47b4-aab4-cf101e00fbbd@app.fastmail.com> Message-ID: <46D8C43B01FAC6419C20963F12B88E2E01E956A40A@MAILSERVER.alamois.com> Well, that is very confusing. When I check that file, it comes from python3-cinder, and that package appears to be from victoria, as I dont have a zena repo installed. This is from the ubuntu repos, so maybe it got backported too far. # grep 2.89 /usr/lib/python3/dist-packages/cinder/compute/nova.py nova = novaclient(context, api_version='2.89') # dpkg -S /usr/lib/python3/dist-packages/cinder/compute/nova.py python3-cinder: /usr/lib/python3/dist-packages/cinder/compute/nova.py # apt-cache showpkg python3-cinder Package: python3-cinder Versions: 2:17.4.0-0ubuntu1~cloud4 (/var/lib/apt/lists/ubuntu-cloud.archive.canonical.com_ubuntu_dists_focal-updates_victoria_main_binary-amd64_Packages) # ls /etc/apt/sources.list.d cloudarchive-stein.list cloudarchive-stein.list.distUpgrade cloudarchive-stein.list.save cloudarchive-train.list cloudarchive-train.list.save cloudarchive-ussuri.list cloudarchive-ussuri.list.distUpgrade cloudarchive-ussuri.list.save cloudarchive-victoria.list cloudarchive-victoria.list.save gluster-ubuntu-glusterfs-6-bionic.list gluster-ubuntu-glusterfs-6-bionic.list.distUpgrade gluster-ubuntu-glusterfs-6-bionic.list.save gluster-ubuntu-glusterfs-6-focal.list gluster-ubuntu-glusterfs-6-focal.list.save mariadb.list mariadb.list.distUpgrade mariadb.list.save ________________________________________ From: Masayuki Igawa [masayuki.igawa at gmail.com] Sent: Tuesday, May 23, 2023 12:41 AM To: openstack-discuss at lists.openstack.org Subject: Re: upgrade issue with nova/cinder and api version error Hi, > When I look at the > /usr/lib/python3/dist-packages/cinder/compute/nova.py I can see it's > using 2.89 in get_server_volume > > def get_server_volume(context, server_id, volume_id): > # Use microversion that includes attachment_id > nova = novaclient(context, api_version='2.89') > return nova.volumes.get_server_volume(server_id, volume_id) It's weird for me. That function was introduced this patch[1] but it was backported till xena not victoria. So, I wondering if you are using mixed versioned openstack somehow. [1] https://review.opendev.org/q/I612905a1bf4a1706cce913c0d8a6df7a240d599a -- Masayuki Igawa On Tue, May 23, 2023, at 01:13, Jim Kilborn wrote: > Hello, > > First time posting here. > We have been running a production openstack environment at my office > since the kilo release. We are currently on train, and I'm trying to > get up to a more recent version. To make it more difficult, we are on > centos7, so having to switch to ubuntu as we update versions. > > The problem that I am having after updaing to victoria, is that when I > delete a vm via horizon, the instance disappears but the cinder volume > doesn't delete the attachment. > It appears this is due to the following error in > /var/log/apache2/cinder_error.log > > ERROR cinder.api.middleware.fault novaclient.exceptions.NotAcceptable: > Version 2.89 is not supported by the API. Minimum is 2.1 and maximum is > 2.87. (HTTP 406) > > When I look at the > /usr/lib/python3/dist-packages/cinder/compute/nova.py I can see it's > using 2.89 in get_server_volume > > def get_server_volume(context, server_id, volume_id): > # Use microversion that includes attachment_id > nova = novaclient(context, api_version='2.89') > return nova.volumes.get_server_volume(server_id, volume_id) > > I am not sure why cinder and nova are in disagreement on the api_version. > I have verified that they are both upgraded to the victoria release. > > Anyone have any ideas as to why I would be getting this error or a > possible fix? I haven't been able to find any information on this error. > > > Here are the nova package versions: > nova-api/focal-updates,now 2:22.4.0-0ubuntu1~cloud3 all [installed] > nova-common/focal-updates,now 2:22.4.0-0ubuntu1~cloud3 all [installed] > nova-conductor/focal-updates,now 2:22.4.0-0ubuntu1~cloud3 all > [installed] > nova-novncproxy/focal-updates,now 2:22.4.0-0ubuntu1~cloud3 all > [installed] > nova-scheduler/focal-updates,now 2:22.4.0-0ubuntu1~cloud3 all > [installed] > python3-nova/focal-updates,now 2:22.4.0-0ubuntu1~cloud3 all [installed] > python3-novaclient/focal-updates,now 2:17.2.1-0ubuntu1~cloud0 all > [installed,automatic] > > Here are the cinder package versions: > cinder-api/focal-updates,now 2:17.4.0-0ubuntu1~cloud3 all [installed] > cinder-common/focal-updates,now 2:17.4.0-0ubuntu1~cloud3 all > [installed,automatic] > cinder-scheduler/focal-updates,now 2:17.4.0-0ubuntu1~cloud3 all > [installed] > cinder-volume/focal-updates,now 2:17.4.0-0ubuntu1~cloud3 all [installed] > python3-cinder/focal-updates,now 2:17.4.0-0ubuntu1~cloud3 all > [installed,automatic] > python3-cinderclient/focal-updates,now 1:7.2.0-0ubuntu1~cloud0 all > [installed] > > > Thanks in advance for any ideas! From jim at kilborns.com Tue May 23 16:46:10 2023 From: jim at kilborns.com (Jim Kilborn) Date: Tue, 23 May 2023 16:46:10 +0000 Subject: upgrade issue with nova/cinder and api version error In-Reply-To: <46D8C43B01FAC6419C20963F12B88E2E01E956A40A@MAILSERVER.alamois.com> References: <46D8C43B01FAC6419C20963F12B88E2E01E9569C67@MAILSERVER.alamois.com>, <2152423e-bd99-47b4-aab4-cf101e00fbbd@app.fastmail.com>, <46D8C43B01FAC6419C20963F12B88E2E01E956A40A@MAILSERVER.alamois.com> Message-ID: <46D8C43B01FAC6419C20963F12B88E2E01E956A459@MAILSERVER.alamois.com> Well, thanks for pointing me to the fact that code wasnt in victoria. I went ahead and removed and purged the cinder packages, and reinstalled from victoria. Now it shows the api verision of 2.51, like it should. Maybe at some point the wrong repo was added and removed, causing a package update that shouldnt have happened. This issue is resolved. ________________________________________ From: Jim Kilborn Sent: Tuesday, May 23, 2023 11:28 AM To: Masayuki Igawa; openstack-discuss at lists.openstack.org Subject: RE: upgrade issue with nova/cinder and api version error Well, that is very confusing. When I check that file, it comes from python3-cinder, and that package appears to be from victoria, as I dont have a zena repo installed. This is from the ubuntu repos, so maybe it got backported too far. # grep 2.89 /usr/lib/python3/dist-packages/cinder/compute/nova.py nova = novaclient(context, api_version='2.89') # dpkg -S /usr/lib/python3/dist-packages/cinder/compute/nova.py python3-cinder: /usr/lib/python3/dist-packages/cinder/compute/nova.py # apt-cache showpkg python3-cinder Package: python3-cinder Versions: 2:17.4.0-0ubuntu1~cloud4 (/var/lib/apt/lists/ubuntu-cloud.archive.canonical.com_ubuntu_dists_focal-updates_victoria_main_binary-amd64_Packages) # ls /etc/apt/sources.list.d cloudarchive-stein.list cloudarchive-stein.list.distUpgrade cloudarchive-stein.list.save cloudarchive-train.list cloudarchive-train.list.save cloudarchive-ussuri.list cloudarchive-ussuri.list.distUpgrade cloudarchive-ussuri.list.save cloudarchive-victoria.list cloudarchive-victoria.list.save gluster-ubuntu-glusterfs-6-bionic.list gluster-ubuntu-glusterfs-6-bionic.list.distUpgrade gluster-ubuntu-glusterfs-6-bionic.list.save gluster-ubuntu-glusterfs-6-focal.list gluster-ubuntu-glusterfs-6-focal.list.save mariadb.list mariadb.list.distUpgrade mariadb.list.save ________________________________________ From: Masayuki Igawa [masayuki.igawa at gmail.com] Sent: Tuesday, May 23, 2023 12:41 AM To: openstack-discuss at lists.openstack.org Subject: Re: upgrade issue with nova/cinder and api version error Hi, > When I look at the > /usr/lib/python3/dist-packages/cinder/compute/nova.py I can see it's > using 2.89 in get_server_volume > > def get_server_volume(context, server_id, volume_id): > # Use microversion that includes attachment_id > nova = novaclient(context, api_version='2.89') > return nova.volumes.get_server_volume(server_id, volume_id) It's weird for me. That function was introduced this patch[1] but it was backported till xena not victoria. So, I wondering if you are using mixed versioned openstack somehow. [1] https://review.opendev.org/q/I612905a1bf4a1706cce913c0d8a6df7a240d599a -- Masayuki Igawa On Tue, May 23, 2023, at 01:13, Jim Kilborn wrote: > Hello, > > First time posting here. > We have been running a production openstack environment at my office > since the kilo release. We are currently on train, and I'm trying to > get up to a more recent version. To make it more difficult, we are on > centos7, so having to switch to ubuntu as we update versions. > > The problem that I am having after updaing to victoria, is that when I > delete a vm via horizon, the instance disappears but the cinder volume > doesn't delete the attachment. > It appears this is due to the following error in > /var/log/apache2/cinder_error.log > > ERROR cinder.api.middleware.fault novaclient.exceptions.NotAcceptable: > Version 2.89 is not supported by the API. Minimum is 2.1 and maximum is > 2.87. (HTTP 406) > > When I look at the > /usr/lib/python3/dist-packages/cinder/compute/nova.py I can see it's > using 2.89 in get_server_volume > > def get_server_volume(context, server_id, volume_id): > # Use microversion that includes attachment_id > nova = novaclient(context, api_version='2.89') > return nova.volumes.get_server_volume(server_id, volume_id) > > I am not sure why cinder and nova are in disagreement on the api_version. > I have verified that they are both upgraded to the victoria release. > > Anyone have any ideas as to why I would be getting this error or a > possible fix? I haven't been able to find any information on this error. > > > Here are the nova package versions: > nova-api/focal-updates,now 2:22.4.0-0ubuntu1~cloud3 all [installed] > nova-common/focal-updates,now 2:22.4.0-0ubuntu1~cloud3 all [installed] > nova-conductor/focal-updates,now 2:22.4.0-0ubuntu1~cloud3 all > [installed] > nova-novncproxy/focal-updates,now 2:22.4.0-0ubuntu1~cloud3 all > [installed] > nova-scheduler/focal-updates,now 2:22.4.0-0ubuntu1~cloud3 all > [installed] > python3-nova/focal-updates,now 2:22.4.0-0ubuntu1~cloud3 all [installed] > python3-novaclient/focal-updates,now 2:17.2.1-0ubuntu1~cloud0 all > [installed,automatic] > > Here are the cinder package versions: > cinder-api/focal-updates,now 2:17.4.0-0ubuntu1~cloud3 all [installed] > cinder-common/focal-updates,now 2:17.4.0-0ubuntu1~cloud3 all > [installed,automatic] > cinder-scheduler/focal-updates,now 2:17.4.0-0ubuntu1~cloud3 all > [installed] > cinder-volume/focal-updates,now 2:17.4.0-0ubuntu1~cloud3 all [installed] > python3-cinder/focal-updates,now 2:17.4.0-0ubuntu1~cloud3 all > [installed,automatic] > python3-cinderclient/focal-updates,now 1:7.2.0-0ubuntu1~cloud0 all > [installed] > > > Thanks in advance for any ideas! From ozzzo at yahoo.com Tue May 23 18:35:12 2023 From: ozzzo at yahoo.com (Albert Braden) Date: Tue, 23 May 2023 18:35:12 +0000 (UTC) Subject: [kolla] [train] [keystone] Number of User/Group entities returned by LDAP exceeded size limit In-Reply-To: <1696731980.1287315.1684502957871@mail.yahoo.com> References: <1696731980.1287315.1684502957871.ref@mail.yahoo.com> <1696731980.1287315.1684502957871@mail.yahoo.com> Message-ID: <1692984653.1578834.1684866912916@mail.yahoo.com> Nobody replied to this Friday afternoon so I'm trying again: On Friday, May 19, 2023, 09:29:17 AM EDT, Albert Braden wrote: We have 2052 groups in our LDAP server. We recently started getting an error when we try to list groups: $ os group list --domain AUTH.OURDOMAIN.COM Number of User/Group entities returned by LDAP exceeded size limit. Contact your LDAP administrator. (HTTP 500) I read the "Additional LDAP integration settings" section in [1] and then tried setting various values of page_size (10, 100, 1000) in the [ldap] section of keystone.conf but that didn't make a difference. What am I? missing? [1] https://docs.openstack.org/keystone/train/admin/configuration.html#identity-ldap-server-set-up Here's the stack trace: 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application [req-198741c6-58b2-46b1-8622-bae1fc5c5280 d64c83e1ea954c368e9fe08a5d8450a1 47dc15c280c9436fadac4d41f1d54a64 - default default] Number of User/Group entities returned by LDAP exceeded size limit. Contact your LDAP administrator.: keystone.exception.LDAPSizeLimitExceeded: Number of User/Group entities returned by LDAP exceeded size limit. Contact your LDAP administrator. 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application Traceback (most recent call last): 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/common.py", line 996, in search_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? attrlist, attrsonly) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/common.py", line 689, in wrapper 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? return func(self, conn, *args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/common.py", line 824, in search_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? attrsonly) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 870, in search_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? return self.search_ext_s(base,scope,filterstr,attrlist,attrsonly,None,None,timeout=self.timeout) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 1286, in search_ext_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? return self._apply_method_s(SimpleLDAPObject.search_ext_s,*args,**kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 1224, in _apply_method_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? return func(self,*args,**kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 864, in search_ext_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? return self.result(msgid,all=1,timeout=timeout)[1] 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 756, in result 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? resp_type, resp_data, resp_msgid = self.result2(msgid,all,timeout) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 760, in result2 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? resp_type, resp_data, resp_msgid, resp_ctrls = self.result3(msgid,all,timeout) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 767, in result3 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? resp_ctrl_classes=resp_ctrl_classes 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 774, in result4 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? ldap_result = self._ldap_call(self._l.result4,msgid,all,timeout,add_ctrls,add_intermediates,add_extop) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 340, in _ldap_call 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? reraise(exc_type, exc_value, exc_traceback) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib64/python3.6/site-packages/ldap/compat.py", line 46, in reraise 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? raise exc_value 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 324, in _ldap_call 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? result = func(*args,**kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application ldap.SIZELIMIT_EXCEEDED: {'msgtype': 100, 'msgid': 2, 'result': 4, 'desc': 'Size limit exceeded', 'ctrls': []} 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application During handling of the above exception, another exception occurred: 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application Traceback (most recent call last): 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/flask/app.py", line 1813, in full_dispatch_request 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? rv = self.dispatch_request() 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/flask/app.py", line 1799, in dispatch_request 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? return self.view_functions[rule.endpoint](**req.view_args) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/flask_restful/__init__.py", line 480, in wrapper 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? resp = resource(*args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/flask/views.py", line 88, in view 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? return self.dispatch_request(*args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/flask_restful/__init__.py", line 595, in dispatch_request 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? resp = meth(*args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/keystone/api/groups.py", line 59, in get 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? return self._list_groups() 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/keystone/api/groups.py", line 86, in _list_groups 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? hints=hints) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/keystone/common/manager.py", line 116, in wrapped 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? __ret_val = __f(*args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/keystone/identity/core.py", line 414, in wrapper 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? return f(self, *args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/keystone/identity/core.py", line 424, in wrapper 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? return f(self, *args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/keystone/identity/core.py", line 1329, in list_groups 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? ref_list = driver.list_groups(hints) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/core.py", line 116, in list_groups 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? return self.group.get_all_filtered(hints) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/core.py", line 474, in get_all_filtered 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? for group in self.get_all(query, hints)] 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/common.py", line 1647, in get_all 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? for x in self._ldap_get_all(hints, ldap_filter)] 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/keystone/common/driver_hints.py", line 42, in wrapper 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? return f(self, hints, *args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/common.py", line 1600, in _ldap_get_all 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? attrs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/common.py", line 998, in search_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? raise exception.LDAPSizeLimitExceeded() 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application keystone.exception.LDAPSizeLimitExceeded: Number of User/Group entities returned by LDAP exceeded size limit. Contact your LDAP administrator. 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application -------------- next part -------------- An HTML attachment was scrubbed... URL: From johnsomor at gmail.com Tue May 23 19:56:42 2023 From: johnsomor at gmail.com (Michael Johnson) Date: Tue, 23 May 2023 12:56:42 -0700 Subject: Designate and neutron - only creating records for floating ips and not servers or ports In-Reply-To: References: Message-ID: Hi Russell, This sounds like you are trying to set up "case 3" as described in the neutron documentation[1]. I would start by walking through this document to make sure the right extensions are enabled in neutron. Also note the requirements the extensions have listed in this section: https://docs.openstack.org/neutron/2023.1/admin/config-dns-int-ext-serv.html#configuration-of-the-externally-accessible-network-for-use-cases-3b-and-3c I'm pretty sure this is a configuration issue on the neutron extension side, but just to be sure, you can watch the Designate API log when you create a port. If there is no request from neutron, it's getting blocked prior to sending the request to Designate. Michael [1] https://docs.openstack.org/neutron/2023.1/admin/config-dns-int-ext-serv.html#use-case-3-ports-are-published-directly-in-the-external-dns-service On Tue, May 23, 2023 at 8:29?AM Russell Stather wrote: > > Hi > > We have a problem where we can create a floating ip and the neutron integration happily creates the dns record. > > However, if we create a port or a server the record does not get created. > > This is a charm installation, with designate using designate-bind as the backend. > > Any suggestions or places to look would be appreciated. > > I can't see any errors in the designate logs or the neutron logs. > > Thanks > > Russell > From michal.arbet at ultimum.io Tue May 23 23:44:05 2023 From: michal.arbet at ultimum.io (Michal Arbet) Date: Wed, 24 May 2023 01:44:05 +0200 Subject: [kolla] How to patch images during build In-Reply-To: References: <20230517181500.gvb2dvjmhc3jyt4g@yuggoth.org> Message-ID: Haha, really ? I don't know what to say ? Have you ever check deb or rpm packages ? And their patches ? On Tue, May 23, 2023, 15:47 Maksim Malchuk wrote: > IMHO, patching - is not a production way. > > On Tue, May 23, 2023 at 3:11?PM Michal Arbet > wrote: > >> I am very glad that someone asked for an option to patch kolla images. >> I've already proposed patches for kolla here [1] and here [2]. >> But unfortunately I didn't get that many votes to merge into master and I >> abandoned this. >> >> [1] https://review.opendev.org/c/openstack/kolla/+/829296 >> [2] https://review.opendev.org/c/openstack/kolla/+/829295 >> >> With these above patches you can patch files inside every container. >> Maybe we can discuss this again ?? >> >> For example now xena, yoga, zed, antelope has oslo.messaging broken : >> >> https://bugs.launchpad.net/oslo.messaging/+bug/2019978 >> fixed by >> https://review.opendev.org/c/openstack/oslo.messaging/+/866617 >> >> As I am using my kolla patches in my downstream kolla git repo i've only >> created patches/ directory and place fix for openstack-base container :) >> >> patches/ >> patches/openstack-base >> patches/openstack-base/series >> patches/openstack-base/fix-rabbitmq-issue-opendev-883538.patch >> >> But, you still can use template-override >> https://docs.openstack.org/kolla/latest/admin/image-building.html . >> >> Thanks >> >> Michal Arbet >> Openstack Engineer >> >> Ultimum Technologies a.s. >> Na Po???? 1047/26, 11000 Praha 1 >> Czech Republic >> >> +420 604 228 897 >> michal.arbet at ultimum.io >> *https://ultimum.io * >> >> LinkedIn | >> Twitter | Facebook >> >> >> >> st 17. 5. 2023 v 20:19 odes?latel Jeremy Stanley >> napsal: >> >>> On 2023-05-17 21:02:02 +0300 (+0300), Maksim Malchuk wrote: >>> > Yes, you can do that, but note bene mitaka not supported. >>> [...] >>> >>> Not only unsupported, but the stable/mitaka branch of >>> openstack/keystone was deleted when it reached EOL in 2017. You may >>> instead want to specify `reference = mitaka-eol` (assuming Git tags >>> also work there). That should get you the final state of the >>> stable/mitaka branch prior to its deletion. >>> -- >>> Jeremy Stanley >>> >> > > -- > Regards, > Maksim Malchuk > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From michal.arbet at ultimum.io Tue May 23 23:44:58 2023 From: michal.arbet at ultimum.io (Michal Arbet) Date: Wed, 24 May 2023 01:44:58 +0200 Subject: [kolla] How to patch images during build In-Reply-To: References: <20230517181500.gvb2dvjmhc3jyt4g@yuggoth.org> Message-ID: How ? Say to me please ? How you can patch oslo.messaging except template override On Tue, May 23, 2023, 16:17 Danny Webb wrote: > You can already do this with the kolla image builder which seems to me to > be a much better solution than patching containers post creation. > ------------------------------ > *From:* Michal Arbet > *Sent:* 23 May 2023 13:01 > *To:* openstack-discuss at lists.openstack.org < > openstack-discuss at lists.openstack.org> > *Subject:* Re: [kolla] How to patch images during build > > > * CAUTION: This email originates from outside THG * > ------------------------------ > I am very glad that someone asked for an option to patch kolla images. > I've already proposed patches for kolla here [1] and here [2]. > But unfortunately I didn't get that many votes to merge into master and I > abandoned this. > > [1] https://review.opendev.org/c/openstack/kolla/+/829296 > [2] https://review.opendev.org/c/openstack/kolla/+/829295 > > With these above patches you can patch files inside every container. > Maybe we can discuss this again ?? > > For example now xena, yoga, zed, antelope has oslo.messaging broken : > > https://bugs.launchpad.net/oslo.messaging/+bug/2019978 > fixed by > https://review.opendev.org/c/openstack/oslo.messaging/+/866617 > > As I am using my kolla patches in my downstream kolla git repo i've only > created patches/ directory and place fix for openstack-base container :) > > patches/ > patches/openstack-base > patches/openstack-base/series > patches/openstack-base/fix-rabbitmq-issue-opendev-883538.patch > > But, you still can use template-override > https://docs.openstack.org/kolla/latest/admin/image-building.html . > > Thanks > > Michal Arbet > Openstack Engineer > > Ultimum Technologies a.s. > Na Po???? 1047/26, 11000 Praha 1 > Czech Republic > > +420 604 228 897 > michal.arbet at ultimum.io > *https://ultimum.io * > > LinkedIn | Twitter > | Facebook > > > > st 17. 5. 2023 v 20:19 odes?latel Jeremy Stanley > napsal: > > On 2023-05-17 21:02:02 +0300 (+0300), Maksim Malchuk wrote: > > Yes, you can do that, but note bene mitaka not supported. > [...] > > Not only unsupported, but the stable/mitaka branch of > openstack/keystone was deleted when it reached EOL in 2017. You may > instead want to specify `reference = mitaka-eol` (assuming Git tags > also work there). That should get you the final state of the > stable/mitaka branch prior to its deletion. > -- > Jeremy Stanley > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Wed May 24 00:26:03 2023 From: satish.txt at gmail.com (Satish Patel) Date: Tue, 23 May 2023 20:26:03 -0400 Subject: [kolla] How to patch images during build In-Reply-To: References: <20230517181500.gvb2dvjmhc3jyt4g@yuggoth.org> Message-ID: This is very interesting that there is no standard or best way to patch images. Everyone uses their own way to handle it. Now I am very curious to see how folks running it in production with patching and building images. I am about to deploy kolla on production and trying to learn all best practices from experts. On Tue, May 23, 2023 at 7:47?PM Michal Arbet wrote: > How ? > > Say to me please ? How you can patch oslo.messaging except template > override > > On Tue, May 23, 2023, 16:17 Danny Webb wrote: > >> You can already do this with the kolla image builder which seems to me to >> be a much better solution than patching containers post creation. >> ------------------------------ >> *From:* Michal Arbet >> *Sent:* 23 May 2023 13:01 >> *To:* openstack-discuss at lists.openstack.org < >> openstack-discuss at lists.openstack.org> >> *Subject:* Re: [kolla] How to patch images during build >> >> >> * CAUTION: This email originates from outside THG * >> ------------------------------ >> I am very glad that someone asked for an option to patch kolla images. >> I've already proposed patches for kolla here [1] and here [2]. >> But unfortunately I didn't get that many votes to merge into master and I >> abandoned this. >> >> [1] https://review.opendev.org/c/openstack/kolla/+/829296 >> [2] https://review.opendev.org/c/openstack/kolla/+/829295 >> >> With these above patches you can patch files inside every container. >> Maybe we can discuss this again ?? >> >> For example now xena, yoga, zed, antelope has oslo.messaging broken : >> >> https://bugs.launchpad.net/oslo.messaging/+bug/2019978 >> fixed by >> https://review.opendev.org/c/openstack/oslo.messaging/+/866617 >> >> As I am using my kolla patches in my downstream kolla git repo i've only >> created patches/ directory and place fix for openstack-base container :) >> >> patches/ >> patches/openstack-base >> patches/openstack-base/series >> patches/openstack-base/fix-rabbitmq-issue-opendev-883538.patch >> >> But, you still can use template-override >> https://docs.openstack.org/kolla/latest/admin/image-building.html . >> >> Thanks >> >> Michal Arbet >> Openstack Engineer >> >> Ultimum Technologies a.s. >> Na Po???? 1047/26, 11000 Praha 1 >> Czech Republic >> >> +420 604 228 897 >> michal.arbet at ultimum.io >> *https://ultimum.io * >> >> LinkedIn | >> Twitter | Facebook >> >> >> >> st 17. 5. 2023 v 20:19 odes?latel Jeremy Stanley >> napsal: >> >> On 2023-05-17 21:02:02 +0300 (+0300), Maksim Malchuk wrote: >> > Yes, you can do that, but note bene mitaka not supported. >> [...] >> >> Not only unsupported, but the stable/mitaka branch of >> openstack/keystone was deleted when it reached EOL in 2017. You may >> instead want to specify `reference = mitaka-eol` (assuming Git tags >> also work there). That should get you the final state of the >> stable/mitaka branch prior to its deletion. >> -- >> Jeremy Stanley >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From tkajinam at redhat.com Wed May 24 05:57:43 2023 From: tkajinam at redhat.com (Takashi Kajinami) Date: Wed, 24 May 2023 14:57:43 +0900 Subject: [neutron] Deprecate networking-odl project In-Reply-To: References: Message-ID: Hello, May I know the current status of this topic ? I have no objections but would like to understand when the deprecation happens, because we maintain implementations to deploy networking-odl (Though it has never been tested in our CI and I doubt anyone is using it nowadays) and I want to make sure we deprecate it timely after the project is deprecated. Thank you, Takashi On Wed, Mar 29, 2023 at 1:53?AM Rodolfo Alonso Hernandez < ralonsoh at redhat.com> wrote: > Hello all: > > During the last releases, the support for "networking-odl" project has > decreased. Currently there is no active developer or maintainer in the > community. This project depends on https://www.opendaylight.org/; the > latest version released is Sulfur (16) while the version still used in the > CI is Sodium (11) [1]. > > I would like first to make a call for developers to update this project. > But if this is not possible, I will then start the procedure to deprecate > it [2] (**not to retire it**). > > Regards. > > [1] > https://github.com/openstack/networking-odl/blob/db5c79b3ee5054feb8a17df130e4ce3a95ec64c2/.zuul.d/jobs.yaml#L172 > [2] > https://docs.openstack.org/project-team-guide/repository.html#deprecating-a-repository > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Joern.Kaster at epg.com Wed May 24 06:22:49 2023 From: Joern.Kaster at epg.com (=?iso-8859-1?Q?Kaster=2C_J=F6rn?=) Date: Wed, 24 May 2023 06:22:49 +0000 Subject: AW: [kolla] [train] [keystone] Number of User/Group entities returned by LDAP exceeded size limit In-Reply-To: <1692984653.1578834.1684866912916@mail.yahoo.com> References: <1696731980.1287315.1684502957871.ref@mail.yahoo.com> <1696731980.1287315.1684502957871@mail.yahoo.com> <1692984653.1578834.1684866912916@mail.yahoo.com> Message-ID: Hello Albert, have seen your message on monday and think that it was replied personaly in the meantime. Anyway. I think this problem is not dedicated to the openstack services. The problem is caused by the ldap server. Which one do you use? Look in the documentation of the ldap server to configure a larger size limit. greets from here J?rn ________________________________ Von: Albert Braden Gesendet: Dienstag, 23. Mai 2023 20:35 An: OpenStack Discuss Betreff: Re: [kolla] [train] [keystone] Number of User/Group entities returned by LDAP exceeded size limit OUTSIDE-EPG! Nobody replied to this Friday afternoon so I'm trying again: On Friday, May 19, 2023, 09:29:17 AM EDT, Albert Braden wrote: We have 2052 groups in our LDAP server. We recently started getting an error when we try to list groups: $ os group list --domain AUTH.OURDOMAIN.COM Number of User/Group entities returned by LDAP exceeded size limit. Contact your LDAP administrator. (HTTP 500) I read the "Additional LDAP integration settings" section in [1] and then tried setting various values of page_size (10, 100, 1000) in the [ldap] section of keystone.conf but that didn't make a difference. What am I missing? [1] https://docs.openstack.org/keystone/train/admin/configuration.html#identity-ldap-server-set-up Here's the stack trace: 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application [req-198741c6-58b2-46b1-8622-bae1fc5c5280 d64c83e1ea954c368e9fe08a5d8450a1 47dc15c280c9436fadac4d41f1d54a64 - default default] Number of User/Group entities returned by LDAP exceeded size limit. Contact your LDAP administrator.: keystone.exception.LDAPSizeLimitExceeded: Number of User/Group entities returned by LDAP exceeded size limit. Contact your LDAP administrator. 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application Traceback (most recent call last): 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/common.py", line 996, in search_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application attrlist, attrsonly) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/common.py", line 689, in wrapper 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return func(self, conn, *args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/common.py", line 824, in search_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application attrsonly) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 870, in search_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return self.search_ext_s(base,scope,filterstr,attrlist,attrsonly,None,None,timeout=self.timeout) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 1286, in search_ext_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return self._apply_method_s(SimpleLDAPObject.search_ext_s,*args,**kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 1224, in _apply_method_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return func(self,*args,**kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 864, in search_ext_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return self.result(msgid,all=1,timeout=timeout)[1] 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 756, in result 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application resp_type, resp_data, resp_msgid = self.result2(msgid,all,timeout) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 760, in result2 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application resp_type, resp_data, resp_msgid, resp_ctrls = self.result3(msgid,all,timeout) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 767, in result3 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application resp_ctrl_classes=resp_ctrl_classes 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 774, in result4 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application ldap_result = self._ldap_call(self._l.result4,msgid,all,timeout,add_ctrls,add_intermediates,add_extop) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 340, in _ldap_call 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application reraise(exc_type, exc_value, exc_traceback) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib64/python3.6/site-packages/ldap/compat.py", line 46, in reraise 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application raise exc_value 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 324, in _ldap_call 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application result = func(*args,**kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application ldap.SIZELIMIT_EXCEEDED: {'msgtype': 100, 'msgid': 2, 'result': 4, 'desc': 'Size limit exceeded', 'ctrls': []} 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application During handling of the above exception, another exception occurred: 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application Traceback (most recent call last): 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/flask/app.py", line 1813, in full_dispatch_request 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application rv = self.dispatch_request() 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/flask/app.py", line 1799, in dispatch_request 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return self.view_functions[rule.endpoint](**req.view_args) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/flask_restful/__init__.py", line 480, in wrapper 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application resp = resource(*args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/flask/views.py", line 88, in view 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return self.dispatch_request(*args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/flask_restful/__init__.py", line 595, in dispatch_request 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application resp = meth(*args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/api/groups.py", line 59, in get 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return self._list_groups() 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/api/groups.py", line 86, in _list_groups 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application hints=hints) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/common/manager.py", line 116, in wrapped 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application __ret_val = __f(*args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/identity/core.py", line 414, in wrapper 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return f(self, *args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/identity/core.py", line 424, in wrapper 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return f(self, *args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/identity/core.py", line 1329, in list_groups 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application ref_list = driver.list_groups(hints) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/core.py", line 116, in list_groups 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return self.group.get_all_filtered(hints) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/core.py", line 474, in get_all_filtered 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application for group in self.get_all(query, hints)] 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/common.py", line 1647, in get_all 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application for x in self._ldap_get_all(hints, ldap_filter)] 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/common/driver_hints.py", line 42, in wrapper 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return f(self, hints, *args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/common.py", line 1600, in _ldap_get_all 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application attrs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/common.py", line 998, in search_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application raise exception.LDAPSizeLimitExceeded() 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application keystone.exception.LDAPSizeLimitExceeded: Number of User/Group entities returned by LDAP exceeded size limit. Contact your LDAP administrator. 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application -------------- next part -------------- An HTML attachment was scrubbed... URL: From noonedeadpunk at gmail.com Wed May 24 06:28:27 2023 From: noonedeadpunk at gmail.com (Dmitriy Rabotyagov) Date: Wed, 24 May 2023 08:28:27 +0200 Subject: [neutron] Deprecate networking-odl project In-Reply-To: References: Message-ID: +1, exactly same situation with OSA. ??, 24 ??? 2023 ?., 08:05 Takashi Kajinami : > Hello, > > May I know the current status of this topic ? > > I have no objections but would like to understand when the deprecation > happens, because > we maintain implementations to deploy networking-odl (Though it has never > been tested in our CI > and I doubt anyone is using it nowadays) and I want to make sure we > deprecate it timely after > the project is deprecated. > > Thank you, > Takashi > > On Wed, Mar 29, 2023 at 1:53?AM Rodolfo Alonso Hernandez < > ralonsoh at redhat.com> wrote: > >> Hello all: >> >> During the last releases, the support for "networking-odl" project has >> decreased. Currently there is no active developer or maintainer in the >> community. This project depends on https://www.opendaylight.org/; the >> latest version released is Sulfur (16) while the version still used in the >> CI is Sodium (11) [1]. >> >> I would like first to make a call for developers to update this project. >> But if this is not possible, I will then start the procedure to deprecate >> it [2] (**not to retire it**). >> >> Regards. >> >> [1] >> https://github.com/openstack/networking-odl/blob/db5c79b3ee5054feb8a17df130e4ce3a95ec64c2/.zuul.d/jobs.yaml#L172 >> [2] >> https://docs.openstack.org/project-team-guide/repository.html#deprecating-a-repository >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From maksim.malchuk at gmail.com Wed May 24 06:41:00 2023 From: maksim.malchuk at gmail.com (Maksim Malchuk) Date: Wed, 24 May 2023 09:41:00 +0300 Subject: [kolla] How to patch images during build In-Reply-To: References: <20230517181500.gvb2dvjmhc3jyt4g@yuggoth.org> Message-ID: The correct and only way to apply patches on images - is build them from source. You should patch the code, not binary files. On Wed, May 24, 2023 at 3:32?AM Satish Patel wrote: > This is very interesting that there is no standard or best way to patch > images. Everyone uses their own way to handle it. Now I am very curious to > see how folks running it in production with patching and building images. I > am about to deploy kolla on production and trying to learn all best > practices from experts. > > On Tue, May 23, 2023 at 7:47?PM Michal Arbet > wrote: > >> How ? >> >> Say to me please ? How you can patch oslo.messaging except template >> override >> >> On Tue, May 23, 2023, 16:17 Danny Webb >> wrote: >> >>> You can already do this with the kolla image builder which seems to me >>> to be a much better solution than patching containers post creation. >>> ------------------------------ >>> *From:* Michal Arbet >>> *Sent:* 23 May 2023 13:01 >>> *To:* openstack-discuss at lists.openstack.org < >>> openstack-discuss at lists.openstack.org> >>> *Subject:* Re: [kolla] How to patch images during build >>> >>> >>> * CAUTION: This email originates from outside THG * >>> ------------------------------ >>> I am very glad that someone asked for an option to patch kolla images. >>> I've already proposed patches for kolla here [1] and here [2]. >>> But unfortunately I didn't get that many votes to merge into master and >>> I abandoned this. >>> >>> [1] https://review.opendev.org/c/openstack/kolla/+/829296 >>> [2] https://review.opendev.org/c/openstack/kolla/+/829295 >>> >>> With these above patches you can patch files inside every container. >>> Maybe we can discuss this again ?? >>> >>> For example now xena, yoga, zed, antelope has oslo.messaging broken : >>> >>> https://bugs.launchpad.net/oslo.messaging/+bug/2019978 >>> fixed by >>> https://review.opendev.org/c/openstack/oslo.messaging/+/866617 >>> >>> As I am using my kolla patches in my downstream kolla git repo i've only >>> created patches/ directory and place fix for openstack-base container :) >>> >>> patches/ >>> patches/openstack-base >>> patches/openstack-base/series >>> patches/openstack-base/fix-rabbitmq-issue-opendev-883538.patch >>> >>> But, you still can use template-override >>> https://docs.openstack.org/kolla/latest/admin/image-building.html . >>> >>> Thanks >>> >>> Michal Arbet >>> Openstack Engineer >>> >>> Ultimum Technologies a.s. >>> Na Po???? 1047/26, 11000 Praha 1 >>> Czech Republic >>> >>> +420 604 228 897 >>> michal.arbet at ultimum.io >>> *https://ultimum.io * >>> >>> LinkedIn | >>> Twitter | Facebook >>> >>> >>> >>> st 17. 5. 2023 v 20:19 odes?latel Jeremy Stanley >>> napsal: >>> >>> On 2023-05-17 21:02:02 +0300 (+0300), Maksim Malchuk wrote: >>> > Yes, you can do that, but note bene mitaka not supported. >>> [...] >>> >>> Not only unsupported, but the stable/mitaka branch of >>> openstack/keystone was deleted when it reached EOL in 2017. You may >>> instead want to specify `reference = mitaka-eol` (assuming Git tags >>> also work there). That should get you the final state of the >>> stable/mitaka branch prior to its deletion. >>> -- >>> Jeremy Stanley >>> >>> -- Regards, Maksim Malchuk -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralonsoh at redhat.com Wed May 24 07:51:02 2023 From: ralonsoh at redhat.com (Rodolfo Alonso Hernandez) Date: Wed, 24 May 2023 09:51:02 +0200 Subject: [neutron] Deprecate networking-odl project In-Reply-To: References: Message-ID: Hello: I will start the deprecation process during this week. As commented, the project is still testing against very old versions of ODL and there are no maintainers involved in the upgrade. This is why I'm proposing the deprecation of this project. The deprecation stops any new development of the master branch but doesn't remove the stable ones. If you have any questions or concerns, please let me know. Regards. On Wed, May 24, 2023 at 8:29?AM Dmitriy Rabotyagov wrote: > +1, exactly same situation with OSA. > > ??, 24 ??? 2023 ?., 08:05 Takashi Kajinami : > >> Hello, >> >> May I know the current status of this topic ? >> >> I have no objections but would like to understand when the deprecation >> happens, because >> we maintain implementations to deploy networking-odl (Though it has never >> been tested in our CI >> and I doubt anyone is using it nowadays) and I want to make sure we >> deprecate it timely after >> the project is deprecated. >> >> Thank you, >> Takashi >> >> On Wed, Mar 29, 2023 at 1:53?AM Rodolfo Alonso Hernandez < >> ralonsoh at redhat.com> wrote: >> >>> Hello all: >>> >>> During the last releases, the support for "networking-odl" project has >>> decreased. Currently there is no active developer or maintainer in the >>> community. This project depends on https://www.opendaylight.org/; the >>> latest version released is Sulfur (16) while the version still used in the >>> CI is Sodium (11) [1]. >>> >>> I would like first to make a call for developers to update this project. >>> But if this is not possible, I will then start the procedure to deprecate >>> it [2] (**not to retire it**). >>> >>> Regards. >>> >>> [1] >>> https://github.com/openstack/networking-odl/blob/db5c79b3ee5054feb8a17df130e4ce3a95ec64c2/.zuul.d/jobs.yaml#L172 >>> [2] >>> https://docs.openstack.org/project-team-guide/repository.html#deprecating-a-repository >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From eblock at nde.ag Wed May 24 09:00:00 2023 From: eblock at nde.ag (Eugen Block) Date: Wed, 24 May 2023 09:00:00 +0000 Subject: [cinder-backup][ceph] cinder-backup support of incremental backup with ceph backend In-Reply-To: References: <7af7d0f9-ff01-482d-91e4-1b99a9659222@app.fastmail.com> <20230523133613.Horde.FQupEQ3O_c7c0sp-Egr6XIu@webmail.nde.ag> <20230523135819.Horde.SQdB_-j6lQN2wos2fvnaW0Z@webmail.nde.ag> <20230523153610.Horde.6JFxWRvW9NAED5MLq81G037@webmail.nde.ag> Message-ID: <20230524090000.Horde.VE36OoWo-mrIVy8d5ZTDAGh@webmail.nde.ag> Thank you Sofia, that is quite helpful. Zitat von Sofia Enriquez : > https://web.archive.org/web/20160404120859/http://gorka.eguileor.com/inside-cinders-incremental-backup/?replytocom=2267 > > On Tue, May 23, 2023 at 4:39?PM Eugen Block wrote: > >> I looked through the code with a colleague, apparently the code to >> increase object counters is not executed with ceph as backend. Is that >> assumption correct? Would be interesting to know for which backends >> that would actually increase per backup. >> >> Zitat von Eugen Block : >> >> > I see the same for Wallaby, object_count is always 0. >> > >> > Zitat von Eugen Block : >> > >> >> Hi, >> >> >> >> I don't see an object_count > 0 for all incremental backups or the >> >> full backup. I tried both with a "full" volume (from image) as well >> >> as en empty volume, put a filesystem on it and copied tiny files >> >> onto it. This is the result: >> >> >> >> controller02:~ # openstack volume backup list >> >> >> +--------------------------------------+--------------+-------------+-----------+------+ >> >> | ID | Name | Description >> >> | Status | Size | >> >> >> +--------------------------------------+--------------+-------------+-----------+------+ >> >> | a8a448e7-8bfd-46e3-81bf-3b1d607893e7 | inc-backup2 | None >> >> | available | 4 | >> >> | 3d11faa0-d67c-432d-afb1-ff44f6a3b4a7 | inc-backup1 | None >> >> | available | 4 | >> >> | 125c23cd-a5e8-4a7a-b59a-015d0bc5902c | full-backup1 | None >> >> | available | 4 | >> >> >> +--------------------------------------+--------------+-------------+-----------+------+ >> >> >> >> controller02:~ # for i in `openstack volume backup list -c ID -f >> >> value`; do openstack volume backup show $i -c id -c is_incremental >> >> -c object_count -f value; done >> >> a8a448e7-8bfd-46e3-81bf-3b1d607893e7 >> >> True >> >> >> >> 3d11faa0-d67c-432d-afb1-ff44f6a3b4a7 >> >> True >> >> >> >> 125c23cd-a5e8-4a7a-b59a-015d0bc5902c >> >> False >> >> >> >> >> >> This is still Victoria, though, I think I have a Wallaby test >> >> installation, I'll try that as well. In which case should >> >> object_count be > 0? All my installations have ceph as storage >> >> backend. >> >> >> >> Thanks, >> >> Eugen >> >> >> >> Zitat von Masayuki Igawa : >> >> >> >>> Hi Satish, >> >>> >> >>>> Whenever I take incremental backup it shows a similar size of original >> >>>> volume. Technically It should be smaller. Question is does ceph >> support >> >>>> incremental backup with cinder? >> >>> >> >>> IIUC, it would be expected behavior. According to the API Doc[1], >> >>> "size" is "The size of the volume, in gibibytes (GiB)." >> >>> So, it's not the actual size of the snapshot itself. >> >>> >> >>> What about the "object_count" of "openstack volume backup show" output? >> >>> The incremental's one should be zero or less than the full backup at >> least? >> >>> >> >>> [1] >> >>> >> https://docs.openstack.org/api-ref/block-storage/v3/?expanded=show-backup-detail-detail,list-backups-with-detail-detail#id428 >> >>> >> >>> -- Masayuki Igawa >> >>> >> >>> On Wed, May 17, 2023, at 03:51, Satish Patel wrote: >> >>>> Folks, >> >>>> >> >>>> I have ceph storage for my openstack and configure cinder-volume and >> >>>> cinder-backup service for my disaster solution. I am trying to use the >> >>>> cinder-backup incremental option to save storage space but somehow It >> >>>> doesn't work the way it should work. >> >>>> >> >>>> Whenever I take incremental backup it shows a similar size of original >> >>>> volume. Technically It should be smaller. Question is does ceph >> support >> >>>> incremental backup with cinder? >> >>>> >> >>>> I am running a Yoga release. >> >>>> >> >>>> $ openstack volume list >> >>>> >> +--------------------------------------+------------+------------+------+-------------------------------------+ >> >>>> | ID | Name | Status | >> Size >> >>>> | Attached to | >> >>>> >> +--------------------------------------+------------+------------+------+-------------------------------------+ >> >>>> | 285a49a6-0e03-49e5-abf1-1c1efbfeb5f2 | spatel-vol | backing-up | >> 10 >> >>>> | Attached to spatel-foo on /dev/sdc | >> >>>> >> +--------------------------------------+------------+------------+------+-------------------------------------+ >> >>>> >> >>>> ### Create full backup >> >>>> $ openstack volume backup create --name spatel-vol-backup >> >>>> spatel-vol --force >> >>>> +-------+--------------------------------------+ >> >>>> | Field | Value | >> >>>> +-------+--------------------------------------+ >> >>>> | id | 4351d9d3-85fa-4cd5-b21d-619b3385aefc | >> >>>> | name | spatel-vol-backup | >> >>>> +-------+--------------------------------------+ >> >>>> >> >>>> ### Create incremental >> >>>> $ openstack volume backup create --name spatel-vol-backup-1 >> >>>> --incremental --force spatel-vol >> >>>> +-------+--------------------------------------+ >> >>>> | Field | Value | >> >>>> +-------+--------------------------------------+ >> >>>> | id | 294b58af-771b-4a9f-bb7b-c37a4f84d678 | >> >>>> | name | spatel-vol-backup-1 | >> >>>> +-------+--------------------------------------+ >> >>>> >> >>>> $ openstack volume backup list >> >>>> >> +--------------------------------------+---------------------+-------------+-----------+------+ >> >>>> | ID | Name | >> >>>> Description | Status | Size | >> >>>> >> +--------------------------------------+---------------------+-------------+-----------+------+ >> >>>> | 294b58af-771b-4a9f-bb7b-c37a4f84d678 | spatel-vol-backup-1 | None >> >>>> | available | 10 | >> >>>> | 4351d9d3-85fa-4cd5-b21d-619b3385aefc | spatel-vol-backup | None >> >>>> | available | 10 | >> >>>> >> +--------------------------------------+---------------------+-------------+-----------+------+ >> >>>> My incremental backup still shows 10G size which should be lower >> >>>> compared to the first backup. >> >> >> >> >> > > -- > > Sof?a Enriquez > > she/her > > Software Engineer > > Red Hat PnT > > IRC: @enriquetaso > @RedHat Red Hat > Red Hat > > From swogatpradhan22 at gmail.com Wed May 24 10:12:09 2023 From: swogatpradhan22 at gmail.com (Swogat Pradhan) Date: Wed, 24 May 2023 15:42:09 +0530 Subject: Add netapp storage in edge site | Wallaby | DCN Message-ID: Hi, I have a DCN setup and there is a requirement to use a netapp storage device in one of the edge sites. Can someone please confirm if it is possible? And if so then should i add the parameters in the edge deployment script or the central deployment script. Any suggestions? With regards, Swogat Pradhan -------------- next part -------------- An HTML attachment was scrubbed... URL: From sbauza at redhat.com Wed May 24 10:24:10 2023 From: sbauza at redhat.com (Sylvain Bauza) Date: Wed, 24 May 2023 12:24:10 +0200 Subject: [nova][ops] EOL'ing stable/train ? Message-ID: Hi folks, in particular operators... We discussed yesterday during the nova meeting [1] about our stable branches and eventually, we were wondering whether we should EOL [2] the stable/train branch for Nova. Why so ? Two points : 1/ The gate is failing at the moment for the branch. 2/ Two CVEs (CVE-2022-47951 [3] and CVE-2023-2088 [4]) aren't fixed in this branch. It would be difficult to fix the CVEs in the upstream branch but hopefully AFAIK all the OpenStack distros already fixed them for their related releases that use Train. So, any concerns ? TBH, I'm not really happy with EOL, but it would be bizarre if we say "oh yeah we support Train backports" but we don't really fix the most important issues... -Sylvain (who will propose the train-eol tag change next week if he doesn't see any concern before) [1] https://meetings.opendev.org/meetings/nova/2023/nova.2023-05-23-16.01.log.html#l-152 [2] https://docs.openstack.org/project-team-guide/stable-branches.html#end-of-life [3] https://security.openstack.org/ossa/OSSA-2023-002.html [4] https://security.openstack.org/ossa/OSSA-2023-003.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From ozzzo at yahoo.com Wed May 24 13:23:14 2023 From: ozzzo at yahoo.com (Albert Braden) Date: Wed, 24 May 2023 13:23:14 +0000 (UTC) Subject: AW: [kolla] [train] [keystone] Number of User/Group entities returned by LDAP exceeded size limit In-Reply-To: References: <1696731980.1287315.1684502957871.ref@mail.yahoo.com> <1696731980.1287315.1684502957871@mail.yahoo.com> <1692984653.1578834.1684866912916@mail.yahoo.com> Message-ID: <1120247677.1939190.1684934594843@mail.yahoo.com> The Keystone documentation [1] appears to indicate that LDAP limitations can be worked around by enabling paging, using the page_size setting. Am I reading it wrong? [1] https://docs.openstack.org/keystone/train/admin/configuration.html#identity-ldap-server-set-up On Wednesday, May 24, 2023, 02:34:23 AM EDT, Kaster, J?rn wrote: #yiv6784134135 P {margin-top:0;margin-bottom:0;}Hello Albert,have seen your message on monday and think that it was replied personaly in the meantime. Anyway.I think this problem is not dedicated to the openstack services. The problem is caused by the ldap server. Which one do you use?Look in the documentation of the ldap server to configure a larger size limit. greets from hereJ?rn Von: Albert Braden Gesendet: Dienstag, 23. Mai 2023 20:35 An: OpenStack Discuss Betreff: Re: [kolla] [train] [keystone] Number of User/Group entities returned by LDAP exceeded size limit? OUTSIDE-EPG! Nobody replied to this Friday afternoon so I'm trying again: On Friday, May 19, 2023, 09:29:17 AM EDT, Albert Braden wrote: We have 2052 groups in our LDAP server. We recently started getting an error when we try to list groups: $ os group list --domain AUTH.OURDOMAIN.COM Number of User/Group entities returned by LDAP exceeded size limit. Contact your LDAP administrator. (HTTP 500) I read the "Additional LDAP integration settings" section in [1] and then tried setting various values of page_size (10, 100, 1000) in the [ldap] section of keystone.conf but that didn't make a difference. What am I? missing? [1] https://docs.openstack.org/keystone/train/admin/configuration.html#identity-ldap-server-set-up Here's the stack trace: 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application [req-198741c6-58b2-46b1-8622-bae1fc5c5280 d64c83e1ea954c368e9fe08a5d8450a1 47dc15c280c9436fadac4d41f1d54a64 - default default] Number of User/Group entities returned by LDAP exceeded size limit. Contact your LDAP administrator.: keystone.exception.LDAPSizeLimitExceeded: Number of User/Group entities returned by LDAP exceeded size limit. Contact your LDAP administrator. 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application Traceback (most recent call last): 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/common.py", line 996, in search_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? attrlist, attrsonly) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/common.py", line 689, in wrapper 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? return func(self, conn, *args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/common.py", line 824, in search_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? attrsonly) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 870, in search_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? return self.search_ext_s(base,scope,filterstr,attrlist,attrsonly,None,None,timeout=self.timeout) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 1286, in search_ext_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? return self._apply_method_s(SimpleLDAPObject.search_ext_s,*args,**kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 1224, in _apply_method_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? return func(self,*args,**kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 864, in search_ext_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? return self.result(msgid,all=1,timeout=timeout)[1] 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 756, in result 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? resp_type, resp_data, resp_msgid = self.result2(msgid,all,timeout) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 760, in result2 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? resp_type, resp_data, resp_msgid, resp_ctrls = self.result3(msgid,all,timeout) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 767, in result3 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? resp_ctrl_classes=resp_ctrl_classes 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 774, in result4 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? ldap_result = self._ldap_call(self._l.result4,msgid,all,timeout,add_ctrls,add_intermediates,add_extop) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 340, in _ldap_call 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? reraise(exc_type, exc_value, exc_traceback) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib64/python3.6/site-packages/ldap/compat.py", line 46, in reraise 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? raise exc_value 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 324, in _ldap_call 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? result = func(*args,**kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application ldap.SIZELIMIT_EXCEEDED: {'msgtype': 100, 'msgid': 2, 'result': 4, 'desc': 'Size limit exceeded', 'ctrls': []} 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application During handling of the above exception, another exception occurred: 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application Traceback (most recent call last): 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/flask/app.py", line 1813, in full_dispatch_request 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? rv = self.dispatch_request() 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/flask/app.py", line 1799, in dispatch_request 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? return self.view_functions[rule.endpoint](**req.view_args) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/flask_restful/__init__.py", line 480, in wrapper 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? resp = resource(*args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/flask/views.py", line 88, in view 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? return self.dispatch_request(*args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/flask_restful/__init__.py", line 595, in dispatch_request 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? resp = meth(*args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/keystone/api/groups.py", line 59, in get 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? return self._list_groups() 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/keystone/api/groups.py", line 86, in _list_groups 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? hints=hints) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/keystone/common/manager.py", line 116, in wrapped 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? __ret_val = __f(*args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/keystone/identity/core.py", line 414, in wrapper 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? return f(self, *args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/keystone/identity/core.py", line 424, in wrapper 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? return f(self, *args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/keystone/identity/core.py", line 1329, in list_groups 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? ref_list = driver.list_groups(hints) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/core.py", line 116, in list_groups 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? return self.group.get_all_filtered(hints) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/core.py", line 474, in get_all_filtered 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? for group in self.get_all(query, hints)] 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/common.py", line 1647, in get_all 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? for x in self._ldap_get_all(hints, ldap_filter)] 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/keystone/common/driver_hints.py", line 42, in wrapper 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? return f(self, hints, *args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/common.py", line 1600, in _ldap_get_all 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? attrs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/common.py", line 998, in search_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application? ? raise exception.LDAPSizeLimitExceeded() 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application keystone.exception.LDAPSizeLimitExceeded: Number of User/Group entities returned by LDAP exceeded size limit. Contact your LDAP administrator. 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application -------------- next part -------------- An HTML attachment was scrubbed... URL: From michal.arbet at ultimum.io Wed May 24 13:45:46 2023 From: michal.arbet at ultimum.io (Michal Arbet) Date: Wed, 24 May 2023 15:45:46 +0200 Subject: [kolla] How to patch images during build In-Reply-To: References: <20230517181500.gvb2dvjmhc3jyt4g@yuggoth.org> Message-ID: Okay, let's say you found a critical bug and you have already sent a patch for review. This - as you know can take quite a long time - gate broken, unit tests are not working etc etc but this is still regular fix and merged in other versions. Moreover upstream guys need to release a new pip package and amend upper constraints. But you need to fix your production now ..how can you do it in kolla now ? You cannot. As I said before as an example .. Oslo messaging is broken from xena to antelope, if your rabbitmq will go down on this versions ..your clients will not connect again to cluster - this is critical bug. So, let's check how upstream guys dealing with it : 1, Patch applied upstream ? - master - OK merged - https://review.opendev.org/c/openstack/oslo.messaging/+/866617 - antelope - OK merged - https://review.opendev.org/c/openstack/oslo.messaging/+/883533 - zed - they don't care - https://review.opendev.org/c/openstack/oslo.messaging/+/883537 - xena - they don't care - https://review.opendev.org/c/openstack/oslo.messaging/+/883539 - yoga - they don't care - https://review.opendev.org/c/openstack/oslo.messaging/+/883538 2, Okay, antelope merged , is the new version released as this is a critical bug ? - no , they again don't care - * 0602d1a1 (HEAD -> master, origin/master, origin/HEAD) Increase ACK_REQUEUE_EVERY_SECONDS_MAX to exceed default kombu_reconnect_delay (Andrew Bogott, 5 weeks ago - 2023-04-20 15:27:58 -0500) * fd2381c7 (tag: 14.3.0) Disable greenthreads for RabbitDriver "listen" connections (Arnaud Morin, 3 months ago - 2023-03-03 11:24:27 +0100) Last version is 14.3.0 and fix is still not released in pypi repo. Other versions ? check 1. Let's check how ubuntu handled this problem : python-oslo.messaging (12.13.0-0ubuntu1.1) jammy; urgency=medium * d/gbp.conf: Create stable/yoga branch. * d/p/revert-limit-maximum-timeout-in-the-poll-loop.patch: This reverts an upstream patch that is preventing active/active rabbitmq from failing over when a node goes down (LP: #1993149). -- Corey Bryant Thu, 20 Oct 2022 15:48:16 -0400 They patched the buggy version !! Kolla dropped binary builds ...so you can't install dependencies from apt repository where it is patched, and you don't have a way how to patch your python library. Patching is normal way how to fix a problem, you don't have always option to bump version, you need patch code and kolla just don't have this option. - On Wed, May 24, 2023, 08:41 Maksim Malchuk wrote: > The correct and only way to apply patches on images - is build them from > source. You should patch the code, not binary files. > > On Wed, May 24, 2023 at 3:32?AM Satish Patel wrote: > >> This is very interesting that there is no standard or best way to patch >> images. Everyone uses their own way to handle it. Now I am very curious to >> see how folks running it in production with patching and building images. I >> am about to deploy kolla on production and trying to learn all best >> practices from experts. >> >> On Tue, May 23, 2023 at 7:47?PM Michal Arbet >> wrote: >> >>> How ? >>> >>> Say to me please ? How you can patch oslo.messaging except template >>> override >>> >>> On Tue, May 23, 2023, 16:17 Danny Webb >>> wrote: >>> >>>> You can already do this with the kolla image builder which seems to me >>>> to be a much better solution than patching containers post creation. >>>> ------------------------------ >>>> *From:* Michal Arbet >>>> *Sent:* 23 May 2023 13:01 >>>> *To:* openstack-discuss at lists.openstack.org < >>>> openstack-discuss at lists.openstack.org> >>>> *Subject:* Re: [kolla] How to patch images during build >>>> >>>> >>>> * CAUTION: This email originates from outside THG * >>>> ------------------------------ >>>> I am very glad that someone asked for an option to patch kolla images. >>>> I've already proposed patches for kolla here [1] and here [2]. >>>> But unfortunately I didn't get that many votes to merge into master and >>>> I abandoned this. >>>> >>>> [1] https://review.opendev.org/c/openstack/kolla/+/829296 >>>> [2] https://review.opendev.org/c/openstack/kolla/+/829295 >>>> >>>> With these above patches you can patch files inside every container. >>>> Maybe we can discuss this again ?? >>>> >>>> For example now xena, yoga, zed, antelope has oslo.messaging broken : >>>> >>>> https://bugs.launchpad.net/oslo.messaging/+bug/2019978 >>>> fixed by >>>> https://review.opendev.org/c/openstack/oslo.messaging/+/866617 >>>> >>>> As I am using my kolla patches in my downstream kolla git repo i've >>>> only created patches/ directory and place fix for openstack-base container >>>> :) >>>> >>>> patches/ >>>> patches/openstack-base >>>> patches/openstack-base/series >>>> patches/openstack-base/fix-rabbitmq-issue-opendev-883538.patch >>>> >>>> But, you still can use template-override >>>> https://docs.openstack.org/kolla/latest/admin/image-building.html . >>>> >>>> Thanks >>>> >>>> Michal Arbet >>>> Openstack Engineer >>>> >>>> Ultimum Technologies a.s. >>>> Na Po???? 1047/26, 11000 Praha 1 >>>> Czech Republic >>>> >>>> +420 604 228 897 >>>> michal.arbet at ultimum.io >>>> *https://ultimum.io * >>>> >>>> LinkedIn | >>>> Twitter | Facebook >>>> >>>> >>>> >>>> st 17. 5. 2023 v 20:19 odes?latel Jeremy Stanley >>>> napsal: >>>> >>>> On 2023-05-17 21:02:02 +0300 (+0300), Maksim Malchuk wrote: >>>> > Yes, you can do that, but note bene mitaka not supported. >>>> [...] >>>> >>>> Not only unsupported, but the stable/mitaka branch of >>>> openstack/keystone was deleted when it reached EOL in 2017. You may >>>> instead want to specify `reference = mitaka-eol` (assuming Git tags >>>> also work there). That should get you the final state of the >>>> stable/mitaka branch prior to its deletion. >>>> -- >>>> Jeremy Stanley >>>> >>>> > > -- > Regards, > Maksim Malchuk > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Danny.Webb at thehutgroup.com Wed May 24 13:53:45 2023 From: Danny.Webb at thehutgroup.com (Danny Webb) Date: Wed, 24 May 2023 13:53:45 +0000 Subject: [kolla] How to patch images during build In-Reply-To: References: <20230517181500.gvb2dvjmhc3jyt4g@yuggoth.org> Message-ID: You do it with kolla image builder, all of the dockerfiles are jinja and you can override or add in new blocks (such as patching a file, changing to different versions of a package etc). Eg: https://github.com/openstack/kolla/blob/master/docker/rabbitmq/Dockerfile.j2 So what we have is a set of pipelines that create patches based on upstream openstack repos, and then we apply these patches in our image build pipelines by changing files in image build process. This way you have immutable containers that you aren't trying to change after the fact in the kolla-ansible process. ________________________________ From: Michal Arbet Sent: 24 May 2023 14:45 To: Maksim Malchuk Cc: Satish Patel ; Danny Webb ; openstack-discuss Subject: Re: [kolla] How to patch images during build CAUTION: This email originates from outside THG ________________________________ Okay, let's say you found a critical bug and you have already sent a patch for review. This - as you know can take quite a long time - gate broken, unit tests are not working etc etc but this is still regular fix and merged in other versions. Moreover upstream guys need to release a new pip package and amend upper constraints. But you need to fix your production now ..how can you do it in kolla now ? You cannot. As I said before as an example .. Oslo messaging is broken from xena to antelope, if your rabbitmq will go down on this versions ..your clients will not connect again to cluster - this is critical bug. So, let's check how upstream guys dealing with it : 1, Patch applied upstream ? - master - OK merged - https://review.opendev.org/c/openstack/oslo.messaging/+/866617 - antelope - OK merged - https://review.opendev.org/c/openstack/oslo.messaging/+/883533 - zed - they don't care - https://review.opendev.org/c/openstack/oslo.messaging/+/883537 - xena - they don't care - https://review.opendev.org/c/openstack/oslo.messaging/+/883539 - yoga - they don't care - https://review.opendev.org/c/openstack/oslo.messaging/+/883538 2, Okay, antelope merged , is the new version released as this is a critical bug ? - no , they again don't care - * 0602d1a1 (HEAD -> master, origin/master, origin/HEAD) Increase ACK_REQUEUE_EVERY_SECONDS_MAX to exceed default kombu_reconnect_delay (Andrew Bogott, 5 weeks ago - 2023-04-20 15:27:58 -0500) * fd2381c7 (tag: 14.3.0) Disable greenthreads for RabbitDriver "listen" connections (Arnaud Morin, 3 months ago - 2023-03-03 11:24:27 +0100) Last version is 14.3.0 and fix is still not released in pypi repo. Other versions ? check 1. Let's check how ubuntu handled this problem : python-oslo.messaging (12.13.0-0ubuntu1.1) jammy; urgency=medium * d/gbp.conf: Create stable/yoga branch. * d/p/revert-limit-maximum-timeout-in-the-poll-loop.patch: This reverts an upstream patch that is preventing active/active rabbitmq from failing over when a node goes down (LP: #1993149). -- Corey Bryant > Thu, 20 Oct 2022 15:48:16 -0400 They patched the buggy version !! Kolla dropped binary builds ...so you can't install dependencies from apt repository where it is patched, and you don't have a way how to patch your python library. Patching is normal way how to fix a problem, you don't have always option to bump version, you need patch code and kolla just don't have this option. - On Wed, May 24, 2023, 08:41 Maksim Malchuk > wrote: The correct and only way to apply patches on images - is build them from source. You should patch the code, not binary files. On Wed, May 24, 2023 at 3:32?AM Satish Patel > wrote: This is very interesting that there is no standard or best way to patch images. Everyone uses their own way to handle it. Now I am very curious to see how folks running it in production with patching and building images. I am about to deploy kolla on production and trying to learn all best practices from experts. On Tue, May 23, 2023 at 7:47?PM Michal Arbet > wrote: How ? Say to me please ? How you can patch oslo.messaging except template override On Tue, May 23, 2023, 16:17 Danny Webb > wrote: You can already do this with the kolla image builder which seems to me to be a much better solution than patching containers post creation. ________________________________ From: Michal Arbet > Sent: 23 May 2023 13:01 To: openstack-discuss at lists.openstack.org > Subject: Re: [kolla] How to patch images during build CAUTION: This email originates from outside THG ________________________________ I am very glad that someone asked for an option to patch kolla images. I've already proposed patches for kolla here [1] and here [2]. But unfortunately I didn't get that many votes to merge into master and I abandoned this. [1] https://review.opendev.org/c/openstack/kolla/+/829296 [2] https://review.opendev.org/c/openstack/kolla/+/829295 With these above patches you can patch files inside every container. Maybe we can discuss this again ?? For example now xena, yoga, zed, antelope has oslo.messaging broken : https://bugs.launchpad.net/oslo.messaging/+bug/2019978 fixed by https://review.opendev.org/c/openstack/oslo.messaging/+/866617 As I am using my kolla patches in my downstream kolla git repo i've only created patches/ directory and place fix for openstack-base container :) patches/ patches/openstack-base patches/openstack-base/series patches/openstack-base/fix-rabbitmq-issue-opendev-883538.patch But, you still can use template-override https://docs.openstack.org/kolla/latest/admin/image-building.html . Thanks Michal Arbet Openstack Engineer [https://www.google.com/a/ultimum.io/images/logo.gif] Ultimum Technologies a.s. Na Po???? 1047/26, 11000 Praha 1 Czech Republic +420 604 228 897 michal.arbet at ultimum.io https://ultimum.io LinkedIn | Twitter | Facebook st 17. 5. 2023 v 20:19 odes?latel Jeremy Stanley > napsal: On 2023-05-17 21:02:02 +0300 (+0300), Maksim Malchuk wrote: > Yes, you can do that, but note bene mitaka not supported. [...] Not only unsupported, but the stable/mitaka branch of openstack/keystone was deleted when it reached EOL in 2017. You may instead want to specify `reference = mitaka-eol` (assuming Git tags also work there). That should get you the final state of the stable/mitaka branch prior to its deletion. -- Jeremy Stanley -- Regards, Maksim Malchuk -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralonsoh at redhat.com Wed May 24 13:55:06 2023 From: ralonsoh at redhat.com (Rodolfo Alonso Hernandez) Date: Wed, 24 May 2023 15:55:06 +0200 Subject: [neutron] Resignation announcement of Nate, Hongbin and Takashi Message-ID: Hello all: Today I would like to announce that Nate Johnston, Takashi Yamamoto and Hongbin Lu are stepping down as Neutron core reviewers and Neutron drivers. Before sending this email, I contacted them and Nate and Takashi agreed on this decision; I tried twice to contact Hongbin with no response. In order to have a participative Neutron community, the Neutron core and Neutron drivers teams should be made up of active members in code reviews, meeting attendance and spec reviewal. Nate, Takashi and Hongbin have been very active and valuable members of the Neutron (and OpenStack) community in the past; for that I would like to thank you for your contributions. Tomorrow I'll update the Launchpad and gerrit groups. Regards. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Joern.Kaster at epg.com Wed May 24 14:56:49 2023 From: Joern.Kaster at epg.com (=?iso-8859-1?Q?Kaster=2C_J=F6rn?=) Date: Wed, 24 May 2023 14:56:49 +0000 Subject: AW: AW: [kolla] [train] [keystone] Number of User/Group entities returned by LDAP exceeded size limit In-Reply-To: <1120247677.1939190.1684934594843@mail.yahoo.com> References: <1696731980.1287315.1684502957871.ref@mail.yahoo.com> <1696731980.1287315.1684502957871@mail.yahoo.com> <1692984653.1578834.1684866912916@mail.yahoo.com> <1120247677.1939190.1684934594843@mail.yahoo.com> Message-ID: Seems that page size option is a client feature that ldap server dont must respect. Look. at [1] I read it a little bit, that the page size only affects the number of data entries to be returned to the client, but the server has to calculate all the entries in the directory. Can you look in the ldap servers logfile and/or increase the debug_level in your configuration? [1] https://serverfault.com/questions/328671/paging-using-ldapsearch/329027 [https://cdn.sstatic.net/Sites/serverfault/Img/apple-touch-icon at 2.png?v=9b1f48ae296b] Paging using ldapsearch I am searching an LDAP directory that has a much larger number of results than the sizelimit currently set,500, by slapd.conf that for all intents and purposes cannot be changed) My idea was to keep serverfault.com ________________________________ Von: Albert Braden Gesendet: Mittwoch, 24. Mai 2023 15:23 An: openstack-discuss at lists.openstack.org ; Kaster, J?rn Betreff: Re: AW: [kolla] [train] [keystone] Number of User/Group entities returned by LDAP exceeded size limit OUTSIDE-EPG! The Keystone documentation [1] appears to indicate that LDAP limitations can be worked around by enabling paging, using the page_size setting. Am I reading it wrong? [1] https://docs.openstack.org/keystone/train/admin/configuration.html#identity-ldap-server-set-up On Wednesday, May 24, 2023, 02:34:23 AM EDT, Kaster, J?rn wrote: Hello Albert, have seen your message on monday and think that it was replied personaly in the meantime. Anyway. I think this problem is not dedicated to the openstack services. The problem is caused by the ldap server. Which one do you use? Look in the documentation of the ldap server to configure a larger size limit. greets from here J?rn ________________________________ Von: Albert Braden Gesendet: Dienstag, 23. Mai 2023 20:35 An: OpenStack Discuss Betreff: Re: [kolla] [train] [keystone] Number of User/Group entities returned by LDAP exceeded size limit OUTSIDE-EPG! Nobody replied to this Friday afternoon so I'm trying again: On Friday, May 19, 2023, 09:29:17 AM EDT, Albert Braden wrote: We have 2052 groups in our LDAP server. We recently started getting an error when we try to list groups: $ os group list --domain AUTH.OURDOMAIN.COM Number of User/Group entities returned by LDAP exceeded size limit. Contact your LDAP administrator. (HTTP 500) I read the "Additional LDAP integration settings" section in [1] and then tried setting various values of page_size (10, 100, 1000) in the [ldap] section of keystone.conf but that didn't make a difference. What am I missing? [1] https://docs.openstack.org/keystone/train/admin/configuration.html#identity-ldap-server-set-up Here's the stack trace: 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application [req-198741c6-58b2-46b1-8622-bae1fc5c5280 d64c83e1ea954c368e9fe08a5d8450a1 47dc15c280c9436fadac4d41f1d54a64 - default default] Number of User/Group entities returned by LDAP exceeded size limit. Contact your LDAP administrator.: keystone.exception.LDAPSizeLimitExceeded: Number of User/Group entities returned by LDAP exceeded size limit. Contact your LDAP administrator. 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application Traceback (most recent call last): 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/common.py", line 996, in search_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application attrlist, attrsonly) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/common.py", line 689, in wrapper 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return func(self, conn, *args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/common.py", line 824, in search_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application attrsonly) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 870, in search_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return self.search_ext_s(base,scope,filterstr,attrlist,attrsonly,None,None,timeout=self.timeout) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 1286, in search_ext_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return self._apply_method_s(SimpleLDAPObject.search_ext_s,*args,**kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 1224, in _apply_method_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return func(self,*args,**kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 864, in search_ext_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return self.result(msgid,all=1,timeout=timeout)[1] 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 756, in result 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application resp_type, resp_data, resp_msgid = self.result2(msgid,all,timeout) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 760, in result2 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application resp_type, resp_data, resp_msgid, resp_ctrls = self.result3(msgid,all,timeout) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 767, in result3 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application resp_ctrl_classes=resp_ctrl_classes 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 774, in result4 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application ldap_result = self._ldap_call(self._l.result4,msgid,all,timeout,add_ctrls,add_intermediates,add_extop) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 340, in _ldap_call 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application reraise(exc_type, exc_value, exc_traceback) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib64/python3.6/site-packages/ldap/compat.py", line 46, in reraise 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application raise exc_value 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib64/python3.6/site-packages/ldap/ldapobject.py", line 324, in _ldap_call 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application result = func(*args,**kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application ldap.SIZELIMIT_EXCEEDED: {'msgtype': 100, 'msgid': 2, 'result': 4, 'desc': 'Size limit exceeded', 'ctrls': []} 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application During handling of the above exception, another exception occurred: 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application Traceback (most recent call last): 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/flask/app.py", line 1813, in full_dispatch_request 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application rv = self.dispatch_request() 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/flask/app.py", line 1799, in dispatch_request 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return self.view_functions[rule.endpoint](**req.view_args) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/flask_restful/__init__.py", line 480, in wrapper 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application resp = resource(*args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/flask/views.py", line 88, in view 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return self.dispatch_request(*args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/flask_restful/__init__.py", line 595, in dispatch_request 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application resp = meth(*args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/api/groups.py", line 59, in get 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return self._list_groups() 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/api/groups.py", line 86, in _list_groups 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application hints=hints) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/common/manager.py", line 116, in wrapped 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application __ret_val = __f(*args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/identity/core.py", line 414, in wrapper 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return f(self, *args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/identity/core.py", line 424, in wrapper 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return f(self, *args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/identity/core.py", line 1329, in list_groups 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application ref_list = driver.list_groups(hints) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/core.py", line 116, in list_groups 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return self.group.get_all_filtered(hints) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/core.py", line 474, in get_all_filtered 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application for group in self.get_all(query, hints)] 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/common.py", line 1647, in get_all 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application for x in self._ldap_get_all(hints, ldap_filter)] 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/common/driver_hints.py", line 42, in wrapper 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application return f(self, hints, *args, **kwargs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/common.py", line 1600, in _ldap_get_all 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application attrs) 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application File "/usr/lib/python3.6/site-packages/keystone/identity/backends/ldap/common.py", line 998, in search_s 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application raise exception.LDAPSizeLimitExceeded() 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application keystone.exception.LDAPSizeLimitExceeded: Number of User/Group entities returned by LDAP exceeded size limit. Contact your LDAP administrator. 2023-05-15 20:18:41.932 36 ERROR keystone.server.flask.application -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Wed May 24 15:28:13 2023 From: satish.txt at gmail.com (Satish Patel) Date: Wed, 24 May 2023 11:28:13 -0400 Subject: [kolla] How to patch images during build In-Reply-To: References: <20230517181500.gvb2dvjmhc3jyt4g@yuggoth.org> Message-ID: Thank you michal for explaining the pain. I was also thinking how to put those things together for smooth rollout. @Danny - We are running a yoga release that means the Oslo bugfix hasn't merged yet. In the above example how do I patch my images and which images that patch will go? Yoga oslo bug - https://review.opendev.org/c/openstack/oslo.messaging/+/883538 Do you have a snippet for Dockerfile.j2 to apply to the above patch? Just looking for sample :) On Wed, May 24, 2023 at 9:53?AM Danny Webb wrote: > You do it with kolla image builder, all of the dockerfiles are jinja and > you can override or add in new blocks (such as patching a file, changing to > different versions of a package etc). Eg: > > > https://github.com/openstack/kolla/blob/master/docker/rabbitmq/Dockerfile.j2 > > So what we have is a set of pipelines that create patches based on > upstream openstack repos, and then we apply these patches in our image > build pipelines by changing files in image build process. This way you > have immutable containers that you aren't trying to change after the fact > in the kolla-ansible process. > ------------------------------ > *From:* Michal Arbet > *Sent:* 24 May 2023 14:45 > *To:* Maksim Malchuk > *Cc:* Satish Patel ; Danny Webb < > Danny.Webb at thehutgroup.com>; openstack-discuss < > openstack-discuss at lists.openstack.org> > *Subject:* Re: [kolla] How to patch images during build > > > * CAUTION: This email originates from outside THG * > ------------------------------ > Okay, let's say you found a critical bug and you have already sent a patch > for review. > This - as you know can take quite a long time - gate broken, unit tests > are not working etc etc but this is still regular fix and merged in other > versions. > Moreover upstream guys need to release a new pip package and amend upper > constraints. > > But you need to fix your production now ..how can you do it in kolla now ? > You cannot. > > As I said before as an example .. Oslo messaging is broken from xena to > antelope, if your rabbitmq will go down on this versions ..your clients > will not connect again to cluster - this is critical bug. > So, let's check how upstream guys dealing with it : > > 1, Patch applied upstream ? > - master - OK merged - > https://review.opendev.org/c/openstack/oslo.messaging/+/866617 > - antelope - OK merged - > https://review.opendev.org/c/openstack/oslo.messaging/+/883533 > - zed - they don't care - > https://review.opendev.org/c/openstack/oslo.messaging/+/883537 > - xena - they don't care - > https://review.opendev.org/c/openstack/oslo.messaging/+/883539 > - yoga - they don't care - > https://review.opendev.org/c/openstack/oslo.messaging/+/883538 > > 2, Okay, antelope merged , is the new version released as this is a > critical bug ? > - no , they again don't care - > > * 0602d1a1 (HEAD -> master, origin/master, origin/HEAD) Increase > ACK_REQUEUE_EVERY_SECONDS_MAX to exceed default kombu_reconnect_delay > (Andrew Bogott, 5 weeks ago - 2023-04-20 15:27:58 -0500) > * fd2381c7 (tag: 14.3.0) Disable greenthreads for RabbitDriver "listen" > connections (Arnaud Morin, 3 months ago - 2023-03-03 11:24:27 +0100) > > Last version is 14.3.0 and fix is still not released in pypi repo. > > Other versions ? check 1. > > Let's check how ubuntu handled this problem : > > python-oslo.messaging (12.13.0-0ubuntu1.1) jammy; urgency=medium > > * d/gbp.conf: Create stable/yoga branch. > * d/p/revert-limit-maximum-timeout-in-the-poll-loop.patch: This reverts > an upstream patch that is preventing active/active rabbitmq from > failing over when a node goes down (LP: #1993149). > > -- Corey Bryant Thu, 20 Oct 2022 15:48:16 -0400 > > > They patched the buggy version !! Kolla dropped binary builds ...so you can't install dependencies from apt repository where it is patched, and you don't have a way how to patch > > your python library. > > > Patching is normal way how to fix a problem, you don't have always option to bump version, you need patch code and kolla just don't have this option. > > > > > - > > On Wed, May 24, 2023, 08:41 Maksim Malchuk > wrote: > > The correct and only way to apply patches on images - is build them from > source. You should patch the code, not binary files. > > On Wed, May 24, 2023 at 3:32?AM Satish Patel wrote: > > This is very interesting that there is no standard or best way to patch > images. Everyone uses their own way to handle it. Now I am very curious to > see how folks running it in production with patching and building images. I > am about to deploy kolla on production and trying to learn all best > practices from experts. > > On Tue, May 23, 2023 at 7:47?PM Michal Arbet > wrote: > > How ? > > Say to me please ? How you can patch oslo.messaging except template > override > > On Tue, May 23, 2023, 16:17 Danny Webb wrote: > > You can already do this with the kolla image builder which seems to me to > be a much better solution than patching containers post creation. > ------------------------------ > *From:* Michal Arbet > *Sent:* 23 May 2023 13:01 > *To:* openstack-discuss at lists.openstack.org < > openstack-discuss at lists.openstack.org> > *Subject:* Re: [kolla] How to patch images during build > > > * CAUTION: This email originates from outside THG * > ------------------------------ > I am very glad that someone asked for an option to patch kolla images. > I've already proposed patches for kolla here [1] and here [2]. > But unfortunately I didn't get that many votes to merge into master and I > abandoned this. > > [1] https://review.opendev.org/c/openstack/kolla/+/829296 > [2] https://review.opendev.org/c/openstack/kolla/+/829295 > > With these above patches you can patch files inside every container. > Maybe we can discuss this again ?? > > For example now xena, yoga, zed, antelope has oslo.messaging broken : > > https://bugs.launchpad.net/oslo.messaging/+bug/2019978 > fixed by > https://review.opendev.org/c/openstack/oslo.messaging/+/866617 > > As I am using my kolla patches in my downstream kolla git repo i've only > created patches/ directory and place fix for openstack-base container :) > > patches/ > patches/openstack-base > patches/openstack-base/series > patches/openstack-base/fix-rabbitmq-issue-opendev-883538.patch > > But, you still can use template-override > https://docs.openstack.org/kolla/latest/admin/image-building.html . > > Thanks > > Michal Arbet > Openstack Engineer > > Ultimum Technologies a.s. > Na Po???? 1047/26, 11000 Praha 1 > Czech Republic > > +420 604 228 897 > michal.arbet at ultimum.io > *https://ultimum.io * > > LinkedIn | Twitter > | Facebook > > > > st 17. 5. 2023 v 20:19 odes?latel Jeremy Stanley > napsal: > > On 2023-05-17 21:02:02 +0300 (+0300), Maksim Malchuk wrote: > > Yes, you can do that, but note bene mitaka not supported. > [...] > > Not only unsupported, but the stable/mitaka branch of > openstack/keystone was deleted when it reached EOL in 2017. You may > instead want to specify `reference = mitaka-eol` (assuming Git tags > also work there). That should get you the final state of the > stable/mitaka branch prior to its deletion. > -- > Jeremy Stanley > > > > -- > Regards, > Maksim Malchuk > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From michal.arbet at ultimum.io Wed May 24 17:09:42 2023 From: michal.arbet at ultimum.io (Michal Arbet) Date: Wed, 24 May 2023 19:09:42 +0200 Subject: [kolla] How to patch images during build In-Reply-To: References: <20230517181500.gvb2dvjmhc3jyt4g@yuggoth.org> Message-ID: I know that there is that option. I just think that there should be a simpler option to patch. Michal Arbet Openstack Engineer Ultimum Technologies a.s. Na Po???? 1047/26, 11000 Praha 1 Czech Republic +420 604 228 897 michal.arbet at ultimum.io *https://ultimum.io * LinkedIn | Twitter | Facebook st 24. 5. 2023 v 15:53 odes?latel Danny Webb napsal: > You do it with kolla image builder, all of the dockerfiles are jinja and > you can override or add in new blocks (such as patching a file, changing to > different versions of a package etc). Eg: > > > https://github.com/openstack/kolla/blob/master/docker/rabbitmq/Dockerfile.j2 > > So what we have is a set of pipelines that create patches based on > upstream openstack repos, and then we apply these patches in our image > build pipelines by changing files in image build process. This way you > have immutable containers that you aren't trying to change after the fact > in the kolla-ansible process. > ------------------------------ > *From:* Michal Arbet > *Sent:* 24 May 2023 14:45 > *To:* Maksim Malchuk > *Cc:* Satish Patel ; Danny Webb < > Danny.Webb at thehutgroup.com>; openstack-discuss < > openstack-discuss at lists.openstack.org> > *Subject:* Re: [kolla] How to patch images during build > > > * CAUTION: This email originates from outside THG * > ------------------------------ > Okay, let's say you found a critical bug and you have already sent a patch > for review. > This - as you know can take quite a long time - gate broken, unit tests > are not working etc etc but this is still regular fix and merged in other > versions. > Moreover upstream guys need to release a new pip package and amend upper > constraints. > > But you need to fix your production now ..how can you do it in kolla now ? > You cannot. > > As I said before as an example .. Oslo messaging is broken from xena to > antelope, if your rabbitmq will go down on this versions ..your clients > will not connect again to cluster - this is critical bug. > So, let's check how upstream guys dealing with it : > > 1, Patch applied upstream ? > - master - OK merged - > https://review.opendev.org/c/openstack/oslo.messaging/+/866617 > - antelope - OK merged - > https://review.opendev.org/c/openstack/oslo.messaging/+/883533 > - zed - they don't care - > https://review.opendev.org/c/openstack/oslo.messaging/+/883537 > - xena - they don't care - > https://review.opendev.org/c/openstack/oslo.messaging/+/883539 > - yoga - they don't care - > https://review.opendev.org/c/openstack/oslo.messaging/+/883538 > > 2, Okay, antelope merged , is the new version released as this is a > critical bug ? > - no , they again don't care - > > * 0602d1a1 (HEAD -> master, origin/master, origin/HEAD) Increase > ACK_REQUEUE_EVERY_SECONDS_MAX to exceed default kombu_reconnect_delay > (Andrew Bogott, 5 weeks ago - 2023-04-20 15:27:58 -0500) > * fd2381c7 (tag: 14.3.0) Disable greenthreads for RabbitDriver "listen" > connections (Arnaud Morin, 3 months ago - 2023-03-03 11:24:27 +0100) > > Last version is 14.3.0 and fix is still not released in pypi repo. > > Other versions ? check 1. > > Let's check how ubuntu handled this problem : > > python-oslo.messaging (12.13.0-0ubuntu1.1) jammy; urgency=medium > > * d/gbp.conf: Create stable/yoga branch. > * d/p/revert-limit-maximum-timeout-in-the-poll-loop.patch: This reverts > an upstream patch that is preventing active/active rabbitmq from > failing over when a node goes down (LP: #1993149). > > -- Corey Bryant Thu, 20 Oct 2022 15:48:16 -0400 > > > They patched the buggy version !! Kolla dropped binary builds ...so you can't install dependencies from apt repository where it is patched, and you don't have a way how to patch > > your python library. > > > Patching is normal way how to fix a problem, you don't have always option to bump version, you need patch code and kolla just don't have this option. > > > > > - > > On Wed, May 24, 2023, 08:41 Maksim Malchuk > wrote: > > The correct and only way to apply patches on images - is build them from > source. You should patch the code, not binary files. > > On Wed, May 24, 2023 at 3:32?AM Satish Patel wrote: > > This is very interesting that there is no standard or best way to patch > images. Everyone uses their own way to handle it. Now I am very curious to > see how folks running it in production with patching and building images. I > am about to deploy kolla on production and trying to learn all best > practices from experts. > > On Tue, May 23, 2023 at 7:47?PM Michal Arbet > wrote: > > How ? > > Say to me please ? How you can patch oslo.messaging except template > override > > On Tue, May 23, 2023, 16:17 Danny Webb wrote: > > You can already do this with the kolla image builder which seems to me to > be a much better solution than patching containers post creation. > ------------------------------ > *From:* Michal Arbet > *Sent:* 23 May 2023 13:01 > *To:* openstack-discuss at lists.openstack.org < > openstack-discuss at lists.openstack.org> > *Subject:* Re: [kolla] How to patch images during build > > > * CAUTION: This email originates from outside THG * > ------------------------------ > I am very glad that someone asked for an option to patch kolla images. > I've already proposed patches for kolla here [1] and here [2]. > But unfortunately I didn't get that many votes to merge into master and I > abandoned this. > > [1] https://review.opendev.org/c/openstack/kolla/+/829296 > [2] https://review.opendev.org/c/openstack/kolla/+/829295 > > With these above patches you can patch files inside every container. > Maybe we can discuss this again ?? > > For example now xena, yoga, zed, antelope has oslo.messaging broken : > > https://bugs.launchpad.net/oslo.messaging/+bug/2019978 > fixed by > https://review.opendev.org/c/openstack/oslo.messaging/+/866617 > > As I am using my kolla patches in my downstream kolla git repo i've only > created patches/ directory and place fix for openstack-base container :) > > patches/ > patches/openstack-base > patches/openstack-base/series > patches/openstack-base/fix-rabbitmq-issue-opendev-883538.patch > > But, you still can use template-override > https://docs.openstack.org/kolla/latest/admin/image-building.html . > > Thanks > > Michal Arbet > Openstack Engineer > > Ultimum Technologies a.s. > Na Po???? 1047/26, 11000 Praha 1 > Czech Republic > > +420 604 228 897 > michal.arbet at ultimum.io > *https://ultimum.io * > > LinkedIn | Twitter > | Facebook > > > > st 17. 5. 2023 v 20:19 odes?latel Jeremy Stanley > napsal: > > On 2023-05-17 21:02:02 +0300 (+0300), Maksim Malchuk wrote: > > Yes, you can do that, but note bene mitaka not supported. > [...] > > Not only unsupported, but the stable/mitaka branch of > openstack/keystone was deleted when it reached EOL in 2017. You may > instead want to specify `reference = mitaka-eol` (assuming Git tags > also work there). That should get you the final state of the > stable/mitaka branch prior to its deletion. > -- > Jeremy Stanley > > > > -- > Regards, > Maksim Malchuk > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fadhel.bedda at gmail.com Wed May 24 19:33:44 2023 From: fadhel.bedda at gmail.com (BEDDA Fadhel) Date: Wed, 24 May 2023 21:33:44 +0200 Subject: Message d'erreur Message-ID: Good morning who can help me to work around this problem: root at controller certs(keystone)# openstack project create --domain default --description "Service Project" service Failed to discover available identity versions when contacting https://controller.cloud.net:5000/v3. Attempting to parse version from URL. SSL exception connecting to https://controller.cloud.net:5000/v3/auth/tokens: HTTPSConnectionPool(host='controller.cloud.net', port=5000): Max retries exceeded with url: /v3/auth/tokens (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate (_ssl.c:997)'))) -------------- next part -------------- An HTML attachment was scrubbed... URL: From luis.ramirez at opencloud.es Wed May 24 19:57:46 2023 From: luis.ramirez at opencloud.es (Luis Ramirez) Date: Wed, 24 May 2023 21:57:46 +0200 Subject: Message d'erreur In-Reply-To: References: Message-ID: Set OS_CACERT pointing to the CA certificate. export OS_CACERT=/path/to/ca.crt Reference: http://docs.openstack.org/user-guide/common/cli-set-environment-variables-using-openstack-rc.html Br, Luis Rmz Blockchain, DevOps & Open Source Cloud Solutions Architect ---------------------------------------- Founder & CEO OpenCloud.es luis.ramirez at opencloud.es Skype ID: d.overload Hangouts: luis.ramirez at opencloud.es [image: ?] +34 911 950 123 / [image: ?]+39 392 1289553 / [image: ?]+49 152 26917722 / ?esk? republika: +420 774 274 882 ----------------------------------------------------- El mi?, 24 may 2023 a las 21:46, BEDDA Fadhel () escribi?: > Good morning > who can help me to work around this problem: > > root at controller certs(keystone)# openstack project create --domain default --description "Service Project" service > Failed to discover available identity versions when contacting https://controller.cloud.net:5000/v3. Attempting to parse version from URL. > SSL exception connecting to https://controller.cloud.net:5000/v3/auth/tokens: HTTPSConnectionPool(host='controller.cloud.net', port=5000): Max retries exceeded with url: /v3/auth/tokens (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate (_ssl.c:997)'))) > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From techstep at gmail.com Wed May 24 19:58:19 2023 From: techstep at gmail.com (Rob Jefferson) Date: Wed, 24 May 2023 15:58:19 -0400 Subject: [kolla-ansible][yoga] Installation failure while `creating shard root mysql user` Message-ID: I'm trying to set up Yoga (as part of a way of upgrading from Xena to Yoga to Zed) using kolla-ansible, and I'm finding that I'm getting the following error at the `kolla-ansible/ansible/roles/mariadb/tasks/register.yml` playbook: TASK [mariadb : Creating shard root mysql user] [...] fatal: [control01]: FAILED! => { "changed": false, "msg": "Can not parse the inner module output: b'[WARNING]: Failure using method (v2_playbook_on_play_start) in callback plugin (): 'Play' object has no attribute 'get_path' [WARNING]: Failure using method (v2_playbook_on_task_start) in callback plugin (): list index out of range [WARNING]: Failure using method (v2_runner_on_ok) in callback plugin (): list index out of range { "custom_stats": {}, "global_custom_stats": {}, "plays": [], "stats\": { "localhost": { "changed\": 0, "failures\": 0, "ignored\": 0, "ok": 1, "rescued\": 0, "skipped\": 0, "unreachable\": 0 } } }' "} I am not sure how to fix this error, or determine what's causing it. I've tried matching up Ansible versions between my installation environment and Yoga, I've tried upgrading Kolla, and so forth. For the installation environment, I'm using: * Debian 11 * Ansible 4.10.0 * ansible-core 2.11.2 * docker-ce 5.20.10 Any help with this would be most appreciated. Thanks, Rob From abishop at redhat.com Wed May 24 20:47:11 2023 From: abishop at redhat.com (Alan Bishop) Date: Wed, 24 May 2023 13:47:11 -0700 Subject: Add netapp storage in edge site | Wallaby | DCN In-Reply-To: References: Message-ID: On Wed, May 24, 2023 at 3:15?AM Swogat Pradhan wrote: > Hi, > I have a DCN setup and there is a requirement to use a netapp storage > device in one of the edge sites. > Can someone please confirm if it is possible? > I see from prior email to this list that you're using tripleo, so I'll respond with that in mind. There are many factors that come into play, but I suspect the short answer to your question is no. Tripleo's DCN architecture requires the cinder-volume service running at edge sites to run in active-active mode, where there are separate instances running on three nodes in to for the service to be highly available (HA).The problem is that only a small number of cinder drivers support running A/A, and NetApp's drivers do not support A/A. It's conceivable you could create a custom tripleo role that deploys just a single node running cinder-volume with a NetApp backend, but it wouldn't be HA. It's also conceivable you could locate the NetApp system in the central site's controlplane, but there are extremely difficult constraints you'd need to overcome: - Network latency between the central and edge sites would mean the disk performance would be bad. - You'd be limited to using iSCSI (FC wouldn't work) - Tripleo disables cross-AZ attachments, so the only way for an edge site to access a NetApp volume would be to configure the cinder-volume service running in the controlplane with a backend availability zone set to the edge site's AZ. You mentioned the NetApp is needed "in one of the edge sites," but in reality the NetApp would be available in one, AND ONLY ONE edge site, and it would also not be available to any instances running in the central site. Alan > And if so then should i add the parameters in the edge deployment script > or the central deployment script. > Any suggestions? > > With regards, > Swogat Pradhan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From michal.arbet at ultimum.io Wed May 24 21:29:36 2023 From: michal.arbet at ultimum.io (Michal Arbet) Date: Wed, 24 May 2023 23:29:36 +0200 Subject: [kolla-ansible][yoga] Installation failure while `creating shard root mysql user` In-Reply-To: References: Message-ID: Hi , this is incompatibility in Ansible version I believe. And if task is created via kolla-toolbox ..check version there .... On Wed, May 24, 2023, 22:06 Rob Jefferson wrote: > I'm trying to set up Yoga (as part of a way of upgrading from Xena to > Yoga to Zed) using kolla-ansible, and I'm finding that I'm getting the > following error at the > `kolla-ansible/ansible/roles/mariadb/tasks/register.yml` playbook: > > TASK [mariadb : Creating shard root mysql user] > [...] > fatal: [control01]: FAILED! => > { > "changed": false, > "msg": "Can not parse the inner module output: > > b'[WARNING]: Failure using method (v2_playbook_on_play_start) in callback > plugin > ( object > at 0x7fa4e6e18d00>): 'Play' object has no attribute 'get_path' > [WARNING]: Failure using method (v2_playbook_on_task_start) in callback > plugin > ( object > at 0x7fa4e6e18d00>): list index out of range > [WARNING]: Failure using method (v2_runner_on_ok) in callback plugin > ( object at 0x7fa4e6e18d00>): list index out of range > { "custom_stats": {}, > "global_custom_stats": {}, > "plays": [], > "stats\": { > "localhost": { > "changed\": 0, > "failures\": 0, > "ignored\": 0, > "ok": 1, > "rescued\": 0, > "skipped\": 0, > "unreachable\": 0 > } > } > }' > "} > > I am not sure how to fix this error, or determine what's causing it. > I've tried matching up Ansible versions between my installation > environment and Yoga, I've tried upgrading Kolla, and so forth. > > For the installation environment, I'm using: > > * Debian 11 > * Ansible 4.10.0 > * ansible-core 2.11.2 > * docker-ce 5.20.10 > > Any help with this would be most appreciated. > > Thanks, > > Rob > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From michal.arbet at ultimum.io Wed May 24 21:30:05 2023 From: michal.arbet at ultimum.io (Michal Arbet) Date: Wed, 24 May 2023 23:30:05 +0200 Subject: [kolla-ansible][yoga] Installation failure while `creating shard root mysql user` In-Reply-To: References: Message-ID: I deployed yoga several days ago and everything was working as expected . On Wed, May 24, 2023, 22:06 Rob Jefferson wrote: > I'm trying to set up Yoga (as part of a way of upgrading from Xena to > Yoga to Zed) using kolla-ansible, and I'm finding that I'm getting the > following error at the > `kolla-ansible/ansible/roles/mariadb/tasks/register.yml` playbook: > > TASK [mariadb : Creating shard root mysql user] > [...] > fatal: [control01]: FAILED! => > { > "changed": false, > "msg": "Can not parse the inner module output: > > b'[WARNING]: Failure using method (v2_playbook_on_play_start) in callback > plugin > ( object > at 0x7fa4e6e18d00>): 'Play' object has no attribute 'get_path' > [WARNING]: Failure using method (v2_playbook_on_task_start) in callback > plugin > ( object > at 0x7fa4e6e18d00>): list index out of range > [WARNING]: Failure using method (v2_runner_on_ok) in callback plugin > ( object at 0x7fa4e6e18d00>): list index out of range > { "custom_stats": {}, > "global_custom_stats": {}, > "plays": [], > "stats\": { > "localhost": { > "changed\": 0, > "failures\": 0, > "ignored\": 0, > "ok": 1, > "rescued\": 0, > "skipped\": 0, > "unreachable\": 0 > } > } > }' > "} > > I am not sure how to fix this error, or determine what's causing it. > I've tried matching up Ansible versions between my installation > environment and Yoga, I've tried upgrading Kolla, and so forth. > > For the installation environment, I'm using: > > * Debian 11 > * Ansible 4.10.0 > * ansible-core 2.11.2 > * docker-ce 5.20.10 > > Any help with this would be most appreciated. > > Thanks, > > Rob > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From 1148203170 at qq.com Thu May 25 03:16:05 2023 From: 1148203170 at qq.com (=?gb18030?B?3/Xf9d/1?=) Date: Thu, 25 May 2023 11:16:05 +0800 Subject: [nova]vnc and spice used together in openstack Message-ID: Hi, When using the console, we enabled both nova spicehtml5proxy and nova novancproxy on the controller node, and enabled vnc and spice on compute1 and compute2, respectively. We found that there was a problem with the coexistence of the two, The log is as follows.  ERROR oslo_messaging.rpc.server [req-28674cdd-f06e-466f-8611-516efc8fbe3c 069c1371a95e4ebaa59eea6ee70362bc dd4e9ff713be4311978334ae0d0c0a5e - default default] Exception during message handling: ValueError: Field `port' cannot be None 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server Traceback (most recent call last): 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server   File "/var/lib/kolla/venv/lib/python3.6/site-packages/oslo_messaging/rpc/server.py", line 165, in _process_incoming 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server     res = self.dispatcher.dispatch(message) 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server   File "/var/lib/kolla/venv/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", line 274, in dispatch 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server     return self._do_dispatch(endpoint, method, ctxt, args) 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server   File "/var/lib/kolla/venv/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", line 194, in _do_dispatch 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server     result = func(ctxt, **new_args) 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server   File "/var/lib/kolla/venv/lib/python3.6/site-packages/oslo_messaging/rpc/server.py", line 235, in inner 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server     return func(*args, **kwargs) 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server   File "/var/lib/kolla/venv/lib/python3.6/site-packages/nova/exception_wrapper.py", line 79, in wrapped 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server     function_name, call_dict, binary, tb) 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server   File "/var/lib/kolla/venv/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server     self.force_reraise() 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server   File "/var/lib/kolla/venv/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server     six.reraise(self.type_, self.value, self.tb) 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/six.py", line 693, in reraise 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server     raise value 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server   File "/var/lib/kolla/venv/lib/python3.6/site-packages/nova/exception_wrapper.py", line 69, in wrapped 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server     return f(self, context, *args, **kw) 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server   File "/var/lib/kolla/venv/lib/python3.6/site-packages/nova/compute/manager.py", line 219, in decorated_function 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server     kwargs['instance'], e, sys.exc_info()) 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server   File "/var/lib/kolla/venv/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server     self.force_reraise() 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server   File "/var/lib/kolla/venv/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server     six.reraise(self.type_, self.value, self.tb) 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.6/site-packages/six.py", line 693, in reraise 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server     raise value 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server   File "/var/lib/kolla/venv/lib/python3.6/site-packages/nova/compute/manager.py", line 207, in decorated_function 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server     return function(self, context, *args, **kwargs) 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server   File "/var/lib/kolla/venv/lib/python3.6/site-packages/nova/compute/manager.py", line 5906, in get_spice_console 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server     access_url_base=CONF.spice.html5proxy_base_url, 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server   File "/var/lib/kolla/venv/lib/python3.6/site-packages/oslo_versionedobjects/base.py", line 307, in __init__ 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server     setattr(self, key, kwargs[key]) 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server   File "/var/lib/kolla/venv/lib/python3.6/site-packages/oslo_versionedobjects/base.py", line 72, in setter 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server     field_value = field.coerce(self, name, value) 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server   File "/var/lib/kolla/venv/lib/python3.6/site-packages/oslo_versionedobjects/fields.py", line 207, in coerce 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server     return self._null(obj, attr) 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server   File "/var/lib/kolla/venv/lib/python3.6/site-packages/oslo_versionedobjects/fields.py", line 185, in _null 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server     raise ValueError(_("Field `%s' cannot be None") % attr) 2023-05-25 10:58:05.851 7 ERROR oslo_messaging.rpc.server ValueError: Field `port' cannot be None I don't know why there is an appeal issue or what special mechanism OpenStack has for the console. I can only choose one. I hope you can help us. Thank you very much. -------------- next part -------------- An HTML attachment was scrubbed... URL: From songwenping at inspur.com Thu May 25 06:05:46 2023 From: songwenping at inspur.com (=?gb2312?B?QWxleCBTb25nICjLzs7Exr0p?=) Date: Thu, 25 May 2023 06:05:46 +0000 Subject: pep8 error of newest devstack Message-ID: <594480c84ac343dd906ee575c0fbf4c4@inspur.com> Hi, When I check pep8 with the newest devstack env by the cmd ??tox ?Ce pep8??, it raise many packages conflict. The log is as follows. root at song-ctrl:/opt/stack/cyborg# tox -e pep8 pep8 create: /opt/stack/cyborg/.tox/shared pep8 installdeps: -chttps://releases.openstack.org/constraints/upper/master, -r/opt/stack/cyborg/requirements.txt, -r/opt/stack/cyborg/test-requirements. txt ERROR: invocation failed (exit code 1), logfile: /opt/stack/cyborg/.tox/shared/log/pep8-1.log ============================================================================ ============================================================== log start ============================================================================ ============================================================== Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Collecting pbr!=2.1.0,>=0.11 (from -r /opt/stack/cyborg/requirements.txt (line 5)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/01/06/4ab11bf70db5a60689fc521b636 849c8593eb67a2c6bdf73a16c72d16a12/pbr-5.11.1-py2.py3-none-any.whl (112 kB) Collecting pecan!=1.0.2,!=1.0.3,!=1.0.4,!=1.2,>=1.0.0 (from -r /opt/stack/cyborg/requirements.txt (line 6)) Using cached pecan-1.4.2-py3-none-any.whl Collecting WSME>=0.10.1 (from -r /opt/stack/cyborg/requirements.txt (line 7)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/d2/a0/54b71fd2e1ab4f838acf472726a 14dadb6d1c77adc18d7e2c062dd955ff9/WSME-0.11.0-py3-none-any.whl (59 kB) Collecting eventlet>=0.26.0 (from -r /opt/stack/cyborg/requirements.txt (line 8)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/90/97/928b89de2e23cc67136eccccf1c 122adf74ffdb65bbf7d2964b937cedd4f/eventlet-0.33.3-py2.py3-none-any.whl (226 kB) Collecting oslo.i18n>=1.5.0 (from -r /opt/stack/cyborg/requirements.txt (line 9)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/62/5a/ced9667f5a35d712734bf3449af ad763283e31ea9a903137eb42df29d948/oslo.i18n-6.0.0-py3-none-any.whl (46 kB) Collecting oslo.config!=4.3.0,!=4.4.0,>=1.1.0 (from -r /opt/stack/cyborg/requirements.txt (line 10)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/ce/15/97fb14b7a1692693610a8e00e2a 08e4186d6cdd875b6ac24c912a429b665/oslo.config-9.1.1-py3-none-any.whl (128 kB) Collecting oslo.log>=5.0.0 (from -r /opt/stack/cyborg/requirements.txt (line 11)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/9c/24/526545a76513c741c65eeb70a85 c93287f10665147c7cff7e0eb24918d43/oslo.log-5.2.0-py3-none-any.whl (71 kB) Collecting oslo.context>=2.9.0 (from -r /opt/stack/cyborg/requirements.txt (line 12)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/7f/69/3a16785c49890cce2f7f14ee652 5c76e7116f9dad44f122f5e3670e43970/oslo.context-5.1.1-py3-none-any.whl (20 kB) ERROR: Cannot install oslo.messaging>=14.1.0 because these package versions have conflicting dependencies. The conflict is caused by: The user requested oslo.messaging>=14.1.0 The user requested (constraint) oslo-messaging===14.3.0 To fix this you could try to: 1. loosen the range of package versions you've specified 2. remove package versions to allow pip attempt to solve the dependency conflict ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dep endency-conflicts ============================================================================ =============================================================== log end ============================================================================ =============================================================== ERROR: could not install deps [-chttps://releases.openstack.org/constraints/upper/master, -r/opt/stack/cyborg/requirements.txt, -r/opt/stack/cyborg/test-requirements. txt]; v = InvocationError('/opt/stack/cyborg/.tox/shared/bin/python -m pip install -chttps://releases.openstack.org/constraints/upper/master -r/opt/stack/cyborg/requirements.txt -r/opt/stack/cyborg/test-requirements.txt', 1) ____________________________________________________________________________ _______________________________________________________________ summary ____________________________________________________________________________ _______________________________________________________________ ERROR: pep8: could not install deps [-chttps://releases.openstack.org/constraints/upper/master, -r/opt/stack/cyborg/requirements.txt, -r/opt/stack/cyborg/test-requirements. txt]; v = InvocationError('/opt/stack/cyborg/.tox/shared/bin/python -m pip install -chttps://releases.openstack.org/constraints/upper/master -r/opt/stack/cyborg/requirements.txt -r/opt/stack/cyborg/test-requirements.txt', 1) I hope you can help us. Thank you very much. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3774 bytes Desc: not available URL: From swogatpradhan22 at gmail.com Thu May 25 07:09:12 2023 From: swogatpradhan22 at gmail.com (Swogat Pradhan) Date: Thu, 25 May 2023 12:39:12 +0530 Subject: Add netapp storage in edge site | Wallaby | DCN In-Reply-To: References: Message-ID: Hi Alan, So, can I include the cinder-netapp-storage.yaml file in the central site and then use the new backend to add storage to edge VM's? I believe it is not possible right?? as the cinder volume in the edge won't have the config for the netapp. With regards, Swogat Pradhan On Thu, May 25, 2023 at 2:17?AM Alan Bishop wrote: > > > On Wed, May 24, 2023 at 3:15?AM Swogat Pradhan > wrote: > >> Hi, >> I have a DCN setup and there is a requirement to use a netapp storage >> device in one of the edge sites. >> Can someone please confirm if it is possible? >> > > I see from prior email to this list that you're using tripleo, so I'll > respond with that in mind. > > There are many factors that come into play, but I suspect the short answer > to your question is no. > > Tripleo's DCN architecture requires the cinder-volume service running at > edge sites to run in active-active > mode, where there are separate instances running on three nodes in to for > the service to be highly > available (HA).The problem is that only a small number of cinder drivers > support running A/A, and NetApp's > drivers do not support A/A. > > It's conceivable you could create a custom tripleo role that deploys just > a single node running cinder-volume > with a NetApp backend, but it wouldn't be HA. > > It's also conceivable you could locate the NetApp system in the central > site's controlplane, but there are > extremely difficult constraints you'd need to overcome: > - Network latency between the central and edge sites would mean the disk > performance would be bad. > - You'd be limited to using iSCSI (FC wouldn't work) > - Tripleo disables cross-AZ attachments, so the only way for an edge site > to access a NetApp volume > would be to configure the cinder-volume service running in the > controlplane with a backend availability > zone set to the edge site's AZ. You mentioned the NetApp is needed "in one > of the edge sites," but in > reality the NetApp would be available in one, AND ONLY ONE edge site, and > it would also not be available > to any instances running in the central site. > > Alan > > >> And if so then should i add the parameters in the edge deployment script >> or the central deployment script. >> Any suggestions? >> >> With regards, >> Swogat Pradhan >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike at idia.ac.za Thu May 25 07:16:50 2023 From: mike at idia.ac.za (Mike Currin) Date: Thu, 25 May 2023 09:16:50 +0200 Subject: [ironic] Deploy images crashing Dell server BIOS using UEFI boot Message-ID: Hi All, We have a Xena based Openstack deployment, recently we deployed 60+ nodes in our research cluster with Ironic which worked well. All of these were deployed using a standard process I'll describe below. We recently took delivery of a new Dell R6625 server with NVMe devices onlym which only support UEFI boot, so we are trying to get that working. The server PXEs and downloads the RAM disk and then the Deploy image, once running that it immediately crashes (I assume when running linuxefi). We tested UEFI deploy on an existing Dell R640 server, that server works with BIOS but we swapped it over to UEFI and it does the same, so it wasn't due to the much bigger/different architecture (AMD vs Intel) server. We have a few older servers in a test setup (which are Dell R630's) which are working fine and don't do this behaviour. We haven't tried them on our production setup as if even if they worked it wouldn't help us move forward. I made a video showing this: https://www.dropbox.com/s/5jbn1qpylxaevqb/uefiboot2.mov?dl=0 In the iDRAC we just get that the "System BIOS has halted" and somewhere I said to change hardware that you recently added, which feels unlikely as 2 different servers both working elsewhere with totally different hardware,. I've done a iDRAC Serial console debug but it isn't showing me much that is of any use. This is our entire process to deploy a node (some is once off of course, I've not included the network setup): openstack flavor create --ram 256000 --disk 20 --vcpus 32 --public our-baremetal openstack flavor set our-baremetal --property capabilities:boot_mode="uefi" We downloaded the latest (Xena) images from: https://tarballs.opendev.org/openstack/ironic-python-agent/dib/files/ I also tried the latest Centos9 ones just to try them, made no difference. We then extract the image and make a small mod to the service start (we found it didn't bring the NIC immediately up so put a ping delay in the ExecStart), but that's not part of this problem. openstack image create --disk-format aki --container-format aki --public --file ironic-agent.kernel deploy-vmlinuz openstack image create --disk-format ari --container-format ari --public --file ironic-agent.initramfs-ping-patched deploy-initrd Then to do the deploy: export HOSTNAME= export MGMTIP= openstack baremetal node create --driver ipmi --name $HOSTNAME --driver-info ipmi_port=623 --driver-info ipmi_username=root --driver-info 'ipmi_password=' --driver-info ipmi_address=$MGMTIP --resource-class baremetal-resource-class --property cpus=32 --property memory_mb=256000 --property local_gb=20 --property cpu_arch=x86_64 --driver-info deploy_ramdisk=$(openstack image show deploy-initrd -f value -c id) --driver-info deploy_kernel=$(openstack image show deploy-vmlinuz -f value -c id) NODE=$(openstack baremetal node show -f value -c uuid $HOSTNAME) openstack baremetal node set $NODE --property capabilities='boot_mode:uefi' openstack baremetal port create --node $NODE --physical-network physnet3 openstack baremetal node manage $NODE --wait && openstack baremetal node list && openstack baremetal node provide $NODE && openstack baremetal node list openstack server create --use-config-drive --image --flavor our-baremetal --security-group worker --network ironic-network --key-name servername Does any one have any more info to help or any suggestions as to something more I could try, I'm out of ideas. I know that UEFI itself works on both the servers, we have a setup with Ubuntu MAAS and it can deploy perfectly fine using its process with the UEFI setup so, it's something on the Ironic deploy image that's causing us this problem. Regards, Mike From katonalala at gmail.com Thu May 25 07:38:20 2023 From: katonalala at gmail.com (Lajos Katona) Date: Thu, 25 May 2023 09:38:20 +0200 Subject: [neutron] Resignation announcement of Nate, Hongbin and Takashi In-Reply-To: References: Message-ID: Hi, Thanks Rodolfo for taking care of the members lists, and checking with Nate, Takashi and Hongbin. Big thanks to Nate, Takashi and Hongbin for their work in the community, and I hope we will have a chance to work together again :-) Lajos Rodolfo Alonso Hernandez ezt ?rta (id?pont: 2023. m?j. 24., Sze, 16:05): > Hello all: > > Today I would like to announce that Nate Johnston, Takashi Yamamoto and > Hongbin Lu are stepping down as Neutron core reviewers and Neutron drivers. > Before sending this email, I contacted them and Nate and Takashi agreed on > this decision; I tried twice to contact Hongbin with no response. > > In order to have a participative Neutron community, the Neutron core and > Neutron drivers teams should be made up of active members in code reviews, > meeting attendance and spec reviewal. Nate, Takashi and Hongbin have been > very active and valuable members of the Neutron (and OpenStack) community > in the past; for that I would like to thank you for your contributions. > > Tomorrow I'll update the Launchpad and gerrit groups. > > Regards. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From geguileo at redhat.com Thu May 25 09:40:36 2023 From: geguileo at redhat.com (Gorka Eguileor) Date: Thu, 25 May 2023 11:40:36 +0200 Subject: upgrade issue with nova/cinder and api version error In-Reply-To: <46D8C43B01FAC6419C20963F12B88E2E01E9569C67@MAILSERVER.alamois.com> References: <46D8C43B01FAC6419C20963F12B88E2E01E9569C67@MAILSERVER.alamois.com> Message-ID: <20230525094036.5on7butnf5tumnow@localhost> On 22/05, Jim Kilborn wrote: > Hello, > > First time posting here. > We have been running a production openstack environment at my office since the kilo release. We are currently on train, and I'm trying to get up to a more recent version. To make it more difficult, we are on centos7, so having to switch to ubuntu as we update versions. > > The problem that I am having after updaing to victoria, is that when I delete a vm via horizon, the instance disappears but the cinder volume doesn't delete the attachment. > It appears this is due to the following error in /var/log/apache2/cinder_error.log > > ERROR cinder.api.middleware.fault novaclient.exceptions.NotAcceptable: Version 2.89 is not supported by the API. Minimum is 2.1 and maximum is 2.87. (HTTP 406) > > When I look at the /usr/lib/python3/dist-packages/cinder/compute/nova.py I can see it's using 2.89 in get_server_volume > > def get_server_volume(context, server_id, volume_id): > # Use microversion that includes attachment_id > nova = novaclient(context, api_version='2.89') > return nova.volumes.get_server_volume(server_id, volume_id) > Hi, That's a bug on the Ubuntu packages [1] caused by the incorrect backport of a CVE [2] patches (it was a complex backport), and in comment #6 Brian explains how to fix the Ubuntu backport [3]. Cheers, Gorka. [1]: https://bugs.launchpad.net/cinder/+bug/2020382 [2]: https://bugs.launchpad.net/nova/+bug/2004555 [3]: https://bugs.launchpad.net/cinder/+bug/2020382/comments/6 > I am not sure why cinder and nova are in disagreement on the api_version. > I have verified that they are both upgraded to the victoria release. > > Anyone have any ideas as to why I would be getting this error or a possible fix? I haven't been able to find any information on this error. > > > Here are the nova package versions: > nova-api/focal-updates,now 2:22.4.0-0ubuntu1~cloud3 all [installed] > nova-common/focal-updates,now 2:22.4.0-0ubuntu1~cloud3 all [installed] > nova-conductor/focal-updates,now 2:22.4.0-0ubuntu1~cloud3 all [installed] > nova-novncproxy/focal-updates,now 2:22.4.0-0ubuntu1~cloud3 all [installed] > nova-scheduler/focal-updates,now 2:22.4.0-0ubuntu1~cloud3 all [installed] > python3-nova/focal-updates,now 2:22.4.0-0ubuntu1~cloud3 all [installed] > python3-novaclient/focal-updates,now 2:17.2.1-0ubuntu1~cloud0 all [installed,automatic] > > Here are the cinder package versions: > cinder-api/focal-updates,now 2:17.4.0-0ubuntu1~cloud3 all [installed] > cinder-common/focal-updates,now 2:17.4.0-0ubuntu1~cloud3 all [installed,automatic] > cinder-scheduler/focal-updates,now 2:17.4.0-0ubuntu1~cloud3 all [installed] > cinder-volume/focal-updates,now 2:17.4.0-0ubuntu1~cloud3 all [installed] > python3-cinder/focal-updates,now 2:17.4.0-0ubuntu1~cloud3 all [installed,automatic] > python3-cinderclient/focal-updates,now 1:7.2.0-0ubuntu1~cloud0 all [installed] > > > Thanks in advance for any ideas! > From fungi at yuggoth.org Thu May 25 11:46:38 2023 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 25 May 2023 11:46:38 +0000 Subject: pep8 error of newest devstack In-Reply-To: <594480c84ac343dd906ee575c0fbf4c4@inspur.com> References: <594480c84ac343dd906ee575c0fbf4c4@inspur.com> Message-ID: <20230525114637.476aqf6ptd2kkwd7@yuggoth.org> On 2023-05-25 06:05:46 +0000 (+0000), Alex Song (???) wrote: > When I check pep8 with the newest devstack env by the cmd ?tox ?e > pep8?, it raise many packages conflict. The log is as follows. [...] > ERROR: Cannot install oslo.messaging>=14.1.0 because these package > versions have conflicting dependencies. > > The conflict is caused by: > > The user requested oslo.messaging>=14.1.0 > > The user requested (constraint) oslo-messaging===14.3.0 [...] It's unclear from your message what version of the CPython interpreter you're using. The current releases of oslo.messaging require Python 3.8 or newer, so if you're trying to do this with older Python that would explain the error you're seeing. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From iurygregory at gmail.com Thu May 25 12:25:10 2023 From: iurygregory at gmail.com (Iury Gregory) Date: Thu, 25 May 2023 09:25:10 -0300 Subject: [ironic] Deploy images crashing Dell server BIOS using UEFI boot In-Reply-To: References: Message-ID: Hello Mike, We had this issue in the past and there was a bug tracker upstream [1], the upstream fix was merged during xena cycle [2]. I would attempt to check the firmware present in the machine and try to upgrade to see if it helps. I've also noticed we just had another change in IPA with a fix for efibootmgr [3], not sure if the image you downloaded from tarballs already contains. [1] https://storyboard.openstack.org/#!/story/2008962 [2] https://review.opendev.org/c/openstack/ironic-python-agent/+/795862 [3] https://review.opendev.org/c/openstack/ironic-python-agent/+/881762 Em qui., 25 de mai. de 2023 ?s 04:19, Mike Currin escreveu: > Hi All, > > We have a Xena based Openstack deployment, recently we deployed 60+ > nodes in our research cluster with Ironic which worked well. All of > these were deployed using a standard process I'll describe below. > > We recently took delivery of a new Dell R6625 server with NVMe devices > onlym which only support UEFI boot, so we are trying to get that > working. > > The server PXEs and downloads the RAM disk and then the Deploy image, > once running that it immediately crashes (I assume when running > linuxefi). We tested UEFI deploy on an existing Dell R640 server, > that server works with BIOS but we swapped it over to UEFI and it does > the same, so it wasn't due to the much bigger/different architecture > (AMD vs Intel) server. We have a few older servers in a test setup > (which are Dell R630's) which are working fine and don't do this > behaviour. We haven't tried them on our production setup as if even > if they worked it wouldn't help us move forward. > > I made a video showing this: > https://www.dropbox.com/s/5jbn1qpylxaevqb/uefiboot2.mov?dl=0 > In the iDRAC we just get that the "System BIOS has halted" and > somewhere I said to change hardware that you recently added, which > feels unlikely as 2 different servers both working elsewhere with > totally different hardware,. > > I've done a iDRAC Serial console debug but it isn't showing me much > that is of any use. > > This is our entire process to deploy a node (some is once off of > course, I've not included the network setup): > > openstack flavor create --ram 256000 --disk 20 --vcpus 32 --public > our-baremetal > openstack flavor set our-baremetal --property capabilities:boot_mode="uefi" > > We downloaded the latest (Xena) images from: > https://tarballs.opendev.org/openstack/ironic-python-agent/dib/files/ > I also tried the latest Centos9 ones just to try them, made no difference. > > We then extract the image and make a small mod to the service start > (we found it didn't bring the NIC immediately up so put a ping delay > in the ExecStart), but that's not part of this problem. > > openstack image create --disk-format aki --container-format aki > --public --file ironic-agent.kernel deploy-vmlinuz > openstack image create --disk-format ari --container-format ari > --public --file ironic-agent.initramfs-ping-patched deploy-initrd > > Then to do the deploy: > export HOSTNAME= > export MGMTIP= > > openstack baremetal node create --driver ipmi --name $HOSTNAME > --driver-info ipmi_port=623 --driver-info ipmi_username=root > --driver-info 'ipmi_password=' --driver-info > ipmi_address=$MGMTIP --resource-class baremetal-resource-class > --property cpus=32 --property memory_mb=256000 --property local_gb=20 > --property cpu_arch=x86_64 --driver-info deploy_ramdisk=$(openstack > image show deploy-initrd -f value -c id) --driver-info > deploy_kernel=$(openstack image show deploy-vmlinuz -f value -c id) > NODE=$(openstack baremetal node show -f value -c uuid $HOSTNAME) > openstack baremetal node set $NODE --property capabilities='boot_mode:uefi' > > openstack baremetal port create --node $NODE > --physical-network physnet3 > openstack baremetal node manage $NODE --wait && openstack baremetal > node list && openstack baremetal node provide $NODE && openstack > baremetal node list > > openstack server create --use-config-drive --image --flavor > our-baremetal --security-group worker --network ironic-network > --key-name servername > > Does any one have any more info to help or any suggestions as to > something more I could try, I'm out of ideas. I know that UEFI itself > works on both the servers, we have a setup with Ubuntu MAAS and it can > deploy perfectly fine using its process with the UEFI setup so, it's > something on the Ironic deploy image that's causing us this problem. > > Regards, > Mike > > -- *Att[]'s* *Iury Gregory Melo Ferreira * *MSc in Computer Science at UFCG* *Ironic PTL * *Senior Software Engineer at Red Hat Brazil* *Social*: https://www.linkedin.com/in/iurygregory *E-mail: iurygregory at gmail.com * -------------- next part -------------- An HTML attachment was scrubbed... URL: From senrique at redhat.com Thu May 25 12:45:23 2023 From: senrique at redhat.com (Sofia Enriquez) Date: Thu, 25 May 2023 13:45:23 +0100 Subject: Cinder Bug Report 2023-05-25 Message-ID: Hello Argonauts, Sorry for the late email. Cinder Bug Meeting Etherpad *Low* - Cached images duplicated per host. - *Status:* Unassigned. - [rbac] User with Reader role can extend/reserve/retype/unreserve/update_readonly volume. - *Status:* Tempest proposed to master. - [rbac] User with Reader role can create/delete/update/set-bootable volume. - *Status:* Tempest proposed to master. Cheers -- Sof?a Enriquez she/her Software Engineer Red Hat PnT IRC: @enriquetaso @RedHat Red Hat Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From dfgyuri at yandex.com Thu May 25 00:22:24 2023 From: dfgyuri at yandex.com (Yuri Blas) Date: Wed, 24 May 2023 17:22:24 -0700 Subject: [Glance] [Ceph] [Zed] Glance no longer working after upgrading Ceph cluster to 17.2.6 Message-ID: <308621684971757@mail.yandex.com> An HTML attachment was scrubbed... URL: From akekane at redhat.com Thu May 25 15:22:27 2023 From: akekane at redhat.com (Abhishek Kekane) Date: Thu, 25 May 2023 20:52:27 +0530 Subject: [Glance] [Ceph] [Zed] Glance no longer working after upgrading Ceph cluster to 17.2.6 In-Reply-To: <308621684971757@mail.yandex.com> References: <308621684971757@mail.yandex.com> Message-ID: Hi Yuri, Can you specify a glance-store version you are using in your environment? Thanks & Best Regards, Abhishek Kekane On Thu, May 25, 2023 at 6:28?PM Yuri Blas wrote: > Hi folks, > > I am using ceph/rbd as backing store and after upgrading my ceph cluster > to 17.2.6 from 17.2.5, > glance is now erroring out when attempting to create new images with: > HttpException: 500: Server Error for url: > http://10.70.70.107:9292/v2/images/e6faf7ed-313e-4d6d-9191-611dd3bd64d1/file, > Internal Server Error > > Cinder and nova are functioning fine without issues, glance-api.logs show > the following > > ERROR glance_store._drivers.rbd [None > req-5fd02661-ce31-4140-838e-3b0ccd765247 3eeb56a678424e938023d6de390e5661 > c3e91973ff21476c80e4273850c8d98e - - default default] Failed to store image > 626977ce-c66a-4bd6-901d-c9f693be5aa0 Store Exception [errno 95] RBD > operation not supported (error creating snapshot b'snap' from > b'626977ce-c66a-4bd6-901d-c9f693be5aa0'): rbd.OperationNotSupported: [errno > 95] RBD operation not supported (error creating snapshot b'snap' from > b'626977ce-c66a-4bd6-901d-c9f693be5aa0') > ERROR glance.api.v2.image_data [None > req-5fd02661-ce31-4140-838e-3b0ccd765247 3eeb56a678424e938023d6de390e5661 > c3e91973ff21476c80e4273850c8d98e - - default default] Failed to upload > image data due to internal error: rbd.OperationNotSupported: [errno 95] RBD > operation not supported (error creating snapshot b'snap' from > b'626977ce-c66a-4bd6-901d-c9f693be5aa0') > ERROR glance.common.wsgi [None req-5fd02661-ce31-4140-838e-3b0ccd765247 > 3eeb56a678424e938023d6de390e5661 c3e91973ff21476c80e4273850c8d98e - - > default default] Caught error: [errno 95] RBD operation not supported > (error creating snapshot b'snap' from > b'626977ce-c66a-4bd6-901d-c9f693be5aa0'): rbd.OperationNotSupported: [errno > 95] RBD operation not supported (error creating snapshot b'snap' from > b'626977ce-c66a-4bd6-901d-c9f693be5aa0') > > > > Any help with this would be most appreciated. > > Thanks, > > -Yuri > -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Thu May 25 15:40:00 2023 From: satish.txt at gmail.com (Satish Patel) Date: Thu, 25 May 2023 11:40:00 -0400 Subject: [kolla-ansible][yoga] Installation failure while `creating shard root mysql user` In-Reply-To: References: Message-ID: I am running yoga in 3 productions without any issue. never heard of this error before. On Wed, May 24, 2023 at 5:38?PM Michal Arbet wrote: > Hi , this is incompatibility in Ansible version I believe. And if task is > created via kolla-toolbox ..check version there .... > > On Wed, May 24, 2023, 22:06 Rob Jefferson wrote: > >> I'm trying to set up Yoga (as part of a way of upgrading from Xena to >> Yoga to Zed) using kolla-ansible, and I'm finding that I'm getting the >> following error at the >> `kolla-ansible/ansible/roles/mariadb/tasks/register.yml` playbook: >> >> TASK [mariadb : Creating shard root mysql user] >> [...] >> fatal: [control01]: FAILED! => >> { >> "changed": false, >> "msg": "Can not parse the inner module output: >> >> b'[WARNING]: Failure using method (v2_playbook_on_play_start) in callback >> plugin >> (> object >> at 0x7fa4e6e18d00>): 'Play' object has no attribute 'get_path' >> [WARNING]: Failure using method (v2_playbook_on_task_start) in callback >> plugin >> (> object >> at 0x7fa4e6e18d00>): list index out of range >> [WARNING]: Failure using method (v2_runner_on_ok) in callback plugin >> (> object at 0x7fa4e6e18d00>): list index out of range >> { "custom_stats": {}, >> "global_custom_stats": {}, >> "plays": [], >> "stats\": { >> "localhost": { >> "changed\": 0, >> "failures\": 0, >> "ignored\": 0, >> "ok": 1, >> "rescued\": 0, >> "skipped\": 0, >> "unreachable\": 0 >> } >> } >> }' >> "} >> >> I am not sure how to fix this error, or determine what's causing it. >> I've tried matching up Ansible versions between my installation >> environment and Yoga, I've tried upgrading Kolla, and so forth. >> >> For the installation environment, I'm using: >> >> * Debian 11 >> * Ansible 4.10.0 >> * ansible-core 2.11.2 >> * docker-ce 5.20.10 >> >> Any help with this would be most appreciated. >> >> Thanks, >> >> Rob >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From ykarel at redhat.com Thu May 25 15:56:16 2023 From: ykarel at redhat.com (Yatin Karel) Date: Thu, 25 May 2023 21:26:16 +0530 Subject: pep8 error of newest devstack In-Reply-To: <594480c84ac343dd906ee575c0fbf4c4@inspur.com> References: <594480c84ac343dd906ee575c0fbf4c4@inspur.com> Message-ID: Hi Alex, The issue looks due to an outdated mirror(i.e https://pypi.tuna.tsinghua.edu.cn), it should work fine with an up to date mirror. See below 14.3.0 is not available in the pypi index you using:- $ curl https://pypi.tuna.tsinghua.edu.cn/simple/oslo-messaging/ 2>/dev/null|grep oslo.messaging-14 oslo.messaging-14.0.0-py3-none-any.whl
oslo.messaging-14.0.0.tar.gz
oslo.messaging-14.1.0-py3-none-any.whl
oslo.messaging-14.1.0.tar.gz
oslo.messaging-14.2.0-py3-none-any.whl
oslo.messaging-14.2.0.tar.gz
V/s $ curl https://pypi.org/simple/oslo-messaging/ 2>/dev/null|grep oslo.messaging-14 oslo.messaging-14.0.0-py3-none-any.whl
oslo.messaging-14.0.0.tar.gz
oslo.messaging-14.1.0-py3-none-any.whl
oslo.messaging-14.1.0.tar.gz
oslo.messaging-14.2.0-py3-none-any.whl
oslo.messaging-14.2.0.tar.gz
oslo.messaging-14.3.0-py3-none-any.whl
oslo.messaging-14.3.0.tar.gz
Thanks and Regards Yatin Karel On Thu, May 25, 2023 at 11:46?AM Alex Song (???) wrote: > > > Hi, > > When I check pep8 with the newest devstack env by the cmd ?tox ?e pep8?, > it raise many packages conflict. The log is as follows. > > > > > root at song-ctrl:/opt/stack/cyborg# tox -e pep8 > > pep8 create: /opt/stack/cyborg/.tox/shared > > pep8 installdeps: -c > https://releases.openstack.org/constraints/upper/master, > -r/opt/stack/cyborg/requirements.txt, > -r/opt/stack/cyborg/test-requirements.txt > > > > > > > > ERROR: invocation failed (exit code 1), logfile: > /opt/stack/cyborg/.tox/shared/log/pep8-1.log > > ========================================================================================================================================== > log start > ========================================================================================================================================== > > Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple > > Collecting pbr!=2.1.0,>=0.11 (from -r /opt/stack/cyborg/requirements.txt > (line 5)) > > Using cached > https://pypi.tuna.tsinghua.edu.cn/packages/01/06/4ab11bf70db5a60689fc521b636849c8593eb67a2c6bdf73a16c72d16a12/pbr-5.11.1-py2.py3-none-any.whl (112 > kB) > > Collecting pecan!=1.0.2,!=1.0.3,!=1.0.4,!=1.2,>=1.0.0 (from -r > /opt/stack/cyborg/requirements.txt (line 6)) > > Using cached pecan-1.4.2-py3-none-any.whl > > Collecting WSME>=0.10.1 (from -r /opt/stack/cyborg/requirements.txt (line > 7)) > > Using cached > https://pypi.tuna.tsinghua.edu.cn/packages/d2/a0/54b71fd2e1ab4f838acf472726a14dadb6d1c77adc18d7e2c062dd955ff9/WSME-0.11.0-py3-none-any.whl (59 > kB) > > Collecting eventlet>=0.26.0 (from -r /opt/stack/cyborg/requirements.txt > (line 8)) > > Using cached > https://pypi.tuna.tsinghua.edu.cn/packages/90/97/928b89de2e23cc67136eccccf1c122adf74ffdb65bbf7d2964b937cedd4f/eventlet-0.33.3-py2.py3-none-any.whl (226 > kB) > > Collecting oslo.i18n>=1.5.0 (from -r /opt/stack/cyborg/requirements.txt > (line 9)) > > Using cached > https://pypi.tuna.tsinghua.edu.cn/packages/62/5a/ced9667f5a35d712734bf3449afad763283e31ea9a903137eb42df29d948/oslo.i18n-6.0.0-py3-none-any.whl (46 > kB) > > Collecting oslo.config!=4.3.0,!=4.4.0,>=1.1.0 (from -r > /opt/stack/cyborg/requirements.txt (line 10)) > > Using cached > https://pypi.tuna.tsinghua.edu.cn/packages/ce/15/97fb14b7a1692693610a8e00e2a08e4186d6cdd875b6ac24c912a429b665/oslo.config-9.1.1-py3-none-any.whl (128 > kB) > > Collecting oslo.log>=5.0.0 (from -r /opt/stack/cyborg/requirements.txt > (line 11)) > > Using cached > https://pypi.tuna.tsinghua.edu.cn/packages/9c/24/526545a76513c741c65eeb70a85c93287f10665147c7cff7e0eb24918d43/oslo.log-5.2.0-py3-none-any.whl (71 > kB) > > Collecting oslo.context>=2.9.0 (from -r /opt/stack/cyborg/requirements.txt > (line 12)) > > Using cached > https://pypi.tuna.tsinghua.edu.cn/packages/7f/69/3a16785c49890cce2f7f14ee6525c76e7116f9dad44f122f5e3670e43970/oslo.context-5.1.1-py3-none-any.whl (20 > kB) > > ERROR: Cannot install oslo.messaging>=14.1.0 because these package > versions have conflicting dependencies. > > > > The conflict is caused by: > > The user requested oslo.messaging>=14.1.0 > > The user requested (constraint) oslo-messaging===14.3.0 > > > > To fix this you could try to: > > 1. loosen the range of package versions you've specified > > 2. remove package versions to allow pip attempt to solve the dependency > conflict > > > > ERROR: ResolutionImpossible: for help visit > https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts > > > > =========================================================================================================================================== > log end > =========================================================================================================================================== > > ERROR: could not install deps [-c > https://releases.openstack.org/constraints/upper/master, > -r/opt/stack/cyborg/requirements.txt, > -r/opt/stack/cyborg/test-requirements.txt]; v = > InvocationError('/opt/stack/cyborg/.tox/shared/bin/python -m pip install -c > https://releases.openstack.org/constraints/upper/master -r/opt/stack/cyborg/requirements.txt > -r/opt/stack/cyborg/test-requirements.txt', 1) > > ___________________________________________________________________________________________________________________________________________ > summary > ___________________________________________________________________________________________________________________________________________ > > ERROR: pep8: could not install deps [-c > https://releases.openstack.org/constraints/upper/master, > -r/opt/stack/cyborg/requirements.txt, > -r/opt/stack/cyborg/test-requirements.txt]; v = > InvocationError('/opt/stack/cyborg/.tox/shared/bin/python -m pip install -c > https://releases.openstack.org/constraints/upper/master -r/opt/stack/cyborg/requirements.txt > -r/opt/stack/cyborg/test-requirements.txt', 1) > > > > > > > > I hope you can help us. Thank you very much. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliver.weinmann at me.com Thu May 25 16:13:13 2023 From: oliver.weinmann at me.com (Oliver Weinmann) Date: Thu, 25 May 2023 18:13:13 +0200 Subject: [kolla-ansible][yoga] Installation failure while `creating shard root mysql user` In-Reply-To: References: Message-ID: <97CAE929-5D7D-42FC-AF8E-4B0FFEFF7159@me.com> An HTML attachment was scrubbed... URL: From techstep at gmail.com Thu May 25 18:42:07 2023 From: techstep at gmail.com (Rob Jefferson) Date: Thu, 25 May 2023 14:42:07 -0400 Subject: [kolla-ansible][yoga] Installation failure while `creating shard root mysql user` In-Reply-To: References: Message-ID: I've compared the Ansible versions -- both the install and kolla-toolbox are running the same Ansible version (ansible-core 2.11.12). I made sure of that before I wrote the initial email, and made doubly sure today. I've also compared ansible-galaxy module versions, but that should be more relevant on the kolla-toolbox side. Maybe the issue is with one of the Python modules -- would anyone who has this working mind doing a `pip freeze` and `ansible-galaxy collection list` and sending what you have, just for a basis of comparison? Thanks, Rob On Wed, May 24, 2023 at 5:29?PM Michal Arbet wrote: > > Hi , this is incompatibility in Ansible version I believe. And if task is created via kolla-toolbox ..check version there .... > > On Wed, May 24, 2023, 22:06 Rob Jefferson wrote: >> >> I'm trying to set up Yoga (as part of a way of upgrading from Xena to >> Yoga to Zed) using kolla-ansible, and I'm finding that I'm getting the >> following error at the >> `kolla-ansible/ansible/roles/mariadb/tasks/register.yml` playbook: >> >> TASK [mariadb : Creating shard root mysql user] >> [...] >> fatal: [control01]: FAILED! => >> { >> "changed": false, >> "msg": "Can not parse the inner module output: >> >> b'[WARNING]: Failure using method (v2_playbook_on_play_start) in callback >> plugin (> object >> at 0x7fa4e6e18d00>): 'Play' object has no attribute 'get_path' >> [WARNING]: Failure using method (v2_playbook_on_task_start) in callback >> plugin (> object >> at 0x7fa4e6e18d00>): list index out of range >> [WARNING]: Failure using method (v2_runner_on_ok) in callback plugin >> (> object at 0x7fa4e6e18d00>): list index out of range >> { "custom_stats": {}, >> "global_custom_stats": {}, >> "plays": [], >> "stats\": { >> "localhost": { >> "changed\": 0, >> "failures\": 0, >> "ignored\": 0, >> "ok": 1, >> "rescued\": 0, >> "skipped\": 0, >> "unreachable\": 0 >> } >> } >> }' >> "} >> >> I am not sure how to fix this error, or determine what's causing it. >> I've tried matching up Ansible versions between my installation >> environment and Yoga, I've tried upgrading Kolla, and so forth. >> >> For the installation environment, I'm using: >> >> * Debian 11 >> * Ansible 4.10.0 >> * ansible-core 2.11.2 >> * docker-ce 5.20.10 >> >> Any help with this would be most appreciated. >> >> Thanks, >> >> Rob >> From techstep at gmail.com Thu May 25 20:58:09 2023 From: techstep at gmail.com (Rob Jefferson) Date: Thu, 25 May 2023 16:58:09 -0400 Subject: [kolla-ansible][yoga] Installation failure while `creating shard root mysql user` In-Reply-To: References: Message-ID: Update: So far, I have: * crafted a new installer (we build a Docker container to deploy to OpenStack); * grabbed the most recent Yoga Kolla release; * built and pushed containers from that; and * grabbed the most recent Kolla-Ansible release for Yoga. >From what I'm gathering right now, I now appear to have a functional OpenStack Yoga install. I suspect the update that changed the `ansible.posix` version to less than `1.5.4` and building the new images did it. Thanks, all! Rob On Thu, May 25, 2023 at 2:42?PM Rob Jefferson wrote: > > I've compared the Ansible versions -- both the install and > kolla-toolbox are running the same Ansible version (ansible-core > 2.11.12). I made sure of that before I wrote the initial email, and > made doubly sure today. > > I've also compared ansible-galaxy module versions, but that should be > more relevant on the kolla-toolbox side. > > Maybe the issue is with one of the Python modules -- would anyone who > has this working mind doing a `pip freeze` and `ansible-galaxy > collection list` and sending what you have, just for a basis of > comparison? > > Thanks, > > Rob > > > On Wed, May 24, 2023 at 5:29?PM Michal Arbet wrote: > > > > Hi , this is incompatibility in Ansible version I believe. And if task is created via kolla-toolbox ..check version there .... > > > > On Wed, May 24, 2023, 22:06 Rob Jefferson wrote: > >> > >> I'm trying to set up Yoga (as part of a way of upgrading from Xena to > >> Yoga to Zed) using kolla-ansible, and I'm finding that I'm getting the > >> following error at the > >> `kolla-ansible/ansible/roles/mariadb/tasks/register.yml` playbook: > >> > >> TASK [mariadb : Creating shard root mysql user] > >> [...] > >> fatal: [control01]: FAILED! => > >> { > >> "changed": false, > >> "msg": "Can not parse the inner module output: > >> > >> b'[WARNING]: Failure using method (v2_playbook_on_play_start) in callback > >> plugin ( >> object > >> at 0x7fa4e6e18d00>): 'Play' object has no attribute 'get_path' > >> [WARNING]: Failure using method (v2_playbook_on_task_start) in callback > >> plugin ( >> object > >> at 0x7fa4e6e18d00>): list index out of range > >> [WARNING]: Failure using method (v2_runner_on_ok) in callback plugin > >> ( >> object at 0x7fa4e6e18d00>): list index out of range > >> { "custom_stats": {}, > >> "global_custom_stats": {}, > >> "plays": [], > >> "stats\": { > >> "localhost": { > >> "changed\": 0, > >> "failures\": 0, > >> "ignored\": 0, > >> "ok": 1, > >> "rescued\": 0, > >> "skipped\": 0, > >> "unreachable\": 0 > >> } > >> } > >> }' > >> "} > >> > >> I am not sure how to fix this error, or determine what's causing it. > >> I've tried matching up Ansible versions between my installation > >> environment and Yoga, I've tried upgrading Kolla, and so forth. > >> > >> For the installation environment, I'm using: > >> > >> * Debian 11 > >> * Ansible 4.10.0 > >> * ansible-core 2.11.2 > >> * docker-ce 5.20.10 > >> > >> Any help with this would be most appreciated. > >> > >> Thanks, > >> > >> Rob > >> From songwenping at inspur.com Fri May 26 02:08:31 2023 From: songwenping at inspur.com (=?utf-8?B?QWxleCBTb25nICjlrovmloflubMp?=) Date: Fri, 26 May 2023 02:08:31 +0000 Subject: =?utf-8?B?562U5aSNOiBwZXA4IGVycm9yIG9mIG5ld2VzdCBkZXZzdGFjaw==?= In-Reply-To: <20230525114637.476aqf6ptd2kkwd7@yuggoth.org> References: <594480c84ac343dd906ee575c0fbf4c4@inspur.com> <20230525114637.476aqf6ptd2kkwd7@yuggoth.org> Message-ID: Thanks Jeremy, my python is 3.8, the problem is my pypi is not outdate. -----????----- ???: Jeremy Stanley [mailto:fungi at yuggoth.org] ????: 2023?5?25? 19:47 ???: openstack-discuss at lists.openstack.org ??: Re: pep8 error of newest devstack On 2023-05-25 06:05:46 +0000 (+0000), Alex Song (???) wrote: > When I check pep8 with the newest devstack env by the cmd ?tox ?e > pep8?, it raise many packages conflict. The log is as follows. [...] > ERROR: Cannot install oslo.messaging>=14.1.0 because these package > versions have conflicting dependencies. > > The conflict is caused by: > > The user requested oslo.messaging>=14.1.0 > > The user requested (constraint) oslo-messaging===14.3.0 [...] It's unclear from your message what version of the CPython interpreter you're using. The current releases of oslo.messaging require Python 3.8 or newer, so if you're trying to do this with older Python that would explain the error you're seeing. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3774 bytes Desc: not available URL: From abishop at redhat.com Fri May 26 04:33:14 2023 From: abishop at redhat.com (Alan Bishop) Date: Thu, 25 May 2023 21:33:14 -0700 Subject: Add netapp storage in edge site | Wallaby | DCN In-Reply-To: References: Message-ID: On Thu, May 25, 2023 at 12:09?AM Swogat Pradhan wrote: > Hi Alan, > So, can I include the cinder-netapp-storage.yaml file in the central site > and then use the new backend to add storage to edge VM's? > Where is the NetApp physically located? Tripleo's DCN architecture assumes the storage is physically located at the same site where the cinder-volume service will be deployed. If you include the cinder-netapp-storage.yaml environment file in the central site's controlplane, then VMs at the edge site will encounter the problems I outlined earlier (network latency, no ability to do cross-AZ attachments). > I believe it is not possible right?? as the cinder volume in the edge > won't have the config for the netapp. > The cinder-volume services at an edge site are meant to manage storage devices at that site. If the NetApp is at the edge site, ideally you'd include some variation of the cinder-netapp-storage.yaml environment file in the edge site's deployment. However, then you're faced with the fact that the NetApp driver doesn't support A/A, which is required for c-vol services running at edge sites (In case you're not familiar with these details, tripleo runs all cinder-volume services in active/passive mode under pacemaker on controllers in the controlplane. Thus, only a single instance runs at any time, and pacemaker provides HA by moving the service to another controller if the first one goes down. However, pacemaker is not available at edge sites, and so to get HA, multiple instances of the cinder-volume service run simultaneously on 3 nodes (A/A), using etcd as a Distributed Lock Manager (DLM) to coordinate things. But drivers must specifically support running A/A, and the NetApp driver does NOT.) Alan > With regards, > Swogat Pradhan > > On Thu, May 25, 2023 at 2:17?AM Alan Bishop wrote: > >> >> >> On Wed, May 24, 2023 at 3:15?AM Swogat Pradhan >> wrote: >> >>> Hi, >>> I have a DCN setup and there is a requirement to use a netapp storage >>> device in one of the edge sites. >>> Can someone please confirm if it is possible? >>> >> >> I see from prior email to this list that you're using tripleo, so I'll >> respond with that in mind. >> >> There are many factors that come into play, but I suspect the short >> answer to your question is no. >> >> Tripleo's DCN architecture requires the cinder-volume service running at >> edge sites to run in active-active >> mode, where there are separate instances running on three nodes in to for >> the service to be highly >> available (HA).The problem is that only a small number of cinder drivers >> support running A/A, and NetApp's >> drivers do not support A/A. >> >> It's conceivable you could create a custom tripleo role that deploys just >> a single node running cinder-volume >> with a NetApp backend, but it wouldn't be HA. >> >> It's also conceivable you could locate the NetApp system in the central >> site's controlplane, but there are >> extremely difficult constraints you'd need to overcome: >> - Network latency between the central and edge sites would mean the disk >> performance would be bad. >> - You'd be limited to using iSCSI (FC wouldn't work) >> - Tripleo disables cross-AZ attachments, so the only way for an edge site >> to access a NetApp volume >> would be to configure the cinder-volume service running in the >> controlplane with a backend availability >> zone set to the edge site's AZ. You mentioned the NetApp is needed "in >> one of the edge sites," but in >> reality the NetApp would be available in one, AND ONLY ONE edge site, and >> it would also not be available >> to any instances running in the central site. >> >> Alan >> >> >>> And if so then should i add the parameters in the edge deployment script >>> or the central deployment script. >>> Any suggestions? >>> >>> With regards, >>> Swogat Pradhan >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From swogatpradhan22 at gmail.com Fri May 26 04:39:02 2023 From: swogatpradhan22 at gmail.com (Swogat Pradhan) Date: Fri, 26 May 2023 10:09:02 +0530 Subject: Add netapp storage in edge site | Wallaby | DCN In-Reply-To: References: Message-ID: Hi Alan, My netapp storage is located in edge site itself. As the networks are routable my central site is able to reach the netapp storage ip address (ping response is 30ms-40ms). Let's say i included the netapp storage yaml in central site deployment script (which is not recommended) and i am able to create the volumes as it is reachable from controller nodes. Will i be able to mount those volumes in edge site VM's?? And if i am able to do so, then how will the data flow?? When storing something in the netapp volume will the data flow through the central site controller and get stored in the storage space? With regards, Swogat Pradhan On Fri, 26 May 2023, 10:03 am Alan Bishop, wrote: > > > On Thu, May 25, 2023 at 12:09?AM Swogat Pradhan > wrote: > >> Hi Alan, >> So, can I include the cinder-netapp-storage.yaml file in the central site >> and then use the new backend to add storage to edge VM's? >> > > Where is the NetApp physically located? Tripleo's DCN architecture assumes > the storage is physically located at the same site where the cinder-volume > service will be deployed. If you include the cinder-netapp-storage.yaml > environment file in the central site's controlplane, then VMs at the edge > site will encounter the problems I outlined earlier (network latency, no > ability to do cross-AZ attachments). > > >> I believe it is not possible right?? as the cinder volume in the edge >> won't have the config for the netapp. >> > > The cinder-volume services at an edge site are meant to manage storage > devices at that site. If the NetApp is at the edge site, ideally you'd > include some variation of the cinder-netapp-storage.yaml environment file > in the edge site's deployment. However, then you're faced with the fact > that the NetApp driver doesn't support A/A, which is required for c-vol > services running at edge sites (In case you're not familiar with these > details, tripleo runs all cinder-volume services in active/passive mode > under pacemaker on controllers in the controlplane. Thus, only a single > instance runs at any time, and pacemaker provides HA by moving the service > to another controller if the first one goes down. However, pacemaker is not > available at edge sites, and so to get HA, multiple instances of the > cinder-volume service run simultaneously on 3 nodes (A/A), using etcd as a > Distributed Lock Manager (DLM) to coordinate things. But drivers must > specifically support running A/A, and the NetApp driver does NOT.) > > Alan > > >> With regards, >> Swogat Pradhan >> >> On Thu, May 25, 2023 at 2:17?AM Alan Bishop wrote: >> >>> >>> >>> On Wed, May 24, 2023 at 3:15?AM Swogat Pradhan < >>> swogatpradhan22 at gmail.com> wrote: >>> >>>> Hi, >>>> I have a DCN setup and there is a requirement to use a netapp storage >>>> device in one of the edge sites. >>>> Can someone please confirm if it is possible? >>>> >>> >>> I see from prior email to this list that you're using tripleo, so I'll >>> respond with that in mind. >>> >>> There are many factors that come into play, but I suspect the short >>> answer to your question is no. >>> >>> Tripleo's DCN architecture requires the cinder-volume service running at >>> edge sites to run in active-active >>> mode, where there are separate instances running on three nodes in to >>> for the service to be highly >>> available (HA).The problem is that only a small number of cinder drivers >>> support running A/A, and NetApp's >>> drivers do not support A/A. >>> >>> It's conceivable you could create a custom tripleo role that deploys >>> just a single node running cinder-volume >>> with a NetApp backend, but it wouldn't be HA. >>> >>> It's also conceivable you could locate the NetApp system in the central >>> site's controlplane, but there are >>> extremely difficult constraints you'd need to overcome: >>> - Network latency between the central and edge sites would mean the disk >>> performance would be bad. >>> - You'd be limited to using iSCSI (FC wouldn't work) >>> - Tripleo disables cross-AZ attachments, so the only way for an edge >>> site to access a NetApp volume >>> would be to configure the cinder-volume service running in the >>> controlplane with a backend availability >>> zone set to the edge site's AZ. You mentioned the NetApp is needed "in >>> one of the edge sites," but in >>> reality the NetApp would be available in one, AND ONLY ONE edge site, >>> and it would also not be available >>> to any instances running in the central site. >>> >>> Alan >>> >>> >>>> And if so then should i add the parameters in the edge deployment >>>> script or the central deployment script. >>>> Any suggestions? >>>> >>>> With regards, >>>> Swogat Pradhan >>>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralonsoh at redhat.com Fri May 26 11:52:12 2023 From: ralonsoh at redhat.com (Rodolfo Alonso Hernandez) Date: Fri, 26 May 2023 13:52:12 +0200 Subject: [neutron] Neutron drivers meeting cancelled Message-ID: Hello Neutrinos: Due to the lack of agenda [1], today's meeting is cancelled. Have a nice weekend! [1]https://wiki.openstack.org/wiki/Meetings/NeutronDrivers -------------- next part -------------- An HTML attachment was scrubbed... URL: From zigo at debian.org Fri May 26 15:26:26 2023 From: zigo at debian.org (Thomas Goirand) Date: Fri, 26 May 2023 17:26:26 +0200 Subject: [nova]vnc and spice used together in openstack In-Reply-To: References: Message-ID: On 5/25/23 05:16, ??? wrote: > Hi, > > When using the console, we enabled both nova spicehtml5proxy and nova > novancproxy on the controller node, and enabled vnc and spice on > compute1 and compute2, respectively. We found that there was a problem > with the coexistence of the two, The log is as follows. > > ?ERROR oslo_messaging.rpc.server > [req-28674cdd-f06e-466f-8611-516efc8fbe3c > [...] > > I don't know why there is an appeal issue or what special mechanism > OpenStack has for the console. I can only choose one. I hope you can > help us. Thank you very much. Hi, This isn't a limitation of OpenStack, but in KVM itself, as your VM instances can only output video to one and only one backend at a time (which is chosen when the VM starts, it cannot be changed at runtime). I hope this helps, Cheers, Thomas Goirand (zigo) From zigo at debian.org Fri May 26 16:19:09 2023 From: zigo at debian.org (Thomas Goirand) Date: Fri, 26 May 2023 18:19:09 +0200 Subject: [nova][ops] EOL'ing stable/train ? In-Reply-To: References: Message-ID: <2019509c-bbe6-864e-bd32-e19ae084d8ba@debian.org> On 5/24/23 12:24, Sylvain Bauza wrote: > Hi folks, in particular operators... > > We discussed yesterday during the nova meeting [1] about our stable > branches and eventually, we were wondering whether we should EOL [2] the > stable/train branch for Nova. > > Why so ? Two points : > 1/ The gate is failing at the moment for the branch. > 2/ Two CVEs (CVE-2022-47951 [3] and CVE-2023-2088 [4]) aren't fixed in > this branch. Hi, This is very disappointing to see these CVE as the cause for deprecating the branches. It should have been the opposite way: it should have triggered some effort to fix them... :/ FYI, I tried to get the fix in, and managed to break instead of fixing. An interesting way to fix CVE-2022-47951 could be to completely disable VMDK support. How hard would this be? As for CVE-2023-2088, the issue is implementing the force > It would be difficult to fix the CVEs in the upstream branch but > hopefully AFAIK all the OpenStack distros already fixed them for their > related releases that use Train. So far, I haven't seen such a fix, neither in Ubuntu or RedHat, on any version prior to ussuri. If you have a link to a working patch, please let me know. Cheers, Thomas Goirand (zigo) From dfgyuri at yandex.com Fri May 26 00:33:59 2023 From: dfgyuri at yandex.com (Yuri Blas) Date: Thu, 25 May 2023 17:33:59 -0700 Subject: [Glance] [Ceph] [Zed] Glance no longer working after upgrading Ceph cluster to 17.2.6 In-Reply-To: References: <308621684971757@mail.yandex.com> Message-ID: <475041685060560@mail.yandex.com> An HTML attachment was scrubbed... URL: From songwenping at inspur.com Fri May 26 02:07:02 2023 From: songwenping at inspur.com (=?utf-8?B?QWxleCBTb25nICjlrovmloflubMp?=) Date: Fri, 26 May 2023 02:07:02 +0000 Subject: =?utf-8?B?562U5aSNOiBwZXA4IGVycm9yIG9mIG5ld2VzdCBkZXZzdGFjaw==?= In-Reply-To: References: Message-ID: <8c08d70a2056402da674a960d9af6853@inspur.com> Thanks, Yatin Karel, it?s really the pypi problem. ???: Yatin Karel [mailto:ykarel at redhat.com] ????: 2023?5?25? 23:56 ???: Alex Song (???) ??: openstack-discuss ??: Re: pep8 error of newest devstack Hi Alex, The issue looks due to an outdated mirror(i.e https://pypi.tuna.tsinghua.edu.cn), it should work fine with an up to date mirror. See below 14.3.0 is not available in the pypi index you using:- $ curl https://pypi.tuna.tsinghua.edu.cn/simple/oslo-messaging/ 2>/dev/null|grep oslo.messaging-14 oslo.messaging-14.0.0-py3-none-any.whl
oslo.messaging-14.0.0.tar.gz
oslo.messaging-14.1.0-py3-none-any.whl
oslo.messaging-14.1.0.tar.gz
oslo.messaging-14.2.0-py3-none-any.whl
oslo.messaging-14.2.0.tar.gz
V/s $ curl https://pypi.org/simple/oslo-messaging/ 2>/dev/null|grep oslo.messaging-14 oslo.messaging-14.0.0-py3-none-any.whl
oslo.messaging-14.0.0.tar.gz
oslo.messaging-14.1.0-py3-none-any.whl
oslo.messaging-14.1.0.tar.gz
oslo.messaging-14.2.0-py3-none-any.whl
oslo.messaging-14.2.0.tar.gz
oslo.messaging-14.3.0-py3-none-any.whl
oslo.messaging-14.3.0.tar.gz
Thanks and Regards Yatin Karel On Thu, May 25, 2023 at 11:46?AM Alex Song (???) > wrote: Hi, When I check pep8 with the newest devstack env by the cmd ?tox ?e pep8?, it raise many packages conflict. The log is as follows. root at song-ctrl:/opt/stack/cyborg# tox -e pep8 pep8 create: /opt/stack/cyborg/.tox/shared pep8 installdeps: -chttps://releases.openstack.org/constraints/upper/master, -r/opt/stack/cyborg/requirements.txt, -r/opt/stack/cyborg/test-requirements.txt ERROR: invocation failed (exit code 1), logfile: /opt/stack/cyborg/.tox/shared/log/pep8-1.log ========================================================================================================================================== log start ========================================================================================================================================== Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Collecting pbr!=2.1.0,>=0.11 (from -r /opt/stack/cyborg/requirements.txt (line 5)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/01/06/4ab11bf70db5a60689fc521b636849c8593eb67a2c6bdf73a16c72d16a12/pbr-5.11.1-py2.py3-none-any.whl (112 kB) Collecting pecan!=1.0.2,!=1.0.3,!=1.0.4,!=1.2,>=1.0.0 (from -r /opt/stack/cyborg/requirements.txt (line 6)) Using cached pecan-1.4.2-py3-none-any.whl Collecting WSME>=0.10.1 (from -r /opt/stack/cyborg/requirements.txt (line 7)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/d2/a0/54b71fd2e1ab4f838acf472726a14dadb6d1c77adc18d7e2c062dd955ff9/WSME-0.11.0-py3-none-any.whl (59 kB) Collecting eventlet>=0.26.0 (from -r /opt/stack/cyborg/requirements.txt (line 8)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/90/97/928b89de2e23cc67136eccccf1c122adf74ffdb65bbf7d2964b937cedd4f/eventlet-0.33.3-py2.py3-none-any.whl (226 kB) Collecting oslo.i18n>=1.5.0 (from -r /opt/stack/cyborg/requirements.txt (line 9)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/62/5a/ced9667f5a35d712734bf3449afad763283e31ea9a903137eb42df29d948/oslo.i18n-6.0.0-py3-none-any.whl (46 kB) Collecting oslo.config!=4.3.0,!=4.4.0,>=1.1.0 (from -r /opt/stack/cyborg/requirements.txt (line 10)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/ce/15/97fb14b7a1692693610a8e00e2a08e4186d6cdd875b6ac24c912a429b665/oslo.config-9.1.1-py3-none-any.whl (128 kB) Collecting oslo.log>=5.0.0 (from -r /opt/stack/cyborg/requirements.txt (line 11)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/9c/24/526545a76513c741c65eeb70a85c93287f10665147c7cff7e0eb24918d43/oslo.log-5.2.0-py3-none-any.whl (71 kB) Collecting oslo.context>=2.9.0 (from -r /opt/stack/cyborg/requirements.txt (line 12)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/7f/69/3a16785c49890cce2f7f14ee6525c76e7116f9dad44f122f5e3670e43970/oslo.context-5.1.1-py3-none-any.whl (20 kB) ERROR: Cannot install oslo.messaging>=14.1.0 because these package versions have conflicting dependencies. The conflict is caused by: The user requested oslo.messaging>=14.1.0 The user requested (constraint) oslo-messaging===14.3.0 To fix this you could try to: 1. loosen the range of package versions you've specified 2. remove package versions to allow pip attempt to solve the dependency conflict ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts =========================================================================================================================================== log end =========================================================================================================================================== ERROR: could not install deps [-chttps://releases.openstack.org/constraints/upper/master, -r/opt/stack/cyborg/requirements.txt, -r/opt/stack/cyborg/test-requirements.txt]; v = InvocationError('/opt/stack/cyborg/.tox/shared/bin/python -m pip install -chttps://releases.openstack.org/constraints/upper/master -r/opt/stack/cyborg/requirements.txt -r/opt/stack/cyborg/test-requirements.txt', 1) ___________________________________________________________________________________________________________________________________________ summary ___________________________________________________________________________________________________________________________________________ ERROR: pep8: could not install deps [-chttps://releases.openstack.org/constraints/upper/master, -r/opt/stack/cyborg/requirements.txt, -r/opt/stack/cyborg/test-requirements.txt]; v = InvocationError('/opt/stack/cyborg/.tox/shared/bin/python -m pip install -chttps://releases.openstack.org/constraints/upper/master -r/opt/stack/cyborg/requirements.txt -r/opt/stack/cyborg/test-requirements.txt', 1) I hope you can help us. Thank you very much. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3774 bytes Desc: not available URL: From songwenping at inspur.com Fri May 26 09:56:04 2023 From: songwenping at inspur.com (=?utf-8?B?QWxleCBTb25nICjlrovmloflubMp?=) Date: Fri, 26 May 2023 09:56:04 +0000 Subject: =?utf-8?B?562U5aSNOiBwZXA4IGVycm9yIG9mIG5ld2VzdCBkZXZzdGFjaw==?= References: Message-ID: <889ca52337d7449ab6e35a124ebe4159@inspur.com> Hi, yatain: The openstacksdk project also cannot check pep8, other projects is ok such as nova, cyborg. the error is as follows. ### version information ``` pre-commit version: 3.3.2 git --version: git version 2.25.1 sys.version: 3.8.10 (default, Mar 13 2023, 10:26:41) [GCC 9.4.0] sys.executable: /opt/stack/openstacksdk/.tox/pep8/bin/python os.name: posix sys.platform: linux ``` ### error information ``` An unexpected error has occurred: CalledProcessError: command: ('/usr/bin/git', 'fetch', 'origin', '--tags') return code: 128 stdout: (none) stderr: (none) ``` ``` Traceback (most recent call last): File "/opt/stack/openstacksdk/.tox/pep8/lib/python3.8/site-packages/pre_commit/store.py", line 202, in clone_strategy self._shallow_clone(ref, _git_cmd) File "/opt/stack/openstacksdk/.tox/pep8/lib/python3.8/site-packages/pre_commit/store.py", line 184, in _shallow_clone git_cmd('-c', git_config, 'fetch', 'origin', ref, '--depth=1') File "/opt/stack/openstacksdk/.tox/pep8/lib/python3.8/site-packages/pre_commit/store.py", line 199, in _git_cmd cmd_output_b('git', *args, cwd=directory, env=env) File "/opt/stack/openstacksdk/.tox/pep8/lib/python3.8/site-packages/pre_commit/util.py", line 110, in cmd_output_b raise CalledProcessError(returncode, cmd, stdout_b, stderr_b) pre_commit.util.CalledProcessError: command: ('/usr/bin/git', '-c', 'protocol.version=2', 'fetch', 'origin', 'v4.4.0', '--depth=1') return code: 128 stdout: (none) stderr: (none) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/stack/openstacksdk/.tox/pep8/lib/python3.8/site-packages/pre_commit/error_handler.py", line 73, in error_handler yield File "/opt/stack/openstacksdk/.tox/pep8/lib/python3.8/site-packages/pre_commit/main.py", line 414, in main return run(args.config, store, args) File "/opt/stack/openstacksdk/.tox/pep8/lib/python3.8/site-packages/pre_commit/commands/run.py", line 425, in run for hook in all_hooks(config, store) File "/opt/stack/openstacksdk/.tox/pep8/lib/python3.8/site-packages/pre_commit/repository.py", line 252, in all_hooks return tuple( File "/opt/stack/openstacksdk/.tox/pep8/lib/python3.8/site-packages/pre_commit/repository.py", line 255, in for hook in _repository_hooks(repo, store, root_config) File "/opt/stack/openstacksdk/.tox/pep8/lib/python3.8/site-packages/pre_commit/repository.py", line 230, in _repository_hooks return _cloned_repository_hooks(repo_config, store, root_config) File "/opt/stack/openstacksdk/.tox/pep8/lib/python3.8/site-packages/pre_commit/repository.py", line 196, in _cloned_repository_hooks manifest_path = os.path.join(store.clone(repo, rev), C.MANIFEST_FILE) File "/opt/stack/openstacksdk/.tox/pep8/lib/python3.8/site-packages/pre_commit/store.py", line 206, in clone return self._new_repo(repo, ref, deps, clone_strategy) File "/opt/stack/openstacksdk/.tox/pep8/lib/python3.8/site-packages/pre_commit/store.py", line 163, in _new_repo make_strategy(directory) File "/opt/stack/openstacksdk/.tox/pep8/lib/python3.8/site-packages/pre_commit/store.py", line 204, in clone_strategy self._complete_clone(ref, _git_cmd) File "/opt/stack/openstacksdk/.tox/pep8/lib/python3.8/site-packages/pre_commit/store.py", line 176, in _complete_clone git_cmd('fetch', 'origin', '--tags') File "/opt/stack/openstacksdk/.tox/pep8/lib/python3.8/site-packages/pre_commit/store.py", line 199, in _git_cmd cmd_output_b('git', *args, cwd=directory, env=env) File "/opt/stack/openstacksdk/.tox/pep8/lib/python3.8/site-packages/pre_commit/util.py", line 110, in cmd_output_b raise CalledProcessError(returncode, cmd, stdout_b, stderr_b) pre_commit.util.CalledProcessError: command: ('/usr/bin/git', 'fetch', 'origin', '--tags') return code: 128 stdout: (none) stderr: (none) Please help, thanks. ???: Alex Song (???) ????: 2023?5?26? 10:07 ???: 'ykarel at redhat.com' ??: 'openstack-discuss at lists.openstack.org' ??: ??: pep8 error of newest devstack Thanks, Yatin Karel, it?s really the pypi problem. ???: Yatin Karel [mailto:ykarel at redhat.com] ????: 2023?5?25? 23:56 ???: Alex Song (???) > ??: openstack-discuss > ??: Re: pep8 error of newest devstack Hi Alex, The issue looks due to an outdated mirror(i.e https://pypi.tuna.tsinghua.edu.cn), it should work fine with an up to date mirror. See below 14.3.0 is not available in the pypi index you using:- $ curl https://pypi.tuna.tsinghua.edu.cn/simple/oslo-messaging/ 2>/dev/null|grep oslo.messaging-14 oslo.messaging-14.0.0-py3-none-any.whl
oslo.messaging-14.0.0.tar.gz
oslo.messaging-14.1.0-py3-none-any.whl
oslo.messaging-14.1.0.tar.gz
oslo.messaging-14.2.0-py3-none-any.whl
oslo.messaging-14.2.0.tar.gz
V/s $ curl https://pypi.org/simple/oslo-messaging/ 2>/dev/null|grep oslo.messaging-14 oslo.messaging-14.0.0-py3-none-any.whl
oslo.messaging-14.0.0.tar.gz
oslo.messaging-14.1.0-py3-none-any.whl
oslo.messaging-14.1.0.tar.gz
oslo.messaging-14.2.0-py3-none-any.whl
oslo.messaging-14.2.0.tar.gz
oslo.messaging-14.3.0-py3-none-any.whl
oslo.messaging-14.3.0.tar.gz
Thanks and Regards Yatin Karel On Thu, May 25, 2023 at 11:46?AM Alex Song (???) > wrote: Hi, When I check pep8 with the newest devstack env by the cmd ?tox ?e pep8?, it raise many packages conflict. The log is as follows. root at song-ctrl:/opt/stack/cyborg# tox -e pep8 pep8 create: /opt/stack/cyborg/.tox/shared pep8 installdeps: -chttps://releases.openstack.org/constraints/upper/master, -r/opt/stack/cyborg/requirements.txt, -r/opt/stack/cyborg/test-requirements.txt ERROR: invocation failed (exit code 1), logfile: /opt/stack/cyborg/.tox/shared/log/pep8-1.log ========================================================================================================================================== log start ========================================================================================================================================== Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Collecting pbr!=2.1.0,>=0.11 (from -r /opt/stack/cyborg/requirements.txt (line 5)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/01/06/4ab11bf70db5a60689fc521b636849c8593eb67a2c6bdf73a16c72d16a12/pbr-5.11.1-py2.py3-none-any.whl (112 kB) Collecting pecan!=1.0.2,!=1.0.3,!=1.0.4,!=1.2,>=1.0.0 (from -r /opt/stack/cyborg/requirements.txt (line 6)) Using cached pecan-1.4.2-py3-none-any.whl Collecting WSME>=0.10.1 (from -r /opt/stack/cyborg/requirements.txt (line 7)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/d2/a0/54b71fd2e1ab4f838acf472726a14dadb6d1c77adc18d7e2c062dd955ff9/WSME-0.11.0-py3-none-any.whl (59 kB) Collecting eventlet>=0.26.0 (from -r /opt/stack/cyborg/requirements.txt (line 8)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/90/97/928b89de2e23cc67136eccccf1c122adf74ffdb65bbf7d2964b937cedd4f/eventlet-0.33.3-py2.py3-none-any.whl (226 kB) Collecting oslo.i18n>=1.5.0 (from -r /opt/stack/cyborg/requirements.txt (line 9)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/62/5a/ced9667f5a35d712734bf3449afad763283e31ea9a903137eb42df29d948/oslo.i18n-6.0.0-py3-none-any.whl (46 kB) Collecting oslo.config!=4.3.0,!=4.4.0,>=1.1.0 (from -r /opt/stack/cyborg/requirements.txt (line 10)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/ce/15/97fb14b7a1692693610a8e00e2a08e4186d6cdd875b6ac24c912a429b665/oslo.config-9.1.1-py3-none-any.whl (128 kB) Collecting oslo.log>=5.0.0 (from -r /opt/stack/cyborg/requirements.txt (line 11)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/9c/24/526545a76513c741c65eeb70a85c93287f10665147c7cff7e0eb24918d43/oslo.log-5.2.0-py3-none-any.whl (71 kB) Collecting oslo.context>=2.9.0 (from -r /opt/stack/cyborg/requirements.txt (line 12)) Using cached https://pypi.tuna.tsinghua.edu.cn/packages/7f/69/3a16785c49890cce2f7f14ee6525c76e7116f9dad44f122f5e3670e43970/oslo.context-5.1.1-py3-none-any.whl (20 kB) ERROR: Cannot install oslo.messaging>=14.1.0 because these package versions have conflicting dependencies. The conflict is caused by: The user requested oslo.messaging>=14.1.0 The user requested (constraint) oslo-messaging===14.3.0 To fix this you could try to: 1. loosen the range of package versions you've specified 2. remove package versions to allow pip attempt to solve the dependency conflict ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts =========================================================================================================================================== log end =========================================================================================================================================== ERROR: could not install deps [-chttps://releases.openstack.org/constraints/upper/master, -r/opt/stack/cyborg/requirements.txt, -r/opt/stack/cyborg/test-requirements.txt]; v = InvocationError('/opt/stack/cyborg/.tox/shared/bin/python -m pip install -chttps://releases.openstack.org/constraints/upper/master -r/opt/stack/cyborg/requirements.txt -r/opt/stack/cyborg/test-requirements.txt', 1) ___________________________________________________________________________________________________________________________________________ summary ___________________________________________________________________________________________________________________________________________ ERROR: pep8: could not install deps [-chttps://releases.openstack.org/constraints/upper/master, -r/opt/stack/cyborg/requirements.txt, -r/opt/stack/cyborg/test-requirements.txt]; v = InvocationError('/opt/stack/cyborg/.tox/shared/bin/python -m pip install -chttps://releases.openstack.org/constraints/upper/master -r/opt/stack/cyborg/requirements.txt -r/opt/stack/cyborg/test-requirements.txt', 1) I hope you can help us. Thank you very much. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3774 bytes Desc: not available URL: From fungi at yuggoth.org Fri May 26 17:10:59 2023 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 26 May 2023 17:10:59 +0000 Subject: [nova][ops] EOL'ing stable/train ? In-Reply-To: <2019509c-bbe6-864e-bd32-e19ae084d8ba@debian.org> References: <2019509c-bbe6-864e-bd32-e19ae084d8ba@debian.org> Message-ID: <20230526171058.xi5rqf6pslvctxnz@yuggoth.org> On 2023-05-26 18:19:09 +0200 (+0200), Thomas Goirand wrote: > On 5/24/23 12:24, Sylvain Bauza wrote: [...] > As for CVE-2023-2088, the issue is implementing the force > > > It would be difficult to fix the CVEs in the upstream branch but > > hopefully AFAIK all the OpenStack distros already fixed them for their > > related releases that use Train. > > So far, I haven't seen such a fix, neither in Ubuntu or RedHat, on any > version prior to ussuri. If you have a link to a working patch, please let > me know. I think he may be referring to Red Hat. As I understand it, they went with the https://wiki.openstack.org/wiki/OSSN/OSSN-0092 approach (mitigation through configuration only, disabling attachment-delete functionality for users). I may be wrong though, as I was not privy to their internal discussions. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From smooney at redhat.com Fri May 26 17:50:32 2023 From: smooney at redhat.com (Sean Mooney) Date: Fri, 26 May 2023 18:50:32 +0100 Subject: [nova][ops] EOL'ing stable/train ? In-Reply-To: <20230526171058.xi5rqf6pslvctxnz@yuggoth.org> References: <2019509c-bbe6-864e-bd32-e19ae084d8ba@debian.org> <20230526171058.xi5rqf6pslvctxnz@yuggoth.org> Message-ID: <6e0ccdfb53da64234a7700745c844c920ba21457.camel@redhat.com> On Fri, 2023-05-26 at 17:10 +0000, Jeremy Stanley wrote: > On 2023-05-26 18:19:09 +0200 (+0200), Thomas Goirand wrote: > > On 5/24/23 12:24, Sylvain Bauza wrote: > [...] > > As for CVE-2023-2088, the issue is implementing the force > > > > > It would be difficult to fix the CVEs in the upstream branch but > > > hopefully AFAIK all the OpenStack distros already fixed them for their > > > related releases that use Train. > > > > So far, I haven't seen such a fix, neither in Ubuntu or RedHat, on any > > version prior to ussuri. If you have a link to a working patch, please let > > me know. for redhat openstack plathform 16 (trian) we fixed the vmdk issue (CVE-2022-47951) by increasing the version of oslo.utils? that we shiped to ensure it had the relevant json format options to inspect the iamge and bacislly used the same fix as on master. we also did that for queens / osp 13 the qemu wersion we used supprot this all the way back to 13/queens so that made that approch more viable. we cant do that upstream as it would break people but the way i would have prefered to do this? would have been to simply vendor the functionality in nova and continue the backport upstream without bumping the min oslo verions. we have done that in the past for other libs. > I think he may be referring to Red Hat. As I understand it, they > went with the https://wiki.openstack.org/wiki/OSSN/OSSN-0092 > approach (mitigation through configuration only, disabling > attachment-delete functionality for users). I may be wrong though, > as I was not privy to their internal discussions. downstream technically we never supproted VMDK in our product we did nto block it either but custoemr are not expect to use vmdk images with our downstream product. we still fixed the issue assumeing our customer cant contol what there customers are uploadign to ther openstack clouds. From marosvarchola at sunray.sk Sun May 28 00:06:38 2023 From: marosvarchola at sunray.sk (=?UTF-8?Q?Maro=C5=A1_Varchola?=) Date: Sun, 28 May 2023 02:06:38 +0200 Subject: CRITICAL! RabbitMQ PackageCloud repos will be not more available from today - affected Openstack-ansible Message-ID: <83dd077c3248b87e7bee15ebd4b88477@sunray.sk> From today, the PackageCloud repos will be not more available from today. Official information: https://github.com/rabbitmq/rabbitmq-server/discussions/8386 BUG on launchpad: https://bugs.launchpad.net/openstack-ansible/+bug/2021410 -- S pozdravom/Yours Sincerely, Maro? Varchola +421 950 401 060 Astrov? 6116/2, 071 01 Michalovce, Slovakia marosvarchola at sunray.sk ---- T?to spr?va bola digit?lne podp?san? PGP certifik?tom/ This message was digitally signed by PGP certificate From noonedeadpunk at gmail.com Sun May 28 00:27:15 2023 From: noonedeadpunk at gmail.com (Dmitriy Rabotyagov) Date: Sun, 28 May 2023 02:27:15 +0200 Subject: CRITICAL! RabbitMQ PackageCloud repos will be not more available from today - affected Openstack-ansible In-Reply-To: <83dd077c3248b87e7bee15ebd4b88477@sunray.sk> References: <83dd077c3248b87e7bee15ebd4b88477@sunray.sk> Message-ID: That is just ridiculous... We have just switched from cloudsmith because it's rotating packages too aggressively... ??, 28 ??? 2023 ?., 02:13 Maro? Varchola : > From today, the PackageCloud repos will be not more available from > today. Official information: > https://github.com/rabbitmq/rabbitmq-server/discussions/8386 > > BUG on launchpad: > https://bugs.launchpad.net/openstack-ansible/+bug/2021410 > > -- > S pozdravom/Yours Sincerely, > Maro? Varchola > +421 950 401 060 > Astrov? 6116/2, 071 01 Michalovce, Slovakia > marosvarchola at sunray.sk > > ---- > T?to spr?va bola digit?lne podp?san? PGP certifik?tom/ > This message was digitally signed by PGP certificate > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From swogatpradhan22 at gmail.com Sun May 28 14:58:53 2023 From: swogatpradhan22 at gmail.com (Swogat Pradhan) Date: Sun, 28 May 2023 20:28:53 +0530 Subject: Add netapp storage in edge site | Wallaby | DCN In-Reply-To: References: Message-ID: Hi Alan, If you please could help me clarify this. With regards, Swogat Pradhan On Fri, 26 May 2023, 10:09 am Swogat Pradhan, wrote: > Hi Alan, > My netapp storage is located in edge site itself. > As the networks are routable my central site is able to reach the netapp > storage ip address (ping response is 30ms-40ms). > Let's say i included the netapp storage yaml in central site deployment > script (which is not recommended) and i am able to create the volumes as it > is reachable from controller nodes. > Will i be able to mount those volumes in edge site VM's?? And if i am able > to do so, then how will the data flow?? When storing something in the > netapp volume will the data flow through the central site controller and > get stored in the storage space? > > With regards, > Swogat Pradhan > > On Fri, 26 May 2023, 10:03 am Alan Bishop, wrote: > >> >> >> On Thu, May 25, 2023 at 12:09?AM Swogat Pradhan < >> swogatpradhan22 at gmail.com> wrote: >> >>> Hi Alan, >>> So, can I include the cinder-netapp-storage.yaml file in the central >>> site and then use the new backend to add storage to edge VM's? >>> >> >> Where is the NetApp physically located? Tripleo's DCN architecture >> assumes the storage is physically located at the same site where the >> cinder-volume service will be deployed. If you include the >> cinder-netapp-storage.yaml environment file in the central site's >> controlplane, then VMs at the edge site will encounter the problems I >> outlined earlier (network latency, no ability to do cross-AZ attachments). >> >> >>> I believe it is not possible right?? as the cinder volume in the edge >>> won't have the config for the netapp. >>> >> >> The cinder-volume services at an edge site are meant to manage storage >> devices at that site. If the NetApp is at the edge site, ideally you'd >> include some variation of the cinder-netapp-storage.yaml environment file >> in the edge site's deployment. However, then you're faced with the fact >> that the NetApp driver doesn't support A/A, which is required for c-vol >> services running at edge sites (In case you're not familiar with these >> details, tripleo runs all cinder-volume services in active/passive mode >> under pacemaker on controllers in the controlplane. Thus, only a single >> instance runs at any time, and pacemaker provides HA by moving the service >> to another controller if the first one goes down. However, pacemaker is not >> available at edge sites, and so to get HA, multiple instances of the >> cinder-volume service run simultaneously on 3 nodes (A/A), using etcd as a >> Distributed Lock Manager (DLM) to coordinate things. But drivers must >> specifically support running A/A, and the NetApp driver does NOT.) >> >> Alan >> >> >>> With regards, >>> Swogat Pradhan >>> >>> On Thu, May 25, 2023 at 2:17?AM Alan Bishop wrote: >>> >>>> >>>> >>>> On Wed, May 24, 2023 at 3:15?AM Swogat Pradhan < >>>> swogatpradhan22 at gmail.com> wrote: >>>> >>>>> Hi, >>>>> I have a DCN setup and there is a requirement to use a netapp storage >>>>> device in one of the edge sites. >>>>> Can someone please confirm if it is possible? >>>>> >>>> >>>> I see from prior email to this list that you're using tripleo, so I'll >>>> respond with that in mind. >>>> >>>> There are many factors that come into play, but I suspect the short >>>> answer to your question is no. >>>> >>>> Tripleo's DCN architecture requires the cinder-volume service running >>>> at edge sites to run in active-active >>>> mode, where there are separate instances running on three nodes in to >>>> for the service to be highly >>>> available (HA).The problem is that only a small number of cinder >>>> drivers support running A/A, and NetApp's >>>> drivers do not support A/A. >>>> >>>> It's conceivable you could create a custom tripleo role that deploys >>>> just a single node running cinder-volume >>>> with a NetApp backend, but it wouldn't be HA. >>>> >>>> It's also conceivable you could locate the NetApp system in the central >>>> site's controlplane, but there are >>>> extremely difficult constraints you'd need to overcome: >>>> - Network latency between the central and edge sites would mean the >>>> disk performance would be bad. >>>> - You'd be limited to using iSCSI (FC wouldn't work) >>>> - Tripleo disables cross-AZ attachments, so the only way for an edge >>>> site to access a NetApp volume >>>> would be to configure the cinder-volume service running in the >>>> controlplane with a backend availability >>>> zone set to the edge site's AZ. You mentioned the NetApp is needed "in >>>> one of the edge sites," but in >>>> reality the NetApp would be available in one, AND ONLY ONE edge site, >>>> and it would also not be available >>>> to any instances running in the central site. >>>> >>>> Alan >>>> >>>> >>>>> And if so then should i add the parameters in the edge deployment >>>>> script or the central deployment script. >>>>> Any suggestions? >>>>> >>>>> With regards, >>>>> Swogat Pradhan >>>>> >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From thierry at openstack.org Mon May 29 07:31:53 2023 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 29 May 2023 09:31:53 +0200 Subject: [largescale-sig] Next meeting: May 31, 8utc Message-ID: Hi everyone, The Large Scale SIG will be meeting this Wednesday in #openstack-operators on OFTC IRC, at 8UTC, our APAC+EU-friendly time. I will be chairing. You can doublecheck how that UTC time translates locally at: https://www.timeanddate.com/worldclock/fixedtime.html?iso=20230531T08 Feel free to add topics to the agenda: https://etherpad.opendev.org/p/large-scale-sig-meeting Regards, -- Thierry Carrez From ykarel at redhat.com Mon May 29 12:29:23 2023 From: ykarel at redhat.com (Yatin Karel) Date: Mon, 29 May 2023 17:59:23 +0530 Subject: [neutron] Bug Deputy Report May 22 - 28 Message-ID: Hello Neutron Team !! Please find bug report from May 22nd to 28th, Undecided bugs needs further triage:- *Critical:-* - https://bugs.launchpad.net/neutron/+bug/2020363 - [stable/train/] openstacksdk-functional-devstack fails with POST_FAILURE Fixed with https://review.opendev.org/c/openstack/neutron/+/883890 Assigned to ykarel *High:-* - https://bugs.launchpad.net/neutron/+bug/2020698 - neutron-tempest-plugin-bgpvpn-bagpipe job unstable Job made non voting temporary with https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/884270 Assigned to bhaley - https://bugs.launchpad.net/neutron/+bug/2021457 - [fwaas]The firewall group without any port in active status Fix proposed https://review.opendev.org/c/openstack/neutron-fwaas/+/884335 Assigned to ZhouHeng *Medium:-* - https://bugs.launchpad.net/neutron/+bug/2020802 - Make DB migration "Add indexes to RBACs" conditional Unassigned *Low:-* - https://bugs.launchpad.net/neutron/+bug/2020552 - trunk_details missing sub port MAC addresses for LIST Requested more info Unassigned *Wishlist:-* - https://bugs.launchpad.net/neutron/+bug/2020358 - [RFE] Allow to limit conntrack entries per tenant to avoid "nf_conntrack: table full, dropping packet" Unassigned - https://bugs.launchpad.net/neutron/+bug/2020823 - [RFE] Add flavor/service provider support to routers in the L3 OVN plugin Assigned to Miguel Lavalle *Undecided:-* - https://bugs.launchpad.net/neutron/+bug/2020771 - [ml2/ovn] Binding_host_id shouldn't be set when a port is down Requested more info - https://bugs.launchpad.net/neutron/+bug/2020349 - Neutron Dynamic Routing : Connection to peers lost Discussion ongoing, possible misconfiguration - https://bugs.launchpad.net/neutron/+bug/2020328 - Concurrent create VM failed because of vif plug timeout Requested more info Thanks and Regards Yatin Karel -------------- next part -------------- An HTML attachment was scrubbed... URL: From knikolla at bu.edu Mon May 29 21:38:21 2023 From: knikolla at bu.edu (Nikolla, Kristi) Date: Mon, 29 May 2023 21:38:21 +0000 Subject: [tc] Technical Committee next weekly meeting on May 30, 2023 Message-ID: Hi all, This is a reminder that the next weekly Technical Committee meeting is to be held on Tuesday, May 30, 2023 at 1800 UTC on #openstack-tc on OFTC IRC. The up-to-date agenda for the meeting can be found at https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting Thank you, Kristi Nikolla From ralonsoh at redhat.com Tue May 30 07:50:14 2023 From: ralonsoh at redhat.com (Rodolfo Alonso Hernandez) Date: Tue, 30 May 2023 09:50:14 +0200 Subject: [Port Creation failed] - openstack Wallaby In-Reply-To: References: Message-ID: Hi Lokendra: Please open a launchpad bug with all the system information, including the chassis list. It could be useful to know why you are deploying DHCP agents on a ML2/OVN deployment. If you upgraded/changed the chassis ID, please report that too. If you are using ML2/OVN, you should also check what is the status of the OVN agents (I don't see that in the agent list you printed). Regards. On Tue, May 23, 2023 at 12:41?PM Lokendra Rathour wrote: > Hi Team, > Issue is yet to be solved. > > openstack network agent list: > (overcloud) [stack at undercloud-loke ~]$ openstack network agent list > /usr/lib64/python3.6/site-packages/_yaml/__init__.py:23: > DeprecationWarning: The _yaml extension module is now located at yaml._yaml > and its location is subject to change. To use the LibYAML-based parser and > emitter, import from `yaml`: `from yaml import CLoader as Loader, CDumper > as Dumper`. > DeprecationWarning > > +--------------------------------------+------------+----------------------------------+-------------------+-------+-------+--------------------+ > | ID | Agent Type | Host > | Availability Zone | Alive | State | Binary | > > +--------------------------------------+------------+----------------------------------+-------------------+-------+-------+--------------------+ > | 8e6ce556-84f7-48c7-b9a0-5ebecad648d1 | DHCP agent | > overcloud-controller-0.myhsc.com | nova | :-) | UP | > neutron-dhcp-agent | > | c0f29f3c-7eb0-4667-b522-61323185adac | DHCP agent | > overcloud-controller-2.myhsc.com | nova | :-) | UP | > neutron-dhcp-agent | > | e5f5f950-99dd-4a4d-9702-232ffe9d0475 | DHCP agent | > overcloud-controller-1.myhsc.com | nova | :-) | UP | > neutron-dhcp-agent | > > +--------------------------------------+------------+----------------------------------+-------------------+-------+-------+--------------------+ > (overcloud) [stack at undercloud-loke ~]$ source stackrc > we do not see any OVN Controller agent. > > Also as reported earlier we see no entry in this chassis DB. > > Any pointers would be helpful. > > Thanks, > Lokendra > > > > > > > > On Thu, May 18, 2023 at 8:39?PM Rodolfo Alonso Hernandez < > ralonsoh at redhat.com> wrote: > >> Hello Lokendra: >> >> Did you check the version of the ovn-controller service in the compute >> nodes and the ovn services in the controller nodes? The services should be >> in sync. >> >> What is the "openstack network agent list" output? Do you see the OVN >> controller, OVN gateways and OVN metadata entries corresponding to the >> compute and controller nodes you have? And did you check the sanity of your >> OVN SB database? What is the list of "Chassis" and "Chassis_Private" >> registers? Each "Chassis_Private" register must have a "Chassis" register >> associated. >> >> Regards. >> >> On Wed, May 17, 2023 at 6:14?PM Lokendra Rathour < >> lokendrarathour at gmail.com> wrote: >> >>> Hi Swogat, >>> Thanks for the inputs, it was showing a similar issue but somehow the >>> issue is not getting resolved. >>> we are trying to explore more around it. >>> >>> getting the error in >>> ovn-metadata-agent.log >>> Cannot find Chassis_Private with >>> name=f80799b9-0cf4-4413-bb4b-e36278c73f6c >>> >>> detailed: >>> 2023-05-17 19:26:31.984 45317 INFO oslo.privsep.daemon [-] privsep >>> daemon running as pid 45317 >>> 2023-05-17 19:26:32.712 44735 ERROR ovsdbapp.backend.ovs_idl.transaction >>> [-] Traceback (most recent call last): >>> File >>> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/connection.py", >>> line 131, in run >>> txn.results.put(txn.do_commit()) >>> File >>> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", >>> line 93, in do_commit >>> command.run_idl(txn) >>> File >>> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/command.py", >>> line 172, in run_idl >>> record = self.api.lookup(self.table, self.record) >>> File >>> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/__init__.py", >>> line 208, in lookup >>> return self._lookup(table, record) >>> File >>> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/__init__.py", >>> line 268, in _lookup >>> row = idlutils.row_by_value(self, rl.table, rl.column, record) >>> File >>> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/idlutils.py", >>> line 114, in row_by_value >>> raise RowNotFound(table=table, col=column, match=match) >>> ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find >>> Chassis_Private with name=f80799b9-0cf4-4413-bb4b-e36278c73f6c >>> >>> 2023-05-17 19:26:32.713 44735 ERROR ovsdbapp.backend.ovs_idl.command [-] >>> Error executing command (DbAddCommand): >>> ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Chassis_Private >>> with name=f80799b9-0cf4-4413-bb4b-e36278c73f6c >>> 2023-05-17 19:26:32.713 44735 ERROR ovsdbapp.backend.ovs_idl.command >>> Traceback (most recent call last): >>> 2023-05-17 19:26:32.713 44735 ERROR ovsdbapp.backend.ovs_idl.command >>> File >>> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/command.py", >>> line 42, in execute >>> >>> waiting for your always helpful inputs. >>> >>> On Tue, May 16, 2023 at 10:47?PM Swogat Pradhan < >>> swogatpradhan22 at gmail.com> wrote: >>> >>>> Hi >>>> I am not sure if this will help, but i faced something similar. >>>> You might need to check the ovn database entries. >>>> http://www.jimmdenton.com/neutron-ovn-private-chassis/ >>>> >>>> Or maybe try restarting the ovn service from pcs, sometimes issue comes >>>> up when ovn doesn't sync up. >>>> >>>> Again m not sure if this will be of any help to you. >>>> >>>> With regards, >>>> Swogat Pradhan >>>> >>>> On Tue, 16 May 2023, 10:41 pm Lokendra Rathour, < >>>> lokendrarathour at gmail.com> wrote: >>>> >>>>> Hi All, >>>>> Was trying to create OpenStack VM in OpenStack wallaby release, not >>>>> able to create VM, it is failing because of Port not getting created. >>>>> >>>>> The error that we are getting: >>>>> nova-compute.log: >>>>> >>>>> 2023-05-16 18:15:35.495 7 INFO nova.compute.provider_config >>>>> [req-faaf38e7-b5ee-43d1-9303-d508285f5ab7 - - - - -] No provider configs >>>>> found in /etc/nova/provider_config. If files are present, ensure the Nova >>>>> process has access. >>>>> 2023-05-16 18:15:35.549 7 ERROR nova.cmd.common >>>>> [req-8842f11c-fe5a-4ad3-92ea-a6898f482bf0 - - - - -] No db access allowed >>>>> in nova-compute: File "/usr/bin/nova-compute", line 10, in >>>>> sys.exit(main()) >>>>> File "/usr/lib/python3.6/site-packages/nova/cmd/compute.py", line >>>>> 59, in main >>>>> topic=compute_rpcapi.RPC_TOPIC) >>>>> File "/usr/lib/python3.6/site-packages/nova/service.py", line 264, >>>>> in create >>>>> utils.raise_if_old_compute() >>>>> File "/usr/lib/python3.6/site-packages/nova/utils.py", line 1068, in >>>>> raise_if_old_compute >>>>> ctxt, ['nova-compute']) >>>>> File "/usr/lib/python3.6/site-packages/nova/objects/service.py", >>>>> line 563, in get_minimum_version_all_cells >>>>> binaries) >>>>> File "/usr/lib/python3.6/site-packages/nova/context.py", line 544, >>>>> in scatter_gather_all_cells >>>>> fn, *args, **kwargs) >>>>> File "/usr/lib/python3.6/site-packages/nova/context.py", line 432, >>>>> in scatter_gather_cells >>>>> with target_cell(context, cell_mapping) as cctxt: >>>>> File "/usr/lib64/python3.6/contextlib.py", line 81, in __enter__ >>>>> return next(self.gen) >>>>> >>>>> >>>>> neutron/ovn-metadata-agent.log >>>>> >>>>> 2023-05-16 22:33:41.871 45204 ERROR ovsdbapp.backend.ovs_idl.command >>>>> ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Chassis_Private >>>>> with name=b6e6f0d3-40c6-4d4e-8ef6-c935fa027bd2 >>>>> 2023-05-16 22:33:41.871 45204 ERROR ovsdbapp.backend.ovs_idl.command >>>>> 2023-05-16 22:36:41.876 45204 ERROR >>>>> ovsdbapp.backend.ovs_idl.transaction [-] Traceback (most recent call last): >>>>> File >>>>> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/connection.py", >>>>> line 131, in run >>>>> txn.results.put(txn.do_commit()) >>>>> File >>>>> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", >>>>> line 93, in do_commit >>>>> command.run_idl(txn) >>>>> File >>>>> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/command.py", >>>>> line 172, in run_idl >>>>> record = self.api.lookup(self.table, self.record) >>>>> File >>>>> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/__init__.py", >>>>> line 208, in lookup >>>>> return self._lookup(table, record) >>>>> File >>>>> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/__init__.py", >>>>> line 268, in _lookup >>>>> row = idlutils.row_by_value(self, rl.table, rl.column, record) >>>>> File >>>>> "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/idlutils.py", >>>>> line 114, in row_by_value >>>>> raise RowNotFound(table=table, col=column, match=match) >>>>> ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find >>>>> Chassis_Private with name=b6e6f0d3-40c6-4d4e-8ef6-c935fa027bd2 >>>>> >>>>> any input to help get this issue fixed would be of great help. >>>>> thanks >>>>> -- >>>>> ~ Lokendra >>>>> skype: lokendrarathour >>>>> >>>>> >>>>> >>> >>> -- >>> ~ Lokendra >>> skype: lokendrarathour >>> >>> >>> > > -- > ~ Lokendra > skype: lokendrarathour > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdhasman at redhat.com Tue May 30 11:02:10 2023 From: rdhasman at redhat.com (Rajat Dhasmana) Date: Tue, 30 May 2023 16:32:10 +0530 Subject: [kolla-ansible][yoga] Glance backend cinder Privsep daemon failed to start operation not permitted In-Reply-To: References: <323313cf9ef907a0bbbdce18e7ae71654385e1f2.camel@redhat.com> Message-ID: Hi, On Mon, Apr 17, 2023 at 9:19?PM wodel youchi wrote: > Hi, > > I managed to get another step forward, but I hit another wall. > Glance-api was trying to mount the NFS share on the controller node, not > on the compute, so I installed nfs-utils on it, now I have this error : > > Glance will mount the share on the node in which it's running. The default location for mounting is "/var/lib/glance/mnt" and can be configured via config option "cinder_mount_point_base"[1] Since Glance is running on the controller node in your case, it will mount the share on the controller. > 2023-04-17 16:33:08.814 62 ERROR glance_store._drivers.cinder >> [req-35f49fc4-15bc-4d3b-b93e-e039c1d0f3fa 0439953e7cfe4a13a1b4bb118b5dc3c4 >> b0f76b5c6dcb457fa716762bbf954837 - default default] Exception while >> accessing to cinder volume 3d4712ef-7cb5-4a0b-bc1e-cfbebd8fa902.: >> FileExistsError: [Errno 17] File exists: >> '/var/lib/glance/mnt/nfs/f6f6b4ee42b4f3522a75f422887010ad2c47f8624f97bf3623b13014f22186b7' >> 2023-04-17 16:33:08.814 62 ERROR glance_store._drivers.cinder Traceback >> (most recent call last): >> 2023-04-17 16:33:08.814 62 ERROR glance_store._drivers.cinder File >> "/var/lib/kolla/venv/lib/python3.6/site-packages/glance_store/_drivers/cinder.py", >> line 788, in _open_cinder_volume >> 2023-04-17 16:33:08.814 62 ERROR glance_store._drivers.cinder device >> = connect_volume_nfs() >> 2023-04-17 16:33:08.814 62 ERROR glance_store._drivers.cinder File >> "/var/lib/kolla/venv/lib/python3.6/site-packages/oslo_concurrency/lockutils.py", >> line 391, in inner >> 2023-04-17 16:33:08.814 62 ERROR glance_store._drivers.cinder return >> f(*args, **kwargs) >> 2023-04-17 16:33:08.814 62 ERROR glance_store._drivers.cinder File >> "/var/lib/kolla/venv/lib/python3.6/site-packages/glance_store/_drivers/cinder.py", >> line 786, in connect_volume_nfs >> 2023-04-17 16:33:08.814 62 ERROR glance_store._drivers.cinder >> root_helper, options) >> 2023-04-17 16:33:08.814 62 ERROR glance_store._drivers.cinder File >> "/var/lib/kolla/venv/lib/python3.6/site-packages/glance_store/common/fs_mount.py", >> line 359, in mount >> 2023-04-17 16:33:08.814 62 ERROR glance_store._drivers.cinder >> rootwrap_helper, options) >> 2023-04-17 16:33:08.814 62 ERROR glance_store._drivers.cinder File >> "/var/lib/kolla/venv/lib/python3.6/site-packages/glance_store/common/fs_mount.py", >> line 247, in mount >> 2023-04-17 16:33:08.814 62 ERROR glance_store._drivers.cinder >> os.makedirs(mountpoint) >> 2023-04-17 16:33:08.814 62 ERROR glance_store._drivers.cinder File >> "/usr/lib64/python3.6/os.py", line 220, in makedirs >> 2023-04-17 16:33:08.814 62 ERROR glance_store._drivers.cinder >> mkdir(name, mode) >> 2023-04-17 16:33:08.814 62 ERROR glance_store._drivers.cinder >> FileExistsError: [Errno 17] File exists: >> '/var/lib/glance/mnt/nfs/f6f6b4ee42b4f3522a75f422887010ad2c47f8624f97bf3623b13014f22186b7' >> 2023-04-17 16:33:08.814 62 ERROR glance_store._drivers.cinder >> 2023-04-17 16:33:08.906 62 ERROR glance_store._drivers.cinder >> [req-35f49fc4-15bc-4d3b-b93e-e039c1d0f3fa 0439953e7cfe4a13a1b4bb118b5dc3c4 >> b0f76b5c6dcb457fa716762bbf954837 - default default] Failed to write to >> volume 3d4712ef-7cb5-4a0b-bc1e-cfbebd8fa902.: FileExistsError: [Errno 17] >> File exists: >> '/var/lib/glance/mnt/nfs/f6f6b4ee42b4f3522a75f422887010ad2c47f8624f97bf3623b13014f22186b7' >> 2023-04-17 16:33:08.939 62 ERROR glance.api.v2.image_data >> [req-35f49fc4-15bc-4d3b-b93e-e039c1d0f3fa 0439953e7cfe4a13a1b4bb118b5dc3c4 >> b0f76b5c6dcb457fa716762bbf954837 - default default] Failed to upload image >> data due to internal error: FileExistsError: [Errno 17] File exists: >> '/var/lib/glance/mnt/nfs/f6f6b4ee42b4f3522a75f422887010ad2c47f8624f97bf3623b13014f22186b7' >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi >> [req-35f49fc4-15bc-4d3b-b93e-e039c1d0f3fa 0439953e7cfe4a13a1b4bb118b5dc3c4 >> b0f76b5c6dcb457fa716762bbf954837 - default default] Caught error: [Errno >> 17] File exists: >> '/var/lib/glance/mnt/nfs/f6f6b4ee42b4f3522a75f422887010ad2c47f8624f97bf3623b13014f22186b7': >> FileExistsError: [Errno 17] File exists: >> '/var/lib/glance/mnt/nfs/f6f6b4ee42b4f3522a75f422887010ad2c47f8624f97bf3623b13014f22186b7' >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi Traceback (most >> recent call last): >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi File >> "/var/lib/kolla/venv/lib/python3.6/site-packages/glance/common/wsgi.py", >> line 1332, in __call__ >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi request, >> **action_args) >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi File >> "/var/lib/kolla/venv/lib/python3.6/site-packages/glance/common/wsgi.py", >> line 1370, in dispatch >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi return >> method(*args, **kwargs) >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi File >> "/var/lib/kolla/venv/lib/python3.6/site-packages/glance/common/utils.py", >> line 414, in wrapped >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi return func(self, >> req, *args, **kwargs) >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi File >> "/var/lib/kolla/venv/lib/python3.6/site-packages/glance/api/v2/image_data.py", >> line 303, in upload >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi >> self._restore(image_repo, image) >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi File >> "/var/lib/kolla/venv/lib/python3.6/site-packages/oslo_utils/excutils.py", >> line 227, in __exit__ >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi >> self.force_reraise() >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi File >> "/var/lib/kolla/venv/lib/python3.6/site-packages/oslo_utils/excutils.py", >> line 200, in force_reraise >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi raise self.value >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi File >> "/var/lib/kolla/venv/lib/python3.6/site-packages/glance/api/v2/image_data.py", >> line 163, in upload >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi >> image.set_data(data, size, backend=backend) >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi File >> "/var/lib/kolla/venv/lib/python3.6/site-packages/glance/notifier.py", line >> 497, in set_data >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi >> _send_notification(notify_error, 'image.upload', msg) >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi File >> "/var/lib/kolla/venv/lib/python3.6/site-packages/oslo_utils/excutils.py", >> line 227, in __exit__ >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi >> self.force_reraise() >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi File >> "/var/lib/kolla/venv/lib/python3.6/site-packages/oslo_utils/excutils.py", >> line 200, in force_reraise >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi raise self.value >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi File >> "/var/lib/kolla/venv/lib/python3.6/site-packages/glance/notifier.py", line >> 444, in set_data >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi >> set_active=set_active) >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi File >> "/var/lib/kolla/venv/lib/python3.6/site-packages/glance/quota/__init__.py", >> line 323, in set_data >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi >> set_active=set_active) >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi File >> "/var/lib/kolla/venv/lib/python3.6/site-packages/glance/location.py", line >> 585, in set_data >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi >> self._upload_to_store(data, verifier, backend, size) >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi File >> "/var/lib/kolla/venv/lib/python3.6/site-packages/glance/location.py", line >> 485, in _upload_to_store >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi verifier=verifier) >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi File >> "/var/lib/kolla/venv/lib/python3.6/site-packages/glance_store/multi_backend.py", >> line 399, in add_with_multihash >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi image_id, data, >> size, hashing_algo, store, context, verifier) >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi File >> "/var/lib/kolla/venv/lib/python3.6/site-packages/glance_store/multi_backend.py", >> line 481, in store_add_to_backend_with_multihash >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi image_id, data, >> size, hashing_algo, context=context, verifier=verifier) >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi File >> "/var/lib/kolla/venv/lib/python3.6/site-packages/glance_store/driver.py", >> line 279, in add_adapter >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi metadata_dict) = >> store_add_fun(*args, **kwargs) >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi File >> "/var/lib/kolla/venv/lib/python3.6/site-packages/glance_store/capabilities.py", >> line 176, in op_checker >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi return >> store_op_fun(store, *args, **kwargs) >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi File >> "/var/lib/kolla/venv/lib/python3.6/site-packages/glance_store/_drivers/cinder.py", >> line 1028, in add >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi raise >> errors.get(e.errno, e) >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi File >> "/var/lib/kolla/venv/lib/python3.6/site-packages/glance_store/_drivers/cinder.py", >> line 985, in add >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi with >> self._open_cinder_volume(client, volume, 'wb') as f: >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi File >> "/usr/lib64/python3.6/contextlib.py", line 81, in __enter__ >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi return >> next(self.gen) >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi File >> "/var/lib/kolla/venv/lib/python3.6/site-packages/glance_store/_drivers/cinder.py", >> line 788, in _open_cinder_volume >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi device = >> connect_volume_nfs() >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi File >> "/var/lib/kolla/venv/lib/python3.6/site-packages/oslo_concurrency/lockutils.py", >> line 391, in inner >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi return f(*args, >> **kwargs) >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi File >> "/var/lib/kolla/venv/lib/python3.6/site-packages/glance_store/_drivers/cinder.py", >> line 786, in connect_volume_nfs >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi root_helper, >> options) >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi File >> "/var/lib/kolla/venv/lib/python3.6/site-packages/glance_store/common/fs_mount.py", >> line 359, in mount >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi rootwrap_helper, >> options) >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi File >> "/var/lib/kolla/venv/lib/python3.6/site-packages/glance_store/common/fs_mount.py", >> line 247, in mount >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi >> os.makedirs(mountpoint) >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi File >> "/usr/lib64/python3.6/os.py", line 220, in makedirs >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi mkdir(name, mode) >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi FileExistsError: >> [Errno 17] File exists: >> '/var/lib/glance/mnt/nfs/f6f6b4ee42b4f3522a75f422887010ad2c47f8624f97bf3623b13014f22186b7' >> 2023-04-17 16:33:08.957 62 ERROR glance.common.wsgi >> > > This is a strange error, we acquire a lock on the export path before it is mounted[2] and again acquire a threading lock on the mountpoint[3]. Then we check if the path is already mounted and only create the directory. we also remove the directory during unmount[4]. Are you using multiple glance processes with concurrent operations running? I can see we have a limitation because of the threading lock that only prevents access to the thread of the same process. [1] https://github.com/openstack/glance_store/blob/9bd9cf4fcd8a0aedc98fafb983fc19744e404015/glance_store/_drivers/cinder/store.py#L399 [2] https://github.com/openstack/glance_store/blob/9bd9cf4fcd8a0aedc98fafb983fc19744e404015/glance_store/_drivers/cinder/nfs.py#L82 [3] https://github.com/openstack/glance_store/blob/9bd9cf4fcd8a0aedc98fafb983fc19744e404015/glance_store/common/fs_mount.py#L242 [4] https://github.com/openstack/glance_store/blob/9bd9cf4fcd8a0aedc98fafb983fc19744e404015/glance_store/common/fs_mount.py#L339 Thanks Rajat Dhasmana > Regards. > > > > Virus-free.www.avast.com > > <#m_-7349147206883296287_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdhasman at redhat.com Tue May 30 11:19:50 2023 From: rdhasman at redhat.com (Rajat Dhasmana) Date: Tue, 30 May 2023 16:49:50 +0530 Subject: [kolla-ansible][yoga] Does glance support cinder-nfs as backend? In-Reply-To: References: Message-ID: Hi, On Mon, Apr 17, 2023 at 1:59?PM wodel youchi wrote: > Hi, > > The openstak documentation says that glance supports cinder as backend, > but it does not exclude any backend used by cinder itself. > > Although cinder as a backend for glance is not thoroughly tested against all backends, it should work for our community drivers like LVM, NFS and RBD. > I'm having trouble configuring glance to use a cinder backend which is > backed by an nfs share. > > Is this configuration supported? > > First, the rootwrap was missing, after adding it, I faced the lack of > privileges, which was corrected by starting the glance-api container in > privileged mode and finally I am facing a non existing filter error. > > Glance is trying to mount the nfs share to use it!!! Which I don't > understand , why mount a share that is already mounted by cinder which > glance is supposed to use as an intermediary!!!? > > Glance is a different and independent service from cinder and has a separate user account as well. Certain deployments don't allow access of one service to the resources available to another service. Cinder, by default, mounts the share in /var/lib/cinder directory and allowing Glance the access to this directory is not a great idea for some deployments, hence, the code we have mounts the share again in /var/lib/glance directory by default. The glance directory might have different permissions, user access, SELinux context etc. which needs to be honoured. > When I push an image I get this error: > > Stderr: '/var/lib/kolla/venv/bin/glance-rootwrap: Unauthorized command: > mount -t nfs 20.1.0.32:/kolla_nfs > /var/lib/glance/mnt/nfs/f6f6b4ee42b4f3522a75f422887010ad2c47f8624 > f97bf3623b13014f22186b7 (no filter matched)\n' > > Did you add the filters required[1] in your rootwrap.d folder? It allows mount and umount to work, see L#15-16. [1] https://github.com/openstack/glance_store/blob/9bd9cf4fcd8a0aedc98fafb983fc19744e404015/etc/glance/rootwrap.d/glance_cinder_store.filters > Regards. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdhasman at redhat.com Tue May 30 11:31:57 2023 From: rdhasman at redhat.com (Rajat Dhasmana) Date: Tue, 30 May 2023 17:01:57 +0530 Subject: [Cinder][LVM backend] LVM vg backed by a shared LUN In-Reply-To: References: Message-ID: Hi, On Wed, Apr 26, 2023 at 4:42?PM wodel youchi wrote: > Hi, > > The examples I could find on the internet using LVM as backend for Cinder, > they expose a local disk using lvm via Cinder. > > I did this configuration and I am wondering if it's correct, especially > from a "simultaneous access" point of view. > > I have an iSCSI target backed by targetcli that exposes a LUN to my > compute nodes. I did configure the iscsi connexion manually on each one of > them and they all see the LUN, then on one of them I created the > cinder-volumes VG (the other nodes can see the modifications), then I > configured Cinder with lvm backend using this VG and it worked. I created > some volumes on it without issues using my account. But what about when > there are multiple tenants that try to create multiple volumes on it, is > this configuration safe? > > I might not be 100% correct but I don't think it should affect anything. The backend, here LVM, doesn't have any information of the LUN association with the project and OpenStack does the management of associating volumes (OpenStack terminology of LUNs) with a particular project also managing the access via keystone roles and scopes. The backend shouldn't worry about the access of a LUN from a different project since "project" is an OpenStack concept which is handled in the OpenStack layer itself. Unless a LUN export/map request is coming from outside of OpenStack, proper authorization and authentication should be maintained. Thanks Rajat Dhasmana > Regards. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdhasman at redhat.com Tue May 30 11:42:10 2023 From: rdhasman at redhat.com (Rajat Dhasmana) Date: Tue, 30 May 2023 17:12:10 +0530 Subject: [cinder] Midcycle-1 Status (31st May) Message-ID: Hello Argonauts, As you know we had planned our first midcycle of 2023.2 Bobcat cycle on 31st May, 2023. I created an etherpad[1] around a month ago and have been providing constant reminders in every cinder meeting about adding topics. At this point, no topic has been added which makes me believe that we don't require a midcycle discussion. I will wait till today EOD to see if any new topics are added, and if not, we will continue with our usual video+IRC (end of the month) meeting[2]. [1] https://etherpad.opendev.org/p/cinder-bobcat-midcycles [2] https://etherpad.opendev.org/p/cinder-bobcat-meetings Thanks Rajat Dhasmana -------------- next part -------------- An HTML attachment was scrubbed... URL: From benedikt.trefzer at cirrax.com Tue May 30 12:29:41 2023 From: benedikt.trefzer at cirrax.com (Benedikt Trefzer) Date: Tue, 30 May 2023 14:29:41 +0200 Subject: [puppet] puppet module improvements Message-ID: <86d026c5-ea18-62da-ed3b-f83c533fd772@cirrax.com> Hi all I use the openstack puppet modules to deploy openstack. I'd like to suggest some improvements to the modules and like to know what the community thinks about: 1.) use proper types for parameters Parameter validation is done with 'validate_legacy...' instead of defining the class/resource parameter with a proper type all over the code. I cannot imaging of any advantage not using proper type definitions. Instead using typed parameters would be more efficient an code readability would increase. 2.) params.pp This is the legacy way to define parameter defaults for different OS's. In the modern puppet world a module specific hiera structure is used, which eliminates the need of params.pp class (with inheritance and include). The usage of hiera improves readability and flexibility (every parameter can be overwritten on request, eg. change of packages names etc.) This also eliminate the restriction that the modules can only be used by certain OS'es (osfamily 'RedHat' or 'Debian'). 3.) Eliminate "if OS=='bla' {" statements in code These statements make the code very inflexible. It cannot be overruled if necessary (eg. if I use custom packages to install and do not need the code provided in the if statement). Instead a parameter should be used with a default provided in hiera. Since there is lot of code to change I do not expect this to be done in a single commit (per module) but in steps probably in more than one release cycle. But defining this as best practice for openstack puppet modules and start using above in new commits would bring the code forward. Finally: These are suggestions open for discussion. In no way I like to critic the current state of the puppet modules (which is quite good, but a bit legacy) or the people working on the modules. This is just a feedback/suggestion from an operator using the modules on a daily basis. Regards Benedikt Trefzer From fkr at osb-alliance.com Tue May 30 14:24:42 2023 From: fkr at osb-alliance.com (Felix Kronlage-Dammers) Date: Tue, 30 May 2023 16:24:42 +0200 Subject: Invitation to further discuss the notion of a 'domain manager' role Message-ID: <5F43A54D-1332-476D-AF59-0E1DDFBC8964@osb-alliance.com> Hi, We (SCS Project) are looking for CSPs / Operators who would like to join a discussion on the notion of a ?domain manager role?. To a lot of you this topic is far from new and has been brought up just recently on this list as well[1]. On the upcoming Monday (June 5th) at 15:05 CEST we would like to have a kickstarting discussion of CSPs who currently solve the hurdle of not having such a role in their environments and/or their custom frontends. We will meet here: Through the Public Cloud SIG I?ve reached out to Cleura, OTC as well as Catalyst Cloud. From the german scene PlusServer will join the discussion. As part of the discussion on monday we want to see, where we can move towards a joint-effort to maybe see that such a role can be introduced earlier than anticipated. For that it sure is good to understand wether any of the participating CSPs do have ?stuff in their drawer? (code, knowledge, experience, ?) that they could bring to the table. regards, felix [1] -- Felix Kronlage-Dammers Product Owner IaaS & Operations Sovereign Cloud Stack Sovereign Cloud Stack ? standardized, built and operated by many Ein Projekt der Open Source Business Alliance - Bundesverband f?r digitale Souver?nit?t e.V. Tel.: +49-30-206539-205 | Matrix: @fkronlage:matrix.org | fkr at osb-alliance.com -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 862 bytes Desc: OpenPGP digital signature URL: From geert.geurts at linuxfabrik.ch Sun May 28 12:13:32 2023 From: geert.geurts at linuxfabrik.ch (Geert Geurts) Date: Sun, 28 May 2023 14:13:32 +0200 Subject: swift container retention Message-ID: Hallo, I've gotten the assignment to implement a backup solution using openstack swift containers. What we want is that the files of customers are kept in the customers own container and for those files to retain their versions on updates of the file. So far so good, this is all possible, and extensively documented online. Now, the second part is more complicated it appears where I thougth this would be a quite common usage example of object storage. I cannot find how I can automagically delete versions of files that are older then X days. There is the X-delete-after header, but this header needs to be added for each transaction. What I want is for all objects in a container to get deleted after X days without having to add a header once the object is uploaded. Does anyone have suggestions on how I could implement this requirement? Please find below a grafical repesentation of what I want Versioning disabled Automated add headers delete-after Versioning enabled Usage on the account of LF Usage on the account of client ? ? ? ? ? ? ? ? ?????????????????????????? ?????????????????????????? ? ? ? ? ? Versions-container ? ? Current-container ? ? ? ? ? ?????????????????????????? ?????????????????????????? ? ? ? ? ?File1 v123456 ? ? File1 ? ?File1 v567899 ? ? File2 ? ? ? ? File3 ? ?File3 v234567 ? ? ... ? ? ? ? FileN ? ?????????????????????????? ?????????????????????????? ? ? ? ? ?????????????????????????????????????????? Modifications cause current to be moved to Versioned-container Best regards, Geert From rdhasman at redhat.com Tue May 30 11:25:43 2023 From: rdhasman at redhat.com (Rajat Dhasmana) Date: Tue, 30 May 2023 16:55:43 +0530 Subject: [cinder] - Test cases or procedure for active/active support validation In-Reply-To: References: Message-ID: Hi Jean-Pierre, There is a thread[1] on the same topic initiated by the HPE 3PAR team. you can follow the discussion points in it. Hope that helps. [1] https://lists.openstack.org/pipermail/openstack-discuss/2023-May/033577.html Thanks Rajat Dhasmana On Tue, Apr 25, 2023 at 1:55?AM Roquesalane, Jean-Pierre < Jeanpierre.Roquesalane at dell.com> wrote: > Hi community! > > > > Here at Dell, we?re in the process of certifying our Dell PowerMax storage > platform against cinder volume Active/active configuration. I was wondering > how vendors who already support this functionality have managed to handle > this and if they can share their testing scenario. > > > > Thank you. > > > > JP > > > > *Jean-Pierre Roquesalane* > > SW System Sr Principal Engineer - OpenStack Champion > > *Dell Technologies* | Security and Ecosystem Software > > Mobile +33 6 21 86 79 04 <1-613-314-8106> > > Email : jeanpierre.roquesalane at dell.com > > > > Internal Use - Confidential > Dell S.A, Si?ge Social 1 rond point Benjamin Franklin 34000 Montpellier. > Capital 1,782,769 Euros, 351 528 229 RCS Montpellier ?APE 4651Z -TVA > Intracommunautaire FR 20 351 528 229, SIRET 351 528 229 00088 Vat Number : > FR 20351528229 (France) / IT00001709997 (Italy) / ESN0012622G (Spain) > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 88008 bytes Desc: not available URL: From abishop at redhat.com Tue May 30 15:13:09 2023 From: abishop at redhat.com (Alan Bishop) Date: Tue, 30 May 2023 08:13:09 -0700 Subject: Add netapp storage in edge site | Wallaby | DCN In-Reply-To: References: Message-ID: On Thu, May 25, 2023 at 9:39?PM Swogat Pradhan wrote: > Hi Alan, > My netapp storage is located in edge site itself. > As the networks are routable my central site is able to reach the netapp > storage ip address (ping response is 30ms-40ms). > Let's say i included the netapp storage yaml in central site deployment > script (which is not recommended) and i am able to create the volumes as it > is reachable from controller nodes. > Will i be able to mount those volumes in edge site VM's?? And if i am able > to do so, then how will the data flow?? When storing something in the > netapp volume will the data flow through the central site controller and > get stored in the storage space? > A cinder-volume service running in the central site's controplane will be able to work with a netapp backend that's physically located at an edge site. The good news is the c-vol service will be HA because it will be controlled by pacemaker running on the controllers. In order for VMs at the edge site to access volumes on the netapp, you'll need to set the CinderNetappAvailabilityZone [1] to the edge site's AZ. [1] https://opendev.org/openstack/tripleo-heat-templates/src/branch/stable/wallaby/deployment/cinder/cinder-backend-netapp-puppet.yaml#L43 To attach a netapp volume, nova-compute at the edge will interact with cinder-volume in the controlplane, and cinder-volume will in turn interact with the netapp. This will happen over central <=> edge network connections. Eventually, nova will directly connect to the netapp, so all traffic from the VM to the netapp will occur within the edge site. Data will not flow through the cinder-volume service, but there are restrictions and limitations: - Only that one edge site can access the netapp backend - If the central <=> edge network connection then you won't be able to attach or detach a netapp volume (but active connections will continue to work) Of course, there are operations where cinder services are in the data path (e.g. creating a volume from an image), but not when a VM is accessing a volume. Alan > With regards, > Swogat Pradhan > > On Fri, 26 May 2023, 10:03 am Alan Bishop, wrote: > >> >> >> On Thu, May 25, 2023 at 12:09?AM Swogat Pradhan < >> swogatpradhan22 at gmail.com> wrote: >> >>> Hi Alan, >>> So, can I include the cinder-netapp-storage.yaml file in the central >>> site and then use the new backend to add storage to edge VM's? >>> >> >> Where is the NetApp physically located? Tripleo's DCN architecture >> assumes the storage is physically located at the same site where the >> cinder-volume service will be deployed. If you include the >> cinder-netapp-storage.yaml environment file in the central site's >> controlplane, then VMs at the edge site will encounter the problems I >> outlined earlier (network latency, no ability to do cross-AZ attachments). >> >> >>> I believe it is not possible right?? as the cinder volume in the edge >>> won't have the config for the netapp. >>> >> >> The cinder-volume services at an edge site are meant to manage storage >> devices at that site. If the NetApp is at the edge site, ideally you'd >> include some variation of the cinder-netapp-storage.yaml environment file >> in the edge site's deployment. However, then you're faced with the fact >> that the NetApp driver doesn't support A/A, which is required for c-vol >> services running at edge sites (In case you're not familiar with these >> details, tripleo runs all cinder-volume services in active/passive mode >> under pacemaker on controllers in the controlplane. Thus, only a single >> instance runs at any time, and pacemaker provides HA by moving the service >> to another controller if the first one goes down. However, pacemaker is not >> available at edge sites, and so to get HA, multiple instances of the >> cinder-volume service run simultaneously on 3 nodes (A/A), using etcd as a >> Distributed Lock Manager (DLM) to coordinate things. But drivers must >> specifically support running A/A, and the NetApp driver does NOT.) >> >> Alan >> >> >>> With regards, >>> Swogat Pradhan >>> >>> On Thu, May 25, 2023 at 2:17?AM Alan Bishop wrote: >>> >>>> >>>> >>>> On Wed, May 24, 2023 at 3:15?AM Swogat Pradhan < >>>> swogatpradhan22 at gmail.com> wrote: >>>> >>>>> Hi, >>>>> I have a DCN setup and there is a requirement to use a netapp storage >>>>> device in one of the edge sites. >>>>> Can someone please confirm if it is possible? >>>>> >>>> >>>> I see from prior email to this list that you're using tripleo, so I'll >>>> respond with that in mind. >>>> >>>> There are many factors that come into play, but I suspect the short >>>> answer to your question is no. >>>> >>>> Tripleo's DCN architecture requires the cinder-volume service running >>>> at edge sites to run in active-active >>>> mode, where there are separate instances running on three nodes in to >>>> for the service to be highly >>>> available (HA).The problem is that only a small number of cinder >>>> drivers support running A/A, and NetApp's >>>> drivers do not support A/A. >>>> >>>> It's conceivable you could create a custom tripleo role that deploys >>>> just a single node running cinder-volume >>>> with a NetApp backend, but it wouldn't be HA. >>>> >>>> It's also conceivable you could locate the NetApp system in the central >>>> site's controlplane, but there are >>>> extremely difficult constraints you'd need to overcome: >>>> - Network latency between the central and edge sites would mean the >>>> disk performance would be bad. >>>> - You'd be limited to using iSCSI (FC wouldn't work) >>>> - Tripleo disables cross-AZ attachments, so the only way for an edge >>>> site to access a NetApp volume >>>> would be to configure the cinder-volume service running in the >>>> controlplane with a backend availability >>>> zone set to the edge site's AZ. You mentioned the NetApp is needed "in >>>> one of the edge sites," but in >>>> reality the NetApp would be available in one, AND ONLY ONE edge site, >>>> and it would also not be available >>>> to any instances running in the central site. >>>> >>>> Alan >>>> >>>> >>>>> And if so then should i add the parameters in the edge deployment >>>>> script or the central deployment script. >>>>> Any suggestions? >>>>> >>>>> With regards, >>>>> Swogat Pradhan >>>>> >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From sbauza at redhat.com Tue May 30 18:57:46 2023 From: sbauza at redhat.com (Sylvain Bauza) Date: Tue, 30 May 2023 20:57:46 +0200 Subject: [nova] Nova Spec review day next Tuesday 6 June Message-ID: Hey Nova community, As a reminder, you can find the Nova agenda for the 2023.2 Bobcat release here https://releases.openstack.org/bobcat/schedule.html As you can see, we plan to do a round of reviews for all the open specifications in the openstack/nova-specs repository by next Tuesday June 6th. If you're a developer wanting to propose a new feature in Nova and you have been said to provide a spec, or if you know that you need to create a spec, or if you already have an open spec, please make sure to upload/update your file before next Tuesday so we'll look at it. Also, if you can look again on your spec during the day, it would be nice as we could try to discuss your spec during this day. Thanks, -Sylvain -------------- next part -------------- An HTML attachment was scrubbed... URL: From jay at gr-oss.io Tue May 30 20:19:02 2023 From: jay at gr-oss.io (Jay Faulkner) Date: Tue, 30 May 2023 13:19:02 -0700 Subject: [ironic] Retirement of ironic-prometheus-exporter bugfix/2.1 branch Message-ID: Hi all, The ironic-prometheus-exporter bugfix/2.1 branch is currently broken in CI, and unused based on a poll of known-downstream users of bugfix branches. It's my intention to retire this branch in the same manner for other Ironic bugfix branches -- by tagging it EOL. If you have an objection, please bring it up soon. Thank you, Jay Faulkner Ironic PTL -------------- next part -------------- An HTML attachment was scrubbed... URL: From jay at gr-oss.io Tue May 30 20:20:28 2023 From: jay at gr-oss.io (Jay Faulkner) Date: Tue, 30 May 2023 13:20:28 -0700 Subject: [ironic] OpenInfra Summit planning/attendance Message-ID: Hi all, We're doing some light coordination of Ironic-interested operators and contributors at the OpenInfra Summit. If you're a part of the larger Ironic community and you're interested in meeting up with the team, seeing talks about Ironic specifically, or attending the team dinner please go to https://etherpad.opendev.org/p/ironic-openinfra-2023 and comment accordingly. Thanks! See many of you in two short weeks! - Jay Faulkner Ironic PTL -------------- next part -------------- An HTML attachment was scrubbed... URL: From jay at gr-oss.io Tue May 30 20:30:06 2023 From: jay at gr-oss.io (Jay Faulkner) Date: Tue, 30 May 2023 13:30:06 -0700 Subject: [ironic][stable] Proposing EOL of ironic project branches older than Wallaby In-Reply-To: References: Message-ID: Hey, I'm trying to clean up zuul-config-errors for Ironic, and Train has reared its head again: https://review.opendev.org/c/openstack/ironic-lib/+/884722. Is there still value in continuing to keep Train (and perhaps, Ussuri and Victoria) in EM at this point? Should we migrate them to EOL? What do you all think? - Jay Faulkner Ironic PTL On Tue, Nov 1, 2022 at 3:12?PM Steve Baker wrote: > > On 12/10/22 05:53, Jay Faulkner wrote: > > We discussed stable branches in the most recent ironic meeting ( > https://meetings.opendev.org/meetings/ironic/2022/ironic.2022-10-10-15.01.log.txt). > The decision was made to do the following: > > EOL these branches: > - stable/queens > - stable/rocky > - stable/stein > > Reduce testing considerably on these branches, and only backport critical > bugfixes or security bugfixes: > - stable/train > - stable/ussuri > - stable/victoria > > Just coming back to this, keeping stable/train jobs green has become > untenable so I think its time we consider EOLing it. > > It is the extended-maintenance branch of interest to me, so I'd be fine > with stable/ussuri and stable/victoria being EOLed also. > > Our remaining branches will continue to get most eligible patches > backported to them. > > This email, plus earlier communications including a tweet, will serve as > notice that these branches are being EOL'd. > > Thanks, > Jay Faulkner > > On Tue, Oct 4, 2022 at 11:18 AM Jay Faulkner wrote: > >> Hi all, >> >> Ironic has a large amount of stable branches still in EM. We need to take >> action to ensure those branches are either retired or have CI repaired to >> the point of being usable. >> >> Specifically, I'm looking at these branches across all Ironic projects: >> - stable/queens >> - stable/rocky >> - stable/stein >> - stable/train >> - stable/ussuri >> - stable/victoria >> >> In lieu of any volunteers to maintain the CI, my recommendation for all >> the branches listed above is that they be marked EOL. If someone wants to >> volunteer to maintain CI for those branches, they can propose one of the >> below paths be taken instead: >> >> 1 - Someone volunteers to maintain these branches, and also report the >> status of CI of these older branches periodically on the Ironic whiteboard >> and in Ironic meetings. If you feel strongly that one of these branches >> needs to continue to be in service; volunteering in this way is how to save >> them. >> >> 2 - We seriously reduce CI. Basically removing all tempest tests to >> ensure that CI remains reliable and able to merge emergency or security >> fixes when needed. In some cases; this still requires CI fixes as some >> older inspector branches are failing *installing packages* in unit tests. I >> would still like, in this case, that someone volunteers to ensure the >> minimalist CI remains happy. >> >> My intention is to let this message serve as notice and a waiting period; >> and if I've not heard any response here or in Monday's Ironic meeting (in 6 >> days), I will begin taking action on retiring these branches. >> >> This is simply a start; other branches (including bugfix branches) are >> also in bad shape in CI, but getting these retired will significantly >> reduce the surface area of projects and branches to evaluate. >> >> I know it's painful to drop support for these branches; but we've >> provided good EM support for these branches for a long time and by pruning >> them away, we'll be able to save time to dedicate to other items. >> >> Thanks, >> Jay Faulkner >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.zanetti at catalystcloud.nz Wed May 31 02:07:10 2023 From: david.zanetti at catalystcloud.nz (David Zanetti) Date: Wed, 31 May 2023 14:07:10 +1200 Subject: Invitation to further discuss the notion of a 'domain manager' role In-Reply-To: <5F43A54D-1332-476D-AF59-0E1DDFBC8964@osb-alliance.com> References: <5F43A54D-1332-476D-AF59-0E1DDFBC8964@osb-alliance.com> Message-ID: <65138016e2a14945d4fabdb31a2bba9ef6675596.camel@catalystcloud.nz> Hi All. We're very interested in this subject as we'd like to be able to give some of our larger customers easy ways for them to self-manage projects, users, and groups, and permissions against projects for users/groups. Meeting time is less than ideal for me, being NZST based (UTC+12), it would help if it was slightly earlier CEST time, say 1300 CEST or earlier which would be late evening here. Keen to hear what others have in mind and share what we've experimented with. -- David Zanetti Chief Technology Officer Catalyst Cloud Aotearoa New Zealand's cloud provider e david.zanetti at catalystcloud.nz m +64-21-402260 w catalystcloud.nz Follow us on LinkedIn Level 5, 2 Commerce Street, Auckland 1010 Confidentiality Notice: This email is intended for the named recipients only. It may contain privileged, confidential or copyright information. If you are not the named recipient, any use, reliance upon, disclosure or copying of this email or its attachments is unauthorised. If you have received this email in error, please reply via email or call +64 4 499 2267.? On Tue, 2023-05-30 at 16:24 +0200, Felix Kronlage-Dammers wrote: > > Hi, > We (SCS Project) are looking for CSPs / Operators who would like to > join a discussion on the notion of a ?domain manager role?. > To a lot of you this topic is far from new and has been brought up > just recently on this list as well[1]. > On the upcoming Monday (June 5th) at 15:05 CEST we would like to have > a kickstarting discussion of CSPs who currently solve the hurdle of > not having such a role in their environments and/or their custom > frontends. > We will meet here: > Through the Public Cloud SIG I?ve reached out to Cleura, OTC as well > as Catalyst Cloud. From the german scene PlusServer will join the > discussion. > As part of the discussion on monday we want to see, where we can move > towards a joint-effort to maybe see that such a role can be > introduced earlier than anticipated. For that it sure is good to > understand wether any of the participating CSPs do have ?stuff in > their drawer? (code, knowledge, experience, ?) that they could bring > to the table. > regards, > felix > [1] https://lists.openstack.org/pipermail/openstack-discuss/2023-May/033803.html> > -- > Felix Kronlage-Dammers > Product Owner IaaS & Operations Sovereign Cloud Stack > Sovereign Cloud Stack ? standardized, built and operated by many > Ein Projekt der Open Source Business Alliance - Bundesverband f?r > digitale Souver?nit?t e.V. > Tel.: +49-30-206539-205 | Matrix: @fkronlage:matrix.org | > fkr at osb-alliance.com -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 6356 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 659 bytes Desc: This is a digitally signed message part URL: From oliver.weinmann at me.com Wed May 31 05:07:16 2023 From: oliver.weinmann at me.com (Oliver Weinmann) Date: Wed, 31 May 2023 07:07:16 +0200 Subject: [kolla][cinder] cinder containers api, volume, backup unhealthy Message-ID: Hi, after replacing my control nodes with new nodes (all bare-metal) somehow the cinder containers are no longer starting. I checked the logs on one of the control nodes and I see this in the api-eror.log: 2023-05-30 21:31:36.350636 Timeout when reading response headers from daemon process 'cinder-api': /var/www/cgi-bin/cinder/cinder-wsgi 2023-05-30 21:31:37.827101 mod_wsgi (pid=22): Failed to exec Python script file '/var/www/cgi-bin/cinder/cinder-wsgi'. 2023-05-30 21:31:37.827168 mod_wsgi (pid=22): Exception occurred processing WSGI script '/var/www/cgi-bin/cinder/cinder-wsgi'. 2023-05-30 21:31:37.828005 Traceback (most recent call last): 2023-05-30 21:31:37.828046?? File "/var/www/cgi-bin/cinder/cinder-wsgi", line 52, in 2023-05-30 21:31:37.828053???? application = initialize_application() 2023-05-30 21:31:37.828058?? File "/var/lib/kolla/venv/lib/python3.6/site-packages/cinder/wsgi/wsgi.py", line 44, in initialize_application 2023-05-30 21:31:37.828063???? coordination.COORDINATOR.start() 2023-05-30 21:31:37.828068?? File "/var/lib/kolla/venv/lib/python3.6/site-packages/cinder/coordination.py", line 86, in start 2023-05-30 21:31:37.828071 self.coordinator.start(start_heart=True) 2023-05-30 21:31:37.828075?? File "/var/lib/kolla/venv/lib/python3.6/site-packages/tooz/coordination.py", line 689, in start 2023-05-30 21:31:37.828078 super(CoordinationDriverWithExecutor, self).start(start_heart) 2023-05-30 21:31:37.828083?? File "/var/lib/kolla/venv/lib/python3.6/site-packages/tooz/coordination.py", line 426, in start 2023-05-30 21:31:37.828086???? self._start() 2023-05-30 21:31:37.828090?? File "/var/lib/kolla/venv/lib/python3.6/site-packages/tooz/drivers/etcd3gw.py", line 224, in _start 2023-05-30 21:31:37.828093???? self._membership_lease = self.client.lease(self.membership_timeout) 2023-05-30 21:31:37.828098?? File "/var/lib/kolla/venv/lib/python3.6/site-packages/etcd3gw/client.py", line 122, in lease 2023-05-30 21:31:37.828111???? json={"TTL": ttl, "ID": 0}) 2023-05-30 21:31:37.828116?? File "/var/lib/kolla/venv/lib/python3.6/site-packages/etcd3gw/client.py", line 88, in post 2023-05-30 21:31:37.828123???? resp.reason 2023-05-30 21:31:37.828154 etcd3gw.exceptions.ConnectionTimeoutError: Gateway Time-out All other containers are working just fine. Even the cinder_scheduler container works fine. So far I have tried the following: remove the cinder containers including its volume from one control node mariadb_recovery Reboot all control nodes. kolla-ansible reconfigure --tags cinder,nova Any help is highly appreciated. Cheers, Oliver From fkr at osb-alliance.com Wed May 31 05:53:04 2023 From: fkr at osb-alliance.com (Felix Kronlage-Dammers) Date: Wed, 31 May 2023 07:53:04 +0200 Subject: Invitation to further discuss the notion of a 'domain manager' role In-Reply-To: <65138016e2a14945d4fabdb31a2bba9ef6675596.camel@catalystcloud.nz> References: <5F43A54D-1332-476D-AF59-0E1DDFBC8964@osb-alliance.com> <65138016e2a14945d4fabdb31a2bba9ef6675596.camel@catalystcloud.nz> Message-ID: <2A23530F-EF7F-4BD5-A450-7D7B801ECEB2@osb-alliance.com> Hi David, On 31 May 2023, at 4:07, David Zanetti wrote: > Meeting time is less than ideal for me, being NZST based (UTC+12), it > would help if it was slightly earlier CEST time, say 1300 CEST or > earlier which would be late evening here. indeed it is anything but easy to find a slot that suits all likely TZs, so for the first one I picked one of the ?overflow? slots we have in the SCS project. We?re usually fairly good with minute taking, so I?ll post a link to the minutes afterwards and I?d be happy to have a call with you afterwards as well and file you in on the discussion. If the discussion shows that this is going to be followed up on, I?ll make sure to schedule further meetings in a slot that allows NZST to participate so that you can join. In the last meeting of the publicloud sig meeting[1] we also discussed that this would be a good topic for one of the PTG-Sessions in Vancouver. felix [1] https://meetings.opendev.org/meetings/publiccloud_sig/2023/publiccloud_sig.2023-05-24-07.03.log.txt -- Felix Kronlage-Dammers Product Owner IaaS & Operations Sovereign Cloud Stack Sovereign Cloud Stack ? standardized, built and operated by many Ein Projekt der Open Source Business Alliance - Bundesverband f?r digitale Souver?nit?t e.V. Tel.: +49-30-206539-205 | Matrix: @fkronlage:matrix.org | fkr at osb-alliance.com -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 862 bytes Desc: OpenPGP digital signature URL: From mnasiadka at gmail.com Wed May 31 06:46:28 2023 From: mnasiadka at gmail.com (=?utf-8?Q?Micha=C5=82_Nasiadka?=) Date: Wed, 31 May 2023 08:46:28 +0200 Subject: [kolla][cinder] cinder containers api, volume, backup unhealthy In-Reply-To: References: Message-ID: Hi Oliver, Looking at the output - the problem is with connection to etcd coordination backend. Best regards, Michal > On 31 May 2023, at 07:07, Oliver Weinmann wrote: > > Hi, > > after replacing my control nodes with new nodes (all bare-metal) somehow the cinder containers are no longer starting. > > I checked the logs on one of the control nodes and I see this in the api-eror.log: > > 2023-05-30 21:31:36.350636 Timeout when reading response headers from daemon process 'cinder-api': /var/www/cgi-bin/cinder/cinder-wsgi > 2023-05-30 21:31:37.827101 mod_wsgi (pid=22): Failed to exec Python script file '/var/www/cgi-bin/cinder/cinder-wsgi'. > 2023-05-30 21:31:37.827168 mod_wsgi (pid=22): Exception occurred processing WSGI script '/var/www/cgi-bin/cinder/cinder-wsgi'. > 2023-05-30 21:31:37.828005 Traceback (most recent call last): > 2023-05-30 21:31:37.828046 File "/var/www/cgi-bin/cinder/cinder-wsgi", line 52, in > 2023-05-30 21:31:37.828053 application = initialize_application() > 2023-05-30 21:31:37.828058 File "/var/lib/kolla/venv/lib/python3.6/site-packages/cinder/wsgi/wsgi.py", line 44, in initialize_application > 2023-05-30 21:31:37.828063 coordination.COORDINATOR.start() > 2023-05-30 21:31:37.828068 File "/var/lib/kolla/venv/lib/python3.6/site-packages/cinder/coordination.py", line 86, in start > 2023-05-30 21:31:37.828071 self.coordinator.start(start_heart=True) > 2023-05-30 21:31:37.828075 File "/var/lib/kolla/venv/lib/python3.6/site-packages/tooz/coordination.py", line 689, in start > 2023-05-30 21:31:37.828078 super(CoordinationDriverWithExecutor, self).start(start_heart) > 2023-05-30 21:31:37.828083 File "/var/lib/kolla/venv/lib/python3.6/site-packages/tooz/coordination.py", line 426, in start > 2023-05-30 21:31:37.828086 self._start() > 2023-05-30 21:31:37.828090 File "/var/lib/kolla/venv/lib/python3.6/site-packages/tooz/drivers/etcd3gw.py", line 224, in _start > 2023-05-30 21:31:37.828093 self._membership_lease = self.client.lease(self.membership_timeout) > 2023-05-30 21:31:37.828098 File "/var/lib/kolla/venv/lib/python3.6/site-packages/etcd3gw/client.py", line 122, in lease > 2023-05-30 21:31:37.828111 json={"TTL": ttl, "ID": 0}) > 2023-05-30 21:31:37.828116 File "/var/lib/kolla/venv/lib/python3.6/site-packages/etcd3gw/client.py", line 88, in post > 2023-05-30 21:31:37.828123 resp.reason > 2023-05-30 21:31:37.828154 etcd3gw.exceptions.ConnectionTimeoutError: Gateway Time-out > > All other containers are working just fine. Even the cinder_scheduler container works fine. > > So far I have tried the following: > > remove the cinder containers including its volume from one control node > > mariadb_recovery > > Reboot all control nodes. > > kolla-ansible reconfigure --tags cinder,nova > > Any help is highly appreciated. > > Cheers, > > Oliver > > From thierry at openstack.org Wed May 31 08:24:50 2023 From: thierry at openstack.org (Thierry Carrez) Date: Wed, 31 May 2023 10:24:50 +0200 Subject: [largescale-sig] Next meeting: May 31, 8utc In-Reply-To: References: Message-ID: <775ed838-3197-370b-32d7-ed78bee7dde4@openstack.org> Here is the summary of our SIG meeting today. We discussed our future plans (Forum session, OpenInfra Live episode, OVN load testing with "typical" openstack patterns) and welcomed a new SIG participant. You can read the detailed meeting logs at: https://meetings.opendev.org/meetings/large_scale_sig/2023/large_scale_sig.2023-05-31-08.00.html Our next meeting will be in-person in Vancouver, at a Forum session happening June 14, 2:30pm local time. Hope to see a lot of you there to discuss challenges of running a large scale openstack deployment, live! Our next IRC meeting will be June 28, 8:00UTC on #openstack-operators on OFTC, before we break for the northern-hemisphere summer. Regards, -- Thierry Carrez (ttx) From geguileo at redhat.com Wed May 31 08:50:54 2023 From: geguileo at redhat.com (Gorka Eguileor) Date: Wed, 31 May 2023 10:50:54 +0200 Subject: [Cinder][LVM backend] LVM vg backed by a shared LUN In-Reply-To: References: Message-ID: <20230531085054.bu6sueebzzjyy36z@localhost> On 26/04, wodel youchi wrote: > Hi, > > The examples I could find on the internet using LVM as backend for Cinder, > they expose a local disk using lvm via Cinder. > > I did this configuration and I am wondering if it's correct, especially > from a "simultaneous access" point of view. > > I have an iSCSI target backed by targetcli that exposes a LUN to my compute > nodes. I did configure the iscsi connexion manually on each one of them and > they all see the LUN, then on one of them I created the cinder-volumes VG > (the other nodes can see the modifications), then I configured Cinder with > lvm backend using this VG and it worked. I created some volumes on it > without issues using my account. But what about when there are multiple > tenants that try to create multiple volumes on it, is this configuration > safe? > > Regards. Hi, I don't understand which of these 2 scenarios you have: - You are sharing an LVM volume manually without using Cinder with each compute node and then you want to use the same LVM VG with Cinder for dynamically allocated volumes. - You manually connected a volume to each compute to confirm that everything was working as expected and then you disconnected them and are using Cinder to dynamically allocate volumes for computes. Even though I don't understand which one is your scenario, it doesn't matter as both should be safe as long as the target iqn you defined for your manual volumes don't use the same format as cinder volumes (to avoid possible collision with a volume created from Cinder) because Cinder LVM+LIO uses a per-volume target, so there won't be conflict with the ones you create manually nor will Cinder touch what you already have there. Cheers, Gorka. From rdhasman at redhat.com Wed May 31 09:40:52 2023 From: rdhasman at redhat.com (Rajat Dhasmana) Date: Wed, 31 May 2023 15:10:52 +0530 Subject: [cinder] Midcycle-1 Status (31st May) In-Reply-To: References: Message-ID: Hello, Top posting for update. Since I don't see any topics added to the midcycle etherpad[1] yet, I would conclude cancelling the midcycle and we will have a Video+IRC meeting instead. Following are the details: Date: 31 May, 2023 Time: 1400-1500 UTC IRC: #openstack-meeting-alt Video: will be shared in #openstack-cinder channel before meeting [1] https://etherpad.opendev.org/p/cinder-bobcat-midcycles Thanks Rajat Dhasmana On Tue, May 30, 2023 at 5:12?PM Rajat Dhasmana wrote: > Hello Argonauts, > > As you know we had planned our first midcycle of 2023.2 Bobcat cycle on > 31st May, 2023. > I created an etherpad[1] around a month ago and have been providing > constant reminders in every cinder meeting about adding topics. > At this point, no topic has been added which makes me believe that we > don't require a midcycle discussion. > I will wait till today EOD to see if any new topics are added, and if not, > we will continue with our usual video+IRC (end of the month) meeting[2]. > > [1] https://etherpad.opendev.org/p/cinder-bobcat-midcycles > [2] https://etherpad.opendev.org/p/cinder-bobcat-meetings > > Thanks > Rajat Dhasmana > -------------- next part -------------- An HTML attachment was scrubbed... URL: From senrique at redhat.com Wed May 31 10:36:24 2023 From: senrique at redhat.com (Sofia Enriquez) Date: Wed, 31 May 2023 11:36:24 +0100 Subject: Cinder Bug Report 2023-05-31 Message-ID: Hello Argonauts, Cinder Bug Meeting Etherpad *Medium* - Creating a snapshot failed by schema validation when its description contains a new line control character . - *Status:* Unassigned. I'm kind of lost on this one if any Argonaut would like to give it a look. - Cached images duplicated per host. - *Status:* Unassigned. - HPE 3par: Unable to create clones of replicated vol. - *Status*: Fix proposed to master. - NetApp ONTAP enhanced instance creation using file copy API should allow use of DNS name for NFS share . - *Status*: Unassigned. Is this related to https://bugs.launchpad.net/cinder/+bug/2020733? - NetApp ONTAP driver does not include snapshot- files for capacity calculations of provisioned_capacity_gb . - *Status*: Unassigned. - *Low* - Using NetApp ONTAP file copy for copy offload needs volume provider_location set. - *Status:* Unassigned. Cheers -- Sof?a Enriquez she/her Software Engineer Red Hat PnT IRC: @enriquetaso @RedHat Red Hat Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From garcetto at gmail.com Wed May 31 10:52:17 2023 From: garcetto at gmail.com (garcetto) Date: Wed, 31 May 2023 12:52:17 +0200 Subject: openstack backup volume - error with rootwrap Message-ID: goodmorning, using kolla-openstack zed cluster, cinder backend is nfs, cinder-backup is another nfs too. when i try to do backup: #openstack volume backup create 43e97a6d-c965-4127-8334-df5b5e8e8a05 (on cinder-backup container) (/var/log/cinder-backup.log) " ERROR oslo_messaging.rpc.server Command: sudo cinder-rootwrap /etc/cinder/rootwrap.conf chown 42407 /var/lib/cinder/mnt/6c827b655049ba4f069fd5f103620555/volume-20f3c72c-0b57-4500-b552-5056d12fd9c7 2023-05-31 12:37:37.551 7 ERROR oslo_messaging.rpc.server Exit code: 1 2023-05-31 12:37:37.551 7 ERROR oslo_messaging.rpc.server Stdout: '' 2023-05-31 12:37:37.551 7 ERROR oslo_messaging.rpc.server Stderr: "/bin/chown: changing ownership of '/var/lib/cinder/mnt/6c827b655049ba4f069fd5f103620555/volume-20f3c72c-0b57-4500-b552-5056d12fd9c7': Invalid argument\n" 2023-05-31 12:37:37.551 7 ERROR oslo_messaging.rpc.server " could anyone give me advice how to proceed? thank you for your time. -------------- next part -------------- An HTML attachment was scrubbed... URL: From kozhukalov at gmail.com Wed May 31 12:27:44 2023 From: kozhukalov at gmail.com (Vladimir Kozhukalov) Date: Wed, 31 May 2023 15:27:44 +0300 Subject: [openstack-helm] Storyboard cleanup Message-ID: Dear Openstack-helmers, Minor announcement about OSH Storyboard. During May/2023 I have been putting some effort into cleaning up the Storyboard. More than 50 stories have been reviewed, verified and updated as merged or invalid. Now we still have 44 active stories in the OSH project group and I am going to clean them up even more. Also the infra team helped us to unlink 3 retired OSH projects from the OSH Storyboard project group. So now only active projects openstack-helm, openstack-helm-infra and openstack-helm-images are in the project group. -- Best regards, Kozhukalov Vladimir -------------- next part -------------- An HTML attachment was scrubbed... URL: From adivya1.singh at gmail.com Wed May 31 13:15:07 2023 From: adivya1.singh at gmail.com (Adivya Singh) Date: Wed, 31 May 2023 18:45:07 +0530 Subject: Affinity Group in Open Stack Message-ID: Hi Guys, Anything i can check on this , more from the server logs and the switch logs Packet drops when using trex traffic generator when device under test and traffic generator are not in same compute node But if we launch the instance using affinity and traffic generator and client on the same node , it works fine Regards Adivya Singh -------------- next part -------------- An HTML attachment was scrubbed... URL: From senrique at redhat.com Wed May 31 13:47:43 2023 From: senrique at redhat.com (Sofia Enriquez) Date: Wed, 31 May 2023 14:47:43 +0100 Subject: openstack backup volume - error with rootwrap In-Reply-To: References: Message-ID: Hi Garcetto, hope you are doing well, Are the c-vol services facing the same issue? From the logs looks like a configuration problem with kolla. Cheers, On Wed, May 31, 2023 at 11:56?AM garcetto wrote: > goodmorning, > using kolla-openstack zed cluster, cinder backend is nfs, cinder-backup > is another nfs too. > when i try to do backup: > > #openstack volume backup create 43e97a6d-c965-4127-8334-df5b5e8e8a05 > > (on cinder-backup container) > (/var/log/cinder-backup.log) > " > ERROR oslo_messaging.rpc.server Command: sudo cinder-rootwrap > /etc/cinder/rootwrap.conf chown 42407 > /var/lib/cinder/mnt/6c827b655049ba4f069fd5f103620555/volume-20f3c72c-0b57-4500-b552-5056d12fd9c7 > 2023-05-31 12:37:37.551 7 ERROR oslo_messaging.rpc.server Exit code: 1 > 2023-05-31 12:37:37.551 7 ERROR oslo_messaging.rpc.server Stdout: '' > 2023-05-31 12:37:37.551 7 ERROR oslo_messaging.rpc.server Stderr: > "/bin/chown: changing ownership of > '/var/lib/cinder/mnt/6c827b655049ba4f069fd5f103620555/volume-20f3c72c-0b57-4500-b552-5056d12fd9c7': > Invalid argument\n" > 2023-05-31 12:37:37.551 7 ERROR oslo_messaging.rpc.server > " > > could anyone give me advice how to proceed? > thank you for your time. > > -- Sof?a Enriquez she/her Software Engineer Red Hat PnT IRC: @enriquetaso @RedHat Red Hat Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From garcetto at gmail.com Wed May 31 14:03:28 2023 From: garcetto at gmail.com (garcetto) Date: Wed, 31 May 2023 16:03:28 +0200 Subject: openstack backup volume - error with rootwrap In-Reply-To: References: Message-ID: hi, what do you mean for c-vol? it you mean cinder volumes, yes they work fine. and don't understand how kolla can be a problem, rootwrap is and openstack "standard" tool. for what i can see, the problem is that cinder-backup starts, create the volume to backup (i can see it briefly on the storage, nfs share) THEN suddlenly it is removed AND that error appears. thank you On Wed, May 31, 2023 at 3:47?PM Sofia Enriquez wrote: > Hi Garcetto, hope you are doing well, > > Are the c-vol services facing the same issue? From the logs looks like a > configuration problem with kolla. > > Cheers, > > On Wed, May 31, 2023 at 11:56?AM garcetto wrote: > >> goodmorning, >> using kolla-openstack zed cluster, cinder backend is nfs, cinder-backup >> is another nfs too. >> when i try to do backup: >> >> #openstack volume backup create 43e97a6d-c965-4127-8334-df5b5e8e8a05 >> >> (on cinder-backup container) >> (/var/log/cinder-backup.log) >> " >> ERROR oslo_messaging.rpc.server Command: sudo cinder-rootwrap >> /etc/cinder/rootwrap.conf chown 42407 >> /var/lib/cinder/mnt/6c827b655049ba4f069fd5f103620555/volume-20f3c72c-0b57-4500-b552-5056d12fd9c7 >> 2023-05-31 12:37:37.551 7 ERROR oslo_messaging.rpc.server Exit code: 1 >> 2023-05-31 12:37:37.551 7 ERROR oslo_messaging.rpc.server Stdout: '' >> 2023-05-31 12:37:37.551 7 ERROR oslo_messaging.rpc.server Stderr: >> "/bin/chown: changing ownership of >> '/var/lib/cinder/mnt/6c827b655049ba4f069fd5f103620555/volume-20f3c72c-0b57-4500-b552-5056d12fd9c7': >> Invalid argument\n" >> 2023-05-31 12:37:37.551 7 ERROR oslo_messaging.rpc.server >> " >> >> could anyone give me advice how to proceed? >> thank you for your time. >> >> > > -- > > Sof?a Enriquez > > she/her > > Software Engineer > > Red Hat PnT > > IRC: @enriquetaso > @RedHat Red Hat > Red Hat > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From garcetto at gmail.com Wed May 31 14:07:13 2023 From: garcetto at gmail.com (garcetto) Date: Wed, 31 May 2023 16:07:13 +0200 Subject: [kolla-ansible][zed] nfs as backend for nova Message-ID: good afternoon, it is a clear doc or tutorial on how to configure an nfs share as backend storage for nova root vm volumes? (find for ceph but not for nfs) thank you -------------- next part -------------- An HTML attachment was scrubbed... URL: From adivya1.singh at gmail.com Wed May 31 14:11:42 2023 From: adivya1.singh at gmail.com (Adivya Singh) Date: Wed, 31 May 2023 19:41:42 +0530 Subject: openstack backup volume - error with rootwrap In-Reply-To: References: Message-ID: Hi, This looks like more of a permission issue in the NFS with respect to ownership of the mount point with respect to Open Stack On Wed, May 31, 2023 at 4:27?PM garcetto wrote: > goodmorning, > using kolla-openstack zed cluster, cinder backend is nfs, cinder-backup > is another nfs too. > when i try to do backup: > > #openstack volume backup create 43e97a6d-c965-4127-8334-df5b5e8e8a05 > > (on cinder-backup container) > (/var/log/cinder-backup.log) > " > ERROR oslo_messaging.rpc.server Command: sudo cinder-rootwrap > /etc/cinder/rootwrap.conf chown 42407 > /var/lib/cinder/mnt/6c827b655049ba4f069fd5f103620555/volume-20f3c72c-0b57-4500-b552-5056d12fd9c7 > 2023-05-31 12:37:37.551 7 ERROR oslo_messaging.rpc.server Exit code: 1 > 2023-05-31 12:37:37.551 7 ERROR oslo_messaging.rpc.server Stdout: '' > 2023-05-31 12:37:37.551 7 ERROR oslo_messaging.rpc.server Stderr: > "/bin/chown: changing ownership of > '/var/lib/cinder/mnt/6c827b655049ba4f069fd5f103620555/volume-20f3c72c-0b57-4500-b552-5056d12fd9c7': > Invalid argument\n" > 2023-05-31 12:37:37.551 7 ERROR oslo_messaging.rpc.server > " > > could anyone give me advice how to proceed? > thank you for your time. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From elfosardo at gmail.com Wed May 31 15:53:14 2023 From: elfosardo at gmail.com (Riccardo Pittau) Date: Wed, 31 May 2023 17:53:14 +0200 Subject: [ironic] retirement of old bugfix branches Message-ID: Hello everyone! We're going to remove old bugfix branches in the next few weeks, all bugfix branches that are older than 6 months are officially EoL. This includes: - for ironic bugfix/18.0 bugfix/18.1 bugfix/19.0 bugfix/20.2 bugfix/21.0 bugfix/21.2 - for ironic-inspector bugfix/10.7 bugfix/10.9 bugfix/10.12 bugfix/11.0 bugfix/11.2 -for ironic-python-agent bugfix/8.0 bugfix/8.1 bugfix/8.3 bugfix/8.6 bugfix/9.0 bugfix/9.2 Thanks! Ciao Riccardo -------------- next part -------------- An HTML attachment was scrubbed... URL: From swogatpradhan22 at gmail.com Wed May 31 17:16:10 2023 From: swogatpradhan22 at gmail.com (Swogat Pradhan) Date: Wed, 31 May 2023 22:46:10 +0530 Subject: Add netapp storage in edge site | Wallaby | DCN In-Reply-To: References: Message-ID: Hi Alan, Thanks for your clarification, the way you suggested will solve my issue. But i already have a netapp backend in my central site and to add another backend should i follow this documentation: https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/17.0/html/custom_block_storage_back_end_deployment_guide/ref_configuration-sample-environment-file_custom-cinder-back-end And should i remove the old netapp environment file and use the new custom environment file created using the above mentioned guide?? I already have a prod workload in the currently deployed netapp and I do not want to cause any issues in that netapp storage. With regards, Swogat Pradhan On Tue, May 30, 2023 at 8:43?PM Alan Bishop wrote: > > > On Thu, May 25, 2023 at 9:39?PM Swogat Pradhan > wrote: > >> Hi Alan, >> My netapp storage is located in edge site itself. >> As the networks are routable my central site is able to reach the netapp >> storage ip address (ping response is 30ms-40ms). >> Let's say i included the netapp storage yaml in central site deployment >> script (which is not recommended) and i am able to create the volumes as it >> is reachable from controller nodes. >> Will i be able to mount those volumes in edge site VM's?? And if i am >> able to do so, then how will the data flow?? When storing something in the >> netapp volume will the data flow through the central site controller and >> get stored in the storage space? >> > > A cinder-volume service running in the central site's controplane will be > able to work with a netapp backend that's physically located at an edge > site. The good news is the c-vol service will be HA because it will be > controlled by pacemaker running on the controllers. > > In order for VMs at the edge site to access volumes on the netapp, you'll > need to set the CinderNetappAvailabilityZone [1] to the edge site's AZ. > > [1] > https://opendev.org/openstack/tripleo-heat-templates/src/branch/stable/wallaby/deployment/cinder/cinder-backend-netapp-puppet.yaml#L43 > > To attach a netapp volume, nova-compute at the edge will interact with > cinder-volume in the controlplane, and cinder-volume will in turn interact > with the netapp. This will happen over central <=> edge network > connections. Eventually, nova will directly connect to the netapp, so all > traffic from the VM to the netapp will occur within the edge site. Data > will not flow through the cinder-volume service, but there are restrictions > and limitations: > - Only that one edge site can access the netapp backend > - If the central <=> edge network connection then you won't be able to > attach or detach a netapp volume (but active connections will continue to > work) > > Of course, there are operations where cinder services are in the data path > (e.g. creating a volume from an image), but not when a VM is accessing a > volume. > > Alan > > >> With regards, >> Swogat Pradhan >> >> On Fri, 26 May 2023, 10:03 am Alan Bishop, wrote: >> >>> >>> >>> On Thu, May 25, 2023 at 12:09?AM Swogat Pradhan < >>> swogatpradhan22 at gmail.com> wrote: >>> >>>> Hi Alan, >>>> So, can I include the cinder-netapp-storage.yaml file in the central >>>> site and then use the new backend to add storage to edge VM's? >>>> >>> >>> Where is the NetApp physically located? Tripleo's DCN architecture >>> assumes the storage is physically located at the same site where the >>> cinder-volume service will be deployed. If you include the >>> cinder-netapp-storage.yaml environment file in the central site's >>> controlplane, then VMs at the edge site will encounter the problems I >>> outlined earlier (network latency, no ability to do cross-AZ attachments). >>> >>> >>>> I believe it is not possible right?? as the cinder volume in the edge >>>> won't have the config for the netapp. >>>> >>> >>> The cinder-volume services at an edge site are meant to manage storage >>> devices at that site. If the NetApp is at the edge site, ideally you'd >>> include some variation of the cinder-netapp-storage.yaml environment file >>> in the edge site's deployment. However, then you're faced with the fact >>> that the NetApp driver doesn't support A/A, which is required for c-vol >>> services running at edge sites (In case you're not familiar with these >>> details, tripleo runs all cinder-volume services in active/passive mode >>> under pacemaker on controllers in the controlplane. Thus, only a single >>> instance runs at any time, and pacemaker provides HA by moving the service >>> to another controller if the first one goes down. However, pacemaker is not >>> available at edge sites, and so to get HA, multiple instances of the >>> cinder-volume service run simultaneously on 3 nodes (A/A), using etcd as a >>> Distributed Lock Manager (DLM) to coordinate things. But drivers must >>> specifically support running A/A, and the NetApp driver does NOT.) >>> >>> Alan >>> >>> >>>> With regards, >>>> Swogat Pradhan >>>> >>>> On Thu, May 25, 2023 at 2:17?AM Alan Bishop wrote: >>>> >>>>> >>>>> >>>>> On Wed, May 24, 2023 at 3:15?AM Swogat Pradhan < >>>>> swogatpradhan22 at gmail.com> wrote: >>>>> >>>>>> Hi, >>>>>> I have a DCN setup and there is a requirement to use a netapp storage >>>>>> device in one of the edge sites. >>>>>> Can someone please confirm if it is possible? >>>>>> >>>>> >>>>> I see from prior email to this list that you're using tripleo, so I'll >>>>> respond with that in mind. >>>>> >>>>> There are many factors that come into play, but I suspect the short >>>>> answer to your question is no. >>>>> >>>>> Tripleo's DCN architecture requires the cinder-volume service running >>>>> at edge sites to run in active-active >>>>> mode, where there are separate instances running on three nodes in to >>>>> for the service to be highly >>>>> available (HA).The problem is that only a small number of cinder >>>>> drivers support running A/A, and NetApp's >>>>> drivers do not support A/A. >>>>> >>>>> It's conceivable you could create a custom tripleo role that deploys >>>>> just a single node running cinder-volume >>>>> with a NetApp backend, but it wouldn't be HA. >>>>> >>>>> It's also conceivable you could locate the NetApp system in the >>>>> central site's controlplane, but there are >>>>> extremely difficult constraints you'd need to overcome: >>>>> - Network latency between the central and edge sites would mean the >>>>> disk performance would be bad. >>>>> - You'd be limited to using iSCSI (FC wouldn't work) >>>>> - Tripleo disables cross-AZ attachments, so the only way for an edge >>>>> site to access a NetApp volume >>>>> would be to configure the cinder-volume service running in the >>>>> controlplane with a backend availability >>>>> zone set to the edge site's AZ. You mentioned the NetApp is needed "in >>>>> one of the edge sites," but in >>>>> reality the NetApp would be available in one, AND ONLY ONE edge site, >>>>> and it would also not be available >>>>> to any instances running in the central site. >>>>> >>>>> Alan >>>>> >>>>> >>>>>> And if so then should i add the parameters in the edge deployment >>>>>> script or the central deployment script. >>>>>> Any suggestions? >>>>>> >>>>>> With regards, >>>>>> Swogat Pradhan >>>>>> >>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From swogatpradhan22 at gmail.com Wed May 31 17:43:43 2023 From: swogatpradhan22 at gmail.com (Swogat Pradhan) Date: Wed, 31 May 2023 23:13:43 +0530 Subject: Add netapp storage in edge site | Wallaby | DCN In-Reply-To: References: Message-ID: Hi Alan, Can you please check if the environment file attached is in correct format and how it should be? Also should i remove the old netapp environment file and use the new custom environment file created (attached). With regards, Swogat Pradhan On Wed, May 31, 2023 at 10:46?PM Swogat Pradhan wrote: > Hi Alan, > Thanks for your clarification, the way you suggested will solve my issue. > But i already have a netapp backend in my central site and to add another > backend should i follow this documentation: > https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/17.0/html/custom_block_storage_back_end_deployment_guide/ref_configuration-sample-environment-file_custom-cinder-back-end > > And should i remove the old netapp environment file and use the new custom > environment file created using the above mentioned guide?? > > I already have a prod workload in the currently deployed netapp and I do > not want to cause any issues in that netapp storage. > > > With regards, > Swogat Pradhan > > On Tue, May 30, 2023 at 8:43?PM Alan Bishop wrote: > >> >> >> On Thu, May 25, 2023 at 9:39?PM Swogat Pradhan >> wrote: >> >>> Hi Alan, >>> My netapp storage is located in edge site itself. >>> As the networks are routable my central site is able to reach the netapp >>> storage ip address (ping response is 30ms-40ms). >>> Let's say i included the netapp storage yaml in central site deployment >>> script (which is not recommended) and i am able to create the volumes as it >>> is reachable from controller nodes. >>> Will i be able to mount those volumes in edge site VM's?? And if i am >>> able to do so, then how will the data flow?? When storing something in the >>> netapp volume will the data flow through the central site controller and >>> get stored in the storage space? >>> >> >> A cinder-volume service running in the central site's controplane will be >> able to work with a netapp backend that's physically located at an edge >> site. The good news is the c-vol service will be HA because it will be >> controlled by pacemaker running on the controllers. >> >> In order for VMs at the edge site to access volumes on the netapp, you'll >> need to set the CinderNetappAvailabilityZone [1] to the edge site's AZ. >> >> [1] >> https://opendev.org/openstack/tripleo-heat-templates/src/branch/stable/wallaby/deployment/cinder/cinder-backend-netapp-puppet.yaml#L43 >> >> To attach a netapp volume, nova-compute at the edge will interact with >> cinder-volume in the controlplane, and cinder-volume will in turn interact >> with the netapp. This will happen over central <=> edge network >> connections. Eventually, nova will directly connect to the netapp, so all >> traffic from the VM to the netapp will occur within the edge site. Data >> will not flow through the cinder-volume service, but there are restrictions >> and limitations: >> - Only that one edge site can access the netapp backend >> - If the central <=> edge network connection then you won't be able to >> attach or detach a netapp volume (but active connections will continue to >> work) >> >> Of course, there are operations where cinder services are in the data >> path (e.g. creating a volume from an image), but not when a VM is accessing >> a volume. >> >> Alan >> >> >>> With regards, >>> Swogat Pradhan >>> >>> On Fri, 26 May 2023, 10:03 am Alan Bishop, wrote: >>> >>>> >>>> >>>> On Thu, May 25, 2023 at 12:09?AM Swogat Pradhan < >>>> swogatpradhan22 at gmail.com> wrote: >>>> >>>>> Hi Alan, >>>>> So, can I include the cinder-netapp-storage.yaml file in the central >>>>> site and then use the new backend to add storage to edge VM's? >>>>> >>>> >>>> Where is the NetApp physically located? Tripleo's DCN architecture >>>> assumes the storage is physically located at the same site where the >>>> cinder-volume service will be deployed. If you include the >>>> cinder-netapp-storage.yaml environment file in the central site's >>>> controlplane, then VMs at the edge site will encounter the problems I >>>> outlined earlier (network latency, no ability to do cross-AZ attachments). >>>> >>>> >>>>> I believe it is not possible right?? as the cinder volume in the edge >>>>> won't have the config for the netapp. >>>>> >>>> >>>> The cinder-volume services at an edge site are meant to manage storage >>>> devices at that site. If the NetApp is at the edge site, ideally you'd >>>> include some variation of the cinder-netapp-storage.yaml environment file >>>> in the edge site's deployment. However, then you're faced with the fact >>>> that the NetApp driver doesn't support A/A, which is required for c-vol >>>> services running at edge sites (In case you're not familiar with these >>>> details, tripleo runs all cinder-volume services in active/passive mode >>>> under pacemaker on controllers in the controlplane. Thus, only a single >>>> instance runs at any time, and pacemaker provides HA by moving the service >>>> to another controller if the first one goes down. However, pacemaker is not >>>> available at edge sites, and so to get HA, multiple instances of the >>>> cinder-volume service run simultaneously on 3 nodes (A/A), using etcd as a >>>> Distributed Lock Manager (DLM) to coordinate things. But drivers must >>>> specifically support running A/A, and the NetApp driver does NOT.) >>>> >>>> Alan >>>> >>>> >>>>> With regards, >>>>> Swogat Pradhan >>>>> >>>>> On Thu, May 25, 2023 at 2:17?AM Alan Bishop >>>>> wrote: >>>>> >>>>>> >>>>>> >>>>>> On Wed, May 24, 2023 at 3:15?AM Swogat Pradhan < >>>>>> swogatpradhan22 at gmail.com> wrote: >>>>>> >>>>>>> Hi, >>>>>>> I have a DCN setup and there is a requirement to use a netapp >>>>>>> storage device in one of the edge sites. >>>>>>> Can someone please confirm if it is possible? >>>>>>> >>>>>> >>>>>> I see from prior email to this list that you're using tripleo, so >>>>>> I'll respond with that in mind. >>>>>> >>>>>> There are many factors that come into play, but I suspect the short >>>>>> answer to your question is no. >>>>>> >>>>>> Tripleo's DCN architecture requires the cinder-volume service running >>>>>> at edge sites to run in active-active >>>>>> mode, where there are separate instances running on three nodes in to >>>>>> for the service to be highly >>>>>> available (HA).The problem is that only a small number of cinder >>>>>> drivers support running A/A, and NetApp's >>>>>> drivers do not support A/A. >>>>>> >>>>>> It's conceivable you could create a custom tripleo role that deploys >>>>>> just a single node running cinder-volume >>>>>> with a NetApp backend, but it wouldn't be HA. >>>>>> >>>>>> It's also conceivable you could locate the NetApp system in the >>>>>> central site's controlplane, but there are >>>>>> extremely difficult constraints you'd need to overcome: >>>>>> - Network latency between the central and edge sites would mean the >>>>>> disk performance would be bad. >>>>>> - You'd be limited to using iSCSI (FC wouldn't work) >>>>>> - Tripleo disables cross-AZ attachments, so the only way for an edge >>>>>> site to access a NetApp volume >>>>>> would be to configure the cinder-volume service running in the >>>>>> controlplane with a backend availability >>>>>> zone set to the edge site's AZ. You mentioned the NetApp is needed >>>>>> "in one of the edge sites," but in >>>>>> reality the NetApp would be available in one, AND ONLY ONE edge site, >>>>>> and it would also not be available >>>>>> to any instances running in the central site. >>>>>> >>>>>> Alan >>>>>> >>>>>> >>>>>>> And if so then should i add the parameters in the edge deployment >>>>>>> script or the central deployment script. >>>>>>> Any suggestions? >>>>>>> >>>>>>> With regards, >>>>>>> Swogat Pradhan >>>>>>> >>>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: cinder-netapp-custom-env - Copy.yml Type: application/octet-stream Size: 1984 bytes Desc: not available URL: