From jmazzone at uchicago.edu Thu Jul 1 01:13:09 2021 From: jmazzone at uchicago.edu (Jeffrey Mazzone) Date: Thu, 1 Jul 2021 01:13:09 +0000 Subject: [nova][placement] Openstack only building one VM per machine in cluster, then runs out of resources In-Reply-To: References: <7329E762-12FA-4F97-B85F-2D36E1E256C6@uchicago.edu> <6717C9A7-FBBD-4560-A636-6842A0B3C8EC@uchicago.edu> <24CEA8FD-BCDD-4953-8E55-01A4717F68EC@uchicago.edu> Message-ID: <47839686-4CAC-4854-8B37-1DB9E458BA6E@uchicago.edu> On Jun 30, 2021, at 5:06 PM, melanie witt > wrote: I suggest you run the 'openstack resource provider show --allocations' command as Balazs mentioned earlier to show all of the allocations (used resources) on the compute node. I also suggest you run the 'nova-manage placement audit' tool [1] as Sylvain mentioned earlier to show whether there are any orphaned allocations, i.e. allocations that are for instances that no longer exist. The consumer UUID is the instance UUID. I did both of those suggestions. "openstack resource provider show —allocations" shows what is expected. No additional orphaned vms and the resources used is correct. Here is an example of a different set of hosts and zones. This host had 2x 16 core vms on it before the cluster went into this state. You can see them both below. The nova-manage audit commands do not show any orphans either. ~# openstack resource provider show 41ecee2a-ec24-48e5-8b9d-24065d67238a --allocations +----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Field | Value | +----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | uuid | 41ecee2a-ec24-48e5-8b9d-24065d67238a | | name | kh09-56 | | generation | 55 | | root_provider_uuid | 41ecee2a-ec24-48e5-8b9d-24065d67238a | | parent_provider_uuid | None | | allocations | {'d6b9d19c-1ba9-44c2-97ab-90098509b872': {'resources': {'DISK_GB': 50, 'MEMORY_MB': 16384, 'VCPU': 16}, 'consumer_generation': 1}, 'e0a8401a-0bb6-4612-a496-6a794ebe6cd0': {'resources': {'DISK_GB': 50, 'MEMORY_MB': 16384, 'VCPU': 16}, 'consumer_generation': 1}} | +----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ Usage on the resource provider: ~# openstack resource provider usage show 41ecee2a-ec24-48e5-8b9d-24065d67238a +----------------+-------+ | resource_class | usage | +----------------+-------+ | VCPU | 32 | | MEMORY_MB | 32768 | | DISK_GB | 100 | +----------------+-------+ All of that looks correct. Requesting it to check allocations for a 4 VCPU vm also shows it as a candidate: ~# openstack allocation candidate list --resource VCPU=4 | grep 41ecee2a-ec24-48e5-8b9d-24065d67238a | 41 | VCPU=4 | 41ecee2a-ec24-48e5-8b9d-24065d67238a | VCPU=32/1024,MEMORY_MB=32768/772714,DISK_GB=100/7096 In the placement database, under the used column, also shows the correct values for the information provided above with those 2 vms on it: +---------------------+------------+-------+----------------------+--------------------------------------+-------------------+-------+ | created_at | updated_at | id | resource_provider_id | consumer_id | resource_class_id | used | +---------------------+------------+-------+----------------------+--------------------------------------+-------------------+-------+ | 2021-06-02 18:45:05 | NULL | 4060 | 125 | e0a8401a-0bb6-4612-a496-6a794ebe6cd0 | 2 | 50 | | 2021-06-02 18:45:05 | NULL | 4061 | 125 | e0a8401a-0bb6-4612-a496-6a794ebe6cd0 | 1 | 16384 | | 2021-06-02 18:45:05 | NULL | 4062 | 125 | e0a8401a-0bb6-4612-a496-6a794ebe6cd0 | 0 | 16 | | 2021-06-04 18:39:13 | NULL | 7654 | 125 | d6b9d19c-1ba9-44c2-97ab-90098509b872 | 2 | 50 | | 2021-06-04 18:39:13 | NULL | 7655 | 125 | d6b9d19c-1ba9-44c2-97ab-90098509b872 | 1 | 16384 | | 2021-06-04 18:39:13 | NULL | 7656 | 125 | d6b9d19c-1ba9-44c2-97ab-90098509b872 | 0 | 16 | Trying to build a vm though.. I get the placement error with the improperly calculated “Used” values. 2021-06-30 19:51:39.732 43832 WARNING placement.objects.allocation [req-de225c66-8297-4b34-9380-26cf9385d658 a770bde56c9d49e68facb792cf69088c 6da06417e0004cbb87c1e64fe1978de5 - default default] Over capacity for VCPU on resource provider b749130c-a368-4332-8a1f-8411851b4b2a. Needed: 4, Used: 18509, Capacity: 1024.0 Outside of changing the allocation ratio, im completely lost. Im confident it has to do with that improper calculation of the used value but how is it being calculated if it isn’t being added up from fixed values in the database as has been suggested? Thanks in advance! -Jeff M The tl;dr on how the value is calculated is there's a table called 'allocations' in the placement database that holds all the values for resource providers and resource classes and it has a 'used' column. If you add up all of the 'used' values for a resource class (VCPU) and resource provider (compute node) then that will be the total used of that resource on that resource provider. You can see this data by 'openstack resource provider show --allocations' as well. The allocation ratio will not affect the value of 'used' but it will affect the working value of 'total' to be considered higher than it actually is in order to oversubscribe. If a compute node has 64 cores and cpu_allocation ratio is 16 then 64 * 16 = 1024 cores will be allowed for placement on that compute node. You likely have "orphaned" allocations for the compute node/resource provider that are not mapped to instances any more and you can use 'nova-manage placement audit' to find those and optionally delete them. Doing that will cleanup your resource provider. First, I would run it without specifying --delete just to see what it shows without modifying anything. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark.kirkwood at catalyst.net.nz Thu Jul 1 01:39:07 2021 From: mark.kirkwood at catalyst.net.nz (Mark Kirkwood) Date: Thu, 1 Jul 2021 13:39:07 +1200 Subject: [Swift] Object replication failures on newly upgraded servers In-Reply-To: <20210629231101.240482d0@suzdal.zaitcev.lan> References: <20210603012230.65f2bc33@suzdal.zaitcev.lan> <20210629231101.240482d0@suzdal.zaitcev.lan> Message-ID: On 30/06/21 4:11 pm, Pete Zaitcev wrote: > On Wed, 30 Jun 2021 13:32:54 +1200 > Mark Kirkwood wrote: > >> Jun 30 06:40:34 cat-hlz-ostor003 object-server: Error syncing partition: >> LockTimeout (10s) /srv/node/obj06/objects-20/544096/.lock > Why do you have 20 policies? Sounds rather unusual. > >> So, I need to figure out why we are timing out! > Sorry, I don't have enough operator experience with this. > In my case it's just not enough workers for the number > of the nodes, but I'm sure your setup is more complex. > Thanks Pete! No we only have 4 policies - but we numbered the additional ones to match the region numbers (10, 20, 30)! Hah! I was just looking at the object-server worker count (e.g: 16 on a node with 32 cores) and thinking 'hmmm...' when I saw your message. Will experiment with increasing that. Cheers Mark From gmann at ghanshyammann.com Thu Jul 1 02:29:51 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 30 Jun 2021 21:29:51 -0500 Subject: [all][tc] Technical Committee next weekly meeting on July 1st at 1500 UTC In-Reply-To: <17a5584edbc.e62ace7a185349.751069679551853439@ghanshyammann.com> References: <17a5584edbc.e62ace7a185349.751069679551853439@ghanshyammann.com> Message-ID: <17a5fe6afc8.105ed2591324722.2845643384995259557@ghanshyammann.com> Hello Everyone, Below is the agenda for Today's TC meeting schedule at 1500 UTC in #openstack-tc IRC OFTC channel. -https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting == Agenda for tomorrow's TC meeting == * Roll call * Follow up on past action items * Gate health check (dansmith/yoctozepto) ** http://paste.openstack.org/show/jD6kAP9tHk7PZr2nhv8h/ * Migration from 'Freenode' to 'OFTC' (gmann) ** https://etherpad.opendev.org/p/openstack-irc-migration-to-oftc * Governance non-active repos retirement & cleanup ** https://etherpad.opendev.org/p/governance-repos-cleanup * Election official assignments ** http://lists.openstack.org/pipermail/openstack-discuss/2021-June/023060.html * Items for next Newsletter ** https://etherpad.opendev.org/p/newsletter-openstack-news * Open Reviews ** https://review.opendev.org/q/project:openstack/governance+is:open -gmann ---- On Mon, 28 Jun 2021 21:06:52 -0500 Ghanshyam Mann wrote ---- > Hello Everyone, > > Technical Committee's next weekly meeting is scheduled for July 1st at 1500 UTC. > > If you would like to add topics for discussion, please add them to the below wiki page by > Wednesday, June 30th, at 2100 UTC. > > https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting > > -gmann > > From laurentfdumont at gmail.com Thu Jul 1 03:51:10 2021 From: laurentfdumont at gmail.com (Laurent Dumont) Date: Wed, 30 Jun 2021 23:51:10 -0400 Subject: [nova][placement] Openstack only building one VM per machine in cluster, then runs out of resources In-Reply-To: <47839686-4CAC-4854-8B37-1DB9E458BA6E@uchicago.edu> References: <7329E762-12FA-4F97-B85F-2D36E1E256C6@uchicago.edu> <6717C9A7-FBBD-4560-A636-6842A0B3C8EC@uchicago.edu> <24CEA8FD-BCDD-4953-8E55-01A4717F68EC@uchicago.edu> <47839686-4CAC-4854-8B37-1DB9E458BA6E@uchicago.edu> Message-ID: I'm curious to see if I can reproduce the issue in my test-env. I never tried puppet-openstack so might as well see how it goes! The ServerFault issue mentions the puppet-openstack integration being used to deploy Ussuri? Specifically, the puppet modules being at the 17.4 version? But looking at https://docs.openstack.org/puppet-openstack-guide/latest/install/releases.html - the modules for Ussuri should be at 16.x? Could it be some kind of weird setup of the deployment modules for Ussuri/placement that didn't go as planned? On Wed, Jun 30, 2021 at 9:13 PM Jeffrey Mazzone wrote: > On Jun 30, 2021, at 5:06 PM, melanie witt wrote: > > I suggest you run the 'openstack resource provider show > --allocations' command as Balazs mentioned earlier to show all of the > allocations (used resources) on the compute node. I also suggest you run > the 'nova-manage placement audit' tool [1] as Sylvain mentioned earlier to > show whether there are any orphaned allocations, i.e. allocations that are > for instances that no longer exist. The consumer UUID is the instance UUID. > > I did both of those suggestions. "openstack resource provider show UUID> —allocations" shows what is expected. No additional orphaned vms and > the resources used is correct. Here is an example of a different set of > hosts and zones. This host had 2x 16 core vms on it before the cluster went > into this state. You can see them both below. The nova-manage audit > commands do not show any orphans either. > > ~# openstack resource provider show 41ecee2a-ec24-48e5-8b9d-24065d67238a --allocations > +----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ > | Field | Value | > +----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ > | uuid | 41ecee2a-ec24-48e5-8b9d-24065d67238a | > | name | kh09-56 | > | generation | 55 | > | root_provider_uuid | 41ecee2a-ec24-48e5-8b9d-24065d67238a | > | parent_provider_uuid | None | > | allocations | {'d6b9d19c-1ba9-44c2-97ab-90098509b872': {'resources': {'DISK_GB': 50, 'MEMORY_MB': 16384, 'VCPU': 16}, 'consumer_generation': 1}, 'e0a8401a-0bb6-4612-a496-6a794ebe6cd0': {'resources': {'DISK_GB': 50, 'MEMORY_MB': 16384, 'VCPU': 16}, 'consumer_generation': 1}} | > +----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ > > > Usage on the resource provider: > > ~# openstack resource provider usage show 41ecee2a-ec24-48e5-8b9d-24065d67238a > +----------------+-------+ > | resource_class | usage | > +----------------+-------+ > | VCPU | 32 | > | MEMORY_MB | 32768 | > | DISK_GB | 100 | > +----------------+-------+ > > > All of that looks correct. Requesting it to check allocations for a 4 VCPU > vm also shows it as a candidate: > > ~# openstack allocation candidate list --resource VCPU=4 | grep 41ecee2a-ec24-48e5-8b9d-24065d67238a > | 41 | VCPU=4 | 41ecee2a-ec24-48e5-8b9d-24065d67238a | VCPU=32/1024,MEMORY_MB=32768/772714,DISK_GB=100/7096 > > > In the placement database, under the used column, also shows the correct > values for the information provided above with those 2 vms on it: > > +---------------------+------------+-------+----------------------+--------------------------------------+-------------------+-------+ > | created_at | updated_at | id | resource_provider_id | consumer_id | resource_class_id | used | > +---------------------+------------+-------+----------------------+--------------------------------------+-------------------+-------+ > | 2021-06-02 18:45:05 | NULL | 4060 | 125 | e0a8401a-0bb6-4612-a496-6a794ebe6cd0 | 2 | 50 | > | 2021-06-02 18:45:05 | NULL | 4061 | 125 | e0a8401a-0bb6-4612-a496-6a794ebe6cd0 | 1 | 16384 | > | 2021-06-02 18:45:05 | NULL | 4062 | 125 | e0a8401a-0bb6-4612-a496-6a794ebe6cd0 | 0 | 16 | > | 2021-06-04 18:39:13 | NULL | 7654 | 125 | d6b9d19c-1ba9-44c2-97ab-90098509b872 | 2 | 50 | > | 2021-06-04 18:39:13 | NULL | 7655 | 125 | d6b9d19c-1ba9-44c2-97ab-90098509b872 | 1 | 16384 | > | 2021-06-04 18:39:13 | NULL | 7656 | 125 | d6b9d19c-1ba9-44c2-97ab-90098509b872 | 0 | 16 | > > > > Trying to build a vm though.. I get the placement error with the > improperly calculated “Used” values. > > 2021-06-30 19:51:39.732 43832 WARNING placement.objects.allocation [req-de225c66-8297-4b34-9380-26cf9385d658 a770bde56c9d49e68facb792cf69088c 6da06417e0004cbb87c1e64fe1978de5 - default default] Over capacity for VCPU on resource provider b749130c-a368-4332-8a1f-8411851b4b2a. Needed: 4, Used: 18509, Capacity: 1024.0 > > > Outside of changing the allocation ratio, im completely lost. Im confident > it has to do with that improper calculation of the used value but how is it > being calculated if it isn’t being added up from fixed values in the > database as has been suggested? > > Thanks in advance! > -Jeff M > > > > > > The tl;dr on how the value is calculated is there's a table called > 'allocations' in the placement database that holds all the values for > resource providers and resource classes and it has a 'used' column. If you > add up all of the 'used' values for a resource class (VCPU) and resource > provider (compute node) then that will be the total used of that resource > on that resource provider. You can see this data by 'openstack resource > provider show --allocations' as well. > > The allocation ratio will not affect the value of 'used' but it will > affect the working value of 'total' to be considered higher than it > actually is in order to oversubscribe. If a compute node has 64 cores and > cpu_allocation ratio is 16 then 64 * 16 = 1024 cores will be allowed for > placement on that compute node. > > You likely have "orphaned" allocations for the compute node/resource > provider that are not mapped to instances any more and you can use > 'nova-manage placement audit' to find those and optionally delete them. > Doing that will cleanup your resource provider. First, I would run it > without specifying --delete just to see what it shows without modifying > anything. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From iwienand at redhat.com Thu Jul 1 05:09:31 2021 From: iwienand at redhat.com (Ian Wienand) Date: Thu, 1 Jul 2021 15:09:31 +1000 Subject: review.opendev.org server upgrade 18/19th July 2021 Message-ID: Hello, Around 2021-07-18 23:00 UTC we plan to provision an upgrade to the review.opendev.org server which hosts our Gerrit instance. We are aware that some users may require firewall updates or similar to access the new server. The new addresses will be: 199.204.45.33 2604:e100:1:0:f816:3eff:fe52:22de We hope the downtime will be limited to a few hours as we migrate and index the data. User-visible changes be should minimal. Due to incompatible schemas between the old hosted database and new database instance we will not be transitioning the table that holds the information about what files you have reviewed (gerrit shows a light-grey "Reviewed" when you have viewed a file). Note this is *not* review comments or code, but just the per-user flag in the UI if you have viewed a file. You can reach admins via email at service-discuss at lists.opendev.org or in OFTC #opendev for any further questions or queries. Thank you, -i From swogatpradhan22 at gmail.com Thu Jul 1 05:49:13 2021 From: swogatpradhan22 at gmail.com (Swogat Pradhan) Date: Thu, 1 Jul 2021 11:19:13 +0530 Subject: [openstack] [DC-DC Setup] [Replication] Correct approach to a DC-DC setup for Openstack (victoria) In-Reply-To: References: Message-ID: Can somebody confirm the proper way for Openstack DC DC setup ? On Sat, Jun 26, 2021 at 9:40 AM Swogat Pradhan wrote: > Hi, > I am trying to setup a DC-DC setup in openstack victoria, I have 2 numbers > of all in one setup (controller, compute) with shared mysql and rabbitmq > cluster and am using ceph image replication and configuring cinder > replication on top of it. > > So when the 1st node goes down then i will perform a 'cinder failover-host > node1' and then do a nova-evacuate to failover to the 2nd DC setup and > once the node 1 comes up i will use cinder failback and live migration from > node2 to node1. > > **i am facing some minor issues in this setup which I will ask once I know > this is the right approach to the DC-DC concept. > > Can you please shed some light on this being the right approach or not and > if not then how can i improve it? > > With regards > Swogat pradhan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tkajinam at redhat.com Thu Jul 1 06:24:58 2021 From: tkajinam at redhat.com (Takashi Kajinami) Date: Thu, 1 Jul 2021 15:24:58 +0900 Subject: [TRIPLEO] - ZUN Support in TripleO In-Reply-To: <7b6df511.4193.17a5bc728d1.Coremail.kira034@163.com> References: <44c2d59d18916edab9ec1a6a2aeba129963ba2e8.camel@redhat.com> <7b6df511.4193.17a5bc728d1.Coremail.kira034@163.com> Message-ID: On Wed, Jun 30, 2021 at 4:22 PM Hongbin Lu wrote: > Hi, > > I am a maintainer of the Zun project. From my perspective, I would like to > have Zun supported by different deployment tools to minimize complexities > for installing and managing the service. If TripleO community interests to > add support for Zun, I am happy to provide necessary supports. > Because TripleO relies on modulus maintained by Puppet OpenStack project to manipulate configurations, you'd need to start with creating a new puppet module (puppet-zun) for it. Also, we need distro packages to support zun in Puppet OpenStack and TripleO. Puppet-openstack relies on packages to install services , and TripleO relies on RDO in CI. I quickly checked RDO but it seems there are no zun packages there. Also Ubuntu has zunclient packaged[2] but no packages available for server side. [1] https://packages.ubuntu.com/search?keywords=zun&searchon=names > > @Lokendra, if you are open to consider alternative tools, I would suggest Kolla [1]. Its support for Zun is quite solid and has CI to cover some basic senario. Another tool is OpenStack Ansible [2]. I didn't try that one myself but it has Zun support based on their document [3]. Lastly, the Zun community maintains an installation guide [4] which you can refer if you want to install it manually. > > > [1] https://docs.openstack.org/kolla-ansible/queens/reference/zun-guide.html > > [2] https://docs.openstack.org/openstack-ansible/latest/ > > [3] https://docs.openstack.org/openstack-ansible-os_zun/latest/ > > [4] https://docs.openstack.org/zun/latest/install/ > > > > Best regards, > Hongbin > > > > At 2021-06-30 00:06:50, "Marios Andreou" wrote: > >On Tue, Jun 29, 2021 at 4:45 PM Sean Mooney wrote: > >> > >> On Tue, 2021-06-29 at 15:52 +0530, Lokendra Rathour wrote: > >> > Hi Marios, > >> > Thank you for the information. > >> > > >> > With respect to the *second question*, please note: > >> > "is there any alternative to deploy containerize services in TripleO?" > >> > > >> > In our use case, in addition to having our workloads in VNFs, we also want > >> > to have containerized deployment of certain workloads on top of OpenStack. > >> > Zun service could give us that flexibility. Is there any reason that > >> > deployment of Zun with TripleO is not supported? And is there an > >> > alternative to Zun that the community is using in productions for deploying > >> > containerized workloads on top of OpenStack? > >> > >> i think marios's resopnce missed that zun is the containers as a service project > >> which provide an alternitive to nova or ironci to provision compute resouces as containers > >> directly on the physical hosts. in ooo term deploying tenant contaienrs directly on the overcloud host > >> with docker or podman. > > > >indeed I did as I got from Lokendra's reply earlier > > > >> > >> i dont think ooo currently supports this as it is not listed in https://opendev.org/openstack/tripleo-heat-templates/src/branch/master/deployment > >> so to answer your orginal questrion this does not appear to be currently supported. > > > >and this isn't somethign I have seen brought up on the irc meetings, > >ptg or elsewhere. That isn't to say it's impossible (it may be, i just > >don't know enough about it to say right now ;)). Of course anyone is > >free to propose a spec about how this might look and get some > >feedback before working on it in tripleo. > > > > > >regards, marios > > > >> > > >> > please advise. > >> > > >> > Regards, > >> > Lokendra > >> > > >> > > >> > On Tue, Jun 29, 2021 at 3:16 PM Marios Andreou wrote: > >> > > >> > > > >> > > > >> > > On Tue, Jun 29, 2021 at 11:36 AM Lokendra Rathour < > >> > > lokendrarathour at gmail.com> wrote: > >> > > > >> > > > > >> > > > Hello Everyone, > >> > > > We are curious in understanding the usage of ZUN Sevice in TripleO with > >> > > > respect to which we have questions as below: > >> > > > > >> > > > 1. Does TripleO Support ZUN? > >> > > > > >> > > > > >> > > no > >> > > > >> > > > >> > > > > >> > > > 1. If not then, is there any alternative to deploy containerize > >> > > > services in TripleO? > >> > > > > >> > > > > >> > > yes we have been deploying services with containers since queens and this > >> > > is the default (in fact we have stopped supporting non containerized > >> > > services altogether for a few releases now). For the default list of > >> > > containers see [1] and information regarding the deployment can be found in > >> > > [2] (though note that is community best effort docs so beware it may be a > >> > > bit outdated in places). > >> > > > >> > > hope it helps for now > >> > > > >> > > regards, marios > >> > > > >> > > [1] > >> > > https://opendev.org/openstack/tripleo-common/src/commit/5836974cf216f5230843e0c63eea21194b527368/container-images/tripleo_containers.yaml > >> > > [2] > >> > > https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/deployment/install_overcloud.html#deploy-the-overcloud > >> > > > >> > > > >> > > Any support with respect to the questions raised will definitely help us > >> > > > in deciding the tripleO usage. > >> > > > > >> > > > -- > >> > > > ~ Lokendra > >> > > > skype: lokendrarathour > >> > > > > >> > > > > >> > > > > >> > > >> > >> > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balazs.gibizer at est.tech Thu Jul 1 07:28:39 2021 From: balazs.gibizer at est.tech (Balazs Gibizer) Date: Thu, 01 Jul 2021 09:28:39 +0200 Subject: [nova][placement] Openstack only building one VM per machine in cluster, then runs out of resources In-Reply-To: <6717C9A7-FBBD-4560-A636-6842A0B3C8EC@uchicago.edu> References: <7329E762-12FA-4F97-B85F-2D36E1E256C6@uchicago.edu> <6717C9A7-FBBD-4560-A636-6842A0B3C8EC@uchicago.edu> Message-ID: On Wed, Jun 30, 2021 at 17:32, Jeffrey Mazzone wrote: > Any other logs with Unable to create allocation for 'VCPU' on > resource provider? > > No, the 3 logs listed are the only logs where it is showing this > message and VCPU is the only thing it fails for. No memory or disk > allocation failures, always VCPU. > > > At this point if you list the resource provider usage on > 3f9d0deb-936c-474a-bdee-d3df049f073d again then do you still see 4 > VCPU used, or 8206 used? > > The usage shows everything correctly: > ~# openstack resource provider usage show > 3f9d0deb-936c-474a-bdee-d3df049f073d > +----------------+-------+ > | resource_class | usage | > +----------------+-------+ > | VCPU | 4 | > | MEMORY_MB | 8192 | > | DISK_GB | 10 | > +----------------+-------+ > > Allocations shows the same: > > ~# openstack resource provider show > 3f9d0deb-936c-474a-bdee-d3df049f073d --allocations > +-------------+--------------------------------------------------------------------------------------------------------+ > | Field | Value > | > +-------------+--------------------------------------------------------------------------------------------------------+ > | uuid | 3f9d0deb-936c-474a-bdee-d3df049f073d > | > | name | kh09-50 > | > | generation | 244 > | > | allocations | {'4a6fe4c2-ece4-45c2-b7a2-fdfd41308988': > {'resources': {'VCPU': 4, 'MEMORY_MB': 8192, 'DISK_GB': 10}}} | > +-------------+--------------------------------------------------------------------------------------------------------+ > > Allocation candidate list shows all 228 servers in the cluster > available: > > ~# openstack allocation candidate list --resource VCPU=4 -c "resource > provider" -f value | wc -l > 228 > > Starting a new vm on that host shows the following in the logs: > > Placement-api.log > 2021-06-30 12:27:21.335 4382 WARNING placement.objects.allocation > [req-f4d74abc-7b18-407a-85e7-f1c268bd5e53 > a770bde56c9d49e68facb792cf69088c 6da06417e0004cbb87c1e64fe1978de5 - > default default] Over capacity for VCPU on resource provider > 0e0d8ec8-bb31-4da5-a813-bd73560ff7d6. Needed: 4, Used: 8206, > Capacity: 1024.0 > You said "Starting a new vm on that host". How do you do that? Something is strange. Now placement points to other than 3f9d0deb-936c-474a-bdee-d3df049f073d, it points to 0e0d8ec8-bb31-4da5-a813-bd73560ff7d6. > nova-scheduler.log > 2021-06-30 12:27:21.429 6895 WARNING nova.scheduler.client.report > [req-3106f4da-1df9-4370-b56b-8ba6b62980dc > aacc7911abf349b783eed20ad176c034 23920ecfbf294e71ad558aa49cb17de8 - > default default] Failed to save allocation for > a9296e22-4b50-45b7-a442-1fce0a844bcd. Got HTTP 409: {"errors": > [{"status": 409, "title": "Conflict", "detail": "There was a conflict > when trying to complete your request.\n\n Unable to allocate > inventory: Unable to create allocation for 'VCPU' on resource > provider '3f9d0deb-936c-474a-bdee-d3df049f073d'. The requested amount > would exceed the capacity. ", "code": "placement.undefined_code", > "request_id": "req-e9f12a3a-3136-4501-8bd6-4add31f0eb82"}]} > > But then the nova scheduler log still complains about 3f9d0deb-936c-474a-bdee-d3df049f073d instead of 0e0d8ec8-bb31-4da5-a813-bd73560ff7d6. I think we are looking at two different requests here as the request id in the nova-scheduler log req-3106f4da-1df9-4370-b56b-8ba6b62980dc does not match with the request id of the placement log req-f4d74abc-7b18-407a-85e7-f1c268bd5e53. > I really can’t figure out where this, what’s seems to be last > minute, calculation of used resources comes from. > > Given you also have an Ussuri deployment, you could call the > nova-audit command to see whether you would have orphaned allocations > : > nova-manage placement audit [--verbose] [--delete] > [--resource_provider ] > > When running this command, it says the UUID does not exist. > > > Thank you! I truly appreciate everyones help. > > -Jeff M > > From balazs.gibizer at est.tech Thu Jul 1 07:30:31 2021 From: balazs.gibizer at est.tech (Balazs Gibizer) Date: Thu, 01 Jul 2021 09:30:31 +0200 Subject: [nova][placement] Openstack only building one VM per machine in cluster, then runs out of resources In-Reply-To: <24CEA8FD-BCDD-4953-8E55-01A4717F68EC@uchicago.edu> References: <7329E762-12FA-4F97-B85F-2D36E1E256C6@uchicago.edu> <6717C9A7-FBBD-4560-A636-6842A0B3C8EC@uchicago.edu> <24CEA8FD-BCDD-4953-8E55-01A4717F68EC@uchicago.edu> Message-ID: On Wed, Jun 30, 2021 at 21:06, Jeffrey Mazzone wrote: > Yes, this is almost exactly what I did. No, I am not running mysql > in a HA deployment and I have ran nova-manage api_db sync several > times throughout the process below. > > I think I found a work around but im not sure how feasible this is. > > I first, changed the reallocation ratio to 1:1. In the nova.conf on > the controller. Nova would not accept this for some reason and seemed > like it needed to be changed on the compute node. So I deleted the > hypervisor, resource provider, and compute service. Changed the > ratios on the compute node itself, and then re-added it back in. Now > the capacity changed to 64 which is the number of cores on the > systems. When starting a vm, it still gets the same number for > “used” in the placement-api.log: See below: > > New ratios > ~# openstack resource provider inventory list > 554f2a3b-924e-440c-9847-596064ea0f3f > +----------------+------------------+----------+----------+----------+-----------+--------+ > | resource_class | allocation_ratio | min_unit | max_unit | reserved > | step_size | total | > +----------------+------------------+----------+----------+----------+-----------+--------+ > | VCPU | 1.0 | 1 | 64 | 0 > | 1 | 64 | > | MEMORY_MB | 1.0 | 1 | 515655 | 512 > | 1 | 515655 | > | DISK_GB | 1.0 | 1 | 7096 | 0 > | 1 | 7096 | > +----------------+------------------+----------+----------+----------+-----------+--------+ > > Error from placement.log > 2021-06-30 13:49:24.877 4381 WARNING placement.objects.allocation > [req-7dc8930f-1eac-401a-ade7-af36e64c2ba8 > a770bde56c9d49e68facb792cf69088c 6da06417e0004cbb87c1e64fe1978de5 - > default default] Over capacity for VCPU on resource provider > c4199e84-8259-4d0e-9361-9b0d9e6e66b7. Needed: 4, Used: 8206, > Capacity: 64.0 > > With that in mind, I did the same procedure again but set the ratio > to 1024 > > New ratios > ~# openstack resource provider inventory list > 519c1e10-3546-4e3b-a017-3e831376cde8 > +----------------+------------------+----------+----------+----------+-----------+--------+ > | resource_class | allocation_ratio | min_unit | max_unit | reserved > | step_size | total | > +----------------+------------------+----------+----------+----------+-----------+--------+ > | VCPU | 1024.0 | 1 | 64 | 0 > | 1 | 64 | > | MEMORY_MB | 1.0 | 1 | 515655 | 512 > | 1 | 515655 | > | DISK_GB | 1.0 | 1 | 7096 | 0 > | 1 | 7096 | > +----------------+------------------+----------+----------+----------+-----------+--------+ > Your are collecting data from the compute RP 519c1e10-3546-4e3b-a017-3e831376cde8 but placement warns about another compute RP c4199e84-8259-4d0e-9361-9b0d9e6e66b7. > > Now I can spin up vms without issues. > > I have 1 test AZ with 2 hosts inside. I have set these hosts to the > ratio above. I was able to spin up approx 45 4x core VMs without > issues and no signs of it hitting an upper limit on the host. > > 120 | VCPU=64 | 519c1e10-3546-4e3b-a017-3e831376cde8 | > VCPU=88/65536 > 23 | VCPU=64 | 8f97a3ba-98a0-475e-a3cf-41425569b2cb | VCPU=96/65536 > > > I have 2 problems with this fix. > > 1) the overcommit is now super high and I have no way, besides > quotas, to guarantee the system won’t be over provisioned. > 2) I still don’t know how that “used” resources value is being > calculated. When this issue first started, the “used” resources > were a different number. Over the past two days, the used resources > for a 4 core virtual machine have remained at 8206 but I have no way > to guarantee this. > > My initial tests when this started was to compare the resource values > when building different size vms. Here is that list: > > 1 core - 4107 > 2 core - 4108 > 4 core- 4110 > 8 core - 4114 > 16 core - 4122 > 32 core - 8234 > > The number on the right is the number the “used” value used to > be. Yesterday and today, it has changed to 8206 for a 4 core vm, I > have not tested the rest. > > Before I commit to combing through the placement api source code to > figure out how the “used” value in the placement log is being > calculated, im hoping someone knows where and how that value is being > calculated. It does not seem to be a fixed value in the database and > it doesn’t seem to be effected by the allocation ratios. > > > Thank you in advance!! > -Jeff Mazzone > Senior Linux Systems Administrator > Center for Translational Data Science > University of Chicago. > > > >> On Jun 30, 2021, at 2:40 PM, Laurent Dumont >> wrote: >> >> In some cases, the DEBUG messages are a bit verbose but can really >> walk you through the allocation/scheduling process. You could >> increase it for nova and restart the api + scheduler on the >> controllers. I wonder if a desync of the DB could be in cause? Are >> you running an HA deployment for the mysql backend? > From skaplons at redhat.com Thu Jul 1 07:36:33 2021 From: skaplons at redhat.com (Slawek Kaplonski) Date: Thu, 01 Jul 2021 09:36:33 +0200 Subject: [neutron] Drivers meeting agenda for 2.07.2021 Message-ID: <1732131.yON5WWAbQT@p1> Hi, We have 2 RFEs to discuss on tomorrow's meeting: https://bugs.launchpad.net/neutron/+bug/1933517 - [RFE][OVN] Create an intermediate OVS bridge between VM and intergration bridge to improve the live-migration process https://bugs.launchpad.net/neutron/+bug/1930866 - [RFE] preventing from deleting a port used by an instance (locked instance can be rendered broken by deleting port) We also have 2 not yet triaged RFEs: https://bugs.launchpad.net/neutron/+bug/1932154 https://bugs.launchpad.net/neutron/+bug/1933222 Please check them, ask questions/comments/whatever You need to triage them so we can discuss on next drivers meetings :) -- Slawek Kaplonski Principal Software Engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: This is a digitally signed message part. URL: From balazs.gibizer at est.tech Thu Jul 1 07:41:59 2021 From: balazs.gibizer at est.tech (Balazs Gibizer) Date: Thu, 01 Jul 2021 09:41:59 +0200 Subject: [nova][placement] Openstack only building one VM per machine in cluster, then runs out of resources In-Reply-To: <47839686-4CAC-4854-8B37-1DB9E458BA6E@uchicago.edu> References: <7329E762-12FA-4F97-B85F-2D36E1E256C6@uchicago.edu> <6717C9A7-FBBD-4560-A636-6842A0B3C8EC@uchicago.edu> <24CEA8FD-BCDD-4953-8E55-01A4717F68EC@uchicago.edu> <47839686-4CAC-4854-8B37-1DB9E458BA6E@uchicago.edu> Message-ID: On Thu, Jul 1, 2021 at 01:13, Jeffrey Mazzone wrote: >> On Jun 30, 2021, at 5:06 PM, melanie witt wrote: >> > I suggest you run the 'openstack resource provider show > --allocations' command as Balazs mentioned earlier to show all of the > allocations (used resources) on the compute node. I also suggest you > run the 'nova-manage placement audit' tool [1] as Sylvain mentioned > earlier to show whether there are any orphaned allocations, i.e. > allocations that are for instances that no longer exist. The consumer > UUID is the instance UUID. > > I did both of those suggestions. "openstack resource provider show > —allocations" shows what is expected. No additional > orphaned vms and the resources used is correct. Here is an example of > a different set of hosts and zones. This host had 2x 16 core vms on > it before the cluster went into this state. You can see them both > below. The nova-manage audit commands do not show any orphans either. > > ~# openstack resource provider show > 41ecee2a-ec24-48e5-8b9d-24065d67238a --allocations > +----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ > | Field | Value > > > > | > +----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ > | uuid | 41ecee2a-ec24-48e5-8b9d-24065d67238a > > > > | > | name | kh09-56 > > > > | > | generation | 55 > > > > | > | root_provider_uuid | 41ecee2a-ec24-48e5-8b9d-24065d67238a > > > > | > | parent_provider_uuid | None > > > > | > | allocations | {'d6b9d19c-1ba9-44c2-97ab-90098509b872': > {'resources': {'DISK_GB': 50, 'MEMORY_MB': 16384, 'VCPU': 16}, > 'consumer_generation': 1}, 'e0a8401a-0bb6-4612-a496-6a794ebe6cd0': > {'resources': {'DISK_GB': 50, 'MEMORY_MB': 16384, 'VCPU': 16}, > 'consumer_generation': 1}} | > +----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ > > Usage on the resource provider: > ~# openstack resource provider usage show > 41ecee2a-ec24-48e5-8b9d-24065d67238a > +----------------+-------+ > | resource_class | usage | > +----------------+-------+ > | VCPU | 32 | > | MEMORY_MB | 32768 | > | DISK_GB | 100 | > +----------------+-------+ > > All of that looks correct. Requesting it to check allocations for a 4 > VCPU vm also shows it as a candidate: > ~# openstack allocation candidate list --resource VCPU=4 | grep > 41ecee2a-ec24-48e5-8b9d-24065d67238a > | 41 | VCPU=4 | 41ecee2a-ec24-48e5-8b9d-24065d67238a | > VCPU=32/1024,MEMORY_MB=32768/772714,DISK_GB=100/7096 > > In the placement database, under the used column, also shows the > correct values for the information provided above with those 2 vms on > it: > +---------------------+------------+-------+----------------------+--------------------------------------+-------------------+-------+ > | created_at | updated_at | id | resource_provider_id | > consumer_id | resource_class_id | used | > +---------------------+------------+-------+----------------------+--------------------------------------+-------------------+-------+ > | 2021-06-02 18:45:05 | NULL | 4060 | 125 | > e0a8401a-0bb6-4612-a496-6a794ebe6cd0 | 2 | 50 | > | 2021-06-02 18:45:05 | NULL | 4061 | 125 | > e0a8401a-0bb6-4612-a496-6a794ebe6cd0 | 1 | 16384 | > | 2021-06-02 18:45:05 | NULL | 4062 | 125 | > e0a8401a-0bb6-4612-a496-6a794ebe6cd0 | 0 | 16 | > | 2021-06-04 18:39:13 | NULL | 7654 | 125 | > d6b9d19c-1ba9-44c2-97ab-90098509b872 | 2 | 50 | > | 2021-06-04 18:39:13 | NULL | 7655 | 125 | > d6b9d19c-1ba9-44c2-97ab-90098509b872 | 1 | 16384 | > | 2021-06-04 18:39:13 | NULL | 7656 | 125 | > d6b9d19c-1ba9-44c2-97ab-90098509b872 | 0 | 16 | > > > Trying to build a vm though.. I get the placement error with the > improperly calculated “Used” values. > > 2021-06-30 19:51:39.732 43832 WARNING placement.objects.allocation > [req-de225c66-8297-4b34-9380-26cf9385d658 > a770bde56c9d49e68facb792cf69088c 6da06417e0004cbb87c1e64fe1978de5 - > default default] Over capacity for VCPU on resource provider > b749130c-a368-4332-8a1f-8411851b4b2a. Needed: 4, Used: 18509, > Capacity: 1024.0 > Again you confirmed that the compute RP 41ecee2a-ec24-48e5-8b9d-24065d67238a has a consistent resource view but placement warns about another compute b749130c-a368-4332-8a1f-8411851b4b2a. Could you try to trace through one single situation? Try to boot a VM that results in the error with the placement over capacity warning. Then collect the resource view of the compute RP the placement warning points at. If the result of such tracing is not showing the reason then you can dig the placement code. The placement warning comes from https://github.com/openstack/placement/blob/f77a7f9928d1156450c48045c48597b2feec9cc1/placement/objects/allocation.py#L228 top of that function there is an SQL command you can try to apply to your DB and the resource provider placement warns about to see where the used value are coming from. Cheers, gibi > Outside of changing the allocation ratio, im completely lost. Im > confident it has to do with that improper calculation of the used > value but how is it being calculated if it isn’t being added up > from fixed values in the database as has been suggested? > > Thanks in advance! > -Jeff M > > > >> >> >> The tl;dr on how the value is calculated is there's a table called >> 'allocations' in the placement database that holds all the values >> for resource providers and resource classes and it has a 'used' >> column. If you add up all of the 'used' values for a resource class >> (VCPU) and resource provider (compute node) then that will be the >> total used of that resource on that resource provider. You can see >> this data by 'openstack resource provider show >> --allocations' as well. >> >> The allocation ratio will not affect the value of 'used' but it will >> affect the working value of 'total' to be considered higher than it >> actually is in order to oversubscribe. If a compute node has 64 >> cores and cpu_allocation ratio is 16 then 64 * 16 = 1024 cores will >> be allowed for placement on that compute node. >> >> You likely have "orphaned" allocations for the compute node/resource >> provider that are not mapped to instances any more and you can use >> 'nova-manage placement audit' to find those and optionally delete >> them. Doing that will cleanup your resource provider. First, I would >> run it without specifying --delete just to see what it shows without >> modifying anything. > From tonyppe at gmail.com Thu Jul 1 07:59:49 2021 From: tonyppe at gmail.com (Tony Pearce) Date: Thu, 1 Jul 2021 15:59:49 +0800 Subject: [kayobe][kolla-ansible][victoria] Message-ID: How do I configure the glance container to use a host-mounted NFS share? It is not covered in the kolla-ansible documentation [1]. What it does say is to set the file path using "glance_file_datadir_volume". I believe that this is relative to the container (not the host). Do I need to mount the local host (nfs) directory as a docker volume for glance as additional step? If so, how do I achieve this with kayobe / kolla-ansible? Some background: After a successful deployment of 3 controller nodes I have one glance service which is dependent on one of these controller nodes being available always. My intent is to provide a NFS share for glance that removes the dependency. [1] OpenStack Docs: Glance - Image service Kind regards, Tony Pearce -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at stackhpc.com Thu Jul 1 08:26:31 2021 From: mark at stackhpc.com (Mark Goddard) Date: Thu, 1 Jul 2021 09:26:31 +0100 Subject: [kayobe][kolla-ansible][victoria] In-Reply-To: References: Message-ID: On Thu, 1 Jul 2021 at 09:00, Tony Pearce wrote: > > How do I configure the glance container to use a host-mounted NFS share? It is not covered in the kolla-ansible documentation [1]. What it does say is to set the file path using "glance_file_datadir_volume". I believe that this is relative to the container (not the host). It's relative to the host, used in the host part of a bind mount for the glance-api container. > > Do I need to mount the local host (nfs) directory as a docker volume for glance as additional step? If so, how do I achieve this with kayobe / kolla-ansible? Yes. You could use a kayobe custom playbook https://docs.openstack.org/kayobe/latest/custom-ansible-playbooks.html > > Some background: > After a successful deployment of 3 controller nodes I have one glance service which is dependent on one of these controller nodes being available always. My intent is to provide a NFS share for glance that removes the dependency. The most common approach is to use Ceph for glance & cinder, but this adds significant complexity. NFS is probably a reasonable second best. > > [1] OpenStack Docs: Glance - Image service > > Kind regards, > > Tony Pearce > From mark at stackhpc.com Thu Jul 1 08:31:10 2021 From: mark at stackhpc.com (Mark Goddard) Date: Thu, 1 Jul 2021 09:31:10 +0100 Subject: [kayobe][victoria] no module named docker - deploy fail after deploy successful In-Reply-To: References: Message-ID: On Tue, 29 Jun 2021 at 12:03, Tony Pearce wrote: > > I had a successful deployment of Openstack Victoria via Kayobe with an all-in-one node running controller and compute roles. I wanted to then add 2 controller nodes to make 3 controllers and one compute. The 2 additional controllers have a different interface naming so I needed to modify the inventory. I checked this ansible documentation to figure out the changes I’d need to make [1]. The first try, I misunderstood the layout because kayobe tried to configure the new nodes interfaces with incorrect naming. > > In my first try I tried this inventory layout: > > > Move existing configuration: > > > [kayobe config] / Inventory / * > > To > > [kayobe config] / Inventory / Ciscohosts / > > > Create new from copy of existing:[kayobe config] / Inventory / Ciscohosts / > > > [kayobe config] / Inventory / Otherhosts / > > > The Otherhosts had its own host file with the 2 controllers and group_vars/controllers network interface configuration as per these two hosts. But anyway it didnt achieve the desired result so I rechecked the ansible doc [1] and decided to do this another way as follows: > > > In my second try I first reversed the inventory change: > > > Delete the new dir: [kayobe config] / Inventory / Otherhosts / > > Move back the config: [kayobe config] / Inventory / Ciscohosts / * > [kayobe config] / Inventory / > > Delete the empty dir: [kayobe config] / Inventory / Ciscohosts / > > > Then create host_vars for the two individual hosts: > > [kayobe config] / Inventory / host_vars / cnode2 > > [kayobe config] / Inventory / host_vars / cnode3 > > > And updated the single hosts inventory file [kayobe config] / Inventory / hosts > > > This seemed to work fine, the “kayobe overcloud host configure” was successful and the hosts interfaces were set up as I desired. > > > The issue came when doing the “kayobe overcloud service deploy” and failed with "/usr/bin/python3", "-c", "import docker” = ModuleNotFoundError: No module named 'docker' for all three nodes, where previously it (the deployment) had been successful for the all-in-one node. I do not know if this task had run or skipped before but the task is run against "baremetal" group and controllers and compute are in this group so I assume that it had been ran successfully in previous deployments and this is the weird thing because no other changes have been made apart from as described here. Perhaps you somehow lost this file: https://opendev.org/openstack/kayobe-config/src/branch/master/etc/kayobe/inventory/group_vars/overcloud/ansible-python-interpreter > > Verbose error output: [3] > > > After the above error, I reverted the inventory back to the “working” state, which is basically to update the inventory hosts and removed the 2 controllers. As well as remove the whole host_vars directory. After doing this however, the same error is still seen /usr/bin/python3", "-c", "import docker” = ModuleNotFoundError. > > > I logged into the host and tried to run this manually on the CLI and I see the same output. What I don’t understand is why this error is occurring now after previous successful deployments. > > > To try and resolve /workaround this issue I have tried to no avail: > - recreating virtual environments on all-in-one node > > - recreating virtual environments on ACH > > - deleting the [kolla config] directory > > - deleting .ansible and /tmp/ caches > > - turning off pipelining > > > After doing the above I needed to do the control host bootstrap and host configure before service deploy however the same error persisted and I could not work around it with any of the above steps being performed. > > > As a test, I decided to turn off this task in the playbook [4] and the yml file runs as follows: [2]. This results in a (maybe pseudo) successful deployment again, in a sense that it deploys without failure because that task does not run. > > After this was successful in deploying once again as it had previously had been, I added the two controller nodes using the “host_vars” and then I was able to successfully deploy again with HA controllers. Well, it is successful apart from Designate issue due to Designate already having the config [5]. I can log in to the horizon dashboard and under system information I can see all three controllers there. > > > Could I ask the community for help with: > > Regarding the kayobe inventory, is anything wrong with the 2nd attempt in line with Kayobe? > > Has anyone come across this docker issue (or similar within this context of failing after being successful) and can suggest? > > > I repeatedly get these odd issues where successful deployments then fail in the future. This often occurs after making a config change and then rolling back but the roll back does not return to a working deployment state. The fix/workaround for me in these cases is to “kayobe overcloud service destroy --yes-i-really-really-mean-it” and also re-deploy the host. > > > [1] Best Practices — Ansible Documentation > > [2] modified Checking docker SDK version# command: "{{ ansible_python.execut - Pastebin.com > > [3] TASK [prechecks : Checking docker SDK version] ********************************* - Pastebin.com > > [4] /home/cv-user/kayobe-victoria/venvs/kolla-ansible/share/kolla-ansible/ansible/roles/prechecks/tasks/package_checks.yml > > [5] TASK [designate : Update DNS pools] ******************************************** - Pastebin.com > > > Kind regards, > > From tonyppe at gmail.com Thu Jul 1 09:44:29 2021 From: tonyppe at gmail.com (Tony Pearce) Date: Thu, 1 Jul 2021 17:44:29 +0800 Subject: [kayobe][kolla-ansible][victoria] In-Reply-To: References: Message-ID: Hi Mark, first thank you for replying so quickly to help me. >It's relative to the host, used in the host part of a bind mount for the glance-api container. This is great and that was the missing key info I needed. Have a great day! Kind regards, Tony Pearce On Thu, 1 Jul 2021 at 16:26, Mark Goddard wrote: > On Thu, 1 Jul 2021 at 09:00, Tony Pearce wrote: > > > > How do I configure the glance container to use a host-mounted NFS share? > It is not covered in the kolla-ansible documentation [1]. What it does say > is to set the file path using "glance_file_datadir_volume". I believe that > this is relative to the container (not the host). > > It's relative to the host, used in the host part of a bind mount for > the glance-api container. > > > > > Do I need to mount the local host (nfs) directory as a docker volume for > glance as additional step? If so, how do I achieve this with kayobe / > kolla-ansible? > > Yes. You could use a kayobe custom playbook > https://docs.openstack.org/kayobe/latest/custom-ansible-playbooks.html > > > > Some background: > > After a successful deployment of 3 controller nodes I have one glance > service which is dependent on one of these controller nodes being available > always. My intent is to provide a NFS share for glance that removes the > dependency. > > The most common approach is to use Ceph for glance & cinder, but this > adds significant complexity. NFS is probably a reasonable second best. > > > > > [1] OpenStack Docs: Glance - Image service > > > > Kind regards, > > > > Tony Pearce > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tonyppe at gmail.com Thu Jul 1 09:48:29 2021 From: tonyppe at gmail.com (Tony Pearce) Date: Thu, 1 Jul 2021 17:48:29 +0800 Subject: [kayobe][victoria] no module named docker - deploy fail after deploy successful In-Reply-To: References: Message-ID: > Perhaps you somehow lost this file: I thought that too, but I did not change those files at all. But also I duplicated the value from that file and inserted it (pasted the values) above the network config for the individual nodes. It did not have any change on the outcome at all, so I removed that. I was able to fix this issue by deleting the nodes and redeploying from the same ACH, in the end. Tony Pearce On Thu, 1 Jul 2021 at 16:31, Mark Goddard wrote: > On Tue, 29 Jun 2021 at 12:03, Tony Pearce wrote: > > > > I had a successful deployment of Openstack Victoria via Kayobe with an > all-in-one node running controller and compute roles. I wanted to then add > 2 controller nodes to make 3 controllers and one compute. The 2 additional > controllers have a different interface naming so I needed to modify the > inventory. I checked this ansible documentation to figure out the changes > I’d need to make [1]. The first try, I misunderstood the layout because > kayobe tried to configure the new nodes interfaces with incorrect naming. > > > > In my first try I tried this inventory layout: > > > > > > Move existing configuration: > > > > > > [kayobe config] / Inventory / * > > > > To > > > > [kayobe config] / Inventory / Ciscohosts / > > > > > > Create new from copy of existing:[kayobe config] / Inventory / > Ciscohosts / > > > > > [kayobe config] / Inventory / Otherhosts / > > > > > > The Otherhosts had its own host file with the 2 controllers and > group_vars/controllers network interface configuration as per these two > hosts. But anyway it didnt achieve the desired result so I rechecked the > ansible doc [1] and decided to do this another way as follows: > > > > > > In my second try I first reversed the inventory change: > > > > > > Delete the new dir: [kayobe config] / Inventory / Otherhosts / > > > > Move back the config: [kayobe config] / Inventory / Ciscohosts / * > > [kayobe config] / Inventory / > > > > Delete the empty dir: [kayobe config] / Inventory / Ciscohosts / > > > > > > Then create host_vars for the two individual hosts: > > > > [kayobe config] / Inventory / host_vars / cnode2 > > > > [kayobe config] / Inventory / host_vars / cnode3 > > > > > > And updated the single hosts inventory file [kayobe config] / Inventory > / hosts > > > > > > This seemed to work fine, the “kayobe overcloud host configure” was > successful and the hosts interfaces were set up as I desired. > > > > > > The issue came when doing the “kayobe overcloud service deploy” and > failed with "/usr/bin/python3", "-c", "import docker” = > ModuleNotFoundError: No module named 'docker' for all three nodes, where > previously it (the deployment) had been successful for the all-in-one node. > I do not know if this task had run or skipped before but the task is run > against "baremetal" group and controllers and compute are in this group so > I assume that it had been ran successfully in previous deployments and this > is the weird thing because no other changes have been made apart from as > described here. > > Perhaps you somehow lost this file: > > https://opendev.org/openstack/kayobe-config/src/branch/master/etc/kayobe/inventory/group_vars/overcloud/ansible-python-interpreter > > > > > Verbose error output: [3] > > > > > > After the above error, I reverted the inventory back to the “working” > state, which is basically to update the inventory hosts and removed the 2 > controllers. As well as remove the whole host_vars directory. After doing > this however, the same error is still seen /usr/bin/python3", "-c", "import > docker” = ModuleNotFoundError. > > > > > > I logged into the host and tried to run this manually on the CLI and I > see the same output. What I don’t understand is why this error is occurring > now after previous successful deployments. > > > > > > To try and resolve /workaround this issue I have tried to no avail: > > - recreating virtual environments on all-in-one node > > > > - recreating virtual environments on ACH > > > > - deleting the [kolla config] directory > > > > - deleting .ansible and /tmp/ caches > > > > - turning off pipelining > > > > > > After doing the above I needed to do the control host bootstrap and host > configure before service deploy however the same error persisted and I > could not work around it with any of the above steps being performed. > > > > > > As a test, I decided to turn off this task in the playbook [4] and the > yml file runs as follows: [2]. This results in a (maybe pseudo) successful > deployment again, in a sense that it deploys without failure because that > task does not run. > > > > After this was successful in deploying once again as it had previously > had been, I added the two controller nodes using the “host_vars” and then I > was able to successfully deploy again with HA controllers. Well, it is > successful apart from Designate issue due to Designate already having the > config [5]. I can log in to the horizon dashboard and under system > information I can see all three controllers there. > > > > > > Could I ask the community for help with: > > > > Regarding the kayobe inventory, is anything wrong with the 2nd attempt > in line with Kayobe? > > > > Has anyone come across this docker issue (or similar within this context > of failing after being successful) and can suggest? > > > > > > I repeatedly get these odd issues where successful deployments then fail > in the future. This often occurs after making a config change and then > rolling back but the roll back does not return to a working deployment > state. The fix/workaround for me in these cases is to “kayobe overcloud > service destroy --yes-i-really-really-mean-it” and also re-deploy the host. > > > > > > [1] Best Practices — Ansible Documentation > > > > [2] modified Checking docker SDK version# command: "{{ > ansible_python.execut - Pastebin.com > > > > [3] TASK [prechecks : Checking docker SDK version] > ********************************* - Pastebin.com > > > > [4] > /home/cv-user/kayobe-victoria/venvs/kolla-ansible/share/kolla-ansible/ansible/roles/prechecks/tasks/package_checks.yml > > > > [5] TASK [designate : Update DNS pools] > ******************************************** - Pastebin.com > > > > > > Kind regards, > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From malikobaidadil at gmail.com Thu Jul 1 10:51:58 2021 From: malikobaidadil at gmail.com (Malik Obaid) Date: Thu, 1 Jul 2021 15:51:58 +0500 Subject: [wallaby][neutron][ovn] Multiple VLAN ranges and FLAT network Message-ID: Hi, I am using Openstack Wallaby release on Ubuntu 20.04. I am configuring openstack neutron for production and just want to know is there a way to specify different vlan ranges with multiple physical networks. I would really appreciate any input in this regard. Thank you. Regards, Malik Obaid -------------- next part -------------- An HTML attachment was scrubbed... URL: From ebiibe82 at gmail.com Thu Jul 1 11:40:10 2021 From: ebiibe82 at gmail.com (Amit Mahajan) Date: Thu, 1 Jul 2021 17:10:10 +0530 Subject: [Kolla-Ansible] Regarding stability of Ironic deployment with Kolla-Ansible In-Reply-To: References: Message-ID: Thanks Mark! On Wed, Jun 30, 2021 at 10:05 PM Mark Goddard wrote: > On Wed, 30 Jun 2021 at 11:46, Amit Mahajan wrote: > > > > Thanks, much appreciate your responses. Definitely, based on our > requirements, we will be looking more into the support for various feature > sets, community support etc. for various tools before trying anything. > > Hi Amit, > > Some Kayobe resources: > > Openinfra london meetup session: > https://www.youtube.com/watch?v=0liqSO0SZ60&t=4842s > Slides for the above: > > https://docs.google.com/presentation/d/1wFittxJwRBH6IyCr4Ext2ZdBuNr42xfpanFdE2drP5Y/edit?usp=sharing > A universe from nothing (hands on workshop): > https://github.com/stackhpc/a-universe-from-nothing/ > > Hope that is enough to get you started. > > Mark > > > > > On Wed, Jun 30, 2021 at 3:57 PM Radosław Piliszek < > radoslaw.piliszek at gmail.com> wrote: > >> > >> On Wed, Jun 30, 2021 at 12:03 PM Amit Mahajan > wrote: > >> > > >> > Thanks, it's reassuring that Ironic support is there and it works > well. > >> > > >> > Could you or anyone else in the community point me to some link where > I can understand the benefit of using Kayobe over Kolla-Ansible. I came > across this link ( > https://www.slideshare.net/MarkGoddard2/to-kayobe-or-not-to-kayobe), but > slides are not available at it. > >> > >> Oh, Mark Goddard, our Project Team Lead (PTL), CC'ed, can likely help > >> with that content. > >> Sad that someone removed it from slideshare... > >> > >> Personally, I am not using Kayobe because I have my own approach to > >> baremetal provisioning and management and did not need it (this may > >> change, who knows). > >> If you have your own way too, then you might be better off using Kolla > >> Ansible directly (you will still get a lot). > >> If you need to come up with "the way" - then starting with Kayobe > >> might be more appropriate. > >> The above logic is obviously oversimplified as you would likely need > >> to evaluate Kayobe if it fits your needs as well. > >> Kayobe docs might help too: https://docs.openstack.org/kayobe/latest/ > >> Also, feel free to use this mailing list for related questions and/or > >> join us on OFTC IRC #openstack-kolla > >> > >> -yoctozepto > -------------- next part -------------- An HTML attachment was scrubbed... URL: From amotoki at gmail.com Thu Jul 1 12:43:04 2021 From: amotoki at gmail.com (Akihiro Motoki) Date: Thu, 1 Jul 2021 21:43:04 +0900 Subject: [release][tc][neutron] how to EOL unofficial branches under offical repositories Message-ID: Hi, I have a question about how to mark stable branches as EOL when such branches were not part of official releases at the moment of corresponding releases but the repositories which contain them became official after the release. One example happens in a repository under the neutron team. networking-odl is now part of the neutron governance but as of Ocata release it was not official. Ocata branches under the neutron stadium were marked as EOL recently, but networking-odl was not marked as EOL as it was not an official repository as of Ocata. The openstack/releases repository handles official branches under a specific release, but all branches in official repositories are under control of the openstack/releases repo. I wonder how we can mark stable/ocata branch of networking-odl repo as EOL and delete stable/ocata branch. What is the recommended way to mark such branches described above as EOL? Thanks in advance, Akihiro Motoki (irc: amotoki) From hberaud at redhat.com Thu Jul 1 12:59:32 2021 From: hberaud at redhat.com (Herve Beraud) Date: Thu, 1 Jul 2021 14:59:32 +0200 Subject: [release][tc][neutron] how to EOL unofficial branches under offical repositories In-Reply-To: References: Message-ID: Hello, I think that you are more or less in the same situation as with `puppet-openstack-integration`. Please have a look to this patch https://review.opendev.org/c/openstack/releases/+/798282 I think that we could do something similar and then remove the stale branches. Let me know what you think about this. Le jeu. 1 juil. 2021 à 14:45, Akihiro Motoki a écrit : > Hi, > > I have a question about how to mark stable branches as EOL when such > branches > were not part of official releases at the moment of corresponding > releases but the > repositories which contain them became official after the release. > > One example happens in a repository under the neutron team. > networking-odl is now part of the neutron governance but as of Ocata > release it was not official. > Ocata branches under the neutron stadium were marked as EOL recently, > but networking-odl > was not marked as EOL as it was not an official repository as of Ocata. > > The openstack/releases repository handles official branches under a > specific release, > but all branches in official repositories are under control of the > openstack/releases repo. > > I wonder how we can mark stable/ocata branch of networking-odl repo as > EOL and delete stable/ocata branch. > What is the recommended way to mark such branches described above as EOL? > > Thanks in advance, > Akihiro Motoki (irc: amotoki) > > -- Hervé Beraud Senior Software Engineer at Red Hat irc: hberaud https://github.com/4383/ https://twitter.com/4383hberaud -----BEGIN PGP SIGNATURE----- wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O v6rDpkeNksZ9fFSyoY2o =ECSj -----END PGP SIGNATURE----- -------------- next part -------------- An HTML attachment was scrubbed... URL: From elod.illes at est.tech Thu Jul 1 13:14:58 2021 From: elod.illes at est.tech (=?UTF-8?B?RWzFkWQgSWxsw6lz?=) Date: Thu, 1 Jul 2021 15:14:58 +0200 Subject: [release][tc][neutron] how to EOL unofficial branches under offical repositories In-Reply-To: References: Message-ID: <3fbabbc1-7e0e-609c-04cd-f3eced18662c@est.tech> Yes, what Hervé wrote is a good approach. (Actually, the same issue was already discussed in another mail thread [1]). I also talked about this with Lajos Katona and he is planning to prepare the EOL process and the release patches as far as I know. Thanks, Előd [1] http://lists.openstack.org/pipermail/openstack-discuss/2021-June/thread.html#23347 On 2021. 07. 01. 14:59, Herve Beraud wrote: > Hello, > > I think that you are more or less in the same situation as with > `puppet-openstack-integration`. > > Please have a look to this patch > https://review.opendev.org/c/openstack/releases/+/798282 > > > I think that we could do something similar and then remove the stale > branches. > > Let me know what you think about this. > > Le jeu. 1 juil. 2021 à 14:45, Akihiro Motoki > a écrit : > > Hi, > > I have a question about how to mark stable branches as EOL when > such branches > were not part of official releases at the moment of corresponding > releases but the > repositories which contain them became official after the release. > > One example happens in a repository under the neutron team. > networking-odl is now part of the neutron governance but as of Ocata > release it was not official. > Ocata branches under the neutron stadium were marked as EOL recently, > but networking-odl > was not marked as EOL as it was not an official repository as of > Ocata. > > The openstack/releases repository handles official branches under a > specific release, > but all branches in official repositories are under control of the > openstack/releases repo. > > I wonder how we can mark stable/ocata branch of networking-odl repo as > EOL and delete stable/ocata branch. > What is the recommended way to mark such branches described above > as EOL? > > Thanks in advance, > Akihiro Motoki (irc: amotoki) > > > > -- > Hervé Beraud > Senior Software Engineer at Red Hat > irc: hberaud > https://github.com/4383/ > https://twitter.com/4383hberaud > -----BEGIN PGP SIGNATURE----- > > wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ > Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ > RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP > F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G > 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g > glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw > m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ > hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 > qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y > F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 > B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O > v6rDpkeNksZ9fFSyoY2o > =ECSj > -----END PGP SIGNATURE----- > -------------- next part -------------- An HTML attachment was scrubbed... URL: From amotoki at gmail.com Thu Jul 1 15:41:12 2021 From: amotoki at gmail.com (Akihiro Motoki) Date: Fri, 2 Jul 2021 00:41:12 +0900 Subject: [release][tc][neutron] how to EOL unofficial branches under offical repositories In-Reply-To: <3fbabbc1-7e0e-609c-04cd-f3eced18662c@est.tech> References: <3fbabbc1-7e0e-609c-04cd-f3eced18662c@est.tech> Message-ID: Thanks Herve and Előd, The approach suggested totally makes sense. Also good to hear that Lajos is always preparing it :) Side note: When I am trying to address zuul configuration errors, I noticed the only way to address it in stable/ocata in networking-odl is to EOL the branch as we already EOL'ed the neutron stable/ocata branch and the jobs always fail. Thanks, Akihiro On Thu, Jul 1, 2021 at 10:17 PM Előd Illés wrote: > > Yes, what Hervé wrote is a good approach. (Actually, the same issue was already discussed in another mail thread [1]). I also talked about this with Lajos Katona and he is planning to prepare the EOL process and the release patches as far as I know. > > Thanks, > > Előd > > [1] http://lists.openstack.org/pipermail/openstack-discuss/2021-June/thread.html#23347 > > > > On 2021. 07. 01. 14:59, Herve Beraud wrote: > > Hello, > > I think that you are more or less in the same situation as with `puppet-openstack-integration`. > > Please have a look to this patch https://review.opendev.org/c/openstack/releases/+/798282 > > I think that we could do something similar and then remove the stale branches. > > Let me know what you think about this. > > Le jeu. 1 juil. 2021 à 14:45, Akihiro Motoki a écrit : >> >> Hi, >> >> I have a question about how to mark stable branches as EOL when such branches >> were not part of official releases at the moment of corresponding >> releases but the >> repositories which contain them became official after the release. >> >> One example happens in a repository under the neutron team. >> networking-odl is now part of the neutron governance but as of Ocata >> release it was not official. >> Ocata branches under the neutron stadium were marked as EOL recently, >> but networking-odl >> was not marked as EOL as it was not an official repository as of Ocata. >> >> The openstack/releases repository handles official branches under a >> specific release, >> but all branches in official repositories are under control of the >> openstack/releases repo. >> >> I wonder how we can mark stable/ocata branch of networking-odl repo as >> EOL and delete stable/ocata branch. >> What is the recommended way to mark such branches described above as EOL? >> >> Thanks in advance, >> Akihiro Motoki (irc: amotoki) >> > > > -- > Hervé Beraud > Senior Software Engineer at Red Hat > irc: hberaud > https://github.com/4383/ > https://twitter.com/4383hberaud > -----BEGIN PGP SIGNATURE----- > > wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ > Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ > RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP > F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G > 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g > glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw > m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ > hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 > qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y > F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 > B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O > v6rDpkeNksZ9fFSyoY2o > =ECSj > -----END PGP SIGNATURE----- > From billy.olsen at canonical.com Thu Jul 1 20:26:10 2021 From: billy.olsen at canonical.com (Billy Olsen) Date: Thu, 1 Jul 2021 13:26:10 -0700 Subject: [charms] Trilio Charms 21.06 release is now available Message-ID: The 21.06 release of the Trilio charms are now available. Please see the Release Notes for full details: https://docs.openstack.org/charm-guide/latest/2106_Trilio.html == Highlights == * TrilioVault 4.1 HF1 The charms have added options to support new configurations presented in the TrilioVault 4.1 HF1 release. * S3 Support The charms now support using S3 storage as a backup target. == Thank you == Lots of thanks to the following people who contributed to this release via code changes, documentation updates and testing efforts. Alex Kavanagh Billy Olsen Chris MacNaughton Corey Bryant David Ames Liam Young Marian Gasparovic Peter Matulis -- OpenStack Charms Team From salman10sheikh at gmail.com Thu Jul 1 05:55:26 2021 From: salman10sheikh at gmail.com (Salman Sheikh) Date: Thu, 1 Jul 2021 11:25:26 +0530 Subject: Extension of disk in cinder Message-ID: Hi all, How we can add the more disk to cinder, as I have created the cinder-volumes on one disk (/sdb) the sdb is 500Gb, if i want to increase the the size of cinder-volumes then how we extend the size of cinder-volumes. how we can add more disk in order to increase the size of cinder-volume. Kindly advise -------------- next part -------------- An HTML attachment was scrubbed... URL: From mahnoor.asghar at xflowresearch.com Thu Jul 1 07:57:52 2021 From: mahnoor.asghar at xflowresearch.com (Mahnoor Asghar) Date: Thu, 1 Jul 2021 12:57:52 +0500 Subject: [Ironic] Vendor-neutral Disk names In-Reply-To: References: Message-ID: Thank you for the response, Mike! I agree that 'ServiceLabel' is a good starting point, but it would be preferable to have a more consistent format that represents a Drive resource uniquely, irrespective of the vendor. The idea is to let Ironic name the Drive resources using this logic, so that the baremetal operator can use the same, consistent method of specifying disks for RAID configuration. Feedback from the Ironic community is very welcome here, so that an informed proposal can be made. On Wed, Jun 30, 2021 at 6:46 PM Mike Raineri wrote: > > Hi Mahnoor, > > First, to answer your questions about the property values: > - "ServiceLabel" is supposed to always be something human friendly and matches an indicator that is used to reference a part in an enclosure. Ultimately the vendor dictates the how they construct their stickers/etching/silk-screens/other labels, but I would expect something along the lines of "Drive Bay 3", or "Slot 9", or something similar. > - The resource type will always be "#Drive.v1_X_X.Drive"; this is required by Redfish for representing a standard "Drive" resource. > - "Id" as defined in the Redfish Specification is going to be a unique identifier in the collection; there are no rules with what goes into this property other than uniqueness in terms of other collection members. I've seen some implementations use a simple numeric index and others use something that looks more like a service label. However, I've also seen a few use either a GUID or other globally unique identifier, which might be unfriendly for a user to look at. > > With that said, when I originally authored that logic in Tacklebox, I fully expected to revisit that logic to fine tune it to ensure it reliably gives something human readable. I haven't gone through the exercise to refine it further, but my initial impression is I'll be looking at more properties specific to the Drive resource to help build the string. I think "ServiceLabel" is still a good starting point, but having the appropriate fallback logic in its absence would be very useful to avoid construction based on "Id". > > Thanks. > > -Mike > > On Wed, Jun 30, 2021 at 9:01 AM Mahnoor Asghar wrote: >> >> Dear all, >> >> There is [a proposal][1] in the metal3-io/baremetal-operator repository to extend the hardware RAID configuration to support ‘physical_disks’ and ‘controller’ fields in the 'target_raid_config' section. >> The user should be able to specify the disks for RAID configuration, in a vendor-agnostic way. (This requirement comes from the Airship project.) The names of the disks should be indicative of the physical location of the disks within the server. An algorithm to construct disk names is therefore needed, for this purpose. >> >> One possible algorithm was found in the [inventory module][2] of the [Redfish Tacklebox scripts][3]. >> To construct a disk name, it uses Redfish properties, specifically the ‘Drive’ resource> ‘Physical Location’ object> ‘Part Location’ property> ‘Service Label’ property. ([Link][4] to code) ([Link][5] to Redfish ‘Drive’ resource) >> If this property is empty, the resource type (String uptil the first dot encountered in the @odata.type field), and the ‘Id’ properties of the Drive resource are used to construct the disk name. ([Link][6] to code) >> For example, if the 'Drive'.'Physical Location'.'Part Location'.'Service Label' field is ‘Disk.Bay1.Slot0’, this is what the disk name will be. If this field is empty, and the resource name is ‘Drive’ and the resource ‘Id’ is ‘Disk.Bay1.Slot0’, the disk name will be: ‘Drive: Disk.Bay1.Slot0’. >> >> We would like to understand the values different vendors give in: >> - ‘Drive’ resource> ‘Physical Location’ object> ‘Part Location’ property> ‘Service Label’ property >> - The resource type for a Drive (@odata.type field) >> - The ‘Id’ property of the Drive resource >> Also, it would be helpful to understand the existing logic used by vendors to construct the disk names, including the (Redfish or other) properties used. >> >> This is so that a consensus can be reached for an algorithm to construct disk names. Any suggestions are welcome, thank you so much! >> >> [1]: https://github.com/metal3-io/metal3-docs/pull/148 >> [2]: https://github.com/DMTF/Redfish-Tacklebox/blob/master/redfish_utilities/inventory.py >> [3]: https://github.com/DMTF/Redfish-Tacklebox >> [4]: https://github.com/DMTF/Redfish-Tacklebox/blob/e429f70a79cfe288756618498ce485ab4be37131/redfish_utilities/inventory.py#L192 >> [5]: https://redfish.dmtf.org/schemas/v1/Drive.v1_12_1.json >> [6]: https://github.com/DMTF/Redfish-Tacklebox/blob/e429f70a79cfe288756618498ce485ab4be37131/redfish_utilities/inventory.py#L199 >> >> Best regards, >> Mahnoor Asghar >> Software Design Engineer >> xFlow Research Inc. >> mahnoor.asghar at xflowresearch.com >> www.xflowresearch.com -- Mahnoor Asghar Software Design Engineer xFlow Research Inc. mahnoor.asghar at xflowresearch.com www.xflowresearch.com From akanevsk at redhat.com Thu Jul 1 23:17:28 2021 From: akanevsk at redhat.com (Arkady Kanevsky) Date: Thu, 1 Jul 2021 18:17:28 -0500 Subject: [Ironic] Vendor-neutral Disk names In-Reply-To: References: Message-ID: Mahnoor, How would your proposed schema work for extended storage? Think of the SCSI connector to JBOD. Thanks, Arkady On Thu, Jul 1, 2021 at 5:29 PM Mahnoor Asghar < mahnoor.asghar at xflowresearch.com> wrote: > Thank you for the response, Mike! > > I agree that 'ServiceLabel' is a good starting point, but it would be > preferable to have a more consistent format that represents a Drive > resource uniquely, irrespective of the vendor. > The idea is to let Ironic name the Drive resources using this logic, > so that the baremetal operator can use the same, consistent method of > specifying disks for RAID configuration. Feedback from the Ironic > community is very welcome here, so that an informed proposal can be > made. > > > On Wed, Jun 30, 2021 at 6:46 PM Mike Raineri wrote: > > > > Hi Mahnoor, > > > > First, to answer your questions about the property values: > > - "ServiceLabel" is supposed to always be something human friendly and > matches an indicator that is used to reference a part in an enclosure. > Ultimately the vendor dictates the how they construct their > stickers/etching/silk-screens/other labels, but I would expect something > along the lines of "Drive Bay 3", or "Slot 9", or something similar. > > - The resource type will always be "#Drive.v1_X_X.Drive"; this is > required by Redfish for representing a standard "Drive" resource. > > - "Id" as defined in the Redfish Specification is going to be a unique > identifier in the collection; there are no rules with what goes into this > property other than uniqueness in terms of other collection members. I've > seen some implementations use a simple numeric index and others use > something that looks more like a service label. However, I've also seen a > few use either a GUID or other globally unique identifier, which might be > unfriendly for a user to look at. > > > > With that said, when I originally authored that logic in Tacklebox, I > fully expected to revisit that logic to fine tune it to ensure it reliably > gives something human readable. I haven't gone through the exercise to > refine it further, but my initial impression is I'll be looking at more > properties specific to the Drive resource to help build the string. I think > "ServiceLabel" is still a good starting point, but having the appropriate > fallback logic in its absence would be very useful to avoid construction > based on "Id". > > > > Thanks. > > > > -Mike > > > > On Wed, Jun 30, 2021 at 9:01 AM Mahnoor Asghar < > mahnoor.asghar at xflowresearch.com> wrote: > >> > >> Dear all, > >> > >> There is [a proposal][1] in the metal3-io/baremetal-operator repository > to extend the hardware RAID configuration to support ‘physical_disks’ and > ‘controller’ fields in the 'target_raid_config' section. > >> The user should be able to specify the disks for RAID configuration, in > a vendor-agnostic way. (This requirement comes from the Airship project.) > The names of the disks should be indicative of the physical location of the > disks within the server. An algorithm to construct disk names is therefore > needed, for this purpose. > >> > >> One possible algorithm was found in the [inventory module][2] of the > [Redfish Tacklebox scripts][3]. > >> To construct a disk name, it uses Redfish properties, specifically the > ‘Drive’ resource> ‘Physical Location’ object> ‘Part Location’ property> > ‘Service Label’ property. ([Link][4] to code) ([Link][5] to Redfish ‘Drive’ > resource) > >> If this property is empty, the resource type (String uptil the first > dot encountered in the @odata.type field), and the ‘Id’ properties of the > Drive resource are used to construct the disk name. ([Link][6] to code) > >> For example, if the 'Drive'.'Physical Location'.'Part > Location'.'Service Label' field is ‘Disk.Bay1.Slot0’, this is what the disk > name will be. If this field is empty, and the resource name is ‘Drive’ and > the resource ‘Id’ is ‘Disk.Bay1.Slot0’, the disk name will be: ‘Drive: > Disk.Bay1.Slot0’. > >> > >> We would like to understand the values different vendors give in: > >> - ‘Drive’ resource> ‘Physical Location’ object> ‘Part Location’ > property> ‘Service Label’ property > >> - The resource type for a Drive (@odata.type field) > >> - The ‘Id’ property of the Drive resource > >> Also, it would be helpful to understand the existing logic used by > vendors to construct the disk names, including the (Redfish or other) > properties used. > >> > >> This is so that a consensus can be reached for an algorithm to > construct disk names. Any suggestions are welcome, thank you so much! > >> > >> [1]: https://github.com/metal3-io/metal3-docs/pull/148 > >> [2]: > https://github.com/DMTF/Redfish-Tacklebox/blob/master/redfish_utilities/inventory.py > >> [3]: https://github.com/DMTF/Redfish-Tacklebox > >> [4]: > https://github.com/DMTF/Redfish-Tacklebox/blob/e429f70a79cfe288756618498ce485ab4be37131/redfish_utilities/inventory.py#L192 > >> [5]: https://redfish.dmtf.org/schemas/v1/Drive.v1_12_1.json > >> [6]: > https://github.com/DMTF/Redfish-Tacklebox/blob/e429f70a79cfe288756618498ce485ab4be37131/redfish_utilities/inventory.py#L199 > >> > >> Best regards, > >> Mahnoor Asghar > >> Software Design Engineer > >> xFlow Research Inc. > >> mahnoor.asghar at xflowresearch.com > >> www.xflowresearch.com > > > > -- > Mahnoor Asghar > Software Design Engineer > xFlow Research Inc. > mahnoor.asghar at xflowresearch.com > www.xflowresearch.com > > -- Arkady Kanevsky, Ph.D. Phone: 972 707-6456 Corporate Phone: 919 729-5744 ext. 8176456 -------------- next part -------------- An HTML attachment was scrubbed... URL: From manu.b at est.tech Fri Jul 2 06:24:43 2021 From: manu.b at est.tech (Manu B) Date: Fri, 2 Jul 2021 06:24:43 +0000 Subject: [neutron] IPv6 advertisement support for BGP neutron-dynamic-routing Message-ID: Hi everyone, This is a doubt regarding existing IPv6 advertisement support for BGP in the current n-d-r repo. >From the documentation, it is mentioned that to enable advertising IPv6 prefixes, create an address scope with ip_version=6 and a BGP speaker with ip_version=6. We have a use case where v4 and v6 routes must be advertised using a single v4 peer. 1. How can we advertise both v4 and v6 routes using single v4 peer? Do we have to create 2 BGP speakers ( one for v4 and another for v6) Then create a single BGP v4 peer and add the same peer to both the speakers? Is this the correct approach or any other way? 1. Is there any other significance of version field in current BGP speaker model? Thanks, Manu -------------- next part -------------- An HTML attachment was scrubbed... URL: From geguileo at redhat.com Fri Jul 2 08:30:16 2021 From: geguileo at redhat.com (Gorka Eguileor) Date: Fri, 2 Jul 2021 10:30:16 +0200 Subject: Extension of disk in cinder In-Reply-To: References: Message-ID: <20210702083016.tiqv7mfixazpjalx@localhost> On 01/07, Salman Sheikh wrote: > Hi all, > > How we can add the more disk to cinder, as I have created the > cinder-volumes on one disk (/sdb) the sdb is 500Gb, if i want to increase > the the size of cinder-volumes then how we extend the size of > cinder-volumes. how we can add more disk in order to increase the size of > cinder-volume. Kindly advise Hi Salman, You don't mention the cinder driver or anything, so I'm going to assume you are using the LVM driver with default's cinder-volumes VG (Volume Group) on the bare disk (no partition). If that's the case, you'll need an additional disk (or loopback device), create a PV (Physical Volume) on it, and add it to the cinder-volumes VG. Fortunately the LVM command that extends a VG can do the PV creation automatically, so if for example you have added /dev/sdc to your system and you want to use it for Cinder volumes you can: - First check the current status of the VG: $ sudo vgs - Add the device: $ sudo vgextend cinder-volumes /dev/sdc --debug - Confirm the VG has grown: $ sudo vgs Cheers, Gorka. From geguileo at redhat.com Fri Jul 2 09:41:28 2021 From: geguileo at redhat.com (Gorka Eguileor) Date: Fri, 2 Jul 2021 11:41:28 +0200 Subject: Extension of disk in cinder In-Reply-To: References: <20210702083016.tiqv7mfixazpjalx@localhost> Message-ID: <20210702094128.k62fwter4ag5uatn@localhost> On 02/07, Salman Sheikh wrote: > Hi Gorka, > > Thanks for the information. one more thing I need to know, how we get the > storage usage of object store (swift). suppose at the time of installation > through packstack we define the swift_storage 20G, if any other > person wants to know how much space is used in swift, then how we can get > the information how much space is available in the object storage size. > Hi, I don't know about Swift, I work on the Cinder project. I recommend you sending a new mail for that question with the subject starting with "[swift]" so knowledgeable people can see it at first glance. Cheers, Gorka. > On Fri, Jul 2, 2021 at 2:00 PM Gorka Eguileor wrote: > > > On 01/07, Salman Sheikh wrote: > > > Hi all, > > > > > > How we can add the more disk to cinder, as I have created the > > > cinder-volumes on one disk (/sdb) the sdb is 500Gb, if i want to increase > > > the the size of cinder-volumes then how we extend the size of > > > cinder-volumes. how we can add more disk in order to increase the size of > > > cinder-volume. Kindly advise > > > > Hi Salman, > > > > You don't mention the cinder driver or anything, so I'm going to assume > > you are using the LVM driver with default's cinder-volumes VG (Volume > > Group) on the bare disk (no partition). > > > > If that's the case, you'll need an additional disk (or loopback device), > > create a PV (Physical Volume) on it, and add it to the cinder-volumes > > VG. > > > > Fortunately the LVM command that extends a VG can do the PV creation > > automatically, so if for example you have added /dev/sdc to your system > > and you want to use it for Cinder volumes you can: > > > > - First check the current status of the VG: > > > > $ sudo vgs > > > > - Add the device: > > > > $ sudo vgextend cinder-volumes /dev/sdc --debug > > > > - Confirm the VG has grown: > > > > $ sudo vgs > > > > Cheers, > > Gorka. > > > > From smooney at redhat.com Fri Jul 2 11:18:48 2021 From: smooney at redhat.com (Sean Mooney) Date: Fri, 02 Jul 2021 12:18:48 +0100 Subject: [neutron] IPv6 advertisement support for BGP neutron-dynamic-routing In-Reply-To: References: Message-ID: <1f39b17b79ec4f9cae64deb04b291bf45a37bd7e.camel@redhat.com> On Fri, 2021-07-02 at 06:24 +0000, Manu B wrote: > Hi everyone, >   > This is a doubt regarding existing IPv6 advertisement support for BGP > in the current n-d-r repo. > From the documentation, it is mentioned that to enable advertising > IPv6 prefixes, > create an address scope with ip_version=6 and a BGP speaker with > ip_version=6. >   > We have a use case where v4 and v6 routes must be advertised using a > single v4 peer. >    1. How can we advertise both v4 and v6 routes using single v4 > peer? you cant as far as i am aware. i dont have neutron-dynamic-routing enabled in my home setup anymore but when i did i found you need 2 speakers. > Do we have to create 2 BGP speakers ( one for v4 and another for v6) yes although the issue i had with that was that you can only schdule one speaker to given host so on my singel node deployment i could only have one of the two speakser running. > Then create a single BGP v4 peer and add the same peer to both the > speakers? to get this working i had to assing two different asn to the two differen speaker and create two peering session to my home router. one for the ipv4 speaker and one for the ipv6 speaker. as i said above i could only have one of the two schdeld to the singe dragent i had running at a time but if i swapped between which one was runing i got either the ipv4 routes or ipv6 routes advertised. > Is this the correct approach or any other way? i have no idea what the correct approch is but that is why i foudn worked. if i had a second dragent runnign i coudl have had the cloud advertise both in parallel this way its a shame that it can just advertise ipv4 and ipv6 in the same speaker with one peering session. the limitation seams to be on the neutron-dynamic-routing side not the bgp protocol in this case. >    1. Is there any other significance of version field in current BGP > speaker model? >   > Thanks, > Manu From juliaashleykreger at gmail.com Fri Jul 2 14:46:29 2021 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Fri, 2 Jul 2021 07:46:29 -0700 Subject: [Ironic] Vendor-neutral Disk names In-Reply-To: References: Message-ID: +1 to Arkady's question. Also, Software raid won't be able to be leveraged the same way, so it would need to be explicitly noted as such as a limitation. It could make a lot of sense for there to be some sort of "device name translation" code that can pass the physical devices information through/to the raid drivers, or be used in the vendor OOB raid device management drivers, at least, that is what I would expect for any common vendor independent standard to really work for handling vendor's presentation of devices inside of their BMCs. On Thu, Jul 1, 2021 at 4:23 PM Arkady Kanevsky wrote: > > Mahnoor, > How would your proposed schema work for extended storage? > Think of the SCSI connector to JBOD. > Thanks, > Arkady > > > On Thu, Jul 1, 2021 at 5:29 PM Mahnoor Asghar wrote: >> >> Thank you for the response, Mike! >> >> I agree that 'ServiceLabel' is a good starting point, but it would be >> preferable to have a more consistent format that represents a Drive >> resource uniquely, irrespective of the vendor. >> The idea is to let Ironic name the Drive resources using this logic, >> so that the baremetal operator can use the same, consistent method of >> specifying disks for RAID configuration. Feedback from the Ironic >> community is very welcome here, so that an informed proposal can be >> made. >> >> >> On Wed, Jun 30, 2021 at 6:46 PM Mike Raineri wrote: >> > >> > Hi Mahnoor, >> > >> > First, to answer your questions about the property values: >> > - "ServiceLabel" is supposed to always be something human friendly and matches an indicator that is used to reference a part in an enclosure. Ultimately the vendor dictates the how they construct their stickers/etching/silk-screens/other labels, but I would expect something along the lines of "Drive Bay 3", or "Slot 9", or something similar. >> > - The resource type will always be "#Drive.v1_X_X.Drive"; this is required by Redfish for representing a standard "Drive" resource. >> > - "Id" as defined in the Redfish Specification is going to be a unique identifier in the collection; there are no rules with what goes into this property other than uniqueness in terms of other collection members. I've seen some implementations use a simple numeric index and others use something that looks more like a service label. However, I've also seen a few use either a GUID or other globally unique identifier, which might be unfriendly for a user to look at. >> > >> > With that said, when I originally authored that logic in Tacklebox, I fully expected to revisit that logic to fine tune it to ensure it reliably gives something human readable. I haven't gone through the exercise to refine it further, but my initial impression is I'll be looking at more properties specific to the Drive resource to help build the string. I think "ServiceLabel" is still a good starting point, but having the appropriate fallback logic in its absence would be very useful to avoid construction based on "Id". >> > >> > Thanks. >> > >> > -Mike >> > >> > On Wed, Jun 30, 2021 at 9:01 AM Mahnoor Asghar wrote: >> >> >> >> Dear all, >> >> >> >> There is [a proposal][1] in the metal3-io/baremetal-operator repository to extend the hardware RAID configuration to support ‘physical_disks’ and ‘controller’ fields in the 'target_raid_config' section. >> >> The user should be able to specify the disks for RAID configuration, in a vendor-agnostic way. (This requirement comes from the Airship project.) The names of the disks should be indicative of the physical location of the disks within the server. An algorithm to construct disk names is therefore needed, for this purpose. >> >> >> >> One possible algorithm was found in the [inventory module][2] of the [Redfish Tacklebox scripts][3]. >> >> To construct a disk name, it uses Redfish properties, specifically the ‘Drive’ resource> ‘Physical Location’ object> ‘Part Location’ property> ‘Service Label’ property. ([Link][4] to code) ([Link][5] to Redfish ‘Drive’ resource) >> >> If this property is empty, the resource type (String uptil the first dot encountered in the @odata.type field), and the ‘Id’ properties of the Drive resource are used to construct the disk name. ([Link][6] to code) >> >> For example, if the 'Drive'.'Physical Location'.'Part Location'.'Service Label' field is ‘Disk.Bay1.Slot0’, this is what the disk name will be. If this field is empty, and the resource name is ‘Drive’ and the resource ‘Id’ is ‘Disk.Bay1.Slot0’, the disk name will be: ‘Drive: Disk.Bay1.Slot0’. >> >> >> >> We would like to understand the values different vendors give in: >> >> - ‘Drive’ resource> ‘Physical Location’ object> ‘Part Location’ property> ‘Service Label’ property >> >> - The resource type for a Drive (@odata.type field) >> >> - The ‘Id’ property of the Drive resource >> >> Also, it would be helpful to understand the existing logic used by vendors to construct the disk names, including the (Redfish or other) properties used. >> >> >> >> This is so that a consensus can be reached for an algorithm to construct disk names. Any suggestions are welcome, thank you so much! >> >> >> >> [1]: https://github.com/metal3-io/metal3-docs/pull/148 >> >> [2]: https://github.com/DMTF/Redfish-Tacklebox/blob/master/redfish_utilities/inventory.py >> >> [3]: https://github.com/DMTF/Redfish-Tacklebox >> >> [4]: https://github.com/DMTF/Redfish-Tacklebox/blob/e429f70a79cfe288756618498ce485ab4be37131/redfish_utilities/inventory.py#L192 >> >> [5]: https://redfish.dmtf.org/schemas/v1/Drive.v1_12_1.json >> >> [6]: https://github.com/DMTF/Redfish-Tacklebox/blob/e429f70a79cfe288756618498ce485ab4be37131/redfish_utilities/inventory.py#L199 >> >> >> >> Best regards, >> >> Mahnoor Asghar >> >> Software Design Engineer >> >> xFlow Research Inc. >> >> mahnoor.asghar at xflowresearch.com >> >> www.xflowresearch.com >> >> >> >> -- >> Mahnoor Asghar >> Software Design Engineer >> xFlow Research Inc. >> mahnoor.asghar at xflowresearch.com >> www.xflowresearch.com >> > > > -- > Arkady Kanevsky, Ph.D. > Phone: 972 707-6456 > Corporate Phone: 919 729-5744 ext. 8176456 From salman10sheikh at gmail.com Fri Jul 2 08:40:28 2021 From: salman10sheikh at gmail.com (Salman Sheikh) Date: Fri, 2 Jul 2021 14:10:28 +0530 Subject: Extension of disk in cinder In-Reply-To: <20210702083016.tiqv7mfixazpjalx@localhost> References: <20210702083016.tiqv7mfixazpjalx@localhost> Message-ID: Hi Gorka, Thanks for the information. one more thing I need to know, how we get the storage usage of object store (swift). suppose at the time of installation through packstack we define the swift_storage 20G, if any other person wants to know how much space is used in swift, then how we can get the information how much space is available in the object storage size. On Fri, Jul 2, 2021 at 2:00 PM Gorka Eguileor wrote: > On 01/07, Salman Sheikh wrote: > > Hi all, > > > > How we can add the more disk to cinder, as I have created the > > cinder-volumes on one disk (/sdb) the sdb is 500Gb, if i want to increase > > the the size of cinder-volumes then how we extend the size of > > cinder-volumes. how we can add more disk in order to increase the size of > > cinder-volume. Kindly advise > > Hi Salman, > > You don't mention the cinder driver or anything, so I'm going to assume > you are using the LVM driver with default's cinder-volumes VG (Volume > Group) on the bare disk (no partition). > > If that's the case, you'll need an additional disk (or loopback device), > create a PV (Physical Volume) on it, and add it to the cinder-volumes > VG. > > Fortunately the LVM command that extends a VG can do the PV creation > automatically, so if for example you have added /dev/sdc to your system > and you want to use it for Cinder volumes you can: > > - First check the current status of the VG: > > $ sudo vgs > > - Add the device: > > $ sudo vgextend cinder-volumes /dev/sdc --debug > > - Confirm the VG has grown: > > $ sudo vgs > > Cheers, > Gorka. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From piotr.mossakowski at tietoevry.com Fri Jul 2 10:19:29 2021 From: piotr.mossakowski at tietoevry.com (Piotr Mossakowski) Date: Fri, 2 Jul 2021 12:19:29 +0200 Subject: [rpm-packaging] logging override in openstack packages Message-ID: Hello, There is a logging override in almost all openstack packages, for example in nova: https://opendev.org/openstack/rpm-packaging/src/branch/stable/victoria/openstack/nova/openstack-nova.defaultconf It lands in /etc/nova/nova.conf.d/010-nova.conf as stated in the spec: https://opendev.org/openstack/rpm-packaging/src/branch/stable/victoria/openstack/nova/nova.spec.j2#L407 Let's assume following scenario: in kolla-ansible based deployment I want to disable logging globally using /etc/kolla/config/global.conf: [DEFAULT] log_dir = log_file = For kolla images built from RPMs, the file '/etc/nova/nova.conf.d/010-nova.conf' will override what I want to achieve by global.conf. We have a situation when kolla images log inside the container into '/var/log/nova' where there is no log rotation. There is no easy way to change that in kolla-ansible based deployments. Is there any specific reason why default logging has been hardcoded like that? Can we agree to change that situation to be able to override logging as expected? -- Pozdrawiam serdecznie / Best regards, *Piotr Mossakowski* Lead System Engineer, Product Development Services, Cloud Infra & Applications Email piotr.mossakowski at tietoevry.com , +48795515407 Aleja Piastów 30, 71-064 Szczecin, Poland, www.tietoevry.com -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 4195 bytes Desc: S/MIME Cryptographic Signature URL: From salman10sheikh at gmail.com Fri Jul 2 10:46:15 2021 From: salman10sheikh at gmail.com (Salman Sheikh) Date: Fri, 2 Jul 2021 16:16:15 +0530 Subject: swift_storage storage space Message-ID: Hi experts, how we get the storage usage of object storage (swift). suppose at the time of installation through packstack we define the swift_storage 20G, if any other person wants to know how much space was used in swift, and how much space is available in the object storage. How we can get all this Regards salman -------------- next part -------------- An HTML attachment was scrubbed... URL: From luke.camilleri at zylacomputing.com Fri Jul 2 16:52:31 2021 From: luke.camilleri at zylacomputing.com (Luke Camilleri) Date: Fri, 2 Jul 2021 18:52:31 +0200 Subject: [victoria][barbican] - UI installation Message-ID: <86362a07-fdcf-ac3e-49db-d252371df66f@zylacomputing.com> Hi there, I have an openstack deployment installed as a bare-metal/manual installation and would like to install the Barbican UI. Barbican is already installed and running. In my case Horizon is already installed so I guess if I follow these steps https://github.com/openstack/barbican-ui#manual-installation I would not need to clone the horizon part and also my installation already contains the local_settings.py file with all the required settings mentioned. At this point when moving into the "Install Barbican UI with all dependencies in your virtual environment" step I do not have the tools folder so I cannot really proceed with the pip install (missing virtual environment )and copying the barbican-ui/barbican_ui/enabled/_90_project_barbican_panelgroup.py and barbican-ui/barbican_ui/enabled/_91_project_barbican_secrets_panel.py to the /usr/share/openstack-dashboard/openstack_dashboard/local/enabled obviously fails A similar issue is reported here https://storyboard.openstack.org/#!/story/2008259 Thanks in advance From kennelson11 at gmail.com Fri Jul 2 17:18:25 2021 From: kennelson11 at gmail.com (Kendall Nelson) Date: Fri, 2 Jul 2021 10:18:25 -0700 Subject: [all] [] Mentors Needed! Open Source Day Summer Message-ID: Hello! As you may or may not know, the Grace Hopper Conference is now running TWO Open Source Day events a year! The upcoming event is on July 15th[0]. If you are interested in mentoring a few individuals- i.e helping them work on some low hanging fruit for your project, please sign up here[1] and let me know! After you've done the signup and replied here, I will reach out to the organizers to make sure you are approved. At that point, please collect a few bugs you think would be good first patches for attendees and link them in the etherpad here[2]. Thank you so much! Can't wait to work with you :) -Kendall Nelson (diablo_rojo) [0] https://anitab-org.github.io/open-source-day/upcoming/ [1] https://anitab-org.github.io/open-source-day/become_a_mentor/ [2] https://etherpad.opendev.org/p/ghc-osd-summer-2021 -------------- next part -------------- An HTML attachment was scrubbed... URL: From akanevsk at redhat.com Fri Jul 2 20:43:04 2021 From: akanevsk at redhat.com (Arkady Kanevsky) Date: Fri, 2 Jul 2021 15:43:04 -0500 Subject: [all] [] Mentors Needed! Open Source Day Summer In-Reply-To: References: Message-ID: Kendall, I will be happy to mentor. I had also added a story into neebee to try. I am US CST. Let me know if I am approved for mentoring. Thanks, Arkady On Fri, Jul 2, 2021 at 12:23 PM Kendall Nelson wrote: > Hello! > > As you may or may not know, the Grace Hopper Conference is now running TWO > Open Source Day events a year! > > The upcoming event is on July 15th[0]. If you are interested in mentoring > a few individuals- i.e helping them work on some low hanging fruit for your > project, please sign up here[1] and let me know! > > After you've done the signup and replied here, I will reach out to the > organizers to make sure you are approved. At that point, please collect a > few bugs you think would be good first patches for attendees and link them > in the etherpad here[2]. > > Thank you so much! Can't wait to work with you :) > > -Kendall Nelson (diablo_rojo) > > [0] https://anitab-org.github.io/open-source-day/upcoming/ > [1] https://anitab-org.github.io/open-source-day/become_a_mentor/ > [2] https://etherpad.opendev.org/p/ghc-osd-summer-2021 > -- Arkady Kanevsky, Ph.D. Phone: 972 707-6456 Corporate Phone: 919 729-5744 ext. 8176456 -------------- next part -------------- An HTML attachment was scrubbed... URL: From akanevsk at redhat.com Fri Jul 2 21:43:52 2021 From: akanevsk at redhat.com (Arkady Kanevsky) Date: Fri, 2 Jul 2021 16:43:52 -0500 Subject: [Interop][Refstack] meeting time changed for the summer Message-ID: Team, The group meeting will be on Fridays 2 hours earlier at 14pm UTC. Meeting info and minutes are still the same https://meetpad.opendev.org/interop https://etherpad.opendev.org/p/interop -- Arkady Kanevsky, Ph.D. Phone: 972 707-6456 Corporate Phone: 919 729-5744 ext. 8176456 -------------- next part -------------- An HTML attachment was scrubbed... URL: From frickler at offenerstapel.de Sat Jul 3 07:02:14 2021 From: frickler at offenerstapel.de (Jens Harbott) Date: Sat, 3 Jul 2021 09:02:14 +0200 Subject: [neutron] IPv6 advertisement support for BGP neutron-dynamic-routing In-Reply-To: References: Message-ID: <851675e3-1d86-499b-fb53-64e0e59bd312@offenerstapel.de> On 7/2/21 8:24 AM, Manu B wrote: > Hi everyone, > >   > > This is a doubt regarding existing IPv6 advertisement support for BGP in > the current n-d-r repo. > > From the documentation, it is mentioned that to enable advertising IPv6 > prefixes, > > create an address scope with ip_version=6 and a BGP speaker with > ip_version=6. > >   > > We have a use case where v4 and v6 routes must be advertised using a > single v4 peer. > > 1. How can we advertise both v4 and v6 routes using single v4 peer? > > Do we have to create 2 BGP speakers ( one for v4 and another for v6) > > Then create a single BGP v4 peer and add the same peer to both the speakers? > > Is this the correct approach or any other way? MP-BGP support has been added with https://review.opendev.org/c/openstack/neutron-dynamic-routing/+/608302, I never tested that myself, but from the release note it would seem that you configure only one speaker and one peer and then by setting proper configuration options on the peer it gets announced both v4 and v6 prefixes. From ignaziocassano at gmail.com Sat Jul 3 14:51:07 2021 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Sat, 3 Jul 2021 16:51:07 +0200 Subject: [Openstack][kolla] updates check Message-ID: Hello, I am new user for openstack kolla because I installed openstack writing my ansible Playbooks. I wonder if it is a method to check if new images has been released . For example: I installed kolla wallaby 2 weeks ago, and now I want to check if new images could be updated. I want to check if new images solve some bugs. Is it possibile without trying to pull all images.? Thanks Ignazio -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Sat Jul 3 19:00:22 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Sat, 03 Jul 2021 14:00:22 -0500 Subject: [all][tc] What's happening in Technical Committee: summary 2nd July, 21: Reading: 5 min Message-ID: <17a6dbe3ee3.11ffc9ec3464056.762781257224623555@ghanshyammann.com> Hello Everyone, Here is last week's summary of the Technical Committee activities. 1. What we completed this week: ========================= * Deprecated OpenStack-Ansible nspawn repositories [1] * Retired puppet-openstack-specs [2] * Retired in-active governance repos [3] - openstack/arch-wg - openstack/openstack-specs - openstack/project-navigator-data - openstack/enterprise-wg - openstack/workload-ref-archs - openstack/ops-tags-team - openstack/scientific-wg - openstack/governance-uc - openstack/uc-recognition 2. TC Meetings: ============ * TC held this week meeting on Thursday; you can find the full meeting logs in the below link: - https://meetings.opendev.org/meetings/tc/2021/tc.2021-07-01-15.00.log.html * We will have next week's meeting on July 8th, Thursday 15:00 UTC[4]. 3. Activities In progress: ================== TC Tracker for Xena cycle ------------------------------ TC is using the etherpad[5] for Xena cycle working item. We will be checking and updating the status biweekly in the same etherpad. Open Reviews ----------------- * Four open review for ongoing activities[6]. ELK services next plan and help status --------------------------------------------- * The Technical Committee brought the ELK help to the Board meeting held on June 30th[7].This is the presentation[8]. Please bring this to your company to broadcast it and get some help. * Join us in next TC meeting on July 8th to discuss the next plan. Migration from Freenode to OFTC ----------------------------------------- * Not much progress on project side wiki/doc page which is only thing left for this migration. * All the required work for this migration is tracked in this etherpad[9] 'Y' release naming process ------------------------------- * Y release naming election is closed now. As a last step, the foundation is doing trademark checks on elected ranking. * We should have final name by next week. Test support for TLS default: ---------------------------------- * Rico has started a separate email thread over testing with tls-proxy enabled[10], we encourage projects to participate in that testing and help to enable the tls-proxy in gate testing. Adding Ceph Dashboard charm to OpenStack charms --------------------------------------------------------------- * Proposal to add Ceph Dashboard charm to OpenStack charms[11] Retiring js-openstack-lib ---------------------------- * Proposal to retire js-openstack-lib [12]. Other Changes ------------------ * Charter change to handle the vacant seat situation[13] * Add DPL model also in 'Appointing leaders' section[14] 4. How to contact the TC: ==================== If you would like to discuss or give feedback to TC, you can reach out to us in multiple ways: 1. Email: you can send the email with tag [tc] on openstack-discuss ML[15]. 2. Weekly meeting: The Technical Committee conduct a weekly meeting every Thursday 15 UTC [16] 3. Office hours: The Technical Committee offers a weekly office hour every Tuesday at 0100 UTC [17] 4. Ping us using 'tc-members' nickname on #openstack-tc IRC channel. [1] https://review.opendev.org/c/openstack/governance/+/797731 [2] https://review.opendev.org/c/openstack/governance/+/798711 [3] https://review.opendev.org/c/openstack/governance/+/797788 [4] https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting [5] https://etherpad.opendev.org/p/tc-xena-tracker [6] https://review.opendev.org/q/project:openstack/governance+status:open [7] https://wiki.openstack.org/wiki/Governance/Foundation/30Jun2021BoardMeeting [8] https://docs.google.com/presentation/d/1ugdwMI2ZM2L8z1sobzHJwDpbvlyWKH02PH7Fi4tkyVc/edit#slide=id.ge1bdf71dac_0_0 [9] https://etherpad.opendev.org/p/openstack-irc-migration-to-oftc [10] http://lists.openstack.org/pipermail/openstack-discuss/2021-June/023000.html [11] https://review.opendev.org/c/openstack/governance/+/797913 [12] https://review.opendev.org/c/openstack/governance/+/798540 [13] https://review.opendev.org/c/openstack/governance/+/797912 [14] https://review.opendev.org/c/openstack/governance/+/797985 [15] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-discuss [16] http://eavesdrop.openstack.org/#Technical_Committee_Meeting [17] http://eavesdrop.openstack.org/#Technical_Committee_Office_hours -gmann From malikobaidadil at gmail.com Sun Jul 4 12:20:07 2021 From: malikobaidadil at gmail.com (Malik Obaid) Date: Sun, 4 Jul 2021 17:20:07 +0500 Subject: [wallaby][neutron][ovn] Low network performance between instances on different compute nodes using OVN Geneve tunnels Message-ID: Hi, I am using Openstack Wallaby release with OVN on Ubuntu 20.04. My environment consists of 2 compute nodes and 1 controller node. ovs_version: "2.15.0" Ubuntu Kernel Version: 5.4.0-77-generic I am observing Network performance between instances on different compute nodes is slow. The network uses geneve tunnels.The environment is using 10Gbps network interface cards. However, iperf between instances on different compute nodes attains only speeds between a few hundred Mbit/s and a few Gb/s. Both instances are in the same tenant network. Note: iperf results between both compute nodes (hypervisors) across the geneve tunnel endpoints is perfect 10 Gbps. Below are the results of iperf commands. *iperf server:* 2: ens3: mtu 8950 qdisc fq_codel state UP group default qlen 1000 link/ether fa:16:3e:4b:1d:29 brd ff:ff:ff:ff:ff:ff inet 192.168.100.111/24 brd 192.168.100.255 scope global dynamic ens3 valid_lft 42694sec preferred_lft 42694sec inet6 fe80::f816:3eff:fe4b:1d29/64 scope link valid_lft forever preferred_lft forever root at vm-01:~# iperf3 -s Server listening on 5201 Accepted connection from 192.168.100.69, port 45542 [ 5] local 192.168.100.111 port 5201 connected to 192.168.100.69 port 45544 [ 8] local 192.168.100.111 port 5201 connected to 192.168.100.69 port 45546 [ ID][Role] Interval Transfer Bitrate Retr Cwnd [ 5][RX-S] 0.00-1.00 sec 692 MBytes 5.81 Gbits/sec [ 8][TX-S] 0.00-1.00 sec 730 MBytes 6.12 Gbits/sec 0 3.14 MBytes [ 5][RX-S] 1.00-2.00 sec 598 MBytes 5.01 Gbits/sec [ 8][TX-S] 1.00-2.00 sec 879 MBytes 7.37 Gbits/sec 0 3.14 MBytes [ 5][RX-S] 2.00-3.00 sec 793 MBytes 6.65 Gbits/sec [ 8][TX-S] 2.00-3.00 sec 756 MBytes 6.34 Gbits/sec 0 3.14 MBytes [ 5][RX-S] 3.00-4.00 sec 653 MBytes 5.48 Gbits/sec [ 8][TX-S] 3.00-4.00 sec 871 MBytes 7.31 Gbits/sec 0 3.14 MBytes [ 5][RX-S] 4.00-5.00 sec 597 MBytes 5.01 Gbits/sec [ 8][TX-S] 4.00-5.00 sec 858 MBytes 7.20 Gbits/sec 0 3.14 MBytes [ 5][RX-S] 5.00-6.00 sec 734 MBytes 6.16 Gbits/sec [ 8][TX-S] 5.00-6.00 sec 818 MBytes 6.86 Gbits/sec 0 3.14 MBytes [ 5][RX-S] 6.00-7.00 sec 724 MBytes 6.06 Gbits/sec [ 8][TX-S] 6.00-7.00 sec 789 MBytes 6.60 Gbits/sec 0 3.14 MBytes [ 5][RX-S] 7.00-8.00 sec 735 MBytes 6.18 Gbits/sec [ 8][TX-S] 7.00-8.00 sec 835 MBytes 7.02 Gbits/sec 0 3.14 MBytes [ 5][RX-S] 8.00-9.00 sec 789 MBytes 6.62 Gbits/sec [ 8][TX-S] 8.00-9.00 sec 845 MBytes 7.09 Gbits/sec 0 3.14 MBytes [ 5][RX-S] 9.00-10.00 sec 599 MBytes 5.02 Gbits/sec [ 8][TX-S] 9.00-10.00 sec 806 MBytes 6.76 Gbits/sec 0 3.14 MBytes [ ID][Role] Interval Transfer Bitrate Retr [ 5][RX-S] 0.00-10.00 sec 6.75 GBytes 5.80 Gbits/sec receiver [ 8][TX-S] 0.00-10.00 sec 7.99 GBytes 6.87 Gbits/sec 0 sender Server listening on 5201 *Client side:* root at vm-03:~# iperf3 -c 192.168.100.111 --bidir Connecting to host 192.168.100.111, port 5201 [ 5] local 192.168.100.69 port 45544 connected to 192.168.100.111 port 5201 [ 7] local 192.168.100.69 port 45546 connected to 192.168.100.111 port 5201 [ ID][Role] Interval Transfer Bitrate Retr Cwnd [ 5][TX-C] 0.00-1.00 sec 700 MBytes 5.87 Gbits/sec 0 3.13 MBytes [ 7][RX-C] 0.00-1.00 sec 722 MBytes 6.06 Gbits/sec [ 5][TX-C] 1.00-2.00 sec 594 MBytes 4.98 Gbits/sec 0 3.13 MBytes [ 7][RX-C] 1.00-2.00 sec 883 MBytes 7.41 Gbits/sec [ 5][TX-C] 2.00-3.00 sec 796 MBytes 6.67 Gbits/sec 0 3.13 MBytes [ 7][RX-C] 2.00-3.00 sec 752 MBytes 6.31 Gbits/sec [ 5][TX-C] 3.00-4.00 sec 654 MBytes 5.49 Gbits/sec 0 3.13 MBytes [ 7][RX-C] 3.00-4.00 sec 876 MBytes 7.35 Gbits/sec [ 5][TX-C] 4.00-5.00 sec 598 MBytes 5.01 Gbits/sec 0 3.13 MBytes [ 7][RX-C] 4.00-5.00 sec 853 MBytes 7.16 Gbits/sec [ 5][TX-C] 5.00-6.00 sec 734 MBytes 6.15 Gbits/sec 0 3.13 MBytes [ 7][RX-C] 5.00-6.00 sec 818 MBytes 6.86 Gbits/sec [ 5][TX-C] 6.00-7.00 sec 726 MBytes 6.09 Gbits/sec 0 3.13 MBytes [ 7][RX-C] 6.00-7.00 sec 793 MBytes 6.65 Gbits/sec [ 5][TX-C] 7.00-8.00 sec 734 MBytes 6.15 Gbits/sec 0 3.13 MBytes [ 7][RX-C] 7.00-8.00 sec 831 MBytes 6.97 Gbits/sec [ 5][TX-C] 8.00-9.00 sec 788 MBytes 6.61 Gbits/sec 0 3.13 MBytes [ 7][RX-C] 8.00-9.00 sec 845 MBytes 7.09 Gbits/sec [ 5][TX-C] 9.00-10.00 sec 600 MBytes 5.03 Gbits/sec 0 3.13 MBytes [ 7][RX-C] 9.00-10.00 sec 805 MBytes 6.76 Gbits/sec [ ID][Role] Interval Transfer Bitrate Retr [ 5][TX-C] 0.00-10.00 sec 6.76 GBytes 5.81 Gbits/sec 0 sender [ 5][TX-C] 0.00-10.00 sec 6.75 GBytes 5.80 Gbits/sec receiver [ 7][RX-C] 0.00-10.00 sec 7.99 GBytes 6.87 Gbits/sec 0 sender [ 7][RX-C] 0.00-10.00 sec 7.99 GBytes 6.86 Gbits/sec receiver iperf Done. --------------------------------------------------------------------------------------------------------- *ovs-vsctl show on compute node1:* root at kvm01-a1-khi01:~# ovs-vsctl show 88e6b984-44dc-4f74-8a9a-891742dbbdbd Bridge br-eth1 Port ens224 Interface ens224 Port patch-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9-to-br-int Interface patch-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9-to-br-int type: patch options: {peer=patch-br-int-to-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9} Port br-eth1 Interface br-eth1 type: internal Bridge br-int fail_mode: secure datapath_type: system Port tapde98b2d4-a0 Interface tapde98b2d4-a0 Port ovn-f51ef9-0 Interface ovn-f51ef9-0 type: vxlan options: {csum="true", key=flow, remote_ip="172.16.30.3"} bfd_status: {diagnostic="No Diagnostic", flap_count="1", forwarding="true", remote_diagnostic="No Diagnostic", remote_state=up, state=up} Port tap348fc6dc-3a Interface tap348fc6dc-3a Port br-int Interface br-int type: internal Port tap6d4d8e02-c0 Interface tap6d4d8e02-c0 error: "could not open network device tap6d4d8e02-c0 (No such device)" Port patch-br-int-to-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9 Interface patch-br-int-to-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9 type: patch options: {peer=patch-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9-to-br-int} Port tap247fe5b2-ff Interface tap247fe5b2-ff ------------------------------------------------------------------------------------------------------ *ovs-vsctl show on compute node2:* root at kvm03-a1-khi01:~# ovs-vsctl show 24ce6475-89bb-4df5-a5ff-4ce58f2c2f68 Bridge br-eth1 Port patch-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9-to-br-int Interface patch-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9-to-br-int type: patch options: {peer=patch-br-int-to-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9} Port br-eth1 Interface br-eth1 type: internal Port ens224 Interface ens224 Bridge br-int fail_mode: secure datapath_type: system Port patch-br-int-to-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9 Interface patch-br-int-to-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9 type: patch options: {peer=patch-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9-to-br-int} Port tap2b0bbf7b-59 Interface tap2b0bbf7b-59 Port ovn-650be8-0 Interface ovn-650be8-0 type: vxlan options: {csum="true", key=flow, remote_ip="172.16.30.1"} bfd_status: {diagnostic="No Diagnostic", flap_count="1", forwarding="true", remote_diagnostic="No Diagnostic", remote_state=up, state=up} Port tap867d2174-83 Interface tap867d2174-83 Port tapde98b2d4-a0 Interface tapde98b2d4-a0 Port br-int Interface br-int type: internal -------------------------------------------------------------------------------------------------------- I would really appreciate any input in this regard. Thank you. Regards, Malik Obaid -------------- next part -------------- An HTML attachment was scrubbed... URL: From laurentfdumont at gmail.com Sun Jul 4 16:49:58 2021 From: laurentfdumont at gmail.com (Laurent Dumont) Date: Sun, 4 Jul 2021 12:49:58 -0400 Subject: [wallaby][neutron][ovn] Low network performance between instances on different compute nodes using OVN Geneve tunnels In-Reply-To: References: Message-ID: Nothing super specific I can think of but : - Can you try running the same tests with two instances on the same compute? - How many cores are inside the sender/receiver VM? - Can you test in UDP mode? On Sun, Jul 4, 2021 at 8:27 AM Malik Obaid wrote: > Hi, > > I am using Openstack Wallaby release with OVN on Ubuntu 20.04. > > My environment consists of 2 compute nodes and 1 controller node. > ovs_version: "2.15.0" > Ubuntu Kernel Version: 5.4.0-77-generic > > > I am observing Network performance between instances on different compute > nodes is slow. The network uses geneve tunnels.The environment is using > 10Gbps network interface cards. However, iperf between instances on > different compute nodes attains only speeds between a few hundred Mbit/s > and a few Gb/s. Both instances are in the same tenant network. > > Note: iperf results between both compute nodes (hypervisors) across the > geneve tunnel endpoints is perfect 10 Gbps. > > Below are the results of iperf commands. > > *iperf server:* > > 2: ens3: mtu 8950 qdisc fq_codel state > UP group default qlen 1000 > link/ether fa:16:3e:4b:1d:29 brd ff:ff:ff:ff:ff:ff > inet 192.168.100.111/24 brd 192.168.100.255 scope global dynamic ens3 > valid_lft 42694sec preferred_lft 42694sec > inet6 fe80::f816:3eff:fe4b:1d29/64 scope link > valid_lft forever preferred_lft forever > > root at vm-01:~# iperf3 -s > Server listening on 5201 > > Accepted connection from 192.168.100.69, port 45542 > [ 5] local 192.168.100.111 port 5201 connected to 192.168.100.69 port > 45544 > [ 8] local 192.168.100.111 port 5201 connected to 192.168.100.69 port > 45546 > [ ID][Role] Interval Transfer Bitrate Retr Cwnd > [ 5][RX-S] 0.00-1.00 sec 692 MBytes 5.81 Gbits/sec > [ 8][TX-S] 0.00-1.00 sec 730 MBytes 6.12 Gbits/sec 0 3.14 > MBytes > [ 5][RX-S] 1.00-2.00 sec 598 MBytes 5.01 Gbits/sec > [ 8][TX-S] 1.00-2.00 sec 879 MBytes 7.37 Gbits/sec 0 3.14 > MBytes > [ 5][RX-S] 2.00-3.00 sec 793 MBytes 6.65 Gbits/sec > [ 8][TX-S] 2.00-3.00 sec 756 MBytes 6.34 Gbits/sec 0 3.14 > MBytes > [ 5][RX-S] 3.00-4.00 sec 653 MBytes 5.48 Gbits/sec > [ 8][TX-S] 3.00-4.00 sec 871 MBytes 7.31 Gbits/sec 0 3.14 > MBytes > [ 5][RX-S] 4.00-5.00 sec 597 MBytes 5.01 Gbits/sec > [ 8][TX-S] 4.00-5.00 sec 858 MBytes 7.20 Gbits/sec 0 3.14 > MBytes > [ 5][RX-S] 5.00-6.00 sec 734 MBytes 6.16 Gbits/sec > [ 8][TX-S] 5.00-6.00 sec 818 MBytes 6.86 Gbits/sec 0 3.14 > MBytes > [ 5][RX-S] 6.00-7.00 sec 724 MBytes 6.06 Gbits/sec > [ 8][TX-S] 6.00-7.00 sec 789 MBytes 6.60 Gbits/sec 0 3.14 > MBytes > [ 5][RX-S] 7.00-8.00 sec 735 MBytes 6.18 Gbits/sec > [ 8][TX-S] 7.00-8.00 sec 835 MBytes 7.02 Gbits/sec 0 3.14 > MBytes > [ 5][RX-S] 8.00-9.00 sec 789 MBytes 6.62 Gbits/sec > [ 8][TX-S] 8.00-9.00 sec 845 MBytes 7.09 Gbits/sec 0 3.14 > MBytes > [ 5][RX-S] 9.00-10.00 sec 599 MBytes 5.02 Gbits/sec > [ 8][TX-S] 9.00-10.00 sec 806 MBytes 6.76 Gbits/sec 0 3.14 > MBytes > > [ ID][Role] Interval Transfer Bitrate Retr > [ 5][RX-S] 0.00-10.00 sec 6.75 GBytes 5.80 Gbits/sec > receiver > [ 8][TX-S] 0.00-10.00 sec 7.99 GBytes 6.87 Gbits/sec 0 sender > > Server listening on 5201 > > *Client side:* > > root at vm-03:~# iperf3 -c 192.168.100.111 --bidir > Connecting to host 192.168.100.111, port 5201 > [ 5] local 192.168.100.69 port 45544 connected to 192.168.100.111 port > 5201 > [ 7] local 192.168.100.69 port 45546 connected to 192.168.100.111 port > 5201 > [ ID][Role] Interval Transfer Bitrate Retr Cwnd > [ 5][TX-C] 0.00-1.00 sec 700 MBytes 5.87 Gbits/sec 0 3.13 > MBytes > [ 7][RX-C] 0.00-1.00 sec 722 MBytes 6.06 Gbits/sec > [ 5][TX-C] 1.00-2.00 sec 594 MBytes 4.98 Gbits/sec 0 3.13 > MBytes > [ 7][RX-C] 1.00-2.00 sec 883 MBytes 7.41 Gbits/sec > [ 5][TX-C] 2.00-3.00 sec 796 MBytes 6.67 Gbits/sec 0 3.13 > MBytes > [ 7][RX-C] 2.00-3.00 sec 752 MBytes 6.31 Gbits/sec > [ 5][TX-C] 3.00-4.00 sec 654 MBytes 5.49 Gbits/sec 0 3.13 > MBytes > [ 7][RX-C] 3.00-4.00 sec 876 MBytes 7.35 Gbits/sec > [ 5][TX-C] 4.00-5.00 sec 598 MBytes 5.01 Gbits/sec 0 3.13 > MBytes > [ 7][RX-C] 4.00-5.00 sec 853 MBytes 7.16 Gbits/sec > [ 5][TX-C] 5.00-6.00 sec 734 MBytes 6.15 Gbits/sec 0 3.13 > MBytes > [ 7][RX-C] 5.00-6.00 sec 818 MBytes 6.86 Gbits/sec > [ 5][TX-C] 6.00-7.00 sec 726 MBytes 6.09 Gbits/sec 0 3.13 > MBytes > [ 7][RX-C] 6.00-7.00 sec 793 MBytes 6.65 Gbits/sec > [ 5][TX-C] 7.00-8.00 sec 734 MBytes 6.15 Gbits/sec 0 3.13 > MBytes > [ 7][RX-C] 7.00-8.00 sec 831 MBytes 6.97 Gbits/sec > [ 5][TX-C] 8.00-9.00 sec 788 MBytes 6.61 Gbits/sec 0 3.13 > MBytes > [ 7][RX-C] 8.00-9.00 sec 845 MBytes 7.09 Gbits/sec > [ 5][TX-C] 9.00-10.00 sec 600 MBytes 5.03 Gbits/sec 0 3.13 > MBytes > [ 7][RX-C] 9.00-10.00 sec 805 MBytes 6.76 Gbits/sec > > [ ID][Role] Interval Transfer Bitrate Retr > [ 5][TX-C] 0.00-10.00 sec 6.76 GBytes 5.81 Gbits/sec 0 > sender > [ 5][TX-C] 0.00-10.00 sec 6.75 GBytes 5.80 Gbits/sec > receiver > [ 7][RX-C] 0.00-10.00 sec 7.99 GBytes 6.87 Gbits/sec 0 > sender > [ 7][RX-C] 0.00-10.00 sec 7.99 GBytes 6.86 Gbits/sec > receiver > > iperf Done. > > > --------------------------------------------------------------------------------------------------------- > > *ovs-vsctl show on compute node1:* > > root at kvm01-a1-khi01:~# ovs-vsctl show > 88e6b984-44dc-4f74-8a9a-891742dbbdbd > Bridge br-eth1 > Port ens224 > Interface ens224 > Port patch-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9-to-br-int > Interface > patch-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9-to-br-int > type: patch > options: > {peer=patch-br-int-to-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9} > Port br-eth1 > Interface br-eth1 > type: internal > Bridge br-int > fail_mode: secure > datapath_type: system > Port tapde98b2d4-a0 > Interface tapde98b2d4-a0 > Port ovn-f51ef9-0 > Interface ovn-f51ef9-0 > type: vxlan > options: {csum="true", key=flow, remote_ip="172.16.30.3"} > bfd_status: {diagnostic="No Diagnostic", flap_count="1", > forwarding="true", remote_diagnostic="No Diagnostic", remote_state=up, > state=up} > Port tap348fc6dc-3a > Interface tap348fc6dc-3a > Port br-int > Interface br-int > type: internal > Port tap6d4d8e02-c0 > Interface tap6d4d8e02-c0 > error: "could not open network device tap6d4d8e02-c0 (No > such device)" > Port patch-br-int-to-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9 > Interface > patch-br-int-to-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9 > type: patch > options: > {peer=patch-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9-to-br-int} > Port tap247fe5b2-ff > Interface tap247fe5b2-ff > > > ------------------------------------------------------------------------------------------------------ > > *ovs-vsctl show on compute node2:* > > root at kvm03-a1-khi01:~# ovs-vsctl show > 24ce6475-89bb-4df5-a5ff-4ce58f2c2f68 > Bridge br-eth1 > Port patch-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9-to-br-int > Interface > patch-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9-to-br-int > type: patch > options: > {peer=patch-br-int-to-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9} > Port br-eth1 > Interface br-eth1 > type: internal > Port ens224 > Interface ens224 > Bridge br-int > fail_mode: secure > datapath_type: system > Port patch-br-int-to-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9 > Interface > patch-br-int-to-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9 > type: patch > options: > {peer=patch-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9-to-br-int} > Port tap2b0bbf7b-59 > Interface tap2b0bbf7b-59 > Port ovn-650be8-0 > Interface ovn-650be8-0 > type: vxlan > options: {csum="true", key=flow, remote_ip="172.16.30.1"} > bfd_status: {diagnostic="No Diagnostic", flap_count="1", > forwarding="true", remote_diagnostic="No Diagnostic", remote_state=up, > state=up} > Port tap867d2174-83 > Interface tap867d2174-83 > Port tapde98b2d4-a0 > Interface tapde98b2d4-a0 > Port br-int > Interface br-int > type: internal > > > -------------------------------------------------------------------------------------------------------- > > I would really appreciate any input in this regard. > > Thank you. > > Regards, > Malik Obaid > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From miguel at mlavalle.com Sun Jul 4 23:00:16 2021 From: miguel at mlavalle.com (Miguel Lavalle) Date: Sun, 4 Jul 2021 18:00:16 -0500 Subject: [neutron] bug deputy report June 28th to July 4th Message-ID: Hi, Here is this week's bugs deputy report: Needs attention or owner =================== https://bugs.launchpad.net/neutron/+bug/1934420 [OVN]ControllerAgent cannot be changed to ControllerGatewayAgent dynamically. In this case, an owner is required https://bugs.launchpad.net/neutron/+bug/1933813 OSError: Premature eof waiting for privileged process error in Ussuri release of Neutron. Submitter still debugging issue in his deployment with guidance from ralonsoh. Waiting for more information Medium ====== https://bugs.launchpad.net/neutron/+bug/1933638 neutron with ovn returns Conflict on security group rules delete https://bugs.launchpad.net/neutron/+bug/1934039 Neutron with noauth authentication strategy needs fake 'project_id' in request body. Proposed fix: https://review.opendev.org/c/openstack/neutron/+/799162 https://bugs.launchpad.net/neutron/+bug/1934115 List security groups by project admin may return 500. Proposed fix: https://review.opendev.org/c/openstack/neutron/+/798821 https://bugs.launchpad.net/neutron/+bug/1934466 Devstack install for ML2 OVS. Proposed fix: https://review.opendev.org/c/openstack/neutron/+/799159 Low === https://bugs.launchpad.net/neutron/+bug/1933802 missing global_request_id in neutron_lib context from_dict method. Proposed fix: https://review.opendev.org/c/openstack/neutron-lib/+/798815 https://bugs.launchpad.net/neutron/+bug/1934256 neutron-lib duplicates os-resource-classes constants. Proposed fix: https://review.opendev.org/c/openstack/neutron-lib/+/799034 https://bugs.launchpad.net/neutron/+bug/1934420 [OVN]ControllerAgent cannot be changed to ControllerGatewayAgent dynamically. Needs an owner https://bugs.launchpad.net/neutron/+bug/1934512 Remove SG RPC "use_enhanced_rpc" check -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark.kirkwood at catalyst.net.nz Mon Jul 5 04:51:44 2021 From: mark.kirkwood at catalyst.net.nz (Mark Kirkwood) Date: Mon, 5 Jul 2021 16:51:44 +1200 Subject: swift_storage storage space In-Reply-To: References: Message-ID: You can use 'swift-recon' to tell you that: $ sudo swift-recon -d (run from any of the storage or proxy nodes), it gives you a distribution graph and the aggregated total use/free space. regards Mark On 2/07/21 10:46 pm, Salman Sheikh wrote: > Hi experts, > > how we get the storage usage of object storage (swift). suppose at the > time of installation through packstack we define the swift_storage > 20G, if any other person wants to know how much space was used in > swift, and how much space is available in the object storage. How we > can get all this > > Regards > salman From malikobaidadil at gmail.com Mon Jul 5 06:19:05 2021 From: malikobaidadil at gmail.com (Malik Obaid) Date: Mon, 5 Jul 2021 11:19:05 +0500 Subject: [wallaby][neutron][ovn] Low network performance between instances on different compute nodes using OVN Geneve tunnels In-Reply-To: References: Message-ID: Hi Laurent, I am using 32 cores and 32GB RAM on VM. The compute nodes are EPYC 7532 dual socket baremetal servers with 1TB RAM with ubuntu 20.04 and network cards are Broadcom BCM57504 NetXtreme-E 10Gb cards. Below are the stats on different hosts. TCP bidirectional, on geneve network. [ ID][Role] Interval Transfer Bitrate Retr [ 5][RX-S] 0.00-10.01 sec 3.44 GBytes 2.95 Gbits/sec receiver [ 8][TX-S] 0.00-10.01 sec 3.48 GBytes 2.99 Gbits/sec 0 sender Unidirectional tcp, on geneve network. [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 6.29 GBytes 5.41 Gbits/sec 491 sender [ 5] 0.00-10.00 sec 6.29 GBytes 5.40 Gbits/sec receiver Unidirectional udp, on geneve network. [ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams [ 5] 0.00-10.00 sec 3.45 GBytes 2.97 Gbits/sec 0.000 ms 0/2540389 (0%) sender [ 5] 0.00-10.00 sec 3.23 GBytes 2.77 Gbits/sec 0.009 ms 165868/2539198 (6.5%) receiver Below are the stats of bidirectional udp, on geneve network. [ ID][Role] Interval Transfer Bitrate Jitter Lost/Total Datagrams [ 5][TX-C] 0.00-10.00 sec 2.00 GBytes 1.72 Gbits/sec 0.000 ms 0/1472357 (0%) sender [ 5][TX-C] 0.00-10.01 sec 1.99 GBytes 1.71 Gbits/sec 0.024 ms 7713/1471535 (0.52%) receiver [ 7][RX-C] 0.00-10.00 sec 2.00 GBytes 1.72 Gbits/sec 0.000 ms 0/1472450 (0%) sender [ 7][RX-C] 0.00-10.01 sec 1.98 GBytes 1.70 Gbits/sec 0.012 ms 17325/1470552 (1.2%) receiver ================================================== Below are the stats of VMs on same host. tcp unidirectional on geneve network. [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 19.5 GBytes 16.7 Gbits/sec 0 sender [ 5] 0.00-10.00 sec 19.5 GBytes 16.7 Gbits/sec receiver tcp bidirectional on geneve network. [ ID][Role] Interval Transfer Bitrate Retr [ 5][TX-C] 0.00-10.00 sec 10.7 GBytes 9.21 Gbits/sec 0 sender [ 5][TX-C] 0.00-10.00 sec 10.7 GBytes 9.21 Gbits/sec receiver [ 7][RX-C] 0.00-10.00 sec 9.95 GBytes 8.55 Gbits/sec 0 sender [ 7][RX-C] 0.00-10.00 sec 9.95 GBytes 8.54 Gbits/sec receiver udp unidirectional on geneve network. [ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams [ 5] 0.00-10.00 sec 2.15 GBytes 1.85 Gbits/sec 0.000 ms 0/1584825 (0%) sender [ 5] 0.00-10.00 sec 2.15 GBytes 1.85 Gbits/sec 0.015 ms 0/1584825 (0%) receiver udp bidirectional on geneve network. [ ID][Role] Interval Transfer Bitrate Jitter Lost/Total Datagrams [ 5][TX-C] 0.00-10.00 sec 2.17 GBytes 1.87 Gbits/sec 0.000 ms 0/1597563 (0%) sender [ 5][TX-C] 0.00-10.00 sec 1.37 GBytes 1.17 Gbits/sec 0.006 ms 590524/1595459 (37%) receiver [ 7][RX-C] 0.00-10.00 sec 1.37 GBytes 1.17 Gbits/sec 0.000 ms 0/1005024 (0%) sender [ 7][RX-C] 0.00-10.00 sec 1.37 GBytes 1.17 Gbits/sec 0.012 ms 0/1004983 (0%) receiver However the performance of network on OVN VLAN provider network is ~9.8Gbps bidirectional with VMs on different hosts. Below are the details of ovs-vsctl show command on compute node. Bridge br-int fail_mode: secure datapath_type: system Port tap131797c7-06 Interface tap131797c7-06 Port ovn-094381-0 Interface ovn-094381-0 type: geneve options: {csum="true", key=flow, remote_ip="172.16.40.2"} bfd_status: {diagnostic="No Diagnostic", flap_count="1", forwarding="true", remote_diagnostic="No Diagnostic", remote_state=up, state=up} Port patch-br-int-to-provnet-4f09b820-b5c1-4006-bb6a-a14ecf7f776d Interface patch-br-int-to-provnet-4f09b820-b5c1-4006-bb6a-a14ecf7f776d type: patch options: {peer=patch-provnet-4f09b820-b5c1-4006-bb6a-a14ecf7f776d-to-br-int} Port br-int Interface br-int type: internal Port tap18ca5a79-10 Interface tap18ca5a79-10 Bridge br-vlan Port bond0 Interface bond0 Port patch-provnet-4f09b820-b5c1-4006-bb6a-a14ecf7f776d-to-br-int Interface patch-provnet-4f09b820-b5c1-4006-bb6a-a14ecf7f776d-to-br-int type: patch options: {peer=patch-br-int-to-provnet-4f09b820-b5c1-4006-bb6a-a14ecf7f776d} Port br-vlan Interface br-vlan type: internal ovs_version: "2.15.0" I have already tuned the BIOS for max performance. Any tuning required at OVS or OS level. I strongly believe that throughput should be 10Gbps without dpdk on geneve network. Regards, Malik Obaid On Sun, Jul 4, 2021 at 9:50 PM Laurent Dumont wrote: > Nothing super specific I can think of but : > > - Can you try running the same tests with two instances on the same > compute? > - How many cores are inside the sender/receiver VM? > - Can you test in UDP mode? > > > > On Sun, Jul 4, 2021 at 8:27 AM Malik Obaid > wrote: > >> Hi, >> >> I am using Openstack Wallaby release with OVN on Ubuntu 20.04. >> >> My environment consists of 2 compute nodes and 1 controller node. >> ovs_version: "2.15.0" >> Ubuntu Kernel Version: 5.4.0-77-generic >> >> >> I am observing Network performance between instances on different compute >> nodes is slow. The network uses geneve tunnels.The environment is using >> 10Gbps network interface cards. However, iperf between instances on >> different compute nodes attains only speeds between a few hundred Mbit/s >> and a few Gb/s. Both instances are in the same tenant network. >> >> Note: iperf results between both compute nodes (hypervisors) across the >> geneve tunnel endpoints is perfect 10 Gbps. >> >> Below are the results of iperf commands. >> >> *iperf server:* >> >> 2: ens3: mtu 8950 qdisc fq_codel state >> UP group default qlen 1000 >> link/ether fa:16:3e:4b:1d:29 brd ff:ff:ff:ff:ff:ff >> inet 192.168.100.111/24 brd 192.168.100.255 scope global dynamic ens3 >> valid_lft 42694sec preferred_lft 42694sec >> inet6 fe80::f816:3eff:fe4b:1d29/64 scope link >> valid_lft forever preferred_lft forever >> >> root at vm-01:~# iperf3 -s >> Server listening on 5201 >> >> Accepted connection from 192.168.100.69, port 45542 >> [ 5] local 192.168.100.111 port 5201 connected to 192.168.100.69 port >> 45544 >> [ 8] local 192.168.100.111 port 5201 connected to 192.168.100.69 port >> 45546 >> [ ID][Role] Interval Transfer Bitrate Retr Cwnd >> [ 5][RX-S] 0.00-1.00 sec 692 MBytes 5.81 Gbits/sec >> [ 8][TX-S] 0.00-1.00 sec 730 MBytes 6.12 Gbits/sec 0 3.14 >> MBytes >> [ 5][RX-S] 1.00-2.00 sec 598 MBytes 5.01 Gbits/sec >> [ 8][TX-S] 1.00-2.00 sec 879 MBytes 7.37 Gbits/sec 0 3.14 >> MBytes >> [ 5][RX-S] 2.00-3.00 sec 793 MBytes 6.65 Gbits/sec >> [ 8][TX-S] 2.00-3.00 sec 756 MBytes 6.34 Gbits/sec 0 3.14 >> MBytes >> [ 5][RX-S] 3.00-4.00 sec 653 MBytes 5.48 Gbits/sec >> [ 8][TX-S] 3.00-4.00 sec 871 MBytes 7.31 Gbits/sec 0 3.14 >> MBytes >> [ 5][RX-S] 4.00-5.00 sec 597 MBytes 5.01 Gbits/sec >> [ 8][TX-S] 4.00-5.00 sec 858 MBytes 7.20 Gbits/sec 0 3.14 >> MBytes >> [ 5][RX-S] 5.00-6.00 sec 734 MBytes 6.16 Gbits/sec >> [ 8][TX-S] 5.00-6.00 sec 818 MBytes 6.86 Gbits/sec 0 3.14 >> MBytes >> [ 5][RX-S] 6.00-7.00 sec 724 MBytes 6.06 Gbits/sec >> [ 8][TX-S] 6.00-7.00 sec 789 MBytes 6.60 Gbits/sec 0 3.14 >> MBytes >> [ 5][RX-S] 7.00-8.00 sec 735 MBytes 6.18 Gbits/sec >> [ 8][TX-S] 7.00-8.00 sec 835 MBytes 7.02 Gbits/sec 0 3.14 >> MBytes >> [ 5][RX-S] 8.00-9.00 sec 789 MBytes 6.62 Gbits/sec >> [ 8][TX-S] 8.00-9.00 sec 845 MBytes 7.09 Gbits/sec 0 3.14 >> MBytes >> [ 5][RX-S] 9.00-10.00 sec 599 MBytes 5.02 Gbits/sec >> [ 8][TX-S] 9.00-10.00 sec 806 MBytes 6.76 Gbits/sec 0 3.14 >> MBytes >> >> [ ID][Role] Interval Transfer Bitrate Retr >> [ 5][RX-S] 0.00-10.00 sec 6.75 GBytes 5.80 Gbits/sec >> receiver >> [ 8][TX-S] 0.00-10.00 sec 7.99 GBytes 6.87 Gbits/sec 0 sender >> >> Server listening on 5201 >> >> *Client side:* >> >> root at vm-03:~# iperf3 -c 192.168.100.111 --bidir >> Connecting to host 192.168.100.111, port 5201 >> [ 5] local 192.168.100.69 port 45544 connected to 192.168.100.111 port >> 5201 >> [ 7] local 192.168.100.69 port 45546 connected to 192.168.100.111 port >> 5201 >> [ ID][Role] Interval Transfer Bitrate Retr Cwnd >> [ 5][TX-C] 0.00-1.00 sec 700 MBytes 5.87 Gbits/sec 0 3.13 >> MBytes >> [ 7][RX-C] 0.00-1.00 sec 722 MBytes 6.06 Gbits/sec >> [ 5][TX-C] 1.00-2.00 sec 594 MBytes 4.98 Gbits/sec 0 3.13 >> MBytes >> [ 7][RX-C] 1.00-2.00 sec 883 MBytes 7.41 Gbits/sec >> [ 5][TX-C] 2.00-3.00 sec 796 MBytes 6.67 Gbits/sec 0 3.13 >> MBytes >> [ 7][RX-C] 2.00-3.00 sec 752 MBytes 6.31 Gbits/sec >> [ 5][TX-C] 3.00-4.00 sec 654 MBytes 5.49 Gbits/sec 0 3.13 >> MBytes >> [ 7][RX-C] 3.00-4.00 sec 876 MBytes 7.35 Gbits/sec >> [ 5][TX-C] 4.00-5.00 sec 598 MBytes 5.01 Gbits/sec 0 3.13 >> MBytes >> [ 7][RX-C] 4.00-5.00 sec 853 MBytes 7.16 Gbits/sec >> [ 5][TX-C] 5.00-6.00 sec 734 MBytes 6.15 Gbits/sec 0 3.13 >> MBytes >> [ 7][RX-C] 5.00-6.00 sec 818 MBytes 6.86 Gbits/sec >> [ 5][TX-C] 6.00-7.00 sec 726 MBytes 6.09 Gbits/sec 0 3.13 >> MBytes >> [ 7][RX-C] 6.00-7.00 sec 793 MBytes 6.65 Gbits/sec >> [ 5][TX-C] 7.00-8.00 sec 734 MBytes 6.15 Gbits/sec 0 3.13 >> MBytes >> [ 7][RX-C] 7.00-8.00 sec 831 MBytes 6.97 Gbits/sec >> [ 5][TX-C] 8.00-9.00 sec 788 MBytes 6.61 Gbits/sec 0 3.13 >> MBytes >> [ 7][RX-C] 8.00-9.00 sec 845 MBytes 7.09 Gbits/sec >> [ 5][TX-C] 9.00-10.00 sec 600 MBytes 5.03 Gbits/sec 0 3.13 >> MBytes >> [ 7][RX-C] 9.00-10.00 sec 805 MBytes 6.76 Gbits/sec >> >> [ ID][Role] Interval Transfer Bitrate Retr >> [ 5][TX-C] 0.00-10.00 sec 6.76 GBytes 5.81 Gbits/sec 0 >> sender >> [ 5][TX-C] 0.00-10.00 sec 6.75 GBytes 5.80 Gbits/sec >> receiver >> [ 7][RX-C] 0.00-10.00 sec 7.99 GBytes 6.87 Gbits/sec 0 >> sender >> [ 7][RX-C] 0.00-10.00 sec 7.99 GBytes 6.86 Gbits/sec >> receiver >> >> iperf Done. >> >> >> --------------------------------------------------------------------------------------------------------- >> >> *ovs-vsctl show on compute node1:* >> >> root at kvm01-a1-khi01:~# ovs-vsctl show >> 88e6b984-44dc-4f74-8a9a-891742dbbdbd >> Bridge br-eth1 >> Port ens224 >> Interface ens224 >> Port patch-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9-to-br-int >> Interface >> patch-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9-to-br-int >> type: patch >> options: >> {peer=patch-br-int-to-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9} >> Port br-eth1 >> Interface br-eth1 >> type: internal >> Bridge br-int >> fail_mode: secure >> datapath_type: system >> Port tapde98b2d4-a0 >> Interface tapde98b2d4-a0 >> Port ovn-f51ef9-0 >> Interface ovn-f51ef9-0 >> type: vxlan >> options: {csum="true", key=flow, remote_ip="172.16.30.3"} >> bfd_status: {diagnostic="No Diagnostic", flap_count="1", >> forwarding="true", remote_diagnostic="No Diagnostic", remote_state=up, >> state=up} >> Port tap348fc6dc-3a >> Interface tap348fc6dc-3a >> Port br-int >> Interface br-int >> type: internal >> Port tap6d4d8e02-c0 >> Interface tap6d4d8e02-c0 >> error: "could not open network device tap6d4d8e02-c0 (No >> such device)" >> Port patch-br-int-to-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9 >> Interface >> patch-br-int-to-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9 >> type: patch >> options: >> {peer=patch-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9-to-br-int} >> Port tap247fe5b2-ff >> Interface tap247fe5b2-ff >> >> >> ------------------------------------------------------------------------------------------------------ >> >> *ovs-vsctl show on compute node2:* >> >> root at kvm03-a1-khi01:~# ovs-vsctl show >> 24ce6475-89bb-4df5-a5ff-4ce58f2c2f68 >> Bridge br-eth1 >> Port patch-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9-to-br-int >> Interface >> patch-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9-to-br-int >> type: patch >> options: >> {peer=patch-br-int-to-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9} >> Port br-eth1 >> Interface br-eth1 >> type: internal >> Port ens224 >> Interface ens224 >> Bridge br-int >> fail_mode: secure >> datapath_type: system >> Port patch-br-int-to-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9 >> Interface >> patch-br-int-to-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9 >> type: patch >> options: >> {peer=patch-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9-to-br-int} >> Port tap2b0bbf7b-59 >> Interface tap2b0bbf7b-59 >> Port ovn-650be8-0 >> Interface ovn-650be8-0 >> type: vxlan >> options: {csum="true", key=flow, remote_ip="172.16.30.1"} >> bfd_status: {diagnostic="No Diagnostic", flap_count="1", >> forwarding="true", remote_diagnostic="No Diagnostic", remote_state=up, >> state=up} >> Port tap867d2174-83 >> Interface tap867d2174-83 >> Port tapde98b2d4-a0 >> Interface tapde98b2d4-a0 >> Port br-int >> Interface br-int >> type: internal >> >> >> -------------------------------------------------------------------------------------------------------- >> >> I would really appreciate any input in this regard. >> >> Thank you. >> >> Regards, >> Malik Obaid >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From zigo at debian.org Mon Jul 5 08:43:42 2021 From: zigo at debian.org (Thomas Goirand) Date: Mon, 5 Jul 2021 10:43:42 +0200 Subject: [neutron] IPv6 advertisement support for BGP neutron-dynamic-routing In-Reply-To: <851675e3-1d86-499b-fb53-64e0e59bd312@offenerstapel.de> References: <851675e3-1d86-499b-fb53-64e0e59bd312@offenerstapel.de> Message-ID: <7d8231c0-156a-d741-da41-24553581b032@debian.org> On 7/3/21 9:02 AM, Jens Harbott wrote: > On 7/2/21 8:24 AM, Manu B wrote: >> Hi everyone, >> >>   >> >> This is a doubt regarding existing IPv6 advertisement support for BGP in >> the current n-d-r repo. >> >> From the documentation, it is mentioned that to enable advertising IPv6 >> prefixes, >> >> create an address scope with ip_version=6 and a BGP speaker with >> ip_version=6. >> >>   >> >> We have a use case where v4 and v6 routes must be advertised using a >> single v4 peer. >> >> 1. How can we advertise both v4 and v6 routes using single v4 peer? >> >> Do we have to create 2 BGP speakers ( one for v4 and another for v6) >> >> Then create a single BGP v4 peer and add the same peer to both the speakers? >> >> Is this the correct approach or any other way? > > MP-BGP support has been added with > https://review.opendev.org/c/openstack/neutron-dynamic-routing/+/608302, > I never tested that myself, but from the release note it would seem that > you configure only one speaker and one peer and then by setting proper > configuration options on the peer it gets announced both v4 and v6 prefixes. In our production environment, we found that that when OSKen advertises for IPv6 over an IPv4 session, it crashes after ~7 days. So yeah, that patches kind of works, but it's not stable. Cheers, Thomas Goirand (zigo) From mark at stackhpc.com Mon Jul 5 09:11:35 2021 From: mark at stackhpc.com (Mark Goddard) Date: Mon, 5 Jul 2021 10:11:35 +0100 Subject: [Openstack][kolla] updates check In-Reply-To: References: Message-ID: Hi Ignazio, We publish images to dockerhub on a weekly basis, on Sundays. The release notes should include information about fixes in those images. For example, for wallaby: https://docs.openstack.org/releasenotes/kolla/wallaby.html. Note that fixes may take up to a week to get to dockerhub. Regards, Mark On Sat, 3 Jul 2021 at 15:52, Ignazio Cassano wrote: > > Hello, I am new user for openstack kolla because I installed openstack writing my ansible Playbooks. > I wonder if it is a method to check if new images has been released . > For example: > I installed kolla wallaby 2 weeks ago, and now I want to check if new images could be updated. > I want to check if new images solve some bugs. > Is it possibile without trying to pull all images.? > Thanks > Ignazio > From thierry at openstack.org Mon Jul 5 09:11:56 2021 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 5 Jul 2021 11:11:56 +0200 Subject: [largescale-sig] Next meeting: July 7, 15utc on #openstack-operators Message-ID: <2e31cadc-1467-2c5f-e1d7-079fe7b19721@openstack.org> Hi everyone, Our next Large Scale SIG meeting will be this Wednesday in #openstack-operators on OFTC IRC, at 15UTC. You can doublecheck how that time translates locally at: https://www.timeanddate.com/worldclock/fixedtime.html?iso=20210707T15 A number of topics have already been added to the agenda, including discussing the contents of our next OpenInfra.Live show. Feel free to add other topics to our agenda at: https://etherpad.openstack.org/p/large-scale-sig-meeting Regards, -- Thierry Carrez From ignaziocassano at gmail.com Mon Jul 5 10:02:13 2021 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Mon, 5 Jul 2021 12:02:13 +0200 Subject: [Openstack][kolla] updates check In-Reply-To: References: Message-ID: Thanks. Another question, please. I Update an image in my local registry. The image is nova_compute. I inserted new manager.py file for patching it. Then I ran kolla-deploy but images on compute nodes does not have the new manager.py file . What about the correct procedure ? Ignazio Il giorno lun 5 lug 2021 alle ore 11:11 Mark Goddard ha scritto: > Hi Ignazio, > > We publish images to dockerhub on a weekly basis, on Sundays. The > release notes should include information about fixes in those images. > For example, for wallaby: > https://docs.openstack.org/releasenotes/kolla/wallaby.html. Note that > fixes may take up to a week to get to dockerhub. > > Regards, > Mark > > On Sat, 3 Jul 2021 at 15:52, Ignazio Cassano > wrote: > > > > Hello, I am new user for openstack kolla because I installed openstack > writing my ansible Playbooks. > > I wonder if it is a method to check if new images has been released . > > For example: > > I installed kolla wallaby 2 weeks ago, and now I want to check if new > images could be updated. > > I want to check if new images solve some bugs. > > Is it possibile without trying to pull all images.? > > Thanks > > Ignazio > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Mon Jul 5 10:34:47 2021 From: smooney at redhat.com (Sean Mooney) Date: Mon, 05 Jul 2021 11:34:47 +0100 Subject: [neutron] IPv6 advertisement support for BGP neutron-dynamic-routing In-Reply-To: <851675e3-1d86-499b-fb53-64e0e59bd312@offenerstapel.de> References: <851675e3-1d86-499b-fb53-64e0e59bd312@offenerstapel.de> Message-ID: <719380af1c4384bd4deae6c411bf1782cca31228.camel@redhat.com> On Sat, 2021-07-03 at 09:02 +0200, Jens Harbott wrote: > On 7/2/21 8:24 AM, Manu B wrote: > > Hi everyone, > > > >   > > > > This is a doubt regarding existing IPv6 advertisement support for > > BGP in > > the current n-d-r repo. > > > > From the documentation, it is mentioned that to enable advertising > > IPv6 > > prefixes, > > > > create an address scope with ip_version=6 and a BGP speaker with > > ip_version=6. > > > >   > > > > We have a use case where v4 and v6 routes must be advertised using > > a > > single v4 peer. > > > >  1. How can we advertise both v4 and v6 routes using single v4 > > peer? > > > > Do we have to create 2 BGP speakers ( one for v4 and another for > > v6) > > > > Then create a single BGP v4 peer and add the same peer to both the > > speakers? > > > > Is this the correct approach or any other way? > > MP-BGP support has been added with > https://review.opendev.org/c/openstack/neutron-dynamic-routing/+/608302 > , > I never tested that myself, but from the release note it would seem > that > you configure only one speaker and one peer and then by setting > proper > configuration options on the peer it gets announced both v4 and v6 > prefixes. you can advertiese ipv6 over ipv4 yes that is what i was doing but you still need to agents since you assocate the speaker instance with an allocation pool/subnet pools which containers only one version of ip adresses as a result i dont thing you can have a singel speaker advertise both ipv4 and ipv6 but either address space can be advertised over a ipv4 perring session. > > From hberaud at redhat.com Mon Jul 5 15:00:49 2021 From: hberaud at redhat.com (Herve Beraud) Date: Mon, 5 Jul 2021 17:00:49 +0200 Subject: [release][murano] PTL validation Message-ID: Hello, Just a heads up about this patch https://review.opendev.org/c/openstack/releases/+/797996 Please can you validate this patch? Thanks in advance. -- Hervé Beraud Senior Software Engineer at Red Hat irc: hberaud https://github.com/4383/ https://twitter.com/4383hberaud -----BEGIN PGP SIGNATURE----- wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O v6rDpkeNksZ9fFSyoY2o =ECSj -----END PGP SIGNATURE----- -------------- next part -------------- An HTML attachment was scrubbed... URL: From balazs.gibizer at est.tech Mon Jul 5 15:58:09 2021 From: balazs.gibizer at est.tech (Balazs Gibizer) Date: Mon, 05 Jul 2021 17:58:09 +0200 Subject: [nova] spec review day In-Reply-To: References: Message-ID: Hi, This is a friendly reminder that nova has a spec review day tomorrow before the spec freeze next week. Cheers, gibi On Tue, Jun 15, 2021 at 18:44, Balazs Gibizer wrote: > Hi, > > Let's have another spec review day on 6th of July before the M2 spec > freeze that will happen on 15th of July. > > The rules are the usual. Let's use this day to focus on open specs, > trying to reach agreement on as many thing as possible with close > cooperation during the day. > > Let me know if the timing does not good for you. > > Cheers, > gibi > > > > From noonedeadpunk at ya.ru Mon Jul 5 17:57:12 2021 From: noonedeadpunk at ya.ru (Dmitriy Rabotyagov) Date: Mon, 05 Jul 2021 20:57:12 +0300 Subject: [openstack-ansible][osa] Wallaby has been released with 23.0.0 Message-ID: <55711625506728@mail.yandex.ru> Hi everyone! We're happy to announce, that OpenStack-Ansible project has finally released stable/wallaby version with tag 23.0.0. This release includes more then 700 changes from over 50 contributors. And I'd love to thank everyone who made this release happen! The major changes since Victoria are: - Added experimental support of Debian Bullseye (11) and CentOS 8 Stream - Significantly improved workflow and status of Cloudkitty, Trove and Zun roles - Added PKI role, that generates local CA, so self-signed certificates will be trusted inside deployment - nspawn containers support has been removed For the full log of the changes, please reffer to the project Release Notes [1] [1] https://docs.openstack.org/releasenotes/openstack-ansible/wallaby.html -- Kind Regards, Dmitriy Rabotyagov From gmann at ghanshyammann.com Mon Jul 5 22:39:00 2021 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 05 Jul 2021 17:39:00 -0500 Subject: [all][policy] Disable & making policy rule's default change warning configurable Message-ID: <17a78d32327.bd56b6a4554068.6364173515112762764@ghanshyammann.com> Hello Everyone, While implementing the new secure RBAC (scope and new defaults), you might have noticed the lot of warnings in the log and sometime failing jobs also due to size of logs. Then you had to disable those via "suppress_default_change_warnings" variable on policy enforcer. The oslo policy log the warnings if the default value of policy rule (if not overridden) is changed, so there are warnings for every policy rule on every API request, everytime policy is initialized which end up a lot of warnings (thousands) in log. It might be happening in production also. Many projects have disabled it via hardcoded "suppress_default_change_warnings". But there is no way for the operator to disable/enable these warnings (enable in case they would like to check the new policy RBAC). To handle it on oslo policy side and generically for all the projects I am planning to: 1. Disable it by default in oslo policy side itself. 2. Make it configurable so that operator can enable it on need basis. NOTE: This proposal is about warnings for default value change, not for the policy name change. I have submitted this proposal in gerrit too - https://review.opendev.org/c/openstack/oslo.policy/+/799539 Please let me know your opinon on this? -gmann From laurentfdumont at gmail.com Tue Jul 6 00:18:33 2021 From: laurentfdumont at gmail.com (Laurent Dumont) Date: Mon, 5 Jul 2021 20:18:33 -0400 Subject: [wallaby][neutron][ovn] Low network performance between instances on different compute nodes using OVN Geneve tunnels In-Reply-To: References: Message-ID: I think that you should be able to reach line rate as well for 10Gbit on Geneve/OVN. I don't have a setup to compare, but you might want to try to force the TCP Window with iperf3. There could be a case where the PPS (packet per second) is the issue and it cannot reach a sufficiently big window. On Mon, Jul 5, 2021 at 2:19 AM Malik Obaid wrote: > Hi Laurent, > > I am using 32 cores and 32GB RAM on VM. The compute nodes are EPYC 7532 > dual socket baremetal servers with 1TB RAM with ubuntu 20.04 and network > cards are Broadcom BCM57504 NetXtreme-E 10Gb cards. > > Below are the stats on different hosts. > > TCP bidirectional, on geneve network. > > [ ID][Role] Interval Transfer Bitrate Retr > [ 5][RX-S] 0.00-10.01 sec 3.44 GBytes 2.95 Gbits/sec > receiver > [ 8][TX-S] 0.00-10.01 sec 3.48 GBytes 2.99 Gbits/sec 0 > sender > > Unidirectional tcp, on geneve network. > > [ ID] Interval Transfer Bitrate Retr > [ 5] 0.00-10.00 sec 6.29 GBytes 5.41 Gbits/sec 491 > sender > [ 5] 0.00-10.00 sec 6.29 GBytes 5.40 Gbits/sec > receiver > > Unidirectional udp, on geneve network. > > [ ID] Interval Transfer Bitrate Jitter Lost/Total > Datagrams > [ 5] 0.00-10.00 sec 3.45 GBytes 2.97 Gbits/sec 0.000 ms 0/2540389 > (0%) sender > [ 5] 0.00-10.00 sec 3.23 GBytes 2.77 Gbits/sec 0.009 ms > 165868/2539198 (6.5%) receiver > > Below are the stats of bidirectional udp, on geneve network. > > [ ID][Role] Interval Transfer Bitrate Jitter > Lost/Total Datagrams > [ 5][TX-C] 0.00-10.00 sec 2.00 GBytes 1.72 Gbits/sec 0.000 ms > 0/1472357 (0%) sender > [ 5][TX-C] 0.00-10.01 sec 1.99 GBytes 1.71 Gbits/sec 0.024 ms > 7713/1471535 (0.52%) receiver > [ 7][RX-C] 0.00-10.00 sec 2.00 GBytes 1.72 Gbits/sec 0.000 ms > 0/1472450 (0%) sender > [ 7][RX-C] 0.00-10.01 sec 1.98 GBytes 1.70 Gbits/sec 0.012 ms > 17325/1470552 (1.2%) receiver > > ================================================== > > Below are the stats of VMs on same host. > > tcp unidirectional on geneve network. > > [ ID] Interval Transfer Bitrate Retr > [ 5] 0.00-10.00 sec 19.5 GBytes 16.7 Gbits/sec 0 > sender > [ 5] 0.00-10.00 sec 19.5 GBytes 16.7 Gbits/sec > receiver > > tcp bidirectional on geneve network. > > [ ID][Role] Interval Transfer Bitrate Retr > [ 5][TX-C] 0.00-10.00 sec 10.7 GBytes 9.21 Gbits/sec 0 > sender > [ 5][TX-C] 0.00-10.00 sec 10.7 GBytes 9.21 Gbits/sec > receiver > [ 7][RX-C] 0.00-10.00 sec 9.95 GBytes 8.55 Gbits/sec 0 > sender > [ 7][RX-C] 0.00-10.00 sec 9.95 GBytes 8.54 Gbits/sec > receiver > > udp unidirectional on geneve network. > > [ ID] Interval Transfer Bitrate Jitter Lost/Total > Datagrams > [ 5] 0.00-10.00 sec 2.15 GBytes 1.85 Gbits/sec 0.000 ms 0/1584825 > (0%) sender > [ 5] 0.00-10.00 sec 2.15 GBytes 1.85 Gbits/sec 0.015 ms 0/1584825 > (0%) receiver > > udp bidirectional on geneve network. > > [ ID][Role] Interval Transfer Bitrate Jitter > Lost/Total Datagrams > [ 5][TX-C] 0.00-10.00 sec 2.17 GBytes 1.87 Gbits/sec 0.000 ms > 0/1597563 (0%) sender > [ 5][TX-C] 0.00-10.00 sec 1.37 GBytes 1.17 Gbits/sec 0.006 ms > 590524/1595459 (37%) receiver > [ 7][RX-C] 0.00-10.00 sec 1.37 GBytes 1.17 Gbits/sec 0.000 ms > 0/1005024 (0%) sender > [ 7][RX-C] 0.00-10.00 sec 1.37 GBytes 1.17 Gbits/sec 0.012 ms > 0/1004983 (0%) receiver > > However the performance of network on OVN VLAN provider network is > ~9.8Gbps bidirectional with VMs on different hosts. > > Below are the details of ovs-vsctl show command on compute node. > > Bridge br-int > fail_mode: secure > datapath_type: system > Port tap131797c7-06 > Interface tap131797c7-06 > Port ovn-094381-0 > Interface ovn-094381-0 > type: geneve > options: {csum="true", key=flow, remote_ip="172.16.40.2"} > bfd_status: {diagnostic="No Diagnostic", flap_count="1", > forwarding="true", remote_diagnostic="No Diagnostic", remote_state=up, > state=up} > Port patch-br-int-to-provnet-4f09b820-b5c1-4006-bb6a-a14ecf7f776d > Interface > patch-br-int-to-provnet-4f09b820-b5c1-4006-bb6a-a14ecf7f776d > type: patch > options: > {peer=patch-provnet-4f09b820-b5c1-4006-bb6a-a14ecf7f776d-to-br-int} > Port br-int > Interface br-int > type: internal > Port tap18ca5a79-10 > Interface tap18ca5a79-10 > Bridge br-vlan > Port bond0 > Interface bond0 > Port patch-provnet-4f09b820-b5c1-4006-bb6a-a14ecf7f776d-to-br-int > Interface > patch-provnet-4f09b820-b5c1-4006-bb6a-a14ecf7f776d-to-br-int > type: patch > options: > {peer=patch-br-int-to-provnet-4f09b820-b5c1-4006-bb6a-a14ecf7f776d} > Port br-vlan > Interface br-vlan > type: internal > ovs_version: "2.15.0" > > I have already tuned the BIOS for max performance. Any tuning required at > OVS or OS level. I strongly believe that throughput should be 10Gbps > without dpdk on geneve network. > > Regards, > Malik Obaid > > On Sun, Jul 4, 2021 at 9:50 PM Laurent Dumont > wrote: > >> Nothing super specific I can think of but : >> >> - Can you try running the same tests with two instances on the same >> compute? >> - How many cores are inside the sender/receiver VM? >> - Can you test in UDP mode? >> >> >> >> On Sun, Jul 4, 2021 at 8:27 AM Malik Obaid >> wrote: >> >>> Hi, >>> >>> I am using Openstack Wallaby release with OVN on Ubuntu 20.04. >>> >>> My environment consists of 2 compute nodes and 1 controller node. >>> ovs_version: "2.15.0" >>> Ubuntu Kernel Version: 5.4.0-77-generic >>> >>> >>> I am observing Network performance between instances on different >>> compute nodes is slow. The network uses geneve tunnels.The environment is >>> using 10Gbps network interface cards. However, iperf between instances on >>> different compute nodes attains only speeds between a few hundred Mbit/s >>> and a few Gb/s. Both instances are in the same tenant network. >>> >>> Note: iperf results between both compute nodes (hypervisors) across the >>> geneve tunnel endpoints is perfect 10 Gbps. >>> >>> Below are the results of iperf commands. >>> >>> *iperf server:* >>> >>> 2: ens3: mtu 8950 qdisc fq_codel state >>> UP group default qlen 1000 >>> link/ether fa:16:3e:4b:1d:29 brd ff:ff:ff:ff:ff:ff >>> inet 192.168.100.111/24 brd 192.168.100.255 scope global dynamic >>> ens3 >>> valid_lft 42694sec preferred_lft 42694sec >>> inet6 fe80::f816:3eff:fe4b:1d29/64 scope link >>> valid_lft forever preferred_lft forever >>> >>> root at vm-01:~# iperf3 -s >>> Server listening on 5201 >>> >>> Accepted connection from 192.168.100.69, port 45542 >>> [ 5] local 192.168.100.111 port 5201 connected to 192.168.100.69 port >>> 45544 >>> [ 8] local 192.168.100.111 port 5201 connected to 192.168.100.69 port >>> 45546 >>> [ ID][Role] Interval Transfer Bitrate Retr Cwnd >>> [ 5][RX-S] 0.00-1.00 sec 692 MBytes 5.81 Gbits/sec >>> [ 8][TX-S] 0.00-1.00 sec 730 MBytes 6.12 Gbits/sec 0 3.14 >>> MBytes >>> [ 5][RX-S] 1.00-2.00 sec 598 MBytes 5.01 Gbits/sec >>> [ 8][TX-S] 1.00-2.00 sec 879 MBytes 7.37 Gbits/sec 0 3.14 >>> MBytes >>> [ 5][RX-S] 2.00-3.00 sec 793 MBytes 6.65 Gbits/sec >>> [ 8][TX-S] 2.00-3.00 sec 756 MBytes 6.34 Gbits/sec 0 3.14 >>> MBytes >>> [ 5][RX-S] 3.00-4.00 sec 653 MBytes 5.48 Gbits/sec >>> [ 8][TX-S] 3.00-4.00 sec 871 MBytes 7.31 Gbits/sec 0 3.14 >>> MBytes >>> [ 5][RX-S] 4.00-5.00 sec 597 MBytes 5.01 Gbits/sec >>> [ 8][TX-S] 4.00-5.00 sec 858 MBytes 7.20 Gbits/sec 0 3.14 >>> MBytes >>> [ 5][RX-S] 5.00-6.00 sec 734 MBytes 6.16 Gbits/sec >>> [ 8][TX-S] 5.00-6.00 sec 818 MBytes 6.86 Gbits/sec 0 3.14 >>> MBytes >>> [ 5][RX-S] 6.00-7.00 sec 724 MBytes 6.06 Gbits/sec >>> [ 8][TX-S] 6.00-7.00 sec 789 MBytes 6.60 Gbits/sec 0 3.14 >>> MBytes >>> [ 5][RX-S] 7.00-8.00 sec 735 MBytes 6.18 Gbits/sec >>> [ 8][TX-S] 7.00-8.00 sec 835 MBytes 7.02 Gbits/sec 0 3.14 >>> MBytes >>> [ 5][RX-S] 8.00-9.00 sec 789 MBytes 6.62 Gbits/sec >>> [ 8][TX-S] 8.00-9.00 sec 845 MBytes 7.09 Gbits/sec 0 3.14 >>> MBytes >>> [ 5][RX-S] 9.00-10.00 sec 599 MBytes 5.02 Gbits/sec >>> [ 8][TX-S] 9.00-10.00 sec 806 MBytes 6.76 Gbits/sec 0 3.14 >>> MBytes >>> >>> [ ID][Role] Interval Transfer Bitrate Retr >>> [ 5][RX-S] 0.00-10.00 sec 6.75 GBytes 5.80 Gbits/sec >>> receiver >>> [ 8][TX-S] 0.00-10.00 sec 7.99 GBytes 6.87 Gbits/sec 0 sender >>> >>> Server listening on 5201 >>> >>> *Client side:* >>> >>> root at vm-03:~# iperf3 -c 192.168.100.111 --bidir >>> Connecting to host 192.168.100.111, port 5201 >>> [ 5] local 192.168.100.69 port 45544 connected to 192.168.100.111 port >>> 5201 >>> [ 7] local 192.168.100.69 port 45546 connected to 192.168.100.111 port >>> 5201 >>> [ ID][Role] Interval Transfer Bitrate Retr Cwnd >>> [ 5][TX-C] 0.00-1.00 sec 700 MBytes 5.87 Gbits/sec 0 3.13 >>> MBytes >>> [ 7][RX-C] 0.00-1.00 sec 722 MBytes 6.06 Gbits/sec >>> [ 5][TX-C] 1.00-2.00 sec 594 MBytes 4.98 Gbits/sec 0 3.13 >>> MBytes >>> [ 7][RX-C] 1.00-2.00 sec 883 MBytes 7.41 Gbits/sec >>> [ 5][TX-C] 2.00-3.00 sec 796 MBytes 6.67 Gbits/sec 0 3.13 >>> MBytes >>> [ 7][RX-C] 2.00-3.00 sec 752 MBytes 6.31 Gbits/sec >>> [ 5][TX-C] 3.00-4.00 sec 654 MBytes 5.49 Gbits/sec 0 3.13 >>> MBytes >>> [ 7][RX-C] 3.00-4.00 sec 876 MBytes 7.35 Gbits/sec >>> [ 5][TX-C] 4.00-5.00 sec 598 MBytes 5.01 Gbits/sec 0 3.13 >>> MBytes >>> [ 7][RX-C] 4.00-5.00 sec 853 MBytes 7.16 Gbits/sec >>> [ 5][TX-C] 5.00-6.00 sec 734 MBytes 6.15 Gbits/sec 0 3.13 >>> MBytes >>> [ 7][RX-C] 5.00-6.00 sec 818 MBytes 6.86 Gbits/sec >>> [ 5][TX-C] 6.00-7.00 sec 726 MBytes 6.09 Gbits/sec 0 3.13 >>> MBytes >>> [ 7][RX-C] 6.00-7.00 sec 793 MBytes 6.65 Gbits/sec >>> [ 5][TX-C] 7.00-8.00 sec 734 MBytes 6.15 Gbits/sec 0 3.13 >>> MBytes >>> [ 7][RX-C] 7.00-8.00 sec 831 MBytes 6.97 Gbits/sec >>> [ 5][TX-C] 8.00-9.00 sec 788 MBytes 6.61 Gbits/sec 0 3.13 >>> MBytes >>> [ 7][RX-C] 8.00-9.00 sec 845 MBytes 7.09 Gbits/sec >>> [ 5][TX-C] 9.00-10.00 sec 600 MBytes 5.03 Gbits/sec 0 3.13 >>> MBytes >>> [ 7][RX-C] 9.00-10.00 sec 805 MBytes 6.76 Gbits/sec >>> >>> [ ID][Role] Interval Transfer Bitrate Retr >>> [ 5][TX-C] 0.00-10.00 sec 6.76 GBytes 5.81 Gbits/sec 0 >>> sender >>> [ 5][TX-C] 0.00-10.00 sec 6.75 GBytes 5.80 Gbits/sec >>> receiver >>> [ 7][RX-C] 0.00-10.00 sec 7.99 GBytes 6.87 Gbits/sec 0 >>> sender >>> [ 7][RX-C] 0.00-10.00 sec 7.99 GBytes 6.86 Gbits/sec >>> receiver >>> >>> iperf Done. >>> >>> >>> --------------------------------------------------------------------------------------------------------- >>> >>> *ovs-vsctl show on compute node1:* >>> >>> root at kvm01-a1-khi01:~# ovs-vsctl show >>> 88e6b984-44dc-4f74-8a9a-891742dbbdbd >>> Bridge br-eth1 >>> Port ens224 >>> Interface ens224 >>> Port patch-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9-to-br-int >>> Interface >>> patch-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9-to-br-int >>> type: patch >>> options: >>> {peer=patch-br-int-to-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9} >>> Port br-eth1 >>> Interface br-eth1 >>> type: internal >>> Bridge br-int >>> fail_mode: secure >>> datapath_type: system >>> Port tapde98b2d4-a0 >>> Interface tapde98b2d4-a0 >>> Port ovn-f51ef9-0 >>> Interface ovn-f51ef9-0 >>> type: vxlan >>> options: {csum="true", key=flow, remote_ip="172.16.30.3"} >>> bfd_status: {diagnostic="No Diagnostic", flap_count="1", >>> forwarding="true", remote_diagnostic="No Diagnostic", remote_state=up, >>> state=up} >>> Port tap348fc6dc-3a >>> Interface tap348fc6dc-3a >>> Port br-int >>> Interface br-int >>> type: internal >>> Port tap6d4d8e02-c0 >>> Interface tap6d4d8e02-c0 >>> error: "could not open network device tap6d4d8e02-c0 (No >>> such device)" >>> Port patch-br-int-to-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9 >>> Interface >>> patch-br-int-to-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9 >>> type: patch >>> options: >>> {peer=patch-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9-to-br-int} >>> Port tap247fe5b2-ff >>> Interface tap247fe5b2-ff >>> >>> >>> ------------------------------------------------------------------------------------------------------ >>> >>> *ovs-vsctl show on compute node2:* >>> >>> root at kvm03-a1-khi01:~# ovs-vsctl show >>> 24ce6475-89bb-4df5-a5ff-4ce58f2c2f68 >>> Bridge br-eth1 >>> Port patch-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9-to-br-int >>> Interface >>> patch-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9-to-br-int >>> type: patch >>> options: >>> {peer=patch-br-int-to-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9} >>> Port br-eth1 >>> Interface br-eth1 >>> type: internal >>> Port ens224 >>> Interface ens224 >>> Bridge br-int >>> fail_mode: secure >>> datapath_type: system >>> Port patch-br-int-to-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9 >>> Interface >>> patch-br-int-to-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9 >>> type: patch >>> options: >>> {peer=patch-provnet-440be99a-c347-4458-b7c1-6c0e6155eee9-to-br-int} >>> Port tap2b0bbf7b-59 >>> Interface tap2b0bbf7b-59 >>> Port ovn-650be8-0 >>> Interface ovn-650be8-0 >>> type: vxlan >>> options: {csum="true", key=flow, remote_ip="172.16.30.1"} >>> bfd_status: {diagnostic="No Diagnostic", flap_count="1", >>> forwarding="true", remote_diagnostic="No Diagnostic", remote_state=up, >>> state=up} >>> Port tap867d2174-83 >>> Interface tap867d2174-83 >>> Port tapde98b2d4-a0 >>> Interface tapde98b2d4-a0 >>> Port br-int >>> Interface br-int >>> type: internal >>> >>> >>> -------------------------------------------------------------------------------------------------------- >>> >>> I would really appreciate any input in this regard. >>> >>> Thank you. >>> >>> Regards, >>> Malik Obaid >>> >>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From marios at redhat.com Tue Jul 6 07:20:15 2021 From: marios at redhat.com (Marios Andreou) Date: Tue, 6 Jul 2021 10:20:15 +0300 Subject: [TripleO] next irc meeting Tuesday 06 July @ 1400 UTC in OFTC #tripleo Message-ID: Reminder the next TripleO irc meeting is today Tuesday 06 July 1400 UTC in OFTC irc channel #tripleo. agenda: https://wiki.openstack.org/wiki/Meetings/TripleO one-off items: https://etherpad.opendev.org/p/tripleo-meeting-items (feel free to add any tripleo status/ongoing work etc to the etherpad). Our last meeting was on Jun 22 - you can find logs @ http://eavesdrop.openstack.org/meetings/tripleo/2021/tripleo.2021-06-22-14.00.html Hope you can make it on Tuesday, regards, marios From ekuvaja at redhat.com Tue Jul 6 10:20:06 2021 From: ekuvaja at redhat.com (Erno Kuvaja) Date: Tue, 6 Jul 2021 11:20:06 +0100 Subject: [all][policy] Disable & making policy rule's default change warning configurable In-Reply-To: <17a78d32327.bd56b6a4554068.6364173515112762764@ghanshyammann.com> References: <17a78d32327.bd56b6a4554068.6364173515112762764@ghanshyammann.com> Message-ID: On Mon, Jul 5, 2021 at 11:43 PM Ghanshyam Mann wrote: > Hello Everyone, > > While implementing the new secure RBAC (scope and new defaults), you might > have noticed > the lot of warnings in the log and sometime failing jobs also due to size > of logs. Then you had > to disable those via "suppress_default_change_warnings" variable on policy > enforcer. > > The oslo policy log the warnings if the default value of policy rule (if > not overridden) is changed, so > there are warnings for every policy rule on every API request, everytime > policy is initialized which > end up a lot of warnings (thousands) in log. It might be happening in > production also. > > Many projects have disabled it via hardcoded > "suppress_default_change_warnings". But there is no > way for the operator to disable/enable these warnings (enable in case they > would like to check the > new policy RBAC). > > To handle it on oslo policy side and generically for all the projects I am > planning to: > > 1. Disable it by default in oslo policy side itself. > > 2. Make it configurable so that operator can enable it on need basis. > > NOTE: This proposal is about warnings for default value change, not for > the policy name change. > > I have submitted this proposal in gerrit too - > https://review.opendev.org/c/openstack/oslo.policy/+/799539 > > Please let me know your opinon on this? > > -gmann > > Thanks Ganshyam! I left the same comments in the review itself but TL;DR: IMO we should have the warnings on by default. If the operator actually happens to read release notes it's an easy switch to flip it off, if not they would get notified of the change in the logs. What's the point of deprecations if we don't tell anyone about them? How big of a change would it be to emit the warnings only when the policy engine loads the rules at service start rather than spamming about them on every API request? Obviously we should turn them off on gate/tests. Thanks for tackling the spammyness of our logs. - jokke -------------- next part -------------- An HTML attachment was scrubbed... URL: From mahnoor.asghar at xflowresearch.com Tue Jul 6 10:25:28 2021 From: mahnoor.asghar at xflowresearch.com (Mahnoor Asghar) Date: Tue, 6 Jul 2021 15:25:28 +0500 Subject: [Ironic] Vendor-neutral Disk names In-Reply-To: References: Message-ID: Hello Arkady, The proposed schema will not work for extended storage, as the disk names would be based on physical location. Our original scheme of work only includes local storage; perhaps we can extend the work later to incorporate a naming mechanism for extended storage. Thank you for pointing this out, though! It’s a very valid point. Regards, Mahnoor Asghar On Fri, Jul 2, 2021 at 4:17 AM Arkady Kanevsky wrote: > Mahnoor, > How would your proposed schema work for extended storage? > Think of the SCSI connector to JBOD. > Thanks, > Arkady > > > On Thu, Jul 1, 2021 at 5:29 PM Mahnoor Asghar < > mahnoor.asghar at xflowresearch.com> wrote: > >> Thank you for the response, Mike! >> >> I agree that 'ServiceLabel' is a good starting point, but it would be >> preferable to have a more consistent format that represents a Drive >> resource uniquely, irrespective of the vendor. >> The idea is to let Ironic name the Drive resources using this logic, >> so that the baremetal operator can use the same, consistent method of >> specifying disks for RAID configuration. Feedback from the Ironic >> community is very welcome here, so that an informed proposal can be >> made. >> >> >> On Wed, Jun 30, 2021 at 6:46 PM Mike Raineri wrote: >> > >> > Hi Mahnoor, >> > >> > First, to answer your questions about the property values: >> > - "ServiceLabel" is supposed to always be something human friendly and >> matches an indicator that is used to reference a part in an enclosure. >> Ultimately the vendor dictates the how they construct their >> stickers/etching/silk-screens/other labels, but I would expect something >> along the lines of "Drive Bay 3", or "Slot 9", or something similar. >> > - The resource type will always be "#Drive.v1_X_X.Drive"; this is >> required by Redfish for representing a standard "Drive" resource. >> > - "Id" as defined in the Redfish Specification is going to be a unique >> identifier in the collection; there are no rules with what goes into this >> property other than uniqueness in terms of other collection members. I've >> seen some implementations use a simple numeric index and others use >> something that looks more like a service label. However, I've also seen a >> few use either a GUID or other globally unique identifier, which might be >> unfriendly for a user to look at. >> > >> > With that said, when I originally authored that logic in Tacklebox, I >> fully expected to revisit that logic to fine tune it to ensure it reliably >> gives something human readable. I haven't gone through the exercise to >> refine it further, but my initial impression is I'll be looking at more >> properties specific to the Drive resource to help build the string. I think >> "ServiceLabel" is still a good starting point, but having the appropriate >> fallback logic in its absence would be very useful to avoid construction >> based on "Id". >> > >> > Thanks. >> > >> > -Mike >> > >> > On Wed, Jun 30, 2021 at 9:01 AM Mahnoor Asghar < >> mahnoor.asghar at xflowresearch.com> wrote: >> >> >> >> Dear all, >> >> >> >> There is [a proposal][1] in the metal3-io/baremetal-operator >> repository to extend the hardware RAID configuration to support >> ‘physical_disks’ and ‘controller’ fields in the 'target_raid_config' >> section. >> >> The user should be able to specify the disks for RAID configuration, >> in a vendor-agnostic way. (This requirement comes from the Airship >> project.) The names of the disks should be indicative of the physical >> location of the disks within the server. An algorithm to construct disk >> names is therefore needed, for this purpose. >> >> >> >> One possible algorithm was found in the [inventory module][2] of the >> [Redfish Tacklebox scripts][3]. >> >> To construct a disk name, it uses Redfish properties, specifically the >> ‘Drive’ resource> ‘Physical Location’ object> ‘Part Location’ property> >> ‘Service Label’ property. ([Link][4] to code) ([Link][5] to Redfish ‘Drive’ >> resource) >> >> If this property is empty, the resource type (String uptil the first >> dot encountered in the @odata.type field), and the ‘Id’ properties of the >> Drive resource are used to construct the disk name. ([Link][6] to code) >> >> For example, if the 'Drive'.'Physical Location'.'Part >> Location'.'Service Label' field is ‘Disk.Bay1.Slot0’, this is what the disk >> name will be. If this field is empty, and the resource name is ‘Drive’ and >> the resource ‘Id’ is ‘Disk.Bay1.Slot0’, the disk name will be: ‘Drive: >> Disk.Bay1.Slot0’. >> >> >> >> We would like to understand the values different vendors give in: >> >> - ‘Drive’ resource> ‘Physical Location’ object> ‘Part Location’ >> property> ‘Service Label’ property >> >> - The resource type for a Drive (@odata.type field) >> >> - The ‘Id’ property of the Drive resource >> >> Also, it would be helpful to understand the existing logic used by >> vendors to construct the disk names, including the (Redfish or other) >> properties used. >> >> >> >> This is so that a consensus can be reached for an algorithm to >> construct disk names. Any suggestions are welcome, thank you so much! >> >> >> >> [1]: https://github.com/metal3-io/metal3-docs/pull/148 >> >> [2]: >> https://github.com/DMTF/Redfish-Tacklebox/blob/master/redfish_utilities/inventory.py >> >> [3]: https://github.com/DMTF/Redfish-Tacklebox >> >> [4]: >> https://github.com/DMTF/Redfish-Tacklebox/blob/e429f70a79cfe288756618498ce485ab4be37131/redfish_utilities/inventory.py#L192 >> >> [5]: https://redfish.dmtf.org/schemas/v1/Drive.v1_12_1.json >> >> [6]: >> https://github.com/DMTF/Redfish-Tacklebox/blob/e429f70a79cfe288756618498ce485ab4be37131/redfish_utilities/inventory.py#L199 >> >> >> >> Best regards, >> >> Mahnoor Asghar >> >> Software Design Engineer >> >> xFlow Research Inc. >> >> mahnoor.asghar at xflowresearch.com >> >> www.xflowresearch.com >> >> >> >> -- >> Mahnoor Asghar >> Software Design Engineer >> xFlow Research Inc. >> mahnoor.asghar at xflowresearch.com >> www.xflowresearch.com >> >> > > -- > Arkady Kanevsky, Ph.D. > Phone: 972 707-6456 > Corporate Phone: 919 729-5744 ext. 8176456 > -- *Mahnoor Asghar* Software Design Engineer xFlow Research Inc. mahnoor.asghar at xflowresearch.com www.xflowresearch.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From mahnoor.asghar at xflowresearch.com Tue Jul 6 10:27:10 2021 From: mahnoor.asghar at xflowresearch.com (Mahnoor Asghar) Date: Tue, 6 Jul 2021 15:27:10 +0500 Subject: [Ironic] Vendor-neutral Disk names In-Reply-To: References: Message-ID: Dear Julia, Software RAID is handled separately in metal3 BMO. You're right, we can specify that as a limitation, since this naming convention will not work for software RAID. Also, we evaluated the approach you mentioned: of using disk name translation logic in the vendor drivers. We discussed it at length with Richard Pioso, Dmitry Tantsur and other community members, using generic disk names like ‘Disk 0’ etc. [1] We couldn't figure out the exact recipe of how this translation should take place. One concern with this approach was that it would be impractical to convert a constructed disk name into the vendor-specific disk name, for all the vendors. However, maybe this approach would be viable using the more verbose disk names (instead of Disk 0/1/2 initially proposed)? Out of all the possible solutions, the one we arrived at is to construct this naming convention using the Redfish schema in order to avoid vendor sprawl. The objective of this proposal is to determine how all the vendors utilize certain fields in the Redfish specification, so that we can formulate a naming convention accordingly. [1] [RFE] RAID config by Operator using generic disk numbers, and vendor-specific RAID controller names | StoryBoard (openstack.org) Thank you for your response. It is very helpful. Regards, Mahnoor Asghar -------------- next part -------------- An HTML attachment was scrubbed... URL: From peiyong.zhang at salesforce.com Fri Jul 2 21:39:57 2021 From: peiyong.zhang at salesforce.com (Pete Zhang) Date: Fri, 2 Jul 2021 14:39:57 -0700 Subject: No server with a name or ID of "blabla' exists Message-ID: Hey, We are seeing this for the FIRST time once we rebuilt MariaDB/Galera and RabbitMQ cluster, after a period of failure/downtime. Probably there are some inconsistencies among components/services. How to clean up those hosts in BUILD state? thx. 1) There are multiple uuid of the same host stuck in BUILD 2) Deleting server failed with this error: No server with a name or ID of "blabal" exists. ERROR (CommandError): Unable to delete the specified server(s). [image: Screen Shot 2021-07-02 at 2.29.44 PM.png] [image: Screen Shot 2021-07-02 at 2.27.56 PM.png] -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Screen Shot 2021-07-02 at 2.29.44 PM.png Type: image/png Size: 631318 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Screen Shot 2021-07-02 at 2.27.56 PM.png Type: image/png Size: 615815 bytes Desc: not available URL: From peiyong.zhang at salesforce.com Fri Jul 2 22:05:44 2021 From: peiyong.zhang at salesforce.com (Pete Zhang) Date: Fri, 2 Jul 2021 15:05:44 -0700 Subject: No server with a name or ID of "blabla' exists Message-ID: Hey, We are seeing this for the FIRST time once we rebuilt MariaDB/Galera and RabbitMQ cluster, after a period of failure/downtime. Probably there are some inconsistencies among components/services. How to clean up those hosts in BUILD state? thx. 1) There are multiple uuid of the same host stuck in BUILD 2) Deleting server failed with this error: No server with a name or ID of "blabal" exists. ERROR (CommandError): Unable to delete the specified server(s). [image: Screen Shot 2021-07-02 at 3.04.34 PM.png] -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Screen Shot 2021-07-02 at 3.04.34 PM.png Type: image/png Size: 229652 bytes Desc: not available URL: From syedammad83 at gmail.com Tue Jul 6 05:48:46 2021 From: syedammad83 at gmail.com (Ammad Syed) Date: Tue, 6 Jul 2021 10:48:46 +0500 Subject: [wallaby][nova] CPU topology and NUMA Nodes In-Reply-To: References: <9b6d248665ced4f826fedddd2ccb4649dd148273.camel@redhat.com> <34ae50bc2491eaae52e6a340dc96424153fd9531.camel@redhat.com> Message-ID: Hi Sean, I have tested cpu topology, numa and cpu soft pinning on one of my production compute node with AMD EPYC CPUs. It works perfectly fine. Here is the xmldump. 33554432 33554432 32 32768 ......... EPYC-Rome AMD One thing I have noticed is with enabling numa nova use cpu_mode custom while without numa node, nova use host-model. Also I have enabled cpu soft pining it works also fine. [compute] cpu_shared_set = 4-122 I have kept 8 cores and 32GB RAM for OS and network. Will it be sufficient ? Ammad On Wed, Jun 30, 2021 at 5:13 PM Sean Mooney wrote: > On Wed, 2021-06-30 at 10:11 +0500, Ammad Syed wrote: > > - Is there any isolation done at the Kernel level of the compute OS? > > > > No, There is no changes made in kernel. > > > > - What does your flavor look like right now? Is the behavior different > > if you remove the numa constraint? > > > > My flavor has currently below specs set. > > > dec2e31d-e1e8-4ff2-90d5-955be7e9efd3 | RC-16G | 16384 | 5 | > > 0 | 8 | True | 0 | 1.0 | hw:cpu_cores='2', > > hw:cpu_sockets='4' | > > if these are the only extra you have added you have not enabled cpu > pinning in the falvor > > > > But I have set cpu pinning in compute node. > > [compute] > > cpu_shared_set = 2-7,10-15,18-23,26-31 > > > you do not enable cpu pinning in the compute node. > you can configure cores to be used for cpu pinning using > cpu_dedicated_set but vms will not use those cores unless you request them > the cpu_shared_set is used for unpinned vms. it defines the range of cpus > over which the vms can float > > more coments below > > > > If I remove hw:cpu_* from flavor and remove above config from nova.conf > of > > compute. Instance deployment works fine. > > > > You seem to have 4 NUMA as well, but only two physical sockets > > (8CPUx2threads - 16 vCPUs per socket x 2 = 32) > > > > This is my test compute node have 4 cpu sockets and 4 numa nodes. > > > > root at kvm10-a1-khi01:~# numactl -H > > available: 4 nodes (0-3) > > node 0 cpus: 0 1 2 3 4 5 6 7 > > node 0 size: 31675 MB > > node 0 free: 30135 MB > > node 1 cpus: 8 9 10 11 12 13 14 15 > > node 1 size: 64510 MB > > node 1 free: 63412 MB > > node 2 cpus: 16 17 18 19 20 21 22 23 > > node 2 size: 64510 MB > > node 2 free: 63255 MB > > node 3 cpus: 24 25 26 27 28 29 30 31 > > node 3 size: 64485 MB > > node 3 free: 63117 MB > > node distances: > > node 0 1 2 3 > > 0: 10 16 16 16 > > 1: 16 10 16 16 > > 2: 16 16 10 16 > > 3: 16 16 16 10 > > > > The generated XML seems to be set for a 4 socket topology? > > > > Yes I am testing this on my test compute node first. > > > > On Wed, Jun 30, 2021 at 7:15 AM Laurent Dumont > > > wrote: > > > > > > > > - Is there any isolation done at the Kernel level of the compute OS? > > > - What does your flavor look like right now? Is the behavior > different > > > if you remove the numa constraint? > > > > > > You seem to have 4 NUMA as well, but only two physical sockets > > > (8CPUx2threads - 16 vCPUs per socket x 2 = 32) > > > > > > The generated XML seems to be set for a 4 socket topology? > > > > > > > > > Opteron_G5 > > > > > > > > > > > > On Tue, Jun 29, 2021 at 12:41 PM Ammad Syed > wrote: > > > > > > > Hi Stephen, > > > > > > > > I have checked all cpus are online. > > > > > > > > root at kvm10-a1-khi01:/etc/nova# lscpu > > > > Architecture: x86_64 > > > > CPU op-mode(s): 32-bit, 64-bit > > > > Byte Order: Little Endian > > > > Address sizes: 48 bits physical, 48 bits virtual > > > > CPU(s): 32 > > > > On-line CPU(s) list: 0-31 > > > > Thread(s) per core: 2 > > > > Core(s) per socket: 8 > > > > Socket(s): 2 > > > > NUMA node(s): 4 > > > > Vendor ID: AuthenticAMD > > > > CPU family: 21 > > > > > > > > I have made below configuration in nova.conf. > > > > > > > > [compute] > > > > > > > > cpu_shared_set = 2-7,10-15,18-23,26-31 > > > > > > > > Below is the xml in nova logs that nova is trying to create domain. > > > > > the xml below is for an unpinend guest and has no numa toplogy defiend. > > > > > 2021-06-29 16:30:56.576 2819 ERROR nova.virt.libvirt.guest > > > > [req-c76c6809-1775-43a8-bfb1-70f6726cad9d > 2af528fdf3244e15b4f3f8fcfc0889c5 > > > > 890eb2b7d1b8488aa88de7c34d08817a - default default] Error launching a > > > > defined domain with XML: > > > > instance-0000026d > > > > 06ff4fd5-b21f-4f64-9dde-55e86dd15da6 > > > > > > > > > > > > > > > > cpu > > > > 2021-06-29 16:30:50 > > > > > > > > 16384 > > > > 5 > > > > 0 > > > > 0 > > > > 8 > > > > > > > > > > > > > > > uuid="2af528fdf3244e15b4f3f8fcfc0889c5">admin > > > > > > > uuid="890eb2b7d1b8488aa88de7c34d08817a">admin > > > > > > > > > > > uuid="1024317b-0db6-418c-bc39-de9b61d8ce59"/> > > > > > > > > > > > > ipVersion="4"/> > > > > > > > > > > > > > > > > > > > > 16777216 > > > > 16777216 > > > > 8 > > > > > here you can se we have copyied the cpu_share_set value into the set of > cpus > which this vm is allowed to float over. this will do soft pinning using > cgroups but the > guest cpu will still float over that range > > > > > > > > 8192 > > > > > > > > > > > > > > > > OpenStack Foundation > > > > OpenStack Nova > > > > 23.0.0 > > > > name='serial'>06ff4fd5-b21f-4f64-9dde-55e86dd15da6 > > > > 06ff4fd5-b21f-4f64-9dde-55e86dd15da6 > > > > Virtual Machine > > > > > > > > > > > > > > > > hvm > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Opteron_G5 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > destroy > > > > restart > > > > destroy > > > > > > > > /usr/bin/qemu-system-x86_64 > > > > > > > > > > > > > > > > file='/var/lib/nova/instances/06ff4fd5-b21f-4f64-9dde-55e86dd15da6/disk'/> > > > > > > > >
unit='0'/> > > > > > > > > > > > >
> > > function='0x0'/> > > > > > > > > > > > >
> > > function='0x2'/> > > > > > > > > > > > > > > > > > > > > > > > > > > > > interfaceid='dccf68a2-ec48-4985-94ce-b3487cfc99f3'/> > > > > > > > > > > > > > > > > > > > > > > > >
> > > function='0x0'/> > > > > > > > > > > > > > > > > file='/var/lib/nova/instances/06ff4fd5-b21f-4f64-9dde-55e86dd15da6/console.log' > > > > append='off'/> > > > > > > > > > > > > > > > > > > > > > > > > > > > > file='/var/lib/nova/instances/06ff4fd5-b21f-4f64-9dde-55e86dd15da6/console.log' > > > > append='off'/> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > >