404 Not Found

From mrunge at matthias-runge.de Fri Sep 1 06:42:18 2023 From: mrunge at matthias-runge.de (Matthias Runge) Date: Fri, 1 Sep 2023 08:42:18 +0200 Subject: [Telemetry] Adding Erno Kuvaja (jokke_) to telemetry cores Message-ID: <05a688fe-b1ba-568d-3eed-888e92ea5fa1@matthias-runge.de> Hi there, a bit premature, but in-line with[1] I've added Erno to telemetry cores. Erno is not a stranger in the OpenStack world and has been deeply involved in glance for quite a while. Thank you Erno for stepping up and for helping with Telemetry! Matthias [1] https://governance.openstack.org/election/#ptl-candidates -- Matthias Runge From mrunge at matthias-runge.de Fri Sep 1 06:48:24 2023 From: mrunge at matthias-runge.de (Matthias Runge) Date: Fri, 1 Sep 2023 08:48:24 +0200 Subject: [all] osprofiler requirements on opentelemetry: is this reasonable? In-Reply-To: <71d1f143-c108-a1be-216b-fa7d1a4adf03@debian.org> References: <71d1f143-c108-a1be-216b-fa7d1a4adf03@debian.org> Message-ID: On 30/08/2023 11:11, Thomas Goirand wrote: > Hi, > > I just saw that the last version of osprofiler requires > opentelemetry-exporter-otlp and opentelemetry-sdk. Since osprofiler is > used in almost all projects, and that opentelemetry is really HUGE, I > wonder if this is all reasonable. This forces downstream distros to do a > lot of packaging only for that. Is this avoidable? > > I started doing the packaging, and then stopped to write this message, > seeing how much work that would represent. Hi zigo, from what I have seen here (and you may remember talking about this when we met at the OpenInfra Summit this year), the change makes sense to me. Quickly looking at [1], there may be the possibility to make it optional though, and that would certainly make sense for the other drivers too. Matthias [1] https://github.com/openstack/osprofiler/commit/908e7402320eb067db45aa9700d54d31c259f3ca -- Matthias Runge From noonedeadpunk at gmail.com Fri Sep 1 07:11:15 2023 From: noonedeadpunk at gmail.com (Dmitriy Rabotyagov) Date: Fri, 1 Sep 2023 09:11:15 +0200 Subject: [all] osprofiler requirements on opentelemetry: is this reasonable? In-Reply-To: References: <71d1f143-c108-a1be-216b-fa7d1a4adf03@debian.org> Message-ID: What is extremely weird for me, is that osprofiler is not I requirements at all - only in test-requirements, which should mean that it should not be required anywhere except for package testing? Though I see from code, that osprofiler will just fail with ImportError if opentelemetry is absent, which IMO is quite wrong from the beginning. And indeed that would make sense to me to use extras_require on setup.cfg to install only drivers that are needed, not * for each service. Though I am not sure who is motivated enough to invest time into that refactoring... On Fri, Sep 1, 2023, 08:54 Matthias Runge wrote: > On 30/08/2023 11:11, Thomas Goirand wrote: > > Hi, > > > > I just saw that the last version of osprofiler requires > > opentelemetry-exporter-otlp and opentelemetry-sdk. Since osprofiler is > > used in almost all projects, and that opentelemetry is really HUGE, I > > wonder if this is all reasonable. This forces downstream distros to do a > > lot of packaging only for that. Is this avoidable? > > > > I started doing the packaging, and then stopped to write this message, > > seeing how much work that would represent. > > Hi zigo, > > from what I have seen here (and you may remember talking about this when > we met at the OpenInfra Summit this year), the change makes sense to me. > > Quickly looking at [1], there may be the possibility to make it optional > though, and that would certainly make sense for the other drivers too. > > Matthias > > > [1] > > https://github.com/openstack/osprofiler/commit/908e7402320eb067db45aa9700d54d31c259f3ca > -- > Matthias Runge > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From noonedeadpunk at gmail.com Fri Sep 1 07:13:08 2023 From: noonedeadpunk at gmail.com (Dmitriy Rabotyagov) Date: Fri, 1 Sep 2023 09:13:08 +0200 Subject: [all] osprofiler requirements on opentelemetry: is this reasonable? In-Reply-To: References: <71d1f143-c108-a1be-216b-fa7d1a4adf03@debian.org> Message-ID: > osprofiler is not I requirements at all I meant to say, is that opentelemetry is not in requirements of osprofiler On Fri, Sep 1, 2023, 09:11 Dmitriy Rabotyagov wrote: > What is extremely weird for me, is that osprofiler is not I requirements > at all - only in test-requirements, which should mean that it should not be > required anywhere except for package testing? Though I see from code, that > osprofiler will just fail with ImportError if opentelemetry is absent, > which IMO is quite wrong from the beginning. > > And indeed that would make sense to me to use extras_require on setup.cfg > to install only drivers that are needed, not * for each service. > > Though I am not sure who is motivated enough to invest time into that > refactoring... > > On Fri, Sep 1, 2023, 08:54 Matthias Runge > wrote: > >> On 30/08/2023 11:11, Thomas Goirand wrote: >> > Hi, >> > >> > I just saw that the last version of osprofiler requires >> > opentelemetry-exporter-otlp and opentelemetry-sdk. Since osprofiler is >> > used in almost all projects, and that opentelemetry is really HUGE, I >> > wonder if this is all reasonable. This forces downstream distros to do >> a >> > lot of packaging only for that. Is this avoidable? >> > >> > I started doing the packaging, and then stopped to write this message, >> > seeing how much work that would represent. >> >> Hi zigo, >> >> from what I have seen here (and you may remember talking about this when >> we met at the OpenInfra Summit this year), the change makes sense to me. >> >> Quickly looking at [1], there may be the possibility to make it optional >> though, and that would certainly make sense for the other drivers too. >> >> Matthias >> >> >> [1] >> >> https://github.com/openstack/osprofiler/commit/908e7402320eb067db45aa9700d54d31c259f3ca >> -- >> Matthias Runge >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From tonykarera at gmail.com Fri Sep 1 09:13:03 2023 From: tonykarera at gmail.com (Karera Tony) Date: Fri, 1 Sep 2023 11:13:03 +0200 Subject: Live migration error Message-ID: Hello Team, I am trying to migrate instances from one host to another but I am getting this error. *Error: *Failed to live migrate instance to host "compute1". Details Migration pre-check error: Unacceptable CPU info: CPU doesn't have compatibility. 0 Refer to http://libvirt.org/html/libvirt-libvirt-host.html#virCPUCompareResult (HTTP 400) (Request-ID: req-9dfdfe2e-70fa-481a-bd3d-c2c76a6e93da) Any idea on how to fix this? Regards Tony Karera -------------- next part -------------- An HTML attachment was scrubbed... URL: From Danny.Webb at thehutgroup.com Fri Sep 1 10:37:22 2023 From: Danny.Webb at thehutgroup.com (Danny Webb) Date: Fri, 1 Sep 2023 10:37:22 +0000 Subject: Live migration error In-Reply-To: References: Message-ID: It means that you have CPUs with incompatible flags or you've got differing kernel versions that expose different flags or you've got differing libvirt versions that expose different flags. You either need to use a lowest common denominator cpu_model or do a cold migration. ________________________________ From: Karera Tony Sent: 01 September 2023 10:13 To: openstack-discuss Subject: Live migration error CAUTION: This email originates from outside THG ________________________________ Hello Team, I am trying to migrate instances from one host to another but I am getting this error. Error: Failed to live migrate instance to host "compute1". Details Migration pre-check error: Unacceptable CPU info: CPU doesn't have compatibility. 0 Refer to http://libvirt.org/html/libvirt-libvirt-host.html#virCPUCompareResult (HTTP 400) (Request-ID: req-9dfdfe2e-70fa-481a-bd3d-c2c76a6e93da) Any idea on how to fix this? Regards Tony Karera -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdhasman at redhat.com Fri Sep 1 10:46:13 2023 From: rdhasman at redhat.com (Rajat Dhasmana) Date: Fri, 1 Sep 2023 16:16:13 +0530 Subject: [cinder] Cinder Midcycle - 2 (R-6) for 2023.2 (Bobcat) Summary Message-ID: Hello Argonauts, The summary for mid cycle -2 (R-6) held on 23rd August, 2023 from 1400 to 1600 UTC is available at the wiki page[1]. Please go through the etherpad and the recording for the entire discussion. [1] https://wiki.openstack.org/wiki/CinderBobcatMidCycleSummary Thanks Rajat Dhasmana -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralonsoh at redhat.com Fri Sep 1 10:50:07 2023 From: ralonsoh at redhat.com (Rodolfo Alonso Hernandez) Date: Fri, 1 Sep 2023 12:50:07 +0200 Subject: [neutron] Neutron drivers meeting cancelled Message-ID: Hello Neutrinos: Due to the lack of agenda [1], today's meeting is cancelled. Have a nice weekend. [1]https://wiki.openstack.org/wiki/Meetings/NeutronDrivers -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdhasman at redhat.com Fri Sep 1 11:47:16 2023 From: rdhasman at redhat.com (Rajat Dhasmana) Date: Fri, 1 Sep 2023 17:17:16 +0530 Subject: [cinder][FFE] Granting FFE to features (8th September) Message-ID: Hi All, We had 6 feature requests during the 2023.2 (Bobcat) development cycle[1]. Few of the features had other patch dependencies amounting to significantly more patches. With limited review bandwidth, we were not able to review all the features on time, hence I would like to grant FFE to the following listed features. The reason so many changes are granted FFE are due to the following reasons: * It has gone through iterations of updates and most of them has +2s so it should just require another core to merge * The response time for addressing review comments is fast which is a good thing * The changes are driver specific and isolated to a particular driver that have been tested by the vendor team and third party CI 1. Fujitsu Driver: Add QoS support Patch: https://review.opendev.org/c/openstack/cinder/+/847730 2. NetApp ONTAP: Added support to Active/Active mode in NFS driver Patch: https://review.opendev.org/c/openstack/cinder/+/886869 Dependent patches: https://review.opendev.org/c/openstack/cinder/+/798384 https://review.opendev.org/c/openstack/cinder/+/889826 3. [NetApp] LUN space-allocation support for iSCSI Patch: https://review.opendev.org/c/openstack/cinder/+/893106 4. [Pure Storage] Replication-Enabled and Snapshot Consistency Groups Patch: https://review.opendev.org/c/openstack/cinder/+/891234 5. [HPE XP] Support HA and data deduplication Patch: https://review.opendev.org/c/openstack/cinder/+/892608 Dependent patches: https://review.opendev.org/c/openstack/cinder/+/885986 https://review.opendev.org/c/openstack/cinder/+/879830 https://review.opendev.org/c/openstack/cinder/+/877672 https://review.opendev.org/c/openstack/cinder/+/891937 [1] https://etherpad.opendev.org/p/cinder-2023.2-bobcat-features The deadline after FFE is going to be 8th September, 2023 but we will try to get the changes reviewed and merged at the earliest. Let me know if there are any queries or concerns. Thanks Rajat Dhasmana -------------- next part -------------- An HTML attachment was scrubbed... URL: From tonykarera at gmail.com Fri Sep 1 12:56:32 2023 From: tonykarera at gmail.com (Karera Tony) Date: Fri, 1 Sep 2023 14:56:32 +0200 Subject: Live migration error In-Reply-To: References: Message-ID: Hello Danny, Thanks for the feedback. use a lowest common denominator cpu_model : Does this mean changing the servers ? Regards Tony Karera On Fri, Sep 1, 2023 at 12:37?PM Danny Webb wrote: > It means that you have CPUs with incompatible flags or you've got > differing kernel versions that expose different flags or you've got > differing libvirt versions that expose different flags. You either need to > use a lowest common denominator cpu_model or do a cold migration. > ------------------------------ > *From:* Karera Tony > *Sent:* 01 September 2023 10:13 > *To:* openstack-discuss > *Subject:* Live migration error > > > * CAUTION: This email originates from outside THG * > ------------------------------ > Hello Team, > > I am trying to migrate instances from one host to another but I am getting > this error. > > *Error: *Failed to live migrate instance to host "compute1". Details > > Migration pre-check error: Unacceptable CPU info: CPU doesn't have > compatibility. 0 Refer to > http://libvirt.org/html/libvirt-libvirt-host.html#virCPUCompareResult > (HTTP 400) (Request-ID: req-9dfdfe2e-70fa-481a-bd3d-c2c76a6e93da) > > Any idea on how to fix this? > > Regards > > Tony Karera > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian.rohmann at inovex.de Fri Sep 1 13:23:36 2023 From: christian.rohmann at inovex.de (Christian Rohmann) Date: Fri, 1 Sep 2023 15:23:36 +0200 Subject: [SDK][CLI][API][OpenAPI] Machine readable OpenStack API spec - time to do a next step? In-Reply-To: <30B316F0-81E3-417A-9A2C-87A31472D78C@gmail.com> References: <30B316F0-81E3-417A-9A2C-87A31472D78C@gmail.com> Message-ID: <949236db-e35f-4fae-8570-3dd331c7c0c7@inovex.de> On 31.08.23 16:39, Artem Goncharov wrote: > Once we have specs for the commands code generators can be updated to > consume this data and produce much better code. BTW, I have now also a > prototype of the new CLI written with Rust and this is a hypersonic > bullet compared to the current OSC. But please do not push me more on > that, it is still in a very early stages and depends heavily on the > available specs. I can only say that it is now designed in the way > that every API call can have a thin CLI coverage just by providing a > spec, when additional logic is desired - surely will require human > implementation. > Code generators in the pipe: OSC, AnsibleModules, RustSDK > (sync/async), RustCLI. Next thing that are on the radar: gopher cloud, > terraform modules, async python sdk, JS SDK(?) > > If all of that gets executed properly and with some community traction > we can all have following things covered: > > - improve standardisation of OpenStack internals and externals: glance > and nova (at least those 2) are already using jsonschema internally in > different areas to describe requests/responses. Why not to make this > standard reaching the service consumers? > - getting rid of api-ref work by updating our sphinx machinery to > consume our customised specs and produce nice docs matching the reality > - sharing specs between teams to improve interface (not like currently > we need to read the api-ref with tons of bugs plus source code to > understand how to cover new feature in service X). Maybe even a > central repo with the specs per release. > - there are plenty of code generators and server bindings for OpenAPI > specs so that we can potentially align frameworks used by different > teams to maintain less > - less work for all of us who needs services talking to each other > (not immediately right now, but once the code is switched on consuming > specs) > - request verification already on the client side not waiting for the > response > - finally show something to customers often annoying asking ?where are > your openapi specs? (no offence here ;-))? > > I know it is a long message. But I am pretty excited with the progress > and would like to hear community opinions. For the more detailed > discussion consider this as a pre-announcement of the topic for PTG in > sdk/cli slots. You have every reason to be excited - this sounds simply awesome! I am a huge fan of OpenStackSDK and Gophercloud and the progress of aligning the contact surface to OpenStack APIs. Moving away from all the individual clients and different usage patterns ... But your new goals bring things to a whole new level! This is how a modern API landscape should look like! Validated and standardized schemas + code auto-generation for all sorts of API clients! Let's go! Christian From smooney at redhat.com Fri Sep 1 13:41:53 2023 From: smooney at redhat.com (smooney at redhat.com) Date: Fri, 01 Sep 2023 14:41:53 +0100 Subject: [SDK][CLI][API][OpenAPI] Machine readable OpenStack API spec - time to do a next step? In-Reply-To: <971BC87A-5D2F-41E2-88CE-2F91E009D8DB@gmail.com> References: <30B316F0-81E3-417A-9A2C-87A31472D78C@gmail.com> <971BC87A-5D2F-41E2-88CE-2F91E009D8DB@gmail.com> Message-ID: <0e1f14eb7aa0a8a24a9cc3e95bc352e6c1b9f1f9.camel@redhat.com> On Thu, 2023-08-31 at 18:36 +0200, Artem Goncharov wrote: > > > > Just an fyi we have jsonschema schmas stored in python dicts for every > > api in nova > > https://github.com/openstack/nova/blob/4490c8bc8461a56a96cdaa08020c9e25ebe8f087/nova/api/openstack/compute/schemas/migrate_server.py#L38-L74 > > > > we use them in our api tests with eh API scamples which are use both > > as test and to generate docs > > https://github.com/openstack/nova/tree/master/doc/api_samples/server-migrations > > > > every time we make an api change we add an api sample and add a secma > > for that micorversion. > > > > These schemas are also use in the api to validate the requests > > https://github.com/openstack/nova/blob/master/nova/api/openstack/compute/migrate_server.py#L84-L90 > > > > so a lot of the data (boht on the spec? side an confromance side) > > exits but not in a form that is directly? usabel to export openapi > > formated documents. > > > Thanks Sean, I know that and because of that I mentioned that nova is already using jsonschema. That is more or less > exactly the point that I want to have a discussion on making this standard by the projects and do this generally with > the openapi spec instead of bunch of unconnected schemas so that our own lives become hopefully easier. ya i wanted to highlight that for you and others to see how they can and are currently used. if we were to standatise this acroos services what i would want to see is the exesitign schemas ported to openapi schema files which we would then import and uses for our validation. that is a non trivaial amount of work but a the end of the process we woudl have a set of schma files that could be used to generate client. we coudld also add a midelware to oslo mideelware that would optionally be able to enumarate the schemas and server them form the rest api if loaded. im not against proting to a well know standard provided we do not loose any fo the test coverage we or validation capablities we have today as a result. in nova the schmas benifit us today by narrowing the domain of check we need to preform in the python code directly i.e. we know if you get to the live_migrate handeler in the nova api that the body is syntacticaly valid due to the schema vlaidation and that you have permissiosn to perform the operators due to the keystoen auth and policy checks that have already been done. that means we only need to validate the semantic i.e. does the instance you asked us to migrate exist? is it in a migratble state ectra. the schemas get rid of a lot of boilerplate in that regard so i think it would be useful for other too. and if other project were to add schmea validation today it would make sense ot align to a standard rather then inventing there own or copyting hte nova code. the nova code works but its kindof our own thign even if it is jasonschma based so it requires more knowledge to extend and uses then following a standard. > From smooney at redhat.com Fri Sep 1 14:05:34 2023 From: smooney at redhat.com (smooney at redhat.com) Date: Fri, 01 Sep 2023 15:05:34 +0100 Subject: Live migration error In-Reply-To: References:

Message-ID: On Fri, 2023-09-01 at 14:56 +0200, Karera Tony wrote: > Hello Danny, > > Thanks for the feedback. > > use a lowest common denominator cpu_model : Does this mean changing the > servers ? it means tha tif you have a mix of cpus in the deployment you shoudl hardcod to the older of the diffent cpu models i.e skylake or whatever it may be in your case. https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.cpu_models [libvirt] cpu_mode=custom cpu_models=skylake in more recent release we replaced the older cpu_model with cpu_models with is a comma sperated list of models in prefernce order. i.e. nova will use the first cpu_modle in the list that work on the current host. not that while this give extra flexiblity it vs just using the oldest cpu model it does limit the set of hosts you can migrate too but means you can have better perfromance so its a tradeoff. performance vs portablity. the error you mentioned can also be caused by micorocode updates. intel remove TSX and MPX i beilve in the last 2 years form come of there cpus that breaks live migration if a vm was create with access to that cpu instruction the only way to resolve that is to cold migrate the guest. i.e. if a vm is currently booted with cpu_model X it cannot be modifed while the guest is running so you either need to update the config option and reboot all the vms or more particlaly update the config and then cold migrate them which will allow you to move the instnace(your orginal goal) while allowing the new cpu model to take effect. novas default for the cpu model when not set is with our default cpu_mode is host_model meaning whatever model best matches the current hosts cpu. that is the correct default in general but if you have mixed cpu generatiosn in your cloud its not ideal. hopefuly that helps > > Regards > > Tony Karera > > > > > On Fri, Sep 1, 2023 at 12:37?PM Danny Webb > wrote: > > > It means that you have CPUs with incompatible flags or you've got > > differing kernel versions that expose different flags or you've got > > differing libvirt versions that expose different flags.? You either need to > > use a lowest common denominator cpu_model or do a cold migration. > > ------------------------------ > > *From:* Karera Tony > > *Sent:* 01 September 2023 10:13 > > *To:* openstack-discuss > > *Subject:* Live migration error > > > > > > * CAUTION: This email originates from outside THG * > > ------------------------------ > > Hello Team, > > > > I am trying to migrate instances from one host to another but I am getting > > this error. > > > > *Error: *Failed to live migrate instance to host "compute1". Details > > > > Migration pre-check error: Unacceptable CPU info: CPU doesn't have > > compatibility. 0 Refer to > > http://libvirt.org/html/libvirt-libvirt-host.html#virCPUCompareResult > > (HTTP 400) (Request-ID: req-9dfdfe2e-70fa-481a-bd3d-c2c76a6e93da) > > > > Any idea on how to fix this? > > > > Regards > > > > Tony Karera > > > > > > From elod.illes at est.tech Fri Sep 1 15:30:37 2023 From: elod.illes at est.tech (=?utf-8?B?RWzDtWQgSWxsw6lz?=) Date: Fri, 1 Sep 2023 15:30:37 +0000 Subject: [release] Release countdown for week R-4, Sep 04-08 Message-ID: Development Focus ----------------- We just passed feature freeze! Until release branches are cut, you should stop accepting featureful changes to deliverables following the cycle-with-rc release model, or to libraries. Exceptions should be discussed on separate threads on the mailing-list, and feature freeze exceptions approved by the team's PTL. Focus should be on finding and fixing release-critical bugs, so that release candidates and final versions of the 2023.2 Bobcat deliverables can be proposed, well ahead of the final 2023.2 Bobcat release date. General Information ------------------- We are still finishing up processing a few release requests, but the 2023.2 Bobcat release requirements are now frozen. If new library releases are needed to fix release-critical bugs in 2023.2 Bobcat, you must request a Requirements Freeze Exception (RFE) from the requirements team before we can do a new release to avoid having something released in 2023.2 Bobcat that is not actually usable. This is done by posting to the openstack-discuss mailing list with a subject line similar to: [$PROJECT][requirements] RFE requested for $PROJECT_LIB Include justification/reasoning for why a RFE is needed for this lib. If/when the requirements team OKs the post-freeze update, we can then process a new release. A soft String freeze is now in effect, in order to let the I18N team do the translation work in good conditions. In Horizon and the various dashboard plugins, you should stop accepting changes that modify user-visible strings. Exceptions should be discussed on the mailing-list. By September 28th, 2023 this will become a hard string freeze, with no changes in user-visible strings allowed. Actions ------- stable/2023.2 branches should be created soon for all not-already-branched libraries. You should expect 2-3 changes to be proposed for each: a .gitreview update, a reno update (skipped for projects not using reno), and a tox.ini constraints URL update. Please review those in priority so that the branch can be functional ASAP. The Prelude section of reno release notes is rendered as the top level overview for the release. Any important overall messaging for 2023.2 Bobcat changes should be added there to make sure the consumers of your release notes see them. Upcoming Deadlines & Dates -------------------------- RC1 deadline: September 14th, 2023 (R-3 week) Final RC deadline: September 28th, 2023 (R-1 week) Final 2023.2 Bobcat release: October 4th, 2023 2024.1 Caracal Virtual PTG - October 23-27, 2023 El?d Ill?s irc: elodilles @ #openstack-release -------------- next part -------------- An HTML attachment was scrubbed... URL: From artem.goncharov at gmail.com Fri Sep 1 17:14:12 2023 From: artem.goncharov at gmail.com (Artem Goncharov) Date: Fri, 1 Sep 2023 19:14:12 +0200 Subject: [SDK][CLI][API][OpenAPI] Machine readable OpenStack API spec - time to do a next step? In-Reply-To: <0e1f14eb7aa0a8a24a9cc3e95bc352e6c1b9f1f9.camel@redhat.com> References: <30B316F0-81E3-417A-9A2C-87A31472D78C@gmail.com> <971BC87A-5D2F-41E2-88CE-2F91E009D8DB@gmail.com> <0e1f14eb7aa0a8a24a9cc3e95bc352e6c1b9f1f9.camel@redhat.com> Message-ID: On Fri, Sep 1, 2023, 15:42 wrote: > On Thu, 2023-08-31 at 18:36 +0200, Artem Goncharov wrote: > > > > > > Just an fyi we have jsonschema schmas stored in python dicts for every > > > api in nova > > > > https://github.com/openstack/nova/blob/4490c8bc8461a56a96cdaa08020c9e25ebe8f087/nova/api/openstack/compute/schemas/migrate_server.py#L38-L74 > > > > > > we use them in our api tests with eh API scamples which are use both > > > as test and to generate docs > > > > https://github.com/openstack/nova/tree/master/doc/api_samples/server-migrations > > > > > > every time we make an api change we add an api sample and add a secma > > > for that micorversion. > > > > > > These schemas are also use in the api to validate the requests > > > > https://github.com/openstack/nova/blob/master/nova/api/openstack/compute/migrate_server.py#L84-L90 > > > > > > so a lot of the data (boht on the spec side an confromance side) > > > exits but not in a form that is directly usabel to export openapi > > > formated documents. > > > > > > Thanks Sean, I know that and because of that I mentioned that nova is > already using jsonschema. That is more or less > > exactly the point that I want to have a discussion on making this > standard by the projects and do this generally with > > the openapi spec instead of bunch of unconnected schemas so that our own > lives become hopefully easier. > ya i wanted to highlight that for you and others to see how they can and > are currently used. > if we were to standatise this acroos services what i would want to see is > the exesitign schemas ported to openapi schema > files which we would then import and uses for our validation. > Actually I have done in my poc nearly that: taken schema (as example, not literally) from nova code and integrated it into the spec. Surely it would be possible as you suggested to take schemas from the current code and construct them into the spec while we learn on how to have our middleware to locate them from the overall spec. Generally openapi helps us (from 3.1) not to loose info, but vice versa extend and combine schemas into a full description of the operation how consumer is supposed to use it (what is the url, what are the parameters, what is the response(s)) with descriptions for each of those elements that are used for rendering docs and helpstrings. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jahns.jay at heb.com Fri Sep 1 18:59:02 2023 From: jahns.jay at heb.com (Jahns, Jay) Date: Fri, 1 Sep 2023 18:59:02 +0000 Subject: [cinder] Map Availability Zone to Volume Type Message-ID: Hi, We have a multi-AZ environment where we launch instances with bootable volumes. When we specify AZ, we need to have a specific volume type in that AZ selected to launch to. Right now, the __DEFAULT__ volume type is used. Normally, that would work, since there is 1 backend in each AZ; however, we need to be able to use the key/value RESKEY:availability_zones to pin a volume type to an AZ, so we can use specific key/values in creating the volume. It is 2 separate backends in the environment. I see this bug: https://bugs.launchpad.net/cinder/+bug/1999706 And this change: https://review.opendev.org/c/openstack/cinder/+/868539 Is there anything we can do to help expedite adding this functionality? It seems that it was supposed to already be in. Thanks, Jay -------------- next part -------------- An HTML attachment was scrubbed... URL: From roger.riverac at gmail.com Sat Sep 2 15:42:45 2023 From: roger.riverac at gmail.com (Roger Rivera) Date: Sat, 2 Sep 2023 11:42:45 -0400 Subject: [openstack-ansible] Dedicated gateway hosts not working with OVN In-Reply-To: References:

<20230208143243.dth426y74jyfacxh@yuggoth.org> Message-ID: Hello, We have deployed an openstack-ansible cluster to test it on_metal with OVN and defined *dedicated gateway hosts* connecting to the external network with the *network-gateway_hosts* host group. Unfortunately, we are not able to connect to the external/provider networks. It seems that traffic wants to reach external networks via the hypervisor nodes and not the gateway hosts. Any suggestions on changes needed to our configuration will be highly appreciated. Environment: -Openstack Antelope -Ubuntu 22 on all hosts -3 infra hosts - 1xNIC (ens1) -2 compute hosts - 1xNIC (ens1) -2 gateway hosts - 2xNIC (ens1 internal, ens2 external) -No linux bridges are created. The gateway hosts are the only ones physically connected to the external network via physical interface ens2. Therefore, we need all external provider network traffic to traverse via these gateway hosts. Tenant networks work fine and VMs can talk to each other. However, when a VM is spawned with a floating IP to the external network, they are unable to reach the outside network. Relevant content from openstack-ansible configuration files: =.=.=.=.=.=.=.= openstack_user_config.yml =.=.=.=.=.=.=.= ``` ... provider_networks: - network: container_bridge: "br-mgmt" container_type: "veth" container_interface: "ens1" ip_from_q: "management" type: "raw" group_binds: - all_containers - hosts is_management_address: true - network: container_bridge: "br-vxlan" container_type: "veth" container_interface: "ens1" ip_from_q: "tunnel" #type: "vxlan" type: "geneve" range: "1:1000" net_name: "geneve" group_binds: - neutron_ovn_controller - network: container_bridge: "br-flat" container_type: "veth" container_interface: "ens1" type: "flat" net_name: "flat" group_binds: - neutron_ovn_controller - network: container_bridge: "br-vlan" container_type: "veth" container_interface: "ens1" type: "vlan" range: "101:300,401:500" net_name: "vlan" group_binds: - neutron_ovn_controller - network: container_bridge: "br-storage" container_type: "veth" container_interface: "ens1" ip_from_q: "storage" type: "raw" group_binds: - glance_api - cinder_api - cinder_volume - nova_compute ... compute-infra_hosts: inf1: ip: 172.16.0.1 inf2: ip: 172.16.0.2 inf3: ip: 172.16.0.3 compute_hosts: cmp4: ip: 172.16.0.21 cmp3: ip: 172.16.0.22 network_hosts: inf1: ip: 172.16.0.1 inf2: ip: 172.16.0.2 inf3: ip: 172.16.0.3 network-gateway_hosts: net1: ip: 172.16.0.31 net2: ip: 172.16.0.32 ``` =.=.=.=.=.=.=.= user_variables.yml =.=.=.=.=.=.=.= ``` --- debug: false install_method: source rabbitmq_use_ssl: False haproxy_use_keepalived: False ... neutron_plugin_type: ml2.ovn neutron_plugin_base: - neutron.services.ovn_l3.plugin.OVNL3RouterPlugin neutron_ml2_drivers_type: geneve,vlan,flat neutron_ml2_conf_ini_overrides: ml2: tenant_network_types: geneve ... ``` =.=.=.=.=.=.=.= env.d/neutron.yml =.=.=.=.=.=.=.= ``` component_skel: neutron_ovn_controller: belongs_to: - neutron_all neutron_ovn_northd: belongs_to: - neutron_all container_skel: neutron_agents_container: contains: {} properties: is_metal: true neutron_ovn_northd_container: belongs_to: - network_containers contains: - neutron_ovn_northd ``` =.=.=.=.=.=.=.= env.d/nova.yml =.=.=.=.=.=.=.= ``` component_skel: nova_compute_container: belongs_to: - compute_containers - kvm-compute_containers - lxd-compute_containers - qemu-compute_containers contains: - neutron_ovn_controller - nova_compute properties: is_metal: true ``` =.=.=.=.=.=.=.= group_vars/network_hosts =.=.=.=.=.=.=.= ``` openstack_host_specific_kernel_modules: - name: "openvswitch" pattern: "CONFIG_OPENVSWITCH" ``` The nodes layout is like this: [image: image.png] Any guidance on what we have wrong or how to improve this configuration will be appreciated. We need to make external traffic for VMs to go out via the gateway nodes and not the compute/hypervisor nodes. Thank you. Roger -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 16574 bytes Desc: not available URL: From noonedeadpunk at gmail.com Sat Sep 2 16:08:23 2023 From: noonedeadpunk at gmail.com (Dmitriy Rabotyagov) Date: Sat, 2 Sep 2023 18:08:23 +0200 Subject: [openstack-ansible] Dedicated gateway hosts not working with OVN In-Reply-To: References:

<20230208143243.dth426y74jyfacxh@yuggoth.org> Message-ID: Hi, I think this is known issue which should be fixed with the following patch: https://review.opendev.org/c/openstack/openstack-ansible/+/892540 In the meanwhile you should be able to workaround the issue by creating /etc/openstack_deploy/env.d/nova.yml file with following content: nova_compute_container: belongs_to: - compute_containers - kvm-compute_containers - qemu-compute_containers contains: - neutron_sriov_nic_agent - neutron_ovn_controller - nova_compute properties: is_metal: true You might also need to remove computes from the inventory using /opt/openstack-ansible/scripts/inventory-manage.py -r cmp03 They will be re-added next time running openstack-ansible or dynamic-inventory.py. Removing them is needed to ensure that they're not part of ovn-gateway related group. You might also need to stop ovn-gateway service on these computes manually, but I'm not sure 100% about that. On Sat, Sep 2, 2023, 17:47 Roger Rivera wrote: > Hello, > > We have deployed an openstack-ansible cluster to test it on_metal with > OVN and defined *dedicated gateway hosts* connecting to the external > network with the *network-gateway_hosts* host group. Unfortunately, we > are not able to connect to the external/provider networks. It seems that > traffic wants to reach external networks via the hypervisor nodes and not > the gateway hosts. > > Any suggestions on changes needed to our configuration will be highly > appreciated. > > Environment: > -Openstack Antelope > -Ubuntu 22 on all hosts > -3 infra hosts - 1xNIC (ens1) > -2 compute hosts - 1xNIC (ens1) > -2 gateway hosts - 2xNIC (ens1 internal, ens2 external) > -No linux bridges are created. > > The gateway hosts are the only ones physically connected to the external > network via physical interface ens2. Therefore, we need all external > provider network traffic to traverse via these gateway hosts. > > Tenant networks work fine and VMs can talk to each other. However, when a > VM is spawned with a floating IP to the external network, they are unable > to reach the outside network. > > Relevant content from openstack-ansible configuration files: > > > =.=.=.=.=.=.=.= > openstack_user_config.yml > =.=.=.=.=.=.=.= > ``` > ... > provider_networks: > - network: > container_bridge: "br-mgmt" > container_type: "veth" > container_interface: "ens1" > ip_from_q: "management" > type: "raw" > group_binds: > - all_containers > - hosts > is_management_address: true > - network: > container_bridge: "br-vxlan" > container_type: "veth" > container_interface: "ens1" > ip_from_q: "tunnel" > #type: "vxlan" > type: "geneve" > range: "1:1000" > net_name: "geneve" > group_binds: > - neutron_ovn_controller > - network: > container_bridge: "br-flat" > container_type: "veth" > container_interface: "ens1" > type: "flat" > net_name: "flat" > group_binds: > - neutron_ovn_controller > - network: > container_bridge: "br-vlan" > container_type: "veth" > container_interface: "ens1" > type: "vlan" > range: "101:300,401:500" > net_name: "vlan" > group_binds: > - neutron_ovn_controller > - network: > container_bridge: "br-storage" > container_type: "veth" > container_interface: "ens1" > ip_from_q: "storage" > type: "raw" > group_binds: > - glance_api > - cinder_api > - cinder_volume > - nova_compute > > ... > > compute-infra_hosts: > inf1: > ip: 172.16.0.1 > inf2: > ip: 172.16.0.2 > inf3: > ip: 172.16.0.3 > > compute_hosts: > cmp4: > ip: 172.16.0.21 > cmp3: > ip: 172.16.0.22 > > network_hosts: > inf1: > ip: 172.16.0.1 > inf2: > ip: 172.16.0.2 > inf3: > ip: 172.16.0.3 > > network-gateway_hosts: > net1: > ip: 172.16.0.31 > net2: > ip: 172.16.0.32 > > ``` > > > =.=.=.=.=.=.=.= > user_variables.yml > =.=.=.=.=.=.=.= > ``` > --- > debug: false > install_method: source > rabbitmq_use_ssl: False > haproxy_use_keepalived: False > ... > neutron_plugin_type: ml2.ovn > neutron_plugin_base: > - neutron.services.ovn_l3.plugin.OVNL3RouterPlugin > > neutron_ml2_drivers_type: geneve,vlan,flat > neutron_ml2_conf_ini_overrides: > ml2: > tenant_network_types: geneve > > ... > ``` > > =.=.=.=.=.=.=.= > env.d/neutron.yml > =.=.=.=.=.=.=.= > ``` > component_skel: > neutron_ovn_controller: > belongs_to: > - neutron_all > neutron_ovn_northd: > belongs_to: > - neutron_all > > container_skel: > neutron_agents_container: > contains: {} > properties: > is_metal: true > neutron_ovn_northd_container: > belongs_to: > - network_containers > contains: > - neutron_ovn_northd > > ``` > > =.=.=.=.=.=.=.= > env.d/nova.yml > =.=.=.=.=.=.=.= > ``` > component_skel: > nova_compute_container: > belongs_to: > - compute_containers > - kvm-compute_containers > - lxd-compute_containers > - qemu-compute_containers > contains: > - neutron_ovn_controller > - nova_compute > properties: > is_metal: true > ``` > > =.=.=.=.=.=.=.= > group_vars/network_hosts > =.=.=.=.=.=.=.= > ``` > openstack_host_specific_kernel_modules: > - name: "openvswitch" > pattern: "CONFIG_OPENVSWITCH" > ``` > > The nodes layout is like this: > > [image: image.png] > > > Any guidance on what we have wrong or how to improve this configuration > will be appreciated. We need to make external traffic for VMs to go out via > the gateway nodes and not the compute/hypervisor nodes. > > Thank you. > > Roger > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 16574 bytes Desc: not available URL: From egarciar at redhat.com Sun Sep 3 23:26:47 2023 From: egarciar at redhat.com (Elvira Garcia Ruiz) Date: Mon, 4 Sep 2023 01:26:47 +0200 Subject: [neutron] Bug Deputy Report August 28 - September 3 Message-ID: Hi! I was the bug deputy last week. Find the summary below: High ------ - https://bugs.launchpad.net/neutron/+bug/2033508 - [HA] "neutron-dynamic-routing" HA unittest failing Unassigned - https://bugs.launchpad.net/neutron/+bug/2033493 - [ndr] Unit tests failing due to a missing patch, still unreleased, in Neutron Fix proposed: https://review.opendev.org/c/openstack/neutron-dynamic-routing/+/893189 Assigned to Rodolfo Alonso Hernandez - https://bugs.launchpad.net/neutron/+bug/2033305 - OVN metadata agent keeps restarting Assigned to Miro Tomaska Fix proposed: https://review.opendev.org/c/openstack/neutron/+/890986 - https://bugs.launchpad.net/neutron/+bug/2033556 - Documentation for DNS integration is incomplete Assigned to Dr. Jens Harbott - https://bugs.launchpad.net/neutron/+bug/2033887 - [OVN][Trunk] The cold migration process is broken since patch 882581 Assigned to Rodolfo Fix proposed: https://review.opendev.org/c/openstack/neutron/+/893447 + backports - https://bugs.launchpad.net/neutron/+bug/2033752 - test_reboot_server_hard fails with AssertionError: time.struct_time() not greater than time.struct_time() This is affecting us but the failure is on Nova. Yatin will monitor this. - https://bugs.launchpad.net/neutron/+bug/2033932 - Add support for OVN MAC_Binding aging Fix proposed: https://review.opendev.org/c/openstack/neutron/+/893575 Assigned to Terry Medium ---------- - https://bugs.launchpad.net/neutron/+bug/2033281 - [OVN] ovn_hash_ring table cleanup In progress. Fix merged on master: https://review.opendev.org/c/openstack/neutron/+/893030 Assigned to Lucas Low ------ - https://bugs.launchpad.net/neutron/+bug/2033651 - [fullstack] Reduce the CI job time Unassigned Incomplete --------------- - https://bugs.launchpad.net/neutron/+bug/2033293 - dns integration saying plugin does not match requirements Unassigned Left a comment since I think this might not be a bug. Also contacted mlavalle to get more information. Undecided --------------- https://bugs.launchpad.net/neutron/+bug/2033683 - openvswitch.agent.ovs_neutron_agent fails to Cmd: ['iptables-restore', '-n'] I?m not very sure about how to confirm this one. Unassigned Kind regards, Elvira From hberaud at redhat.com Mon Sep 4 09:07:25 2023 From: hberaud at redhat.com (Herve Beraud) Date: Mon, 4 Sep 2023 11:07:25 +0200 Subject: [PTL][release] 2023.2 Bobcat Cycle Highlights Message-ID: Hi, This is a reminder that Cycle highlights need to be added to deliverable yamls so that they can be included in release marketing preparations. (See the details about how to add them at the project team guide [1]) Note that the deadline was the Feature Freeze date in previous cycles, but due to requests of PTLs it was decided to postpone one week in the future (to R-4 week). This is better for PTLs as they are usually very busy around Feature Freeze, on the other hand gives less time to marketing team to process the highlights. [1] https://docs.openstack.org/project-team-guide/release-management.html# cycle-highlights Thanks, -- Herv? Beraud Senior Software Engineer at Red Hat irc: hberaud https://github.com/4383/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From thierry at openstack.org Mon Sep 4 10:28:21 2023 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 4 Sep 2023 12:28:21 +0200 Subject: [largescale-sig] Next meeting: Sept 6, 15utc Message-ID: <94831ce5-03e2-b08c-0c78-4b000be7a681@openstack.org> Hi everyone, The Large Scale SIG is back after the northern hemisphere summer break and will be meeting this Wednesday in #openstack-operators on OFTC IRC, at 15UTC, our EU+US-friendly time. I will be chairing. You can doublecheck how that UTC time translates locally at: https://www.timeanddate.com/worldclock/fixedtime.html?iso=20230906T15 Feel free to add topics to the agenda: https://etherpad.opendev.org/p/large-scale-sig-meeting Regards, -- Thierry Carrez From tonykarera at gmail.com Mon Sep 4 10:59:59 2023 From: tonykarera at gmail.com (Karera Tony) Date: Mon, 4 Sep 2023 12:59:59 +0200 Subject: Live migration error In-Reply-To: References:

Message-ID: Hello Danny, I am actually using Broadwell-noTSX-IBRS on all the servers but I have seen that some flags are not there in the older servers. The flags below to be specific. hwp hwp_act_window hwp_pkg_req Regards Tony Karera On Fri, Sep 1, 2023 at 4:05?PM wrote: > On Fri, 2023-09-01 at 14:56 +0200, Karera Tony wrote: > > Hello Danny, > > > > Thanks for the feedback. > > > > use a lowest common denominator cpu_model : Does this mean changing the > > servers ? > > it means tha tif you have a mix of cpus in the deployment you shoudl > hardcod > to the older of the diffent cpu models i.e skylake or whatever it may be > in your case. > > > https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.cpu_models > > [libvirt] > cpu_mode=custom > cpu_models=skylake > > in more recent release we replaced the older cpu_model with cpu_models > with is a comma sperated > list of models in prefernce order. i.e. nova will use the first cpu_modle > in the list that work > on the current host. not that while this give extra flexiblity it vs just > using the oldest cpu model > it does limit the set of hosts you can migrate too but means you can have > better perfromance so its > a tradeoff. performance vs portablity. > > the error you mentioned can also be caused by micorocode updates. > intel remove TSX and MPX i beilve in the last 2 years form come of there > cpus > that breaks live migration if a vm was create with access to that cpu > instruction > > the only way to resolve that is to cold migrate the guest. > i.e. if a vm is currently booted with cpu_model X it cannot be modifed > while the guest is running > so you either need to update the config option and reboot all the vms or > more particlaly update the config > and then cold migrate them which will allow you to move the instnace(your > orginal goal) while allowing > the new cpu model to take effect. > > novas default for the cpu model when not set is with our default cpu_mode > is host_model > meaning whatever model best matches the current hosts cpu. > > that is the correct default in general but if you have mixed cpu > generatiosn in your cloud its not ideal. > > hopefuly that helps > > > > > Regards > > > > Tony Karera > > > > > > > > > > On Fri, Sep 1, 2023 at 12:37?PM Danny Webb > > wrote: > > > > > It means that you have CPUs with incompatible flags or you've got > > > differing kernel versions that expose different flags or you've got > > > differing libvirt versions that expose different flags. You either > need to > > > use a lowest common denominator cpu_model or do a cold migration. > > > ------------------------------ > > > *From:* Karera Tony > > > *Sent:* 01 September 2023 10:13 > > > *To:* openstack-discuss > > > *Subject:* Live migration error > > > > > > > > > * CAUTION: This email originates from outside THG * > > > ------------------------------ > > > Hello Team, > > > > > > I am trying to migrate instances from one host to another but I am > getting > > > this error. > > > > > > *Error: *Failed to live migrate instance to host "compute1". Details > > > > > > Migration pre-check error: Unacceptable CPU info: CPU doesn't have > > > compatibility. 0 Refer to > > > http://libvirt.org/html/libvirt-libvirt-host.html#virCPUCompareResult > > > (HTTP 400) (Request-ID: req-9dfdfe2e-70fa-481a-bd3d-c2c76a6e93da) > > > > > > Any idea on how to fix this? > > > > > > Regards > > > > > > Tony Karera > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Mon Sep 4 11:25:50 2023 From: smooney at redhat.com (smooney at redhat.com) Date: Mon, 04 Sep 2023 12:25:50 +0100 Subject: Live migration error In-Reply-To: References:

Message-ID: <9a8d05595db731f00e30066ef00fc4df04d84f14.camel@redhat.com> On Mon, 2023-09-04 at 12:59 +0200, Karera Tony wrote: > Hello Danny, > > I am actually using Broadwell-noTSX-IBRS on all the servers but I have seen > that some flags are not there in the older servers. > > The flags below to be specific. > > hwp hwp_act_window hwp_pkg_req so hwp is apprentlly intels pstate contol based on https://unix.stackexchange.com/a/43540 i dont belive this is normally exposed to a guest but if you are seeign differnce between hosts then that likely means you have disabled pstates also know as "intel speedstep" on some of the host and not others. > > Regards > > Tony Karera > > > > > On Fri, Sep 1, 2023 at 4:05?PM wrote: > > > On Fri, 2023-09-01 at 14:56 +0200, Karera Tony wrote: > > > Hello Danny, > > > > > > Thanks for the feedback. > > > > > > use a lowest common denominator cpu_model : Does this mean changing the > > > servers ? > > > > it means tha tif you have a mix of cpus in the deployment you shoudl > > hardcod > > to the older of the diffent cpu models i.e skylake or whatever it may be > > in your case. > > > > > > https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.cpu_models > > > > [libvirt] > > cpu_mode=custom > > cpu_models=skylake > > > > in more recent release we replaced the older cpu_model with cpu_models > > with is a comma sperated > > list of models in prefernce order. i.e. nova will use the first cpu_modle > > in the list that work > > on the current host. not that while this give extra flexiblity it vs just > > using the oldest cpu model > > it does limit the set of hosts you can migrate too but means you can have > > better perfromance so its > > a tradeoff. performance vs portablity. > > > > the error you mentioned can also be caused by micorocode updates. > > intel remove TSX and MPX i beilve in the last 2 years form come of there > > cpus > > that breaks live migration if a vm was create with access to that cpu > > instruction > > > > the only way to resolve that is to cold migrate the guest. > > i.e. if a vm is currently booted with cpu_model X it cannot be modifed > > while the guest is running > > so you either need to update the config option and reboot all the vms or > > more particlaly update the config > > and then cold migrate them which will allow you to move the instnace(your > > orginal goal) while allowing > > the new cpu model to take effect. > > > > novas default for the cpu model when not set is with our default cpu_mode > > is host_model > > meaning whatever model best matches the current hosts cpu. > > > > that is the correct default in general but if you have mixed cpu > > generatiosn in your cloud its not ideal. > > > > hopefuly that helps > > > > > > > > Regards > > > > > > Tony Karera > > > > > > > > > > > > > > > On Fri, Sep 1, 2023 at 12:37?PM Danny Webb > > > wrote: > > > > > > > It means that you have CPUs with incompatible flags or you've got > > > > differing kernel versions that expose different flags or you've got > > > > differing libvirt versions that expose different flags.? You either > > need to > > > > use a lowest common denominator cpu_model or do a cold migration. > > > > ------------------------------ > > > > *From:* Karera Tony > > > > *Sent:* 01 September 2023 10:13 > > > > *To:* openstack-discuss > > > > *Subject:* Live migration error > > > > > > > > > > > > * CAUTION: This email originates from outside THG * > > > > ------------------------------ > > > > Hello Team, > > > > > > > > I am trying to migrate instances from one host to another but I am > > getting > > > > this error. > > > > > > > > *Error: *Failed to live migrate instance to host "compute1". Details > > > > > > > > Migration pre-check error: Unacceptable CPU info: CPU doesn't have > > > > compatibility. 0 Refer to > > > > http://libvirt.org/html/libvirt-libvirt-host.html#virCPUCompareResult > > > > (HTTP 400) (Request-ID: req-9dfdfe2e-70fa-481a-bd3d-c2c76a6e93da) > > > > > > > > Any idea on how to fix this? > > > > > > > > Regards > > > > > > > > Tony Karera > > > > > > > > > > > > > > > > From tonykarera at gmail.com Mon Sep 4 12:11:37 2023 From: tonykarera at gmail.com (Karera Tony) Date: Mon, 4 Sep 2023 14:11:37 +0200 Subject: Live migration error In-Reply-To: <9a8d05595db731f00e30066ef00fc4df04d84f14.camel@redhat.com> References:

<9a8d05595db731f00e30066ef00fc4df04d84f14.camel@redhat.com> Message-ID: Hello Smooney, I can confirm that it is disabled on all the servers. Regards Tony Karera On Mon, Sep 4, 2023 at 1:25?PM wrote: > On Mon, 2023-09-04 at 12:59 +0200, Karera Tony wrote: > > Hello Danny, > > > > I am actually using Broadwell-noTSX-IBRS on all the servers but I have > seen > > that some flags are not there in the older servers. > > > > The flags below to be specific. > > > > hwp hwp_act_window hwp_pkg_req > so hwp is apprentlly intels pstate contol > based on https://unix.stackexchange.com/a/43540 > > i dont belive this is normally exposed to a guest but > if you are seeign differnce between hosts then that likely > means you have disabled pstates also know as "intel speedstep" on > some of the host and not others. > > > > > Regards > > > > Tony Karera > > > > > > > > > > On Fri, Sep 1, 2023 at 4:05?PM wrote: > > > > > On Fri, 2023-09-01 at 14:56 +0200, Karera Tony wrote: > > > > Hello Danny, > > > > > > > > Thanks for the feedback. > > > > > > > > use a lowest common denominator cpu_model : Does this mean changing > the > > > > servers ? > > > > > > it means tha tif you have a mix of cpus in the deployment you shoudl > > > hardcod > > > to the older of the diffent cpu models i.e skylake or whatever it may > be > > > in your case. > > > > > > > > > > https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.cpu_models > > > > > > [libvirt] > > > cpu_mode=custom > > > cpu_models=skylake > > > > > > in more recent release we replaced the older cpu_model with cpu_models > > > with is a comma sperated > > > list of models in prefernce order. i.e. nova will use the first > cpu_modle > > > in the list that work > > > on the current host. not that while this give extra flexiblity it vs > just > > > using the oldest cpu model > > > it does limit the set of hosts you can migrate too but means you can > have > > > better perfromance so its > > > a tradeoff. performance vs portablity. > > > > > > the error you mentioned can also be caused by micorocode updates. > > > intel remove TSX and MPX i beilve in the last 2 years form come of > there > > > cpus > > > that breaks live migration if a vm was create with access to that cpu > > > instruction > > > > > > the only way to resolve that is to cold migrate the guest. > > > i.e. if a vm is currently booted with cpu_model X it cannot be modifed > > > while the guest is running > > > so you either need to update the config option and reboot all the vms > or > > > more particlaly update the config > > > and then cold migrate them which will allow you to move the > instnace(your > > > orginal goal) while allowing > > > the new cpu model to take effect. > > > > > > novas default for the cpu model when not set is with our default > cpu_mode > > > is host_model > > > meaning whatever model best matches the current hosts cpu. > > > > > > that is the correct default in general but if you have mixed cpu > > > generatiosn in your cloud its not ideal. > > > > > > hopefuly that helps > > > > > > > > > > > Regards > > > > > > > > Tony Karera > > > > > > > > > > > > > > > > > > > > On Fri, Sep 1, 2023 at 12:37?PM Danny Webb < > Danny.Webb at thehutgroup.com> > > > > wrote: > > > > > > > > > It means that you have CPUs with incompatible flags or you've got > > > > > differing kernel versions that expose different flags or you've got > > > > > differing libvirt versions that expose different flags. You either > > > need to > > > > > use a lowest common denominator cpu_model or do a cold migration. > > > > > ------------------------------ > > > > > *From:* Karera Tony > > > > > *Sent:* 01 September 2023 10:13 > > > > > *To:* openstack-discuss > > > > > *Subject:* Live migration error > > > > > > > > > > > > > > > * CAUTION: This email originates from outside THG * > > > > > ------------------------------ > > > > > Hello Team, > > > > > > > > > > I am trying to migrate instances from one host to another but I am > > > getting > > > > > this error. > > > > > > > > > > *Error: *Failed to live migrate instance to host "compute1". > Details > > > > > > > > > > Migration pre-check error: Unacceptable CPU info: CPU doesn't have > > > > > compatibility. 0 Refer to > > > > > > http://libvirt.org/html/libvirt-libvirt-host.html#virCPUCompareResult > > > > > (HTTP 400) (Request-ID: req-9dfdfe2e-70fa-481a-bd3d-c2c76a6e93da) > > > > > > > > > > Any idea on how to fix this? > > > > > > > > > > Regards > > > > > > > > > > Tony Karera > > > > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jorgevisentini at gmail.com Mon Sep 4 14:51:07 2023 From: jorgevisentini at gmail.com (Jorge Visentini) Date: Mon, 4 Sep 2023 11:51:07 -0300 Subject: What is the difference between 4 and 8 virtual sockets to physical sockets? Message-ID: Hello Team, What is the difference between creating an instance with *4* or *8 virtual sockets*, since the hypervisor has only *4 physical sockets*. My question is where do sockets, cores and virtual threads fit into the physical hardware. I think this question is not just related to Openstack, but with any virtualization. My hypervisor configuration is as follows: CPU(s): 192 Online CPU(s) list: 0-191 Thread(s) per core: 2 Core(s) per socket: 24 Socket(s): 4 NUMA node(s): 4 Do you have any documentation that I can read and understand better? That we have a nice week! -- Att, Jorge Visentini +55 55 98432-9868 -------------- next part -------------- An HTML attachment was scrubbed... URL: From elod.illes at est.tech Mon Sep 4 15:26:39 2023 From: elod.illes at est.tech (=?utf-8?B?RWzDtWQgSWxsw6lz?=) Date: Mon, 4 Sep 2023 15:26:39 +0000 Subject: [all][stable][ptl] Propose to EOL Stein series Message-ID: Hi, Core projects announced and deleted their stable/stein branches for some time already. Furthermore, most of the projects, where stable/stein is still open, have broken gates. So, the same way as we transitioned Rocky series to End of Life [1][2], now is the time to do the same with Stein. I'll propose the stein-eol transition patches for the rest of the open projects. I ask again the teams to signal their decision (with a +1 if the team is ready for the transition). Thanks in advance! (Please note, that the Extended Maintenance process is phasing out / transforming soon, see details in Technical Committee's resolution [3], which was accepted and merged in August.) Thanks, El?d Ill?s irc: elodilles @ #openstack-stable / #openstack-release [1] https://lists.openstack.org/pipermail/openstack-discuss/2023-January/031922.html [2] https://lists.openstack.org/pipermail/openstack-discuss/2023-April/033386.html [3] https://governance.openstack.org/tc/resolutions/20230724-unmaintained-branches.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Mon Sep 4 16:10:52 2023 From: smooney at redhat.com (smooney at redhat.com) Date: Mon, 04 Sep 2023 17:10:52 +0100 Subject: What is the difference between 4 and 8 virtual sockets to physical sockets? In-Reply-To: References: Message-ID: <961b08ba25c4eeffb430d3d60ef521e15e26df18.camel@redhat.com> in general the only parameter you want to align to the physical host is the number of thread so if the phsyical host has 2 thread per physical core then its best to also do that in the vm we generally recommend setting the number of virutal socket equal to the number of virutal numa nodes if the vm has no explict numa toploly then you should set sockets=1 else hw:cpu_sockets==hw:numa_nodes is our recomendation. for windows in partcalar the default config generated is suboptimal as windows client only supprot 1-2 sockets and windows serverver maxes out at 4 i believe. On Mon, 2023-09-04 at 11:51 -0300, Jorge Visentini wrote: > Hello Team, > > What is the difference between creating an instance with *4* or *8 virtual > sockets*, since the hypervisor has only *4 physical sockets*. > My question is where do sockets, cores and virtual threads fit into the > physical hardware. I think this question is not just related to Openstack, > but with any virtualization. > > My hypervisor configuration is as follows: > > CPU(s): 192 > Online CPU(s) list: 0-191 > Thread(s) per core: 2 > Core(s) per socket: 24 > Socket(s): 4 > NUMA node(s): 4 > > Do you have any documentation that I can read and understand better? > > That we have a nice week! From ygk.kmr at gmail.com Mon Sep 4 17:01:39 2023 From: ygk.kmr at gmail.com (Gk Gk) Date: Mon, 4 Sep 2023 22:31:39 +0530 Subject: Need assistance Message-ID: Hi, We have Yoga version setup of OSA. When a user is trying to use a certain flavor which has some extra specs, same as that which is mentioned for an aggregate, the vm creation is failing for the user. But I created a vm using that same flavor and it succeeded for a test user account. When I analyzed the logs, the nova scheduler is not filtering or checking the individual hosts from that target aggregate in the case of that user. But it is validating the target aggregate hosts in the case of a test user and able to select a host for vm creation. So what could be the reason that the scheduler is unable to find the hosts from that aggregate in the case of that user as opposed to that of the test user ? Is there any connection between a project and an aggregate ? Thanks Kumar -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Mon Sep 4 17:50:47 2023 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 04 Sep 2023 10:50:47 -0700 Subject: [all][qa] Devstack dropping support for Ubuntu Focal Message-ID: <18a61518602.fd2214f3355467.3969169943764214362@ghanshyammann.com> Hello Everyone, As you know, this release 2023.2 (bobcat) does not mandate testing the ubuntu Focal[1] and with Nova bumped the libvirt version, focal jobs are failing. Devstack was planning to drop the Focal support in the next cycle but we have to do it now only[2]. This email is a heads-up if you are still wondering about the Focal job failure. If your project still running any of the jobs on focal, please plan to move to Jammy. [1] https://governance.openstack.org/tc/reference/runtimes/2023.2.html [2] https://review.opendev.org/c/openstack/devstack/+/885468 -gmann From eblock at nde.ag Mon Sep 4 18:27:58 2023 From: eblock at nde.ag (Eugen Block) Date: Mon, 04 Sep 2023 18:27:58 +0000 Subject: Need assistance In-Reply-To: Message-ID: <20230904182758.Horde.GouaYgjRk4uB_PKPQvkqktm@webmail.nde.ag> I don?t mean to be rude but all your threads have the same subject ?need assistance?. There?s no way to distinguish those threads without reading the mails, I recommend to read the etiquette [https://wiki.openstack.org/wiki/MailingListEtiquette#Subjects] and add a proper subject to your threads. Zitat von Gk Gk : > Hi, > > We have Yoga version setup of OSA. When a user is trying to use a certain > flavor which has some extra specs, same as that which is mentioned for an > aggregate, the vm creation is failing for the user. But I created a vm > using that same flavor and it succeeded for a test user account. When I > analyzed the logs, the nova scheduler is not filtering or checking the > individual hosts from that target aggregate in the case of that user. But > it is validating the target aggregate hosts in the case of a test user and > able to select a host for vm creation. So what could be the reason that the > scheduler is unable to find the hosts from that aggregate in the case of > that user as opposed to that of the test user ? Is there any connection > between a project and an aggregate ? > > Thanks > Kumar From jorgevisentini at gmail.com Mon Sep 4 21:13:15 2023 From: jorgevisentini at gmail.com (Jorge Visentini) Date: Mon, 4 Sep 2023 18:13:15 -0300 Subject: What is the difference between 4 and 8 virtual sockets to physical sockets? In-Reply-To: <961b08ba25c4eeffb430d3d60ef521e15e26df18.camel@redhat.com> References: <961b08ba25c4eeffb430d3d60ef521e15e26df18.camel@redhat.com> Message-ID: Yes, yes, that I understand. I know that for example if I want to use host_passthrough then I must use cpu_sockets == numa_nodes. *My question is more conceptual, for me to understand.* *For example:* If I have a physical host with 1 physical processor (1 socket), can I define my instance with 2, 4, 8+ sockets? I mean, is it good practice? Is it correct to define the instance with 1 socket and increase the amount of socket colors? Not sure if I could explain my question... In short, what is the relationship between the socket, cores and virtual thread and the socket, cores and physical thread. PS: I'm not into the issue of passthrough or not. Em seg., 4 de set. de 2023 ?s 13:10, escreveu: > in general the only parameter you want to align to the physical host is > the number of thread > > so if the phsyical host has 2 thread per physical core then its best to > also do that in the vm > > we generally recommend setting the number of virutal socket equal to the > number of virutal numa nodes > if the vm has no explict numa toploly then you should set sockets=1 > else hw:cpu_sockets==hw:numa_nodes is our recomendation. > > for windows in partcalar the default config generated is suboptimal as > windows client only supprot 1-2 sockets > and windows serverver maxes out at 4 i believe. > > On Mon, 2023-09-04 at 11:51 -0300, Jorge Visentini wrote: > > Hello Team, > > > > What is the difference between creating an instance with *4* or *8 > virtual > > sockets*, since the hypervisor has only *4 physical sockets*. > > My question is where do sockets, cores and virtual threads fit into the > > physical hardware. I think this question is not just related to > Openstack, > > but with any virtualization. > > > > My hypervisor configuration is as follows: > > > > CPU(s): 192 > > Online CPU(s) list: 0-191 > > Thread(s) per core: 2 > > Core(s) per socket: 24 > > Socket(s): 4 > > NUMA node(s): 4 > > > > Do you have any documentation that I can read and understand better? > > > > That we have a nice week! > > -- Att, Jorge Visentini +55 55 98432-9868 -------------- next part -------------- An HTML attachment was scrubbed... URL: From knikolla at bu.edu Tue Sep 5 08:12:18 2023 From: knikolla at bu.edu (Nikolla, Kristi) Date: Tue, 5 Sep 2023 08:12:18 +0000 Subject: [tc] Technical Committee next weekly meeting Today on September 5 Message-ID: Hi all, This is a reminder that the next weekly Technical Committee meeting is to be held on Tuesday, September 5, 2023 at 1800 UTC on Zoom https://us06web.zoom.us/j/87108541765?pwd=emlXVXg4QUxrUTlLNDZ2TTllWUM3Zz09 The agenda can be found at the link https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Agenda_Suggestions and attached below: ? Roll call ? Follow up on past action items ? No action items ? Gate health check ? OpenStack Elections ? #link https://governance.openstack.org/election/ ? Ballots go out tomorrow and remain open until September 20. ? We have 6 candidates out of 4 seats for TC :) ? Open Discussion and Reviews ? Register for the PTG ? #link https://openinfra.dev/ptg/ ? #link https://review.opendev.org/q/projects:openstack/governance+is:open Thank you, Kristi Nikolla From smooney at redhat.com Tue Sep 5 08:48:52 2023 From: smooney at redhat.com (smooney at redhat.com) Date: Tue, 05 Sep 2023 09:48:52 +0100 Subject: What is the difference between 4 and 8 virtual sockets to physical sockets? In-Reply-To: References: <961b08ba25c4eeffb430d3d60ef521e15e26df18.camel@redhat.com> Message-ID: <7549fec42f89c31896076d93285730d6d787c54a.camel@redhat.com> On Mon, 2023-09-04 at 18:13 -0300, Jorge Visentini wrote: > Yes, yes, that I understand. > > I know that for example if I want to use host_passthrough then I must use > cpu_sockets == numa_nodes. no even without cpu_mode=host_passthough you shoudl use cpu_sockets == numa_nodes. The reason for that is that the windows and to a lesser degree linux scheduler tends to make better schduling desicions when the numa nodes and sockets are aligned. intel supported cluster on die since around sandybridge/ivybridge but it was not until very recentlly with amd epic platform and the move to chiplet design that the windows and linux kernels schdulers really got optimised to properly handel multipel numa nodes per socket. in the very early days fo openstack qemu/libvirt strongly prefer generating toplogies with 1 socket, 1 core and 1 thread per vcpu. that is still what we defautl to today in nova because 10+ years ago that outperformed other toplogies in some test done in a non openstack env i.e. with just libvirt/qemu when i got involed in openstack 10 years ago the cracks were already forming in that analasy and wehn we started to look at addign cpu pinnng and numa support the initall benchmarks we did show no benift to 1 socket per vcpu and infact if you had hyperthread on the host it actully perfromed worse. for legacy reason we did not change the defautl toplogy but our recomendation has been to 1 socket per numa node and 1 thread per host thread by default so 8 VCPU with a normal floatign vm with no special numa toplogy shoudl be 1 socket 2 thread and 4 cpus if the host has smt/hyperthreading enable or 1 socket 1 threadn and 8 cpus. not 8 sockets, 1 thread and 1 cpu which is the defualt toplogy. > *My question is more conceptual, for me to understand.* no worries > > *For example:* If I have a physical host with 1 physical processor (1 > socket), can I define my instance with 2, 4, 8+ sockets? I mean, is it good > practice? Is it correct to define the instance with 1 socket and increase > the amount of socket colors? you do not need to align the virtual toplogy to the hsot toplogy in anyway. so on a host with 1 socket and 1 thread per core and 64 cores you can create a vm with 16 sockets and 2 thread per core and 1 core per socket largely this wont impact performace vs 1 sockets and 1 thread per core and 32 core per socket which woudl be our recomended toplogy. what you are trying to do with the virtual toplogy is create a config that enabel the guest kernel schduler to make good choices historically kernel schduler were largely not numa aware but did consider moving a process between socket to be very costly and as a result it avoided it. that meant for old kernels (windows 2012 or linux (centos 6/linux 2.x era)) it was more effeicnt to have 1 socket per vcpu that change in the linux side many year ago adn windows a lot more recently. now its more imporant to the scudherl to knwo about the numa toplogy and thread topopligy. i.e. in the old model fo 1 socket per vcpu if you had hyperthreading enabld the guest kernel would expect that it can run n process in parralle when infact it really can only run n/2 so match the theread count to the host tread count allow the guest kernel to make better choices. setting thread=2 when the host has thread=1 also tends to have littel downside for what its worth. for what its worth on the linux side its my understanding that the kernel also does som extra work per socket that can be elimated if you use 1 socket instead of 32 but that is negligible unless your dealing with realtime workloads where it might matter. > > Not sure if I could explain my question... In short, what is the > relationship between the socket, cores and virtual thread and the socket, > cores and physical thread. > PS: I'm not into the issue of passthrough or not. yep so conceptually as a user o fopenstack you should have no awareness of the hsot platform. specirfcaly as an unprivladged user you are not even ment to know if its libvirt/kvm or vmware from an api prespective. form an openstack admin point of view its in yoru interest to craft your flavors and default images to be as power efficent as possible. the best way to acihve power effeicncy is often to make the workload perform better so that it can idle sooner and you can take one step in that direction by using some of your knowlage of the hardware to turn your images. as an end user because you are not ment to knwo the toplogy of the host systems you generally shoudl benchmark your workload but you shoudl not really expect a large delta regardless of what you choose today. openstack is not a virtualisation plathform, its a cloud plathform and while we supprot some enhanced paltform awareness features like cpu pinning this si largely intened to be someing the cloud admin configured and the end user just selects rather then something an end user should have to deeply understand. so tl;dr virtual cpu tpopligies are about optimisting the vm for the workload not the host. optimising for the host can give a minimal performance uplift for the vm but the virutal cpu toplogoy is intentionally decloupeld form the host toplogy. the virtual numa topology feature is purely a performance enhancement feature and is implictly tied to a host. ie. if you ask for 2 numa nodes we will pin each guest numa node to a seperate host numa node. virtual numa topligecs and cpu toplogies had two entrily diffent design goals. numa is tied to the host toplogy to optimise for hardware constraits, cpu topliges are not tied to the host hardware as they are optimising for guest software constraits? (i.e. windows server only support 4 sockets so if you need more the 4 cpus you cannot have 1 socket pre vcpu) > > Em seg., 4 de set. de 2023 ?s 13:10, escreveu: > > > in general the only parameter you want to align to the physical host is > > the number of thread > > > > so if the phsyical host has 2 thread per physical core then its best to > > also do that in the vm > > > > we generally recommend setting the number of virutal socket equal to the > > number of virutal numa nodes > > if the vm has no explict numa toploly then you should set sockets=1 > > else hw:cpu_sockets==hw:numa_nodes is our recomendation. > > > > for windows in partcalar the default config generated is suboptimal as > > windows client only supprot 1-2 sockets > > and windows serverver maxes out at 4 i believe. > > > > On Mon, 2023-09-04 at 11:51 -0300, Jorge Visentini wrote: > > > Hello Team, > > > > > > What is the difference between creating an instance with *4* or *8 > > virtual > > > sockets*, since the hypervisor has only *4 physical sockets*. > > > My question is where do sockets, cores and virtual threads fit into the > > > physical hardware. I think this question is not just related to > > Openstack, > > > but with any virtualization. > > > > > > My hypervisor configuration is as follows: > > > > > > CPU(s): 192 > > > Online CPU(s) list: 0-191 > > > Thread(s) per core: 2 > > > Core(s) per socket: 24 > > > Socket(s): 4 > > > NUMA node(s): 4 > > > > > > Do you have any documentation that I can read and understand better? > > > > > > That we have a nice week! > > > > > From hberaud at redhat.com Tue Sep 5 12:09:53 2023 From: hberaud at redhat.com (Herve Beraud) Date: Tue, 5 Sep 2023 14:09:53 +0200 Subject: [QA][release] Cycle With Intermediary without recent deliverables Message-ID: Quick reminder that for deliverables following the cycle-with-intermediary model, the release team will use the latest $series release available on release week. The following deliverables have done a 2023.2 bobcat release, but it was not refreshed in the last two months: tempest You should consider making a new one very soon, so that we don't use an outdated version for the final release. Thanks -- Herv? Beraud Senior Software Engineer at Red Hat irc: hberaud https://github.com/4383/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ygk.kmr at gmail.com Tue Sep 5 12:49:11 2023 From: ygk.kmr at gmail.com (Gk Gk) Date: Tue, 5 Sep 2023 18:19:11 +0530 Subject: Host aggregates and placement aggregates Message-ID: Hi All, We have a yoga OSA setup. We have upgraded the setup from Xena to Yoga. We have found that some of the host aggregates have not synced to the placement aggregates. As a result , the vm creation on those aggregate nodes is failing even though they have the full capacity. I have tried the placement sync aggregates command but it did not help. Is this a known bug in Yoga ? Thanks Y.G -------------- next part -------------- An HTML attachment was scrubbed... URL: From asma.naz at techavenue.biz Tue Sep 5 04:16:01 2023 From: asma.naz at techavenue.biz (Asma Naz Shariq) Date: Tue, 5 Sep 2023 09:16:01 +0500 Subject: ***UNCHECKED*** openstack-discuss Digest, Vol 59, Issue 10 | Openstack Web Interface Issue In-Reply-To: References: Message-ID: <000201d9dfaf$b43df350$1cb9d9f0$@techavenue.biz> Hi Team, I have deployed Openstack through manual installation-Yoga release. I encountered the attached error and can't have access to Openstack dashboard. When I entered the login credentials it displays something went wrong and you don't have permission to use /horizon/. Can anyone have a solution for this error. -----Original Message----- From: openstack-discuss-request at lists.openstack.org Sent: Monday, September 4, 2023 8:27 PM To: openstack-discuss at lists.openstack.org Subject: ***UNCHECKED*** openstack-discuss Digest, Vol 59, Issue 10 Send openstack-discuss mailing list submissions to openstack-discuss at lists.openstack.org To subscribe or unsubscribe via the World Wide Web, visit https://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-discuss or, via email, send a message with subject or body 'help' to openstack-discuss-request at lists.openstack.org You can reach the person managing the list at openstack-discuss-owner at lists.openstack.org When replying, please edit your Subject line so it is more specific than "Re: Contents of openstack-discuss digest..." Today's Topics: 1. Re: Live migration error (Karera Tony) 2. What is the difference between 4 and 8 virtual sockets to physical sockets? (Jorge Visentini) 3. [all][stable][ptl] Propose to EOL Stein series (El?d Ill?s) ---------------------------------------------------------------------- Message: 1 Date: Mon, 4 Sep 2023 14:11:37 +0200 From: Karera Tony To: smooney at redhat.com Cc: Danny Webb , openstack-discuss Subject: Re: Live migration error Message-ID: Content-Type: text/plain; charset="utf-8" Hello Smooney, I can confirm that it is disabled on all the servers. Regards Tony Karera On Mon, Sep 4, 2023 at 1:25?PM wrote: > On Mon, 2023-09-04 at 12:59 +0200, Karera Tony wrote: > > Hello Danny, > > > > I am actually using Broadwell-noTSX-IBRS on all the servers but I > > have > seen > > that some flags are not there in the older servers. > > > > The flags below to be specific. > > > > hwp hwp_act_window hwp_pkg_req > so hwp is apprentlly intels pstate contol based on > https://unix.stackexchange.com/a/43540 > > i dont belive this is normally exposed to a guest but if you are > seeign differnce between hosts then that likely means you have > disabled pstates also know as "intel speedstep" on some of the host > and not others. > > > > > Regards > > > > Tony Karera > > > > > > > > > > On Fri, Sep 1, 2023 at 4:05?PM wrote: > > > > > On Fri, 2023-09-01 at 14:56 +0200, Karera Tony wrote: > > > > Hello Danny, > > > > > > > > Thanks for the feedback. > > > > > > > > use a lowest common denominator cpu_model : Does this mean > > > > changing > the > > > > servers ? > > > > > > it means tha tif you have a mix of cpus in the deployment you > > > shoudl hardcod to the older of the diffent cpu models i.e skylake > > > or whatever it may > be > > > in your case. > > > > > > > > > > https://docs.openstack.org/nova/latest/configuration/config.html#libvi > rt.cpu_models > > > > > > [libvirt] > > > cpu_mode=custom > > > cpu_models=skylake > > > > > > in more recent release we replaced the older cpu_model with > > > cpu_models with is a comma sperated list of models in prefernce > > > order. i.e. nova will use the first > cpu_modle > > > in the list that work > > > on the current host. not that while this give extra flexiblity it > > > vs > just > > > using the oldest cpu model > > > it does limit the set of hosts you can migrate too but means you > > > can > have > > > better perfromance so its > > > a tradeoff. performance vs portablity. > > > > > > the error you mentioned can also be caused by micorocode updates. > > > intel remove TSX and MPX i beilve in the last 2 years form come of > there > > > cpus > > > that breaks live migration if a vm was create with access to that > > > cpu instruction > > > > > > the only way to resolve that is to cold migrate the guest. > > > i.e. if a vm is currently booted with cpu_model X it cannot be > > > modifed while the guest is running so you either need to update > > > the config option and reboot all the vms > or > > > more particlaly update the config > > > and then cold migrate them which will allow you to move the > instnace(your > > > orginal goal) while allowing > > > the new cpu model to take effect. > > > > > > novas default for the cpu model when not set is with our default > cpu_mode > > > is host_model > > > meaning whatever model best matches the current hosts cpu. > > > > > > that is the correct default in general but if you have mixed cpu > > > generatiosn in your cloud its not ideal. > > > > > > hopefuly that helps > > > > > > > > > > > Regards > > > > > > > > Tony Karera > > > > > > > > > > > > > > > > > > > > On Fri, Sep 1, 2023 at 12:37?PM Danny Webb < > Danny.Webb at thehutgroup.com> > > > > wrote: > > > > > > > > > It means that you have CPUs with incompatible flags or you've > > > > > got differing kernel versions that expose different flags or > > > > > you've got differing libvirt versions that expose different > > > > > flags. You either > > > need to > > > > > use a lowest common denominator cpu_model or do a cold migration. > > > > > ------------------------------ > > > > > *From:* Karera Tony > > > > > *Sent:* 01 September 2023 10:13 > > > > > *To:* openstack-discuss > > > > > > > > > > *Subject:* Live migration error > > > > > > > > > > > > > > > * CAUTION: This email originates from outside THG * > > > > > ------------------------------ Hello Team, > > > > > > > > > > I am trying to migrate instances from one host to another but > > > > > I am > > > getting > > > > > this error. > > > > > > > > > > *Error: *Failed to live migrate instance to host "compute1". > Details > > > > > > > > > > Migration pre-check error: Unacceptable CPU info: CPU doesn't > > > > > have compatibility. 0 Refer to > > > > > > http://libvirt.org/html/libvirt-libvirt-host.html#virCPUCompareResult > > > > > (HTTP 400) (Request-ID: > > > > > req-9dfdfe2e-70fa-481a-bd3d-c2c76a6e93da) > > > > > > > > > > Any idea on how to fix this? > > > > > > > > > > Regards > > > > > > > > > > Tony Karera > > > > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: ------------------------------ Message: 2 Date: Mon, 4 Sep 2023 11:51:07 -0300 From: Jorge Visentini To: openstack-discuss at lists.openstack.org Subject: What is the difference between 4 and 8 virtual sockets to physical sockets? Message-ID: Content-Type: text/plain; charset="utf-8" Hello Team, What is the difference between creating an instance with *4* or *8 virtual sockets*, since the hypervisor has only *4 physical sockets*. My question is where do sockets, cores and virtual threads fit into the physical hardware. I think this question is not just related to Openstack, but with any virtualization. My hypervisor configuration is as follows: CPU(s): 192 Online CPU(s) list: 0-191 Thread(s) per core: 2 Core(s) per socket: 24 Socket(s): 4 NUMA node(s): 4 Do you have any documentation that I can read and understand better? That we have a nice week! -- Att, Jorge Visentini +55 55 98432-9868 -------------- next part -------------- An HTML attachment was scrubbed... URL: ------------------------------ Message: 3 Date: Mon, 4 Sep 2023 15:26:39 +0000 From: El?d Ill?s To: "openstack-discuss at lists.openstack.org" Subject: [all][stable][ptl] Propose to EOL Stein series Message-ID: Content-Type: text/plain; charset="utf-8" Hi, Core projects announced and deleted their stable/stein branches for some time already. Furthermore, most of the projects, where stable/stein is still open, have broken gates. So, the same way as we transitioned Rocky series to End of Life [1][2], now is the time to do the same with Stein. I'll propose the stein-eol transition patches for the rest of the open projects. I ask again the teams to signal their decision (with a +1 if the team is ready for the transition). Thanks in advance! (Please note, that the Extended Maintenance process is phasing out / transforming soon, see details in Technical Committee's resolution [3], which was accepted and merged in August.) Thanks, El?d Ill?s irc: elodilles @ #openstack-stable / #openstack-release [1] https://lists.openstack.org/pipermail/openstack-discuss/2023-January/031922. html [2] https://lists.openstack.org/pipermail/openstack-discuss/2023-April/033386.ht ml [3] https://governance.openstack.org/tc/resolutions/20230724-unmaintained-branch es.html -------------- next part -------------- An HTML attachment was scrubbed... URL: ------------------------------ Subject: Digest Footer _______________________________________________ openstack-discuss mailing list openstack-discuss at lists.openstack.org ------------------------------ End of openstack-discuss Digest, Vol 59, Issue 10 ************************************************* -------------- next part -------------- A non-text attachment was scrubbed... Name: apache2.err log with neutron.PNG Type: image/png Size: 24954 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: neutron-server.error.PNG Type: image/png Size: 86403 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: openstack error_pic.PNG Type: image/png Size: 24672 bytes Desc: not available URL: From hanoi952022 at gmail.com Tue Sep 5 10:56:34 2023 From: hanoi952022 at gmail.com (Ha Noi) Date: Tue, 5 Sep 2023 17:56:34 +0700 Subject: [openstack][neutron][openvswitch] Openvswitch Packet loss when high throughput (pps) Message-ID: Hi everyone, I'm using Openstack Train and Openvswitch for ML2 driver and GRE for tunnel type. I tested our network performance between two VMs and suffer packet loss as below. VM1: IP: 10.20.1.206 VM2: IP: 10.20.1.154 VM3: IP: 10.20.1.72 Using iperf3 to testing performance between VM1 and VM2. Run iperf3 client and server on both VMs. On VM2: iperf3 -t 10000 -b 130M -l 442 -P 6 -u -c 10.20.1.206 On VM1: iperf3 -t 10000 -b 130M -l 442 -P 6 -u -c 10.20.1.154 Using VM3 ping into VM1, then the packet is lost and the latency is quite high. ping -i 0.1 10.20.1.206 PING 10.20.1.206 (10.20.1.206) 56(84) bytes of data. 64 bytes from 10.20.1.206: icmp_seq=1 ttl=64 time=7.70 ms 64 bytes from 10.20.1.206: icmp_seq=2 ttl=64 time=6.90 ms 64 bytes from 10.20.1.206: icmp_seq=3 ttl=64 time=7.71 ms 64 bytes from 10.20.1.206: icmp_seq=4 ttl=64 time=7.98 ms 64 bytes from 10.20.1.206: icmp_seq=6 ttl=64 time=8.58 ms 64 bytes from 10.20.1.206: icmp_seq=7 ttl=64 time=8.34 ms 64 bytes from 10.20.1.206: icmp_seq=8 ttl=64 time=8.09 ms 64 bytes from 10.20.1.206: icmp_seq=10 ttl=64 time=4.57 ms 64 bytes from 10.20.1.206: icmp_seq=11 ttl=64 time=8.74 ms 64 bytes from 10.20.1.206: icmp_seq=12 ttl=64 time=9.37 ms 64 bytes from 10.20.1.206: icmp_seq=14 ttl=64 time=9.59 ms 64 bytes from 10.20.1.206: icmp_seq=15 ttl=64 time=7.97 ms 64 bytes from 10.20.1.206: icmp_seq=16 ttl=64 time=8.72 ms 64 bytes from 10.20.1.206: icmp_seq=17 ttl=64 time=9.23 ms ^C --- 10.20.1.206 ping statistics --- 34 packets transmitted, 28 received, 17.6471% packet loss, time 3328ms rtt min/avg/max/mdev = 1.396/6.266/9.590/2.805 ms Does any one get this issue ? Please help me. Thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: From roger.riverac at gmail.com Tue Sep 5 12:48:49 2023 From: roger.riverac at gmail.com (Roger Rivera) Date: Tue, 5 Sep 2023 08:48:49 -0400 Subject: [openstack-ansible] Dedicated gateway hosts not working with OVN In-Reply-To: References:

<20230208143243.dth426y74jyfacxh@yuggoth.org>

Message-ID: Hello, We are noticing two issues with these changes: *1*. The overrides on the file /*etc/openstack_deploy/env.d/nova.yml* are not being honored: nova_compute_container: belongs_to: - compute_containers - kvm-compute_containers - qemu-compute_containers contains: - neutron_sriov_nic_agent - neutron_ovn_controller - nova_compute properties: is_metal: true The following block continues to be populated in with compute nodes in */etc/openstack_deploy/openstack_inventory.json* after deleting and recreating the inventory file with */opt/openstack-ansible/scripts/inventory-manage.py*: "neutron_ovn_gateway": { "children": [], "hosts": [ "cmp3", "cmp4", "net1", "net2" ] }, *2*. After changing *group_binds *to *neutron_ovn_gateway *instead of the previous *neutron_ovn_controller*, group binds for *provider_networks *in *openstack_user_config.yml*. Openstack-ansible still wants to create network mappings for compute nodes, which are not part of the *neutron_ovn_gateway *host group: =.=.=.=.=.=.=.=.= TASK [os_neutron : Setup Network Provider Bridges] ********************************************************************************************************************************************************************************************************************************************************************************************** fatal: [cmp4]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: list object has no element 1\n\nThe error appears to be in '/etc/ansible/roles/os_neutron/tasks/providers/setup_ovs_ovn.yml': line 55, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Setup Network Provider Bridges\n ^ here\n"} =.=.=.=.=.=.=.=.= I'll dig deeper to see if I can find anything that helps. But any assistance will be appreciated. Thanks On Sat, Sep 2, 2023 at 12:08?PM Dmitriy Rabotyagov wrote: > Hi, > > I think this is known issue which should be fixed with the following patch: > https://review.opendev.org/c/openstack/openstack-ansible/+/892540 > > In the meanwhile you should be able to workaround the issue by creating > /etc/openstack_deploy/env.d/nova.yml file with following content: > > nova_compute_container: > belongs_to: > - compute_containers > - kvm-compute_containers > - qemu-compute_containers > contains: > - neutron_sriov_nic_agent > - neutron_ovn_controller > - nova_compute > properties: > is_metal: true > > You might also need to remove computes from the inventory using > /opt/openstack-ansible/scripts/inventory-manage.py -r cmp03 > > They will be re-added next time running openstack-ansible or > dynamic-inventory.py. Removing them is needed to ensure that they're not > part of ovn-gateway related group. > You might also need to stop ovn-gateway service on these computes > manually, but I'm not sure 100% about that. > > On Sat, Sep 2, 2023, 17:47 Roger Rivera wrote: > >> Hello, >> >> We have deployed an openstack-ansible cluster to test it on_metal with >> OVN and defined *dedicated gateway hosts* connecting to the external >> network with the *network-gateway_hosts* host group. Unfortunately, we >> are not able to connect to the external/provider networks. It seems that >> traffic wants to reach external networks via the hypervisor nodes and not >> the gateway hosts. >> >> Any suggestions on changes needed to our configuration will be highly >> appreciated. >> >> Environment: >> -Openstack Antelope >> -Ubuntu 22 on all hosts >> -3 infra hosts - 1xNIC (ens1) >> -2 compute hosts - 1xNIC (ens1) >> -2 gateway hosts - 2xNIC (ens1 internal, ens2 external) >> -No linux bridges are created. >> >> The gateway hosts are the only ones physically connected to the external >> network via physical interface ens2. Therefore, we need all external >> provider network traffic to traverse via these gateway hosts. >> >> Tenant networks work fine and VMs can talk to each other. However, when a >> VM is spawned with a floating IP to the external network, they are unable >> to reach the outside network. >> >> Relevant content from openstack-ansible configuration files: >> >> >> =.=.=.=.=.=.=.= >> openstack_user_config.yml >> =.=.=.=.=.=.=.= >> ``` >> ... >> provider_networks: >> - network: >> container_bridge: "br-mgmt" >> container_type: "veth" >> container_interface: "ens1" >> ip_from_q: "management" >> type: "raw" >> group_binds: >> - all_containers >> - hosts >> is_management_address: true >> - network: >> container_bridge: "br-vxlan" >> container_type: "veth" >> container_interface: "ens1" >> ip_from_q: "tunnel" >> #type: "vxlan" >> type: "geneve" >> range: "1:1000" >> net_name: "geneve" >> group_binds: >> - neutron_ovn_controller >> - network: >> container_bridge: "br-flat" >> container_type: "veth" >> container_interface: "ens1" >> type: "flat" >> net_name: "flat" >> group_binds: >> - neutron_ovn_controller >> - network: >> container_bridge: "br-vlan" >> container_type: "veth" >> container_interface: "ens1" >> type: "vlan" >> range: "101:300,401:500" >> net_name: "vlan" >> group_binds: >> - neutron_ovn_controller >> - network: >> container_bridge: "br-storage" >> container_type: "veth" >> container_interface: "ens1" >> ip_from_q: "storage" >> type: "raw" >> group_binds: >> - glance_api >> - cinder_api >> - cinder_volume >> - nova_compute >> >> ... >> >> compute-infra_hosts: >> inf1: >> ip: 172.16.0.1 >> inf2: >> ip: 172.16.0.2 >> inf3: >> ip: 172.16.0.3 >> >> compute_hosts: >> cmp4: >> ip: 172.16.0.21 >> cmp3: >> ip: 172.16.0.22 >> >> network_hosts: >> inf1: >> ip: 172.16.0.1 >> inf2: >> ip: 172.16.0.2 >> inf3: >> ip: 172.16.0.3 >> >> network-gateway_hosts: >> net1: >> ip: 172.16.0.31 >> net2: >> ip: 172.16.0.32 >> >> ``` >> >> >> =.=.=.=.=.=.=.= >> user_variables.yml >> =.=.=.=.=.=.=.= >> ``` >> --- >> debug: false >> install_method: source >> rabbitmq_use_ssl: False >> haproxy_use_keepalived: False >> ... >> neutron_plugin_type: ml2.ovn >> neutron_plugin_base: >> - neutron.services.ovn_l3.plugin.OVNL3RouterPlugin >> >> neutron_ml2_drivers_type: geneve,vlan,flat >> neutron_ml2_conf_ini_overrides: >> ml2: >> tenant_network_types: geneve >> >> ... >> ``` >> >> =.=.=.=.=.=.=.= >> env.d/neutron.yml >> =.=.=.=.=.=.=.= >> ``` >> component_skel: >> neutron_ovn_controller: >> belongs_to: >> - neutron_all >> neutron_ovn_northd: >> belongs_to: >> - neutron_all >> >> container_skel: >> neutron_agents_container: >> contains: {} >> properties: >> is_metal: true >> neutron_ovn_northd_container: >> belongs_to: >> - network_containers >> contains: >> - neutron_ovn_northd >> >> ``` >> >> =.=.=.=.=.=.=.= >> env.d/nova.yml >> =.=.=.=.=.=.=.= >> ``` >> component_skel: >> nova_compute_container: >> belongs_to: >> - compute_containers >> - kvm-compute_containers >> - lxd-compute_containers >> - qemu-compute_containers >> contains: >> - neutron_ovn_controller >> - nova_compute >> properties: >> is_metal: true >> ``` >> >> =.=.=.=.=.=.=.= >> group_vars/network_hosts >> =.=.=.=.=.=.=.= >> ``` >> openstack_host_specific_kernel_modules: >> - name: "openvswitch" >> pattern: "CONFIG_OPENVSWITCH" >> ``` >> >> The nodes layout is like this: >> >> [image: image.png] >> >> >> Any guidance on what we have wrong or how to improve this configuration >> will be appreciated. We need to make external traffic for VMs to go out via >> the gateway nodes and not the compute/hypervisor nodes. >> >> Thank you. >> >> Roger >> > -- *Roger Rivera* -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 16574 bytes Desc: not available URL: From gmann at ghanshyammann.com Tue Sep 5 15:49:19 2023 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 05 Sep 2023 08:49:19 -0700 Subject: [QA][release] Cycle With Intermediary without recent deliverables In-Reply-To: References: Message-ID: <18a6608af79.127dbda1c450285.5437835893139603985@ghanshyammann.com> ---- On Tue, 05 Sep 2023 05:09:53 -0700 Herve Beraud wrote --- > Quick reminder that for deliverables following the cycle-with-intermediarymodel, the release team will use the latest $series release available onrelease week.The following deliverables have done a 2023.2 bobcat release, but it was notrefreshed in the last two months: tempestYou should consider making a new one very soon, so that we don't use anoutdated version for the final release. Ack, I will check what changes we need to merge for 2023.2 and push the latest release. -gmann > Thanks > -- > Herv? BeraudSenior Software Engineer at Red Hatirc: hberaudhttps://github.com/4383/ > > From noonedeadpunk at gmail.com Tue Sep 5 16:31:39 2023 From: noonedeadpunk at gmail.com (Dmitriy Rabotyagov) Date: Tue, 5 Sep 2023 18:31:39 +0200 Subject: [openstack-ansible] Dedicated gateway hosts not working with OVN In-Reply-To: References:

<20230208143243.dth426y74jyfacxh@yuggoth.org>

Message-ID: Hey, 1. Sorry, my bad, was copying you from my phone, so the extra section ( container_skel) that is required has slipped my paste. So /etc/openstack_deploy/env.d/nova.yml should look like this: container_skel: nova_compute_container: belongs_to: - compute_containers - kvm-compute_containers - qemu-compute_containers contains: - neutron_sriov_nic_agent - neutron_ovn_controller - nova_compute properties: is_metal: true 2. Now I actually see more issues in defined openstack_user_config. I'm not sure if that is the reason of the error or not, but it still must be adjusted: a) replace network_hosts with network-infra_hosts. Defining network_hosts also adds infra servers to neutron_l3_agent (and other agents) which has in fact no effect, but triggers a bug, where run_once is treated wrongly. But this will cause failure down the line and I assume that's not it yet. You might need to clean up inventory as a result. b) also define network-northd_hosts - this usually is usually set to infra nodes, and spawns inside LXC. I would also suggest to check out doc on OVN configuration: https://docs.openstack.org/openstack-ansible-os_neutron/latest/app-ovn.html c) For the issue itself. Most likely, it is looking for `container_bridge` or `host_bind_override` key for some network. As one of these keys are expected in order to create a mapping and ovs bridges for you. It does combine net_name and one of these keys. So it would be interesting to see adjusted openstack_user_config once the above issues are sorted out. I can also suggest defining mappings in neutron_provider_networks directly, like mentioned in the documentation above. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jay at gr-oss.io Tue Sep 5 19:09:51 2023 From: jay at gr-oss.io (Jay Faulkner) Date: Tue, 5 Sep 2023 12:09:51 -0700 Subject: [election] [tc] Running for a second year on the TC Message-ID: Hey all, Around a year ago, I announced my candidacy for the TC after a call for more contributors from a diverse employer base was made. I think a year later, it's more obvious than ever how important it is to have a diverse, independent technical committee. We have made huge progress on this front. We have six candidates for the four open TC positions, representing different independent companies utilizing OpenStack. This is a major improvement year over year. If you've been in the community a while, you've probably worked with me and have an idea if you want to vote for me or not in this position. Whether elected or not, I'll continue my vocal support of OpenStack technologies in private meetings and on social media, I'll continue to invite people I meet interested in cloud to join our community -- not because it's my job (it is), but also because I'm proud of what we've built. When I have those conversations with potential users or contributors, I highlight two things I admire about the OpenStack community: longevity -- there are not many large, distributed open source projects that have survived as long as we have, and the four opens -- putting openness out front as our face to the world is why I'll continue to be proud to represent OpenStack in the future -- as a TC member or not (you decide). If someone does want to talk about a particular policy or thought process of mine, I'm happy to respond to comments here or via private email/IRC. Thanks, Jay Faulkner -------------- next part -------------- An HTML attachment was scrubbed... URL: From ianyrchoi at gmail.com Tue Sep 5 19:44:59 2023 From: ianyrchoi at gmail.com (Ian Y. Choi) Date: Wed, 6 Sep 2023 04:44:59 +0900 Subject: [all][elections][ptl][tc] Opt in to CIVS voting system Message-ID: The election uses the Condorcet Internet Voting Service (CIVS). Due to CIVS policy, to vote in private CIVS polls, you must opt in to email communication. To opt in, please enter your Gerrit email address in the following page before the start of the election Sep 06, 2023 23:45 UTC, and confirm with the code that will sent to you via email. https://civs1.civs.us/cgi-bin/opt_in.pl If you have any question, please contact the election officials. https://governance.openstack.org/election/#election-officials From satish.txt at gmail.com Tue Sep 5 22:50:03 2023 From: satish.txt at gmail.com (Satish Patel) Date: Tue, 5 Sep 2023 18:50:03 -0400 Subject: [kolla-ansible][octavia] octavia management network setup for vlan provider Message-ID: Folks, I have setup kolla-ansible and configured octavia using the o-hm0 interface with the tenant and it works. For production I think I should use VLAN based provider for octavia management network so this is what I did I have created a bond0.41 dedicated interface on all 3 controller nodes and created vlan 41 on all network switches. This is what my global.yml looks like ## Octivia enable_octavia: "yes" octavia_network_interface: "bond0.41" octavia_amp_flavor: name: "amphora" is_public: no vcpus: 2 ram: 2048 disk: 5 octavia_amp_network: name: lb-mgmt-net provider_network_type: vlan provider_segmentation_id: 41 provider_physical_network: physnet1 external: false shared: false subnet: name: lb-mgmt-subnet cidr: "192.168.41.0/24" allocation_pool_start: "192.168.41.100" allocation_pool_end: "192.168.41.200" enable_dhcp: yes After running the playbook all get setup as per document. When I create loadbalancer it just get stuck in PENDING status. [1] Document saying make sure your octavia_network_interface is connected to openvswitch. Do I need to connect manually or will kolla-ansible do that for me? If I am going to do that then on which bridge I should attach br-ex or br-int ? [1] https://docs.openstack.org/kolla-ansible/latest/reference/networking/octavia.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliver.weinmann at me.com Wed Sep 6 04:31:04 2023 From: oliver.weinmann at me.com (Oliver Weinmann) Date: Wed, 6 Sep 2023 06:31:04 +0200 Subject: [kolla-ansible][octavia] octavia management network setup for vlan provider In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From tkajinam at redhat.com Wed Sep 6 12:38:13 2023 From: tkajinam at redhat.com (Takashi Kajinami) Date: Wed, 6 Sep 2023 21:38:13 +0900 Subject: [puppet] Transitioning train, ussuri and victoria to EOL Message-ID: Hello, As we agreed in the PTG discussions at Vancouver, we'll EOL the old stable branches of Puppet OpenStack projects. As the first step I'll start transitioning stable/train, ussuri and victoria to EOL early next week. Please let me know in case anyone has any concerns about it. Thank you, Takashi -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Wed Sep 6 12:59:04 2023 From: satish.txt at gmail.com (Satish Patel) Date: Wed, 6 Sep 2023 08:59:04 -0400 Subject: [kolla-ansible][octavia] octavia management network setup for vlan provider In-Reply-To: References:

Message-ID: Hi Oliver, Thank you for your reply, That is an awesome blog and we should add multiple scenarios or examples to kolla-ansible official doc page to help out people :) By the way, Last night I figured out how to handle veth and wire up with lb-mgmt-net and soon I will create a blog to make it easier for others to understand the logic behind it. On Wed, Sep 6, 2023 at 12:31?AM Oliver Weinmann wrote: > Hi Satish, > > I got stuck at the very same issue when I first set up Octavia. The > control. Does need to have an interface on VLAN 41, since they need to > communicate with the amphora instances. So you need to create a VLAN 41 > interface on all control nodes with an IP of the LB-MGMT-NET outside of > your defined allocation pool. If you have a free interface in your control > nodes use that, if not you can try to create VETH interfaces as explained > in the following article: > > *https://cloudbase.it/openstack-on-arm64-lbaas/* > > > > Cheers, > > Oliver > > Von meinem iPhone gesendet > > Am 06.09.2023 um 00:52 schrieb Satish Patel : > > ? > Folks, > > I have setup kolla-ansible and configured octavia using the o-hm0 > interface with the tenant and it works. For production I think I should use > VLAN based provider for octavia management network so this is what I did > > I have created a bond0.41 dedicated interface on all 3 controller nodes > and created vlan 41 on all network switches. > > This is what my global.yml looks like > > ## Octivia > enable_octavia: "yes" > octavia_network_interface: "bond0.41" > > octavia_amp_flavor: > name: "amphora" > is_public: no > vcpus: 2 > ram: 2048 > disk: 5 > > octavia_amp_network: > name: lb-mgmt-net > provider_network_type: vlan > provider_segmentation_id: 41 > provider_physical_network: physnet1 > external: false > shared: false > subnet: > name: lb-mgmt-subnet > cidr: "192.168.41.0/24" > allocation_pool_start: "192.168.41.100" > allocation_pool_end: "192.168.41.200" > enable_dhcp: yes > > After running the playbook all get setup as per document. When I create > loadbalancer it just get stuck in PENDING status. > > [1] Document saying make sure your octavia_network_interface is connected > to openvswitch. Do I need to connect manually or will kolla-ansible do that > for me? If I am going to do that then on which bridge I should attach > br-ex or br-int ? > > [1] > https://docs.openstack.org/kolla-ansible/latest/reference/networking/octavia.html > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Wed Sep 6 13:18:52 2023 From: satish.txt at gmail.com (Satish Patel) Date: Wed, 6 Sep 2023 09:18:52 -0400 Subject: [openstack][neutron][openvswitch] Openvswitch Packet loss when high throughput (pps) In-Reply-To: References: Message-ID: Hi, This is normal because OVS or LinuxBridge wire up VMs using TAP interface which runs on kernel space and that drives higher interrupt and that makes the kernel so busy working on handling packets. Standard OVS/LinuxBridge are not meant for higher PPS. If you want to handle higher PPS then look for DPDK or SRIOV deployment. ( We are running everything in SRIOV because of high PPS requirement) On Tue, Sep 5, 2023 at 11:11?AM Ha Noi wrote: > Hi everyone, > > I'm using Openstack Train and Openvswitch for ML2 driver and GRE for > tunnel type. I tested our network performance between two VMs and suffer > packet loss as below. > > VM1: IP: 10.20.1.206 > > VM2: IP: 10.20.1.154 > > VM3: IP: 10.20.1.72 > > > Using iperf3 to testing performance between VM1 and VM2. > > Run iperf3 client and server on both VMs. > > On VM2: iperf3 -t 10000 -b 130M -l 442 -P 6 -u -c 10.20.1.206 > > On VM1: iperf3 -t 10000 -b 130M -l 442 -P 6 -u -c 10.20.1.154 > > > > Using VM3 ping into VM1, then the packet is lost and the latency is quite > high. > > > ping -i 0.1 10.20.1.206 > > PING 10.20.1.206 (10.20.1.206) 56(84) bytes of data. > > 64 bytes from 10.20.1.206: icmp_seq=1 ttl=64 time=7.70 ms > > 64 bytes from 10.20.1.206: icmp_seq=2 ttl=64 time=6.90 ms > > 64 bytes from 10.20.1.206: icmp_seq=3 ttl=64 time=7.71 ms > > 64 bytes from 10.20.1.206: icmp_seq=4 ttl=64 time=7.98 ms > > 64 bytes from 10.20.1.206: icmp_seq=6 ttl=64 time=8.58 ms > > 64 bytes from 10.20.1.206: icmp_seq=7 ttl=64 time=8.34 ms > > 64 bytes from 10.20.1.206: icmp_seq=8 ttl=64 time=8.09 ms > > 64 bytes from 10.20.1.206: icmp_seq=10 ttl=64 time=4.57 ms > > 64 bytes from 10.20.1.206: icmp_seq=11 ttl=64 time=8.74 ms > > 64 bytes from 10.20.1.206: icmp_seq=12 ttl=64 time=9.37 ms > > 64 bytes from 10.20.1.206: icmp_seq=14 ttl=64 time=9.59 ms > > 64 bytes from 10.20.1.206: icmp_seq=15 ttl=64 time=7.97 ms > > 64 bytes from 10.20.1.206: icmp_seq=16 ttl=64 time=8.72 ms > > 64 bytes from 10.20.1.206: icmp_seq=17 ttl=64 time=9.23 ms > > ^C > > --- 10.20.1.206 ping statistics --- > > 34 packets transmitted, 28 received, 17.6471% packet loss, time 3328ms > > rtt min/avg/max/mdev = 1.396/6.266/9.590/2.805 ms > > > > Does any one get this issue ? > > Please help me. Thanks > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hanoi952022 at gmail.com Wed Sep 6 13:21:14 2023 From: hanoi952022 at gmail.com (Ha Noi) Date: Wed, 6 Sep 2023 20:21:14 +0700 Subject: [openstack][neutron][openvswitch] Openvswitch Packet loss when high throughput (pps) In-Reply-To: References:

Message-ID: Hi Satish, Actually, our customer get this issue when the tx/rx above only 40k pps. So what is the threshold of this throughput for OvS? Thanks and regards On Wed, 6 Sep 2023 at 20:19 Satish Patel wrote: > Hi, > > This is normal because OVS or LinuxBridge wire up VMs using TAP interface > which runs on kernel space and that drives higher interrupt and that makes > the kernel so busy working on handling packets. Standard OVS/LinuxBridge > are not meant for higher PPS. > > If you want to handle higher PPS then look for DPDK or SRIOV deployment. ( > We are running everything in SRIOV because of high PPS requirement) > > On Tue, Sep 5, 2023 at 11:11?AM Ha Noi wrote: > >> Hi everyone, >> >> I'm using Openstack Train and Openvswitch for ML2 driver and GRE for >> tunnel type. I tested our network performance between two VMs and suffer >> packet loss as below. >> >> VM1: IP: 10.20.1.206 >> >> VM2: IP: 10.20.1.154 >> >> VM3: IP: 10.20.1.72 >> >> >> Using iperf3 to testing performance between VM1 and VM2. >> >> Run iperf3 client and server on both VMs. >> >> On VM2: iperf3 -t 10000 -b 130M -l 442 -P 6 -u -c 10.20.1.206 >> >> On VM1: iperf3 -t 10000 -b 130M -l 442 -P 6 -u -c 10.20.1.154 >> >> >> >> Using VM3 ping into VM1, then the packet is lost and the latency is quite >> high. >> >> >> ping -i 0.1 10.20.1.206 >> >> PING 10.20.1.206 (10.20.1.206) 56(84) bytes of data. >> >> 64 bytes from 10.20.1.206: icmp_seq=1 ttl=64 time=7.70 ms >> >> 64 bytes from 10.20.1.206: icmp_seq=2 ttl=64 time=6.90 ms >> >> 64 bytes from 10.20.1.206: icmp_seq=3 ttl=64 time=7.71 ms >> >> 64 bytes from 10.20.1.206: icmp_seq=4 ttl=64 time=7.98 ms >> >> 64 bytes from 10.20.1.206: icmp_seq=6 ttl=64 time=8.58 ms >> >> 64 bytes from 10.20.1.206: icmp_seq=7 ttl=64 time=8.34 ms >> >> 64 bytes from 10.20.1.206: icmp_seq=8 ttl=64 time=8.09 ms >> >> 64 bytes from 10.20.1.206: icmp_seq=10 ttl=64 time=4.57 ms >> >> 64 bytes from 10.20.1.206: icmp_seq=11 ttl=64 time=8.74 ms >> >> 64 bytes from 10.20.1.206: icmp_seq=12 ttl=64 time=9.37 ms >> >> 64 bytes from 10.20.1.206: icmp_seq=14 ttl=64 time=9.59 ms >> >> 64 bytes from 10.20.1.206: icmp_seq=15 ttl=64 time=7.97 ms >> >> 64 bytes from 10.20.1.206: icmp_seq=16 ttl=64 time=8.72 ms >> >> 64 bytes from 10.20.1.206: icmp_seq=17 ttl=64 time=9.23 ms >> >> ^C >> >> --- 10.20.1.206 ping statistics --- >> >> 34 packets transmitted, 28 received, 17.6471% packet loss, time 3328ms >> >> rtt min/avg/max/mdev = 1.396/6.266/9.590/2.805 ms >> >> >> >> Does any one get this issue ? >> >> Please help me. Thanks >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jobernar at redhat.com Wed Sep 6 13:57:19 2023 From: jobernar at redhat.com (Jon Bernard) Date: Wed, 6 Sep 2023 13:57:19 +0000 Subject: Cinder Bug Report 2023-09-06 Message-ID: Hello Argonauts, Cinder Bug Meeting Etherpad Undecided - Weak Cryptographic Algorithm (MD5) used - Status: New - group list doesn't parse 'all_tenants' parameter correctly - Status: In Progress - Documentation check and correction for PowerStore NFS driver - Status: New Thanks -- Jon From Danny.Webb at thehutgroup.com Wed Sep 6 14:40:18 2023 From: Danny.Webb at thehutgroup.com (Danny Webb) Date: Wed, 6 Sep 2023 14:40:18 +0000 Subject: [kolla-ansible][octavia] octavia management network setup for vlan provider In-Reply-To: References:

Message-ID: The one downside of using the veth method vs a vlan tagged interface on the host is making it persistent after reboot. It's possible, but it's far more of a faff than just using a standard tagged interface. ________________________________ From: Satish Patel Sent: 06 September 2023 13:59 To: Oliver Weinmann Cc: OpenStack Discuss Subject: Re: [kolla-ansible][octavia] octavia management network setup for vlan provider CAUTION: This email originates from outside THG ________________________________ Hi Oliver, Thank you for your reply, That is an awesome blog and we should add multiple scenarios or examples to kolla-ansible official doc page to help out people :) By the way, Last night I figured out how to handle veth and wire up with lb-mgmt-net and soon I will create a blog to make it easier for others to understand the logic behind it. On Wed, Sep 6, 2023 at 12:31?AM Oliver Weinmann > wrote: Hi Satish, I got stuck at the very same issue when I first set up Octavia. The control. Does need to have an interface on VLAN 41, since they need to communicate with the amphora instances. So you need to create a VLAN 41 interface on all control nodes with an IP of the LB-MGMT-NET outside of your defined allocation pool. If you have a free interface in your control nodes use that, if not you can try to create VETH interfaces as explained in the following article: *https://cloudbase.it/openstack-on-arm64-lbaas/* Cheers, Oliver Von meinem iPhone gesendet Am 06.09.2023 um 00:52 schrieb Satish Patel >: ? Folks, I have setup kolla-ansible and configured octavia using the o-hm0 interface with the tenant and it works. For production I think I should use VLAN based provider for octavia management network so this is what I did I have created a bond0.41 dedicated interface on all 3 controller nodes and created vlan 41 on all network switches. This is what my global.yml looks like ## Octivia enable_octavia: "yes" octavia_network_interface: "bond0.41" octavia_amp_flavor: name: "amphora" is_public: no vcpus: 2 ram: 2048 disk: 5 octavia_amp_network: name: lb-mgmt-net provider_network_type: vlan provider_segmentation_id: 41 provider_physical_network: physnet1 external: false shared: false subnet: name: lb-mgmt-subnet cidr: "192.168.41.0/24" allocation_pool_start: "192.168.41.100" allocation_pool_end: "192.168.41.200" enable_dhcp: yes After running the playbook all get setup as per document. When I create loadbalancer it just get stuck in PENDING status. [1] Document saying make sure your octavia_network_interface is connected to openvswitch. Do I need to connect manually or will kolla-ansible do that for me? If I am going to do that then on which bridge I should attach br-ex or br-int ? [1] https://docs.openstack.org/kolla-ansible/latest/reference/networking/octavia.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From thierry at openstack.org Wed Sep 6 15:22:59 2023 From: thierry at openstack.org (Thierry Carrez) Date: Wed, 6 Sep 2023 17:22:59 +0200 Subject: [largescale-sig] Next meeting: Sept 6, 15utc In-Reply-To: <94831ce5-03e2-b08c-0c78-4b000be7a681@openstack.org> References: <94831ce5-03e2-b08c-0c78-4b000be7a681@openstack.org> Message-ID: <1671f647-90cb-d328-f58f-7ac0e5cf060f@openstack.org> Here is the summary of our SIG meeting today. We discussed hosts for our next OpenInfra Live episode, a deep dive into NIPA Cloud deployment on Sept 21. You can read the detailed meeting logs at: https://meetings.opendev.org/meetings/large_scale_sig/2023/large_scale_sig.2023-09-06-15.00.html Our next IRC meeting will be Sept 20, 8:00UTC on #openstack-operators on OFTC. Regards, -- Thierry Carrez (ttx) From satish.txt at gmail.com Wed Sep 6 15:43:42 2023 From: satish.txt at gmail.com (Satish Patel) Date: Wed, 6 Sep 2023 11:43:42 -0400 Subject: [openstack][neutron][openvswitch] Openvswitch Packet loss when high throughput (pps) In-Reply-To: References:

Message-ID: Damn! We have noticed the same issue around 40k to 55k PPS. Trust me nothing is wrong in your config. This is just a limitation of the software stack and kernel itself. On Wed, Sep 6, 2023 at 9:21?AM Ha Noi wrote: > Hi Satish, > > Actually, our customer get this issue when the tx/rx above only 40k pps. > So what is the threshold of this throughput for OvS? > > > Thanks and regards > > On Wed, 6 Sep 2023 at 20:19 Satish Patel wrote: > >> Hi, >> >> This is normal because OVS or LinuxBridge wire up VMs using TAP interface >> which runs on kernel space and that drives higher interrupt and that makes >> the kernel so busy working on handling packets. Standard OVS/LinuxBridge >> are not meant for higher PPS. >> >> If you want to handle higher PPS then look for DPDK or SRIOV deployment. >> ( We are running everything in SRIOV because of high PPS requirement) >> >> On Tue, Sep 5, 2023 at 11:11?AM Ha Noi wrote: >> >>> Hi everyone, >>> >>> I'm using Openstack Train and Openvswitch for ML2 driver and GRE for >>> tunnel type. I tested our network performance between two VMs and suffer >>> packet loss as below. >>> >>> VM1: IP: 10.20.1.206 >>> >>> VM2: IP: 10.20.1.154 >>> >>> VM3: IP: 10.20.1.72 >>> >>> >>> Using iperf3 to testing performance between VM1 and VM2. >>> >>> Run iperf3 client and server on both VMs. >>> >>> On VM2: iperf3 -t 10000 -b 130M -l 442 -P 6 -u -c 10.20.1.206 >>> >>> On VM1: iperf3 -t 10000 -b 130M -l 442 -P 6 -u -c 10.20.1.154 >>> >>> >>> >>> Using VM3 ping into VM1, then the packet is lost and the latency is >>> quite high. >>> >>> >>> ping -i 0.1 10.20.1.206 >>> >>> PING 10.20.1.206 (10.20.1.206) 56(84) bytes of data. >>> >>> 64 bytes from 10.20.1.206: icmp_seq=1 ttl=64 time=7.70 ms >>> >>> 64 bytes from 10.20.1.206: icmp_seq=2 ttl=64 time=6.90 ms >>> >>> 64 bytes from 10.20.1.206: icmp_seq=3 ttl=64 time=7.71 ms >>> >>> 64 bytes from 10.20.1.206: icmp_seq=4 ttl=64 time=7.98 ms >>> >>> 64 bytes from 10.20.1.206: icmp_seq=6 ttl=64 time=8.58 ms >>> >>> 64 bytes from 10.20.1.206: icmp_seq=7 ttl=64 time=8.34 ms >>> >>> 64 bytes from 10.20.1.206: icmp_seq=8 ttl=64 time=8.09 ms >>> >>> 64 bytes from 10.20.1.206: icmp_seq=10 ttl=64 time=4.57 ms >>> >>> 64 bytes from 10.20.1.206: icmp_seq=11 ttl=64 time=8.74 ms >>> >>> 64 bytes from 10.20.1.206: icmp_seq=12 ttl=64 time=9.37 ms >>> >>> 64 bytes from 10.20.1.206: icmp_seq=14 ttl=64 time=9.59 ms >>> >>> 64 bytes from 10.20.1.206: icmp_seq=15 ttl=64 time=7.97 ms >>> >>> 64 bytes from 10.20.1.206: icmp_seq=16 ttl=64 time=8.72 ms >>> >>> 64 bytes from 10.20.1.206: icmp_seq=17 ttl=64 time=9.23 ms >>> >>> ^C >>> >>> --- 10.20.1.206 ping statistics --- >>> >>> 34 packets transmitted, 28 received, 17.6471% packet loss, time 3328ms >>> >>> rtt min/avg/max/mdev = 1.396/6.266/9.590/2.805 ms >>> >>> >>> >>> Does any one get this issue ? >>> >>> Please help me. Thanks >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From ces.eduardo98 at gmail.com Wed Sep 6 17:06:17 2023 From: ces.eduardo98 at gmail.com (Carlos Silva) Date: Wed, 6 Sep 2023 14:06:17 -0300 Subject: [manila] Cancelling Sep 7 weekly meeting Message-ID: Hello, Zorillas! As discussed in the previous meeting, we are cancelling tomorrow's weekly meeting, as a big part of the audience is on PTO tomorrow/this week. The next Manila weekly meeting will be on Sep 14th. If you have something urgent to bring up, please let me know on the #openstack-manila IRC channel. Regards, carloss -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Wed Sep 6 17:41:24 2023 From: smooney at redhat.com (smooney at redhat.com) Date: Wed, 06 Sep 2023 18:41:24 +0100 Subject: [openstack][neutron][openvswitch] Openvswitch Packet loss when high throughput (pps) In-Reply-To: References:

Message-ID: <0278613d4d6f217482014c08e50cfbbcf4acc5c6.camel@redhat.com> On Wed, 2023-09-06 at 11:43 -0400, Satish Patel wrote: > Damn! We have noticed the same issue around 40k to 55k PPS. Trust me > nothing is wrong in your config. This is just a limitation of the software > stack and kernel itself. its partly determined by your cpu frequency. kernel ovs of yesteryear could handel about 1mpps total on a ~4GHZ cpu. with per port troughpuyt being lower dependin on what qos/firewall rules that were apllied. moving form iptables firewall to ovs firewall can help to some degree but your partly trading connection setup time for statead state troughput with the overhead of the connection tracker in ovs. using stateless security groups can help we also recently fixed a regression cause by changes in newer versions of ovs. this was notable in goign form rhel 8 to rhel 9 where litrally it reduced small packet performce to 1/10th and jumboframes to about 1/2 on master we have a config option that will set the default qos on a port to linux-noop https://github.com/openstack/os-vif/blob/master/vif_plug_ovs/ovs.py#L106-L125 the backports are propsoed upstream https://review.opendev.org/q/Id9ef7074634a0f23d67a4401fa8fca363b51bb43 and we have backported this downstream to adress that performance regression. the upstram backport is semi stalled just ebcasue we wanted to disucss if we shoudl make ti opt in by default upstream while backporting but it might be helpful for you if this is related to yoru current issues. 40-55 kpps is kind of low for kernel ovs but if you have a low clockrate cpu, hybrid_plug + incorrect qos then i could see you hitting such a bottelneck. one workaround by the way without the os-vif workaround backported is to set /proc/sys/net/core/default_qdisc to not apply any qos or a low overhead qos type i.e. sudo sysctl -w net.core.default_qdisc=pfifo_fast that may or may not help but i would ensure that your are not usign somting like fqdel or cake for net.core.default_qdisc and if you are try changing it to pfifo_fast and see if that helps. there isnet much you can do about the cpu clock rate but ^ is somethign you can try for free note it wont actully take effect on an exsitng vm if you jsut change the default but you can use tc to also chagne the qdisk for testing. hard rebooting the vm shoudl also make the default take effect. the only other advice i can give assuming kernel ovs is the only option you have is to look at https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.rx_queue_size https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.tx_queue_size and https://docs.openstack.org/nova/latest/configuration/extra-specs.html#hw:vif_multiqueue_enabled if the bottelneck is actully in qemu or the guest kernel rather then ovs adjusting the rx/tx queue size and using multi queue can help. it will have no effect if ovs is the bottel neck. > > On Wed, Sep 6, 2023 at 9:21?AM Ha Noi wrote: > > > Hi Satish, > > > > Actually, our customer get this issue when the tx/rx above only 40k pps. > > So what is the threshold of this throughput for OvS? > > > > > > Thanks and regards > > > > On Wed, 6 Sep 2023 at 20:19 Satish Patel wrote: > > > > > Hi, > > > > > > This is normal because OVS or LinuxBridge wire up VMs using TAP interface > > > which runs on kernel space and that drives higher interrupt and that makes > > > the kernel so busy working on handling packets. Standard OVS/LinuxBridge > > > are not meant for higher PPS. > > > > > > If you want to handle higher PPS then look for DPDK or SRIOV deployment. > > > ( We are running everything in SRIOV because of high PPS requirement) > > > > > > On Tue, Sep 5, 2023 at 11:11?AM Ha Noi wrote: > > > > > > > Hi everyone, > > > > > > > > I'm using Openstack Train and Openvswitch for ML2 driver and GRE for > > > > tunnel type. I tested our network performance between two VMs and suffer > > > > packet loss as below. > > > > > > > > VM1: IP: 10.20.1.206 > > > > > > > > VM2: IP: 10.20.1.154 > > > > > > > > VM3: IP: 10.20.1.72 > > > > > > > > > > > > Using iperf3 to testing performance between VM1 and VM2. > > > > > > > > Run iperf3 client and server on both VMs. > > > > > > > > On VM2: iperf3 -t 10000 -b 130M -l 442 -P 6 -u -c 10.20.1.206 > > > > > > > > On VM1: iperf3 -t 10000 -b 130M -l 442 -P 6 -u -c 10.20.1.154 > > > > > > > > > > > > > > > > Using VM3 ping into VM1, then the packet is lost and the latency is > > > > quite high. > > > > > > > > > > > > ping -i 0.1 10.20.1.206 > > > > > > > > PING 10.20.1.206 (10.20.1.206) 56(84) bytes of data. > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=1 ttl=64 time=7.70 ms > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=2 ttl=64 time=6.90 ms > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=3 ttl=64 time=7.71 ms > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=4 ttl=64 time=7.98 ms > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=6 ttl=64 time=8.58 ms > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=7 ttl=64 time=8.34 ms > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=8 ttl=64 time=8.09 ms > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=10 ttl=64 time=4.57 ms > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=11 ttl=64 time=8.74 ms > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=12 ttl=64 time=9.37 ms > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=14 ttl=64 time=9.59 ms > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=15 ttl=64 time=7.97 ms > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=16 ttl=64 time=8.72 ms > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=17 ttl=64 time=9.23 ms > > > > > > > > ^C > > > > > > > > --- 10.20.1.206 ping statistics --- > > > > > > > > 34 packets transmitted, 28 received, 17.6471% packet loss, time 3328ms > > > > > > > > rtt min/avg/max/mdev = 1.396/6.266/9.590/2.805 ms > > > > > > > > > > > > > > > > Does any one get this issue ? > > > > > > > > Please help me. Thanks > > > > > > > From roger.riverac at gmail.com Wed Sep 6 20:25:44 2023 From: roger.riverac at gmail.com (Roger Rivera) Date: Wed, 6 Sep 2023 16:25:44 -0400 Subject: [openstack-ansible] Dedicated gateway hosts not working with OVN In-Reply-To: References:

<20230208143243.dth426y74jyfacxh@yuggoth.org>

Message-ID: Hello, I appreciate the prompt feedback. Unfortunately, after making multiple changes, we cannot make external networks to connect via gateway-hosts. Our follow up investigation has shown the following: 1. Removed flat and vlan provider_networks from /etc/openstack_deploy/openstack_user_config.yml. Only management provider_networks was defined here : provider_networks: - network: container_bridge: "br-mgmt" container_type: "veth" container_interface: "ens4" ip_from_q: "management" type: "raw" group_binds: - all_containers - hosts is_management_address: true 2. Defined ML2 information and network types in /etc/openstack_deploy/user_variables.yml: neutron_ml2_conf_ini_overrides: ml2: tenant_network_types: geneve ml2_type_flat: flat_networks: flat ml2_type_geneve: vni_ranges: 1:1000 max_header_size: 38 3. Moved neutron_provider_networks configuration on a per-host basis and removed network_mappings and network_interface_mappings for compute hosts in /etc/openstack_deploy/host_vars/ compute node /etc/openstack_deploy/host_vars/cmp3: neutron_provider_networks: network_types: "geneve" network_geneve_ranges: "1:1000" gateway node /etc/openstack_deploy/host_vars/net1: neutron_provider_networks: network_types: "geneve" network_geneve_ranges: "1:1000" network_mappings: "flat:br-flat" network_interface_mappings: "br-flat:ens2" 4. Upon checking the new recreated inventory targets the correct neutron_ovn_gateway hosts /etc/openstack_deploy/openstack_inventory.json ? "component": "neutron_ovn_gateway", "container_name": "net1", "container_networks": { "management_address": { "address": "172.16.0.31", "bridge": "br-mgmt", -- "component": "neutron_ovn_gateway", "container_name": "net2", "container_networks": { "management_address": { "address": "172.16.0.32", "bridge": "br-mgmt", -- "neutron_ovn_gateway": { "children": [], "hosts": [ "net1", "net2" ? 5. The correct ovn-cms-options=enable-chassis-as-gw is set on gateway nodes only: ovn-sbctl list chassis | grep 'hostname\|ovn-cms-options' hostname : net2 other_config : {ct-no-masked-label="true", datapath-type=system, iface-types="afxdp,afxdp-nonpmd,bareudp,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan", is-interconn="false", mac-binding-timestamp="true", ovn-bridge-mappings="flat:br-flat", ovn-chassis-mac-mappings="", ovn-cms-options=enable-chassis-as-gw, ovn-ct-lb-related="true", ovn-enable-lflow-cache="true", ovn-limit-lflow-cache="", ovn-memlimit-lflow-cache-kb="", ovn-monitor-all="false", ovn-trim-limit-lflow-cache="", ovn-trim-timeout-ms="", ovn-trim-wmark-perc-lflow-cache="", port-up-notif="true"} hostname : net1 other_config : {ct-no-masked-label="true", datapath-type=system, iface-types="afxdp,afxdp-nonpmd,bareudp,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan", is-interconn="false", mac-binding-timestamp="true", ovn-bridge-mappings="flat:br-flat", ovn-chassis-mac-mappings="", ovn-cms-options=enable-chassis-as-gw, ovn-ct-lb-related="true", ovn-enable-lflow-cache="true", ovn-limit-lflow-cache="", ovn-memlimit-lflow-cache-kb="", ovn-monitor-all="false", ovn-trim-limit-lflow-cache="", ovn-trim-timeout-ms="", ovn-trim-wmark-perc-lflow-cache="", port-up-notif="true"} hostname : cmp3 other_config : {ct-no-masked-label="true", datapath-type=system, iface-types="afxdp,afxdp-nonpmd,bareudp,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan", is-interconn="false", mac-binding-timestamp="true", ovn-bridge-mappings="", ovn-chassis-mac-mappings="", ovn-cms-options="", ovn-ct-lb-related="true", ovn-enable-lflow-cache="true", ovn-limit-lflow-cache="", ovn-memlimit-lflow-cache-kb="", ovn-monitor-all="false", ovn-trim-limit-lflow-cache="", ovn-trim-timeout-ms="", ovn-trim-wmark-perc-lflow-cache="", port-up-notif="true"} hostname : cmp4 other_config : {ct-no-masked-label="true", datapath-type=system, iface-types="afxdp,afxdp-nonpmd,bareudp,erspan,geneve,gre,gtpu,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan", is-interconn="false", mac-binding-timestamp="true", ovn-bridge-mappings="", ovn-chassis-mac-mappings="", ovn-cms-options="", ovn-ct-lb-related="true", ovn-enable-lflow-cache="true", ovn-limit-lflow-cache="", ovn-memlimit-lflow-cache-kb="", ovn-monitor-all="false", ovn-trim-limit-lflow-cache="", ovn-trim-timeout-ms="", ovn-trim-wmark-perc-lflow-cache="", port-up-notif="true"} RESULT: VMs fail to launch with external network (flat). Error logs "Binding failed for port": Sep 6 19:42:37 net1 nova-conductor[4270]: 2023-09-06 19:42:37.599 4270 ERROR nova.scheduler.utils [None req-25a6a8d6-8122-4621-a2c2-8ca0be5e594c 52059c7247434072b6823d1701fec23e 116579f970b242b996ac717fa7580311 - - default default] [instance: 8760706e-d38f-454d-b90f-b9d5d322ba99] Error from last host: dev-usc1-ost-cmp4 (node dev-usc1-ost-cmp4.openstack.local): ['Traceback (most recent call last):\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/compute/manager.py", line 2607, in _build_and_run_instance\n self.driver.spawn(context, instance, image_meta,\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/virt/libvirt/driver.py", line 4383, in spawn\n xml = self._get_guest_xml(context, instance, network_info,\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/virt/libvirt/driver.py", line 7516, in _get_guest_xml\n network_info_str = str(network_info)\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/network/model.py", line 620, in __str__\n return self._sync_wrapper(fn, *args, **kwargs)\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/network/model.py", line 603, in _sync_wrapper\n self.wait()\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/network/model.py", line 635, in wait\n self[:] = self._gt.wait()\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/eventlet/greenthread.py", line 181, in wait\n return self._exit_event.wait()\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/eventlet/event.py", line 132, in wait\n current.throw(*self._exc)\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/eventlet/greenthread.py", line 221, in main\n result = function(*args, **kwargs)\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/utils.py", line 654, in context_wrapper\n return func(*args, **kwargs)\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/compute/manager.py", line 1987, in _allocate_network_async\n raise e\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/compute/manager.py", line 1965, in _allocate_network_async\n nwinfo = self.network_api.allocate_for_instance(\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/network/neutron.py", line 1216, in allocate_for_instance\n created_port_ids = self._update_ports_for_instance(\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/network/neutron.py", line 1352, in _update_ports_for_instance\n with excutils.save_and_reraise_exception():\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/oslo_utils/excutils.py", line 227, in __exit__\n self.force_reraise()\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/oslo_utils/excutils.py", line 200, in force_reraise\n raise self.value\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/network/neutron.py", line 1327, in _update_ports_for_instance\n updated_port = self._update_port(\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/network/neutron.py", line 585, in _update_port\n _ensure_no_port_binding_failure(port)\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/network/neutron.py", line 294, in _ensure_no_port_binding_failure\n raise exception.PortBindingFailed(port_id=port[\'id\'])\n', 'nova.exception.PortBindingFailed: Binding failed for port b82f4518-ecba-49d9-a21d-2646d3f33efd, please check neutron logs for more information.\n', '\nDuring handling of the above exception, another exception occurred:\n\n', 'Traceback (most recent call last):\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/compute/manager.py", line 2428, in _do_build_and_run_instance\n self._build_and_run_instance(context, instance, image,\n', ' File "/openstack/venvs/nova-0.1.0.dev8112/lib/python3.10/site-packages/nova/compute/manager.py", line 2703, in _build_and_run_instance\n raise exception.RescheduledException(\n', 'nova.exception.RescheduledException: Build of instance 8760706e-d38f-454d-b90f-b9d5d322ba99 was re-scheduled: Binding failed for port b82f4518-ecba-49d9-a21d-2646d3f33efd, please check neutron logs for more information.\n'] All we need is to make sure external networks are routed via gateway-hosts and not via compute nodes. In our case, compute nodes have only one physical interface with an IP address and no connectivity to the flat network. No layer 2 connectivity is available on compute nodes either. That's the reason why we must traverse external traffic via gateway nodes only. It is worth noting that tenant/internal networks work fine. What are we doing wrong? Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ianyrchoi at gmail.com Wed Sep 6 23:46:51 2023 From: ianyrchoi at gmail.com (Ian Y. Choi) Date: Thu, 7 Sep 2023 08:46:51 +0900 Subject: [all][elections][ptl][tc] Conbined PTL/TC 2024.1 cycle Election Voting Kickoff Message-ID: <8b4cf482-7598-48c6-bb42-6110bb134970@gmail.com> Polls for PTL and TC elections are now open and will remain open for you to cast your vote until Sep 20, 2023 23:45 UTC. We are selecting 4 TC members, and are having PTL elections for OpenStack_Helm. Please rank all candidates in your order of preference. You are eligible to vote in the TC election if you are a Foundation individual member[0] that also has committed to any official project team's deliverable repositories[1] over the Sep 16, 2022 00:00 UTC - Aug 30, 2023 00:00 UTC timeframe (2023.1 to 2023.2) or if you are in the list of extra-atcs[2] for any official project team. You are eligible to vote in a PTL election if you are a Foundation individual member[0] and had a commit in one of that team's deliverable repositories[1] over the Sep 16, 2022 00:00 UTC - Aug 30, 2023 00:00 UTC timeframe (2023.1 to 2023.2) or if you are in that team's list of extra-atcs[2]. If you are eligible to vote in an election, you should find your email with a link to the Condorcet Internet Voting Service (CIVS) page to cast your vote in the inbox of your Gerrit preferred email[3]. What to do if you don't see the email and have a commit in at least one of the projects having an election: * due to a new CIVS policy, to get the email from poll supervisors, ??? you must opt in to email communication ??? opt in with your Gerrit email address at: ??? https://civs1.civs.us/cgi-bin/opt_in.pl * eligible voters who opt in to email communication late ? can see a pending poll invitation and vote ? until Sep 20, 2023 23:45 UTC. * check the trash or spam folders of your Gerrit Preferred ??? Email address, in case it went into trash or spam * wait a bit and check again, in case your email server is a bit slow * find the sha of at least one commit from the project's deliverable ??? repos[0] and email the election officials[4]. If we can confirm that you are entitled to vote, we will add you to the voters list for the appropriate election. Our democratic process is important to the health of OpenStack, please exercise your right to vote! Candidate statements/platforms can be found linked to Candidate names on this page: https://governance.openstack.org/election/ Happy voting, [0] https://www.openstack.org/community/members/ [1] The list of the repositories eligible for electoral status: https://opendev.org/openstack/governance/src/tag/0.15.0/reference/projects.yaml [2] Look for the extra-atcs element in [1] [3] Sign into review.openstack.org: Go to Settings > Contact ??? Information. Look at the email listed as your preferred email. ??? That is where the ballot has been sent. [4] https://governance.openstack.org/election/#election-officials From ianyrchoi at gmail.com Thu Sep 7 00:45:19 2023 From: ianyrchoi at gmail.com (Ian Y. Choi) Date: Thu, 7 Sep 2023 09:45:19 +0900 Subject: [i18n] New I18n SIG Core Message-ID: Hello, I would like to happily announce a new i18n core: Seongsoo Cho, considering his notable contributions on I18n with volunteering: 1. PDF Docs contribution with I18n issues (font issues, latex with Asian languages) [1] 2. Zanata to Weblate migration volunteering effort [2] 3. Participation of I18n SIG PTG on March [3] and Forum on May [4] Also, he is active on #openstack-i18n IRC channel (nick name: "seongsoocho"). All, please welcome Seongsoo as I18n-core. I am looking forward to more active contributions from him and interactions with other wonderful I18n members and OpenStack team! Thank you, /Ian [1] https://review.opendev.org/q/topic:bp%252Fbuild-pdf-from-rst-guides [2] https://lists.openstack.org/pipermail/openstack-i18n/2022-October/003568.html [3] https://etherpad.opendev.org/p/march2023-ptg-i18n [4] https://etherpad.opendev.org/p/vancouver-2023-i18n-forum From ppiyakk2 at printf.kr Thu Sep 7 01:55:44 2023 From: ppiyakk2 at printf.kr (Seongsoo Cho) Date: Thu, 7 Sep 2023 10:55:44 +0900 Subject: [OpenStack-I18n] [i18n] New I18n SIG Core In-Reply-To: References: Message-ID: Hello OpenStack Community and i18n team Thank you for adding me as a new i18n core team member. As Ian said, I am now working on the i18n SIG to migrate the translation platform from zanata to weblate. I will do my best to internationalize OpenInfra in the future. P.S. Thanks to Ian for always supporting me. Best regards Seongsoo On Thu, Sep 7, 2023 at 9:45?AM Ian Y. Choi wrote: > > Hello, > > I would like to happily announce a new i18n core: Seongsoo Cho, > considering his notable contributions on I18n with volunteering: > > 1. PDF Docs contribution with I18n issues (font issues, latex with Asian > languages) [1] > 2. Zanata to Weblate migration volunteering effort [2] > 3. Participation of I18n SIG PTG on March [3] and Forum on May [4] > > Also, he is active on #openstack-i18n IRC channel (nick name: > "seongsoocho"). > > All, please welcome Seongsoo as I18n-core. > > I am looking forward to more active contributions from him and > interactions with other wonderful I18n members and OpenStack team! > > > Thank you, > > /Ian > > [1] https://review.opendev.org/q/topic:bp%252Fbuild-pdf-from-rst-guides > [2] > https://lists.openstack.org/pipermail/openstack-i18n/2022-October/003568.html > [3] https://etherpad.opendev.org/p/march2023-ptg-i18n > [4] https://etherpad.opendev.org/p/vancouver-2023-i18n-forum > > > _______________________________________________ > OpenStack-I18n mailing list > OpenStack-I18n at lists.openstack.org -- Seongsoo Cho OpenStack Korea User Group / Community Leader IRC #seongsoocho From noonedeadpunk at gmail.com Thu Sep 7 06:20:14 2023 From: noonedeadpunk at gmail.com (Dmitriy Rabotyagov) Date: Thu, 7 Sep 2023 08:20:14 +0200 Subject: [openstack-ansible] Dedicated gateway hosts not working with OVN In-Reply-To: References:

<20230208143243.dth426y74jyfacxh@yuggoth.org>

Message-ID: Hey, If you are using standalone gateway hosts and compute nodes do not have access to external networks, I think it is expected that you can not bind a port from the external network to the VM. In this case access to the external network is done only through L3 routers. So the idea is the following: You attach only a private (geneve) networks to the VMs. Then, you create a neutron router, which acts as a gateway for the private network, and is attached to an external network as well. Then you create a floating IP from the external network and attach it to the port of the VM from the internal one. This way you make the external network routed via gateway hosts. Also, regarding your original issue "The task includes an option with an undefined variable. The error was: list object has no element 1" - we had similar case in IRC yesterday, and James Denton has found the workaround there by defining the br-ex bridge instead of naming it as br-flat, here was his paste that worked out for the folk: https://paste.opendev.org/show/bjw3b5ncP6dbhj34ltJU/ He also found an issue in our logic that caused this issue and has proposed a patch for that: https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/893924 -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard.buregeya at bsc.rw Thu Sep 7 08:29:18 2023 From: richard.buregeya at bsc.rw (Richard Buregeya) Date: Thu, 7 Sep 2023 08:29:18 +0000 Subject: Windows Image failed to access network Message-ID: Hello Team, I created the windows image and added the N/w drivers, but it can't be able to get from DHCP. Any idea? Regards Richard. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralonsoh at redhat.com Thu Sep 7 09:00:41 2023 From: ralonsoh at redhat.com (Rodolfo Alonso Hernandez) Date: Thu, 7 Sep 2023 11:00:41 +0200 Subject: [neutron][release] Proposing transition to EOL Train (Neutron and neutron-lib) Message-ID: Hello: I'm sending this mail in advance to propose transitioning the Neutron and the neutron-lib Train branch to EOL. This topic was proposed and approved during the last Neutron team meeting [1]. These are the only two Neutron related projects still in EM. The announcement is the first step [2] to transition a stable branch to EOL. The patch to mark these branches as EOL will be pushed in two weeks. If you have any inconvenience, please let me know in this mail chain or in IRC (ralonsoh, #openstack-neutron channel). You can also contact any Neutron core reviewer in the IRC channel. Regards. [2] https://meetings.opendev.org/meetings/networking/2023/networking.2023-09-05-14.00.log.html#l-144 [1] https://docs.openstack.org/project-team-guide/stable-branches.html#end-of-life -------------- next part -------------- An HTML attachment was scrubbed... URL: From udaydikshit2007 at gmail.com Thu Sep 7 09:36:24 2023 From: udaydikshit2007 at gmail.com (Uday Dikshit) Date: Thu, 7 Sep 2023 15:06:24 +0530 Subject: Windows Image failed to access network In-Reply-To: References: Message-ID: Hello Richard It will be helpful if you can share some logs from Nova and neutron. On Thu, Sep 7, 2023, 15:00 Richard Buregeya wrote: > Hello Team, > > > > I created the windows image and added the N/w drivers, but it can?t be > able to get from DHCP. > > > > Any idea? > > > > Regards > > Richard. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Thu Sep 7 12:03:27 2023 From: satish.txt at gmail.com (Satish Patel) Date: Thu, 7 Sep 2023 08:03:27 -0400 Subject: [openstack][neutron][openvswitch] Openvswitch Packet loss when high throughput (pps) In-Reply-To: References:

<0278613d4d6f217482014c08e50cfbbcf4acc5c6.camel@redhat.com> Message-ID: I totally agreed with Sean on all his points but trust me, I have tried everything possible to tune OS, Network stack, multi-queue, NUMA, CPU pinning and name it.. but I didn't get any significant improvement. You may gain 2 to 5% gain with all those tweek. I am running the entire workload on sriov and life is happy except no LACP bonding. I am very interesting is this project https://docs.openvswitch.org/en/latest/intro/install/afxdp/ On Thu, Sep 7, 2023 at 6:07?AM Ha Noi wrote: > Dear Smoney, > > > > On Thu, Sep 7, 2023 at 12:41?AM wrote: > >> On Wed, 2023-09-06 at 11:43 -0400, Satish Patel wrote: >> > Damn! We have noticed the same issue around 40k to 55k PPS. Trust me >> > nothing is wrong in your config. This is just a limitation of the >> software >> > stack and kernel itself. >> its partly determined by your cpu frequency. >> kernel ovs of yesteryear could handel about 1mpps total on a ~4GHZ >> cpu. with per port troughpuyt being lower dependin on what qos/firewall >> rules that were apllied. >> >> > > My CPU frequency is 3Ghz and using CPU Intel Gold 2nd generation. I think > the problem is tuning in the compute node inside. But I cannot find any > guide or best practices for it. > > > >> moving form iptables firewall to ovs firewall can help to some degree >> but your partly trading connection setup time for statead state troughput >> with the overhead of the connection tracker in ovs. >> >> using stateless security groups can help >> >> we also recently fixed a regression cause by changes in newer versions of >> ovs. >> this was notable in goign form rhel 8 to rhel 9 where litrally it reduced >> small packet performce to 1/10th and jumboframes to about 1/2 >> on master we have a config option that will set the default qos on a port >> to linux-noop >> >> https://github.com/openstack/os-vif/blob/master/vif_plug_ovs/ovs.py#L106-L125 >> >> the backports are propsoed upstream >> https://review.opendev.org/q/Id9ef7074634a0f23d67a4401fa8fca363b51bb43 >> and we have backported this downstream to adress that performance >> regression. >> the upstram backport is semi stalled just ebcasue we wanted to disucss if >> we shoudl make ti opt in >> by default upstream while backporting but it might be helpful for you if >> this is related to yoru current >> issues. >> >> 40-55 kpps is kind of low for kernel ovs but if you have a low clockrate >> cpu, hybrid_plug + incorrect qos >> then i could see you hitting such a bottelneck. >> >> one workaround by the way without the os-vif workaround backported is to >> set >> /proc/sys/net/core/default_qdisc to not apply any qos or a low overhead >> qos type >> i.e. sudo sysctl -w net.core.default_qdisc=pfifo_fast >> >> > >> that may or may not help but i would ensure that your are not usign >> somting like fqdel or cake >> for net.core.default_qdisc and if you are try changing it to pfifo_fast >> and see if that helps. >> >> there isnet much you can do about the cpu clock rate but ^ is somethign >> you can try for free >> note it wont actully take effect on an exsitng vm if you jsut change the >> default but you can use >> tc to also chagne the qdisk for testing. hard rebooting the vm shoudl >> also make the default take effect. >> >> the only other advice i can give assuming kernel ovs is the only option >> you have is >> >> to look at >> >> https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.rx_queue_size >> >> https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.tx_queue_size >> and >> >> https://docs.openstack.org/nova/latest/configuration/extra-specs.html#hw:vif_multiqueue_enabled >> >> if the bottelneck is actully in qemu or the guest kernel rather then ovs >> adjusting the rx/tx queue size and >> using multi queue can help. it will have no effect if ovs is the bottel >> neck. >> >> >> > I have set this option to 1024, and enable multiqueue as well. But it did > not help. > > >> > >> > On Wed, Sep 6, 2023 at 9:21?AM Ha Noi wrote: >> > >> > > Hi Satish, >> > > >> > > Actually, our customer get this issue when the tx/rx above only 40k >> pps. >> > > So what is the threshold of this throughput for OvS? >> > > >> > > >> > > Thanks and regards >> > > >> > > On Wed, 6 Sep 2023 at 20:19 Satish Patel >> wrote: >> > > >> > > > Hi, >> > > > >> > > > This is normal because OVS or LinuxBridge wire up VMs using TAP >> interface >> > > > which runs on kernel space and that drives higher interrupt and >> that makes >> > > > the kernel so busy working on handling packets. Standard >> OVS/LinuxBridge >> > > > are not meant for higher PPS. >> > > > >> > > > If you want to handle higher PPS then look for DPDK or SRIOV >> deployment. >> > > > ( We are running everything in SRIOV because of high PPS >> requirement) >> > > > >> > > > On Tue, Sep 5, 2023 at 11:11?AM Ha Noi >> wrote: >> > > > >> > > > > Hi everyone, >> > > > > >> > > > > I'm using Openstack Train and Openvswitch for ML2 driver and GRE >> for >> > > > > tunnel type. I tested our network performance between two VMs and >> suffer >> > > > > packet loss as below. >> > > > > >> > > > > VM1: IP: 10.20.1.206 >> > > > > >> > > > > VM2: IP: 10.20.1.154 >> > > > > >> > > > > VM3: IP: 10.20.1.72 >> > > > > >> > > > > >> > > > > Using iperf3 to testing performance between VM1 and VM2. >> > > > > >> > > > > Run iperf3 client and server on both VMs. >> > > > > >> > > > > On VM2: iperf3 -t 10000 -b 130M -l 442 -P 6 -u -c 10.20.1.206 >> > > > > >> > > > > On VM1: iperf3 -t 10000 -b 130M -l 442 -P 6 -u -c 10.20.1.154 >> > > > > >> > > > > >> > > > > >> > > > > Using VM3 ping into VM1, then the packet is lost and the latency >> is >> > > > > quite high. >> > > > > >> > > > > >> > > > > ping -i 0.1 10.20.1.206 >> > > > > >> > > > > PING 10.20.1.206 (10.20.1.206) 56(84) bytes of data. >> > > > > >> > > > > 64 bytes from 10.20.1.206: icmp_seq=1 ttl=64 time=7.70 ms >> > > > > >> > > > > 64 bytes from 10.20.1.206: icmp_seq=2 ttl=64 time=6.90 ms >> > > > > >> > > > > 64 bytes from 10.20.1.206: icmp_seq=3 ttl=64 time=7.71 ms >> > > > > >> > > > > 64 bytes from 10.20.1.206: icmp_seq=4 ttl=64 time=7.98 ms >> > > > > >> > > > > 64 bytes from 10.20.1.206: icmp_seq=6 ttl=64 time=8.58 ms >> > > > > >> > > > > 64 bytes from 10.20.1.206: icmp_seq=7 ttl=64 time=8.34 ms >> > > > > >> > > > > 64 bytes from 10.20.1.206: icmp_seq=8 ttl=64 time=8.09 ms >> > > > > >> > > > > 64 bytes from 10.20.1.206: icmp_seq=10 ttl=64 time=4.57 ms >> > > > > >> > > > > 64 bytes from 10.20.1.206: icmp_seq=11 ttl=64 time=8.74 ms >> > > > > >> > > > > 64 bytes from 10.20.1.206: icmp_seq=12 ttl=64 time=9.37 ms >> > > > > >> > > > > 64 bytes from 10.20.1.206: icmp_seq=14 ttl=64 time=9.59 ms >> > > > > >> > > > > 64 bytes from 10.20.1.206: icmp_seq=15 ttl=64 time=7.97 ms >> > > > > >> > > > > 64 bytes from 10.20.1.206: icmp_seq=16 ttl=64 time=8.72 ms >> > > > > >> > > > > 64 bytes from 10.20.1.206: icmp_seq=17 ttl=64 time=9.23 ms >> > > > > >> > > > > ^C >> > > > > >> > > > > --- 10.20.1.206 ping statistics --- >> > > > > >> > > > > 34 packets transmitted, 28 received, 17.6471% packet loss, time >> 3328ms >> > > > > >> > > > > rtt min/avg/max/mdev = 1.396/6.266/9.590/2.805 ms >> > > > > >> > > > > >> > > > > >> > > > > Does any one get this issue ? >> > > > > >> > > > > Please help me. Thanks >> > > > > >> > > > >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From remo at rm.ht Thu Sep 7 02:59:57 2023 From: remo at rm.ht (Remo Mattei) Date: Wed, 6 Sep 2023 19:59:57 -0700 Subject: [OpenStack-I18n] [i18n] New I18n SIG Core In-Reply-To: References:

Message-ID: <1cf874e2-5ace-4702-91d9-ae526f3ba3d6@Canary> Well that?s nice!! Remo > On Wednesday, Sep 06, 2023 at 18:58, Seongsoo Cho wrote: > Hello OpenStack Community and i18n team > > Thank you for adding me as a new i18n core team member. > As Ian said, I am now working on the i18n SIG to migrate the > translation platform from zanata to weblate. > > I will do my best to internationalize OpenInfra in the future. > > P.S. Thanks to Ian for always supporting me. > > Best regards > Seongsoo > > On Thu, Sep 7, 2023 at 9:45?AM Ian Y. Choi wrote: > > > > Hello, > > > > I would like to happily announce a new i18n core: Seongsoo Cho, > > considering his notable contributions on I18n with volunteering: > > > > 1. PDF Docs contribution with I18n issues (font issues, latex with Asian > > languages) [1] > > 2. Zanata to Weblate migration volunteering effort [2] > > 3. Participation of I18n SIG PTG on March [3] and Forum on May [4] > > > > Also, he is active on #openstack-i18n IRC channel (nick name: > > "seongsoocho"). > > > > All, please welcome Seongsoo as I18n-core. > > > > I am looking forward to more active contributions from him and > > interactions with other wonderful I18n members and OpenStack team! > > > > > > Thank you, > > > > /Ian > > > > [1] https://review.opendev.org/q/topic:bp%252Fbuild-pdf-from-rst-guides > > [2] > > https://lists.openstack.org/pipermail/openstack-i18n/2022-October/003568.html > > [3] https://etherpad.opendev.org/p/march2023-ptg-i18n > > [4] https://etherpad.opendev.org/p/vancouver-2023-i18n-forum > > > > > > _______________________________________________ > > OpenStack-I18n mailing list > > OpenStack-I18n at lists.openstack.org > > > > -- > Seongsoo Cho > OpenStack Korea User Group / Community Leader > IRC #seongsoocho > > _______________________________________________ > OpenStack-I18n mailing list > OpenStack-I18n at lists.openstack.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From hanoi952022 at gmail.com Thu Sep 7 10:07:45 2023 From: hanoi952022 at gmail.com (Ha Noi) Date: Thu, 7 Sep 2023 17:07:45 +0700 Subject: [openstack][neutron][openvswitch] Openvswitch Packet loss when high throughput (pps) In-Reply-To: <0278613d4d6f217482014c08e50cfbbcf4acc5c6.camel@redhat.com> References:

<0278613d4d6f217482014c08e50cfbbcf4acc5c6.camel@redhat.com> Message-ID: Dear Smoney, On Thu, Sep 7, 2023 at 12:41?AM wrote: > On Wed, 2023-09-06 at 11:43 -0400, Satish Patel wrote: > > Damn! We have noticed the same issue around 40k to 55k PPS. Trust me > > nothing is wrong in your config. This is just a limitation of the > software > > stack and kernel itself. > its partly determined by your cpu frequency. > kernel ovs of yesteryear could handel about 1mpps total on a ~4GHZ > cpu. with per port troughpuyt being lower dependin on what qos/firewall > rules that were apllied. > > My CPU frequency is 3Ghz and using CPU Intel Gold 2nd generation. I think the problem is tuning in the compute node inside. But I cannot find any guide or best practices for it. > moving form iptables firewall to ovs firewall can help to some degree > but your partly trading connection setup time for statead state troughput > with the overhead of the connection tracker in ovs. > > using stateless security groups can help > > we also recently fixed a regression cause by changes in newer versions of > ovs. > this was notable in goign form rhel 8 to rhel 9 where litrally it reduced > small packet performce to 1/10th and jumboframes to about 1/2 > on master we have a config option that will set the default qos on a port > to linux-noop > > https://github.com/openstack/os-vif/blob/master/vif_plug_ovs/ovs.py#L106-L125 > > the backports are propsoed upstream > https://review.opendev.org/q/Id9ef7074634a0f23d67a4401fa8fca363b51bb43 > and we have backported this downstream to adress that performance > regression. > the upstram backport is semi stalled just ebcasue we wanted to disucss if > we shoudl make ti opt in > by default upstream while backporting but it might be helpful for you if > this is related to yoru current > issues. > > 40-55 kpps is kind of low for kernel ovs but if you have a low clockrate > cpu, hybrid_plug + incorrect qos > then i could see you hitting such a bottelneck. > > one workaround by the way without the os-vif workaround backported is to > set > /proc/sys/net/core/default_qdisc to not apply any qos or a low overhead > qos type > i.e. sudo sysctl -w net.core.default_qdisc=pfifo_fast > > > that may or may not help but i would ensure that your are not usign > somting like fqdel or cake > for net.core.default_qdisc and if you are try changing it to pfifo_fast > and see if that helps. > > there isnet much you can do about the cpu clock rate but ^ is somethign > you can try for free > note it wont actully take effect on an exsitng vm if you jsut change the > default but you can use > tc to also chagne the qdisk for testing. hard rebooting the vm shoudl also > make the default take effect. > > the only other advice i can give assuming kernel ovs is the only option > you have is > > to look at > > https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.rx_queue_size > > https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.tx_queue_size > and > > https://docs.openstack.org/nova/latest/configuration/extra-specs.html#hw:vif_multiqueue_enabled > > if the bottelneck is actully in qemu or the guest kernel rather then ovs > adjusting the rx/tx queue size and > using multi queue can help. it will have no effect if ovs is the bottel > neck. > > > I have set this option to 1024, and enable multiqueue as well. But it did not help. > > > > On Wed, Sep 6, 2023 at 9:21?AM Ha Noi wrote: > > > > > Hi Satish, > > > > > > Actually, our customer get this issue when the tx/rx above only 40k > pps. > > > So what is the threshold of this throughput for OvS? > > > > > > > > > Thanks and regards > > > > > > On Wed, 6 Sep 2023 at 20:19 Satish Patel wrote: > > > > > > > Hi, > > > > > > > > This is normal because OVS or LinuxBridge wire up VMs using TAP > interface > > > > which runs on kernel space and that drives higher interrupt and that > makes > > > > the kernel so busy working on handling packets. Standard > OVS/LinuxBridge > > > > are not meant for higher PPS. > > > > > > > > If you want to handle higher PPS then look for DPDK or SRIOV > deployment. > > > > ( We are running everything in SRIOV because of high PPS requirement) > > > > > > > > On Tue, Sep 5, 2023 at 11:11?AM Ha Noi > wrote: > > > > > > > > > Hi everyone, > > > > > > > > > > I'm using Openstack Train and Openvswitch for ML2 driver and GRE > for > > > > > tunnel type. I tested our network performance between two VMs and > suffer > > > > > packet loss as below. > > > > > > > > > > VM1: IP: 10.20.1.206 > > > > > > > > > > VM2: IP: 10.20.1.154 > > > > > > > > > > VM3: IP: 10.20.1.72 > > > > > > > > > > > > > > > Using iperf3 to testing performance between VM1 and VM2. > > > > > > > > > > Run iperf3 client and server on both VMs. > > > > > > > > > > On VM2: iperf3 -t 10000 -b 130M -l 442 -P 6 -u -c 10.20.1.206 > > > > > > > > > > On VM1: iperf3 -t 10000 -b 130M -l 442 -P 6 -u -c 10.20.1.154 > > > > > > > > > > > > > > > > > > > > Using VM3 ping into VM1, then the packet is lost and the latency is > > > > > quite high. > > > > > > > > > > > > > > > ping -i 0.1 10.20.1.206 > > > > > > > > > > PING 10.20.1.206 (10.20.1.206) 56(84) bytes of data. > > > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=1 ttl=64 time=7.70 ms > > > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=2 ttl=64 time=6.90 ms > > > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=3 ttl=64 time=7.71 ms > > > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=4 ttl=64 time=7.98 ms > > > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=6 ttl=64 time=8.58 ms > > > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=7 ttl=64 time=8.34 ms > > > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=8 ttl=64 time=8.09 ms > > > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=10 ttl=64 time=4.57 ms > > > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=11 ttl=64 time=8.74 ms > > > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=12 ttl=64 time=9.37 ms > > > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=14 ttl=64 time=9.59 ms > > > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=15 ttl=64 time=7.97 ms > > > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=16 ttl=64 time=8.72 ms > > > > > > > > > > 64 bytes from 10.20.1.206: icmp_seq=17 ttl=64 time=9.23 ms > > > > > > > > > > ^C > > > > > > > > > > --- 10.20.1.206 ping statistics --- > > > > > > > > > > 34 packets transmitted, 28 received, 17.6471% packet loss, time > 3328ms > > > > > > > > > > rtt min/avg/max/mdev = 1.396/6.266/9.590/2.805 ms > > > > > > > > > > > > > > > > > > > > Does any one get this issue ? > > > > > > > > > > Please help me. Thanks > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hanoi952022 at gmail.com Thu Sep 7 12:06:02 2023 From: hanoi952022 at gmail.com (Ha Noi) Date: Thu, 7 Sep 2023 19:06:02 +0700 Subject: [openstack][neutron][openvswitch] Openvswitch Packet loss when high throughput (pps) In-Reply-To: References:

<0278613d4d6f217482014c08e50cfbbcf4acc5c6.camel@redhat.com>

Message-ID: Hi Satish, Why dont you use DPDK? Thanks On Thu, 7 Sep 2023 at 19:03 Satish Patel wrote: > I totally agreed with Sean on all his points but trust me, I have tried > everything possible to tune OS, Network stack, multi-queue, NUMA, CPU > pinning and name it.. but I didn't get any significant improvement. You may > gain 2 to 5% gain with all those tweek. I am running the entire workload on > sriov and life is happy except no LACP bonding. > > I am very interesting is this project > https://docs.openvswitch.org/en/latest/intro/install/afxdp/ > > On Thu, Sep 7, 2023 at 6:07?AM Ha Noi wrote: > >> Dear Smoney, >> >> >> >> On Thu, Sep 7, 2023 at 12:41?AM wrote: >> >>> On Wed, 2023-09-06 at 11:43 -0400, Satish Patel wrote: >>> > Damn! We have noticed the same issue around 40k to 55k PPS. Trust me >>> > nothing is wrong in your config. This is just a limitation of the >>> software >>> > stack and kernel itself. >>> its partly determined by your cpu frequency. >>> kernel ovs of yesteryear could handel about 1mpps total on a ~4GHZ >>> cpu. with per port troughpuyt being lower dependin on what qos/firewall >>> rules that were apllied. >>> >>> >> >> My CPU frequency is 3Ghz and using CPU Intel Gold 2nd generation. I think >> the problem is tuning in the compute node inside. But I cannot find any >> guide or best practices for it. >> >> >> >>> moving form iptables firewall to ovs firewall can help to some degree >>> but your partly trading connection setup time for statead state troughput >>> with the overhead of the connection tracker in ovs. >>> >>> using stateless security groups can help >>> >>> we also recently fixed a regression cause by changes in newer versions >>> of ovs. >>> this was notable in goign form rhel 8 to rhel 9 where litrally it reduced >>> small packet performce to 1/10th and jumboframes to about 1/2 >>> on master we have a config option that will set the default qos on a >>> port to linux-noop >>> >>> https://github.com/openstack/os-vif/blob/master/vif_plug_ovs/ovs.py#L106-L125 >>> >>> the backports are propsoed upstream >>> https://review.opendev.org/q/Id9ef7074634a0f23d67a4401fa8fca363b51bb43 >>> and we have backported this downstream to adress that performance >>> regression. >>> the upstram backport is semi stalled just ebcasue we wanted to disucss >>> if we shoudl make ti opt in >>> by default upstream while backporting but it might be helpful for you if >>> this is related to yoru current >>> issues. >>> >>> 40-55 kpps is kind of low for kernel ovs but if you have a low clockrate >>> cpu, hybrid_plug + incorrect qos >>> then i could see you hitting such a bottelneck. >>> >>> one workaround by the way without the os-vif workaround backported is to >>> set >>> /proc/sys/net/core/default_qdisc to not apply any qos or a low overhead >>> qos type >>> i.e. sudo sysctl -w net.core.default_qdisc=pfifo_fast >>> >>> >> >>> that may or may not help but i would ensure that your are not usign >>> somting like fqdel or cake >>> for net.core.default_qdisc and if you are try changing it to pfifo_fast >>> and see if that helps. >>> >>> there isnet much you can do about the cpu clock rate but ^ is somethign >>> you can try for free >>> note it wont actully take effect on an exsitng vm if you jsut change the >>> default but you can use >>> tc to also chagne the qdisk for testing. hard rebooting the vm shoudl >>> also make the default take effect. >>> >>> the only other advice i can give assuming kernel ovs is the only option >>> you have is >>> >>> to look at >>> >>> https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.rx_queue_size >>> >>> https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.tx_queue_size >>> and >>> >>> https://docs.openstack.org/nova/latest/configuration/extra-specs.html#hw:vif_multiqueue_enabled >>> >>> if the bottelneck is actully in qemu or the guest kernel rather then ovs >>> adjusting the rx/tx queue size and >>> using multi queue can help. it will have no effect if ovs is the bottel >>> neck. >>> >>> >>> >> I have set this option to 1024, and enable multiqueue as well. But it did >> not help. >> >> >>> > >>> > On Wed, Sep 6, 2023 at 9:21?AM Ha Noi wrote: >>> > >>> > > Hi Satish, >>> > > >>> > > Actually, our customer get this issue when the tx/rx above only 40k >>> pps. >>> > > So what is the threshold of this throughput for OvS? >>> > > >>> > > >>> > > Thanks and regards >>> > > >>> > > On Wed, 6 Sep 2023 at 20:19 Satish Patel >>> wrote: >>> > > >>> > > > Hi, >>> > > > >>> > > > This is normal because OVS or LinuxBridge wire up VMs using TAP >>> interface >>> > > > which runs on kernel space and that drives higher interrupt and >>> that makes >>> > > > the kernel so busy working on handling packets. Standard >>> OVS/LinuxBridge >>> > > > are not meant for higher PPS. >>> > > > >>> > > > If you want to handle higher PPS then look for DPDK or SRIOV >>> deployment. >>> > > > ( We are running everything in SRIOV because of high PPS >>> requirement) >>> > > > >>> > > > On Tue, Sep 5, 2023 at 11:11?AM Ha Noi >>> wrote: >>> > > > >>> > > > > Hi everyone, >>> > > > > >>> > > > > I'm using Openstack Train and Openvswitch for ML2 driver and GRE >>> for >>> > > > > tunnel type. I tested our network performance between two VMs >>> and suffer >>> > > > > packet loss as below. >>> > > > > >>> > > > > VM1: IP: 10.20.1.206 >>> > > > > >>> > > > > VM2: IP: 10.20.1.154 >>> > > > > >>> > > > > VM3: IP: 10.20.1.72 >>> > > > > >>> > > > > >>> > > > > Using iperf3 to testing performance between VM1 and VM2. >>> > > > > >>> > > > > Run iperf3 client and server on both VMs. >>> > > > > >>> > > > > On VM2: iperf3 -t 10000 -b 130M -l 442 -P 6 -u -c 10.20.1.206 >>> > > > > >>> > > > > On VM1: iperf3 -t 10000 -b 130M -l 442 -P 6 -u -c 10.20.1.154 >>> > > > > >>> > > > > >>> > > > > >>> > > > > Using VM3 ping into VM1, then the packet is lost and the latency >>> is >>> > > > > quite high. >>> > > > > >>> > > > > >>> > > > > ping -i 0.1 10.20.1.206 >>> > > > > >>> > > > > PING 10.20.1.206 (10.20.1.206) 56(84) bytes of data. >>> > > > > >>> > > > > 64 bytes from 10.20.1.206: icmp_seq=1 ttl=64 time=7.70 ms >>> > > > > >>> > > > > 64 bytes from 10.20.1.206: icmp_seq=2 ttl=64 time=6.90 ms >>> > > > > >>> > > > > 64 bytes from 10.20.1.206: icmp_seq=3 ttl=64 time=7.71 ms >>> > > > > >>> > > > > 64 bytes from 10.20.1.206: icmp_seq=4 ttl=64 time=7.98 ms >>> > > > > >>> > > > > 64 bytes from 10.20.1.206: icmp_seq=6 ttl=64 time=8.58 ms >>> > > > > >>> > > > > 64 bytes from 10.20.1.206: icmp_seq=7 ttl=64 time=8.34 ms >>> > > > > >>> > > > > 64 bytes from 10.20.1.206: icmp_seq=8 ttl=64 time=8.09 ms >>> > > > > >>> > > > > 64 bytes from 10.20.1.206: icmp_seq=10 ttl=64 time=4.57 ms >>> > > > > >>> > > > > 64 bytes from 10.20.1.206: icmp_seq=11 ttl=64 time=8.74 ms >>> > > > > >>> > > > > 64 bytes from 10.20.1.206: icmp_seq=12 ttl=64 time=9.37 ms >>> > > > > >>> > > > > 64 bytes from 10.20.1.206: icmp_seq=14 ttl=64 time=9.59 ms >>> > > > > >>> > > > > 64 bytes from 10.20.1.206: icmp_seq=15 ttl=64 time=7.97 ms >>> > > > > >>> > > > > 64 bytes from 10.20.1.206: icmp_seq=16 ttl=64 time=8.72 ms >>> > > > > >>> > > > > 64 bytes from 10.20.1.206: icmp_seq=17 ttl=64 time=9.23 ms >>> > > > > >>> > > > > ^C >>> > > > > >>> > > > > --- 10.20.1.206 ping statistics --- >>> > > > > >>> > > > > 34 packets transmitted, 28 received, 17.6471% packet loss, time >>> 3328ms >>> > > > > >>> > > > > rtt min/avg/max/mdev = 1.396/6.266/9.590/2.805 ms >>> > > > > >>> > > > > >>> > > > > >>> > > > > Does any one get this issue ? >>> > > > > >>> > > > > Please help me. Thanks >>> > > > > >>> > > > >>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From pdeore at redhat.com Thu Sep 7 13:30:21 2023 From: pdeore at redhat.com (Pranali Deore) Date: Thu, 7 Sep 2023 19:00:21 +0530 Subject: [Glance] Cancelling Weekly Meeting 7th Sept Message-ID: Hello, I won't be able to chair today's meeting, my kid is not well, going to take him to the doctor. so cancelling the meeting for this week. Let's meet next week ! Thanks, Pranali D -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Thu Sep 7 15:13:16 2023 From: satish.txt at gmail.com (Satish Patel) Date: Thu, 7 Sep 2023 11:13:16 -0400 Subject: [openstack][neutron][openvswitch] Openvswitch Packet loss when high throughput (pps) In-Reply-To: References:

<0278613d4d6f217482014c08e50cfbbcf4acc5c6.camel@redhat.com>

Message-ID: Because DPDK required DPDK support inside guest VM. It's not suitable for general purpose workload. You need your guest VM network to support DPDK to get 100% throughput. On Thu, Sep 7, 2023 at 8:06?AM Ha Noi wrote: > Hi Satish, > > Why dont you use DPDK? > > Thanks > > On Thu, 7 Sep 2023 at 19:03 Satish Patel wrote: > >> I totally agreed with Sean on all his points but trust me, I have tried >> everything possible to tune OS, Network stack, multi-queue, NUMA, CPU >> pinning and name it.. but I didn't get any significant improvement. You may >> gain 2 to 5% gain with all those tweek. I am running the entire workload on >> sriov and life is happy except no LACP bonding. >> >> I am very interesting is this project >> https://docs.openvswitch.org/en/latest/intro/install/afxdp/ >> >> On Thu, Sep 7, 2023 at 6:07?AM Ha Noi wrote: >> >>> Dear Smoney, >>> >>> >>> >>> On Thu, Sep 7, 2023 at 12:41?AM wrote: >>> >>>> On Wed, 2023-09-06 at 11:43 -0400, Satish Patel wrote: >>>> > Damn! We have noticed the same issue around 40k to 55k PPS. Trust me >>>> > nothing is wrong in your config. This is just a limitation of the >>>> software >>>> > stack and kernel itself. >>>> its partly determined by your cpu frequency. >>>> kernel ovs of yesteryear could handel about 1mpps total on a ~4GHZ >>>> cpu. with per port troughpuyt being lower dependin on what qos/firewall >>>> rules that were apllied. >>>> >>>> >>> >>> My CPU frequency is 3Ghz and using CPU Intel Gold 2nd generation. I >>> think the problem is tuning in the compute node inside. But I cannot find >>> any guide or best practices for it. >>> >>> >>> >>>> moving form iptables firewall to ovs firewall can help to some degree >>>> but your partly trading connection setup time for statead state >>>> troughput >>>> with the overhead of the connection tracker in ovs. >>>> >>>> using stateless security groups can help >>>> >>>> we also recently fixed a regression cause by changes in newer versions >>>> of ovs. >>>> this was notable in goign form rhel 8 to rhel 9 where litrally it >>>> reduced >>>> small packet performce to 1/10th and jumboframes to about 1/2 >>>> on master we have a config option that will set the default qos on a >>>> port to linux-noop >>>> >>>> https://github.com/openstack/os-vif/blob/master/vif_plug_ovs/ovs.py#L106-L125 >>>> >>>> the backports are propsoed upstream >>>> https://review.opendev.org/q/Id9ef7074634a0f23d67a4401fa8fca363b51bb43 >>>> and we have backported this downstream to adress that performance >>>> regression. >>>> the upstram backport is semi stalled just ebcasue we wanted to disucss >>>> if we shoudl make ti opt in >>>> by default upstream while backporting but it might be helpful for you >>>> if this is related to yoru current >>>> issues. >>>> >>>> 40-55 kpps is kind of low for kernel ovs but if you have a low >>>> clockrate cpu, hybrid_plug + incorrect qos >>>> then i could see you hitting such a bottelneck. >>>> >>>> one workaround by the way without the os-vif workaround backported is >>>> to set >>>> /proc/sys/net/core/default_qdisc to not apply any qos or a low overhead >>>> qos type >>>> i.e. sudo sysctl -w net.core.default_qdisc=pfifo_fast >>>> >>>> >>> >>>> that may or may not help but i would ensure that your are not usign >>>> somting like fqdel or cake >>>> for net.core.default_qdisc and if you are try changing it to pfifo_fast >>>> and see if that helps. >>>> >>>> there isnet much you can do about the cpu clock rate but ^ is somethign >>>> you can try for free >>>> note it wont actully take effect on an exsitng vm if you jsut change >>>> the default but you can use >>>> tc to also chagne the qdisk for testing. hard rebooting the vm shoudl >>>> also make the default take effect. >>>> >>>> the only other advice i can give assuming kernel ovs is the only option >>>> you have is >>>> >>>> to look at >>>> >>>> https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.rx_queue_size >>>> >>>> https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.tx_queue_size >>>> and >>>> >>>> https://docs.openstack.org/nova/latest/configuration/extra-specs.html#hw:vif_multiqueue_enabled >>>> >>>> if the bottelneck is actully in qemu or the guest kernel rather then >>>> ovs adjusting the rx/tx queue size and >>>> using multi queue can help. it will have no effect if ovs is the bottel >>>> neck. >>>> >>>> >>>> >>> I have set this option to 1024, and enable multiqueue as well. But it >>> did not help. >>> >>> >>>> > >>>> > On Wed, Sep 6, 2023 at 9:21?AM Ha Noi wrote: >>>> > >>>> > > Hi Satish, >>>> > > >>>> > > Actually, our customer get this issue when the tx/rx above only 40k >>>> pps. >>>> > > So what is the threshold of this throughput for OvS? >>>> > > >>>> > > >>>> > > Thanks and regards >>>> > > >>>> > > On Wed, 6 Sep 2023 at 20:19 Satish Patel >>>> wrote: >>>> > > >>>> > > > Hi, >>>> > > > >>>> > > > This is normal because OVS or LinuxBridge wire up VMs using TAP >>>> interface >>>> > > > which runs on kernel space and that drives higher interrupt and >>>> that makes >>>> > > > the kernel so busy working on handling packets. Standard >>>> OVS/LinuxBridge >>>> > > > are not meant for higher PPS. >>>> > > > >>>> > > > If you want to handle higher PPS then look for DPDK or SRIOV >>>> deployment. >>>> > > > ( We are running everything in SRIOV because of high PPS >>>> requirement) >>>> > > > >>>> > > > On Tue, Sep 5, 2023 at 11:11?AM Ha Noi >>>> wrote: >>>> > > > >>>> > > > > Hi everyone, >>>> > > > > >>>> > > > > I'm using Openstack Train and Openvswitch for ML2 driver and >>>> GRE for >>>> > > > > tunnel type. I tested our network performance between two VMs >>>> and suffer >>>> > > > > packet loss as below. >>>> > > > > >>>> > > > > VM1: IP: 10.20.1.206 >>>> > > > > >>>> > > > > VM2: IP: 10.20.1.154 >>>> > > > > >>>> > > > > VM3: IP: 10.20.1.72 >>>> > > > > >>>> > > > > >>>> > > > > Using iperf3 to testing performance between VM1 and VM2. >>>> > > > > >>>> > > > > Run iperf3 client and server on both VMs. >>>> > > > > >>>> > > > > On VM2: iperf3 -t 10000 -b 130M -l 442 -P 6 -u -c 10.20.1.206 >>>> > > > > >>>> > > > > On VM1: iperf3 -t 10000 -b 130M -l 442 -P 6 -u -c 10.20.1.154 >>>> > > > > >>>> > > > > >>>> > > > > >>>> > > > > Using VM3 ping into VM1, then the packet is lost and the >>>> latency is >>>> > > > > quite high. >>>> > > > > >>>> > > > > >>>> > > > > ping -i 0.1 10.20.1.206 >>>> > > > > >>>> > > > > PING 10.20.1.206 (10.20.1.206) 56(84) bytes of data. >>>> > > > > >>>> > > > > 64 bytes from 10.20.1.206: icmp_seq=1 ttl=64 time=7.70 ms >>>> > > > > >>>> > > > > 64 bytes from 10.20.1.206: icmp_seq=2 ttl=64 time=6.90 ms >>>> > > > > >>>> > > > > 64 bytes from 10.20.1.206: icmp_seq=3 ttl=64 time=7.71 ms >>>> > > > > >>>> > > > > 64 bytes from 10.20.1.206: icmp_seq=4 ttl=64 time=7.98 ms >>>> > > > > >>>> > > > > 64 bytes from 10.20.1.206: icmp_seq=6 ttl=64 time=8.58 ms >>>> > > > > >>>> > > > > 64 bytes from 10.20.1.206: icmp_seq=7 ttl=64 time=8.34 ms >>>> > > > > >>>> > > > > 64 bytes from 10.20.1.206: icmp_seq=8 ttl=64 time=8.09 ms >>>> > > > > >>>> > > > > 64 bytes from 10.20.1.206: icmp_seq=10 ttl=64 time=4.57 ms >>>> > > > > >>>> > > > > 64 bytes from 10.20.1.206: icmp_seq=11 ttl=64 time=8.74 ms >>>> > > > > >>>> > > > > 64 bytes from 10.20.1.206: icmp_seq=12 ttl=64 time=9.37 ms >>>> > > > > >>>> > > > > 64 bytes from 10.20.1.206: icmp_seq=14 ttl=64 time=9.59 ms >>>> > > > > >>>> > > > > 64 bytes from 10.20.1.206: icmp_seq=15 ttl=64 time=7.97 ms >>>> > > > > >>>> > > > > 64 bytes from 10.20.1.206: icmp_seq=16 ttl=64 time=8.72 ms >>>> > > > > >>>> > > > > 64 bytes from 10.20.1.206: icmp_seq=17 ttl=64 time=9.23 ms >>>> > > > > >>>> > > > > ^C >>>> > > > > >>>> > > > > --- 10.20.1.206 ping statistics --- >>>> > > > > >>>> > > > > 34 packets transmitted, 28 received, 17.6471% packet loss, time >>>> 3328ms >>>> > > > > >>>> > > > > rtt min/avg/max/mdev = 1.396/6.266/9.590/2.805 ms >>>> > > > > >>>> > > > > >>>> > > > > >>>> > > > > Does any one get this issue ? >>>> > > > > >>>> > > > > Please help me. Thanks >>>> > > > > >>>> > > > >>>> >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Thu Sep 7 15:35:47 2023 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Thu, 07 Sep 2023 08:35:47 -0700 Subject: [OpenStack-I18n] [i18n] New I18n SIG Core In-Reply-To: References:

Message-ID: <18a70490102.ff8e772d643566.6427525976911462058@ghanshyammann.com> Thanks, Seongsoo, for help here and Ian too -gmann ---- On Wed, 06 Sep 2023 18:55:44 -0700 Seongsoo Cho wrote --- > Hello OpenStack Community and i18n team > > Thank you for adding me as a new i18n core team member. > As Ian said, I am now working on the i18n SIG to migrate the > translation platform from zanata to weblate. > > I will do my best to internationalize OpenInfra in the future. > > P.S. Thanks to Ian for always supporting me. > > Best regards > Seongsoo > > On Thu, Sep 7, 2023 at 9:45?AM Ian Y. Choi ianyrchoi at gmail.com> wrote: > > > > Hello, > > > > I would like to happily announce a new i18n core: Seongsoo Cho, > > considering his notable contributions on I18n with volunteering: > > > > 1. PDF Docs contribution with I18n issues (font issues, latex with Asian > > languages) [1] > > 2. Zanata to Weblate migration volunteering effort [2] > > 3. Participation of I18n SIG PTG on March [3] and Forum on May [4] > > > > Also, he is active on #openstack-i18n IRC channel (nick name: > > "seongsoocho"). > > > > All, please welcome Seongsoo as I18n-core. > > > > I am looking forward to more active contributions from him and > > interactions with other wonderful I18n members and OpenStack team! > > > > > > Thank you, > > > > /Ian > > > > [1] https://review.opendev.org/q/topic:bp%252Fbuild-pdf-from-rst-guides > > [2] > > https://lists.openstack.org/pipermail/openstack-i18n/2022-October/003568.html > > [3] https://etherpad.opendev.org/p/march2023-ptg-i18n > > [4] https://etherpad.opendev.org/p/vancouver-2023-i18n-forum > > > > > > _______________________________________________ > > OpenStack-I18n mailing list > > OpenStack-I18n at lists.openstack.org > > > > -- > Seongsoo Cho > OpenStack Korea User Group / Community Leader > IRC #seongsoocho > > From gmann at ghanshyammann.com Thu Sep 7 16:03:52 2023 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Thu, 07 Sep 2023 09:03:52 -0700 Subject: [all][tc] python 3.11 testing plan In-Reply-To: <18a19500dda.1179e4341294220.8475553822244831650@ghanshyammann.com> References: <18a19500dda.1179e4341294220.8475553822244831650@ghanshyammann.com> Message-ID: <18a7062b69d.d3f024e6645938.1764325247310912451@ghanshyammann.com> ---- On Mon, 21 Aug 2023 11:16:31 -0700 Ghanshyam Mann wrote --- > Hi All, > > Some of you are part of discussion for python 3.11 testing but If you are not aware of it, > below is the plan for python 3.11 testing in OpenStack. > > Non voting in 2023.2 > ------------------------- > You might have seen that python 3.11 job is now running as non voting in all projects[1]. > Idea is to run it as non voting for this (2023.2) cycle which will give projects to fix the issue and make > it green. As it is running on debian (frickler mentioned the reason of running it in debian in gerrit[2]), it > need some changes in bindep.txt file to pass. Here is the example of fix[3] which you can do in your > project also. Hello Everyone, Many projects still need to fix the py3.11 job[1]. I started fixing a few of them, so changes are up for review of those projects. NOTE: The deadline to fix is the 2023.2 release (Oct 6th); after that, this job will become voting on the master (2024.1 dev cycle but remain non-voting on stable/2023.2) and will block the master gate. [1] https://zuul.openstack.org/builds?job_name=openstack-tox-py311+&result=RETRY_LIMIT&result=RETRY&result=CONFIG_ERROR&result=FAILURE&skip=0&limit=100 -gmann > > Voting in 2024.1 > -------------------- > In next cycle (2024.1), I am proposing to make py3.11 testing mandatory [4] and voting (automatically > via common python job template). You need to fix the failure in this cycle otherwise it will block the > gate once the next cycle development start (basically once 891238 is merged). > > [1] https://review.opendev.org/c/openstack/openstack-zuul-jobs/+/891227/5 > [2] https://review.opendev.org/c/openstack/openstack-zuul-jobs/+/891146/1 > [3] https://review.opendev.org/c/openstack/nova/+/891256 > [4] https://review.opendev.org/c/openstack/governance/+/891225 > [5] https://review.opendev.org/c/openstack/openstack-zuul-jobs/+/891238 > > -gmann > > From mihalis68 at gmail.com Thu Sep 7 17:26:08 2023 From: mihalis68 at gmail.com (Chris Morgan) Date: Thu, 7 Sep 2023 13:26:08 -0400 Subject: [ops] Message-ID: I'm trying to reanimate the openstack ops meetups. I moved the social media account used for notifications from X to mastodon (using the fosstodon instance). Please follow there if interested (link below). I also posted a similar announcement to the openstack group on linkedin. One of the first tasks (besides reforming the team) is finding which communications channels actually work. https://fosstodon.org/@osopsmeetup The old account was here https://twitter.com/osopsmeetup now "sepia-tinted" because it's just historical. Cheers, Chris -- Chris Morgan -------------- next part -------------- An HTML attachment was scrubbed... URL: From roger.riverac at gmail.com Thu Sep 7 17:30:03 2023 From: roger.riverac at gmail.com (Roger Rivera) Date: Thu, 7 Sep 2023 13:30:03 -0400 Subject: [openstack-ansible] Dedicated gateway hosts not working with OVN In-Reply-To: References:

<20230208143243.dth426y74jyfacxh@yuggoth.org>

Message-ID: Hello Dimitry, Thanks again for your help. Unfortunately, we've tried everything that's been suggested to no avail. And it seems plausible that external connectivity will not be achieved on the compute nodes if there are no bridges mapped to the external network on those hosts. Keep in mind these compute hosts do not have the ens2 physical interface to bind the ext-br or br-flat bridges to. Having said that, we would have loved to see a complete OVN scenario reference configuration with dedicated networking/gateway nodes. The documentation we have reviewed assumes compute nodes as gateways and that bridges can be set up on compute nodes, which is not our case. We are relying 100% on a single L3 interface on compute nodes with GENEVE as a tunneling protocol. And it is because of GENEVE that private east/west traffic works without a problem. Only networking nodes have that second ens2 network interface that physically connects to the external network, hence the need to make those chassis as gateway nodes. Again, our setup has the following configuration: -Compute nodes with x1 L3 NIC and IP. -Network/gateway nodes with x1 L3 NIC and x1 L2 NIC with connection to external network. Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From noonedeadpunk at gmail.com Thu Sep 7 17:45:59 2023 From: noonedeadpunk at gmail.com (Dmitriy Rabotyagov) Date: Thu, 7 Sep 2023 19:45:59 +0200 Subject: [openstack-ansible] Dedicated gateway hosts not working with OVN In-Reply-To: References:

<20230208143243.dth426y74jyfacxh@yuggoth.org>