From pramchan at yahoo.com  Sat Aug  1 02:43:36 2020
From: pramchan at yahoo.com (prakash RAMCHANDRAN)
Date: Sat, 1 Aug 2020 02:43:36 +0000 (UTC)
Subject: [Openstack-Interop] Friday 31 - Updates and Questions  for future
 of Interop?
References: <424140393.10386729.1596249816702.ref@mail.yahoo.com>
Message-ID: <424140393.10386729.1596249816702@mail.yahoo.com>

Hi all,
We looked at OSF and Open-Infra initiatives and for sustaining and enhancing the Interop activity need your support.
The current meeting notes are available on the site and will be meeting alternate Fridays Aug 14, 28,... 10 AM PST / 17 UTC - going forward on meetpad: Lnink: https://meetpad.opendev.org/Interop-WG-weekly-meeting
Details in therpadhttps://etherpad.opendev.org/p/interop
Do you have suggestions for Branding efforts?
Few Question:
1. Interop WG being an OSF or Board Driven, how can Interop work with Projects to ensure Branding Integrated Logo Programs and other Cloud Providers can leverage OpenStack Logo Program?
a. We do  have role to guide & reviewing refstack reports for Branding. We have decided to decouple Marketplace  & Branding programs.Seeking feedback from community on proposed add-on programs for Bare metal(Ironic, MaaS,...) & "Kubernetes-ready OpenStack" 
b. Refstack hosting / Refstack-client are currently not maintained due to lack of volunteers to support. (seeking volunteers for updates - please reach out to @gmann in TC.

c. Await user & operator survey annual reports, but if you have any Branding innovation like to propose or contribute please brainstorm your ideas here and bring it to midweek community meetings next week. 
ThanksFor Interop WGPrakash


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200801/bd838885/attachment-0001.html>

From anilj.mailing at gmail.com  Sat Aug  1 21:36:37 2020
From: anilj.mailing at gmail.com (Anil Jangam)
Date: Sat, 1 Aug 2020 14:36:37 -0700
Subject: Two subnets under same network context.
Message-ID: <CAHVD1iaQk4eZDjqsq87sCK3JMxK3ZYhXon+GjJmMfOzrthSOqw@mail.gmail.com>

Hi,

I have observed that one can create two subnets under the same network
scope. See below an example of the use case.

[image: Screen Shot 2020-08-01 at 2.22.15 PM.png]
Upon checking the data structures, I saw that the segment type (vlan) and
segment id (55) is associated with the "network" object and not with the
"subnet" (I was under impression that the segment type (vlan) and segment
id (55) would be allocated to the "subnet").

When I create the VM instances, they always pick the IP address from the
SUBNET1-2 IP range. If the segment (vlan 55) is associated with "network"
then what is the reason two "subnets" are allowed under it?

Does it mean that VM instances from both these subnets would be configured
under the same VLAN?

/anil.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200801/40d7e73d/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screen Shot 2020-08-01 at 2.22.15 PM.png
Type: image/png
Size: 65802 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200801/40d7e73d/attachment-0001.png>

From skaplons at redhat.com  Sun Aug  2 11:27:58 2020
From: skaplons at redhat.com (Slawek Kaplonski)
Date: Sun, 2 Aug 2020 13:27:58 +0200
Subject: Two subnets under same network context.
In-Reply-To: <CAHVD1iaQk4eZDjqsq87sCK3JMxK3ZYhXon+GjJmMfOzrthSOqw@mail.gmail.com>
References: <CAHVD1iaQk4eZDjqsq87sCK3JMxK3ZYhXon+GjJmMfOzrthSOqw@mail.gmail.com>
Message-ID: <4991a915-210a-ad24-8497-fab029c1a050@redhat.com>

Hi,

This is "normal". You can have many subnets (both IPv4 and IPv6 in the 
one network). By default Neutron will associate to the port IP address 
only from one subnet of one type (IPv4/IPv6) but You can change it and 
tell Neutron to allocate for the port IP adresses from more than one subnet.
If You have both IPv4 and IPv6 subnets in the network, Neutron will by 
default allocate one IPv4 and one IPv6 to each port. But again, You can 
manually tell Neutron to use e.g. only IPv6 address for specific port.

Please check [1] and [2] for more details.

[1] https://docs.openstack.org/neutron/latest/admin/intro-os-networking.html
[2] https://docs.openstack.org/api-ref/network/v2/

W dniu 01.08.2020 o 23:36, Anil Jangam pisze:
> Hi,
> 
> I have observed that one can create two subnets under the same network
> scope. See below an example of the use case.
> 
> [image: Screen Shot 2020-08-01 at 2.22.15 PM.png]
> Upon checking the data structures, I saw that the segment type (vlan) and
> segment id (55) is associated with the "network" object and not with the
> "subnet" (I was under impression that the segment type (vlan) and segment
> id (55) would be allocated to the "subnet").
> 
> When I create the VM instances, they always pick the IP address from the
> SUBNET1-2 IP range. If the segment (vlan 55) is associated with "network"
> then what is the reason two "subnets" are allowed under it?
> 
> Does it mean that VM instances from both these subnets would be configured
> under the same VLAN?
> 
> /anil.
> 

-- 
Slawek Kaplonski
Principal software engineer
Red Hat


From romain.chanu at univ-lyon1.fr  Sun Aug  2 10:27:12 2020
From: romain.chanu at univ-lyon1.fr (CHANU ROMAIN)
Date: Sun, 2 Aug 2020 10:27:12 +0000
Subject: Two subnets under same network context.
In-Reply-To: <CAHVD1iaQk4eZDjqsq87sCK3JMxK3ZYhXon+GjJmMfOzrthSOqw@mail.gmail.com>
References: <CAHVD1iaQk4eZDjqsq87sCK3JMxK3ZYhXon+GjJmMfOzrthSOqw@mail.gmail.com>
Message-ID: <1596364031962.69223@univ-lyon1.fr>

Hello,


Network object is an isolation layer, it's defined by the cloud administrator: isolation type (VLAN, VXLAN...), physical NIC.. The subnet is a free value to cloud users, this mechanism allows multiples users to use same L3 networks (overlapping). So the network is used by admin to isolate the client and subnet is used by client to "isolate" his instances (webfont  / db ...). Isolation works only on layer3 because all subnets will use the same layer2 (defined by admin). It's very easy to verify: boot one instance on each subnet then capture the traffic: you will see ARP trames.


I dont know why Neutron drains all IP from last network but anyway the best practice is to create port then allocate to instance.


Does it mean that VM instances from both these subnets would be configured under the same VLAN?  > yes


Best regards,

Romain


________________________________
From: Anil Jangam <anilj.mailing at gmail.com>
Sent: Saturday, August 1, 2020 11:36 PM
To: openstack-discuss
Subject: Two subnets under same network context.

Hi,

I have observed that one can create two subnets under the same network scope. See below an example of the use case.

[Screen Shot 2020-08-01 at 2.22.15 PM.png]
Upon checking the data structures, I saw that the segment type (vlan) and segment id (55) is associated with the "network" object and not with the "subnet" (I was under impression that the segment type (vlan) and segment id (55) would be allocated to the "subnet").

When I create the VM instances, they always pick the IP address from the SUBNET1-2 IP range. If the segment (vlan 55) is associated with "network" then what is the reason two "subnets" are allowed under it?

Does it mean that VM instances from both these subnets would be configured under the same VLAN?

/anil.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200802/463053de/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screen Shot 2020-08-01 at 2.22.15 PM.png
Type: image/png
Size: 65802 bytes
Desc: Screen Shot 2020-08-01 at 2.22.15 PM.png
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200802/463053de/attachment-0001.png>

From gmann at ghanshyammann.com  Mon Aug  3 00:10:36 2020
From: gmann at ghanshyammann.com (Ghanshyam Mann)
Date: Sun, 02 Aug 2020 19:10:36 -0500
Subject: [all][tc][goals] Migrate CI/CD jobs to new Ubuntu LTS Focal: Week
 R-11 Update
Message-ID: <173b1a7e32b.adf0e67a81133.1529455632674185621@ghanshyammann.com>

Hello Everyone,

Please find the week R-11 updates on 'Ubuntu Focal migration' community goal.

Tracking: https://storyboard.openstack.org/#!/story/2007865

Progress:
=======
* We passed the first deadline which I planned initially but looking at the failure happing it will definitely
will take more time. My first goal is "zero downtime in gate", so if we finish it little late (with all repos tested)
is ok.

* ~80 repos gate have been tested/fixed till now.
** https://review.opendev.org/#/q/topic:migrate-to-focal+(status:abandoned+OR+status:merged)

* 115 repos are under test and failing. Debugging and fixing are in progress (If you would like to help, please check your
project repos if I am late to fix them):
** https://review.opendev.org/#/q/topic:migrate-to-focal+status:open

* Patches ready to merge:
** https://review.opendev.org/#/q/topic:migrate-to-focal+status:open+label%3AVerified%3E%3D1%2Czuul+NOT+label%3AWorkflow%3C%3D-1

Bugs Report:
==========
Summary: Total 4 (1 fixed, 3 in-progress).

1. Bug#1882521. (IN-PROGRESS)
There is open bug for nova/cinder where three tempest tests are failing for
volume detach operation. There is no clear root cause found yet
-https://bugs.launchpad.net/cinder/+bug/1882521
We have skipped the tests in tempest base patch to proceed with the other
projects testing but this is blocking things for the migration.

2. We encountered the nodeset name conflict with x/tobiko. (FIXED)
nodeset conflict is resolved now and devstack provides all focal nodes now.

3. Bug#1886296. (IN-PROGRESS)
pyflakes till 2.1.0 is not compatible with python 3.8 which is the default python version
on ubuntu focal[1]. With pep8 job running on focal faces the issue and fail. We need to bump
the pyflakes to 2.1.1 as min version to run pep8 jobs on py3.8.
As of now, many projects are using old hacking version so I am explicitly adding pyflakes>=2.1.1
on the project side[2] but for the long term easy maintenance, I am doing it in 'hacking' requirements.txt[3]
nd will release a new hacking version. After that project can move to new hacking and do not need
to maintain pyflakes version compatibility.

4. Bug#1886298. (IN-PROGRESS)
'Markupsafe' 1.0 is not compatible with the latest version of setuptools[4],
We need to bump the lower-constraint for Markupsafe to 1.1.1 to make it work.
There are a few more issues[5] with lower-constraint jobs which I am debugging.

What work to be done on the project side:
================================
This goal is more of testing the jobs on focal and fixing bugs if any otherwise
migrate jobs by switching the nodeset to focal node sets defined in devstack.

1. Start a patch in your repo by making depends-on on either of below:
devstack base patch if you are using only devstack base jobs not tempest:

Depends-on: https://review.opendev.org/#/c/731207/
OR
tempest base patch if you are using the tempest base job (like devstack-tempest):
Depends-on: https://review.opendev.org/#/c/734700/

Both have depends-on on the series where I am moving unit/functional/doc/cover/nodejs tox jobs to focal. So
you can test the complete gate jobs(unit/functional/doc/integration) together.
This and its base patches - https://review.opendev.org/#/c/738328/

Example: https://review.opendev.org/#/c/738126/

2. If none of your project jobs override the nodeset then above patch will be
testing patch(do not merge) otherwise change the nodeset to focal.
Example: https://review.opendev.org/#/c/737370/

3. If the jobs are defined in branchless repo and override the nodeset then you need to override the branches
variant to adjust the nodeset so that those jobs run on Focal on victoria onwards only. If no nodeset
is overridden then devstack being branched and stable base job using bionic/xenial will take care of
this.
Example: https://review.opendev.org/#/c/744056/2

4. If no updates need you can abandon the testing patch (https://review.opendev.org/#/c/744341/). If it need
updates then modify the same patch with proper commit msg, once it pass the gate then remove the Depends-On
so that you can merge your patch before base jobs are switched to focal. This way we make sure no gate downtime in
this migration.
Example: https://review.opendev.org/#/c/744056/1..2//COMMIT_MSG

Once we finish the testing on projects side and no failure then we will merge the devstack and tempest
base patches.

Important things to note:
===================
* Do not forgot to add the story and task link to your patch so that we can track it smoothly.
* Use gerrit topic 'migrate-to-focal'
* Do not backport any of the patches.

References:
=========
Goal doc: https://governance.openstack.org/tc/goals/selected/victoria/migrate-ci-cd-jobs-to-ubuntu-focal.html
Storyboard tracking: https://storyboard.openstack.org/#!/story/2007865

[1] https://github.com/PyCQA/pyflakes/issues/367
[2] https://review.opendev.org/#/c/739315/
[3] https://review.opendev.org/#/c/739334/
[4] https://github.com/pallets/markupsafe/issues/116
[5] https://zuul.opendev.org/t/openstack/build/7ecd9cf100194bc99b3b70fa1e6de032

-gmann


From zhangbailin at inspur.com  Mon Aug  3 00:10:59 2020
From: zhangbailin at inspur.com (=?utf-8?B?QnJpbiBaaGFuZyjlvKDnmb7mnpcp?=)
Date: Mon, 3 Aug 2020 00:10:59 +0000
Subject: =?utf-8?B?562U5aSNOiBbbGlzdHMub3BlbnN0YWNrLm9yZ+S7o+WPkV1SZTogW0dsYW5j?=
 =?utf-8?Q?e]_Proposing_Dan_Smith_for_glance_core?=
In-Reply-To: <8635120d-11d6-136e-2581-40d3d451d1aa@gmail.com>
References: <e6267d6b1e054d3901ef57db7f86702d@sslemail.net>
 <8635120d-11d6-136e-2581-40d3d451d1aa@gmail.com>
Message-ID: <03ece5d405c74b2d9292301c2e3be7b8@inspur.com>

+1


发件人: Jay Bryant [mailto:jungleboyj at gmail.com]
发送时间: 2020年7月31日 23:39
收件人: openstack-discuss at lists.openstack.org
主题: [lists.openstack.org代发]Re: [Glance] Proposing Dan Smith for glance core


On 7/31/2020 8:10 AM, Sean McGinnis wrote:

   On 7/30/20 10:25 AM, Abhishek Kekane wrote:

      Hi All,

      I'd like to propose adding Dan Smith to the glance core group.


      Dan Smith has contributed to stabilize image import workflow as well as multiple stores of glance.

      He is also contributing in tempest and nova to set up CI/tempest jobs around image import and multiple stores.


      Being involved on the mailing-list and IRC channels, Dan is always helpful to the community and here to help.

      Please respond with +1/-1 until 03rd August, 2020 1400 UTC.

      Cheers,
      Abhishek

   +1

   Not a Glance core but definitely +1 from me.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200803/8d3f1691/attachment.html>

From zigo at debian.org  Mon Aug  3 07:05:58 2020
From: zigo at debian.org (Thomas Goirand)
Date: Mon, 3 Aug 2020 09:05:58 +0200
Subject: Two subnets under same network context.
In-Reply-To: <1596364031962.69223@univ-lyon1.fr>
References: <CAHVD1iaQk4eZDjqsq87sCK3JMxK3ZYhXon+GjJmMfOzrthSOqw@mail.gmail.com>
 <1596364031962.69223@univ-lyon1.fr>
Message-ID: <01345dce-deb8-de63-4a01-bf86c6ac893e@debian.org>

On 8/2/20 12:27 PM, CHANU ROMAIN wrote:
> Hello,
> 
> 
> Network object is an isolation layer, it's defined by the cloud
> administrator: isolation type (VLAN, VXLAN...), physical NIC.. The
> subnet is a free value to cloud users, this mechanism allows multiples
> users to use same L3 networks (overlapping). So the network is used by
> admin to isolate the client and subnet is used by client to "isolate"
> his instances (webfont  / db ...).

No, this isn't the way it works.

Thomas


From zigo at debian.org  Mon Aug  3 07:14:19 2020
From: zigo at debian.org (Thomas Goirand)
Date: Mon, 3 Aug 2020 09:14:19 +0200
Subject: Two subnets under same network context.
In-Reply-To: <CAHVD1iaQk4eZDjqsq87sCK3JMxK3ZYhXon+GjJmMfOzrthSOqw@mail.gmail.com>
References: <CAHVD1iaQk4eZDjqsq87sCK3JMxK3ZYhXon+GjJmMfOzrthSOqw@mail.gmail.com>
Message-ID: <cd6ddab6-1991-4dd8-478c-c8033d9699de@debian.org>

On 8/1/20 11:36 PM, Anil Jangam wrote:
> Hi, 
> 
> I have observed that one can create two subnets under the same network
> scope. See below an example of the use case. 
> 
> Screen Shot 2020-08-01 at 2.22.15 PM.png
> Upon checking the data structures, I saw that the segment type (vlan)
> and segment id (55) is associated with the "network" object and not with
> the "subnet" (I was under impression that the segment type (vlan) and
> segment id (55) would be allocated to the "subnet"). 
> 
> When I create the VM instances, they always pick the IP address from the
> SUBNET1-2 IP range. If the segment (vlan 55) is associated with
> "network" then what is the reason two "subnets" are allowed under it? 
> 
> Does it mean that VM instances from both these subnets would be
> configured under the same VLAN? 
> 
> /anil.

Hi,

If you want to use segments, with a different address range depending on
where a compute is physically located (for example, a rack...), then you
should first set a different name for the physical network of your
nodes. This is done by tweaking these:

[ml2_type_flat]
flat_networks = rack-number-1

[ml2_type_vlan]
network_vlan_ranges = rack-number-1

Then you can:
1/ create a network scope
2/ create a network using that scope, a vlan and
"--provider-physical-network rack-number-1" and --provider-segment
3/ create a subnet pool using the network scope created above
4/ create a subnet attached to the subnet pool and network segment

Then you can create more network segment + subnet couples addressing
different location. Once you're done, VMs will get a different range
depending on the rack they are in.

Cheers,

Thomas Goirand (zigo)


From ignaziocassano at gmail.com  Mon Aug  3 07:53:37 2020
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Mon, 3 Aug 2020 09:53:37 +0200
Subject: [openstack][stein][manila-ui] error
Message-ID: <CAB7j8cUDXoNHipBMJfXR=93eoupTrZa5cm+1NjEALY7Cg12U_g@mail.gmail.com>

Hello, I installed manila on openstack stein and it works by command line
mat the manila ui does not work and in httpd error log I read:

[Mon Aug 03 07:45:26.697408 2020] [:error] [pid 3506291] ERROR
django.request Internal Server Error: /dashboard/project/shares/
[Mon Aug 03 07:45:26.697437 2020] [:error] [pid 3506291] Traceback (most
recent call last):
[Mon Aug 03 07:45:26.697442 2020] [:error] [pid 3506291]   File
"/usr/lib/python2.7/site-packages/django/core/handlers/exception.py", line
41, in inner
[Mon Aug 03 07:45:26.697446 2020] [:error] [pid 3506291]     response =
get_response(request)
[Mon Aug 03 07:45:26.697450 2020] [:error] [pid 3506291]   File
"/usr/lib/python2.7/site-packages/django/core/handlers/base.py", line 187,
in _get_response
[Mon Aug 03 07:45:26.697453 2020] [:error] [pid 3506291]     response =
self.process_exception_by_middleware(e, request)
[Mon Aug 03 07:45:26.697466 2020] [:error] [pid 3506291]   File
"/usr/lib/python2.7/site-packages/django/core/handlers/base.py", line 185,
in _get_response
[Mon Aug 03 07:45:26.697471 2020] [:error] [pid 3506291]     response =
wrapped_callback(request, *callback_args, **callback_kwargs)
[Mon Aug 03 07:45:26.697475 2020] [:error] [pid 3506291]   File
"/usr/lib/python2.7/site-packages/horizon/decorators.py", line 52, in dec
[Mon Aug 03 07:45:26.697479 2020] [:error] [pid 3506291]     return
view_func(request, *args, **kwargs)
[Mon Aug 03 07:45:26.697482 2020] [:error] [pid 3506291]   File
"/usr/lib/python2.7/site-packages/horizon/decorators.py", line 36, in dec
[Mon Aug 03 07:45:26.697485 2020] [:error] [pid 3506291]     return
view_func(request, *args, **kwargs)
[Mon Aug 03 07:45:26.697489 2020] [:error] [pid 3506291]   File
"/usr/lib/python2.7/site-packages/horizon/decorators.py", line 36, in dec
[Mon Aug 03 07:45:26.697492 2020] [:error] [pid 3506291]     return
view_func(request, *args, **kwargs)
[Mon Aug 03 07:45:26.697496 2020] [:error] [pid 3506291]   File
"/usr/lib/python2.7/site-packages/horizon/decorators.py", line 113, in dec
[Mon Aug 03 07:45:26.697499 2020] [:error] [pid 3506291]     return
view_func(request, *args, **kwargs)
[Mon Aug 03 07:45:26.697502 2020] [:error] [pid 3506291]   File
"/usr/lib/python2.7/site-packages/horizon/decorators.py", line 84, in dec
[Mon Aug 03 07:45:26.697506 2020] [:error] [pid 3506291]     return
view_func(request, *args, **kwargs)
[Mon Aug 03 07:45:26.697509 2020] [:error] [pid 3506291]   File
"/usr/lib/python2.7/site-packages/django/views/generic/base.py", line 68,
in view
[Mon Aug 03 07:45:26.697513 2020] [:error] [pid 3506291]     return
self.dispatch(request, *args, **kwargs)
[Mon Aug 03 07:45:26.697516 2020] [:error] [pid 3506291]   File
"/usr/lib/python2.7/site-packages/django/views/generic/base.py", line 88,
in dispatch
[Mon Aug 03 07:45:26.697520 2020] [:error] [pid 3506291]     return
handler(request, *args, **kwargs)
[Mon Aug 03 07:45:26.697523 2020] [:error] [pid 3506291]   File
"/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 223, in get
[Mon Aug 03 07:45:26.697526 2020] [:error] [pid 3506291]     handled =
self.construct_tables()
[Mon Aug 03 07:45:26.697530 2020] [:error] [pid 3506291]   File
"/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 214, in
construct_tables
[Mon Aug 03 07:45:26.697533 2020] [:error] [pid 3506291]     handled =
self.handle_table(table)
[Mon Aug 03 07:45:26.697537 2020] [:error] [pid 3506291]   File
"/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 123, in
handle_table
[Mon Aug 03 07:45:26.697540 2020] [:error] [pid 3506291]     data =
self._get_data_dict()
[Mon Aug 03 07:45:26.697544 2020] [:error] [pid 3506291]   File
"/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 43, in
_get_data_dict
[Mon Aug 03 07:45:26.697547 2020] [:error] [pid 3506291]
data.extend(func())
[Mon Aug 03 07:45:26.697550 2020] [:error] [pid 3506291]   File
"/usr/lib/python2.7/site-packages/horizon/utils/memoized.py", line 109, in
wrapped
[Mon Aug 03 07:45:26.697554 2020] [:error] [pid 3506291]     value =
cache[key] = func(*args, **kwargs)
[Mon Aug 03 07:45:26.697557 2020] [:error] [pid 3506291]   File
"/usr/lib/python2.7/site-packages/manila_ui/dashboards/project/shares/views.py",
line 57, in get_shares_data
[Mon Aug 03 07:45:26.697561 2020] [:error] [pid 3506291]     share_nets =
manila.share_network_list(self.request)
[Mon Aug 03 07:45:26.697564 2020] [:error] [pid 3506291]   File
"/usr/lib/python2.7/site-packages/manila_ui/api/manila.py", line 280, in
share_network_list
[Mon Aug 03 07:45:26.697568 2020] [:error] [pid 3506291]     return
manilaclient(request).share_networks.list(detailed=detailed,
[Mon Aug 03 07:45:26.697571 2020] [:error] [pid 3506291] AttributeError:
'NoneType' object has no attribute 'share_networks'

Please, anyone could help ?
Ignazio
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200803/0b16012a/attachment.html>

From bdobreli at redhat.com  Mon Aug  3 08:13:31 2020
From: bdobreli at redhat.com (Bogdan Dobrelya)
Date: Mon, 3 Aug 2020 10:13:31 +0200
Subject: [tripleo][ansible] the current action plugins use patterns are
 suboptimal?
Message-ID: <6feb1d83-5cc8-1916-90a7-1a6b54593310@redhat.com>

There is a trend of writing action plugins, see [0], for simple things, 
like just calling a module in a loop. I'm not sure that is the direction 
TripleO should go. If ansible is inefficient in this sort of tasks 
without custom python code written, we should fix ansible. Otherwise, 
what is the ultimate goal of that trend? Is that having only action 
plugins in roles and playbooks?

Please kindly asking the community to stop that, make a step back and 
reiterate with the taken approach. Thank you.

[0] https://review.opendev.org/716108


-- 
Best regards,
Bogdan Dobrelya,
Irc #bogdando


From thierry at openstack.org  Mon Aug  3 09:56:16 2020
From: thierry at openstack.org (Thierry Carrez)
Date: Mon, 3 Aug 2020 11:56:16 +0200
Subject: [cloudkitty][tc] Cloudkitty abandoned?
In-Reply-To: <c6150166-1fa9-5020-988a-0a8e9101bca8@gmx.com>
References: <c6150166-1fa9-5020-988a-0a8e9101bca8@gmx.com>
Message-ID: <274f7289-136a-e829-bdf5-1c819355ce77@openstack.org>

Sean McGinnis wrote:
> Posting here to raise awareness, and start discussion about next steps.
> 
> It appears there is no one working on Cloudkitty anymore. No patches
> have been merged for several months now, including simple bot proposed
> patches. It would appear no one is maintaining this project anymore.
> [...]

Thanks for raising this, Sean.

I reached out to the maintainers at Objectif Libre to check on their 
status. Maybe it's just a COVID19 + summer vacancy situation... Let's 
see what they say.

-- 
Thierry Carrez (ttx)


From thierry at openstack.org  Mon Aug  3 10:15:06 2020
From: thierry at openstack.org (Thierry Carrez)
Date: Mon, 3 Aug 2020 12:15:06 +0200
Subject: [largescale-sig][nova][neutron][oslo] RPC ping
In-Reply-To: <CAJoCO=Mi8d23=Jhjsdy-05k1jWayg_1=NvTsdjUw+uDA9eE3hw@mail.gmail.com>
References: <20200727095744.GK31915@sync>
 <3d238530-6c84-d611-da4c-553ba836fc02@nemebean.com>
 <m2v9i8yjfw.fsf@caffeine.hv.danplanet.com>
 <CAJoCO=Mi8d23=Jhjsdy-05k1jWayg_1=NvTsdjUw+uDA9eE3hw@mail.gmail.com>
Message-ID: <88c24f3a-7d29-aa39-ed12-803279cc90c1@openstack.org>

Ken Giusti wrote:
> On Mon, Jul 27, 2020 at 1:18 PM Dan Smith <dms at danplanet.com 
> <mailto:dms at danplanet.com>> wrote:
>>     The primary concern was about something other than nova sitting on our
>>     bus making calls to our internal services. I imagine that the proposal
>>     to bake it into oslo.messaging is for the same purpose, and I'd probably
>>     have the same concern. At the time I think we agreed that if we were
>>     going to support direct-to-service health checks, they should be teensy
>>     HTTP servers with oslo healthchecks middleware. Further loading down
>>     rabbit with those pings doesn't seem like the best plan to
>>     me. Especially since Nova (compute) services already check in over RPC
>>     periodically and the success of that is discoverable en masse through
>>     the API.
> 
> While initially in favor of this feature Dan's concern has me 
> reconsidering this.
> 
> Now I believe that if the purpose of this feature is to check the 
> operational health of a service _using_ oslo.messaging, then I'm against 
> it.   A naked ping to a generic service point in an application doesn't 
> prove the operating health of that application beyond its connection to 
> rabbit. 

While I understand the need to further avoid loading down Rabbit, I like 
the universality of this solution, solving a real operational issue.

Obviously that creates a trade-off (further loading rabbit to get more 
operational insights), but nobody forces you to run those ping calls, 
they would be opt-in. So the proposed code in itself does not weigh down 
Rabbit, or make anything sit on the bus.

> Connectivity monitoring between an application and rabbit is 
> done using the keepalive connection heartbeat mechanism built into the 
> rabbit protocol, which O.M. supports today.

I'll let Arnaud answer, but I suspect the operational need is 
code-external checking of the rabbit->agent chain, not code-internal 
checking of the agent->rabbit chain. The heartbeat mechanism is used by 
the agent to keep the Rabbit connection alive, ensuring it works in most 
of the cases. The check described above is to catch the corner cases 
where it still doesn't.

-- 
Thierry Carrez (ttx)


From sshnaidm at redhat.com  Mon Aug  3 10:36:12 2020
From: sshnaidm at redhat.com (Sagi Shnaidman)
Date: Mon, 3 Aug 2020 13:36:12 +0300
Subject: [tripleo][ansible] the current action plugins use patterns are
 suboptimal?
In-Reply-To: <6feb1d83-5cc8-1916-90a7-1a6b54593310@redhat.com>
References: <6feb1d83-5cc8-1916-90a7-1a6b54593310@redhat.com>
Message-ID: <CAGHQP9w45ZNRebwRVTx870qoZ2NiJ7mdxWmmkGExhDx3OLYYzA@mail.gmail.com>

Hi, Bogdan

thanks for raising this up, although I'm not sure I understand what it is
the problem with using action plugins.
Action plugins are well known official extensions for Ansible, as any other
plugins - callback, strategy, inventory etc [1]. It is not any hack or
unsupported workaround, it's a known and official feature of Ansible. Why
can't we use it? What makes it different from filter, lookup, inventory or
any other plugin we already use?
Action plugins are also used wide in Ansible itself, for example templates
plugin is implemented with action plugin [2]. If Ansible can use it, why
can't we? I don't think there is something with "fixing" Ansible, it's not
a bug, this is a useful extension.
What regards the mentioned action plugin for podman containers, it allows
to spawn containers remotely while skipping the connection part for every
cycle. I'm not sure you can "fix" Ansible not to do that, it's not a bug.
We may not see the difference in a few hosts in CI, but it might be very
efficient when we deploy on 100+ hosts oro even 1000+ hosts. In order to
evaluate this on bigger setups to understand its value we configured both
options - to use action plugin or usual module.
If better performance of action plugin will be proven, we can switch to use
it, if it doesn't make a difference on bigger setups - then I think we can
easily switch back to using an usual module.

Thanks

[1] https://docs.ansible.com/ansible/latest/plugins/plugins.html
[2]
https://github.com/ansible/ansible/blob/devel/lib/ansible/plugins/action/template.py

On Mon, Aug 3, 2020 at 11:19 AM Bogdan Dobrelya <bdobreli at redhat.com> wrote:

> There is a trend of writing action plugins, see [0], for simple things,
> like just calling a module in a loop. I'm not sure that is the direction
> TripleO should go. If ansible is inefficient in this sort of tasks
> without custom python code written, we should fix ansible. Otherwise,
> what is the ultimate goal of that trend? Is that having only action
> plugins in roles and playbooks?
>
> Please kindly asking the community to stop that, make a step back and
> reiterate with the taken approach. Thank you.
>
> [0] https://review.opendev.org/716108
>
>
> --
> Best regards,
> Bogdan Dobrelya,
> Irc #bogdando
>
>
>

-- 
Best regards
Sagi Shnaidman
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200803/9cfbc85b/attachment-0001.html>

From balazs.gibizer at est.tech  Mon Aug  3 11:12:18 2020
From: balazs.gibizer at est.tech (=?iso-8859-1?q?Bal=E1zs?= Gibizer)
Date: Mon, 03 Aug 2020 13:12:18 +0200
Subject: [nova] If any spec freeze exception now?
In-Reply-To: <tencent_1612560D509DB6395C24DF4F@qq.com>
References: <tencent_1612560D509DB6395C24DF4F@qq.com>
Message-ID: <I4JHEQ.ZSONZPGBW4I82@est.tech>


On Fri, Jul 31, 2020 at 17:23, Rambo <lijie at unitedstack.com> wrote:
> Hi,all:
>         I have a spec which is support volume backed server 
> rebuild[0].This spec was accepted in Stein, but some of the work did 
> not finish, so repropose it for Victoria.And this spec is depend on 
> the cinder reimage api [1], now the reimage api is almost all 
> completed. So I sincerely wish this spec will approved in Victoria. 
> If this spec is approved, I will achieve it at once.
> 
I was +2 before on this spec. I see the value of it. I see that Lee, 
Sylvain and Sean had negative comments after my review. Also I saw that 
the spec was updated since. It would be good the get a re-review from 
those folks to see if their issues has been resolved. In general let's 
try to decide on the feature freeze exception for this on the weekly 
meeting. I hope folks will re-review the spec until Thursday.

I saw you added it as a topic to the meeting agenda, thanks. (I moved 
that topic to the Open Discussion section)

Cheers,
gibi

> 
> 
> Ref:
> 
> [0]:https://blueprints.launchpad.net/nova/+spec/volume-backed-server-rebuild
> 
> [1]:https://blueprints.launchpad.net/cinder/+spec/add-volume-re-image-api
> 
> Best Regards
> Rambo


From balazs.gibizer at est.tech  Mon Aug  3 11:16:26 2020
From: balazs.gibizer at est.tech (=?iso-8859-1?q?Bal=E1zs?= Gibizer)
Date: Mon, 03 Aug 2020 13:16:26 +0200
Subject: [nova] Victoria Milestone 2 Spec Freeze
Message-ID: <EBJHEQ.878CT10HXLFU1@est.tech>

Hi,

Last Thursday we reached Milestone 2 which means Nova is in Spec Freeze 
now. If you have a spec close to be approved and you wish to request a 
spec freeze exception then please send a mail to the ML about it. We 
will make the final decision on the weekly meeting on Thursday.

Cheers,
gibi


From bdobreli at redhat.com  Mon Aug  3 12:25:37 2020
From: bdobreli at redhat.com (Bogdan Dobrelya)
Date: Mon, 3 Aug 2020 14:25:37 +0200
Subject: [tripleo][ansible] the current action plugins use patterns are
 suboptimal?
In-Reply-To: <CAGHQP9w45ZNRebwRVTx870qoZ2NiJ7mdxWmmkGExhDx3OLYYzA@mail.gmail.com>
References: <6feb1d83-5cc8-1916-90a7-1a6b54593310@redhat.com>
 <CAGHQP9w45ZNRebwRVTx870qoZ2NiJ7mdxWmmkGExhDx3OLYYzA@mail.gmail.com>
Message-ID: <b01f96a4-c716-7db3-180e-ff1e6f76f975@redhat.com>

On 8/3/20 12:36 PM, Sagi Shnaidman wrote:
> Hi, Bogdan
> 
> thanks for raising this up, although I'm not sure I understand what it 
> is the problem with using action plugins.
> Action plugins are well known official extensions for Ansible, as any 
> other plugins - callback, strategy, inventory etc [1]. It is not any 
> hack or unsupported workaround, it's a known and official feature of 
> Ansible. Why can't we use it? What makes it different from filter, 

I believe the cases that require the use of those should be justified. 
For the given example, that manages containers in a loop via calling a 
module, what the written custom callback plugin buys for us? That brings 
code to maintain, extra complexity, like handling possible corner cases 
in async mode, dry-run mode etc. But what is justification aside of 
looks handy?

> lookup, inventory or any other plugin we already use?
> Action plugins are also used wide in Ansible itself, for example 
> templates plugin is implemented with action plugin [2]. If Ansible can 
> use it, why can't we? I don't think there is something with "fixing" 
> Ansible, it's not a bug, this is a useful extension.
> What regards the mentioned action plugin for podman containers, it 
> allows to spawn containers remotely while skipping the connection part 
> for every cycle. I'm not sure you can "fix" Ansible not to do that, it's 
> not a bug. We may not see the difference in a few hosts in CI, but it 
> might be very efficient when we deploy on 100+ hosts oro even 1000+ 
> hosts. In order to evaluate this on bigger setups to understand its 
> value we configured both options - to use action plugin or usual module.
> If better performance of action plugin will be proven, we can switch to 
> use it, if it doesn't make a difference on bigger setups - then I think 
> we can easily switch back to using an usual module.
> 
> Thanks
> 
> [1] https://docs.ansible.com/ansible/latest/plugins/plugins.html
> [2] 
> https://github.com/ansible/ansible/blob/devel/lib/ansible/plugins/action/template.py
> 
> On Mon, Aug 3, 2020 at 11:19 AM Bogdan Dobrelya <bdobreli at redhat.com 
> <mailto:bdobreli at redhat.com>> wrote:
> 
>     There is a trend of writing action plugins, see [0], for simple things,
>     like just calling a module in a loop. I'm not sure that is the
>     direction
>     TripleO should go. If ansible is inefficient in this sort of tasks
>     without custom python code written, we should fix ansible. Otherwise,
>     what is the ultimate goal of that trend? Is that having only action
>     plugins in roles and playbooks?
> 
>     Please kindly asking the community to stop that, make a step back and
>     reiterate with the taken approach. Thank you.
> 
>     [0] https://review.opendev.org/716108
> 
> 
>     -- 
>     Best regards,
>     Bogdan Dobrelya,
>     Irc #bogdando
> 
> 
> 
> 
> -- 
> Best regards
> Sagi Shnaidman


-- 
Best regards,
Bogdan Dobrelya,
Irc #bogdando


From monika.samal at outlook.com  Mon Aug  3 08:38:43 2020
From: monika.samal at outlook.com (Monika Samal)
Date: Mon, 3 Aug 2020 08:38:43 +0000
Subject: [openstack-community] Octavia :; Unable to create load balancer
In-Reply-To: <CAMH0MgLADYE6ncadXJHxghaKDYVmL=wYPE6hOejTC3xyHMgaAg@mail.gmail.com>
References: <DM6PR08MB51944451C8861614628EE9DA8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAFs83Qr3Us22pK7gnw7X68_FjimnBTRDn0s+RaSzuzhjcXdY6w@mail.gmail.com>
 <CAMH0MgL7mvG-Zgb-Sr1mbNWhHLyZAHYw+BbAuzaHA+EJSJ=DQQ@mail.gmail.com>
 <DM6PR08MB5194D8A130262996A211B61F8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VzM4BBStF8nTwVJtFxaX0surEidjnmB3JExTDsncJsZNg@mail.gmail.com>
 <DM6PR08MB5194211D78CB801C7C3B0D4A8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VwKMWiO7L6CYK+zb4jkT9ntCjcjQRspxdm8MFL+i13aKg@mail.gmail.com>
 <DM6PR08MB519419FF0C8F173E25A6EAB78F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyDANKJ0CvPXdwMFuED-xEea=Urn0sy0tpwnwLBPSnKTg@mail.gmail.com>
 <DM6PR08MB5194C8259398F0810FAA236E8F710@DM6PR08MB5194.namprd08.prod.outlook.com>,
 <CAMH0MgLADYE6ncadXJHxghaKDYVmL=wYPE6hOejTC3xyHMgaAg@mail.gmail.com>
Message-ID: <DM6PR08MB519489AEE16CD5C6CA69890C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>

Thanks a ton Michael for helping me out
________________________________
From: Michael Johnson <johnsomor at gmail.com>
Sent: Friday, July 31, 2020 3:57 AM
To: Monika Samal <monika.samal at outlook.com>
Cc: Fabian Zimmermann <dev.faz at gmail.com>; Amy Marrich <amy at demarco.com>; openstack-discuss <openstack-discuss at lists.openstack.org>; community at lists.openstack.org <community at lists.openstack.org>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

Just to close the loop on this, the octavia.conf file had
"project_name = admin" instead of "project_name = service" in the
[service_auth] section. This was causing the keystone errors when
Octavia was communicating with neutron.

I don't know if that is a bug in kolla-ansible or was just a local
configuration issue.

Michael

On Thu, Jul 30, 2020 at 1:39 PM Monika Samal <monika.samal at outlook.com> wrote:
>
> Hello Fabian,,
>
> http://paste.openstack.org/show/QxKv2Ai697qulp9UWTjY/
>
> Regards,
> Monika
> ________________________________
> From: Fabian Zimmermann <dev.faz at gmail.com>
> Sent: Friday, July 31, 2020 1:57 AM
> To: Monika Samal <monika.samal at outlook.com>
> Cc: Michael Johnson <johnsomor at gmail.com>; Amy Marrich <amy at demarco.com>; openstack-discuss <openstack-discuss at lists.openstack.org>; community at lists.openstack.org <community at lists.openstack.org>
> Subject: Re: [openstack-community] Octavia :; Unable to create load balancer
>
> Hi,
>
> just to debug, could you replace the auth_type password with v3password?
>
> And do a curl against your :5000 and :35357 urls and paste the output.
>
>  Fabian
>
> Monika Samal <monika.samal at outlook.com> schrieb am Do., 30. Juli 2020, 22:15:
>
> Hello Fabian,
>
> http://paste.openstack.org/show/796477/
>
> Thanks,
> Monika
> ________________________________
> From: Fabian Zimmermann <dev.faz at gmail.com>
> Sent: Friday, July 31, 2020 1:38 AM
> To: Monika Samal <monika.samal at outlook.com>
> Cc: Michael Johnson <johnsomor at gmail.com>; Amy Marrich <amy at demarco.com>; openstack-discuss <openstack-discuss at lists.openstack.org>; community at lists.openstack.org <community at lists.openstack.org>
> Subject: Re: [openstack-community] Octavia :; Unable to create load balancer
>
> The sections should be
>
> service_auth
> keystone_authtoken
>
> if i read the docs correctly. Maybe you can just paste your config (remove/change passwords) to paste.openstack.org and post the link?
>
>  Fabian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200803/54e04f51/attachment.html>

From monika.samal at outlook.com  Mon Aug  3 08:53:17 2020
From: monika.samal at outlook.com (Monika Samal)
Date: Mon, 3 Aug 2020 08:53:17 +0000
Subject: [openstack-community] Octavia :; Unable to create load balancer
In-Reply-To: <DM6PR08MB519489AEE16CD5C6CA69890C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
References: <DM6PR08MB51944451C8861614628EE9DA8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAFs83Qr3Us22pK7gnw7X68_FjimnBTRDn0s+RaSzuzhjcXdY6w@mail.gmail.com>
 <CAMH0MgL7mvG-Zgb-Sr1mbNWhHLyZAHYw+BbAuzaHA+EJSJ=DQQ@mail.gmail.com>
 <DM6PR08MB5194D8A130262996A211B61F8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VzM4BBStF8nTwVJtFxaX0surEidjnmB3JExTDsncJsZNg@mail.gmail.com>
 <DM6PR08MB5194211D78CB801C7C3B0D4A8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VwKMWiO7L6CYK+zb4jkT9ntCjcjQRspxdm8MFL+i13aKg@mail.gmail.com>
 <DM6PR08MB519419FF0C8F173E25A6EAB78F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyDANKJ0CvPXdwMFuED-xEea=Urn0sy0tpwnwLBPSnKTg@mail.gmail.com>
 <DM6PR08MB5194C8259398F0810FAA236E8F710@DM6PR08MB5194.namprd08.prod.outlook.com>,
 <CAMH0MgLADYE6ncadXJHxghaKDYVmL=wYPE6hOejTC3xyHMgaAg@mail.gmail.com>,
 <DM6PR08MB519489AEE16CD5C6CA69890C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
Message-ID: <DM6PR08MB51949EBABD05456E0D7B14E38F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>

After Michael suggestion I was able to create load balancer but there is error in status.


[cid:de900175-3754-4942-a53d-43c78e425e62]

PFB the error link:

http://paste.openstack.org/show/meNZCeuOlFkfjj189noN/
________________________________
From: Monika Samal <monika.samal at outlook.com>
Sent: Monday, August 3, 2020 2:08 PM
To: Michael Johnson <johnsomor at gmail.com>
Cc: Fabian Zimmermann <dev.faz at gmail.com>; Amy Marrich <amy at demarco.com>; openstack-discuss <openstack-discuss at lists.openstack.org>; community at lists.openstack.org <community at lists.openstack.org>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

Thanks a ton Michael for helping me out
________________________________
From: Michael Johnson <johnsomor at gmail.com>
Sent: Friday, July 31, 2020 3:57 AM
To: Monika Samal <monika.samal at outlook.com>
Cc: Fabian Zimmermann <dev.faz at gmail.com>; Amy Marrich <amy at demarco.com>; openstack-discuss <openstack-discuss at lists.openstack.org>; community at lists.openstack.org <community at lists.openstack.org>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

Just to close the loop on this, the octavia.conf file had
"project_name = admin" instead of "project_name = service" in the
[service_auth] section. This was causing the keystone errors when
Octavia was communicating with neutron.

I don't know if that is a bug in kolla-ansible or was just a local
configuration issue.

Michael

On Thu, Jul 30, 2020 at 1:39 PM Monika Samal <monika.samal at outlook.com> wrote:
>
> Hello Fabian,,
>
> http://paste.openstack.org/show/QxKv2Ai697qulp9UWTjY/
>
> Regards,
> Monika
> ________________________________
> From: Fabian Zimmermann <dev.faz at gmail.com>
> Sent: Friday, July 31, 2020 1:57 AM
> To: Monika Samal <monika.samal at outlook.com>
> Cc: Michael Johnson <johnsomor at gmail.com>; Amy Marrich <amy at demarco.com>; openstack-discuss <openstack-discuss at lists.openstack.org>; community at lists.openstack.org <community at lists.openstack.org>
> Subject: Re: [openstack-community] Octavia :; Unable to create load balancer
>
> Hi,
>
> just to debug, could you replace the auth_type password with v3password?
>
> And do a curl against your :5000 and :35357 urls and paste the output.
>
>  Fabian
>
> Monika Samal <monika.samal at outlook.com> schrieb am Do., 30. Juli 2020, 22:15:
>
> Hello Fabian,
>
> http://paste.openstack.org/show/796477/
>
> Thanks,
> Monika
> ________________________________
> From: Fabian Zimmermann <dev.faz at gmail.com>
> Sent: Friday, July 31, 2020 1:38 AM
> To: Monika Samal <monika.samal at outlook.com>
> Cc: Michael Johnson <johnsomor at gmail.com>; Amy Marrich <amy at demarco.com>; openstack-discuss <openstack-discuss at lists.openstack.org>; community at lists.openstack.org <community at lists.openstack.org>
> Subject: Re: [openstack-community] Octavia :; Unable to create load balancer
>
> The sections should be
>
> service_auth
> keystone_authtoken
>
> if i read the docs correctly. Maybe you can just paste your config (remove/change passwords) to paste.openstack.org and post the link?
>
>  Fabian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200803/24dd0d7c/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 26283 bytes
Desc: image.png
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200803/24dd0d7c/attachment-0001.png>

From lyarwood at redhat.com  Mon Aug  3 12:55:22 2020
From: lyarwood at redhat.com (Lee Yarwood)
Date: Mon, 3 Aug 2020 13:55:22 +0100
Subject: [nova] openstack-tox-lower-constraints broken
Message-ID: <20200803125522.rjso5tafqzt3sjoh@lyarwood.usersys.redhat.com>

Hello all,

$subject, I've raised the following bug:

openstack-tox-lower-constraints failing due to unmet dependency on decorator==4.0.0
https://launchpad.net/bugs/1890123

I'm trying to resolve this below but I honestly feel like I'm going
around in circles:

https://review.opendev.org/#/q/topic:bug/1890123

If anyone has any tooling and/or recommendations for resolving issues
like this I'd appreciate it!

Cheers,

-- 
Lee Yarwood                 A5D1 9385 88CB 7E5F BE64  6618 BCA6 6E33 F672 2D76
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200803/dcb31ecc/attachment.sig>

From dev.faz at gmail.com  Mon Aug  3 13:38:21 2020
From: dev.faz at gmail.com (Fabian Zimmermann)
Date: Mon, 3 Aug 2020 15:38:21 +0200
Subject: [openstack-community] Octavia :; Unable to create load balancer
In-Reply-To: <DM6PR08MB51949EBABD05456E0D7B14E38F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
References: <DM6PR08MB51944451C8861614628EE9DA8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAFs83Qr3Us22pK7gnw7X68_FjimnBTRDn0s+RaSzuzhjcXdY6w@mail.gmail.com>
 <CAMH0MgL7mvG-Zgb-Sr1mbNWhHLyZAHYw+BbAuzaHA+EJSJ=DQQ@mail.gmail.com>
 <DM6PR08MB5194D8A130262996A211B61F8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VzM4BBStF8nTwVJtFxaX0surEidjnmB3JExTDsncJsZNg@mail.gmail.com>
 <DM6PR08MB5194211D78CB801C7C3B0D4A8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VwKMWiO7L6CYK+zb4jkT9ntCjcjQRspxdm8MFL+i13aKg@mail.gmail.com>
 <DM6PR08MB519419FF0C8F173E25A6EAB78F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyDANKJ0CvPXdwMFuED-xEea=Urn0sy0tpwnwLBPSnKTg@mail.gmail.com>
 <DM6PR08MB5194C8259398F0810FAA236E8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAMH0MgLADYE6ncadXJHxghaKDYVmL=wYPE6hOejTC3xyHMgaAg@mail.gmail.com>
 <DM6PR08MB519489AEE16CD5C6CA69890C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <DM6PR08MB51949EBABD05456E0D7B14E38F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
Message-ID: <CAA857VyATFxkKPL+YJRsrTiY0qzxN4GWVu6FDroKgjGc5Bqzbg@mail.gmail.com>

Did you check the (nova) flavor you use in octavia.

 Fabian

Monika Samal <monika.samal at outlook.com> schrieb am Mo., 3. Aug. 2020, 10:53:

> After Michael suggestion I was able to create load balancer but there is
> error in status.
>
>
>
> PFB the error link:
>
> http://paste.openstack.org/show/meNZCeuOlFkfjj189noN/
> ------------------------------
> *From:* Monika Samal <monika.samal at outlook.com>
> *Sent:* Monday, August 3, 2020 2:08 PM
> *To:* Michael Johnson <johnsomor at gmail.com>
> *Cc:* Fabian Zimmermann <dev.faz at gmail.com>; Amy Marrich <amy at demarco.com>;
> openstack-discuss <openstack-discuss at lists.openstack.org>;
> community at lists.openstack.org <community at lists.openstack.org>
> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
> balancer
>
> Thanks a ton Michael for helping me out
> ------------------------------
> *From:* Michael Johnson <johnsomor at gmail.com>
> *Sent:* Friday, July 31, 2020 3:57 AM
> *To:* Monika Samal <monika.samal at outlook.com>
> *Cc:* Fabian Zimmermann <dev.faz at gmail.com>; Amy Marrich <amy at demarco.com>;
> openstack-discuss <openstack-discuss at lists.openstack.org>;
> community at lists.openstack.org <community at lists.openstack.org>
> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
> balancer
>
> Just to close the loop on this, the octavia.conf file had
> "project_name = admin" instead of "project_name = service" in the
> [service_auth] section. This was causing the keystone errors when
> Octavia was communicating with neutron.
>
> I don't know if that is a bug in kolla-ansible or was just a local
> configuration issue.
>
> Michael
>
> On Thu, Jul 30, 2020 at 1:39 PM Monika Samal <monika.samal at outlook.com>
> wrote:
> >
> > Hello Fabian,,
> >
> > http://paste.openstack.org/show/QxKv2Ai697qulp9UWTjY/
> >
> > Regards,
> > Monika
> > ________________________________
> > From: Fabian Zimmermann <dev.faz at gmail.com>
> > Sent: Friday, July 31, 2020 1:57 AM
> > To: Monika Samal <monika.samal at outlook.com>
> > Cc: Michael Johnson <johnsomor at gmail.com>; Amy Marrich <amy at demarco.com>;
> openstack-discuss <openstack-discuss at lists.openstack.org>;
> community at lists.openstack.org <community at lists.openstack.org>
> > Subject: Re: [openstack-community] Octavia :; Unable to create load
> balancer
> >
> > Hi,
> >
> > just to debug, could you replace the auth_type password with v3password?
> >
> > And do a curl against your :5000 and :35357 urls and paste the output.
> >
> >  Fabian
> >
> > Monika Samal <monika.samal at outlook.com> schrieb am Do., 30. Juli 2020,
> 22:15:
> >
> > Hello Fabian,
> >
> > http://paste.openstack.org/show/796477/
> >
> > Thanks,
> > Monika
> > ________________________________
> > From: Fabian Zimmermann <dev.faz at gmail.com>
> > Sent: Friday, July 31, 2020 1:38 AM
> > To: Monika Samal <monika.samal at outlook.com>
> > Cc: Michael Johnson <johnsomor at gmail.com>; Amy Marrich <amy at demarco.com>;
> openstack-discuss <openstack-discuss at lists.openstack.org>;
> community at lists.openstack.org <community at lists.openstack.org>
> > Subject: Re: [openstack-community] Octavia :; Unable to create load
> balancer
> >
> > The sections should be
> >
> > service_auth
> > keystone_authtoken
> >
> > if i read the docs correctly. Maybe you can just paste your config
> (remove/change passwords) to paste.openstack.org and post the link?
> >
> >  Fabian
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200803/50c616cc/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 26283 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200803/50c616cc/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 26283 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200803/50c616cc/attachment-0003.png>

From sean.mcginnis at gmx.com  Mon Aug  3 14:00:52 2020
From: sean.mcginnis at gmx.com (Sean McGinnis)
Date: Mon, 3 Aug 2020 09:00:52 -0500
Subject: [nova] openstack-tox-lower-constraints broken
In-Reply-To: <20200803125522.rjso5tafqzt3sjoh@lyarwood.usersys.redhat.com>
References: <20200803125522.rjso5tafqzt3sjoh@lyarwood.usersys.redhat.com>
Message-ID: <bb371041-a8af-c695-b0bb-593cd4372283@gmx.com>

On 8/3/20 7:55 AM, Lee Yarwood wrote:
> Hello all,
>
> $subject, I've raised the following bug:
>
> openstack-tox-lower-constraints failing due to unmet dependency on decorator==4.0.0
> https://launchpad.net/bugs/1890123
>
> I'm trying to resolve this below but I honestly feel like I'm going
> around in circles:
>
> https://review.opendev.org/#/q/topic:bug/1890123
>
> If anyone has any tooling and/or recommendations for resolving issues
> like this I'd appreciate it!
>
> Cheers,

This appears to be broken for everyone. I initially saw the decorator
thing with Cinder, but after looking closer realized it's not that package.

The root issue (or at least one level closer to the root issue, that
seems to be causing the decorator failure) is that the lower-constraints
are not actually being enforced. Even though the logs should it is
passing "-c [path to lower-constraints.txt]". So even though things
should be constrained to a lower version, presumably a version that
works with a different version of decorator, pip is still installing a
newer package than what the constraints should allow.

There was a pip release on the 28th. Things don't look like they started
failing until the 31st for us though, so either that is not it, or there
was just a delay before our nodes started picking up the newer version.

I tested locally, and at least with version 19.3.1, I am getting the
correctly constrained packages installed.

Still looking, but thought I would share in case that info triggers any
ideas for anyone else.

Sean


From sean.mcginnis at gmx.com  Mon Aug  3 14:12:47 2020
From: sean.mcginnis at gmx.com (Sean McGinnis)
Date: Mon, 3 Aug 2020 09:12:47 -0500
Subject: [nova] openstack-tox-lower-constraints broken
In-Reply-To: <bb371041-a8af-c695-b0bb-593cd4372283@gmx.com>
References: <20200803125522.rjso5tafqzt3sjoh@lyarwood.usersys.redhat.com>
 <bb371041-a8af-c695-b0bb-593cd4372283@gmx.com>
Message-ID: <6a92c9c8-9cc5-4b8e-4204-13545b40e5a2@gmx.com>


> The root issue (or at least one level closer to the root issue, that
> seems to be causing the decorator failure) is that the lower-constraints
> are not actually being enforced. Even though the logs should it is
> passing "-c [path to lower-constraints.txt]". So even though things
> should be constrained to a lower version, presumably a version that
> works with a different version of decorator, pip is still installing a
> newer package than what the constraints should allow.
>
> There was a pip release on the 28th. Things don't look like they started
> failing until the 31st for us though, so either that is not it, or there
> was just a delay before our nodes started picking up the newer version.
>
> I tested locally, and at least with version 19.3.1, I am getting the
> correctly constrained packages installed.
>
> Still looking, but thought I would share in case that info triggers any
> ideas for anyone else.
>
I upgraded my pip and rebuilt the venv. The new pip has some good
warnings emitted about some incompatible conflicts, so that part is
good, but it did not change the package installation behavior. I am
still able to get the correctly constrained packages installed locally
on Fedora 32. So at least so far, it doesn't appear to be a pip issue.


From mnaser at vexxhost.com  Mon Aug  3 14:21:33 2020
From: mnaser at vexxhost.com (Mohammed Naser)
Date: Mon, 3 Aug 2020 10:21:33 -0400
Subject: [largescale-sig] RPC ping
In-Reply-To: <20200727095744.GK31915@sync>
References: <20200727095744.GK31915@sync>
Message-ID: <CAEs876jC_0E-Pnr+KiPD5b+LRety9zhwR6tMof0ajK81-16YRA@mail.gmail.com>

I have a few operational suggestions on how I think we could do this best:

1. I think exposing a healthcheck endpoint that _actually_ runs the
ping and responds with a 200 OK makes a lot more sense in terms of
being able to run it inside something like Kubernetes, you end up with
a "who makes the ping and who responds to it" type of scenario which
can be tricky though I'm sure we can figure that out
2. I've found that newer releases of RabbitMQ really help with those
un-usable queues after a split, I haven't had any issues at all with
newer releases, so that could be something to help your life be a lot
easier.
3. You mentioned you're moving towards Kubernetes, we're doing the
same and building an operator:
https://opendev.org/vexxhost/openstack-operator -- Because the
operator manages the whole thing and Kubernetes does it's thing too,
we started moving towards 1 (single) rabbitmq per service, which
reaaaaaaally helped a lot in stabilizing things.  Oslo messaging is a
lot better at recovering when a single service IP is pointing towards
it because it doesn't do weird things like have threads trying to
connect to other Rabbit ports.  Just a thought.
4. In terms of telemetry and making sure you avoid that issue, we
track the consumption rates of queues inside OpenStack.  OpenStack
consumption rate should be constant and never growing, anytime it
grows, we instantly detect that something is fishy.  However, the
other issue comes in that when you restart any openstack service, it
'forgets' all it's existing queues and then you have a set of building
up queues until they automatically expire which happens around 30
minutes-ish, so it makes that alarm of "things are not being consumed"
a little noisy if you're restarting services

Sorry for the wall of super unorganized text, all over the place here
but thought I'd chime in with my 2 cents :)

On Mon, Jul 27, 2020 at 6:04 AM Arnaud Morin <arnaud.morin at gmail.com> wrote:
>
> Hey all,
>
> TLDR: I propose a change to oslo_messaging to allow doing a ping over RPC,
>       this is useful to monitor liveness of agents.
>
>
> Few weeks ago, I proposed a patch to oslo_messaging [1], which is adding a
> ping endpoint to RPC dispatcher.
> It means that every openstack service which is using oslo_messaging RPC
> endpoints (almosts all OpenStack services and agents - e.g. neutron
> server + agents, nova + computes, etc.) will then be able to answer to a
> specific "ping" call over RPC.
>
> I decided to propose this patch in my company mainly for 2 reasons:
> 1 - we are struggling monitoring our nova compute and neutron agents in a
>   correct way:
>
> 1.1 - sometimes our agents are disconnected from RPC, but the python process
> is still running.
> 1.2 - sometimes the agent is still connected, but the queue / binding on
> rabbit cluster is not working anymore (after a rabbit split for
> example). This one is very hard to debug, because the agent is still
> reporting health correctly on neutron server, but it's not able to
> receive messages anymore.
>
>
> 2 - we are trying to monitor agents running in k8s pods:
> when running a python agent (neutron l3-agent for example) in a k8s pod, we
> wanted to find a way to monitor if it is still live of not.
>
>
> Adding a RPC ping endpoint could help us solve both these issues.
> Note that we still need an external mechanism (out of OpenStack) to do this
> ping.
> We also think it could be nice for other OpenStackers, and especially
> large scale ops.
>
> Feel free to comment.
>
>
> [1] https://review.opendev.org/#/c/735385/
>
>
> --
> Arnaud Morin
>
>


-- 
Mohammed Naser
VEXXHOST, Inc.


From iurygregory at gmail.com  Mon Aug  3 14:42:38 2020
From: iurygregory at gmail.com (Iury Gregory)
Date: Mon, 3 Aug 2020 16:42:38 +0200
Subject: [ironic] let's talk about grenade
In-Reply-To: <CAJqPaj_zrNZ+oEhYc4Zh6VBPEfuW9PNj8JYB41RCxOeeo+iH1Q@mail.gmail.com>
References: <CAJqPaj-kFHWrpMeVtkoBDcoBSp6sFL529iayJLNsxfXQW01etQ@mail.gmail.com>
 <CAJqPaj_zrNZ+oEhYc4Zh6VBPEfuW9PNj8JYB41RCxOeeo+iH1Q@mail.gmail.com>
Message-ID: <CAJqPaj_GvLsMR3oWHgQjQ51+o1EHAjyjB65G_oxNQqcytSQiGA@mail.gmail.com>

Hello Everyone,

We will meet this Thursday (August 6th) at 2pm - 3pm UTC Time on bluejeans
[1].

Thank you!

[1] https://bluejeans.com/imelofer

Em qua., 29 de jul. de 2020 às 20:37, Iury Gregory <iurygregory at gmail.com>
escreveu:

> Hello everyone,
>
> Since we didn't get many responses I will keep the doodle open till Friday
> =)
>
> Em seg., 27 de jul. de 2020 às 17:55, Iury Gregory <iurygregory at gmail.com>
> escreveu:
>
>> Hello everyone,
>>
>> I'm still on the fight to move our ironic-grenade-dsvm-multinode-multitenant
>> to zuulv3 [1], you can find some of my findings on the etherpad [2] under `Move
>> to Zuul v3 Jobs (Iurygregory)`.
>>
>> If you are interested in helping out we are going to schedule a meeting
>> to discuss about this, please use the doodle in [3]. I will close the
>> doodle on Wed July 29.
>>
>> Thanks!
>>
>> [1] https://review.opendev.org/705030
>> [2] https://etherpad.openstack.org/p/IronicWhiteBoard
>> [3] https://doodle.com/poll/m69b5zwnsbgcysct
>>
>> --
>>
>>
>> *Att[]'sIury Gregory Melo Ferreira *
>> *MSc in Computer Science at UFCG*
>> *Part of the puppet-manager-core team in OpenStack*
>> *Software Engineer at Red Hat Czech*
>> *Social*: https://www.linkedin.com/in/iurygregory
>> *E-mail:  iurygregory at gmail.com <iurygregory at gmail.com>*
>>
>
>
> --
>
>
> *Att[]'sIury Gregory Melo Ferreira *
> *MSc in Computer Science at UFCG*
> *Part of the puppet-manager-core team in OpenStack*
> *Software Engineer at Red Hat Czech*
> *Social*: https://www.linkedin.com/in/iurygregory
> *E-mail:  iurygregory at gmail.com <iurygregory at gmail.com>*
>


-- 


*Att[]'sIury Gregory Melo Ferreira *
*MSc in Computer Science at UFCG*
*Part of the puppet-manager-core team in OpenStack*
*Software Engineer at Red Hat Czech*
*Social*: https://www.linkedin.com/in/iurygregory
*E-mail:  iurygregory at gmail.com <iurygregory at gmail.com>*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200803/a0307099/attachment-0001.html>

From dev.faz at gmail.com  Mon Aug  3 14:46:57 2020
From: dev.faz at gmail.com (Fabian Zimmermann)
Date: Mon, 3 Aug 2020 16:46:57 +0200
Subject: [openstack-community] Octavia :; Unable to create load balancer
In-Reply-To: <DM6PR08MB519447CBCC68203A8D5C870C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
References: <DM6PR08MB51944451C8861614628EE9DA8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAFs83Qr3Us22pK7gnw7X68_FjimnBTRDn0s+RaSzuzhjcXdY6w@mail.gmail.com>
 <CAMH0MgL7mvG-Zgb-Sr1mbNWhHLyZAHYw+BbAuzaHA+EJSJ=DQQ@mail.gmail.com>
 <DM6PR08MB5194D8A130262996A211B61F8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VzM4BBStF8nTwVJtFxaX0surEidjnmB3JExTDsncJsZNg@mail.gmail.com>
 <DM6PR08MB5194211D78CB801C7C3B0D4A8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VwKMWiO7L6CYK+zb4jkT9ntCjcjQRspxdm8MFL+i13aKg@mail.gmail.com>
 <DM6PR08MB519419FF0C8F173E25A6EAB78F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyDANKJ0CvPXdwMFuED-xEea=Urn0sy0tpwnwLBPSnKTg@mail.gmail.com>
 <DM6PR08MB5194C8259398F0810FAA236E8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAMH0MgLADYE6ncadXJHxghaKDYVmL=wYPE6hOejTC3xyHMgaAg@mail.gmail.com>
 <DM6PR08MB519489AEE16CD5C6CA69890C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <DM6PR08MB51949EBABD05456E0D7B14E38F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyATFxkKPL+YJRsrTiY0qzxN4GWVu6FDroKgjGc5Bqzbg@mail.gmail.com>
 <DM6PR08MB519447CBCC68203A8D5C870C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
Message-ID: <CAA857VxOttV=yNCoFp-RGMts82RJrCB7wXkygsPwmNx--miYVg@mail.gmail.com>

Seems like the flavor is missing or empty '' - check for typos and enable
debug.

Check if the nova req contains valid information/flavor.

 Fabian

Monika Samal <monika.samal at outlook.com> schrieb am Mo., 3. Aug. 2020, 15:46:

> It's registered
>
> Get Outlook for Android <https://aka.ms/ghei36>
> ------------------------------
> *From:* Fabian Zimmermann <dev.faz at gmail.com>
> *Sent:* Monday, August 3, 2020 7:08:21 PM
> *To:* Monika Samal <monika.samal at outlook.com>; openstack-discuss <
> openstack-discuss at lists.openstack.org>
> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
> balancer
>
> Did you check the (nova) flavor you use in octavia.
>
>  Fabian
>
> Monika Samal <monika.samal at outlook.com> schrieb am Mo., 3. Aug. 2020,
> 10:53:
>
> After Michael suggestion I was able to create load balancer but there is
> error in status.
>
>
>
> PFB the error link:
>
> http://paste.openstack.org/show/meNZCeuOlFkfjj189noN/
> ------------------------------
> *From:* Monika Samal <monika.samal at outlook.com>
> *Sent:* Monday, August 3, 2020 2:08 PM
> *To:* Michael Johnson <johnsomor at gmail.com>
> *Cc:* Fabian Zimmermann <dev.faz at gmail.com>; Amy Marrich <amy at demarco.com>;
> openstack-discuss <openstack-discuss at lists.openstack.org>;
> community at lists.openstack.org <community at lists.openstack.org>
> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
> balancer
>
> Thanks a ton Michael for helping me out
> ------------------------------
> *From:* Michael Johnson <johnsomor at gmail.com>
> *Sent:* Friday, July 31, 2020 3:57 AM
> *To:* Monika Samal <monika.samal at outlook.com>
> *Cc:* Fabian Zimmermann <dev.faz at gmail.com>; Amy Marrich <amy at demarco.com>;
> openstack-discuss <openstack-discuss at lists.openstack.org>;
> community at lists.openstack.org <community at lists.openstack.org>
> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
> balancer
>
> Just to close the loop on this, the octavia.conf file had
> "project_name = admin" instead of "project_name = service" in the
> [service_auth] section. This was causing the keystone errors when
> Octavia was communicating with neutron.
>
> I don't know if that is a bug in kolla-ansible or was just a local
> configuration issue.
>
> Michael
>
> On Thu, Jul 30, 2020 at 1:39 PM Monika Samal <monika.samal at outlook.com>
> wrote:
> >
> > Hello Fabian,,
> >
> > http://paste.openstack.org/show/QxKv2Ai697qulp9UWTjY/
> >
> > Regards,
> > Monika
> > ________________________________
> > From: Fabian Zimmermann <dev.faz at gmail.com>
> > Sent: Friday, July 31, 2020 1:57 AM
> > To: Monika Samal <monika.samal at outlook.com>
> > Cc: Michael Johnson <johnsomor at gmail.com>; Amy Marrich <amy at demarco.com>;
> openstack-discuss <openstack-discuss at lists.openstack.org>;
> community at lists.openstack.org <community at lists.openstack.org>
> > Subject: Re: [openstack-community] Octavia :; Unable to create load
> balancer
> >
> > Hi,
> >
> > just to debug, could you replace the auth_type password with v3password?
> >
> > And do a curl against your :5000 and :35357 urls and paste the output.
> >
> >  Fabian
> >
> > Monika Samal <monika.samal at outlook.com> schrieb am Do., 30. Juli 2020,
> 22:15:
> >
> > Hello Fabian,
> >
> > http://paste.openstack.org/show/796477/
> >
> > Thanks,
> > Monika
> > ________________________________
> > From: Fabian Zimmermann <dev.faz at gmail.com>
> > Sent: Friday, July 31, 2020 1:38 AM
> > To: Monika Samal <monika.samal at outlook.com>
> > Cc: Michael Johnson <johnsomor at gmail.com>; Amy Marrich <amy at demarco.com>;
> openstack-discuss <openstack-discuss at lists.openstack.org>;
> community at lists.openstack.org <community at lists.openstack.org>
> > Subject: Re: [openstack-community] Octavia :; Unable to create load
> balancer
> >
> > The sections should be
> >
> > service_auth
> > keystone_authtoken
> >
> > if i read the docs correctly. Maybe you can just paste your config
> (remove/change passwords) to paste.openstack.org and post the link?
> >
> >  Fabian
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200803/2beddd44/attachment.html>

From sean.mcginnis at gmx.com  Mon Aug  3 15:31:52 2020
From: sean.mcginnis at gmx.com (Sean McGinnis)
Date: Mon, 3 Aug 2020 10:31:52 -0500
Subject: [release] Release countdown for week R-10  August 3 - 7
Message-ID: <20200803153152.GA3471444@sm-workstation>

Development Focus
-----------------

We are now past the Victoria-2 milestone, and entering the last development
phase of the cycle. Teams should be focused on implementing planned work
for the cycle.

Now is a good time to review those plans and reprioritize anything if
needed based on the what progress has been made and what looks realistic
to complete in the next few weeks.

General Information
-------------------

Looking ahead to the end of the release cycle, please be aware of the
feature freeze dates. Those vary depending on deliverable type:

* General libraries (except client libraries) need to have their last
  feature release before Non-client library freeze (September 3). Their
  stable branches are cut early.

* Client libraries (think python-*client libraries) need to have their
  last feature release before Client library freeze (September 10)

* Deliverables following a cycle-with-rc model (that would be most
  services) observe a Feature freeze on that same date, September 10.
  Any feature addition beyond that date should be discussed on the
  mailing-list and get PTL approval. After feature freeze, cycle-with-rc
  deliverables need to produce a first release candidate (and a stable
  branch) before RC1 deadline (September 24)

* Deliverables following cycle-with-intermediary model can release as
  necessary, but in all cases before Final RC deadline (October 8)

Upcoming Deadlines & Dates
--------------------------

Ussuri Cycle-trailing deadline: August 13 (R-9 week)
Non-client library freeze: September 3 (R-6 week)
Client library freeze: September 10 (R-5 week)
Ussuri-3 milestone: September 10 (R-5 week)
Victoria release: October 14


From johnsomor at gmail.com  Mon Aug  3 15:40:40 2020
From: johnsomor at gmail.com (Michael Johnson)
Date: Mon, 3 Aug 2020 08:40:40 -0700
Subject: [openstack-community] Octavia :; Unable to create load balancer
In-Reply-To: <CAA857VxOttV=yNCoFp-RGMts82RJrCB7wXkygsPwmNx--miYVg@mail.gmail.com>
References: <DM6PR08MB51944451C8861614628EE9DA8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAFs83Qr3Us22pK7gnw7X68_FjimnBTRDn0s+RaSzuzhjcXdY6w@mail.gmail.com>
 <CAMH0MgL7mvG-Zgb-Sr1mbNWhHLyZAHYw+BbAuzaHA+EJSJ=DQQ@mail.gmail.com>
 <DM6PR08MB5194D8A130262996A211B61F8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VzM4BBStF8nTwVJtFxaX0surEidjnmB3JExTDsncJsZNg@mail.gmail.com>
 <DM6PR08MB5194211D78CB801C7C3B0D4A8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VwKMWiO7L6CYK+zb4jkT9ntCjcjQRspxdm8MFL+i13aKg@mail.gmail.com>
 <DM6PR08MB519419FF0C8F173E25A6EAB78F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyDANKJ0CvPXdwMFuED-xEea=Urn0sy0tpwnwLBPSnKTg@mail.gmail.com>
 <DM6PR08MB5194C8259398F0810FAA236E8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAMH0MgLADYE6ncadXJHxghaKDYVmL=wYPE6hOejTC3xyHMgaAg@mail.gmail.com>
 <DM6PR08MB519489AEE16CD5C6CA69890C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <DM6PR08MB51949EBABD05456E0D7B14E38F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyATFxkKPL+YJRsrTiY0qzxN4GWVu6FDroKgjGc5Bqzbg@mail.gmail.com>
 <DM6PR08MB519447CBCC68203A8D5C870C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VxOttV=yNCoFp-RGMts82RJrCB7wXkygsPwmNx--miYVg@mail.gmail.com>
Message-ID: <CAMH0MgLNm36nWsuXWyYvbd=1LNdNdmKLMYJn7u+d-muTnk0kuA@mail.gmail.com>

Yeah, it looks like nova is failing to boot the instance.

Check this setting in your octavia.conf files:
https://docs.openstack.org/octavia/latest/configuration/configref.html#controller_worker.amp_flavor_id

Also, if kolla-ansible didn't set both of these values correctly, please
open bug reports for kolla-ansible. These all should have been configured
by the deployment tool.

Michael

On Mon, Aug 3, 2020 at 7:53 AM Fabian Zimmermann <dev.faz at gmail.com> wrote:

> Seems like the flavor is missing or empty '' - check for typos and enable
> debug.
>
> Check if the nova req contains valid information/flavor.
>
>  Fabian
>
> Monika Samal <monika.samal at outlook.com> schrieb am Mo., 3. Aug. 2020,
> 15:46:
>
>> It's registered
>>
>> Get Outlook for Android <https://aka.ms/ghei36>
>> ------------------------------
>> *From:* Fabian Zimmermann <dev.faz at gmail.com>
>> *Sent:* Monday, August 3, 2020 7:08:21 PM
>> *To:* Monika Samal <monika.samal at outlook.com>; openstack-discuss <
>> openstack-discuss at lists.openstack.org>
>> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
>> balancer
>>
>> Did you check the (nova) flavor you use in octavia.
>>
>>  Fabian
>>
>> Monika Samal <monika.samal at outlook.com> schrieb am Mo., 3. Aug. 2020,
>> 10:53:
>>
>> After Michael suggestion I was able to create load balancer but there is
>> error in status.
>>
>>
>>
>> PFB the error link:
>>
>> http://paste.openstack.org/show/meNZCeuOlFkfjj189noN/
>> ------------------------------
>> *From:* Monika Samal <monika.samal at outlook.com>
>> *Sent:* Monday, August 3, 2020 2:08 PM
>> *To:* Michael Johnson <johnsomor at gmail.com>
>> *Cc:* Fabian Zimmermann <dev.faz at gmail.com>; Amy Marrich <amy at demarco.com>;
>> openstack-discuss <openstack-discuss at lists.openstack.org>;
>> community at lists.openstack.org <community at lists.openstack.org>
>> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
>> balancer
>>
>> Thanks a ton Michael for helping me out
>> ------------------------------
>> *From:* Michael Johnson <johnsomor at gmail.com>
>> *Sent:* Friday, July 31, 2020 3:57 AM
>> *To:* Monika Samal <monika.samal at outlook.com>
>> *Cc:* Fabian Zimmermann <dev.faz at gmail.com>; Amy Marrich <amy at demarco.com>;
>> openstack-discuss <openstack-discuss at lists.openstack.org>;
>> community at lists.openstack.org <community at lists.openstack.org>
>> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
>> balancer
>>
>> Just to close the loop on this, the octavia.conf file had
>> "project_name = admin" instead of "project_name = service" in the
>> [service_auth] section. This was causing the keystone errors when
>> Octavia was communicating with neutron.
>>
>> I don't know if that is a bug in kolla-ansible or was just a local
>> configuration issue.
>>
>> Michael
>>
>> On Thu, Jul 30, 2020 at 1:39 PM Monika Samal <monika.samal at outlook.com>
>> wrote:
>> >
>> > Hello Fabian,,
>> >
>> > http://paste.openstack.org/show/QxKv2Ai697qulp9UWTjY/
>> >
>> > Regards,
>> > Monika
>> > ________________________________
>> > From: Fabian Zimmermann <dev.faz at gmail.com>
>> > Sent: Friday, July 31, 2020 1:57 AM
>> > To: Monika Samal <monika.samal at outlook.com>
>> > Cc: Michael Johnson <johnsomor at gmail.com>; Amy Marrich <amy at demarco.com>;
>> openstack-discuss <openstack-discuss at lists.openstack.org>;
>> community at lists.openstack.org <community at lists.openstack.org>
>> > Subject: Re: [openstack-community] Octavia :; Unable to create load
>> balancer
>> >
>> > Hi,
>> >
>> > just to debug, could you replace the auth_type password with v3password?
>> >
>> > And do a curl against your :5000 and :35357 urls and paste the output.
>> >
>> >  Fabian
>> >
>> > Monika Samal <monika.samal at outlook.com> schrieb am Do., 30. Juli 2020,
>> 22:15:
>> >
>> > Hello Fabian,
>> >
>> > http://paste.openstack.org/show/796477/
>> >
>> > Thanks,
>> > Monika
>> > ________________________________
>> > From: Fabian Zimmermann <dev.faz at gmail.com>
>> > Sent: Friday, July 31, 2020 1:38 AM
>> > To: Monika Samal <monika.samal at outlook.com>
>> > Cc: Michael Johnson <johnsomor at gmail.com>; Amy Marrich <amy at demarco.com>;
>> openstack-discuss <openstack-discuss at lists.openstack.org>;
>> community at lists.openstack.org <community at lists.openstack.org>
>> > Subject: Re: [openstack-community] Octavia :; Unable to create load
>> balancer
>> >
>> > The sections should be
>> >
>> > service_auth
>> > keystone_authtoken
>> >
>> > if i read the docs correctly. Maybe you can just paste your config
>> (remove/change passwords) to paste.openstack.org and post the link?
>> >
>> >  Fabian
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200803/84a5b35b/attachment-0001.html>

From ashlee at openstack.org  Mon Aug  3 18:12:07 2020
From: ashlee at openstack.org (Ashlee Ferguson)
Date: Mon, 3 Aug 2020 13:12:07 -0500
Subject: CFP Deadline Tomorrow - Virtual Open Infrastructure Summit
Message-ID: <3353421C-E038-47F5-B68F-3828232639EB@openstack.org>

Hi everyone,

It’s time to submit your Open Infrastructure virtual Summit presentations[1]! The CFP deadline is tomorrow. 

Submit sessions featuring open source projects including Airship, Ansible, Ceph, Kata Containers, Kubernetes, ONAP, OpenStack, OPNFV, StarlingX and Zuul. As a reminder, these are the 2020 Tracks:

5G, NFV & Edge
AI, Machine Learning & HPC
CI/CD
Container Infrastructure
Getting Started
Hands-on Workshops
Open Development
Private & Hybrid Cloud
Public Cloud
Security

Get your presentations, panels, and workshops in before August 4 at 11:59 pm PT (August 5 at 6:59 am UTC). The content submission process for the Forum and Project Teams Gathering (PTG) will be managed separately in the upcoming months. The Summit Programming Committee has shared topics by Track for community members interested in speaking at the upcoming Summit. Check out the submission tips[2]!

Then don’t forget to register[3] for the virtual Open Infrastructure Summit taking place October 19-23, 2020 at no cost to you. 

Need more time? Reach out to speakersupport at openstack.org <mailto:speakersupport at openstack.org> with any questions or concerns.

Cheers,
Ashlee


[1] https://cfp.openstack.org/ <https://cfp.openstack.org/app/presentations>
[2] https://superuser.openstack.org/articles/virtual-open-infrastructure-summit-cfp/ <https://superuser.openstack.org/articles/virtual-open-infrastructure-summit-cfp/>
[3] https://openinfrasummit2020.eventbrite.com <https://openinfrasummit2020.eventbrite.com/>


Ashlee Ferguson
Community & Events Coordinator
OpenStack Foundation

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200803/03764b6f/attachment.html>

From massimo.sgaravatto at gmail.com  Mon Aug  3 18:14:21 2020
From: massimo.sgaravatto at gmail.com (Massimo Sgaravatto)
Date: Mon, 3 Aug 2020 20:14:21 +0200
Subject: [ops] [cinder[ "Services and volumes must have a service UUID."
Message-ID: <CALaZjRE97Jp_JBduWm4LKjRH-w=1tgfajwJ6GPCHerXKMBqpAQ@mail.gmail.com>

We have just updated a small OpenStack cluster to Train.
Everything seems working, but "cinder-status upgrade check" complains that
services and volumes must have a service UUID [*].
What does this exactly mean?

Thanks, Massimo

[*]
+--------------------------------------------------------------------+
| Check: Service UUIDs                                               |
| Result: Failure                                                    |
| Details: Services and volumes must have a service UUID. Please fix |
|   this issue by running Queens online data migrations.             |
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200803/1e8573e1/attachment.html>

From victoria at vmartinezdelacruz.com  Mon Aug  3 19:12:06 2020
From: victoria at vmartinezdelacruz.com (=?UTF-8?Q?Victoria_Mart=C3=ADnez_de_la_Cruz?=)
Date: Mon, 3 Aug 2020 16:12:06 -0300
Subject: [manila] Doc-a-thon event coming up next Thursday (Aug 6th)
In-Reply-To: <CAJ_e2gD3SNqirw0euCBOjhyC2maHeN75G6-FpP77eXCRWTvOjg@mail.gmail.com>
References: <CAJ_e2gD3SNqirw0euCBOjhyC2maHeN75G6-FpP77eXCRWTvOjg@mail.gmail.com>
Message-ID: <CAJ_e2gDKLRx9GGMZXSqr35FTAL6MQSrEYJTWSUn1TvVTfougZA@mail.gmail.com>

Hi everybody,

An update on this. We decided to take over the upstream meeting directly
and start *at* the slot of the Manila weekly meeting. We will join the
Jitsi bridge [0] at 3pm UTC time and start going through the list of bugs
we have in [1]. There is no finish time, you can join and leave the bridge
freely. We will also use IRC Freenode channel #openstack-manila if needed.

If the time slot doesn't work for you (we are aware this is not a friendly
slot for EMEA/APAC), you can still go through the bug list in [1], claim a
bug and work on it.

If things go well, we plan to do this again in a different slot so
everybody that wants to collaborate can do it.

Looking forward to see you there,

Cheers,

V

[0] https://meetpad.opendev.org/ManilaV-ReleaseDocAThon
[1] https://ethercalc.openstack.org/ur17jprbprxx

On Fri, Jul 31, 2020 at 2:05 PM Victoria Martínez de la Cruz <
victoria at vmartinezdelacruz.com> wrote:

> Hi folks,
>
> We will be organizing a doc-a-thon next Thursday, August 6th, with the
> main goal of improving our docs for the next release. We will be gathering
> on our Freenode channel #openstack-manila after our weekly meeting (3pm
> UTC) and also using a videoconference tool (exact details TBC) to go over a
> curated list of opened doc bugs we have here [0].
>
> *Your* participation is truly valued, being you an already Manila
> contributor or if you are interested in contributing and you didn't know
> how, so looking forward to seeing you there :)
>
> Cheers,
>
> Victoria
>
> [0] https://ethercalc.openstack.org/ur17jprbprxx
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200803/632b1146/attachment.html>

From gagehugo at gmail.com  Mon Aug  3 19:19:07 2020
From: gagehugo at gmail.com (Gage Hugo)
Date: Mon, 3 Aug 2020 14:19:07 -0500
Subject: [openstack-helm] IRC Meeting Canceled 08/04
Message-ID: <CAE4Awf_dNZsLu+sBcCSEAdD7x5bM8w7kU4wtnad0EboOQ4S3EA@mail.gmail.com>

Hello everyone,

Since I will be unavailable tomorrow and there's currently no agenda, I am
going to cancel the meeting for tomorrow.  We will meet again next week at
the regular time.

Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200803/da38f629/attachment-0001.html>

From victoria at vmartinezdelacruz.com  Mon Aug  3 19:21:16 2020
From: victoria at vmartinezdelacruz.com (=?UTF-8?Q?Victoria_Mart=C3=ADnez_de_la_Cruz?=)
Date: Mon, 3 Aug 2020 16:21:16 -0300
Subject: [openstack][stein][manila-ui] error
In-Reply-To: <CAB7j8cUDXoNHipBMJfXR=93eoupTrZa5cm+1NjEALY7Cg12U_g@mail.gmail.com>
References: <CAB7j8cUDXoNHipBMJfXR=93eoupTrZa5cm+1NjEALY7Cg12U_g@mail.gmail.com>
Message-ID: <CAJ_e2gANpL4i8QXetjobSKEEN9D3tWg3T=yPrq3ZbUNuMX_H8g@mail.gmail.com>

Hi Ignazio,

How did you deploy Manila and Manila UI? Can you point me toward the docs
you used?

Also, which is the specific workflow you are following to reach that trace?
Just opening the dashboard and clicking on the Shares tab?

Cheers,

V

On Mon, Aug 3, 2020 at 4:55 AM Ignazio Cassano <ignaziocassano at gmail.com>
wrote:

> Hello, I installed manila on openstack stein and it works by command line
> mat the manila ui does not work and in httpd error log I read:
>
> [Mon Aug 03 07:45:26.697408 2020] [:error] [pid 3506291] ERROR
> django.request Internal Server Error: /dashboard/project/shares/
> [Mon Aug 03 07:45:26.697437 2020] [:error] [pid 3506291] Traceback (most
> recent call last):
> [Mon Aug 03 07:45:26.697442 2020] [:error] [pid 3506291]   File
> "/usr/lib/python2.7/site-packages/django/core/handlers/exception.py", line
> 41, in inner
> [Mon Aug 03 07:45:26.697446 2020] [:error] [pid 3506291]     response =
> get_response(request)
> [Mon Aug 03 07:45:26.697450 2020] [:error] [pid 3506291]   File
> "/usr/lib/python2.7/site-packages/django/core/handlers/base.py", line 187,
> in _get_response
> [Mon Aug 03 07:45:26.697453 2020] [:error] [pid 3506291]     response =
> self.process_exception_by_middleware(e, request)
> [Mon Aug 03 07:45:26.697466 2020] [:error] [pid 3506291]   File
> "/usr/lib/python2.7/site-packages/django/core/handlers/base.py", line 185,
> in _get_response
> [Mon Aug 03 07:45:26.697471 2020] [:error] [pid 3506291]     response =
> wrapped_callback(request, *callback_args, **callback_kwargs)
> [Mon Aug 03 07:45:26.697475 2020] [:error] [pid 3506291]   File
> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 52, in dec
> [Mon Aug 03 07:45:26.697479 2020] [:error] [pid 3506291]     return
> view_func(request, *args, **kwargs)
> [Mon Aug 03 07:45:26.697482 2020] [:error] [pid 3506291]   File
> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 36, in dec
> [Mon Aug 03 07:45:26.697485 2020] [:error] [pid 3506291]     return
> view_func(request, *args, **kwargs)
> [Mon Aug 03 07:45:26.697489 2020] [:error] [pid 3506291]   File
> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 36, in dec
> [Mon Aug 03 07:45:26.697492 2020] [:error] [pid 3506291]     return
> view_func(request, *args, **kwargs)
> [Mon Aug 03 07:45:26.697496 2020] [:error] [pid 3506291]   File
> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 113, in dec
> [Mon Aug 03 07:45:26.697499 2020] [:error] [pid 3506291]     return
> view_func(request, *args, **kwargs)
> [Mon Aug 03 07:45:26.697502 2020] [:error] [pid 3506291]   File
> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 84, in dec
> [Mon Aug 03 07:45:26.697506 2020] [:error] [pid 3506291]     return
> view_func(request, *args, **kwargs)
> [Mon Aug 03 07:45:26.697509 2020] [:error] [pid 3506291]   File
> "/usr/lib/python2.7/site-packages/django/views/generic/base.py", line 68,
> in view
> [Mon Aug 03 07:45:26.697513 2020] [:error] [pid 3506291]     return
> self.dispatch(request, *args, **kwargs)
> [Mon Aug 03 07:45:26.697516 2020] [:error] [pid 3506291]   File
> "/usr/lib/python2.7/site-packages/django/views/generic/base.py", line 88,
> in dispatch
> [Mon Aug 03 07:45:26.697520 2020] [:error] [pid 3506291]     return
> handler(request, *args, **kwargs)
> [Mon Aug 03 07:45:26.697523 2020] [:error] [pid 3506291]   File
> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 223, in get
> [Mon Aug 03 07:45:26.697526 2020] [:error] [pid 3506291]     handled =
> self.construct_tables()
> [Mon Aug 03 07:45:26.697530 2020] [:error] [pid 3506291]   File
> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 214, in
> construct_tables
> [Mon Aug 03 07:45:26.697533 2020] [:error] [pid 3506291]     handled =
> self.handle_table(table)
> [Mon Aug 03 07:45:26.697537 2020] [:error] [pid 3506291]   File
> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 123, in
> handle_table
> [Mon Aug 03 07:45:26.697540 2020] [:error] [pid 3506291]     data =
> self._get_data_dict()
> [Mon Aug 03 07:45:26.697544 2020] [:error] [pid 3506291]   File
> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 43, in
> _get_data_dict
> [Mon Aug 03 07:45:26.697547 2020] [:error] [pid 3506291]
> data.extend(func())
> [Mon Aug 03 07:45:26.697550 2020] [:error] [pid 3506291]   File
> "/usr/lib/python2.7/site-packages/horizon/utils/memoized.py", line 109, in
> wrapped
> [Mon Aug 03 07:45:26.697554 2020] [:error] [pid 3506291]     value =
> cache[key] = func(*args, **kwargs)
> [Mon Aug 03 07:45:26.697557 2020] [:error] [pid 3506291]   File
> "/usr/lib/python2.7/site-packages/manila_ui/dashboards/project/shares/views.py",
> line 57, in get_shares_data
> [Mon Aug 03 07:45:26.697561 2020] [:error] [pid 3506291]     share_nets =
> manila.share_network_list(self.request)
> [Mon Aug 03 07:45:26.697564 2020] [:error] [pid 3506291]   File
> "/usr/lib/python2.7/site-packages/manila_ui/api/manila.py", line 280, in
> share_network_list
> [Mon Aug 03 07:45:26.697568 2020] [:error] [pid 3506291]     return
> manilaclient(request).share_networks.list(detailed=detailed,
> [Mon Aug 03 07:45:26.697571 2020] [:error] [pid 3506291] AttributeError:
> 'NoneType' object has no attribute 'share_networks'
>
> Please, anyone could help ?
> Ignazio
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200803/fd312610/attachment.html>

From aschultz at redhat.com  Mon Aug  3 19:28:51 2020
From: aschultz at redhat.com (Alex Schultz)
Date: Mon, 3 Aug 2020 13:28:51 -0600
Subject: [tripleo][ansible] the current action plugins use patterns are
 suboptimal?
In-Reply-To: <b01f96a4-c716-7db3-180e-ff1e6f76f975@redhat.com>
References: <6feb1d83-5cc8-1916-90a7-1a6b54593310@redhat.com>
 <CAGHQP9w45ZNRebwRVTx870qoZ2NiJ7mdxWmmkGExhDx3OLYYzA@mail.gmail.com>
 <b01f96a4-c716-7db3-180e-ff1e6f76f975@redhat.com>
Message-ID: <CAFsb3b7eU4Bus-6vk1w+bCMPd1Y7h_6xYVqNCzcqbai=DhCXHg@mail.gmail.com>

On Mon, Aug 3, 2020 at 6:34 AM Bogdan Dobrelya <bdobreli at redhat.com> wrote:
>
> On 8/3/20 12:36 PM, Sagi Shnaidman wrote:
> > Hi, Bogdan
> >
> > thanks for raising this up, although I'm not sure I understand what it
> > is the problem with using action plugins.
> > Action plugins are well known official extensions for Ansible, as any
> > other plugins - callback, strategy, inventory etc [1]. It is not any
> > hack or unsupported workaround, it's a known and official feature of
> > Ansible. Why can't we use it? What makes it different from filter,
>
> I believe the cases that require the use of those should be justified.
> For the given example, that manages containers in a loop via calling a
> module, what the written custom callback plugin buys for us? That brings
> code to maintain, extra complexity, like handling possible corner cases
> in async mode, dry-run mode etc. But what is justification aside of
> looks handy?

I disagree that we shouldn't use action plugins or modules.  Tasks
themselves are expensive at scale.  We saw that when we switched away
from paunch to container management in pure ansible tasks.  This
exposed that looping tasks are even more expensive and complex error
handling and workflows are better suited for modules or action plugins
than a series of tasks.  This is not something to be "fixed in
ansible".  This is the nature of the executor and strategy related
interactions.  Should everything be converted to modules and plugins?
no.  Should everything be tasks only? no.  It's a balance that must be
struck between when a specific set of complex tasks need extra data
processing or error handling.  Switching to modules or action plugins
allows us to unit test our logic. Using tasks do not have such a
concept outside of writing complex molecule testing.   IMHO it's safer
to switch to modules/action plugins than writing task logic.

IMHO the issue that I see with the switch to Action plugins is the
increased load on the ansible "controller" node during execution.
Modules may be better depending on the task being managed. But I
believe with unit testing, action plugins or modules provide a cleaner
and more testable solution than writing roles consisting only of
tasks.


>
> > lookup, inventory or any other plugin we already use?
> > Action plugins are also used wide in Ansible itself, for example
> > templates plugin is implemented with action plugin [2]. If Ansible can
> > use it, why can't we? I don't think there is something with "fixing"
> > Ansible, it's not a bug, this is a useful extension.
> > What regards the mentioned action plugin for podman containers, it
> > allows to spawn containers remotely while skipping the connection part
> > for every cycle. I'm not sure you can "fix" Ansible not to do that, it's
> > not a bug. We may not see the difference in a few hosts in CI, but it
> > might be very efficient when we deploy on 100+ hosts oro even 1000+
> > hosts. In order to evaluate this on bigger setups to understand its
> > value we configured both options - to use action plugin or usual module.
> > If better performance of action plugin will be proven, we can switch to
> > use it, if it doesn't make a difference on bigger setups - then I think
> > we can easily switch back to using an usual module.
> >
> > Thanks
> >
> > [1] https://docs.ansible.com/ansible/latest/plugins/plugins.html
> > [2]
> > https://github.com/ansible/ansible/blob/devel/lib/ansible/plugins/action/template.py
> >
> > On Mon, Aug 3, 2020 at 11:19 AM Bogdan Dobrelya <bdobreli at redhat.com
> > <mailto:bdobreli at redhat.com>> wrote:
> >
> >     There is a trend of writing action plugins, see [0], for simple things,
> >     like just calling a module in a loop. I'm not sure that is the
> >     direction
> >     TripleO should go. If ansible is inefficient in this sort of tasks
> >     without custom python code written, we should fix ansible. Otherwise,
> >     what is the ultimate goal of that trend? Is that having only action
> >     plugins in roles and playbooks?
> >
> >     Please kindly asking the community to stop that, make a step back and
> >     reiterate with the taken approach. Thank you.
> >
> >     [0] https://review.opendev.org/716108
> >
> >
> >     --
> >     Best regards,
> >     Bogdan Dobrelya,
> >     Irc #bogdando
> >
> >
> >
> >
> > --
> > Best regards
> > Sagi Shnaidman
>
>
> --
> Best regards,
> Bogdan Dobrelya,
> Irc #bogdando
>
>


From ignaziocassano at gmail.com  Mon Aug  3 19:32:25 2020
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Mon, 3 Aug 2020 21:32:25 +0200
Subject: [openstack][stein][manila-ui] error
In-Reply-To: <CAJ_e2gANpL4i8QXetjobSKEEN9D3tWg3T=yPrq3ZbUNuMX_H8g@mail.gmail.com>
References: <CAB7j8cUDXoNHipBMJfXR=93eoupTrZa5cm+1NjEALY7Cg12U_g@mail.gmail.com>
 <CAJ_e2gANpL4i8QXetjobSKEEN9D3tWg3T=yPrq3ZbUNuMX_H8g@mail.gmail.com>
Message-ID: <CAB7j8cWJQNEmSkoGBXzPaahSaphwEQprMPmFNVozfOz2sB-9Mw@mail.gmail.com>

Hello Victoria, I installed manila with yum on centos 7.
Yes, I open the dashboard and I click on shares tab.

I think the problem is I not using share networks because I am using netapp
drivers without share management option.
Looking at the code the dashboard check if there are shares under shared
networks.
My understading is that shared networks should be created only when shared
management option is true.
Ignazio


Il Lun 3 Ago 2020, 21:21 Victoria Martínez de la Cruz <
victoria at vmartinezdelacruz.com> ha scritto:

> Hi Ignazio,
>
> How did you deploy Manila and Manila UI? Can you point me toward the docs
> you used?
>
> Also, which is the specific workflow you are following to reach that
> trace? Just opening the dashboard and clicking on the Shares tab?
>
> Cheers,
>
> V
>
> On Mon, Aug 3, 2020 at 4:55 AM Ignazio Cassano <ignaziocassano at gmail.com>
> wrote:
>
>> Hello, I installed manila on openstack stein and it works by command line
>> mat the manila ui does not work and in httpd error log I read:
>>
>> [Mon Aug 03 07:45:26.697408 2020] [:error] [pid 3506291] ERROR
>> django.request Internal Server Error: /dashboard/project/shares/
>> [Mon Aug 03 07:45:26.697437 2020] [:error] [pid 3506291] Traceback (most
>> recent call last):
>> [Mon Aug 03 07:45:26.697442 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/django/core/handlers/exception.py", line
>> 41, in inner
>> [Mon Aug 03 07:45:26.697446 2020] [:error] [pid 3506291]     response =
>> get_response(request)
>> [Mon Aug 03 07:45:26.697450 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/django/core/handlers/base.py", line 187,
>> in _get_response
>> [Mon Aug 03 07:45:26.697453 2020] [:error] [pid 3506291]     response =
>> self.process_exception_by_middleware(e, request)
>> [Mon Aug 03 07:45:26.697466 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/django/core/handlers/base.py", line 185,
>> in _get_response
>> [Mon Aug 03 07:45:26.697471 2020] [:error] [pid 3506291]     response =
>> wrapped_callback(request, *callback_args, **callback_kwargs)
>> [Mon Aug 03 07:45:26.697475 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 52, in dec
>> [Mon Aug 03 07:45:26.697479 2020] [:error] [pid 3506291]     return
>> view_func(request, *args, **kwargs)
>> [Mon Aug 03 07:45:26.697482 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 36, in dec
>> [Mon Aug 03 07:45:26.697485 2020] [:error] [pid 3506291]     return
>> view_func(request, *args, **kwargs)
>> [Mon Aug 03 07:45:26.697489 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 36, in dec
>> [Mon Aug 03 07:45:26.697492 2020] [:error] [pid 3506291]     return
>> view_func(request, *args, **kwargs)
>> [Mon Aug 03 07:45:26.697496 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 113, in dec
>> [Mon Aug 03 07:45:26.697499 2020] [:error] [pid 3506291]     return
>> view_func(request, *args, **kwargs)
>> [Mon Aug 03 07:45:26.697502 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 84, in dec
>> [Mon Aug 03 07:45:26.697506 2020] [:error] [pid 3506291]     return
>> view_func(request, *args, **kwargs)
>> [Mon Aug 03 07:45:26.697509 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/django/views/generic/base.py", line 68,
>> in view
>> [Mon Aug 03 07:45:26.697513 2020] [:error] [pid 3506291]     return
>> self.dispatch(request, *args, **kwargs)
>> [Mon Aug 03 07:45:26.697516 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/django/views/generic/base.py", line 88,
>> in dispatch
>> [Mon Aug 03 07:45:26.697520 2020] [:error] [pid 3506291]     return
>> handler(request, *args, **kwargs)
>> [Mon Aug 03 07:45:26.697523 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 223, in get
>> [Mon Aug 03 07:45:26.697526 2020] [:error] [pid 3506291]     handled =
>> self.construct_tables()
>> [Mon Aug 03 07:45:26.697530 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 214, in
>> construct_tables
>> [Mon Aug 03 07:45:26.697533 2020] [:error] [pid 3506291]     handled =
>> self.handle_table(table)
>> [Mon Aug 03 07:45:26.697537 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 123, in
>> handle_table
>> [Mon Aug 03 07:45:26.697540 2020] [:error] [pid 3506291]     data =
>> self._get_data_dict()
>> [Mon Aug 03 07:45:26.697544 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 43, in
>> _get_data_dict
>> [Mon Aug 03 07:45:26.697547 2020] [:error] [pid 3506291]
>> data.extend(func())
>> [Mon Aug 03 07:45:26.697550 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/horizon/utils/memoized.py", line 109, in
>> wrapped
>> [Mon Aug 03 07:45:26.697554 2020] [:error] [pid 3506291]     value =
>> cache[key] = func(*args, **kwargs)
>> [Mon Aug 03 07:45:26.697557 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/manila_ui/dashboards/project/shares/views.py",
>> line 57, in get_shares_data
>> [Mon Aug 03 07:45:26.697561 2020] [:error] [pid 3506291]     share_nets =
>> manila.share_network_list(self.request)
>> [Mon Aug 03 07:45:26.697564 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/manila_ui/api/manila.py", line 280, in
>> share_network_list
>> [Mon Aug 03 07:45:26.697568 2020] [:error] [pid 3506291]     return
>> manilaclient(request).share_networks.list(detailed=detailed,
>> [Mon Aug 03 07:45:26.697571 2020] [:error] [pid 3506291] AttributeError:
>> 'NoneType' object has no attribute 'share_networks'
>>
>> Please, anyone could help ?
>> Ignazio
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200803/9e2a59f9/attachment-0001.html>

From ignaziocassano at gmail.com  Mon Aug  3 19:34:34 2020
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Mon, 3 Aug 2020 21:34:34 +0200
Subject: [openstack][stein][manila-ui] error
In-Reply-To: <CAJ_e2gANpL4i8QXetjobSKEEN9D3tWg3T=yPrq3ZbUNuMX_H8g@mail.gmail.com>
References: <CAB7j8cUDXoNHipBMJfXR=93eoupTrZa5cm+1NjEALY7Cg12U_g@mail.gmail.com>
 <CAJ_e2gANpL4i8QXetjobSKEEN9D3tWg3T=yPrq3ZbUNuMX_H8g@mail.gmail.com>
Message-ID: <CAB7j8cUORJdd=BZnTV+8DFvbpKhCGnb7_=W4GWo5fYDchkDeWg@mail.gmail.com>

PS
I followed installation guide under docs.openstack.org.


Il Lun 3 Ago 2020, 21:21 Victoria Martínez de la Cruz <
victoria at vmartinezdelacruz.com> ha scritto:

> Hi Ignazio,
>
> How did you deploy Manila and Manila UI? Can you point me toward the docs
> you used?
>
> Also, which is the specific workflow you are following to reach that
> trace? Just opening the dashboard and clicking on the Shares tab?
>
> Cheers,
>
> V
>
> On Mon, Aug 3, 2020 at 4:55 AM Ignazio Cassano <ignaziocassano at gmail.com>
> wrote:
>
>> Hello, I installed manila on openstack stein and it works by command line
>> mat the manila ui does not work and in httpd error log I read:
>>
>> [Mon Aug 03 07:45:26.697408 2020] [:error] [pid 3506291] ERROR
>> django.request Internal Server Error: /dashboard/project/shares/
>> [Mon Aug 03 07:45:26.697437 2020] [:error] [pid 3506291] Traceback (most
>> recent call last):
>> [Mon Aug 03 07:45:26.697442 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/django/core/handlers/exception.py", line
>> 41, in inner
>> [Mon Aug 03 07:45:26.697446 2020] [:error] [pid 3506291]     response =
>> get_response(request)
>> [Mon Aug 03 07:45:26.697450 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/django/core/handlers/base.py", line 187,
>> in _get_response
>> [Mon Aug 03 07:45:26.697453 2020] [:error] [pid 3506291]     response =
>> self.process_exception_by_middleware(e, request)
>> [Mon Aug 03 07:45:26.697466 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/django/core/handlers/base.py", line 185,
>> in _get_response
>> [Mon Aug 03 07:45:26.697471 2020] [:error] [pid 3506291]     response =
>> wrapped_callback(request, *callback_args, **callback_kwargs)
>> [Mon Aug 03 07:45:26.697475 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 52, in dec
>> [Mon Aug 03 07:45:26.697479 2020] [:error] [pid 3506291]     return
>> view_func(request, *args, **kwargs)
>> [Mon Aug 03 07:45:26.697482 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 36, in dec
>> [Mon Aug 03 07:45:26.697485 2020] [:error] [pid 3506291]     return
>> view_func(request, *args, **kwargs)
>> [Mon Aug 03 07:45:26.697489 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 36, in dec
>> [Mon Aug 03 07:45:26.697492 2020] [:error] [pid 3506291]     return
>> view_func(request, *args, **kwargs)
>> [Mon Aug 03 07:45:26.697496 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 113, in dec
>> [Mon Aug 03 07:45:26.697499 2020] [:error] [pid 3506291]     return
>> view_func(request, *args, **kwargs)
>> [Mon Aug 03 07:45:26.697502 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 84, in dec
>> [Mon Aug 03 07:45:26.697506 2020] [:error] [pid 3506291]     return
>> view_func(request, *args, **kwargs)
>> [Mon Aug 03 07:45:26.697509 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/django/views/generic/base.py", line 68,
>> in view
>> [Mon Aug 03 07:45:26.697513 2020] [:error] [pid 3506291]     return
>> self.dispatch(request, *args, **kwargs)
>> [Mon Aug 03 07:45:26.697516 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/django/views/generic/base.py", line 88,
>> in dispatch
>> [Mon Aug 03 07:45:26.697520 2020] [:error] [pid 3506291]     return
>> handler(request, *args, **kwargs)
>> [Mon Aug 03 07:45:26.697523 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 223, in get
>> [Mon Aug 03 07:45:26.697526 2020] [:error] [pid 3506291]     handled =
>> self.construct_tables()
>> [Mon Aug 03 07:45:26.697530 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 214, in
>> construct_tables
>> [Mon Aug 03 07:45:26.697533 2020] [:error] [pid 3506291]     handled =
>> self.handle_table(table)
>> [Mon Aug 03 07:45:26.697537 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 123, in
>> handle_table
>> [Mon Aug 03 07:45:26.697540 2020] [:error] [pid 3506291]     data =
>> self._get_data_dict()
>> [Mon Aug 03 07:45:26.697544 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 43, in
>> _get_data_dict
>> [Mon Aug 03 07:45:26.697547 2020] [:error] [pid 3506291]
>> data.extend(func())
>> [Mon Aug 03 07:45:26.697550 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/horizon/utils/memoized.py", line 109, in
>> wrapped
>> [Mon Aug 03 07:45:26.697554 2020] [:error] [pid 3506291]     value =
>> cache[key] = func(*args, **kwargs)
>> [Mon Aug 03 07:45:26.697557 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/manila_ui/dashboards/project/shares/views.py",
>> line 57, in get_shares_data
>> [Mon Aug 03 07:45:26.697561 2020] [:error] [pid 3506291]     share_nets =
>> manila.share_network_list(self.request)
>> [Mon Aug 03 07:45:26.697564 2020] [:error] [pid 3506291]   File
>> "/usr/lib/python2.7/site-packages/manila_ui/api/manila.py", line 280, in
>> share_network_list
>> [Mon Aug 03 07:45:26.697568 2020] [:error] [pid 3506291]     return
>> manilaclient(request).share_networks.list(detailed=detailed,
>> [Mon Aug 03 07:45:26.697571 2020] [:error] [pid 3506291] AttributeError:
>> 'NoneType' object has no attribute 'share_networks'
>>
>> Please, anyone could help ?
>> Ignazio
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200803/8f3cef63/attachment.html>

From ignaziocassano at gmail.com  Mon Aug  3 19:41:34 2020
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Mon, 3 Aug 2020 21:41:34 +0200
Subject: [openstack][stein][manila-ui] error
In-Reply-To: <CAB7j8cUORJdd=BZnTV+8DFvbpKhCGnb7_=W4GWo5fYDchkDeWg@mail.gmail.com>
References: <CAB7j8cUDXoNHipBMJfXR=93eoupTrZa5cm+1NjEALY7Cg12U_g@mail.gmail.com>
 <CAJ_e2gANpL4i8QXetjobSKEEN9D3tWg3T=yPrq3ZbUNuMX_H8g@mail.gmail.com>
 <CAB7j8cUORJdd=BZnTV+8DFvbpKhCGnb7_=W4GWo5fYDchkDeWg@mail.gmail.com>
Message-ID: <CAB7j8cU3X1oTg0MKqYWe945owAFfwg71525wmD2_YnnwcGvGKA@mail.gmail.com>

PS ps
Sorry If aI am writing again.
The command:
manila list let me to show shares I created with command line.
The dashboard gives errors I reported in my first email.
Looking at manila.py line 280 it checks shares under share networks.
Ignazio


Il Lun 3 Ago 2020, 21:34 Ignazio Cassano <ignaziocassano at gmail.com> ha
scritto:

> PS
> I followed installation guide under docs.openstack.org.
>
>
> Il Lun 3 Ago 2020, 21:21 Victoria Martínez de la Cruz <
> victoria at vmartinezdelacruz.com> ha scritto:
>
>> Hi Ignazio,
>>
>> How did you deploy Manila and Manila UI? Can you point me toward the docs
>> you used?
>>
>> Also, which is the specific workflow you are following to reach that
>> trace? Just opening the dashboard and clicking on the Shares tab?
>>
>> Cheers,
>>
>> V
>>
>> On Mon, Aug 3, 2020 at 4:55 AM Ignazio Cassano <ignaziocassano at gmail.com>
>> wrote:
>>
>>> Hello, I installed manila on openstack stein and it works by command
>>> line mat the manila ui does not work and in httpd error log I read:
>>>
>>> [Mon Aug 03 07:45:26.697408 2020] [:error] [pid 3506291] ERROR
>>> django.request Internal Server Error: /dashboard/project/shares/
>>> [Mon Aug 03 07:45:26.697437 2020] [:error] [pid 3506291] Traceback (most
>>> recent call last):
>>> [Mon Aug 03 07:45:26.697442 2020] [:error] [pid 3506291]   File
>>> "/usr/lib/python2.7/site-packages/django/core/handlers/exception.py", line
>>> 41, in inner
>>> [Mon Aug 03 07:45:26.697446 2020] [:error] [pid 3506291]     response =
>>> get_response(request)
>>> [Mon Aug 03 07:45:26.697450 2020] [:error] [pid 3506291]   File
>>> "/usr/lib/python2.7/site-packages/django/core/handlers/base.py", line 187,
>>> in _get_response
>>> [Mon Aug 03 07:45:26.697453 2020] [:error] [pid 3506291]     response =
>>> self.process_exception_by_middleware(e, request)
>>> [Mon Aug 03 07:45:26.697466 2020] [:error] [pid 3506291]   File
>>> "/usr/lib/python2.7/site-packages/django/core/handlers/base.py", line 185,
>>> in _get_response
>>> [Mon Aug 03 07:45:26.697471 2020] [:error] [pid 3506291]     response =
>>> wrapped_callback(request, *callback_args, **callback_kwargs)
>>> [Mon Aug 03 07:45:26.697475 2020] [:error] [pid 3506291]   File
>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 52, in dec
>>> [Mon Aug 03 07:45:26.697479 2020] [:error] [pid 3506291]     return
>>> view_func(request, *args, **kwargs)
>>> [Mon Aug 03 07:45:26.697482 2020] [:error] [pid 3506291]   File
>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 36, in dec
>>> [Mon Aug 03 07:45:26.697485 2020] [:error] [pid 3506291]     return
>>> view_func(request, *args, **kwargs)
>>> [Mon Aug 03 07:45:26.697489 2020] [:error] [pid 3506291]   File
>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 36, in dec
>>> [Mon Aug 03 07:45:26.697492 2020] [:error] [pid 3506291]     return
>>> view_func(request, *args, **kwargs)
>>> [Mon Aug 03 07:45:26.697496 2020] [:error] [pid 3506291]   File
>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 113, in dec
>>> [Mon Aug 03 07:45:26.697499 2020] [:error] [pid 3506291]     return
>>> view_func(request, *args, **kwargs)
>>> [Mon Aug 03 07:45:26.697502 2020] [:error] [pid 3506291]   File
>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 84, in dec
>>> [Mon Aug 03 07:45:26.697506 2020] [:error] [pid 3506291]     return
>>> view_func(request, *args, **kwargs)
>>> [Mon Aug 03 07:45:26.697509 2020] [:error] [pid 3506291]   File
>>> "/usr/lib/python2.7/site-packages/django/views/generic/base.py", line 68,
>>> in view
>>> [Mon Aug 03 07:45:26.697513 2020] [:error] [pid 3506291]     return
>>> self.dispatch(request, *args, **kwargs)
>>> [Mon Aug 03 07:45:26.697516 2020] [:error] [pid 3506291]   File
>>> "/usr/lib/python2.7/site-packages/django/views/generic/base.py", line 88,
>>> in dispatch
>>> [Mon Aug 03 07:45:26.697520 2020] [:error] [pid 3506291]     return
>>> handler(request, *args, **kwargs)
>>> [Mon Aug 03 07:45:26.697523 2020] [:error] [pid 3506291]   File
>>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 223, in get
>>> [Mon Aug 03 07:45:26.697526 2020] [:error] [pid 3506291]     handled =
>>> self.construct_tables()
>>> [Mon Aug 03 07:45:26.697530 2020] [:error] [pid 3506291]   File
>>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 214, in
>>> construct_tables
>>> [Mon Aug 03 07:45:26.697533 2020] [:error] [pid 3506291]     handled =
>>> self.handle_table(table)
>>> [Mon Aug 03 07:45:26.697537 2020] [:error] [pid 3506291]   File
>>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 123, in
>>> handle_table
>>> [Mon Aug 03 07:45:26.697540 2020] [:error] [pid 3506291]     data =
>>> self._get_data_dict()
>>> [Mon Aug 03 07:45:26.697544 2020] [:error] [pid 3506291]   File
>>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 43, in
>>> _get_data_dict
>>> [Mon Aug 03 07:45:26.697547 2020] [:error] [pid 3506291]
>>> data.extend(func())
>>> [Mon Aug 03 07:45:26.697550 2020] [:error] [pid 3506291]   File
>>> "/usr/lib/python2.7/site-packages/horizon/utils/memoized.py", line 109, in
>>> wrapped
>>> [Mon Aug 03 07:45:26.697554 2020] [:error] [pid 3506291]     value =
>>> cache[key] = func(*args, **kwargs)
>>> [Mon Aug 03 07:45:26.697557 2020] [:error] [pid 3506291]   File
>>> "/usr/lib/python2.7/site-packages/manila_ui/dashboards/project/shares/views.py",
>>> line 57, in get_shares_data
>>> [Mon Aug 03 07:45:26.697561 2020] [:error] [pid 3506291]     share_nets
>>> = manila.share_network_list(self.request)
>>> [Mon Aug 03 07:45:26.697564 2020] [:error] [pid 3506291]   File
>>> "/usr/lib/python2.7/site-packages/manila_ui/api/manila.py", line 280, in
>>> share_network_list
>>> [Mon Aug 03 07:45:26.697568 2020] [:error] [pid 3506291]     return
>>> manilaclient(request).share_networks.list(detailed=detailed,
>>> [Mon Aug 03 07:45:26.697571 2020] [:error] [pid 3506291] AttributeError:
>>> 'NoneType' object has no attribute 'share_networks'
>>>
>>> Please, anyone could help ?
>>> Ignazio
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200803/8f1241d8/attachment-0001.html>

From juliaashleykreger at gmail.com  Mon Aug  3 20:05:25 2020
From: juliaashleykreger at gmail.com (Julia Kreger)
Date: Mon, 3 Aug 2020 13:05:25 -0700
Subject: [ironic][stable] Include ironic-core in ironic-stable-maint ?
Message-ID: <CAF7gwdjB1toM-LJ+869p0K7yJhEy3YNWYV_5V7WjYLLJJO2hNA@mail.gmail.com>

Greetings awesome humans,

I have a conundrum, and largely it is over stable branch maintenance.

In essence, our stable branch approvers are largely down to Dmitry,
Riccardo, and Myself. I think this needs to change and I'd like to
propose that we go ahead and change ironic-stable-maint to just
include ironic-core in order to prevent the bottleneck and conflict
and risk which this presents.

I strongly believe that our existing cores would all do the right
thing if presented with the question of if a change needed to be
merged. So honestly I'm not concerned by this proposal. Plus, some of
our sub-projects have operated this way for quite some time.

Thoughts, concerns, worries?

-Julia


From ignaziocassano at gmail.com  Mon Aug  3 20:25:55 2020
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Mon, 3 Aug 2020 22:25:55 +0200
Subject: [openstack][stein][manila-ui] error
In-Reply-To: <CAB7j8cU3X1oTg0MKqYWe945owAFfwg71525wmD2_YnnwcGvGKA@mail.gmail.com>
References: <CAB7j8cUDXoNHipBMJfXR=93eoupTrZa5cm+1NjEALY7Cg12U_g@mail.gmail.com>
 <CAJ_e2gANpL4i8QXetjobSKEEN9D3tWg3T=yPrq3ZbUNuMX_H8g@mail.gmail.com>
 <CAB7j8cUORJdd=BZnTV+8DFvbpKhCGnb7_=W4GWo5fYDchkDeWg@mail.gmail.com>
 <CAB7j8cU3X1oTg0MKqYWe945owAFfwg71525wmD2_YnnwcGvGKA@mail.gmail.com>
Message-ID: <CAB7j8cWLkLMz91gx5Fw_kk=5wAnhGtH67nWU2W56xGjoRUzV=Q@mail.gmail.com>

I mean I am using dhss false

Il Lun 3 Ago 2020, 21:41 Ignazio Cassano <ignaziocassano at gmail.com> ha
scritto:

> PS ps
> Sorry If aI am writing again.
> The command:
> manila list let me to show shares I created with command line.
> The dashboard gives errors I reported in my first email.
> Looking at manila.py line 280 it checks shares under share networks.
> Ignazio
>
>
> Il Lun 3 Ago 2020, 21:34 Ignazio Cassano <ignaziocassano at gmail.com> ha
> scritto:
>
>> PS
>> I followed installation guide under docs.openstack.org.
>>
>>
>> Il Lun 3 Ago 2020, 21:21 Victoria Martínez de la Cruz <
>> victoria at vmartinezdelacruz.com> ha scritto:
>>
>>> Hi Ignazio,
>>>
>>> How did you deploy Manila and Manila UI? Can you point me toward the
>>> docs you used?
>>>
>>> Also, which is the specific workflow you are following to reach that
>>> trace? Just opening the dashboard and clicking on the Shares tab?
>>>
>>> Cheers,
>>>
>>> V
>>>
>>> On Mon, Aug 3, 2020 at 4:55 AM Ignazio Cassano <ignaziocassano at gmail.com>
>>> wrote:
>>>
>>>> Hello, I installed manila on openstack stein and it works by command
>>>> line mat the manila ui does not work and in httpd error log I read:
>>>>
>>>> [Mon Aug 03 07:45:26.697408 2020] [:error] [pid 3506291] ERROR
>>>> django.request Internal Server Error: /dashboard/project/shares/
>>>> [Mon Aug 03 07:45:26.697437 2020] [:error] [pid 3506291] Traceback
>>>> (most recent call last):
>>>> [Mon Aug 03 07:45:26.697442 2020] [:error] [pid 3506291]   File
>>>> "/usr/lib/python2.7/site-packages/django/core/handlers/exception.py", line
>>>> 41, in inner
>>>> [Mon Aug 03 07:45:26.697446 2020] [:error] [pid 3506291]     response =
>>>> get_response(request)
>>>> [Mon Aug 03 07:45:26.697450 2020] [:error] [pid 3506291]   File
>>>> "/usr/lib/python2.7/site-packages/django/core/handlers/base.py", line 187,
>>>> in _get_response
>>>> [Mon Aug 03 07:45:26.697453 2020] [:error] [pid 3506291]     response =
>>>> self.process_exception_by_middleware(e, request)
>>>> [Mon Aug 03 07:45:26.697466 2020] [:error] [pid 3506291]   File
>>>> "/usr/lib/python2.7/site-packages/django/core/handlers/base.py", line 185,
>>>> in _get_response
>>>> [Mon Aug 03 07:45:26.697471 2020] [:error] [pid 3506291]     response =
>>>> wrapped_callback(request, *callback_args, **callback_kwargs)
>>>> [Mon Aug 03 07:45:26.697475 2020] [:error] [pid 3506291]   File
>>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 52, in dec
>>>> [Mon Aug 03 07:45:26.697479 2020] [:error] [pid 3506291]     return
>>>> view_func(request, *args, **kwargs)
>>>> [Mon Aug 03 07:45:26.697482 2020] [:error] [pid 3506291]   File
>>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 36, in dec
>>>> [Mon Aug 03 07:45:26.697485 2020] [:error] [pid 3506291]     return
>>>> view_func(request, *args, **kwargs)
>>>> [Mon Aug 03 07:45:26.697489 2020] [:error] [pid 3506291]   File
>>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 36, in dec
>>>> [Mon Aug 03 07:45:26.697492 2020] [:error] [pid 3506291]     return
>>>> view_func(request, *args, **kwargs)
>>>> [Mon Aug 03 07:45:26.697496 2020] [:error] [pid 3506291]   File
>>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 113, in dec
>>>> [Mon Aug 03 07:45:26.697499 2020] [:error] [pid 3506291]     return
>>>> view_func(request, *args, **kwargs)
>>>> [Mon Aug 03 07:45:26.697502 2020] [:error] [pid 3506291]   File
>>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 84, in dec
>>>> [Mon Aug 03 07:45:26.697506 2020] [:error] [pid 3506291]     return
>>>> view_func(request, *args, **kwargs)
>>>> [Mon Aug 03 07:45:26.697509 2020] [:error] [pid 3506291]   File
>>>> "/usr/lib/python2.7/site-packages/django/views/generic/base.py", line 68,
>>>> in view
>>>> [Mon Aug 03 07:45:26.697513 2020] [:error] [pid 3506291]     return
>>>> self.dispatch(request, *args, **kwargs)
>>>> [Mon Aug 03 07:45:26.697516 2020] [:error] [pid 3506291]   File
>>>> "/usr/lib/python2.7/site-packages/django/views/generic/base.py", line 88,
>>>> in dispatch
>>>> [Mon Aug 03 07:45:26.697520 2020] [:error] [pid 3506291]     return
>>>> handler(request, *args, **kwargs)
>>>> [Mon Aug 03 07:45:26.697523 2020] [:error] [pid 3506291]   File
>>>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 223, in get
>>>> [Mon Aug 03 07:45:26.697526 2020] [:error] [pid 3506291]     handled =
>>>> self.construct_tables()
>>>> [Mon Aug 03 07:45:26.697530 2020] [:error] [pid 3506291]   File
>>>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 214, in
>>>> construct_tables
>>>> [Mon Aug 03 07:45:26.697533 2020] [:error] [pid 3506291]     handled =
>>>> self.handle_table(table)
>>>> [Mon Aug 03 07:45:26.697537 2020] [:error] [pid 3506291]   File
>>>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 123, in
>>>> handle_table
>>>> [Mon Aug 03 07:45:26.697540 2020] [:error] [pid 3506291]     data =
>>>> self._get_data_dict()
>>>> [Mon Aug 03 07:45:26.697544 2020] [:error] [pid 3506291]   File
>>>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 43, in
>>>> _get_data_dict
>>>> [Mon Aug 03 07:45:26.697547 2020] [:error] [pid 3506291]
>>>> data.extend(func())
>>>> [Mon Aug 03 07:45:26.697550 2020] [:error] [pid 3506291]   File
>>>> "/usr/lib/python2.7/site-packages/horizon/utils/memoized.py", line 109, in
>>>> wrapped
>>>> [Mon Aug 03 07:45:26.697554 2020] [:error] [pid 3506291]     value =
>>>> cache[key] = func(*args, **kwargs)
>>>> [Mon Aug 03 07:45:26.697557 2020] [:error] [pid 3506291]   File
>>>> "/usr/lib/python2.7/site-packages/manila_ui/dashboards/project/shares/views.py",
>>>> line 57, in get_shares_data
>>>> [Mon Aug 03 07:45:26.697561 2020] [:error] [pid 3506291]     share_nets
>>>> = manila.share_network_list(self.request)
>>>> [Mon Aug 03 07:45:26.697564 2020] [:error] [pid 3506291]   File
>>>> "/usr/lib/python2.7/site-packages/manila_ui/api/manila.py", line 280, in
>>>> share_network_list
>>>> [Mon Aug 03 07:45:26.697568 2020] [:error] [pid 3506291]     return
>>>> manilaclient(request).share_networks.list(detailed=detailed,
>>>> [Mon Aug 03 07:45:26.697571 2020] [:error] [pid 3506291]
>>>> AttributeError: 'NoneType' object has no attribute 'share_networks'
>>>>
>>>> Please, anyone could help ?
>>>> Ignazio
>>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200803/2cac142a/attachment.html>

From gouthampravi at gmail.com  Mon Aug  3 21:00:53 2020
From: gouthampravi at gmail.com (Goutham Pacha Ravi)
Date: Mon, 3 Aug 2020 14:00:53 -0700
Subject: [openstack][stein][manila-ui] error
In-Reply-To: <CAB7j8cWLkLMz91gx5Fw_kk=5wAnhGtH67nWU2W56xGjoRUzV=Q@mail.gmail.com>
References: <CAB7j8cUDXoNHipBMJfXR=93eoupTrZa5cm+1NjEALY7Cg12U_g@mail.gmail.com>
 <CAJ_e2gANpL4i8QXetjobSKEEN9D3tWg3T=yPrq3ZbUNuMX_H8g@mail.gmail.com>
 <CAB7j8cUORJdd=BZnTV+8DFvbpKhCGnb7_=W4GWo5fYDchkDeWg@mail.gmail.com>
 <CAB7j8cU3X1oTg0MKqYWe945owAFfwg71525wmD2_YnnwcGvGKA@mail.gmail.com>
 <CAB7j8cWLkLMz91gx5Fw_kk=5wAnhGtH67nWU2W56xGjoRUzV=Q@mail.gmail.com>
Message-ID: <CAKSuTPaN3ru6oAkg7_7pm99VpWYgigAsv2WnLAdLHcYPqf9oNQ@mail.gmail.com>

On Mon, Aug 3, 2020 at 1:31 PM Ignazio Cassano <ignaziocassano at gmail.com>
wrote:

> I mean I am using dhss false
>
> Il Lun 3 Ago 2020, 21:41 Ignazio Cassano <ignaziocassano at gmail.com> ha
> scritto:
>
>> PS ps
>> Sorry If aI am writing again.
>> The command:
>> manila list let me to show shares I created with command line.
>> The dashboard gives errors I reported in my first email.
>> Looking at manila.py line 280 it checks shares under share networks.
>> Ignazio
>>
>>
>> Il Lun 3 Ago 2020, 21:34 Ignazio Cassano <ignaziocassano at gmail.com> ha
>> scritto:
>>
>>> PS
>>> I followed installation guide under docs.openstack.org.
>>>
>>>
>>> Il Lun 3 Ago 2020, 21:21 Victoria Martínez de la Cruz <
>>> victoria at vmartinezdelacruz.com> ha scritto:
>>>
>>>> Hi Ignazio,
>>>>
>>>> How did you deploy Manila and Manila UI? Can you point me toward the
>>>> docs you used?
>>>>
>>>> Also, which is the specific workflow you are following to reach that
>>>> trace? Just opening the dashboard and clicking on the Shares tab?
>>>>
>>>> Cheers,
>>>>
>>>> V
>>>>
>>>> On Mon, Aug 3, 2020 at 4:55 AM Ignazio Cassano <
>>>> ignaziocassano at gmail.com> wrote:
>>>>
>>>>> Hello, I installed manila on openstack stein and it works by command
>>>>> line mat the manila ui does not work and in httpd error log I read:
>>>>>
>>>>> [Mon Aug 03 07:45:26.697408 2020] [:error] [pid 3506291] ERROR
>>>>> django.request Internal Server Error: /dashboard/project/shares/
>>>>> [Mon Aug 03 07:45:26.697437 2020] [:error] [pid 3506291] Traceback
>>>>> (most recent call last):
>>>>> [Mon Aug 03 07:45:26.697442 2020] [:error] [pid 3506291]   File
>>>>> "/usr/lib/python2.7/site-packages/django/core/handlers/exception.py", line
>>>>> 41, in inner
>>>>> [Mon Aug 03 07:45:26.697446 2020] [:error] [pid 3506291]     response
>>>>> = get_response(request)
>>>>> [Mon Aug 03 07:45:26.697450 2020] [:error] [pid 3506291]   File
>>>>> "/usr/lib/python2.7/site-packages/django/core/handlers/base.py", line 187,
>>>>> in _get_response
>>>>> [Mon Aug 03 07:45:26.697453 2020] [:error] [pid 3506291]     response
>>>>> = self.process_exception_by_middleware(e, request)
>>>>> [Mon Aug 03 07:45:26.697466 2020] [:error] [pid 3506291]   File
>>>>> "/usr/lib/python2.7/site-packages/django/core/handlers/base.py", line 185,
>>>>> in _get_response
>>>>> [Mon Aug 03 07:45:26.697471 2020] [:error] [pid 3506291]     response
>>>>> = wrapped_callback(request, *callback_args, **callback_kwargs)
>>>>> [Mon Aug 03 07:45:26.697475 2020] [:error] [pid 3506291]   File
>>>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 52, in dec
>>>>> [Mon Aug 03 07:45:26.697479 2020] [:error] [pid 3506291]     return
>>>>> view_func(request, *args, **kwargs)
>>>>> [Mon Aug 03 07:45:26.697482 2020] [:error] [pid 3506291]   File
>>>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 36, in dec
>>>>> [Mon Aug 03 07:45:26.697485 2020] [:error] [pid 3506291]     return
>>>>> view_func(request, *args, **kwargs)
>>>>> [Mon Aug 03 07:45:26.697489 2020] [:error] [pid 3506291]   File
>>>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 36, in dec
>>>>> [Mon Aug 03 07:45:26.697492 2020] [:error] [pid 3506291]     return
>>>>> view_func(request, *args, **kwargs)
>>>>> [Mon Aug 03 07:45:26.697496 2020] [:error] [pid 3506291]   File
>>>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 113, in dec
>>>>> [Mon Aug 03 07:45:26.697499 2020] [:error] [pid 3506291]     return
>>>>> view_func(request, *args, **kwargs)
>>>>> [Mon Aug 03 07:45:26.697502 2020] [:error] [pid 3506291]   File
>>>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 84, in dec
>>>>> [Mon Aug 03 07:45:26.697506 2020] [:error] [pid 3506291]     return
>>>>> view_func(request, *args, **kwargs)
>>>>> [Mon Aug 03 07:45:26.697509 2020] [:error] [pid 3506291]   File
>>>>> "/usr/lib/python2.7/site-packages/django/views/generic/base.py", line 68,
>>>>> in view
>>>>> [Mon Aug 03 07:45:26.697513 2020] [:error] [pid 3506291]     return
>>>>> self.dispatch(request, *args, **kwargs)
>>>>> [Mon Aug 03 07:45:26.697516 2020] [:error] [pid 3506291]   File
>>>>> "/usr/lib/python2.7/site-packages/django/views/generic/base.py", line 88,
>>>>> in dispatch
>>>>> [Mon Aug 03 07:45:26.697520 2020] [:error] [pid 3506291]     return
>>>>> handler(request, *args, **kwargs)
>>>>> [Mon Aug 03 07:45:26.697523 2020] [:error] [pid 3506291]   File
>>>>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 223, in get
>>>>> [Mon Aug 03 07:45:26.697526 2020] [:error] [pid 3506291]     handled =
>>>>> self.construct_tables()
>>>>> [Mon Aug 03 07:45:26.697530 2020] [:error] [pid 3506291]   File
>>>>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 214, in
>>>>> construct_tables
>>>>> [Mon Aug 03 07:45:26.697533 2020] [:error] [pid 3506291]     handled =
>>>>> self.handle_table(table)
>>>>> [Mon Aug 03 07:45:26.697537 2020] [:error] [pid 3506291]   File
>>>>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 123, in
>>>>> handle_table
>>>>> [Mon Aug 03 07:45:26.697540 2020] [:error] [pid 3506291]     data =
>>>>> self._get_data_dict()
>>>>> [Mon Aug 03 07:45:26.697544 2020] [:error] [pid 3506291]   File
>>>>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 43, in
>>>>> _get_data_dict
>>>>> [Mon Aug 03 07:45:26.697547 2020] [:error] [pid 3506291]
>>>>> data.extend(func())
>>>>> [Mon Aug 03 07:45:26.697550 2020] [:error] [pid 3506291]   File
>>>>> "/usr/lib/python2.7/site-packages/horizon/utils/memoized.py", line 109, in
>>>>> wrapped
>>>>> [Mon Aug 03 07:45:26.697554 2020] [:error] [pid 3506291]     value =
>>>>> cache[key] = func(*args, **kwargs)
>>>>> [Mon Aug 03 07:45:26.697557 2020] [:error] [pid 3506291]   File
>>>>> "/usr/lib/python2.7/site-packages/manila_ui/dashboards/project/shares/views.py",
>>>>> line 57, in get_shares_data
>>>>> [Mon Aug 03 07:45:26.697561 2020] [:error] [pid 3506291]
>>>>> share_nets = manila.share_network_list(self.request)
>>>>> [Mon Aug 03 07:45:26.697564 2020] [:error] [pid 3506291]   File
>>>>> "/usr/lib/python2.7/site-packages/manila_ui/api/manila.py", line 280, in
>>>>> share_network_list
>>>>> [Mon Aug 03 07:45:26.697568 2020] [:error] [pid 3506291]     return
>>>>> manilaclient(request).share_networks.list(detailed=detailed,
>>>>> [Mon Aug 03 07:45:26.697571 2020] [:error] [pid 3506291]
>>>>> AttributeError: 'NoneType' object has no attribute 'share_networks'
>>>>>
>>>>
Looking at the error here, and the code - it could be that the UI isn't
able to retrieve the manila service endpoint from the service catalog. If
this is the case, you must be able to see a "DEBUG" level log in your httpd
error log with "no share service configured". Do you see it?

As the user you're using on horizon, can you perform "openstack catalog
list" and check whether the "sharev2" service type exists in that list?


>
>>>>> Please, anyone could help ?
>>>>> Ignazio
>>>>>
>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200803/cd4fcd74/attachment-0001.html>

From ignaziocassano at gmail.com  Mon Aug  3 21:45:55 2020
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Mon, 3 Aug 2020 23:45:55 +0200
Subject: [openstack][stein][manila-ui] error
In-Reply-To: <CAKSuTPaN3ru6oAkg7_7pm99VpWYgigAsv2WnLAdLHcYPqf9oNQ@mail.gmail.com>
References: <CAB7j8cUDXoNHipBMJfXR=93eoupTrZa5cm+1NjEALY7Cg12U_g@mail.gmail.com>
 <CAJ_e2gANpL4i8QXetjobSKEEN9D3tWg3T=yPrq3ZbUNuMX_H8g@mail.gmail.com>
 <CAB7j8cUORJdd=BZnTV+8DFvbpKhCGnb7_=W4GWo5fYDchkDeWg@mail.gmail.com>
 <CAB7j8cU3X1oTg0MKqYWe945owAFfwg71525wmD2_YnnwcGvGKA@mail.gmail.com>
 <CAB7j8cWLkLMz91gx5Fw_kk=5wAnhGtH67nWU2W56xGjoRUzV=Q@mail.gmail.com>
 <CAKSuTPaN3ru6oAkg7_7pm99VpWYgigAsv2WnLAdLHcYPqf9oNQ@mail.gmail.com>
Message-ID: <CAB7j8cVYNRHpdPckT4ZseLpf4Sfi_TgqvOvmdxwnH+5O-nqPxg@mail.gmail.com>

Hello Goutham,tomorrow I will check the catalog.
Must I enable the debug option in dashboard local_setting or in manila.conf?
Thanks
Ignazio


Il Lun 3 Ago 2020, 23:01 Goutham Pacha Ravi <gouthampravi at gmail.com> ha
scritto:

>
>
>
> On Mon, Aug 3, 2020 at 1:31 PM Ignazio Cassano <ignaziocassano at gmail.com>
> wrote:
>
>> I mean I am using dhss false
>>
>> Il Lun 3 Ago 2020, 21:41 Ignazio Cassano <ignaziocassano at gmail.com> ha
>> scritto:
>>
>>> PS ps
>>> Sorry If aI am writing again.
>>> The command:
>>> manila list let me to show shares I created with command line.
>>> The dashboard gives errors I reported in my first email.
>>> Looking at manila.py line 280 it checks shares under share networks.
>>> Ignazio
>>>
>>>
>>> Il Lun 3 Ago 2020, 21:34 Ignazio Cassano <ignaziocassano at gmail.com> ha
>>> scritto:
>>>
>>>> PS
>>>> I followed installation guide under docs.openstack.org.
>>>>
>>>>
>>>> Il Lun 3 Ago 2020, 21:21 Victoria Martínez de la Cruz <
>>>> victoria at vmartinezdelacruz.com> ha scritto:
>>>>
>>>>> Hi Ignazio,
>>>>>
>>>>> How did you deploy Manila and Manila UI? Can you point me toward the
>>>>> docs you used?
>>>>>
>>>>> Also, which is the specific workflow you are following to reach that
>>>>> trace? Just opening the dashboard and clicking on the Shares tab?
>>>>>
>>>>> Cheers,
>>>>>
>>>>> V
>>>>>
>>>>> On Mon, Aug 3, 2020 at 4:55 AM Ignazio Cassano <
>>>>> ignaziocassano at gmail.com> wrote:
>>>>>
>>>>>> Hello, I installed manila on openstack stein and it works by command
>>>>>> line mat the manila ui does not work and in httpd error log I read:
>>>>>>
>>>>>> [Mon Aug 03 07:45:26.697408 2020] [:error] [pid 3506291] ERROR
>>>>>> django.request Internal Server Error: /dashboard/project/shares/
>>>>>> [Mon Aug 03 07:45:26.697437 2020] [:error] [pid 3506291] Traceback
>>>>>> (most recent call last):
>>>>>> [Mon Aug 03 07:45:26.697442 2020] [:error] [pid 3506291]   File
>>>>>> "/usr/lib/python2.7/site-packages/django/core/handlers/exception.py", line
>>>>>> 41, in inner
>>>>>> [Mon Aug 03 07:45:26.697446 2020] [:error] [pid 3506291]     response
>>>>>> = get_response(request)
>>>>>> [Mon Aug 03 07:45:26.697450 2020] [:error] [pid 3506291]   File
>>>>>> "/usr/lib/python2.7/site-packages/django/core/handlers/base.py", line 187,
>>>>>> in _get_response
>>>>>> [Mon Aug 03 07:45:26.697453 2020] [:error] [pid 3506291]     response
>>>>>> = self.process_exception_by_middleware(e, request)
>>>>>> [Mon Aug 03 07:45:26.697466 2020] [:error] [pid 3506291]   File
>>>>>> "/usr/lib/python2.7/site-packages/django/core/handlers/base.py", line 185,
>>>>>> in _get_response
>>>>>> [Mon Aug 03 07:45:26.697471 2020] [:error] [pid 3506291]     response
>>>>>> = wrapped_callback(request, *callback_args, **callback_kwargs)
>>>>>> [Mon Aug 03 07:45:26.697475 2020] [:error] [pid 3506291]   File
>>>>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 52, in dec
>>>>>> [Mon Aug 03 07:45:26.697479 2020] [:error] [pid 3506291]     return
>>>>>> view_func(request, *args, **kwargs)
>>>>>> [Mon Aug 03 07:45:26.697482 2020] [:error] [pid 3506291]   File
>>>>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 36, in dec
>>>>>> [Mon Aug 03 07:45:26.697485 2020] [:error] [pid 3506291]     return
>>>>>> view_func(request, *args, **kwargs)
>>>>>> [Mon Aug 03 07:45:26.697489 2020] [:error] [pid 3506291]   File
>>>>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 36, in dec
>>>>>> [Mon Aug 03 07:45:26.697492 2020] [:error] [pid 3506291]     return
>>>>>> view_func(request, *args, **kwargs)
>>>>>> [Mon Aug 03 07:45:26.697496 2020] [:error] [pid 3506291]   File
>>>>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 113, in dec
>>>>>> [Mon Aug 03 07:45:26.697499 2020] [:error] [pid 3506291]     return
>>>>>> view_func(request, *args, **kwargs)
>>>>>> [Mon Aug 03 07:45:26.697502 2020] [:error] [pid 3506291]   File
>>>>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 84, in dec
>>>>>> [Mon Aug 03 07:45:26.697506 2020] [:error] [pid 3506291]     return
>>>>>> view_func(request, *args, **kwargs)
>>>>>> [Mon Aug 03 07:45:26.697509 2020] [:error] [pid 3506291]   File
>>>>>> "/usr/lib/python2.7/site-packages/django/views/generic/base.py", line 68,
>>>>>> in view
>>>>>> [Mon Aug 03 07:45:26.697513 2020] [:error] [pid 3506291]     return
>>>>>> self.dispatch(request, *args, **kwargs)
>>>>>> [Mon Aug 03 07:45:26.697516 2020] [:error] [pid 3506291]   File
>>>>>> "/usr/lib/python2.7/site-packages/django/views/generic/base.py", line 88,
>>>>>> in dispatch
>>>>>> [Mon Aug 03 07:45:26.697520 2020] [:error] [pid 3506291]     return
>>>>>> handler(request, *args, **kwargs)
>>>>>> [Mon Aug 03 07:45:26.697523 2020] [:error] [pid 3506291]   File
>>>>>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 223, in get
>>>>>> [Mon Aug 03 07:45:26.697526 2020] [:error] [pid 3506291]     handled
>>>>>> = self.construct_tables()
>>>>>> [Mon Aug 03 07:45:26.697530 2020] [:error] [pid 3506291]   File
>>>>>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 214, in
>>>>>> construct_tables
>>>>>> [Mon Aug 03 07:45:26.697533 2020] [:error] [pid 3506291]     handled
>>>>>> = self.handle_table(table)
>>>>>> [Mon Aug 03 07:45:26.697537 2020] [:error] [pid 3506291]   File
>>>>>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 123, in
>>>>>> handle_table
>>>>>> [Mon Aug 03 07:45:26.697540 2020] [:error] [pid 3506291]     data =
>>>>>> self._get_data_dict()
>>>>>> [Mon Aug 03 07:45:26.697544 2020] [:error] [pid 3506291]   File
>>>>>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 43, in
>>>>>> _get_data_dict
>>>>>> [Mon Aug 03 07:45:26.697547 2020] [:error] [pid 3506291]
>>>>>> data.extend(func())
>>>>>> [Mon Aug 03 07:45:26.697550 2020] [:error] [pid 3506291]   File
>>>>>> "/usr/lib/python2.7/site-packages/horizon/utils/memoized.py", line 109, in
>>>>>> wrapped
>>>>>> [Mon Aug 03 07:45:26.697554 2020] [:error] [pid 3506291]     value =
>>>>>> cache[key] = func(*args, **kwargs)
>>>>>> [Mon Aug 03 07:45:26.697557 2020] [:error] [pid 3506291]   File
>>>>>> "/usr/lib/python2.7/site-packages/manila_ui/dashboards/project/shares/views.py",
>>>>>> line 57, in get_shares_data
>>>>>> [Mon Aug 03 07:45:26.697561 2020] [:error] [pid 3506291]
>>>>>> share_nets = manila.share_network_list(self.request)
>>>>>> [Mon Aug 03 07:45:26.697564 2020] [:error] [pid 3506291]   File
>>>>>> "/usr/lib/python2.7/site-packages/manila_ui/api/manila.py", line 280, in
>>>>>> share_network_list
>>>>>> [Mon Aug 03 07:45:26.697568 2020] [:error] [pid 3506291]     return
>>>>>> manilaclient(request).share_networks.list(detailed=detailed,
>>>>>> [Mon Aug 03 07:45:26.697571 2020] [:error] [pid 3506291]
>>>>>> AttributeError: 'NoneType' object has no attribute 'share_networks'
>>>>>>
>>>>>
> Looking at the error here, and the code - it could be that the UI isn't
> able to retrieve the manila service endpoint from the service catalog. If
> this is the case, you must be able to see a "DEBUG" level log in your httpd
> error log with "no share service configured". Do you see it?
>
> As the user you're using on horizon, can you perform "openstack catalog
> list" and check whether the "sharev2" service type exists in that list?
>
>
>>
>>>>>> Please, anyone could help ?
>>>>>> Ignazio
>>>>>>
>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200803/626fd5cc/attachment.html>

From victoria at vmartinezdelacruz.com  Tue Aug  4 00:53:09 2020
From: victoria at vmartinezdelacruz.com (=?UTF-8?Q?Victoria_Mart=C3=ADnez_de_la_Cruz?=)
Date: Mon, 3 Aug 2020 21:53:09 -0300
Subject: [openstack][stein][manila-ui] error
In-Reply-To: <CAB7j8cVYNRHpdPckT4ZseLpf4Sfi_TgqvOvmdxwnH+5O-nqPxg@mail.gmail.com>
References: <CAB7j8cUDXoNHipBMJfXR=93eoupTrZa5cm+1NjEALY7Cg12U_g@mail.gmail.com>
 <CAJ_e2gANpL4i8QXetjobSKEEN9D3tWg3T=yPrq3ZbUNuMX_H8g@mail.gmail.com>
 <CAB7j8cUORJdd=BZnTV+8DFvbpKhCGnb7_=W4GWo5fYDchkDeWg@mail.gmail.com>
 <CAB7j8cU3X1oTg0MKqYWe945owAFfwg71525wmD2_YnnwcGvGKA@mail.gmail.com>
 <CAB7j8cWLkLMz91gx5Fw_kk=5wAnhGtH67nWU2W56xGjoRUzV=Q@mail.gmail.com>
 <CAKSuTPaN3ru6oAkg7_7pm99VpWYgigAsv2WnLAdLHcYPqf9oNQ@mail.gmail.com>
 <CAB7j8cVYNRHpdPckT4ZseLpf4Sfi_TgqvOvmdxwnH+5O-nqPxg@mail.gmail.com>
Message-ID: <CAJ_e2gAf8Gx7Ubs45UJrH0c1de-2i6GRwfVCeZ2nTFbzjtU0zw@mail.gmail.com>

In local_settings.py under openstack-dashboard. And then restart the
webserver.

Did you copy the enable and local files from manila-ui under Horizon's
namespace? Check out
https://docs.openstack.org/manila-ui/latest/install/installation.html

We can continue debugging tomorrow, we will find out what is going on.

Cheers,

V


On Mon, Aug 3, 2020, 6:46 PM Ignazio Cassano <ignaziocassano at gmail.com>
wrote:

> Hello Goutham,tomorrow I will check the catalog.
> Must I enable the debug option in dashboard local_setting or in
> manila.conf?
> Thanks
> Ignazio
>
>
> Il Lun 3 Ago 2020, 23:01 Goutham Pacha Ravi <gouthampravi at gmail.com> ha
> scritto:
>
>>
>>
>>
>> On Mon, Aug 3, 2020 at 1:31 PM Ignazio Cassano <ignaziocassano at gmail.com>
>> wrote:
>>
>>> I mean I am using dhss false
>>>
>>> Il Lun 3 Ago 2020, 21:41 Ignazio Cassano <ignaziocassano at gmail.com> ha
>>> scritto:
>>>
>>>> PS ps
>>>> Sorry If aI am writing again.
>>>> The command:
>>>> manila list let me to show shares I created with command line.
>>>> The dashboard gives errors I reported in my first email.
>>>> Looking at manila.py line 280 it checks shares under share networks.
>>>> Ignazio
>>>>
>>>>
>>>> Il Lun 3 Ago 2020, 21:34 Ignazio Cassano <ignaziocassano at gmail.com> ha
>>>> scritto:
>>>>
>>>>> PS
>>>>> I followed installation guide under docs.openstack.org.
>>>>>
>>>>>
>>>>> Il Lun 3 Ago 2020, 21:21 Victoria Martínez de la Cruz <
>>>>> victoria at vmartinezdelacruz.com> ha scritto:
>>>>>
>>>>>> Hi Ignazio,
>>>>>>
>>>>>> How did you deploy Manila and Manila UI? Can you point me toward the
>>>>>> docs you used?
>>>>>>
>>>>>> Also, which is the specific workflow you are following to reach that
>>>>>> trace? Just opening the dashboard and clicking on the Shares tab?
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> V
>>>>>>
>>>>>> On Mon, Aug 3, 2020 at 4:55 AM Ignazio Cassano <
>>>>>> ignaziocassano at gmail.com> wrote:
>>>>>>
>>>>>>> Hello, I installed manila on openstack stein and it works by command
>>>>>>> line mat the manila ui does not work and in httpd error log I read:
>>>>>>>
>>>>>>> [Mon Aug 03 07:45:26.697408 2020] [:error] [pid 3506291] ERROR
>>>>>>> django.request Internal Server Error: /dashboard/project/shares/
>>>>>>> [Mon Aug 03 07:45:26.697437 2020] [:error] [pid 3506291] Traceback
>>>>>>> (most recent call last):
>>>>>>> [Mon Aug 03 07:45:26.697442 2020] [:error] [pid 3506291]   File
>>>>>>> "/usr/lib/python2.7/site-packages/django/core/handlers/exception.py", line
>>>>>>> 41, in inner
>>>>>>> [Mon Aug 03 07:45:26.697446 2020] [:error] [pid 3506291]
>>>>>>> response = get_response(request)
>>>>>>> [Mon Aug 03 07:45:26.697450 2020] [:error] [pid 3506291]   File
>>>>>>> "/usr/lib/python2.7/site-packages/django/core/handlers/base.py", line 187,
>>>>>>> in _get_response
>>>>>>> [Mon Aug 03 07:45:26.697453 2020] [:error] [pid 3506291]
>>>>>>> response = self.process_exception_by_middleware(e, request)
>>>>>>> [Mon Aug 03 07:45:26.697466 2020] [:error] [pid 3506291]   File
>>>>>>> "/usr/lib/python2.7/site-packages/django/core/handlers/base.py", line 185,
>>>>>>> in _get_response
>>>>>>> [Mon Aug 03 07:45:26.697471 2020] [:error] [pid 3506291]
>>>>>>> response = wrapped_callback(request, *callback_args, **callback_kwargs)
>>>>>>> [Mon Aug 03 07:45:26.697475 2020] [:error] [pid 3506291]   File
>>>>>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 52, in dec
>>>>>>> [Mon Aug 03 07:45:26.697479 2020] [:error] [pid 3506291]     return
>>>>>>> view_func(request, *args, **kwargs)
>>>>>>> [Mon Aug 03 07:45:26.697482 2020] [:error] [pid 3506291]   File
>>>>>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 36, in dec
>>>>>>> [Mon Aug 03 07:45:26.697485 2020] [:error] [pid 3506291]     return
>>>>>>> view_func(request, *args, **kwargs)
>>>>>>> [Mon Aug 03 07:45:26.697489 2020] [:error] [pid 3506291]   File
>>>>>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 36, in dec
>>>>>>> [Mon Aug 03 07:45:26.697492 2020] [:error] [pid 3506291]     return
>>>>>>> view_func(request, *args, **kwargs)
>>>>>>> [Mon Aug 03 07:45:26.697496 2020] [:error] [pid 3506291]   File
>>>>>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 113, in dec
>>>>>>> [Mon Aug 03 07:45:26.697499 2020] [:error] [pid 3506291]     return
>>>>>>> view_func(request, *args, **kwargs)
>>>>>>> [Mon Aug 03 07:45:26.697502 2020] [:error] [pid 3506291]   File
>>>>>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 84, in dec
>>>>>>> [Mon Aug 03 07:45:26.697506 2020] [:error] [pid 3506291]     return
>>>>>>> view_func(request, *args, **kwargs)
>>>>>>> [Mon Aug 03 07:45:26.697509 2020] [:error] [pid 3506291]   File
>>>>>>> "/usr/lib/python2.7/site-packages/django/views/generic/base.py", line 68,
>>>>>>> in view
>>>>>>> [Mon Aug 03 07:45:26.697513 2020] [:error] [pid 3506291]     return
>>>>>>> self.dispatch(request, *args, **kwargs)
>>>>>>> [Mon Aug 03 07:45:26.697516 2020] [:error] [pid 3506291]   File
>>>>>>> "/usr/lib/python2.7/site-packages/django/views/generic/base.py", line 88,
>>>>>>> in dispatch
>>>>>>> [Mon Aug 03 07:45:26.697520 2020] [:error] [pid 3506291]     return
>>>>>>> handler(request, *args, **kwargs)
>>>>>>> [Mon Aug 03 07:45:26.697523 2020] [:error] [pid 3506291]   File
>>>>>>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 223, in get
>>>>>>> [Mon Aug 03 07:45:26.697526 2020] [:error] [pid 3506291]     handled
>>>>>>> = self.construct_tables()
>>>>>>> [Mon Aug 03 07:45:26.697530 2020] [:error] [pid 3506291]   File
>>>>>>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 214, in
>>>>>>> construct_tables
>>>>>>> [Mon Aug 03 07:45:26.697533 2020] [:error] [pid 3506291]     handled
>>>>>>> = self.handle_table(table)
>>>>>>> [Mon Aug 03 07:45:26.697537 2020] [:error] [pid 3506291]   File
>>>>>>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 123, in
>>>>>>> handle_table
>>>>>>> [Mon Aug 03 07:45:26.697540 2020] [:error] [pid 3506291]     data =
>>>>>>> self._get_data_dict()
>>>>>>> [Mon Aug 03 07:45:26.697544 2020] [:error] [pid 3506291]   File
>>>>>>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 43, in
>>>>>>> _get_data_dict
>>>>>>> [Mon Aug 03 07:45:26.697547 2020] [:error] [pid 3506291]
>>>>>>> data.extend(func())
>>>>>>> [Mon Aug 03 07:45:26.697550 2020] [:error] [pid 3506291]   File
>>>>>>> "/usr/lib/python2.7/site-packages/horizon/utils/memoized.py", line 109, in
>>>>>>> wrapped
>>>>>>> [Mon Aug 03 07:45:26.697554 2020] [:error] [pid 3506291]     value =
>>>>>>> cache[key] = func(*args, **kwargs)
>>>>>>> [Mon Aug 03 07:45:26.697557 2020] [:error] [pid 3506291]   File
>>>>>>> "/usr/lib/python2.7/site-packages/manila_ui/dashboards/project/shares/views.py",
>>>>>>> line 57, in get_shares_data
>>>>>>> [Mon Aug 03 07:45:26.697561 2020] [:error] [pid 3506291]
>>>>>>> share_nets = manila.share_network_list(self.request)
>>>>>>> [Mon Aug 03 07:45:26.697564 2020] [:error] [pid 3506291]   File
>>>>>>> "/usr/lib/python2.7/site-packages/manila_ui/api/manila.py", line 280, in
>>>>>>> share_network_list
>>>>>>> [Mon Aug 03 07:45:26.697568 2020] [:error] [pid 3506291]     return
>>>>>>> manilaclient(request).share_networks.list(detailed=detailed,
>>>>>>> [Mon Aug 03 07:45:26.697571 2020] [:error] [pid 3506291]
>>>>>>> AttributeError: 'NoneType' object has no attribute 'share_networks'
>>>>>>>
>>>>>>
>> Looking at the error here, and the code - it could be that the UI isn't
>> able to retrieve the manila service endpoint from the service catalog. If
>> this is the case, you must be able to see a "DEBUG" level log in your httpd
>> error log with "no share service configured". Do you see it?
>>
>> As the user you're using on horizon, can you perform "openstack catalog
>> list" and check whether the "sharev2" service type exists in that list?
>>
>>
>>>
>>>>>>> Please, anyone could help ?
>>>>>>> Ignazio
>>>>>>>
>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200803/93f27e3e/attachment-0001.html>

From thomas.king at gmail.com  Mon Aug  3 21:58:53 2020
From: thomas.king at gmail.com (Thomas King)
Date: Mon, 3 Aug 2020 15:58:53 -0600
Subject: [Openstack-mentoring] Neutron subnet with DHCP relay - continued
In-Reply-To: <CAMD4D2LgKVss+ybf=DDQC39XD=b6r9M3AWdcxEPU7ZC9J7j3JQ@mail.gmail.com>
References: <CAMD4D2JWGhNJxnFRGo5N2kN8W=6qQaZeMWKai6vzhRR7DUQhgg@mail.gmail.com>
 <CAFs83QoqtBpRoYTwkg6NSPdOtjyOvuGLjJpeXvMFdr+PuL3GZQ@mail.gmail.com>
 <CAMD4D2K+hKwsnTgP_mOy0kJwL-o3pVRjpoXhzQCin1vYZX_HvA@mail.gmail.com>
 <CABE5tBYpC_WEH4AJZdjJTH87m7S2HLAs8ov5fGQF1+u+wGAaVQ@mail.gmail.com>
 <CAMD4D2JPomjciGJBivFFBsWquwkMGb5ycdT1fGPwRNx_LRGXTg@mail.gmail.com>
 <CABE5tBYc3ZU4m-BAOHPp66KXULrjHhv5dDBRXWOTBJxe3Nmf2Q@mail.gmail.com>
 <CAMD4D2+i3fbA5pONBgaWzU8p9OYRzeRkeNs++Q-fB9_-Ly_v8Q@mail.gmail.com>
 <CABE5tBZfWU7-Rn8QBi+yiiXT9TqHE1m5AJgr1XsqKonqfyxKxA@mail.gmail.com>
 <CAMD4D2KprJQmsMKnvgKbEFPMS+MYoVZRFCoppwsZyK-cexGRbw@mail.gmail.com>
 <CAMD4D2LgKVss+ybf=DDQC39XD=b6r9M3AWdcxEPU7ZC9J7j3JQ@mail.gmail.com>
Message-ID: <CAMD4D2JLhn-7EOyjA7jjUf0k0rOpes5tAv1tKRKLXSDqZd+7bw@mail.gmail.com>

I've been using named physical networks so long, I completely forgot using
wildcards!

Is this the answer????
https://docs.openstack.org/mitaka/config-reference/networking/networking_options_reference.html#modular-layer-2-ml2-flat-type-configuration-options

Tom King

On Tue, Jul 28, 2020 at 3:46 PM Thomas King <thomas.king at gmail.com> wrote:

> Ruslanas has been a tremendous help. To catch up the discussion lists...
> 1. I enabled Neutron segments.
> 2. I renamed the existing segments for each network so they'll make sense.
> 3. I attempted to create a segment for a remote subnet (it is using DHCP
> relay) and this was the error that is blocking me. This is where the docs
> do not cover:
> [root at sea-maas-controller ~(keystone_admin)]# openstack network segment
> create --physical-network remote146-30-32 --network-type flat --network
> baremetal seg-remote-146-30-32
> BadRequestException: 400: Client Error for url:
> http://10.146.30.65:9696/v2.0/segments, Invalid input for operation:
> physical_network 'remote146-30-32' unknown for flat provider network.
>
> I've asked Ruslanas to clarify how their physical networks correspond to
> their remote networks. They have a single provider network and multiple
> segments tied to multiple physical networks.
>
> However, if anyone can shine some light on this, I would greatly
> appreciate it. How should neutron's configurations accommodate remote
> networks<->Neutron segments when I have only one physical network
> attachment for provisioning?
>
> Thanks!
> Tom King
>
> On Wed, Jul 15, 2020 at 3:33 PM Thomas King <thomas.king at gmail.com> wrote:
>
>> That helps a lot, thank you!
>>
>> "I use only one network..."
>> This bit seems to go completely against the Neutron segments
>> documentation. When you have access, please let me know if Triple-O is
>> using segments or some other method.
>>
>> I greatly appreciate this, this is a tremendous help.
>>
>> Tom King
>>
>> On Wed, Jul 15, 2020 at 1:07 PM Ruslanas Gžibovskis <ruslanas at lpic.lt>
>> wrote:
>>
>>> Hi Thomas,
>>>
>>> I have a bit complicated setup from tripleo side :) I use only one
>>> network (only ControlPlane). thanks to Harold, he helped to make it work
>>> for me.
>>>
>>> Yes, as written in the tripleo docs for leaf networks, it use the same
>>> neutron network, different subnets. so neutron network is ctlplane (I
>>> think) and have ctlplane-subnet, remote-provision and remote-KI :)) that
>>> generates additional lines in "ip r s" output for routing "foreign" subnets
>>> through correct gw, if you would have isolated networks, by vlans and ports
>>> this would apply for each subnet different gw... I believe you
>>> know/understand that part.
>>>
>>> remote* subnets have dhcp-relay setup by network team... do not ask
>>> details for that. I do not know how to, but can ask :)
>>>
>>>
>>> in undercloud/tripleo i have 2 dhcp servers, one is for introspection,
>>> another for provide/cleanup and deployment process.
>>>
>>> all of those subnets have organization level tagged networks and are
>>> tagged on network devices, but they are untagged on provisioning
>>> interfaces/ports, as in general pxe should be untagged, but some nic's can
>>> do vlan untag on nic/bios level. but who cares!?
>>>
>>> I just did a brief check on your first post, I think I have simmilar
>>> setup to yours :)) I will check in around 12hours :)) more deaply, as will
>>> be at work :)))
>>>
>>>
>>> P.S. sorry for wrong terms, I am bad at naming.
>>>
>>>
>>> On Wed, 15 Jul 2020, 21:13 Thomas King, <thomas.king at gmail.com> wrote:
>>>
>>>> Ruslanas, that would be excellent!
>>>>
>>>> I will reply to you directly for details later unless the maillist
>>>> would like the full thread.
>>>>
>>>> Some preliminary questions:
>>>>
>>>>    - Do you have a separate physical interface for the segment(s) used
>>>>    for your remote subnets?
>>>>    The docs state each segment must have a unique physical network
>>>>    name, which suggests a separate physical interface for each segment unless
>>>>    I'm misunderstanding something.
>>>>    - Are your provisioning segments all on the same Neutron network?
>>>>    - Are you using tagged switchports or access switchports to your
>>>>    Ironic server(s)?
>>>>
>>>> Thanks,
>>>> Tom King
>>>>
>>>> On Wed, Jul 15, 2020 at 12:26 AM Ruslanas Gžibovskis <ruslanas at lpic.lt>
>>>> wrote:
>>>>
>>>>> I have deployed that with tripleO, but now we are recabling and
>>>>> redeploying it. So once I have it running I can share my configs, just name
>>>>> which you want :)
>>>>>
>>>>> On Tue, 14 Jul 2020 at 18:40, Thomas King <thomas.king at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> I have. That's the Triple-O docs and they don't go through the normal
>>>>>> .conf files to explain how it works outside of Triple-O. It has some ideas
>>>>>> but no running configurations.
>>>>>>
>>>>>> Tom King
>>>>>>
>>>>>> On Tue, Jul 14, 2020 at 3:01 AM Ruslanas Gžibovskis <ruslanas at lpic.lt>
>>>>>> wrote:
>>>>>>
>>>>>>> hi, have you checked:
>>>>>>> https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/features/routed_spine_leaf_network.html
>>>>>>>  ?
>>>>>>> I am following this link. I only have one network, having different
>>>>>>> issues tho ;)
>>>>>>>
>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200803/97e0eabf/attachment.html>

From akekane at redhat.com  Tue Aug  4 04:37:11 2020
From: akekane at redhat.com (Abhishek Kekane)
Date: Tue, 4 Aug 2020 10:07:11 +0530
Subject: =?UTF-8?Q?Re=3A_=5Blists=2Eopenstack=2Eorg=E4=BB=A3=E5=8F=91=5DRe=3A_=5BGlance=5D_Proposin?=
 =?UTF-8?Q?g_Dan_Smith_for_glance_core?=
In-Reply-To: <03ece5d405c74b2d9292301c2e3be7b8@inspur.com>
References: <e6267d6b1e054d3901ef57db7f86702d@sslemail.net>
 <8635120d-11d6-136e-2581-40d3d451d1aa@gmail.com>
 <03ece5d405c74b2d9292301c2e3be7b8@inspur.com>
Message-ID: <CALOt+SR4D7GWodKF6OHhKYs-7cEZEnBySTSEKTafc+h8cng-yA@mail.gmail.com>

Hi All,

After hearing only positive responses, I have added Dan to the Core members
list.

Welcome aboard Dan.

Cheers,

Abhishek


On Mon, 3 Aug, 2020, 05:44 Brin Zhang(张百林), <zhangbailin at inspur.com> wrote:

> +1
>
>
>
> *发件人:* Jay Bryant [mailto:jungleboyj at gmail.com]
> *发送时间:* 2020年7月31日 23:39
> *收件人:* openstack-discuss at lists.openstack.org
> *主题:* [lists.openstack.org代发]Re: [Glance] Proposing Dan Smith for glance
> core
>
>
>
> On 7/31/2020 8:10 AM, Sean McGinnis wrote:
>
> On 7/30/20 10:25 AM, Abhishek Kekane wrote:
>
> Hi All,
>
> I'd like to propose adding Dan Smith to the glance core group.
>
>
>
> Dan Smith has contributed to stabilize image import workflow as well as
> multiple stores of glance.
>
> He is also contributing in tempest and nova to set up CI/tempest jobs
> around image import and multiple stores.
>
>
>
> Being involved on the mailing-list and IRC channels, Dan is always helpful
> to the community and here to help.
>
> Please respond with +1/-1 until 03rd August, 2020 1400 UTC.
>
> Cheers,
> Abhishek
>
> +1
>
> Not a Glance core but definitely +1 from me.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200804/616d8cca/attachment.html>

From emiller at genesishosting.com  Tue Aug  4 05:02:49 2020
From: emiller at genesishosting.com (Eric K. Miller)
Date: Tue, 4 Aug 2020 00:02:49 -0500
Subject: [nova] Hyper-V hosts
Message-ID: <046E9C0290DD9149B106B72FC9156BEA04814461@gmsxchsvr01.thecreation.com>

Hi,

I thought I'd look into support of Hyper-V hosts for Windows Server
environments, but it looks like the latest cloudbase Windows Hyper-V
OpenStack Installer is for Train, and nothing seems to discuss the use
of Hyper-V in Windows Server 2019.  Has it been abandoned?

Is anyone using Hyper-V with OpenStack successfully?  One of the reasons
we thought we might support it is to provide nested support for VMs with
GPUs and/or vGPUs, and thought this would work better than with KVM,
specifically with AMD EPYC systems.  It seems that when "options kvm-amd
nested=1" is used in a modprobe.d config file, Windows machines lock up
when started.  I think this has been an issue for a while with AMD
processors, but thought it was fixed recently (I don't remember where I
saw this, though).

Would love to hear about any experiences related to Hyper-V and/or
nested hypervisor support on AMD EPYC processors.

Thanks!

Eric


From ignaziocassano at gmail.com  Tue Aug  4 05:49:48 2020
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Tue, 4 Aug 2020 07:49:48 +0200
Subject: [openstack][stein][manila-ui] error
In-Reply-To: <CAJ_e2gAf8Gx7Ubs45UJrH0c1de-2i6GRwfVCeZ2nTFbzjtU0zw@mail.gmail.com>
References: <CAB7j8cUDXoNHipBMJfXR=93eoupTrZa5cm+1NjEALY7Cg12U_g@mail.gmail.com>
 <CAJ_e2gANpL4i8QXetjobSKEEN9D3tWg3T=yPrq3ZbUNuMX_H8g@mail.gmail.com>
 <CAB7j8cUORJdd=BZnTV+8DFvbpKhCGnb7_=W4GWo5fYDchkDeWg@mail.gmail.com>
 <CAB7j8cU3X1oTg0MKqYWe945owAFfwg71525wmD2_YnnwcGvGKA@mail.gmail.com>
 <CAB7j8cWLkLMz91gx5Fw_kk=5wAnhGtH67nWU2W56xGjoRUzV=Q@mail.gmail.com>
 <CAKSuTPaN3ru6oAkg7_7pm99VpWYgigAsv2WnLAdLHcYPqf9oNQ@mail.gmail.com>
 <CAB7j8cVYNRHpdPckT4ZseLpf4Sfi_TgqvOvmdxwnH+5O-nqPxg@mail.gmail.com>
 <CAJ_e2gAf8Gx7Ubs45UJrH0c1de-2i6GRwfVCeZ2nTFbzjtU0zw@mail.gmail.com>
Message-ID: <CAB7j8cWB_tO-u-NY_2o7rmvaEJ5=ftUtxcuC91jkDuZW8ayg5g@mail.gmail.com>

Hello Victoria and Goutham, thank you for your great help.
Unfortunately I made I mistake in my ansible playbook for installing
manila: it created manila services more times, so some entries in the
catalog did not have an endpoint associated.
I removed the duplicated service entries where catalog was absent and now
it works.
Many thanks
Ignazio

Il giorno mar 4 ago 2020 alle ore 02:53 Victoria Martínez de la Cruz <
victoria at vmartinezdelacruz.com> ha scritto:

> In local_settings.py under openstack-dashboard. And then restart the
> webserver.
>
> Did you copy the enable and local files from manila-ui under Horizon's
> namespace? Check out
> https://docs.openstack.org/manila-ui/latest/install/installation.html
>
> We can continue debugging tomorrow, we will find out what is going on.
>
> Cheers,
>
> V
>
>
> On Mon, Aug 3, 2020, 6:46 PM Ignazio Cassano <ignaziocassano at gmail.com>
> wrote:
>
>> Hello Goutham,tomorrow I will check the catalog.
>> Must I enable the debug option in dashboard local_setting or in
>> manila.conf?
>> Thanks
>> Ignazio
>>
>>
>> Il Lun 3 Ago 2020, 23:01 Goutham Pacha Ravi <gouthampravi at gmail.com> ha
>> scritto:
>>
>>>
>>>
>>>
>>> On Mon, Aug 3, 2020 at 1:31 PM Ignazio Cassano <ignaziocassano at gmail.com>
>>> wrote:
>>>
>>>> I mean I am using dhss false
>>>>
>>>> Il Lun 3 Ago 2020, 21:41 Ignazio Cassano <ignaziocassano at gmail.com> ha
>>>> scritto:
>>>>
>>>>> PS ps
>>>>> Sorry If aI am writing again.
>>>>> The command:
>>>>> manila list let me to show shares I created with command line.
>>>>> The dashboard gives errors I reported in my first email.
>>>>> Looking at manila.py line 280 it checks shares under share networks.
>>>>> Ignazio
>>>>>
>>>>>
>>>>> Il Lun 3 Ago 2020, 21:34 Ignazio Cassano <ignaziocassano at gmail.com>
>>>>> ha scritto:
>>>>>
>>>>>> PS
>>>>>> I followed installation guide under docs.openstack.org.
>>>>>>
>>>>>>
>>>>>> Il Lun 3 Ago 2020, 21:21 Victoria Martínez de la Cruz <
>>>>>> victoria at vmartinezdelacruz.com> ha scritto:
>>>>>>
>>>>>>> Hi Ignazio,
>>>>>>>
>>>>>>> How did you deploy Manila and Manila UI? Can you point me toward the
>>>>>>> docs you used?
>>>>>>>
>>>>>>> Also, which is the specific workflow you are following to reach that
>>>>>>> trace? Just opening the dashboard and clicking on the Shares tab?
>>>>>>>
>>>>>>> Cheers,
>>>>>>>
>>>>>>> V
>>>>>>>
>>>>>>> On Mon, Aug 3, 2020 at 4:55 AM Ignazio Cassano <
>>>>>>> ignaziocassano at gmail.com> wrote:
>>>>>>>
>>>>>>>> Hello, I installed manila on openstack stein and it works by
>>>>>>>> command line mat the manila ui does not work and in httpd error log I read:
>>>>>>>>
>>>>>>>> [Mon Aug 03 07:45:26.697408 2020] [:error] [pid 3506291] ERROR
>>>>>>>> django.request Internal Server Error: /dashboard/project/shares/
>>>>>>>> [Mon Aug 03 07:45:26.697437 2020] [:error] [pid 3506291] Traceback
>>>>>>>> (most recent call last):
>>>>>>>> [Mon Aug 03 07:45:26.697442 2020] [:error] [pid 3506291]   File
>>>>>>>> "/usr/lib/python2.7/site-packages/django/core/handlers/exception.py", line
>>>>>>>> 41, in inner
>>>>>>>> [Mon Aug 03 07:45:26.697446 2020] [:error] [pid 3506291]
>>>>>>>> response = get_response(request)
>>>>>>>> [Mon Aug 03 07:45:26.697450 2020] [:error] [pid 3506291]   File
>>>>>>>> "/usr/lib/python2.7/site-packages/django/core/handlers/base.py", line 187,
>>>>>>>> in _get_response
>>>>>>>> [Mon Aug 03 07:45:26.697453 2020] [:error] [pid 3506291]
>>>>>>>> response = self.process_exception_by_middleware(e, request)
>>>>>>>> [Mon Aug 03 07:45:26.697466 2020] [:error] [pid 3506291]   File
>>>>>>>> "/usr/lib/python2.7/site-packages/django/core/handlers/base.py", line 185,
>>>>>>>> in _get_response
>>>>>>>> [Mon Aug 03 07:45:26.697471 2020] [:error] [pid 3506291]
>>>>>>>> response = wrapped_callback(request, *callback_args, **callback_kwargs)
>>>>>>>> [Mon Aug 03 07:45:26.697475 2020] [:error] [pid 3506291]   File
>>>>>>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 52, in dec
>>>>>>>> [Mon Aug 03 07:45:26.697479 2020] [:error] [pid 3506291]     return
>>>>>>>> view_func(request, *args, **kwargs)
>>>>>>>> [Mon Aug 03 07:45:26.697482 2020] [:error] [pid 3506291]   File
>>>>>>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 36, in dec
>>>>>>>> [Mon Aug 03 07:45:26.697485 2020] [:error] [pid 3506291]     return
>>>>>>>> view_func(request, *args, **kwargs)
>>>>>>>> [Mon Aug 03 07:45:26.697489 2020] [:error] [pid 3506291]   File
>>>>>>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 36, in dec
>>>>>>>> [Mon Aug 03 07:45:26.697492 2020] [:error] [pid 3506291]     return
>>>>>>>> view_func(request, *args, **kwargs)
>>>>>>>> [Mon Aug 03 07:45:26.697496 2020] [:error] [pid 3506291]   File
>>>>>>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 113, in dec
>>>>>>>> [Mon Aug 03 07:45:26.697499 2020] [:error] [pid 3506291]     return
>>>>>>>> view_func(request, *args, **kwargs)
>>>>>>>> [Mon Aug 03 07:45:26.697502 2020] [:error] [pid 3506291]   File
>>>>>>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 84, in dec
>>>>>>>> [Mon Aug 03 07:45:26.697506 2020] [:error] [pid 3506291]     return
>>>>>>>> view_func(request, *args, **kwargs)
>>>>>>>> [Mon Aug 03 07:45:26.697509 2020] [:error] [pid 3506291]   File
>>>>>>>> "/usr/lib/python2.7/site-packages/django/views/generic/base.py", line 68,
>>>>>>>> in view
>>>>>>>> [Mon Aug 03 07:45:26.697513 2020] [:error] [pid 3506291]     return
>>>>>>>> self.dispatch(request, *args, **kwargs)
>>>>>>>> [Mon Aug 03 07:45:26.697516 2020] [:error] [pid 3506291]   File
>>>>>>>> "/usr/lib/python2.7/site-packages/django/views/generic/base.py", line 88,
>>>>>>>> in dispatch
>>>>>>>> [Mon Aug 03 07:45:26.697520 2020] [:error] [pid 3506291]     return
>>>>>>>> handler(request, *args, **kwargs)
>>>>>>>> [Mon Aug 03 07:45:26.697523 2020] [:error] [pid 3506291]   File
>>>>>>>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 223, in get
>>>>>>>> [Mon Aug 03 07:45:26.697526 2020] [:error] [pid 3506291]
>>>>>>>> handled = self.construct_tables()
>>>>>>>> [Mon Aug 03 07:45:26.697530 2020] [:error] [pid 3506291]   File
>>>>>>>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 214, in
>>>>>>>> construct_tables
>>>>>>>> [Mon Aug 03 07:45:26.697533 2020] [:error] [pid 3506291]
>>>>>>>> handled = self.handle_table(table)
>>>>>>>> [Mon Aug 03 07:45:26.697537 2020] [:error] [pid 3506291]   File
>>>>>>>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 123, in
>>>>>>>> handle_table
>>>>>>>> [Mon Aug 03 07:45:26.697540 2020] [:error] [pid 3506291]     data =
>>>>>>>> self._get_data_dict()
>>>>>>>> [Mon Aug 03 07:45:26.697544 2020] [:error] [pid 3506291]   File
>>>>>>>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 43, in
>>>>>>>> _get_data_dict
>>>>>>>> [Mon Aug 03 07:45:26.697547 2020] [:error] [pid 3506291]
>>>>>>>> data.extend(func())
>>>>>>>> [Mon Aug 03 07:45:26.697550 2020] [:error] [pid 3506291]   File
>>>>>>>> "/usr/lib/python2.7/site-packages/horizon/utils/memoized.py", line 109, in
>>>>>>>> wrapped
>>>>>>>> [Mon Aug 03 07:45:26.697554 2020] [:error] [pid 3506291]     value
>>>>>>>> = cache[key] = func(*args, **kwargs)
>>>>>>>> [Mon Aug 03 07:45:26.697557 2020] [:error] [pid 3506291]   File
>>>>>>>> "/usr/lib/python2.7/site-packages/manila_ui/dashboards/project/shares/views.py",
>>>>>>>> line 57, in get_shares_data
>>>>>>>> [Mon Aug 03 07:45:26.697561 2020] [:error] [pid 3506291]
>>>>>>>> share_nets = manila.share_network_list(self.request)
>>>>>>>> [Mon Aug 03 07:45:26.697564 2020] [:error] [pid 3506291]   File
>>>>>>>> "/usr/lib/python2.7/site-packages/manila_ui/api/manila.py", line 280, in
>>>>>>>> share_network_list
>>>>>>>> [Mon Aug 03 07:45:26.697568 2020] [:error] [pid 3506291]     return
>>>>>>>> manilaclient(request).share_networks.list(detailed=detailed,
>>>>>>>> [Mon Aug 03 07:45:26.697571 2020] [:error] [pid 3506291]
>>>>>>>> AttributeError: 'NoneType' object has no attribute 'share_networks'
>>>>>>>>
>>>>>>>
>>> Looking at the error here, and the code - it could be that the UI isn't
>>> able to retrieve the manila service endpoint from the service catalog. If
>>> this is the case, you must be able to see a "DEBUG" level log in your httpd
>>> error log with "no share service configured". Do you see it?
>>>
>>> As the user you're using on horizon, can you perform "openstack catalog
>>> list" and check whether the "sharev2" service type exists in that list?
>>>
>>>
>>>>
>>>>>>>> Please, anyone could help ?
>>>>>>>> Ignazio
>>>>>>>>
>>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200804/1edb1b6e/attachment.html>

From dev.faz at gmail.com  Tue Aug  4 05:54:49 2020
From: dev.faz at gmail.com (Fabian Zimmermann)
Date: Tue, 4 Aug 2020 07:54:49 +0200
Subject: [ops] [cinder[ "Services and volumes must have a service UUID."
In-Reply-To: <CALaZjRE97Jp_JBduWm4LKjRH-w=1tgfajwJ6GPCHerXKMBqpAQ@mail.gmail.com>
References: <CALaZjRE97Jp_JBduWm4LKjRH-w=1tgfajwJ6GPCHerXKMBqpAQ@mail.gmail.com>
Message-ID: <CAA857VwrCa7aLhpiY-7_=JFexJjktruMg=EswLePixAg14QeSg@mail.gmail.com>

Hi,

i never had this issue, but did you run the post upgrade data migrations?

 Fabian

Massimo Sgaravatto <massimo.sgaravatto at gmail.com> schrieb am Mo., 3. Aug.
2020, 20:21:

> We have just updated a small OpenStack cluster to Train.
> Everything seems working, but "cinder-status upgrade check" complains that
> services and volumes must have a service UUID [*].
> What does this exactly mean?
>
> Thanks, Massimo
>
> [*]
> +--------------------------------------------------------------------+
> | Check: Service UUIDs                                               |
> | Result: Failure                                                    |
> | Details: Services and volumes must have a service UUID. Please fix |
> |   this issue by running Queens online data migrations.             |
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200804/72123a2e/attachment-0001.html>

From massimo.sgaravatto at gmail.com  Tue Aug  4 07:14:00 2020
From: massimo.sgaravatto at gmail.com (Massimo Sgaravatto)
Date: Tue, 4 Aug 2020 09:14:00 +0200
Subject: [ops] [cinder[ "Services and volumes must have a service UUID."
In-Reply-To: <CAA857VwrCa7aLhpiY-7_=JFexJjktruMg=EswLePixAg14QeSg@mail.gmail.com>
References: <CALaZjRE97Jp_JBduWm4LKjRH-w=1tgfajwJ6GPCHerXKMBqpAQ@mail.gmail.com>
 <CAA857VwrCa7aLhpiY-7_=JFexJjktruMg=EswLePixAg14QeSg@mail.gmail.com>
Message-ID: <CALaZjREcGpOhscZz_7eNVD_DHVMPKWmCDCrw8u3uubhKAxhmdw@mail.gmail.com>

Do you mean "su -s /bin/sh -c "cinder-manage db sync" cinder" ?
Yes: this was run

Cheers, Massimo

On Tue, Aug 4, 2020 at 7:54 AM Fabian Zimmermann <dev.faz at gmail.com> wrote:

> Hi,
>
> i never had this issue, but did you run the post upgrade data migrations?
>
>  Fabian
>
> Massimo Sgaravatto <massimo.sgaravatto at gmail.com> schrieb am Mo., 3. Aug.
> 2020, 20:21:
>
>> We have just updated a small OpenStack cluster to Train.
>> Everything seems working, but "cinder-status upgrade check" complains
>> that services and volumes must have a service UUID [*].
>> What does this exactly mean?
>>
>> Thanks, Massimo
>>
>> [*]
>> +--------------------------------------------------------------------+
>> | Check: Service UUIDs                                               |
>> | Result: Failure                                                    |
>> | Details: Services and volumes must have a service UUID. Please fix |
>> |   this issue by running Queens online data migrations.             |
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200804/9454bf4c/attachment.html>

From dev.faz at gmail.com  Tue Aug  4 07:20:09 2020
From: dev.faz at gmail.com (Fabian Zimmermann)
Date: Tue, 4 Aug 2020 09:20:09 +0200
Subject: [ops] [cinder[ "Services and volumes must have a service UUID."
In-Reply-To: <CALaZjREcGpOhscZz_7eNVD_DHVMPKWmCDCrw8u3uubhKAxhmdw@mail.gmail.com>
References: <CALaZjRE97Jp_JBduWm4LKjRH-w=1tgfajwJ6GPCHerXKMBqpAQ@mail.gmail.com>
 <CAA857VwrCa7aLhpiY-7_=JFexJjktruMg=EswLePixAg14QeSg@mail.gmail.com>
 <CALaZjREcGpOhscZz_7eNVD_DHVMPKWmCDCrw8u3uubhKAxhmdw@mail.gmail.com>
Message-ID: <CAA857VzBYHQog+vtS5RZbUdiNtQRdqVsOy-WhTV2U20QL04MXg@mail.gmail.com>

Hi,

No i mean the "online data migrations"

https://docs.openstack.org/cinder/rocky/upgrade.html

 Fabian

Massimo Sgaravatto <massimo.sgaravatto at gmail.com> schrieb am Di., 4. Aug.
2020, 09:14:

> Do you mean "su -s /bin/sh -c "cinder-manage db sync" cinder" ?
> Yes: this was run
>
> Cheers, Massimo
>
> On Tue, Aug 4, 2020 at 7:54 AM Fabian Zimmermann <dev.faz at gmail.com>
> wrote:
>
>> Hi,
>>
>> i never had this issue, but did you run the post upgrade data migrations?
>>
>>  Fabian
>>
>> Massimo Sgaravatto <massimo.sgaravatto at gmail.com> schrieb am Mo., 3.
>> Aug. 2020, 20:21:
>>
>>> We have just updated a small OpenStack cluster to Train.
>>> Everything seems working, but "cinder-status upgrade check" complains
>>> that services and volumes must have a service UUID [*].
>>> What does this exactly mean?
>>>
>>> Thanks, Massimo
>>>
>>> [*]
>>> +--------------------------------------------------------------------+
>>> | Check: Service UUIDs                                               |
>>> | Result: Failure                                                    |
>>> | Details: Services and volumes must have a service UUID. Please fix |
>>> |   this issue by running Queens online data migrations.             |
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200804/9dd17cf1/attachment.html>

From massimo.sgaravatto at gmail.com  Tue Aug  4 07:46:35 2020
From: massimo.sgaravatto at gmail.com (Massimo Sgaravatto)
Date: Tue, 4 Aug 2020 09:46:35 +0200
Subject: [ops] [cinder[ "Services and volumes must have a service UUID."
In-Reply-To: <CAA857VzBYHQog+vtS5RZbUdiNtQRdqVsOy-WhTV2U20QL04MXg@mail.gmail.com>
References: <CALaZjRE97Jp_JBduWm4LKjRH-w=1tgfajwJ6GPCHerXKMBqpAQ@mail.gmail.com>
 <CAA857VwrCa7aLhpiY-7_=JFexJjktruMg=EswLePixAg14QeSg@mail.gmail.com>
 <CALaZjREcGpOhscZz_7eNVD_DHVMPKWmCDCrw8u3uubhKAxhmdw@mail.gmail.com>
 <CAA857VzBYHQog+vtS5RZbUdiNtQRdqVsOy-WhTV2U20QL04MXg@mail.gmail.com>
Message-ID: <CALaZjRF3Z61Od5e3rHCByLtRe4FRcLbyRMXW0s35szGmWGqKAw@mail.gmail.com>

Thanks.
I tried but it says there is nothing to migrate:

+-----------------------------------------+--------------+-----------+
|                Migration                | Total Needed | Completed |
+-----------------------------------------+--------------+-----------+
| untyped_snapshots_online_data_migration |      0       |     0     |
|  untyped_volumes_online_data_migration  |      0       |     0     |
+-----------------------------------------+--------------+-----------+

On Tue, Aug 4, 2020 at 9:20 AM Fabian Zimmermann <dev.faz at gmail.com> wrote:

> Hi,
>
> No i mean the "online data migrations"
>
> https://docs.openstack.org/cinder/rocky/upgrade.html
>
>  Fabian
>
> Massimo Sgaravatto <massimo.sgaravatto at gmail.com> schrieb am Di., 4. Aug.
> 2020, 09:14:
>
>> Do you mean "su -s /bin/sh -c "cinder-manage db sync" cinder" ?
>> Yes: this was run
>>
>> Cheers, Massimo
>>
>> On Tue, Aug 4, 2020 at 7:54 AM Fabian Zimmermann <dev.faz at gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> i never had this issue, but did you run the post upgrade data migrations?
>>>
>>>  Fabian
>>>
>>> Massimo Sgaravatto <massimo.sgaravatto at gmail.com> schrieb am Mo., 3.
>>> Aug. 2020, 20:21:
>>>
>>>> We have just updated a small OpenStack cluster to Train.
>>>> Everything seems working, but "cinder-status upgrade check" complains
>>>> that services and volumes must have a service UUID [*].
>>>> What does this exactly mean?
>>>>
>>>> Thanks, Massimo
>>>>
>>>> [*]
>>>> +--------------------------------------------------------------------+
>>>> | Check: Service UUIDs                                               |
>>>> | Result: Failure                                                    |
>>>> | Details: Services and volumes must have a service UUID. Please fix |
>>>> |   this issue by running Queens online data migrations.             |
>>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200804/1d09b39e/attachment.html>

From mark at stackhpc.com  Tue Aug  4 08:08:19 2020
From: mark at stackhpc.com (Mark Goddard)
Date: Tue, 4 Aug 2020 09:08:19 +0100
Subject: [ironic][stable] Include ironic-core in ironic-stable-maint ?
In-Reply-To: <CAF7gwdjB1toM-LJ+869p0K7yJhEy3YNWYV_5V7WjYLLJJO2hNA@mail.gmail.com>
References: <CAF7gwdjB1toM-LJ+869p0K7yJhEy3YNWYV_5V7WjYLLJJO2hNA@mail.gmail.com>
Message-ID: <CAFHSqWp6E-RD9afuoL=WhJAQXaYOUKns196sp5CCDjP-+_qaxQ@mail.gmail.com>

On Mon, 3 Aug 2020 at 21:06, Julia Kreger <juliaashleykreger at gmail.com> wrote:
>
> Greetings awesome humans,
>
> I have a conundrum, and largely it is over stable branch maintenance.
>
> In essence, our stable branch approvers are largely down to Dmitry,
> Riccardo, and Myself. I think this needs to change and I'd like to
> propose that we go ahead and change ironic-stable-maint to just
> include ironic-core in order to prevent the bottleneck and conflict
> and risk which this presents.
>
> I strongly believe that our existing cores would all do the right
> thing if presented with the question of if a change needed to be
> merged. So honestly I'm not concerned by this proposal. Plus, some of
> our sub-projects have operated this way for quite some time.
>
> Thoughts, concerns, worries?
>

Makes sense to me. We operate this way in Kolla. It might be good to
make sure that current cores are all aware of what 'the right thing'
is, that it is written down, and that we include it in the core
onboarding process.

> -Julia
>


From mark at stackhpc.com  Tue Aug  4 08:11:39 2020
From: mark at stackhpc.com (Mark Goddard)
Date: Tue, 4 Aug 2020 09:11:39 +0100
Subject: [cloudkitty][tc] Cloudkitty abandoned?
In-Reply-To: <CAG97raekL-OR3mT_GdpD+zcwoCCLYDTCsieMwqCkYC9brmHKpw@mail.gmail.com>
References: <c6150166-1fa9-5020-988a-0a8e9101bca8@gmx.com>
 <CAG97raekL-OR3mT_GdpD+zcwoCCLYDTCsieMwqCkYC9brmHKpw@mail.gmail.com>
Message-ID: <CAFHSqWrKO4nDiVNpZe3LUsccXwefqRirZGk9BCsQjads0VrYCA@mail.gmail.com>

On Thu, 30 Jul 2020 at 14:43, Rafael Weingärtner
<rafaelweingartner at gmail.com> wrote:
>
> We are working on it. So far we have 3 open proposals there, but we do not have enough karma to move things along.
> Besides these 3 open proposals, we do have more ongoing extensions that have not yet been proposed to the community.

It's good to hear you want to help improve cloudkitty, however it
sounds like what is required is help with maintaining the project. Is
that something you could be involved with?
Mark

>
> On Thu, Jul 30, 2020 at 10:22 AM Sean McGinnis <sean.mcginnis at gmx.com> wrote:
>>
>> Posting here to raise awareness, and start discussion about next steps.
>>
>> It appears there is no one working on Cloudkitty anymore. No patches
>> have been merged for several months now, including simple bot proposed
>> patches. It would appear no one is maintaining this project anymore.
>>
>> I know there is a need out there for this type of functionality, so
>> maybe this will raise awareness and get some attention to it. But
>> barring that, I am wondering if we should start the process to retire
>> this project.
>>
>>  From a Victoria release perspective, it is milestone-2 week, so we
>> should make a decision if any of the Cloudkitty deliverables should be
>> included in this release or not. We can certainly force releases of
>> whatever is the latest, but I think that is a bit risky since these
>> repos have never merged the job template change for victoria and
>> therefore are not even testing with Python 3.8. That is an official
>> runtime for Victoria, so we run the risk of having issues with the code
>> if someone runs under 3.8 but we have not tested to make sure there are
>> no problems doing so.
>>
>> I am hoping this at least starts the discussion. I will not propose any
>> release patches to remove anything until we have had a chance to discuss
>> the situation.
>>
>> Sean
>>
>>
>
>
> --
> Rafael Weingärtner


From dev.faz at gmail.com  Tue Aug  4 08:17:45 2020
From: dev.faz at gmail.com (Fabian Zimmermann)
Date: Tue, 4 Aug 2020 10:17:45 +0200
Subject: [ops] [cinder[ "Services and volumes must have a service UUID."
In-Reply-To: <CALaZjRF3Z61Od5e3rHCByLtRe4FRcLbyRMXW0s35szGmWGqKAw@mail.gmail.com>
References: <CALaZjRE97Jp_JBduWm4LKjRH-w=1tgfajwJ6GPCHerXKMBqpAQ@mail.gmail.com>
 <CAA857VwrCa7aLhpiY-7_=JFexJjktruMg=EswLePixAg14QeSg@mail.gmail.com>
 <CALaZjREcGpOhscZz_7eNVD_DHVMPKWmCDCrw8u3uubhKAxhmdw@mail.gmail.com>
 <CAA857VzBYHQog+vtS5RZbUdiNtQRdqVsOy-WhTV2U20QL04MXg@mail.gmail.com>
 <CALaZjRF3Z61Od5e3rHCByLtRe4FRcLbyRMXW0s35szGmWGqKAw@mail.gmail.com>
Message-ID: <CAA857Vy-XRwMR-mT-f=vBQWNs2C81q8_a4P1sUuYn7yq4TGFCg@mail.gmail.com>

Hmm, the err msg tells to run the queens version of the tool.

Maybe something went wrong, but the db version got incremented? Just
guessing.

Did you try to find the commit/change that introduced the msg?

Maybe it refers to the action required to fix it / or check the db online
migrarions scripts what they would/should do.

 Fabian

Massimo Sgaravatto <massimo.sgaravatto at gmail.com> schrieb am Di., 4. Aug.
2020, 09:46:

> Thanks.
> I tried but it says there is nothing to migrate:
>
> +-----------------------------------------+--------------+-----------+
> |                Migration                | Total Needed | Completed |
> +-----------------------------------------+--------------+-----------+
> | untyped_snapshots_online_data_migration |      0       |     0     |
> |  untyped_volumes_online_data_migration  |      0       |     0     |
> +-----------------------------------------+--------------+-----------+
>
> On Tue, Aug 4, 2020 at 9:20 AM Fabian Zimmermann <dev.faz at gmail.com>
> wrote:
>
>> Hi,
>>
>> No i mean the "online data migrations"
>>
>> https://docs.openstack.org/cinder/rocky/upgrade.html
>>
>>  Fabian
>>
>> Massimo Sgaravatto <massimo.sgaravatto at gmail.com> schrieb am Di., 4.
>> Aug. 2020, 09:14:
>>
>>> Do you mean "su -s /bin/sh -c "cinder-manage db sync" cinder" ?
>>> Yes: this was run
>>>
>>> Cheers, Massimo
>>>
>>> On Tue, Aug 4, 2020 at 7:54 AM Fabian Zimmermann <dev.faz at gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> i never had this issue, but did you run the post upgrade data
>>>> migrations?
>>>>
>>>>  Fabian
>>>>
>>>> Massimo Sgaravatto <massimo.sgaravatto at gmail.com> schrieb am Mo., 3.
>>>> Aug. 2020, 20:21:
>>>>
>>>>> We have just updated a small OpenStack cluster to Train.
>>>>> Everything seems working, but "cinder-status upgrade check" complains
>>>>> that services and volumes must have a service UUID [*].
>>>>> What does this exactly mean?
>>>>>
>>>>> Thanks, Massimo
>>>>>
>>>>> [*]
>>>>> +--------------------------------------------------------------------+
>>>>> | Check: Service UUIDs                                               |
>>>>> | Result: Failure                                                    |
>>>>> | Details: Services and volumes must have a service UUID. Please fix |
>>>>> |   this issue by running Queens online data migrations.             |
>>>>>
>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200804/5d805480/attachment.html>

From thierry at openstack.org  Tue Aug  4 08:54:39 2020
From: thierry at openstack.org (Thierry Carrez)
Date: Tue, 4 Aug 2020 10:54:39 +0200
Subject: [ironic][stable] Include ironic-core in ironic-stable-maint ?
In-Reply-To: <CAF7gwdjB1toM-LJ+869p0K7yJhEy3YNWYV_5V7WjYLLJJO2hNA@mail.gmail.com>
References: <CAF7gwdjB1toM-LJ+869p0K7yJhEy3YNWYV_5V7WjYLLJJO2hNA@mail.gmail.com>
Message-ID: <01691703-aa8a-c24b-bc8e-7671a55d1d34@openstack.org>

Julia Kreger wrote:
> [...]
> In essence, our stable branch approvers are largely down to Dmitry,
> Riccardo, and Myself. I think this needs to change and I'd like to
> propose that we go ahead and change ironic-stable-maint to just
> include ironic-core in order to prevent the bottleneck and conflict
> and risk which this presents.
> 
> I strongly believe that our existing cores would all do the right
> thing if presented with the question of if a change needed to be
> merged. So honestly I'm not concerned by this proposal. Plus, some of
> our sub-projects have operated this way for quite some time.
> 
> Thoughts, concerns, worries?

Sounds good to me.

Stable branch backport approvals follow different rules from development 
branch changes, which is why historically we used separate groups -- so 
that all -core do not need to know the stable policy rules.

But today -core groups evolve less quickly and can probably be taught 
the stable policy, so I'm not too concerned either. Maybe it's a good 
time to remind them of the stable policy doc though, in particular the 
"appropriate fixes" section:

https://docs.openstack.org/project-team-guide/stable-branches.html

Cheers,

-- 
Thierry


From jesse at odyssey4.me  Tue Aug  4 09:23:30 2020
From: jesse at odyssey4.me (Jesse Pretorius)
Date: Tue, 4 Aug 2020 09:23:30 +0000
Subject: [tripleo][ansible] the current action plugins use patterns are
 suboptimal?
In-Reply-To: <CAFsb3b7eU4Bus-6vk1w+bCMPd1Y7h_6xYVqNCzcqbai=DhCXHg@mail.gmail.com>
References: <6feb1d83-5cc8-1916-90a7-1a6b54593310@redhat.com>
 <CAGHQP9w45ZNRebwRVTx870qoZ2NiJ7mdxWmmkGExhDx3OLYYzA@mail.gmail.com>
 <b01f96a4-c716-7db3-180e-ff1e6f76f975@redhat.com>
 <CAFsb3b7eU4Bus-6vk1w+bCMPd1Y7h_6xYVqNCzcqbai=DhCXHg@mail.gmail.com>
Message-ID: <2231a9649ccef5f1c712ac509fb22cb199f6b88b.camel@odyssey4.me>

On Mon, 2020-08-03 at 13:28 -0600, Alex Schultz wrote:

On Mon, Aug 3, 2020 at 6:34 AM Bogdan Dobrelya <

<mailto:bdobreli at redhat.com>

bdobreli at redhat.com

> wrote:


On 8/3/20 12:36 PM, Sagi Shnaidman wrote:

Hi, Bogdan


thanks for raising this up, although I'm not sure I understand what it

is the problem with using action plugins.

Action plugins are well known official extensions for Ansible, as any

other plugins - callback, strategy, inventory etc [1]. It is not any

hack or unsupported workaround, it's a known and official feature of

Ansible. Why can't we use it? What makes it different from filter,


I believe the cases that require the use of those should be justified.

For the given example, that manages containers in a loop via calling a

module, what the written custom callback plugin buys for us? That brings

code to maintain, extra complexity, like handling possible corner cases

in async mode, dry-run mode etc. But what is justification aside of

looks handy?


I disagree that we shouldn't use action plugins or modules.  Tasks

themselves are expensive at scale.  We saw that when we switched away

from paunch to container management in pure ansible tasks.  This

exposed that looping tasks are even more expensive and complex error

handling and workflows are better suited for modules or action plugins

than a series of tasks.  This is not something to be "fixed in

ansible".  This is the nature of the executor and strategy related

interactions.  Should everything be converted to modules and plugins?

no.  Should everything be tasks only? no.  It's a balance that must be

struck between when a specific set of complex tasks need extra data

processing or error handling.  Switching to modules or action plugins

allows us to unit test our logic. Using tasks do not have such a

concept outside of writing complex molecule testing.   IMHO it's safer

to switch to modules/action plugins than writing task logic.

I agree with Alex. Writing complex logic or trying to do error handling in tasks or jinja is not only very slow in execution, but gives us no way to properly test. Using ansible extensions like modules, action plugins, filters, etc gives us something that we can unit test, do better error handling with and therefore provides the abilty to produce a better quality result.

While it is true that it does give us more of a downstream burden, our community is well versed in reading python code and testing python code properly. Sometimes it might seem easier to an author to prototype something using ansible/jinja, but if the result is complex then an extension of some kind should be considered as a iteration and it should be unit tested. I'd go as far as to say that if we add module, we should force the requirement for unit testing through some means to ensure good code quality and maintainability.

Another benefit to using modules is that the Ansible tasks read more like a sequence of events that need to happen, which is exactly the spirit that Ansible has always advocated. When complex logic is implemented in tasks or in jinja, trying to follow the orchestration sequence becomes a *lot* harder.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200804/71c278af/attachment-0001.html>

From bdobreli at redhat.com  Tue Aug  4 09:35:06 2020
From: bdobreli at redhat.com (Bogdan Dobrelya)
Date: Tue, 4 Aug 2020 11:35:06 +0200
Subject: [tripleo][ansible] the current action plugins use patterns are
 suboptimal?
In-Reply-To: <CAFsb3b7eU4Bus-6vk1w+bCMPd1Y7h_6xYVqNCzcqbai=DhCXHg@mail.gmail.com>
References: <6feb1d83-5cc8-1916-90a7-1a6b54593310@redhat.com>
 <CAGHQP9w45ZNRebwRVTx870qoZ2NiJ7mdxWmmkGExhDx3OLYYzA@mail.gmail.com>
 <b01f96a4-c716-7db3-180e-ff1e6f76f975@redhat.com>
 <CAFsb3b7eU4Bus-6vk1w+bCMPd1Y7h_6xYVqNCzcqbai=DhCXHg@mail.gmail.com>
Message-ID: <385dc8d7-198f-64ce-908f-49ab823ed229@redhat.com>

On 8/3/20 9:28 PM, Alex Schultz wrote:
> On Mon, Aug 3, 2020 at 6:34 AM Bogdan Dobrelya <bdobreli at redhat.com> wrote:
>>
>> On 8/3/20 12:36 PM, Sagi Shnaidman wrote:
>>> Hi, Bogdan
>>>
>>> thanks for raising this up, although I'm not sure I understand what it
>>> is the problem with using action plugins.
>>> Action plugins are well known official extensions for Ansible, as any
>>> other plugins - callback, strategy, inventory etc [1]. It is not any
>>> hack or unsupported workaround, it's a known and official feature of
>>> Ansible. Why can't we use it? What makes it different from filter,
>>
>> I believe the cases that require the use of those should be justified.
>> For the given example, that manages containers in a loop via calling a
>> module, what the written custom callback plugin buys for us? That brings
>> code to maintain, extra complexity, like handling possible corner cases
>> in async mode, dry-run mode etc. But what is justification aside of
>> looks handy?
> 
> I disagree that we shouldn't use action plugins or modules.  Tasks
> themselves are expensive at scale.  We saw that when we switched away
> from paunch to container management in pure ansible tasks.  This
> exposed that looping tasks are even more expensive and complex error
> handling and workflows are better suited for modules or action plugins
> than a series of tasks.  This is not something to be "fixed in
> ansible".  This is the nature of the executor and strategy related
> interactions.  Should everything be converted to modules and plugins?
> no.  Should everything be tasks only? no.  It's a balance that must be
> struck between when a specific set of complex tasks need extra data
> processing or error handling.  Switching to modules or action plugins
> allows us to unit test our logic. Using tasks do not have such a

I can understand that ansible should not be fixed for some composition 
tasks what require iterations and have complex logic for its "unit of 
work". And such ones also should be unit tested indeed. What I do not 
fully understand though is then what abandoning paunch for its action 
plugin had bought for us in the end?

Paunch was self-contained and had no external dependencies on 
fast-changing ansible frameworks. There was also no need for paunch to 
handle the ansible-specific execution strategies and nuances, like "what 
if that action plugin is called in async or in the check-mode?" Unit 
tests exited in paunch as well. It was easy to backport changes within a 
single code base.

So, looking back retrospectively, was rewriting paunch as an action 
plugin a simplification of the deployment framework? Please reply to 
yourself honestly. It does pretty same things but differently and added 
external framework. It is now also self-contained action plugin, since 
traditional tasks cannot be used in loops for this goal because of 
performance reasons.

To summarize, action plugins may be a good solution indeed, but perhaps 
we should go back and use paunch instead of ansible? Same applies for 
*some* other tasks? That would also provide a balance, for action 
plugins, tasks and common sense.

> concept outside of writing complex molecule testing.   IMHO it's safer
> to switch to modules/action plugins than writing task logic.
> 
> IMHO the issue that I see with the switch to Action plugins is the
> increased load on the ansible "controller" node during execution.
> Modules may be better depending on the task being managed. But I
> believe with unit testing, action plugins or modules provide a cleaner
> and more testable solution than writing roles consisting only of
> tasks.
> 
> 
> 
>>
>>> lookup, inventory or any other plugin we already use?
>>> Action plugins are also used wide in Ansible itself, for example
>>> templates plugin is implemented with action plugin [2]. If Ansible can
>>> use it, why can't we? I don't think there is something with "fixing"
>>> Ansible, it's not a bug, this is a useful extension.
>>> What regards the mentioned action plugin for podman containers, it
>>> allows to spawn containers remotely while skipping the connection part
>>> for every cycle. I'm not sure you can "fix" Ansible not to do that, it's
>>> not a bug. We may not see the difference in a few hosts in CI, but it
>>> might be very efficient when we deploy on 100+ hosts oro even 1000+
>>> hosts. In order to evaluate this on bigger setups to understand its
>>> value we configured both options - to use action plugin or usual module.
>>> If better performance of action plugin will be proven, we can switch to
>>> use it, if it doesn't make a difference on bigger setups - then I think
>>> we can easily switch back to using an usual module.
>>>
>>> Thanks
>>>
>>> [1] https://docs.ansible.com/ansible/latest/plugins/plugins.html
>>> [2]
>>> https://github.com/ansible/ansible/blob/devel/lib/ansible/plugins/action/template.py
>>>
>>> On Mon, Aug 3, 2020 at 11:19 AM Bogdan Dobrelya <bdobreli at redhat.com
>>> <mailto:bdobreli at redhat.com>> wrote:
>>>
>>>      There is a trend of writing action plugins, see [0], for simple things,
>>>      like just calling a module in a loop. I'm not sure that is the
>>>      direction
>>>      TripleO should go. If ansible is inefficient in this sort of tasks
>>>      without custom python code written, we should fix ansible. Otherwise,
>>>      what is the ultimate goal of that trend? Is that having only action
>>>      plugins in roles and playbooks?
>>>
>>>      Please kindly asking the community to stop that, make a step back and
>>>      reiterate with the taken approach. Thank you.
>>>
>>>      [0] https://review.opendev.org/716108
>>>
>>>
>>>      --
>>>      Best regards,
>>>      Bogdan Dobrelya,
>>>      Irc #bogdando
>>>
>>>
>>>
>>>
>>> --
>>> Best regards
>>> Sagi Shnaidman
>>
>>
>> --
>> Best regards,
>> Bogdan Dobrelya,
>> Irc #bogdando
>>
>>
> 


-- 
Best regards,
Bogdan Dobrelya,
Irc #bogdando


From sshnaidm at redhat.com  Tue Aug  4 09:38:42 2020
From: sshnaidm at redhat.com (Sagi Shnaidman)
Date: Tue, 4 Aug 2020 12:38:42 +0300
Subject: [tripleo][ansible] the current action plugins use patterns are
 suboptimal?
In-Reply-To: <385dc8d7-198f-64ce-908f-49ab823ed229@redhat.com>
References: <6feb1d83-5cc8-1916-90a7-1a6b54593310@redhat.com>
 <CAGHQP9w45ZNRebwRVTx870qoZ2NiJ7mdxWmmkGExhDx3OLYYzA@mail.gmail.com>
 <b01f96a4-c716-7db3-180e-ff1e6f76f975@redhat.com>
 <CAFsb3b7eU4Bus-6vk1w+bCMPd1Y7h_6xYVqNCzcqbai=DhCXHg@mail.gmail.com>
 <385dc8d7-198f-64ce-908f-49ab823ed229@redhat.com>
Message-ID: <CAGHQP9x8F7w7PETsC0cYjKoo-MseiDosjSues34BZ+mnq4=qVQ@mail.gmail.com>

Hi,

Actually this discussion prompted me to investigate more how to optimize
containers setup on a large number of hosts. As I saw from action plugin
work, it still copies the module file each loop cycle, which is not very
efficient behavior. That's why I started work on a podman container module
which can start a bunch of containers in one call and accepts a list of
containers as an input. In this case the module file will be transferred to
the remote host only once and all containers execution will be done by
python on the remote host.
That way we'll avoid unnecessary establishing connections, copying files,
setting permissions etc what happens every time we cycle over data. It will
be done only once.
I'll send a patch soon for review.

Thanks

On Tue, Aug 4, 2020 at 12:35 PM Bogdan Dobrelya <bdobreli at redhat.com> wrote:

> On 8/3/20 9:28 PM, Alex Schultz wrote:
> > On Mon, Aug 3, 2020 at 6:34 AM Bogdan Dobrelya <bdobreli at redhat.com>
> wrote:
> >>
> >> On 8/3/20 12:36 PM, Sagi Shnaidman wrote:
> >>> Hi, Bogdan
> >>>
> >>> thanks for raising this up, although I'm not sure I understand what it
> >>> is the problem with using action plugins.
> >>> Action plugins are well known official extensions for Ansible, as any
> >>> other plugins - callback, strategy, inventory etc [1]. It is not any
> >>> hack or unsupported workaround, it's a known and official feature of
> >>> Ansible. Why can't we use it? What makes it different from filter,
> >>
> >> I believe the cases that require the use of those should be justified.
> >> For the given example, that manages containers in a loop via calling a
> >> module, what the written custom callback plugin buys for us? That brings
> >> code to maintain, extra complexity, like handling possible corner cases
> >> in async mode, dry-run mode etc. But what is justification aside of
> >> looks handy?
> >
> > I disagree that we shouldn't use action plugins or modules.  Tasks
> > themselves are expensive at scale.  We saw that when we switched away
> > from paunch to container management in pure ansible tasks.  This
> > exposed that looping tasks are even more expensive and complex error
> > handling and workflows are better suited for modules or action plugins
> > than a series of tasks.  This is not something to be "fixed in
> > ansible".  This is the nature of the executor and strategy related
> > interactions.  Should everything be converted to modules and plugins?
> > no.  Should everything be tasks only? no.  It's a balance that must be
> > struck between when a specific set of complex tasks need extra data
> > processing or error handling.  Switching to modules or action plugins
> > allows us to unit test our logic. Using tasks do not have such a
>
> I can understand that ansible should not be fixed for some composition
> tasks what require iterations and have complex logic for its "unit of
> work". And such ones also should be unit tested indeed. What I do not
> fully understand though is then what abandoning paunch for its action
> plugin had bought for us in the end?
>
> Paunch was self-contained and had no external dependencies on
> fast-changing ansible frameworks. There was also no need for paunch to
> handle the ansible-specific execution strategies and nuances, like "what
> if that action plugin is called in async or in the check-mode?" Unit
> tests exited in paunch as well. It was easy to backport changes within a
> single code base.
>
> So, looking back retrospectively, was rewriting paunch as an action
> plugin a simplification of the deployment framework? Please reply to
> yourself honestly. It does pretty same things but differently and added
> external framework. It is now also self-contained action plugin, since
> traditional tasks cannot be used in loops for this goal because of
> performance reasons.
>
> To summarize, action plugins may be a good solution indeed, but perhaps
> we should go back and use paunch instead of ansible? Same applies for
> *some* other tasks? That would also provide a balance, for action
> plugins, tasks and common sense.
>
> > concept outside of writing complex molecule testing.   IMHO it's safer
> > to switch to modules/action plugins than writing task logic.
> >
> > IMHO the issue that I see with the switch to Action plugins is the
> > increased load on the ansible "controller" node during execution.
> > Modules may be better depending on the task being managed. But I
> > believe with unit testing, action plugins or modules provide a cleaner
> > and more testable solution than writing roles consisting only of
> > tasks.
> >
> >
> >
> >>
> >>> lookup, inventory or any other plugin we already use?
> >>> Action plugins are also used wide in Ansible itself, for example
> >>> templates plugin is implemented with action plugin [2]. If Ansible can
> >>> use it, why can't we? I don't think there is something with "fixing"
> >>> Ansible, it's not a bug, this is a useful extension.
> >>> What regards the mentioned action plugin for podman containers, it
> >>> allows to spawn containers remotely while skipping the connection part
> >>> for every cycle. I'm not sure you can "fix" Ansible not to do that,
> it's
> >>> not a bug. We may not see the difference in a few hosts in CI, but it
> >>> might be very efficient when we deploy on 100+ hosts oro even 1000+
> >>> hosts. In order to evaluate this on bigger setups to understand its
> >>> value we configured both options - to use action plugin or usual
> module.
> >>> If better performance of action plugin will be proven, we can switch to
> >>> use it, if it doesn't make a difference on bigger setups - then I think
> >>> we can easily switch back to using an usual module.
> >>>
> >>> Thanks
> >>>
> >>> [1] https://docs.ansible.com/ansible/latest/plugins/plugins.html
> >>> [2]
> >>>
> https://github.com/ansible/ansible/blob/devel/lib/ansible/plugins/action/template.py
> >>>
> >>> On Mon, Aug 3, 2020 at 11:19 AM Bogdan Dobrelya <bdobreli at redhat.com
> >>> <mailto:bdobreli at redhat.com>> wrote:
> >>>
> >>>      There is a trend of writing action plugins, see [0], for simple
> things,
> >>>      like just calling a module in a loop. I'm not sure that is the
> >>>      direction
> >>>      TripleO should go. If ansible is inefficient in this sort of tasks
> >>>      without custom python code written, we should fix ansible.
> Otherwise,
> >>>      what is the ultimate goal of that trend? Is that having only
> action
> >>>      plugins in roles and playbooks?
> >>>
> >>>      Please kindly asking the community to stop that, make a step back
> and
> >>>      reiterate with the taken approach. Thank you.
> >>>
> >>>      [0] https://review.opendev.org/716108
> >>>
> >>>
> >>>      --
> >>>      Best regards,
> >>>      Bogdan Dobrelya,
> >>>      Irc #bogdando
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>> Best regards
> >>> Sagi Shnaidman
> >>
> >>
> >> --
> >> Best regards,
> >> Bogdan Dobrelya,
> >> Irc #bogdando
> >>
> >>
> >
>
>
> --
> Best regards,
> Bogdan Dobrelya,
> Irc #bogdando
>
>

-- 
Best regards
Sagi Shnaidman
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200804/c2bb73ae/attachment-0001.html>

From marino.mrc at gmail.com  Tue Aug  4 09:55:52 2020
From: marino.mrc at gmail.com (Marco Marino)
Date: Tue, 4 Aug 2020 11:55:52 +0200
Subject: [ironic][tripleo][ussuri] Problem with bare metal provisioning and
 old RAID controllers
Message-ID: <CAFHVVuJdusoD1aMPtQEf3FqyFYZQuGwx9J=cio6hOks1hE6vRg@mail.gmail.com>

Hi, I'm trying to install openstack Ussuri on Centos 8 hardware using
tripleo. I'm using a relatively old hardware (dell PowerEdge R620) with old
RAID controllers, deprecated in RHEL8/Centos8. Here is some basic
information:
# lspci | grep -i raid
00:1f.2 RAID bus controller: Intel Corporation C600/X79 series chipset SATA
RAID Controller (rev 05)
02:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS 2008 [Falcon] (rev
03)

I'm able to manually install centos 8 using DUD driver from here ->
https://elrepo.org/linux/dud/el8/x86_64/dd-megaraid_sas-07.710.50.00-1.el8_2.elrepo.iso
(basically I add inst.dd and I use an usb pendrive with iso).
Is there a way to do bare metal provisioning using openstack on this kind
of server? At the moment, when I launch "openstack overcloud node
introspect --provide controller1" it doesn't recognize disks (local_gb = 0
in properties) and in inspector logs I see:
Jun 22 11:12:42 localhost.localdomain ironic-python-agent[1543]: 2018-06-22
11:12:42.261 1543 DEBUG root [-] Still waiting for the root device to
appear, attempt 1 of 10 wait_for_disks
/usr/lib/python3.6/site-packages/ironic_python_agent/hardware.py:652
Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]: 2018-06-22
11:12:45.299 1543 DEBUG oslo_concurrency.processutils [-] Running cmd
(subprocess): udevadm settle execute
/usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py:372
Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]: 2018-06-22
11:12:45.357 1543 DEBUG oslo_concurrency.processutils [-] CMD "udevadm
settle" returned: 0 in 0.058s execute
/usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py:409
Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]: 2018-06-22
11:12:45.392 1543 DEBUG ironic_lib.utils [-] Execution completed, command
line is "udevadm settle" execute
/usr/lib/python3.6/site-packages/ironic_lib/utils.py:101
Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]: 2018-06-22
11:12:45.426 1543 DEBUG ironic_lib.utils [-] Command stdout is: "" execute
/usr/lib/python3.6/site-packages/ironic_lib/utils.py:103
Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]: 2018-06-22
11:12:45.460 1543 DEBUG ironic_lib.utils [-] Command stderr is: "" execute
/usr/lib/python3.6/site-packages/ironic_lib/utils.py:104
Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]: 2018-06-22
11:12:45.496 1543 WARNING root [-] Path /dev/disk/by-path is inaccessible,
/dev/disk/by-path/* version of block device name is unavailable Cause:
[Errno 2] No such file or directory: '/dev/disk/by-path':
FileNotFoundError: [Errno 2] No such file or directory: '/dev/disk/by-path'
Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]: 2018-06-22
11:12:45.549 1543 DEBUG oslo_concurrency.processutils [-] Running cmd
(subprocess): lsblk -Pbia -oKNAME,MODEL,SIZE,ROTA,TYPE execute
/usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py:372
Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]: 2018-06-22
11:12:45.647 1543 DEBUG oslo_concurrency.processutils [-] CMD "lsblk -Pbia
-oKNAME,MODEL,SIZE,ROTA,TYPE" returned: 0 in 0.097s execute
/usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py:409
Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]: 2018-06-22
11:12:45.683 1543 DEBUG ironic_lib.utils [-] Execution completed, command
line is "lsblk -Pbia -oKNAME,MODEL,SIZE,ROTA,TYPE" execute
/usr/lib/python3.6/site-packages/ironic_lib/utils.py:101
Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]: 2018-06-22
11:12:45.719 1543 DEBUG ironic_lib.utils [-] Command stdout is: "" execute
/usr/lib/python3.6/site-packages/ironic_lib/utils.py:103
Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]: 2018-06-22
11:12:45.755 1543 DEBUG ironic_lib.utils [-] Command stderr is: "" execute
/usr/lib/python3.6/site-packages/ironic_lib/utils.py:104

Is there a way to solve the issue? For example, can I modify ramdisk and
include DUD driver? I tried this guide:
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.0/html/partner_integration/overcloud_images#initrd_modifying_the_initial_ramdisks

but I don't know how to include an ISO instead of an rpm packet as
described in the example.
Thank you,
Marco
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200804/07eae956/attachment.html>

From dtantsur at redhat.com  Tue Aug  4 10:30:04 2020
From: dtantsur at redhat.com (Dmitry Tantsur)
Date: Tue, 4 Aug 2020 12:30:04 +0200
Subject: [ironic][tripleo][ussuri] Problem with bare metal provisioning
 and old RAID controllers
In-Reply-To: <CAFHVVuJdusoD1aMPtQEf3FqyFYZQuGwx9J=cio6hOks1hE6vRg@mail.gmail.com>
References: <CAFHVVuJdusoD1aMPtQEf3FqyFYZQuGwx9J=cio6hOks1hE6vRg@mail.gmail.com>
Message-ID: <CACNgkFxN75EwBzqE2rXbKk0DEZ5f=KVQhUWxzjNXG3CG1oHk9w@mail.gmail.com>

Hi,

On Tue, Aug 4, 2020 at 11:58 AM Marco Marino <marino.mrc at gmail.com> wrote:

> Hi, I'm trying to install openstack Ussuri on Centos 8 hardware using
> tripleo. I'm using a relatively old hardware (dell PowerEdge R620) with old
> RAID controllers, deprecated in RHEL8/Centos8. Here is some basic
> information:
> # lspci | grep -i raid
> 00:1f.2 RAID bus controller: Intel Corporation C600/X79 series chipset
> SATA RAID Controller (rev 05)
> 02:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS 2008 [Falcon]
> (rev 03)
>
> I'm able to manually install centos 8 using DUD driver from here ->
> https://elrepo.org/linux/dud/el8/x86_64/dd-megaraid_sas-07.710.50.00-1.el8_2.elrepo.iso
> (basically I add inst.dd and I use an usb pendrive with iso).
> Is there a way to do bare metal provisioning using openstack on this kind
> of server? At the moment, when I launch "openstack overcloud node
> introspect --provide controller1" it doesn't recognize disks (local_gb = 0
> in properties) and in inspector logs I see:
> Jun 22 11:12:42 localhost.localdomain ironic-python-agent[1543]:
> 2018-06-22 11:12:42.261 1543 DEBUG root [-] Still waiting for the root
> device to appear, attempt 1 of 10 wait_for_disks
> /usr/lib/python3.6/site-packages/ironic_python_agent/hardware.py:652
> Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]:
> 2018-06-22 11:12:45.299 1543 DEBUG oslo_concurrency.processutils [-]
> Running cmd (subprocess): udevadm settle execute
> /usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py:372
> Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]:
> 2018-06-22 11:12:45.357 1543 DEBUG oslo_concurrency.processutils [-] CMD
> "udevadm settle" returned: 0 in 0.058s execute
> /usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py:409
> Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]:
> 2018-06-22 11:12:45.392 1543 DEBUG ironic_lib.utils [-] Execution
> completed, command line is "udevadm settle" execute
> /usr/lib/python3.6/site-packages/ironic_lib/utils.py:101
> Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]:
> 2018-06-22 11:12:45.426 1543 DEBUG ironic_lib.utils [-] Command stdout is:
> "" execute /usr/lib/python3.6/site-packages/ironic_lib/utils.py:103
> Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]:
> 2018-06-22 11:12:45.460 1543 DEBUG ironic_lib.utils [-] Command stderr is:
> "" execute /usr/lib/python3.6/site-packages/ironic_lib/utils.py:104
> Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]:
> 2018-06-22 11:12:45.496 1543 WARNING root [-] Path /dev/disk/by-path is
> inaccessible, /dev/disk/by-path/* version of block device name is
> unavailable Cause: [Errno 2] No such file or directory:
> '/dev/disk/by-path': FileNotFoundError: [Errno 2] No such file or
> directory: '/dev/disk/by-path'
> Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]:
> 2018-06-22 11:12:45.549 1543 DEBUG oslo_concurrency.processutils [-]
> Running cmd (subprocess): lsblk -Pbia -oKNAME,MODEL,SIZE,ROTA,TYPE execute
> /usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py:372
> Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]:
> 2018-06-22 11:12:45.647 1543 DEBUG oslo_concurrency.processutils [-] CMD
> "lsblk -Pbia -oKNAME,MODEL,SIZE,ROTA,TYPE" returned: 0 in 0.097s execute
> /usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py:409
> Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]:
> 2018-06-22 11:12:45.683 1543 DEBUG ironic_lib.utils [-] Execution
> completed, command line is "lsblk -Pbia -oKNAME,MODEL,SIZE,ROTA,TYPE"
> execute /usr/lib/python3.6/site-packages/ironic_lib/utils.py:101
> Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]:
> 2018-06-22 11:12:45.719 1543 DEBUG ironic_lib.utils [-] Command stdout is:
> "" execute /usr/lib/python3.6/site-packages/ironic_lib/utils.py:103
> Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]:
> 2018-06-22 11:12:45.755 1543 DEBUG ironic_lib.utils [-] Command stderr is:
> "" execute /usr/lib/python3.6/site-packages/ironic_lib/utils.py:104
>
> Is there a way to solve the issue? For example, can I modify ramdisk and
> include DUD driver? I tried this guide:
> https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.0/html/partner_integration/overcloud_images#initrd_modifying_the_initial_ramdisks
>
> but I don't know how to include an ISO instead of an rpm packet as
> described in the example.
>

Indeed, I don't think you can use ISO as it is, you'll need to figure out
what is inside. If it's an RPM (as I assume), you'll need to extract it and
install into the ramdisk.

If nothing helps, you can try building a ramdisk with CentOS 7, the (very)
recent versions of ironic-python-agent-builder allow using Python 3 on
CentOS 7.

Dmitry


> Thank you,
> Marco
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200804/3539416c/attachment.html>

From marino.mrc at gmail.com  Tue Aug  4 10:57:13 2020
From: marino.mrc at gmail.com (Marco Marino)
Date: Tue, 4 Aug 2020 12:57:13 +0200
Subject: [ironic][tripleo][ussuri] Problem with bare metal provisioning
 and old RAID controllers
In-Reply-To: <CACNgkFxN75EwBzqE2rXbKk0DEZ5f=KVQhUWxzjNXG3CG1oHk9w@mail.gmail.com>
References: <CAFHVVuJdusoD1aMPtQEf3FqyFYZQuGwx9J=cio6hOks1hE6vRg@mail.gmail.com>
 <CACNgkFxN75EwBzqE2rXbKk0DEZ5f=KVQhUWxzjNXG3CG1oHk9w@mail.gmail.com>
Message-ID: <CAFHVVuK+aMZ8AviE5kTxUWJG5nNDiWyU6B5pzHbT=AcTxNwasg@mail.gmail.com>

Here is what I did:
# /usr/lib/dracut/skipcpio /home/stack/images/ironic-python-agent.initramfs
| zcat | cpio -ivd | pax -r
# mount dd-megaraid_sas-07.710.50.00-1.el8_2.elrepo.iso /mnt/
# rpm2cpio
/mnt/rpms/x86_64/kmod-megaraid_sas-07.710.50.00-1.el8_2.elrepo.x86_64.rpm |
pax -r
# find . 2>/dev/null | cpio --quiet -c -o | gzip -8  >
/home/stack/images/ironic-python-agent.initramfs
# chown stack: /home/stack/images/ironic-python-agent.initramfs
(undercloud) [stack at undercloud ~]$ openstack overcloud image upload
--update-existing --image-path /home/stack/images/

At this point I checked that agent.ramdisk in /var/lib/ironic/httpboot has
an update timestamp

Then
(undercloud) [stack at undercloud ~]$ openstack overcloud node introspect
--provide controller2
/usr/lib64/python3.6/importlib/_bootstrap.py:219: ImportWarning: can't
resolve package from __spec__ or __package__, falling back on __name__ and
__path__
  return f(*args, **kwds)

PLAY [Baremetal Introspection for multiple Ironic Nodes]
***********************
2020-08-04 12:32:26.684368 | ecf4bbd2-e605-20dd-3da9-000000000008 |
TASK | Check for required inputs
2020-08-04 12:32:26.739797 | ecf4bbd2-e605-20dd-3da9-000000000008 |
 SKIPPED | Check for required inputs | localhost | item=node_uuids
2020-08-04 12:32:26.746684 | ecf4bbd2-e605-20dd-3da9-00000000000a |
TASK | Set node_uuids_intro fact
[WARNING]: Failure using method (v2_playbook_on_task_start) in callback
plugin
(<ansible.plugins.callback.tripleo_profile_tasks.CallbackModule object at
0x7f1b0f9bce80>): maximum recursion depth exceeded while calling a Python
object
2020-08-04 12:32:26.828985 | ecf4bbd2-e605-20dd-3da9-00000000000a |
OK | Set node_uuids_intro fact | localhost
2020-08-04 12:32:26.834281 | ecf4bbd2-e605-20dd-3da9-00000000000c |
TASK | Notice
2020-08-04 12:32:26.911106 | ecf4bbd2-e605-20dd-3da9-00000000000c |
 SKIPPED | Notice | localhost
2020-08-04 12:32:26.916344 | ecf4bbd2-e605-20dd-3da9-00000000000e |
TASK | Set concurrency fact
2020-08-04 12:32:26.994087 | ecf4bbd2-e605-20dd-3da9-00000000000e |
OK | Set concurrency fact | localhost
2020-08-04 12:32:27.005932 | ecf4bbd2-e605-20dd-3da9-000000000010 |
TASK | Check if validation enabled
2020-08-04 12:32:27.116425 | ecf4bbd2-e605-20dd-3da9-000000000010 |
 SKIPPED | Check if validation enabled | localhost
2020-08-04 12:32:27.129120 | ecf4bbd2-e605-20dd-3da9-000000000011 |
TASK | Run Validations
2020-08-04 12:32:27.239850 | ecf4bbd2-e605-20dd-3da9-000000000011 |
 SKIPPED | Run Validations | localhost
2020-08-04 12:32:27.251796 | ecf4bbd2-e605-20dd-3da9-000000000012 |
TASK | Fail if validations are disabled
2020-08-04 12:32:27.362050 | ecf4bbd2-e605-20dd-3da9-000000000012 |
 SKIPPED | Fail if validations are disabled | localhost
2020-08-04 12:32:27.373947 | ecf4bbd2-e605-20dd-3da9-000000000014 |
TASK | Start baremetal introspection


2020-08-04 12:48:19.944028 | ecf4bbd2-e605-20dd-3da9-000000000014 |
 CHANGED | Start baremetal introspection | localhost
2020-08-04 12:48:19.966517 | ecf4bbd2-e605-20dd-3da9-000000000015 |
TASK | Nodes that passed introspection
2020-08-04 12:48:20.130913 | ecf4bbd2-e605-20dd-3da9-000000000015 |
OK | Nodes that passed introspection | localhost | result={
    "changed": false,
    "msg": " 00c5e81b-1e5d-442b-b64f-597a604051f7"
}
2020-08-04 12:48:20.142919 | ecf4bbd2-e605-20dd-3da9-000000000016 |
TASK | Nodes that failed introspection
2020-08-04 12:48:20.305004 | ecf4bbd2-e605-20dd-3da9-000000000016 |
OK | Nodes that failed introspection | localhost | result={
    "changed": false,
    "failed_when_result": false,
    "msg": " All nodes completed introspection successfully!"
}
2020-08-04 12:48:20.316860 | ecf4bbd2-e605-20dd-3da9-000000000017 |
TASK | Node introspection failed and no results are provided
2020-08-04 12:48:20.427675 | ecf4bbd2-e605-20dd-3da9-000000000017 |
 SKIPPED | Node introspection failed and no results are provided | localhost

PLAY RECAP
*********************************************************************
localhost                  : ok=5    changed=1    unreachable=0    failed=0
   skipped=6    rescued=0    ignored=0
[WARNING]: Failure using method (v2_playbook_on_stats) in callback plugin
(<ansible.plugins.callback.tripleo_profile_tasks.CallbackModule object at
0x7f1b0f9bce80>): _output() missing 1 required positional argument: 'color'
Successfully introspected nodes: ['controller2']
Exception occured while running the command
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/ansible_runner/runner_config.py",
line 340, in prepare_command
    cmdline_args = self.loader.load_file('args', string_types,
encoding=None)
  File "/usr/lib/python3.6/site-packages/ansible_runner/loader.py", line
164, in load_file
    contents = parsed_data = self.get_contents(path)
  File "/usr/lib/python3.6/site-packages/ansible_runner/loader.py", line
98, in get_contents
    raise ConfigurationError('specified path does not exist %s' % path)
ansible_runner.exceptions.ConfigurationError: specified path does not exist
/tmp/tripleop89yr8i8/args

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/tripleoclient/command.py", line
34, in run
    super(Command, self).run(parsed_args)
  File "/usr/lib/python3.6/site-packages/osc_lib/command/command.py", line
41, in run
    return super(Command, self).run(parsed_args)
  File "/usr/lib/python3.6/site-packages/cliff/command.py", line 187, in run
    return_code = self.take_action(parsed_args) or 0
  File
"/usr/lib/python3.6/site-packages/tripleoclient/v2/overcloud_node.py", line
210, in take_action
    node_uuids=parsed_args.node_uuids,
  File
"/usr/lib/python3.6/site-packages/tripleoclient/workflows/baremetal.py",
line 134, in provide
    'node_uuids': node_uuids
  File "/usr/lib/python3.6/site-packages/tripleoclient/utils.py", line 659,
in run_ansible_playbook
    runner_config.prepare()
  File "/usr/lib/python3.6/site-packages/ansible_runner/runner_config.py",
line 174, in prepare
    self.prepare_command()
  File "/usr/lib/python3.6/site-packages/ansible_runner/runner_config.py",
line 346, in prepare_command
    self.command = self.generate_ansible_command()
  File "/usr/lib/python3.6/site-packages/ansible_runner/runner_config.py",
line 415, in generate_ansible_command
    v = 'v' * self.verbosity
TypeError: can't multiply sequence by non-int of type 'ClientManager'
can't multiply sequence by non-int of type 'ClientManager'
(undercloud) [stack at undercloud ~]$


and
(undercloud) [stack at undercloud ~]$ openstack baremetal node show controller2
....
| properties             | {'local_gb': '0', 'cpus': '24', 'cpu_arch':
'x86_64', 'memory_mb': '32768', 'capabilities':
'cpu_vt:true,cpu_aes:true,cpu_hugepages:true,cpu_hugepages_1g:true,cpu_txt:true'}


It seems that megaraid driver is correctly inserted in ramdisk:
# lsinitrd /var/lib/ironic/httpboot/agent.ramdisk | grep  megaraid
/bin/lsinitrd: line 276: warning: command substitution: ignored null byte
in input
-rw-r--r--   1 root     root           50 Apr 28 21:55
etc/depmod.d/kmod-megaraid_sas.conf
drwxr-xr-x   2 root     root            0 Aug  4 12:13
usr/lib/modules/4.18.0-193.6.3.el8_2.x86_64/kernel/drivers/scsi/megaraid
-rw-r--r--   1 root     root        68240 Aug  4 12:13
usr/lib/modules/4.18.0-193.6.3.el8_2.x86_64/kernel/drivers/scsi/megaraid/megaraid_sas.ko.xz
drwxr-xr-x   2 root     root            0 Apr 28 21:55
usr/lib/modules/4.18.0-193.el8.x86_64/extra/megaraid_sas
-rw-r--r--   1 root     root       309505 Apr 28 21:55
usr/lib/modules/4.18.0-193.el8.x86_64/extra/megaraid_sas/megaraid_sas.ko
drwxr-xr-x   2 root     root            0 Apr 28 21:55
usr/share/doc/kmod-megaraid_sas-07.710.50.00
-rw-r--r--   1 root     root        18092 Apr 28 21:55
usr/share/doc/kmod-megaraid_sas-07.710.50.00/GPL-v2.0.txt
-rw-r--r--   1 root     root         1152 Apr 28 21:55
usr/share/doc/kmod-megaraid_sas-07.710.50.00/greylist.txt

If the solution is to use a Centos7 ramdisk, please can you give me some
hint? I have no idea on how to build a new ramdisk from scratch
Thank you


Il giorno mar 4 ago 2020 alle ore 12:33 Dmitry Tantsur <dtantsur at redhat.com>
ha scritto:

> Hi,
>
> On Tue, Aug 4, 2020 at 11:58 AM Marco Marino <marino.mrc at gmail.com> wrote:
>
>> Hi, I'm trying to install openstack Ussuri on Centos 8 hardware using
>> tripleo. I'm using a relatively old hardware (dell PowerEdge R620) with old
>> RAID controllers, deprecated in RHEL8/Centos8. Here is some basic
>> information:
>> # lspci | grep -i raid
>> 00:1f.2 RAID bus controller: Intel Corporation C600/X79 series chipset
>> SATA RAID Controller (rev 05)
>> 02:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS 2008 [Falcon]
>> (rev 03)
>>
>> I'm able to manually install centos 8 using DUD driver from here ->
>> https://elrepo.org/linux/dud/el8/x86_64/dd-megaraid_sas-07.710.50.00-1.el8_2.elrepo.iso
>> (basically I add inst.dd and I use an usb pendrive with iso).
>> Is there a way to do bare metal provisioning using openstack on this kind
>> of server? At the moment, when I launch "openstack overcloud node
>> introspect --provide controller1" it doesn't recognize disks (local_gb = 0
>> in properties) and in inspector logs I see:
>> Jun 22 11:12:42 localhost.localdomain ironic-python-agent[1543]:
>> 2018-06-22 11:12:42.261 1543 DEBUG root [-] Still waiting for the root
>> device to appear, attempt 1 of 10 wait_for_disks
>> /usr/lib/python3.6/site-packages/ironic_python_agent/hardware.py:652
>> Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]:
>> 2018-06-22 11:12:45.299 1543 DEBUG oslo_concurrency.processutils [-]
>> Running cmd (subprocess): udevadm settle execute
>> /usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py:372
>> Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]:
>> 2018-06-22 11:12:45.357 1543 DEBUG oslo_concurrency.processutils [-] CMD
>> "udevadm settle" returned: 0 in 0.058s execute
>> /usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py:409
>> Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]:
>> 2018-06-22 11:12:45.392 1543 DEBUG ironic_lib.utils [-] Execution
>> completed, command line is "udevadm settle" execute
>> /usr/lib/python3.6/site-packages/ironic_lib/utils.py:101
>> Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]:
>> 2018-06-22 11:12:45.426 1543 DEBUG ironic_lib.utils [-] Command stdout is:
>> "" execute /usr/lib/python3.6/site-packages/ironic_lib/utils.py:103
>> Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]:
>> 2018-06-22 11:12:45.460 1543 DEBUG ironic_lib.utils [-] Command stderr is:
>> "" execute /usr/lib/python3.6/site-packages/ironic_lib/utils.py:104
>> Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]:
>> 2018-06-22 11:12:45.496 1543 WARNING root [-] Path /dev/disk/by-path is
>> inaccessible, /dev/disk/by-path/* version of block device name is
>> unavailable Cause: [Errno 2] No such file or directory:
>> '/dev/disk/by-path': FileNotFoundError: [Errno 2] No such file or
>> directory: '/dev/disk/by-path'
>> Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]:
>> 2018-06-22 11:12:45.549 1543 DEBUG oslo_concurrency.processutils [-]
>> Running cmd (subprocess): lsblk -Pbia -oKNAME,MODEL,SIZE,ROTA,TYPE execute
>> /usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py:372
>> Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]:
>> 2018-06-22 11:12:45.647 1543 DEBUG oslo_concurrency.processutils [-] CMD
>> "lsblk -Pbia -oKNAME,MODEL,SIZE,ROTA,TYPE" returned: 0 in 0.097s execute
>> /usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py:409
>> Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]:
>> 2018-06-22 11:12:45.683 1543 DEBUG ironic_lib.utils [-] Execution
>> completed, command line is "lsblk -Pbia -oKNAME,MODEL,SIZE,ROTA,TYPE"
>> execute /usr/lib/python3.6/site-packages/ironic_lib/utils.py:101
>> Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]:
>> 2018-06-22 11:12:45.719 1543 DEBUG ironic_lib.utils [-] Command stdout is:
>> "" execute /usr/lib/python3.6/site-packages/ironic_lib/utils.py:103
>> Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]:
>> 2018-06-22 11:12:45.755 1543 DEBUG ironic_lib.utils [-] Command stderr is:
>> "" execute /usr/lib/python3.6/site-packages/ironic_lib/utils.py:104
>>
>> Is there a way to solve the issue? For example, can I modify ramdisk and
>> include DUD driver? I tried this guide:
>> https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.0/html/partner_integration/overcloud_images#initrd_modifying_the_initial_ramdisks
>>
>> but I don't know how to include an ISO instead of an rpm packet as
>> described in the example.
>>
>
> Indeed, I don't think you can use ISO as it is, you'll need to figure out
> what is inside. If it's an RPM (as I assume), you'll need to extract it and
> install into the ramdisk.
>
> If nothing helps, you can try building a ramdisk with CentOS 7, the (very)
> recent versions of ironic-python-agent-builder allow using Python 3 on
> CentOS 7.
>
> Dmitry
>
>
>> Thank you,
>> Marco
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200804/2579fd6d/attachment-0001.html>

From victoria at vmartinezdelacruz.com  Tue Aug  4 11:43:05 2020
From: victoria at vmartinezdelacruz.com (=?UTF-8?Q?Victoria_Mart=C3=ADnez_de_la_Cruz?=)
Date: Tue, 4 Aug 2020 08:43:05 -0300
Subject: [openstack][stein][manila-ui] error
In-Reply-To: <CAB7j8cWB_tO-u-NY_2o7rmvaEJ5=ftUtxcuC91jkDuZW8ayg5g@mail.gmail.com>
References: <CAB7j8cUDXoNHipBMJfXR=93eoupTrZa5cm+1NjEALY7Cg12U_g@mail.gmail.com>
 <CAJ_e2gANpL4i8QXetjobSKEEN9D3tWg3T=yPrq3ZbUNuMX_H8g@mail.gmail.com>
 <CAB7j8cUORJdd=BZnTV+8DFvbpKhCGnb7_=W4GWo5fYDchkDeWg@mail.gmail.com>
 <CAB7j8cU3X1oTg0MKqYWe945owAFfwg71525wmD2_YnnwcGvGKA@mail.gmail.com>
 <CAB7j8cWLkLMz91gx5Fw_kk=5wAnhGtH67nWU2W56xGjoRUzV=Q@mail.gmail.com>
 <CAKSuTPaN3ru6oAkg7_7pm99VpWYgigAsv2WnLAdLHcYPqf9oNQ@mail.gmail.com>
 <CAB7j8cVYNRHpdPckT4ZseLpf4Sfi_TgqvOvmdxwnH+5O-nqPxg@mail.gmail.com>
 <CAJ_e2gAf8Gx7Ubs45UJrH0c1de-2i6GRwfVCeZ2nTFbzjtU0zw@mail.gmail.com>
 <CAB7j8cWB_tO-u-NY_2o7rmvaEJ5=ftUtxcuC91jkDuZW8ayg5g@mail.gmail.com>
Message-ID: <CAJ_e2gC0k4oVFEMTYMveHvnkpVE4tTT8EPNykpCuNR6TSup0Gw@mail.gmail.com>

Glad to hear it is working ok now!

Cheers,

V

On Tue, Aug 4, 2020 at 2:50 AM Ignazio Cassano <ignaziocassano at gmail.com>
wrote:

> Hello Victoria and Goutham, thank you for your great help.
> Unfortunately I made I mistake in my ansible playbook for installing
> manila: it created manila services more times, so some entries in the
> catalog did not have an endpoint associated.
> I removed the duplicated service entries where catalog was absent and now
> it works.
> Many thanks
> Ignazio
>
> Il giorno mar 4 ago 2020 alle ore 02:53 Victoria Martínez de la Cruz <
> victoria at vmartinezdelacruz.com> ha scritto:
>
>> In local_settings.py under openstack-dashboard. And then restart the
>> webserver.
>>
>> Did you copy the enable and local files from manila-ui under Horizon's
>> namespace? Check out
>> https://docs.openstack.org/manila-ui/latest/install/installation.html
>>
>> We can continue debugging tomorrow, we will find out what is going on.
>>
>> Cheers,
>>
>> V
>>
>>
>> On Mon, Aug 3, 2020, 6:46 PM Ignazio Cassano <ignaziocassano at gmail.com>
>> wrote:
>>
>>> Hello Goutham,tomorrow I will check the catalog.
>>> Must I enable the debug option in dashboard local_setting or in
>>> manila.conf?
>>> Thanks
>>> Ignazio
>>>
>>>
>>> Il Lun 3 Ago 2020, 23:01 Goutham Pacha Ravi <gouthampravi at gmail.com> ha
>>> scritto:
>>>
>>>>
>>>>
>>>>
>>>> On Mon, Aug 3, 2020 at 1:31 PM Ignazio Cassano <
>>>> ignaziocassano at gmail.com> wrote:
>>>>
>>>>> I mean I am using dhss false
>>>>>
>>>>> Il Lun 3 Ago 2020, 21:41 Ignazio Cassano <ignaziocassano at gmail.com>
>>>>> ha scritto:
>>>>>
>>>>>> PS ps
>>>>>> Sorry If aI am writing again.
>>>>>> The command:
>>>>>> manila list let me to show shares I created with command line.
>>>>>> The dashboard gives errors I reported in my first email.
>>>>>> Looking at manila.py line 280 it checks shares under share networks.
>>>>>> Ignazio
>>>>>>
>>>>>>
>>>>>> Il Lun 3 Ago 2020, 21:34 Ignazio Cassano <ignaziocassano at gmail.com>
>>>>>> ha scritto:
>>>>>>
>>>>>>> PS
>>>>>>> I followed installation guide under docs.openstack.org.
>>>>>>>
>>>>>>>
>>>>>>> Il Lun 3 Ago 2020, 21:21 Victoria Martínez de la Cruz <
>>>>>>> victoria at vmartinezdelacruz.com> ha scritto:
>>>>>>>
>>>>>>>> Hi Ignazio,
>>>>>>>>
>>>>>>>> How did you deploy Manila and Manila UI? Can you point me toward
>>>>>>>> the docs you used?
>>>>>>>>
>>>>>>>> Also, which is the specific workflow you are following to reach
>>>>>>>> that trace? Just opening the dashboard and clicking on the Shares tab?
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>>
>>>>>>>> V
>>>>>>>>
>>>>>>>> On Mon, Aug 3, 2020 at 4:55 AM Ignazio Cassano <
>>>>>>>> ignaziocassano at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hello, I installed manila on openstack stein and it works by
>>>>>>>>> command line mat the manila ui does not work and in httpd error log I read:
>>>>>>>>>
>>>>>>>>> [Mon Aug 03 07:45:26.697408 2020] [:error] [pid 3506291] ERROR
>>>>>>>>> django.request Internal Server Error: /dashboard/project/shares/
>>>>>>>>> [Mon Aug 03 07:45:26.697437 2020] [:error] [pid 3506291] Traceback
>>>>>>>>> (most recent call last):
>>>>>>>>> [Mon Aug 03 07:45:26.697442 2020] [:error] [pid 3506291]   File
>>>>>>>>> "/usr/lib/python2.7/site-packages/django/core/handlers/exception.py", line
>>>>>>>>> 41, in inner
>>>>>>>>> [Mon Aug 03 07:45:26.697446 2020] [:error] [pid 3506291]
>>>>>>>>> response = get_response(request)
>>>>>>>>> [Mon Aug 03 07:45:26.697450 2020] [:error] [pid 3506291]   File
>>>>>>>>> "/usr/lib/python2.7/site-packages/django/core/handlers/base.py", line 187,
>>>>>>>>> in _get_response
>>>>>>>>> [Mon Aug 03 07:45:26.697453 2020] [:error] [pid 3506291]
>>>>>>>>> response = self.process_exception_by_middleware(e, request)
>>>>>>>>> [Mon Aug 03 07:45:26.697466 2020] [:error] [pid 3506291]   File
>>>>>>>>> "/usr/lib/python2.7/site-packages/django/core/handlers/base.py", line 185,
>>>>>>>>> in _get_response
>>>>>>>>> [Mon Aug 03 07:45:26.697471 2020] [:error] [pid 3506291]
>>>>>>>>> response = wrapped_callback(request, *callback_args, **callback_kwargs)
>>>>>>>>> [Mon Aug 03 07:45:26.697475 2020] [:error] [pid 3506291]   File
>>>>>>>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 52, in dec
>>>>>>>>> [Mon Aug 03 07:45:26.697479 2020] [:error] [pid 3506291]
>>>>>>>>> return view_func(request, *args, **kwargs)
>>>>>>>>> [Mon Aug 03 07:45:26.697482 2020] [:error] [pid 3506291]   File
>>>>>>>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 36, in dec
>>>>>>>>> [Mon Aug 03 07:45:26.697485 2020] [:error] [pid 3506291]
>>>>>>>>> return view_func(request, *args, **kwargs)
>>>>>>>>> [Mon Aug 03 07:45:26.697489 2020] [:error] [pid 3506291]   File
>>>>>>>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 36, in dec
>>>>>>>>> [Mon Aug 03 07:45:26.697492 2020] [:error] [pid 3506291]
>>>>>>>>> return view_func(request, *args, **kwargs)
>>>>>>>>> [Mon Aug 03 07:45:26.697496 2020] [:error] [pid 3506291]   File
>>>>>>>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 113, in dec
>>>>>>>>> [Mon Aug 03 07:45:26.697499 2020] [:error] [pid 3506291]
>>>>>>>>> return view_func(request, *args, **kwargs)
>>>>>>>>> [Mon Aug 03 07:45:26.697502 2020] [:error] [pid 3506291]   File
>>>>>>>>> "/usr/lib/python2.7/site-packages/horizon/decorators.py", line 84, in dec
>>>>>>>>> [Mon Aug 03 07:45:26.697506 2020] [:error] [pid 3506291]
>>>>>>>>> return view_func(request, *args, **kwargs)
>>>>>>>>> [Mon Aug 03 07:45:26.697509 2020] [:error] [pid 3506291]   File
>>>>>>>>> "/usr/lib/python2.7/site-packages/django/views/generic/base.py", line 68,
>>>>>>>>> in view
>>>>>>>>> [Mon Aug 03 07:45:26.697513 2020] [:error] [pid 3506291]
>>>>>>>>> return self.dispatch(request, *args, **kwargs)
>>>>>>>>> [Mon Aug 03 07:45:26.697516 2020] [:error] [pid 3506291]   File
>>>>>>>>> "/usr/lib/python2.7/site-packages/django/views/generic/base.py", line 88,
>>>>>>>>> in dispatch
>>>>>>>>> [Mon Aug 03 07:45:26.697520 2020] [:error] [pid 3506291]
>>>>>>>>> return handler(request, *args, **kwargs)
>>>>>>>>> [Mon Aug 03 07:45:26.697523 2020] [:error] [pid 3506291]   File
>>>>>>>>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 223, in get
>>>>>>>>> [Mon Aug 03 07:45:26.697526 2020] [:error] [pid 3506291]
>>>>>>>>> handled = self.construct_tables()
>>>>>>>>> [Mon Aug 03 07:45:26.697530 2020] [:error] [pid 3506291]   File
>>>>>>>>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 214, in
>>>>>>>>> construct_tables
>>>>>>>>> [Mon Aug 03 07:45:26.697533 2020] [:error] [pid 3506291]
>>>>>>>>> handled = self.handle_table(table)
>>>>>>>>> [Mon Aug 03 07:45:26.697537 2020] [:error] [pid 3506291]   File
>>>>>>>>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 123, in
>>>>>>>>> handle_table
>>>>>>>>> [Mon Aug 03 07:45:26.697540 2020] [:error] [pid 3506291]     data
>>>>>>>>> = self._get_data_dict()
>>>>>>>>> [Mon Aug 03 07:45:26.697544 2020] [:error] [pid 3506291]   File
>>>>>>>>> "/usr/lib/python2.7/site-packages/horizon/tables/views.py", line 43, in
>>>>>>>>> _get_data_dict
>>>>>>>>> [Mon Aug 03 07:45:26.697547 2020] [:error] [pid 3506291]
>>>>>>>>> data.extend(func())
>>>>>>>>> [Mon Aug 03 07:45:26.697550 2020] [:error] [pid 3506291]   File
>>>>>>>>> "/usr/lib/python2.7/site-packages/horizon/utils/memoized.py", line 109, in
>>>>>>>>> wrapped
>>>>>>>>> [Mon Aug 03 07:45:26.697554 2020] [:error] [pid 3506291]     value
>>>>>>>>> = cache[key] = func(*args, **kwargs)
>>>>>>>>> [Mon Aug 03 07:45:26.697557 2020] [:error] [pid 3506291]   File
>>>>>>>>> "/usr/lib/python2.7/site-packages/manila_ui/dashboards/project/shares/views.py",
>>>>>>>>> line 57, in get_shares_data
>>>>>>>>> [Mon Aug 03 07:45:26.697561 2020] [:error] [pid 3506291]
>>>>>>>>> share_nets = manila.share_network_list(self.request)
>>>>>>>>> [Mon Aug 03 07:45:26.697564 2020] [:error] [pid 3506291]   File
>>>>>>>>> "/usr/lib/python2.7/site-packages/manila_ui/api/manila.py", line 280, in
>>>>>>>>> share_network_list
>>>>>>>>> [Mon Aug 03 07:45:26.697568 2020] [:error] [pid 3506291]
>>>>>>>>> return manilaclient(request).share_networks.list(detailed=detailed,
>>>>>>>>> [Mon Aug 03 07:45:26.697571 2020] [:error] [pid 3506291]
>>>>>>>>> AttributeError: 'NoneType' object has no attribute 'share_networks'
>>>>>>>>>
>>>>>>>>
>>>> Looking at the error here, and the code - it could be that the UI isn't
>>>> able to retrieve the manila service endpoint from the service catalog. If
>>>> this is the case, you must be able to see a "DEBUG" level log in your httpd
>>>> error log with "no share service configured". Do you see it?
>>>>
>>>> As the user you're using on horizon, can you perform "openstack catalog
>>>> list" and check whether the "sharev2" service type exists in that list?
>>>>
>>>>
>>>>>
>>>>>>>>> Please, anyone could help ?
>>>>>>>>> Ignazio
>>>>>>>>>
>>>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200804/ba27001b/attachment.html>

From yan.y.zhao at intel.com  Tue Aug  4 08:37:08 2020
From: yan.y.zhao at intel.com (Yan Zhao)
Date: Tue, 4 Aug 2020 16:37:08 +0800
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200730112930.6f4c5762@x1.home>
References: <20200716083230.GA25316@joy-OptiPlex-7040>
 <20200717101258.65555978@x1.home>
 <20200721005113.GA10502@joy-OptiPlex-7040>
 <20200727072440.GA28676@joy-OptiPlex-7040>
 <20200727162321.7097070e@x1.home>
 <20200729080503.GB28676@joy-OptiPlex-7040>
 <e8a973ea0bb2bc3eb15649fb1c44599ae3509e84.camel@redhat.com>
 <20200729131255.68730f68@x1.home>
 <20200730034104.GB32327@joy-OptiPlex-7040>
 <20200730112930.6f4c5762@x1.home>
Message-ID: <20200804083708.GA30485@joy-OptiPlex-7040>

> > yes, include a device_api field is better.
> > for mdev, "device_type=vfio-mdev", is it right?
> 
> No, vfio-mdev is not a device API, it's the driver that attaches to the
> mdev bus device to expose it through vfio.  The device_api exposes the
> actual interface of the vfio device, it's also vfio-pci for typical
> mdev devices found on x86, but may be vfio-ccw, vfio-ap, etc...  See
> VFIO_DEVICE_API_PCI_STRING and friends.
> 
ok. got it.

> > > > > 	device_id=8086591d  
> > > 
> > > Is device_id interpreted relative to device_type?  How does this
> > > relate to mdev_type?  If we have an mdev_type, doesn't that fully
> > > defined the software API?
> > >   
> > it's parent pci id for mdev actually.
>
> If we need to specify the parent PCI ID then something is fundamentally
> wrong with the mdev_type.  The mdev_type should define a unique,
> software compatible interface, regardless of the parent device IDs.  If
> a i915-GVTg_V5_2 means different things based on the parent device IDs,
> then then different mdev_types should be reported for those parent
> devices.
>
hmm, then do we allow vendor specific fields?
or is it a must that a vendor specific field should have corresponding
vendor attribute?

another thing is that the definition of mdev_type in GVT only corresponds
to vGPU computing ability currently,
e.g. i915-GVTg_V5_2, is 1/2 of a gen9 IGD, i915-GVTg_V4_2 is 1/2 of a
gen8 IGD.
It is too coarse-grained to live migration compatibility.

Do you think we need to update GVT's definition of mdev_type?
And is there any guide in mdev_type definition?

> > > > > 	mdev_type=i915-GVTg_V5_2  
> > > 
> > > And how are non-mdev devices represented?
> > >   
> > non-mdev can opt to not include this field, or as you said below, a
> > vendor signature. 
> > 
> > > > > 	aggregator=1
> > > > > 	pv_mode="none+ppgtt+context"  
> > > 
> > > These are meaningless vendor specific matches afaict.
> > >   
> > yes, pv_mode and aggregator are vendor specific fields.
> > but they are important to decide whether two devices are compatible.
> > pv_mode means whether a vGPU supports guest paravirtualized api.
> > "none+ppgtt+context" means guest can not use pv, or use ppgtt mode pv or
> > use context mode pv.
> > 
> > > > > 	interface_version=3  
> > > 
> > > Not much granularity here, I prefer Sean's previous
> > > <major>.<minor>[.bugfix] scheme.
> > >   
> > yes, <major>.<minor>[.bugfix] scheme may be better, but I'm not sure if
> > it works for a complicated scenario.
> > e.g for pv_mode,
> > (1) initially,  pv_mode is not supported, so it's pv_mode=none, it's 0.0.0,
> > (2) then, pv_mode=ppgtt is supported, pv_mode="none+ppgtt", it's 0.1.0,
> > indicating pv_mode=none can migrate to pv_mode="none+ppgtt", but not vice versa.
> > (3) later, pv_mode=context is also supported,
> > pv_mode="none+ppgtt+context", so it's 0.2.0.
> > 
> > But if later, pv_mode=ppgtt is removed. pv_mode="none+context", how to
> > name its version? "none+ppgtt" (0.1.0) is not compatible to
> > "none+context", but "none+ppgtt+context" (0.2.0) is compatible to
> > "none+context".
> 
> If pv_mode=ppgtt is removed, then the compatible versions would be
> 0.0.0 or 1.0.0, ie. the major version would be incremented due to
> feature removal.
>  
> > Maintain such scheme is painful to vendor driver.
> 
> Migration compatibility is painful, there's no way around that.  I
> think the version scheme is an attempt to push some of that low level
> burden on the vendor driver, otherwise the management tools need to
> work on an ever growing matrix of vendor specific features which is
> going to become unwieldy and is largely meaningless outside of the
> vendor driver.  Instead, the vendor driver can make strategic decisions
> about where to continue to maintain a support burden and make explicit
> decisions to maintain or break compatibility.  The version scheme is a
> simplification and abstraction of vendor driver features in order to
> create a small, logical compatibility matrix.  Compromises necessarily
> need to be made for that to occur.
>
ok. got it.

> > > > > COMPATIBLE:
> > > > > 	device_type=pci
> > > > > 	device_id=8086591d
> > > > > 	mdev_type=i915-GVTg_V5_{val1:int:1,2,4,8}    
> > > > this mixed notation will be hard to parse so i would avoid that.  
> > > 
> > > Some background, Intel has been proposing aggregation as a solution to
> > > how we scale mdev devices when hardware exposes large numbers of
> > > assignable objects that can be composed in essentially arbitrary ways.
> > > So for instance, if we have a workqueue (wq), we might have an mdev
> > > type for 1wq, 2wq, 3wq,... Nwq.  It's not really practical to expose a
> > > discrete mdev type for each of those, so they want to define a base
> > > type which is composable to other types via this aggregation.  This is
> > > what this substitution and tagging is attempting to accomplish.  So
> > > imagine this set of values for cases where it's not practical to unroll
> > > the values for N discrete types.
> > >   
> > > > > 	aggregator={val1}/2  
> > > 
> > > So the {val1} above would be substituted here, though an aggregation
> > > factor of 1/2 is a head scratcher...
> > >   
> > > > > 	pv_mode={val2:string:"none+ppgtt","none+context","none+ppgtt+context"}  
> > > 
> > > I'm lost on this one though.  I think maybe it's indicating that it's
> > > compatible with any of these, so do we need to list it?  Couldn't this
> > > be handled by Sean's version proposal where the minor version
> > > represents feature compatibility?  
> > yes, it's indicating that it's compatible with any of these.
> > Sean's version proposal may also work, but it would be painful for
> > vendor driver to maintain the versions when multiple similar features
> > are involved.
> 
> This is something vendor drivers need to consider when adding and
> removing features.
> 
> > > > > 	interface_version={val3:int:2,3}  
> > > 
> > > What does this turn into in a few years, 2,7,12,23,75,96,...
> > >   
> > is a range better?
> 
> I was really trying to point out that sparseness becomes an issue if
> the vendor driver is largely disconnected from how their feature
> addition and deprecation affects migration support.  Thanks,
>
ok. we'll use the x.y.z scheme then.

Thanks
Yan


From moreira.belmiro.email.lists at gmail.com  Tue Aug  4 08:55:05 2020
From: moreira.belmiro.email.lists at gmail.com (Belmiro Moreira)
Date: Tue, 4 Aug 2020 10:55:05 +0200
Subject: [TC] [PTG] Victoria vPTG Summary of Conversations and Action Items
In-Reply-To: <CAJ6yrQhCTT6i-377wW5u7SiEAq27igWFpeqxr6e1W2Cut7fvVg@mail.gmail.com>
References: <CAJ6yrQhCTT6i-377wW5u7SiEAq27igWFpeqxr6e1W2Cut7fvVg@mail.gmail.com>
Message-ID: <CAPkQhnc1mW_Cepo3+cYJOkiQm5PpZcYiOv4nFhYU3jCfbwveCQ@mail.gmail.com>

Hi everyone,
the problem described in the "OpenStack User-facing APIs" is something that
we face daily in our deployment. Different CLIs for different operations.

I'm really interested in driving this action item.

Belmiro

On Fri, Jun 12, 2020 at 9:38 PM Kendall Nelson <kennelson11 at gmail.com>
wrote:

> Hello Everyone!
>
> I hope you all had a productive and enjoyable PTG! While it’s still
> reasonably fresh, I wanted to take a moment to summarize discussions and
> actions that came out of TC discussions.
>
> If there is a particular action item you are interested in taking, please
> reply on this thread!
>
> For the long version, check out the etherpad from the PTG[1].
>
> Tuesday
>
> ======
>
> Ussuri Retrospective
>
> ----------------------------
>
> As usual we accomplished a lot. Some of the things we accomplished were
> around enumerating operating systems per release (again), removing python2
> support, and adding the ideas repository. Towards the end of the release,
> we had a lot of discussions around what to do with leaderless projects, the
> role of PTLs, and what to do with projects that were missing PTL candidates
> for the next release. We discussed office hours, their history and reason
> for existence, and clarified how we can strengthen communication amongst
> ourselves, the projects, and the larger community.
>
> TC Onboarding
>
> --------------------
>
> It was brought up that those elected most recently (and even new members
> the election before) felt like there wasn’t enough onboarding into the TC.
> Through discussion about what we can do to better support returning members
> is to better document the daily, weekly and monthly tasks TC members are
> supposed to be doing. Kendall Nelson proposed a patch to start adding more
> detail to a guide for TC members already[2]. It was also proposed that we
> have a sort of mentorship or shadow program for people interested in
> joining the TC or new TC members by more experienced TC members. The
> discussion about the shadow/mentorship program is to be continued.
>
> TC/UC Merge
>
> ------------------
>
> Thierry gave an update on the merge of the committees. The simplified
> version is that the current proposal is that UC members are picked from TC
> members, the UC operates within the TC, and that we are already setup for
> this given the number of TC members that have AUC status. None of this
> requires a by-laws change. One next step that has already begun is the
> merging of the openstack-users ML into openstack-discuss ML. Other next
> steps are to decide when to do the actual transition (disbanding the
> separate UC, probably at the next election?) and when to setup AUC’s to be
> defined as extra-ATC’s to be included in the electorate for elections. For
> more detail, check out the openstack-discuss ML thread[3].
>
> Wednesday
>
> =========
>
> Help Wanted List
>
> -----------------------
>
> We settled on a format for the job postings and have several on the list.
> We talked about how often we want to look through, update or add to it. The
> proposal is to do this yearly. We need to continue pushing on the board to
> dedicate contributors at their companies to work on these items, and get
> them to understand that it's an investment that will take longer than a
> year in a lot of cases; interns are great, but not enough.
>
> TC Position on Foundation Member Community Contributions
>
>
> ----------------------------------------------------------------------------------
>
> The discussion started with a state of things today - the expectations of
> platinum members, the benefits the members get being on the board and why
> they should donate contributor resources for these benefits, etc. A variety
> of proposals were made: either enforce or remove the minimum contribution
> level, give gold members the chance to have increased visibility (perhaps
> giving them some of the platinum member advantages) if they supplement
> their monetary contributions with contributor contributions, etc. The
> #ACTION that was decided was for Mohammed to take these ideas to the board
> and see what they think.
>
> OpenStack User-facing APIs
>
> --------------------------------------
>
> Users are confused about the state of the user facing API’s; they’ve been
> told to use the OpenStackClient(OSC) but upon use, they discover that there
> are features missing that exist in the python-*clients. Partial
> implementation in the OSC is worse than if the service only used their
> specific CLI. Members of the OpenStackSDK joined discussions and explained
> that many of the barriers that projects used to have behind implementing
> certain commands have been resolved. The proposal is to create a pop up
> team and that they start with fully migrating Nova, documenting the process
> and collecting any other unresolved blocking issues with the hope that one
> day we can set the migration of the remaining projects as a community goal.
> Supplementally, a new idea was proposed-  enforcing new functionality to
> services is only added to the SDK (and optionally the OSC) and not the
> project’s specific CLI to stop increasing the disparity between the two.
> The #ACTION here is to start the pop up team, if you are interested, please
> reply! Additionally, if you disagree with this kind of enforcement, please
> contact the TC as soon as possible and explain your concerns.
>
> PTL Role in OpenStack today & Leaderless Projects
>
> ---------------------------------------------------------------------
>
> This was a veeeeeeeerrrry long conversation that went in circles a few
> times. The very short version is that we, the TC, are willing to let
> project teams decide for themselves if they want to have a more
> deconstructed kind of PTL role by breaking it into someone responsible for
> releases and someone responsible for security issues. This new format also
> comes with setting the expectation that for things like project updates and
> signing up for PTG time, if someone on the team doesn’t actively take that
> on, the default assumption is that the project won’t participate. The
> #ACTION we need someone to take on is to write a resolution about how this
> will work and how it can be done. Ideally, this would be done before the
> next technical election, so that teams can choose it at that point. If you
> are interested in taking on the writing of this resolution, please speak up!
>
> Cross Project Work
>
> -------------------------
>
> -Pop Up Teams-
>
> The two teams we have right now are Encryption and Secure Consistent
> Policy Groups. Both are making slow progress and will continue.
>
>
>
> -Reducing Community Goals Per Cycle-
>
> Historically we have had two goals per cycle, but for smaller teams this
> can be a HUGE lift. The #ACTION is to clearly outline the documentation for
> the goal proposal and selection process to clarify that selecting only one
> goal is fine. No one has claimed this action item yet.
>
> -Victoria Goal Finalization-
>
> Currently, we have three proposals and one accepted goal. If we are going
> to select a second goal, it needs to be done ASAP as Victoria development
> has already begun. All TC members should review the last proposal
> requesting selection[4].
>
> -Wallaby Cycle Goal Discussion Kick Off-
>
> Firstly, there is a #ACTION that one or two TC members are needed to guide
> the W goal selection. If you are interested, please reply to this thread!
> There were a few proposed goals for VIctoria that didn’t make it that could
> be the starting point for W discussions, in particular, the rootwrap goal
> which would be good for operators. The OpenStackCLI might be another goal
> to propose for Wallaby.
>
> Detecting Unmaintained Projects Early
>
> ---------------------------------------------------
>
> The TC liaisons program had been created a few releases ago, but the
> initial load on TC members was large. We discussed bringing this program
> back and making the project health checks happen twice a release, either
> the start or end of the release and once in the middle. TC liaisons will
> look at  previously proposed releases,  release activity of the team, the
> state of tempest plugins, if regular meetings are happening, if there are
> patches in progress and how busy the project’s IRC channel is to make a
> determination. Since more than one liaison will be assigned to each
> project, those liaisons can divvy up the work how they see fit. The other
> aspect that still needs to be decided is where the health checks will be
> recorded- in a wiki? In a meeting and meeting logs? That decision is still
> to be continued. The current #ACTION currently unassigned is that we need
> to assign liaisons for the Victoria cycle and decide when to do the first
> health check.
>
> Friday
>
> =====
>
> Reducing Systems and Friction to Drive Change
>
> ----------------------------------------------------------------
>
> This was another conversation that went in circles a bit before realizing
> that we should make a list of the more specific problems we want to address
> and then brainstorm solutions for them. The list we created (including
> things already being worked) are as follows:
>
>    -
>
>    TC separate from UC (solution in progress)
>    -
>
>    Stable releases being approved by a separate team (solution in
>    progress)
>    -
>
>    Making repository creation faster (especially for established project
>    teams)
>    -
>
>    Create a process blueprint for project team mergers
>    -
>
>    Requirements Team being one person
>    -
>
>    Stable Team
>    -
>
>    Consolidate the agent experience
>    -
>
>    Figure out how to improve project <--> openstack client/sdk
>    interaction.
>
> If you feel compelled to pick one of these things up and start proposing
> solutions or add to the list, please do!
>
> Monitoring in OpenStack (Ceilometer + Telemetry + Gnocchi State)
>
>
> -----------------------------------------------------------------------------------------
>
> This conversation is also ongoing, but essentially we talked about the
> state of things right now- largely they are not well maintained and there
> is added complexity with Ceilometers being partially dependent on Gnocchi.
> There are a couple of ideas to look into like using oslo.metrics for the
> interface between all the tools or using Ceilometer without Gnocchi if we
> can clean up those dependencies. No specific action items here, just please
> share your thoughts if you have them.
>
> Ideas Repo Next Steps
>
> -------------------------------
>
> Out of the Ussuri retrospective, it was brought up that we probably needed
> to talk a little more about what we wanted for this repo. Essentially we
> just want it to be a place to collect ideas into without worrying about the
> how. It should be a place to document ideas we have had (old and new) and
> keep all the discussion in one place as opposed to historic email threads,
> meetings logs, other IRC logs, etc. We decided it would be good to
> periodically go through this repo, likely as a forum session at a summit to
> see if there is any updating that could happen or promotion of ideas to
> community goals, etc.
>
> ‘tc:approved-release’ Tag
>
> ---------------------------------
>
> This topic was proposed by the Manila team from a discussion they had
> earlier in the week. We talked about the history of the tag and how usage
> of tags has evolved. At this point, the proposal is to remove the tag as
> anything in the releases repo is essentially tc-approved. Ghanshyam has
> volunteered to document this and do the removal. The board also needs to be
> notified of this and to look at projects.yaml in the governance repo as the
> source of truth for TC approved projects. The unassigned #ACTION item is to
> review remaining tags and see if there are others that need to be
> modified/removed/added to  drive common behavior across OpenSack
> components.
>
> Board Proposals
>
> ----------------------
>
> This was a pretty quick summary of all discussions we had that had any
> impact on the board and largely decided who would mention them.
>
>
>
> Session Feedback
>
> ------------------------
>
> This was also a pretty quick topic compared to many of the others, we
> talked about how things went across all our discussions (largely we called
> the PTG a success) logistically. We tried to make good use of the raising
> hands feature which mostly worked, but it lacks context and its possible
> that the conversation has moved on by the time it’s your turn (if you even
> remember what you want to say).
>
> OpenStack 2.0: k8s Native
>
> -----------------------------------
>
> This topic was brought up at the end of our time so we didn’t have time to
> discuss it really. Basically Mohammed wanted to start the conversation
> about adding k8s as a base service[5] and what we would do if a project
> proposed required k8s. Adding services that work with k8s could open a door
> to new innovation in OpenStack. Obviously this topic will need to be
> discussed further as we barely got started before we had to wrap things up.
>
>
> So.
>
>
> The tldr;
>
>
> Here are the #ACTION items we need owners for:
>
>    -
>
>    Start the User Facing API Pop Up Team
>    -
>
>    Write a resolution about how the deconstructed PTL roles will work
>    -
>
>    Update Goal Selection docs to explain that one or more goals is fine;
>    it doesn’t have to be more than one
>    -
>
>    Two volunteers to start the W goal selection process
>    -
>
>    Assign two TC liaisons per project
>    -
>
>     Review Tags to make sure they are still good for driving common
>    behavior across all openstack projects
>
>
> Here are the things EVERYONE needs to do:
>
>    -
>
>    Review last goal proposal so that we can decide to accept or reject it
>    for the V release[4]
>    -
>
>    Add systems that are barriers to progress in openstack to the Reducing
>    Systems and Friction list
>    -
>
>    Continue conversations you find important
>
>
>
> Thanks everyone for your hard work and great conversations :)
>
> Enjoy the attached (photoshopped) team photo :)
>
> -Kendall (diablo_rojo)
>
>
>
> [1] TC PTG Etherpad: https://etherpad.opendev.org/p/tc-victoria-ptg
>
> [2] TC Guide Patch: https://review.opendev.org/#/c/732983/
>
> [3] UC TC Merge Thread:
> http://lists.openstack.org/pipermail/openstack-discuss/2020-May/014736.html
>
>
> [4] Proposed V Goal: https://review.opendev.org/#/c/731213/
>
> [5] Base Service Description:
> https://governance.openstack.org/tc/reference/base-services.html
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200804/83f40cf1/attachment-0001.html>

From sean.mcginnis at gmx.com  Tue Aug  4 12:22:19 2020
From: sean.mcginnis at gmx.com (Sean McGinnis)
Date: Tue, 4 Aug 2020 07:22:19 -0500
Subject: [ops] [cinder[ "Services and volumes must have a service UUID."
In-Reply-To: <CAA857Vy-XRwMR-mT-f=vBQWNs2C81q8_a4P1sUuYn7yq4TGFCg@mail.gmail.com>
References: <CALaZjRE97Jp_JBduWm4LKjRH-w=1tgfajwJ6GPCHerXKMBqpAQ@mail.gmail.com>
 <CAA857VwrCa7aLhpiY-7_=JFexJjktruMg=EswLePixAg14QeSg@mail.gmail.com>
 <CALaZjREcGpOhscZz_7eNVD_DHVMPKWmCDCrw8u3uubhKAxhmdw@mail.gmail.com>
 <CAA857VzBYHQog+vtS5RZbUdiNtQRdqVsOy-WhTV2U20QL04MXg@mail.gmail.com>
 <CALaZjRF3Z61Od5e3rHCByLtRe4FRcLbyRMXW0s35szGmWGqKAw@mail.gmail.com>
 <CAA857Vy-XRwMR-mT-f=vBQWNs2C81q8_a4P1sUuYn7yq4TGFCg@mail.gmail.com>
Message-ID: <a9fcee08-4146-f191-dd65-532223cf949b@gmx.com>

On 8/4/20 3:17 AM, Fabian Zimmermann wrote:
> Hmm, the err msg tells to run the queens version of the tool.
>
> Maybe something went wrong, but the db version got incremented? Just
> guessing.
>
> Did you try to find the commit/change that introduced the msg?

[snip]

>                 Massimo Sgaravatto <massimo.sgaravatto at gmail.com
>                 <mailto:massimo.sgaravatto at gmail.com>> schrieb am Mo.,
>                 3. Aug. 2020, 20:21:
>
>                     We have just updated a small OpenStack cluster to
>                     Train.
>                     Everything seems working, but "cinder-status
>                     upgrade check" complains that services and volumes
>                     must have a service UUID [*].
>                     What does this exactly mean?
>
>                     Thanks, Massimo
>
>                     [*]
>                     +--------------------------------------------------------------------+
>                     | Check: Service UUIDs                         |
>                     | Result: Failure                          |
>                     | Details: Services and volumes must have a
>                     service UUID. Please fix |
>                     |   this issue by running Queens online data
>                     migrations.             |
>
Hmm, this does look concerning. If you are now on Train but a migration
is missing from Queens, that would seem to indicate some migrations were
missed along the way.

Were migrations run in each release prior to getting to Train?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200804/ac4beb24/attachment.html>

From geguileo at redhat.com  Tue Aug  4 12:26:06 2020
From: geguileo at redhat.com (Gorka Eguileor)
Date: Tue, 4 Aug 2020 14:26:06 +0200
Subject: [nova] openstack-tox-lower-constraints broken
In-Reply-To: <bb371041-a8af-c695-b0bb-593cd4372283@gmx.com>
References: <20200803125522.rjso5tafqzt3sjoh@lyarwood.usersys.redhat.com>
 <bb371041-a8af-c695-b0bb-593cd4372283@gmx.com>
Message-ID: <20200804122606.dctnfvxqytfv22ws@localhost>

On 03/08, Sean McGinnis wrote:
> On 8/3/20 7:55 AM, Lee Yarwood wrote:
> > Hello all,
> >
> > $subject, I've raised the following bug:
> >
> > openstack-tox-lower-constraints failing due to unmet dependency on decorator==4.0.0
> > https://launchpad.net/bugs/1890123
> >
> > I'm trying to resolve this below but I honestly feel like I'm going
> > around in circles:
> >
> > https://review.opendev.org/#/q/topic:bug/1890123
> >
> > If anyone has any tooling and/or recommendations for resolving issues
> > like this I'd appreciate it!
> >
> > Cheers,
>
> This appears to be broken for everyone. I initially saw the decorator
> thing with Cinder, but after looking closer realized it's not that package.
>
> The root issue (or at least one level closer to the root issue, that
> seems to be causing the decorator failure) is that the lower-constraints
> are not actually being enforced. Even though the logs should it is
> passing "-c [path to lower-constraints.txt]". So even though things
> should be constrained to a lower version, presumably a version that
> works with a different version of decorator, pip is still installing a
> newer package than what the constraints should allow.
>
> There was a pip release on the 28th. Things don't look like they started
> failing until the 31st for us though, so either that is not it, or there
> was just a delay before our nodes started picking up the newer version.
>
> I tested locally, and at least with version 19.3.1, I am getting the
> correctly constrained packages installed.
>
> Still looking, but thought I would share in case that info triggers any
> ideas for anyone else.
>
> Sean
>
>

Hi,

Looking at one of my patches I see that the right version of
dogpile.cache==0.6.5 is being installed [1], but then at another step we
download [2] and install [3] version 1.0.1, and we can see that pip is
actually complaining that we have incompatibilities [4].

As far as I can see this is because in that pip install we requested to
wipe existing installed packages [6] and we are not passing any
constraints in that call.

I don't know why or where we are doing that though.

Cheers,
Gorka.

[1]: https://zuul.opendev.org/t/openstack/build/49f226f8efb94c088cb2b22c46565d97/log/tox/lower-constraints-1.log#235-236
[2]: https://zuul.opendev.org/t/openstack/build/49f226f8efb94c088cb2b22c46565d97/log/tox/lower-constraints-2.log#148-149
[3]: https://zuul.opendev.org/t/openstack/build/49f226f8efb94c088cb2b22c46565d97/log/tox/lower-constraints-2.log#168-174
[4]: https://zuul.opendev.org/t/openstack/build/49f226f8efb94c088cb2b22c46565d97/log/tox/lower-constraints-2.log#202-203
[5]: https://zuul.opendev.org/t/openstack/build/49f226f8efb94c088cb2b22c46565d97/log/tox/lower-constraints-2.log#3


From fungi at yuggoth.org  Tue Aug  4 12:39:22 2020
From: fungi at yuggoth.org (Jeremy Stanley)
Date: Tue, 4 Aug 2020 12:39:22 +0000
Subject: [nova] openstack-tox-lower-constraints broken
In-Reply-To: <20200804122606.dctnfvxqytfv22ws@localhost>
References: <20200803125522.rjso5tafqzt3sjoh@lyarwood.usersys.redhat.com>
 <bb371041-a8af-c695-b0bb-593cd4372283@gmx.com>
 <20200804122606.dctnfvxqytfv22ws@localhost>
Message-ID: <20200804123922.tcxglphtn6x2yona@yuggoth.org>

On 2020-08-04 14:26:06 +0200 (+0200), Gorka Eguileor wrote:
[...]
> Looking at one of my patches I see that the right version of
> dogpile.cache==0.6.5 is being installed [1], but then at another step we
> download [2] and install [3] version 1.0.1, and we can see that pip is
> actually complaining that we have incompatibilities [4].
> 
> As far as I can see this is because in that pip install we requested to
> wipe existing installed packages [6] and we are not passing any
> constraints in that call.
> 
> I don't know why or where we are doing that though.
[...]

Yes, I started digging into this yesterday too. It's affecting all
tox jobs, not just lower-constraints jobs (upper-constraints is
close enough to unconstrained that this isn't immediately apparent
for master branch jobs, but the divergence becomes obvious in stable
branch jobs and it's breaking lots of them). It seems this started
roughly a week ago.

I don't think we're explicitly doing it, this seems to be a behavior
baked into tox itself. Most projects are currently applying
constraints via the deps parameter in their tox.ini, and tox appears
to invoke pip twice: once to install your deps, and then a second
time to install the project being tested. The latter phase does not
use the deps parameter, and so no constraints get applied.

We might be able to work around this by going back to overriding
install_command and putting the -c option there instead, but I
haven't had an opportunity to test that theory yet. If anyone else
has time to pursue this line of investigation, I'd be curious to
hear whether it helps.
-- 
Jeremy Stanley
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 963 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200804/712758d3/attachment.sig>

From sean.mcginnis at gmx.com  Tue Aug  4 13:01:16 2020
From: sean.mcginnis at gmx.com (Sean McGinnis)
Date: Tue, 4 Aug 2020 08:01:16 -0500
Subject: [ops] [cinder[ "Services and volumes must have a service UUID."
In-Reply-To: <CALaZjRHHOqBgH=uMOHVJ5GhvfKyzPoo-9-Rgu+_xzysXys0vQw@mail.gmail.com>
References: <CALaZjRE97Jp_JBduWm4LKjRH-w=1tgfajwJ6GPCHerXKMBqpAQ@mail.gmail.com>
 <CAA857VwrCa7aLhpiY-7_=JFexJjktruMg=EswLePixAg14QeSg@mail.gmail.com>
 <CALaZjREcGpOhscZz_7eNVD_DHVMPKWmCDCrw8u3uubhKAxhmdw@mail.gmail.com>
 <CAA857VzBYHQog+vtS5RZbUdiNtQRdqVsOy-WhTV2U20QL04MXg@mail.gmail.com>
 <CALaZjRF3Z61Od5e3rHCByLtRe4FRcLbyRMXW0s35szGmWGqKAw@mail.gmail.com>
 <CAA857Vy-XRwMR-mT-f=vBQWNs2C81q8_a4P1sUuYn7yq4TGFCg@mail.gmail.com>
 <a9fcee08-4146-f191-dd65-532223cf949b@gmx.com>
 <CALaZjRHHOqBgH=uMOHVJ5GhvfKyzPoo-9-Rgu+_xzysXys0vQw@mail.gmail.com>
Message-ID: <199d0036-99c2-0be0-c464-39dcda8368ca@gmx.com>

[adding back the ML]

On 8/4/20 7:48 AM, Massimo Sgaravatto wrote:
> I am afraid I never ran the online data migration
>
> This cluster ran Ocata
> Then we updated to Rocky. We went though Pike and Queens but just to
> run the db-syncs
> Then we updated from Rocky to train (again, we went though Stein  but
> just to run the db-syncs)
>
> Am I in troubles now ?
>
> Thanks, Massimo

I know some folks were handling parts of this by running each version in
a container. That may be an option to quickly go through the DB migrations.

Let's see if anyone responds with any tips to make this easy.


From massimo.sgaravatto at gmail.com  Tue Aug  4 13:08:51 2020
From: massimo.sgaravatto at gmail.com (Massimo Sgaravatto)
Date: Tue, 4 Aug 2020 15:08:51 +0200
Subject: [ops] [cinder[ "Services and volumes must have a service UUID."
In-Reply-To: <199d0036-99c2-0be0-c464-39dcda8368ca@gmx.com>
References: <CALaZjRE97Jp_JBduWm4LKjRH-w=1tgfajwJ6GPCHerXKMBqpAQ@mail.gmail.com>
 <CAA857VwrCa7aLhpiY-7_=JFexJjktruMg=EswLePixAg14QeSg@mail.gmail.com>
 <CALaZjREcGpOhscZz_7eNVD_DHVMPKWmCDCrw8u3uubhKAxhmdw@mail.gmail.com>
 <CAA857VzBYHQog+vtS5RZbUdiNtQRdqVsOy-WhTV2U20QL04MXg@mail.gmail.com>
 <CALaZjRF3Z61Od5e3rHCByLtRe4FRcLbyRMXW0s35szGmWGqKAw@mail.gmail.com>
 <CAA857Vy-XRwMR-mT-f=vBQWNs2C81q8_a4P1sUuYn7yq4TGFCg@mail.gmail.com>
 <a9fcee08-4146-f191-dd65-532223cf949b@gmx.com>
 <CALaZjRHHOqBgH=uMOHVJ5GhvfKyzPoo-9-Rgu+_xzysXys0vQw@mail.gmail.com>
 <199d0036-99c2-0be0-c464-39dcda8368ca@gmx.com>
Message-ID: <CALaZjRFPxM4-pjJtum42AxXXH=GAkeEc7kFJk6bAHpHPH7kO_A@mail.gmail.com>

Shouldn't the db sync fail if a needed online data migrations was not done ?

PS: Updating is becoming a nightmare: some services now require online data
migration, while for others only the db syncs should be done.

On Tue, Aug 4, 2020 at 3:01 PM Sean McGinnis <sean.mcginnis at gmx.com> wrote:

> [adding back the ML]
>
> On 8/4/20 7:48 AM, Massimo Sgaravatto wrote:
> > I am afraid I never ran the online data migration
> >
> > This cluster ran Ocata
> > Then we updated to Rocky. We went though Pike and Queens but just to
> > run the db-syncs
> > Then we updated from Rocky to train (again, we went though Stein  but
> > just to run the db-syncs)
> >
> > Am I in troubles now ?
> >
> > Thanks, Massimo
>
> I know some folks were handling parts of this by running each version in
> a container. That may be an option to quickly go through the DB migrations.
>
> Let's see if anyone responds with any tips to make this easy.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200804/133f6414/attachment-0001.html>

From smooney at redhat.com  Tue Aug  4 13:11:03 2020
From: smooney at redhat.com (Sean Mooney)
Date: Tue, 04 Aug 2020 14:11:03 +0100
Subject: [nova] openstack-tox-lower-constraints broken
In-Reply-To: <20200804123922.tcxglphtn6x2yona@yuggoth.org>
References: <20200803125522.rjso5tafqzt3sjoh@lyarwood.usersys.redhat.com>
 <bb371041-a8af-c695-b0bb-593cd4372283@gmx.com>
 <20200804122606.dctnfvxqytfv22ws@localhost>
 <20200804123922.tcxglphtn6x2yona@yuggoth.org>
Message-ID: <38f40d528b2689996b9114157e2c578c71e37942.camel@redhat.com>

On Tue, 2020-08-04 at 12:39 +0000, Jeremy Stanley wrote:
> On 2020-08-04 14:26:06 +0200 (+0200), Gorka Eguileor wrote:
> [...]
> > Looking at one of my patches I see that the right version of
> > dogpile.cache==0.6.5 is being installed [1], but then at another step we
> > download [2] and install [3] version 1.0.1, and we can see that pip is
> > actually complaining that we have incompatibilities [4].
> > 
> > As far as I can see this is because in that pip install we requested to
> > wipe existing installed packages [6] and we are not passing any
> > constraints in that call.
> > 
> > I don't know why or where we are doing that though.
> 
> [...]
> 
> Yes, I started digging into this yesterday too. It's affecting all
> tox jobs, not just lower-constraints jobs (upper-constraints is
> close enough to unconstrained that this isn't immediately apparent
> for master branch jobs, but the divergence becomes obvious in stable
> branch jobs and it's breaking lots of them). It seems this started
> roughly a week ago.
> 
> I don't think we're explicitly doing it, this seems to be a behavior
> baked into tox itself. Most projects are currently applying
> constraints via the deps parameter in their tox.ini, and tox appears
> to invoke pip twice: once to install your deps, and then a second
> time to install the project being tested. The latter phase does not
> use the deps parameter, and so no constraints get applied.
> 
> We might be able to work around this by going back to overriding
> install_command and putting the -c option there instead,
right so stephen asked me to remove that override in one of my recent patches to os-vif that is
under view since he made the comment the command we were using was more or less the same as the default
we currently set teh -c in deps.
so if i understand the workaound correclty we woudl add -c  {env:CONSTRAINTS_OPT} to install_command
so "install_command = pip install -U {opts} {packages} -c {env:CONSTRAINTS_OPT}" in our case and
then for the lower contriats jobs in stead of 

deps =
  -c{toxinidir}/lower-constraints.txt
  -r{toxinidir}/requirements.txt
  -r{toxinidir}/test-requirements.txt
  -r{toxinidir}/doc/requirements.txt

we would do 
setenv =
 CONSTRAINTS_OPT=-c{toxinidir}/lower-constraints.txt
deps =
  -r{toxinidir}/requirements.txt
  -r{toxinidir}/test-requirements.txt
  -r{toxinidir}/doc/requirements.txt


that way we can keep the same install command for both but use the correct constrint file.
>  but I
> haven't had an opportunity to test that theory yet. If anyone else
> has time to pursue this line of investigation, I'd be curious to
> hear whether it helps.


From fungi at yuggoth.org  Tue Aug  4 13:16:43 2020
From: fungi at yuggoth.org (Jeremy Stanley)
Date: Tue, 4 Aug 2020 13:16:43 +0000
Subject: [nova] openstack-tox-lower-constraints broken
In-Reply-To: <38f40d528b2689996b9114157e2c578c71e37942.camel@redhat.com>
References: <20200803125522.rjso5tafqzt3sjoh@lyarwood.usersys.redhat.com>
 <bb371041-a8af-c695-b0bb-593cd4372283@gmx.com>
 <20200804122606.dctnfvxqytfv22ws@localhost>
 <20200804123922.tcxglphtn6x2yona@yuggoth.org>
 <38f40d528b2689996b9114157e2c578c71e37942.camel@redhat.com>
Message-ID: <20200804131643.dsdpoea4proojeky@yuggoth.org>

On 2020-08-04 14:11:03 +0100 (+0100), Sean Mooney wrote:
[...]
> so if i understand the workaound correclty we woudl add -c
> {env:CONSTRAINTS_OPT} to install_command so "install_command = pip
> install -U {opts} {packages} -c {env:CONSTRAINTS_OPT}" in our case
> and then for the lower contriats jobs in stead of
> 
> deps =
>   -c{toxinidir}/lower-constraints.txt
>   -r{toxinidir}/requirements.txt
>   -r{toxinidir}/test-requirements.txt
>   -r{toxinidir}/doc/requirements.txt
> 
> we would do
> 
> setenv =
>  CONSTRAINTS_OPT=-c{toxinidir}/lower-constraints.txt
> deps =
>   -r{toxinidir}/requirements.txt
>   -r{toxinidir}/test-requirements.txt
>   -r{toxinidir}/doc/requirements.txt
> 
> that way we can keep the same install command for both but use the
> correct constrint file.
[...]

Yep, Sean McGinnis is trying a variant of that in
https://review.opendev.org/744698 now to see if it alters tox's
behavior like we expect.
-- 
Jeremy Stanley
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 963 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200804/ed7edc75/attachment.sig>

From rafaelweingartner at gmail.com  Tue Aug  4 13:20:06 2020
From: rafaelweingartner at gmail.com (=?UTF-8?Q?Rafael_Weing=C3=A4rtner?=)
Date: Tue, 4 Aug 2020 10:20:06 -0300
Subject: [cloudkitty][tc] Cloudkitty abandoned?
In-Reply-To: <CAFHSqWrKO4nDiVNpZe3LUsccXwefqRirZGk9BCsQjads0VrYCA@mail.gmail.com>
References: <c6150166-1fa9-5020-988a-0a8e9101bca8@gmx.com>
 <CAG97raekL-OR3mT_GdpD+zcwoCCLYDTCsieMwqCkYC9brmHKpw@mail.gmail.com>
 <CAFHSqWrKO4nDiVNpZe3LUsccXwefqRirZGk9BCsQjads0VrYCA@mail.gmail.com>
Message-ID: <CAG97racc_j30TVhZ1UBk9SAo1rNrer1j1pr-uQEjkee4cTY_gg@mail.gmail.com>

I am not sure how the projects/communities here in OpenStack are maintained
and conducted, but I could for sure help.
I am a committer and PMC for some Apache projects; therefore, I am a bit
familiar with some processes in OpenSource communities.

On Tue, Aug 4, 2020 at 5:11 AM Mark Goddard <mark at stackhpc.com> wrote:

> On Thu, 30 Jul 2020 at 14:43, Rafael Weingärtner
> <rafaelweingartner at gmail.com> wrote:
> >
> > We are working on it. So far we have 3 open proposals there, but we do
> not have enough karma to move things along.
> > Besides these 3 open proposals, we do have more ongoing extensions that
> have not yet been proposed to the community.
>
> It's good to hear you want to help improve cloudkitty, however it
> sounds like what is required is help with maintaining the project. Is
> that something you could be involved with?
> Mark
>
> >
> > On Thu, Jul 30, 2020 at 10:22 AM Sean McGinnis <sean.mcginnis at gmx.com>
> wrote:
> >>
> >> Posting here to raise awareness, and start discussion about next steps.
> >>
> >> It appears there is no one working on Cloudkitty anymore. No patches
> >> have been merged for several months now, including simple bot proposed
> >> patches. It would appear no one is maintaining this project anymore.
> >>
> >> I know there is a need out there for this type of functionality, so
> >> maybe this will raise awareness and get some attention to it. But
> >> barring that, I am wondering if we should start the process to retire
> >> this project.
> >>
> >>  From a Victoria release perspective, it is milestone-2 week, so we
> >> should make a decision if any of the Cloudkitty deliverables should be
> >> included in this release or not. We can certainly force releases of
> >> whatever is the latest, but I think that is a bit risky since these
> >> repos have never merged the job template change for victoria and
> >> therefore are not even testing with Python 3.8. That is an official
> >> runtime for Victoria, so we run the risk of having issues with the code
> >> if someone runs under 3.8 but we have not tested to make sure there are
> >> no problems doing so.
> >>
> >> I am hoping this at least starts the discussion. I will not propose any
> >> release patches to remove anything until we have had a chance to discuss
> >> the situation.
> >>
> >> Sean
> >>
> >>
> >
> >
> > --
> > Rafael Weingärtner
>


-- 
Rafael Weingärtner
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200804/2d6b5057/attachment.html>

From CAPSEY at augusta.edu  Tue Aug  4 13:39:35 2020
From: CAPSEY at augusta.edu (Apsey, Christopher)
Date: Tue, 4 Aug 2020 13:39:35 +0000
Subject: [nova] Hyper-V hosts
In-Reply-To: <046E9C0290DD9149B106B72FC9156BEA04814461@gmsxchsvr01.thecreation.com>
References: <046E9C0290DD9149B106B72FC9156BEA04814461@gmsxchsvr01.thecreation.com>
Message-ID: <BN8PR03MB5123648F223C6196BAEB7919DD4A0@BN8PR03MB5123.namprd03.prod.outlook.com>

You currently need a hyper-v host that is running at least Windows Insider Build 19640 in order to use Epyc with nested virtualization [1].  See if the beta compute driver works[2]. The hyper-v driver for ussuri has release notes[3], so it should be OK, although I haven't personally tried it.

Chris Apsey

[1] https://github.com/MicrosoftDocs/Virtualization-Documentation/issues/1276
[2] https://www.cloudbase.it/downloads/HyperVNovaCompute_Beta.msi
[3] https://docs.openstack.org/releasenotes/compute-hyperv/ussuri.html

-----Original Message-----
From: Eric K. Miller <emiller at genesishosting.com> 
Sent: Tuesday, August 4, 2020 1:03 AM
To: openstack-discuss at lists.openstack.org
Subject: [EXTERNAL] [nova] Hyper-V hosts

CAUTION: EXTERNAL SENDER This email originated from an external source. Please exercise caution before opening attachments, clicking links, replying, or providing information to the sender. If you believe it to be fraudulent, contact the AU Cybersecurity Hotline at 72-CYBER (2-9237 / 706-722-9237) or 72CYBER at augusta.edu

Hi,

I thought I'd look into support of Hyper-V hosts for Windows Server environments, but it looks like the latest cloudbase Windows Hyper-V OpenStack Installer is for Train, and nothing seems to discuss the use of Hyper-V in Windows Server 2019.  Has it been abandoned?

Is anyone using Hyper-V with OpenStack successfully?  One of the reasons we thought we might support it is to provide nested support for VMs with GPUs and/or vGPUs, and thought this would work better than with KVM, specifically with AMD EPYC systems.  It seems that when "options kvm-amd nested=1" is used in a modprobe.d config file, Windows machines lock up when started.  I think this has been an issue for a while with AMD processors, but thought it was fixed recently (I don't remember where I saw this, though).

Would love to hear about any experiences related to Hyper-V and/or nested hypervisor support on AMD EPYC processors.

Thanks!

Eric


From rosmaita.fossdev at gmail.com  Tue Aug  4 13:48:37 2020
From: rosmaita.fossdev at gmail.com (Brian Rosmaita)
Date: Tue, 4 Aug 2020 09:48:37 -0400
Subject: [cinder] victoria virtual mid-cycle next week
Message-ID: <64a4d8e5-d271-8b96-eed2-167f8daf900a@gmail.com>

The date/time selection poll has closed, and I am happy to announce the 
unanimous choice.

Session Two of the Cinder Victoria virtual mid-cycle will be held:

DATE: 12 August 2020
TIME: 1400-1600 UTC
LOCATION: https://bluejeans.com/3228528973

The meeting will be recorded.

Please add topics to the etherpad:
   https://etherpad.opendev.org/p/cinder-victoria-mid-cycles


cheers,
brian


From amotoki at gmail.com  Tue Aug  4 14:01:44 2020
From: amotoki at gmail.com (Akihiro Motoki)
Date: Tue, 4 Aug 2020 23:01:44 +0900
Subject: [neutron] bug deputy report (Jul 27 - Aug 2)
Message-ID: <CALhU9tmr_jk7XAFB3QQJR7s3jCAK_1ds2aF9Wh649pdbqKjACQ@mail.gmail.com>

Hi,

Sorry for late just before the meeting.
This is my bug deputy report last week.

General questions
=================

* l3-dvr-backlog tag was originally introduced to identify DVR feature
gaps. What should we use for OVN L3 feature gaps?
* We have no volunteer for FWaaS now. How should we triage fwaas bugs?

Needs attentions
================

Both affects neutron behaviors and they are not simple bugs.
More opinions would be appreciated.

https://bugs.launchpad.net/neutron/+bug/1889631
[OVS][FW] Multicast non-IGMP traffic is allowed by default, not in iptables FW
New, Undecided
It might be worth RFE. It affects existing deployments. We need more
detail discussion on this.

https://bugs.launchpad.net/neutron/+bug/1889454
br-int has an unpredictable MTU
New, Undecided
This is an interesting topic. More opinions would be appreciated.

Confirmed
=========


[FT][Fullstack] Timeout during OVS bridge creation transaction
https://bugs.launchpad.net/neutron/+bug/1889453
Critical, Confirmed

In Progress
===========

Functional tests
neutron.tests.functional.agent.linux.test_linuxbridge_arp_protect
failing on Ubuntu 20.04
https://bugs.launchpad.net/neutron/+bug/1889779
High, In Progress

CI failture: ImportError: cannot import decorate
https://bugs.launchpad.net/neutron/+bug/1890064
High, In Progress, The fix is https://review.opendev.org/#/c/744465/
but blocked by other gate failures

Validate subnet when plugging to the router don't works when plugging port
https://bugs.launchpad.net/neutron/+bug/1889619
Low, In Progress

Won't Fix
=========

Functional tests on Ubuntu 20.04 are timed out
https://bugs.launchpad.net/neutron/+bug/1889781
High, Won't Fix, it doesn't look like an issue after fixing other
issues per slawek

ONV feature gaps or cleanups
============================

https://bugs.launchpad.net/neutron/+bug/1889737
[OVN] Stop using neutron.api.rpc.handlers.resources_rpc with OVN as a backend
Medium, Confirmed, a kind of cleanup around OVN

https://bugs.launchpad.net/neutron/+bug/1889738
[OVN] Stop doing PgDelPortCommand on each router port update
Low, Confirmed

[OVN] access between Floatings ip and instance with Direct External IP
https://bugs.launchpad.net/neutron/+bug/1889388
New, Undecided, OVN feature gap
Q: l3-dvr-backlog tag was originally introduced to identify DVR
feature gaps. What should we use for OVN L3 feature gaps?

FWaaS
======

neutron_tempest_plugin.fwaas.api.test_fwaasv2_extensions failed
https://bugs.launchpad.net/neutron/+bug/1889730
New, Undecided, we have no volunteer for FWaaS now. How should we
triage fwaas bugs?


From mnaser at vexxhost.com  Tue Aug  4 14:47:51 2020
From: mnaser at vexxhost.com (Mohammed Naser)
Date: Tue, 4 Aug 2020 10:47:51 -0400
Subject: [tc] weekly update
Message-ID: <CAEs876g-GR7HtpxWGLxufxadxjL4YR2x-JJ39-H5VD8UGPAdnA@mail.gmail.com>

Hi everyone,

Here’s an update for what happened in the OpenStack TC this week. You
can get more information by checking for
changes in openstack/governance repository.  We've also included a few
references to some important mailing
list threads that you should check out.

# Patches
## Open Reviews
- Declare supported runtimes for Wallaby release
https://review.opendev.org/743847
- [draft] Add assert:supports-standalone https://review.opendev.org/722399
- Create starter-kit:kubernetes-in-virt tag https://review.opendev.org/736369
- migrate testing to ubuntu focal https://review.opendev.org/740851

## Project Updates
- Add Keystone Kerberos charm to OpenStack charms
https://review.opendev.org/743769
- Deprecate os_congress project https://review.opendev.org/742533
- Add Ceph iSCSI charm to OpenStack charms https://review.opendev.org/744480

## General Changes
- Cleanup the remaining osf repos and their data
https://review.opendev.org/739291
- [manila] assert:supports-accessible-upgrade https://review.opendev.org/740509
- V goals, Zuul v3 migration: update links and grenade
https://review.opendev.org/741987

## Abandoned Changes
- DNM: testing gate on ubuntu focal https://review.opendev.org/743249

# Email Threads
- Legacy Zuul Jobs Update 1:
http://lists.openstack.org/pipermail/openstack-discuss/2020-July/016058.html
- Community PyCharm Licenses:
http://lists.openstack.org/pipermail/openstack-discuss/2020-July/016039.html
- Release Countdown R-10:
http://lists.openstack.org/pipermail/openstack-discuss/2020-August/016243.html
- CloudKitty Status:
http://lists.openstack.org/pipermail/openstack-discuss/2020-July/016171.html
- Migrate CI Jobs to Ubuntu Focal Update:
http://lists.openstack.org/pipermail/openstack-discuss/2020-August/016222.html
- TC Monthly Meeting Reminder:
http://lists.openstack.org/pipermail/openstack-discuss/2020-July/016196.html

# Other Reminders
- Aug 4: CFP for Open Infra Summit Closes

Thanks for reading!
Mohammed & Kendall

-- 
Mohammed Naser
VEXXHOST, Inc.


From mkopec at redhat.com  Tue Aug  4 15:10:35 2020
From: mkopec at redhat.com (Martin Kopec)
Date: Tue, 4 Aug 2020 17:10:35 +0200
Subject: [openstack][tempest] Deprecation of scenario.img_dir option
Message-ID: <CAKZGdE2cfyu7Ft4-LdWpCbbq7ePfmRz0zmNFUvocXtu9cSfL6Q@mail.gmail.com>

Hello all,

we deprecated *scenario.img_dir* option in Tempest by this patch [1].
*scenario.img_file* should contain the full path of an image from now on.

However, to make the transition easier for all the Tempest dependent
projects, Tempest currently accepts both behaviors - the old where the path
to an image consisted of *scenario.img_dir* + *scenario.img_file* and the
new one where *scenario.img_file* contains the full path to an image. It
will be accepting both ways for one whole release - 25.

I proposed patches to projects I found they use scenario.img_dir option,
see this link [2]. If you maintain a project from the list [2], please
review. If your project somehow uses scenario.img_dir or img_file option
and is not in the list [2], please make appropriate changes.

[1] https://review.opendev.org/#/c/710996
[2]
https://review.opendev.org/#/q/topic:remove_img_dir+(status:open+OR+status:merged)

Regards,
-- 

Martin Kopec

Quality Engineer

Red Hat EMEA <https://www.redhat.com>

<https://red.ht/sig>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200804/6c83283b/attachment.html>

From monika.samal at outlook.com  Tue Aug  4 15:28:35 2020
From: monika.samal at outlook.com (Monika Samal)
Date: Tue, 4 Aug 2020 15:28:35 +0000
Subject: [openstack-community] Octavia :; Unable to create load balancer
In-Reply-To: <CAMH0MgLNm36nWsuXWyYvbd=1LNdNdmKLMYJn7u+d-muTnk0kuA@mail.gmail.com>
References: <DM6PR08MB51944451C8861614628EE9DA8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAFs83Qr3Us22pK7gnw7X68_FjimnBTRDn0s+RaSzuzhjcXdY6w@mail.gmail.com>
 <CAMH0MgL7mvG-Zgb-Sr1mbNWhHLyZAHYw+BbAuzaHA+EJSJ=DQQ@mail.gmail.com>
 <DM6PR08MB5194D8A130262996A211B61F8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VzM4BBStF8nTwVJtFxaX0surEidjnmB3JExTDsncJsZNg@mail.gmail.com>
 <DM6PR08MB5194211D78CB801C7C3B0D4A8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VwKMWiO7L6CYK+zb4jkT9ntCjcjQRspxdm8MFL+i13aKg@mail.gmail.com>
 <DM6PR08MB519419FF0C8F173E25A6EAB78F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyDANKJ0CvPXdwMFuED-xEea=Urn0sy0tpwnwLBPSnKTg@mail.gmail.com>
 <DM6PR08MB5194C8259398F0810FAA236E8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAMH0MgLADYE6ncadXJHxghaKDYVmL=wYPE6hOejTC3xyHMgaAg@mail.gmail.com>
 <DM6PR08MB519489AEE16CD5C6CA69890C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <DM6PR08MB51949EBABD05456E0D7B14E38F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyATFxkKPL+YJRsrTiY0qzxN4GWVu6FDroKgjGc5Bqzbg@mail.gmail.com>
 <DM6PR08MB519447CBCC68203A8D5C870C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VxOttV=yNCoFp-RGMts82RJrCB7wXkygsPwmNx--miYVg@mail.gmail.com>,
 <CAMH0MgLNm36nWsuXWyYvbd=1LNdNdmKLMYJn7u+d-muTnk0kuA@mail.gmail.com>
Message-ID: <DM6PR08MB5194961A1F38F812D414AE408F4A0@DM6PR08MB5194.namprd08.prod.outlook.com>

Hello Guys,

With Michaels help I was able to solve the problem but now there is another error I was able to create my network on vlan but still error persist. PFB the logs:

http://paste.openstack.org/show/fEixSudZ6lzscxYxsG1z/

Kindly help

regards,
Monika
________________________________
From: Michael Johnson <johnsomor at gmail.com>
Sent: Monday, August 3, 2020 9:10 PM
To: Fabian Zimmermann <dev.faz at gmail.com>
Cc: Monika Samal <monika.samal at outlook.com>; openstack-discuss <openstack-discuss at lists.openstack.org>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

Yeah, it looks like nova is failing to boot the instance.

Check this setting in your octavia.conf files: https://docs.openstack.org/octavia/latest/configuration/configref.html#controller_worker.amp_flavor_id

Also, if kolla-ansible didn't set both of these values correctly, please open bug reports for kolla-ansible. These all should have been configured by the deployment tool.

Michael

On Mon, Aug 3, 2020 at 7:53 AM Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>> wrote:
Seems like the flavor is missing or empty '' - check for typos and enable debug.

Check if the nova req contains valid information/flavor.

 Fabian

Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> schrieb am Mo., 3. Aug. 2020, 15:46:
It's registered

Get Outlook for Android<https://aka.ms/ghei36>
________________________________
From: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>
Sent: Monday, August 3, 2020 7:08:21 PM
To: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

Did you check the (nova) flavor you use in octavia.

 Fabian

Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> schrieb am Mo., 3. Aug. 2020, 10:53:
After Michael suggestion I was able to create load balancer but there is error in status.


[X]

PFB the error link:

http://paste.openstack.org/show/meNZCeuOlFkfjj189noN/
________________________________
From: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>
Sent: Monday, August 3, 2020 2:08 PM
To: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>
Cc: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>; Amy Marrich <amy at demarco.com<mailto:amy at demarco.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>; community at lists.openstack.org<mailto:community at lists.openstack.org> <community at lists.openstack.org<mailto:community at lists.openstack.org>>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

Thanks a ton Michael for helping me out
________________________________
From: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>
Sent: Friday, July 31, 2020 3:57 AM
To: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>
Cc: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>; Amy Marrich <amy at demarco.com<mailto:amy at demarco.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>; community at lists.openstack.org<mailto:community at lists.openstack.org> <community at lists.openstack.org<mailto:community at lists.openstack.org>>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

Just to close the loop on this, the octavia.conf file had
"project_name = admin" instead of "project_name = service" in the
[service_auth] section. This was causing the keystone errors when
Octavia was communicating with neutron.

I don't know if that is a bug in kolla-ansible or was just a local
configuration issue.

Michael

On Thu, Jul 30, 2020 at 1:39 PM Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> wrote:
>
> Hello Fabian,,
>
> http://paste.openstack.org/show/QxKv2Ai697qulp9UWTjY/
>
> Regards,
> Monika
> ________________________________
> From: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>
> Sent: Friday, July 31, 2020 1:57 AM
> To: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>
> Cc: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>; Amy Marrich <amy at demarco.com<mailto:amy at demarco.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>; community at lists.openstack.org<mailto:community at lists.openstack.org> <community at lists.openstack.org<mailto:community at lists.openstack.org>>
> Subject: Re: [openstack-community] Octavia :; Unable to create load balancer
>
> Hi,
>
> just to debug, could you replace the auth_type password with v3password?
>
> And do a curl against your :5000 and :35357 urls and paste the output.
>
>  Fabian
>
> Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> schrieb am Do., 30. Juli 2020, 22:15:
>
> Hello Fabian,
>
> http://paste.openstack.org/show/796477/
>
> Thanks,
> Monika
> ________________________________
> From: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>
> Sent: Friday, July 31, 2020 1:38 AM
> To: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>
> Cc: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>; Amy Marrich <amy at demarco.com<mailto:amy at demarco.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>; community at lists.openstack.org<mailto:community at lists.openstack.org> <community at lists.openstack.org<mailto:community at lists.openstack.org>>
> Subject: Re: [openstack-community] Octavia :; Unable to create load balancer
>
> The sections should be
>
> service_auth
> keystone_authtoken
>
> if i read the docs correctly. Maybe you can just paste your config (remove/change passwords) to paste.openstack.org<http://paste.openstack.org> and post the link?
>
>  Fabian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200804/1bce668f/attachment-0001.html>

From ralonsoh at redhat.com  Tue Aug  4 17:05:49 2020
From: ralonsoh at redhat.com (Rodolfo Alonso Hernandez)
Date: Tue, 4 Aug 2020 18:05:49 +0100
Subject: [neutron][OVS firewall] Multicast non-IGMP traffic is allowed by
 default, not in iptables FW (LP#1889631)
Message-ID: <CAECr9X5vd2y9KV5hvGh67x_sBZTbXA_MymveQSLr1SQicmwmaA@mail.gmail.com>

Hello all:

First of all, the link: https://bugs.launchpad.net/neutron/+bug/1889631

To sum up the bug: in iptables FW, the non-IGMP multicast traffic from
224.0.0.x was blocked; this is not happening in OVS FW.

That was discussed today in the Neutron meeting today [1]. We face two
possible situations here:
- If we block this traffic now, some deployments using the OVS FW will
experience an unexpected network blockage.
- Deployments migrating from iptables to OVS FW, now won't be able to
explicitly allow this traffic (or block it by default). This also breaks
the current API, because some rules won't have any effect (those ones
allowing this traffic).

A possible solution is to add a new knob in the FW configuration; this
config option will allow to block or not this traffic by default. Remember
that the FW can only create permissive rules, not blocking ones.

Any feedback is welcome!

Regards.

[1]
http://eavesdrop.openstack.org/meetings/networking/2020/networking.2020-08-04-14.00.log.html#l-136
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200804/29897a3d/attachment.html>

From elfosardo at gmail.com  Tue Aug  4 17:09:33 2020
From: elfosardo at gmail.com (Riccardo Pittau)
Date: Tue, 4 Aug 2020 19:09:33 +0200
Subject: [nova] openstack-tox-lower-constraints broken
In-Reply-To: <20200804131643.dsdpoea4proojeky@yuggoth.org>
References: <20200803125522.rjso5tafqzt3sjoh@lyarwood.usersys.redhat.com>
 <bb371041-a8af-c695-b0bb-593cd4372283@gmx.com>
 <20200804122606.dctnfvxqytfv22ws@localhost>
 <20200804123922.tcxglphtn6x2yona@yuggoth.org>
 <38f40d528b2689996b9114157e2c578c71e37942.camel@redhat.com>
 <20200804131643.dsdpoea4proojeky@yuggoth.org>
Message-ID: <CAORRS==fa6bPk-aNtx_ApAH=0-+yH8A8B06ytUUjWPHYE0dnSA@mail.gmail.com>

Hi all!

After a very interesting and enlightening discussion with Sean and Clark on
IRC (thanks!),
we were able to test and verify that the issue is related to the latest
released version
of virtualenv, v2.0.29, that embeds pip 2.20, apparently the real offender
here.
I submitted a bug to virtualenv [1] for that, the fix is included in pip
2.20.1.
The bump in virtualenv is already up [2] and merged and a new version has
been
released, v2.0.30 [3], that should solve this issue.

[1] https://github.com/pypa/virtualenv/issues/1914
[2] https://github.com/pypa/virtualenv/pull/1915
[3] https://pypi.org/project/virtualenv/20.0.30/

A si biri,

Riccardo


On Tue, Aug 4, 2020 at 3:26 PM Jeremy Stanley <fungi at yuggoth.org> wrote:

> On 2020-08-04 14:11:03 +0100 (+0100), Sean Mooney wrote:
> [...]
> > so if i understand the workaound correclty we woudl add -c
> > {env:CONSTRAINTS_OPT} to install_command so "install_command = pip
> > install -U {opts} {packages} -c {env:CONSTRAINTS_OPT}" in our case
> > and then for the lower contriats jobs in stead of
> >
> > deps =
> >   -c{toxinidir}/lower-constraints.txt
> >   -r{toxinidir}/requirements.txt
> >   -r{toxinidir}/test-requirements.txt
> >   -r{toxinidir}/doc/requirements.txt
> >
> > we would do
> >
> > setenv =
> >  CONSTRAINTS_OPT=-c{toxinidir}/lower-constraints.txt
> > deps =
> >   -r{toxinidir}/requirements.txt
> >   -r{toxinidir}/test-requirements.txt
> >   -r{toxinidir}/doc/requirements.txt
> >
> > that way we can keep the same install command for both but use the
> > correct constrint file.
> [...]
>
> Yep, Sean McGinnis is trying a variant of that in
> https://review.opendev.org/744698 now to see if it alters tox's
> behavior like we expect.
> --
> Jeremy Stanley
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200804/3343674c/attachment.html>

From fungi at yuggoth.org  Tue Aug  4 17:32:00 2020
From: fungi at yuggoth.org (Jeremy Stanley)
Date: Tue, 4 Aug 2020 17:32:00 +0000
Subject: [nova] openstack-tox-lower-constraints broken
In-Reply-To: <CAORRS==fa6bPk-aNtx_ApAH=0-+yH8A8B06ytUUjWPHYE0dnSA@mail.gmail.com>
References: <20200803125522.rjso5tafqzt3sjoh@lyarwood.usersys.redhat.com>
 <bb371041-a8af-c695-b0bb-593cd4372283@gmx.com>
 <20200804122606.dctnfvxqytfv22ws@localhost>
 <20200804123922.tcxglphtn6x2yona@yuggoth.org>
 <38f40d528b2689996b9114157e2c578c71e37942.camel@redhat.com>
 <20200804131643.dsdpoea4proojeky@yuggoth.org>
 <CAORRS==fa6bPk-aNtx_ApAH=0-+yH8A8B06ytUUjWPHYE0dnSA@mail.gmail.com>
Message-ID: <20200804173200.7fkfmnwr3qofwjsp@yuggoth.org>

On 2020-08-04 19:09:33 +0200 (+0200), Riccardo Pittau wrote:
> After a very interesting and enlightening discussion with Sean and
> Clark on IRC (thanks!), we were able to test and verify that the
> issue is related to the latest released version of virtualenv,
> v2.0.29, that embeds pip 2.20, apparently the real offender here.
[...]

That was indeed confusing. Until I skimmed virtualenv's changelog it
hadn't dawned on me that all the problem libraries had a "." in
their names.
-- 
Jeremy Stanley
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 963 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200804/055374a2/attachment.sig>

From fungi at yuggoth.org  Tue Aug  4 17:38:48 2020
From: fungi at yuggoth.org (Jeremy Stanley)
Date: Tue, 4 Aug 2020 17:38:48 +0000
Subject: [nova] openstack-tox-lower-constraints broken
In-Reply-To: <20200804173200.7fkfmnwr3qofwjsp@yuggoth.org>
References: <20200803125522.rjso5tafqzt3sjoh@lyarwood.usersys.redhat.com>
 <bb371041-a8af-c695-b0bb-593cd4372283@gmx.com>
 <20200804122606.dctnfvxqytfv22ws@localhost>
 <20200804123922.tcxglphtn6x2yona@yuggoth.org>
 <38f40d528b2689996b9114157e2c578c71e37942.camel@redhat.com>
 <20200804131643.dsdpoea4proojeky@yuggoth.org>
 <CAORRS==fa6bPk-aNtx_ApAH=0-+yH8A8B06ytUUjWPHYE0dnSA@mail.gmail.com>
 <20200804173200.7fkfmnwr3qofwjsp@yuggoth.org>
Message-ID: <20200804173848.foqxbkdy72z36dcp@yuggoth.org>

On 2020-08-04 17:32:00 +0000 (+0000), Jeremy Stanley wrote:
> On 2020-08-04 19:09:33 +0200 (+0200), Riccardo Pittau wrote:
> > After a very interesting and enlightening discussion with Sean and
> > Clark on IRC (thanks!), we were able to test and verify that the
> > issue is related to the latest released version of virtualenv,
> > v2.0.29, that embeds pip 2.20, apparently the real offender here.
> [...]
> 
> That was indeed confusing. Until I skimmed virtualenv's changelog it
> hadn't dawned on me that all the problem libraries had a "." in
> their names.

Er, pip's changelog I meant, of course.
-- 
Jeremy Stanley
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 963 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200804/181b4506/attachment.sig>

From cohuck at redhat.com  Tue Aug  4 16:35:03 2020
From: cohuck at redhat.com (Cornelia Huck)
Date: Tue, 4 Aug 2020 18:35:03 +0200
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200729080503.GB28676@joy-OptiPlex-7040>
References: <20200713232957.GD5955@joy-OptiPlex-7040>
 <9bfa8700-91f5-ebb4-3977-6321f0487a63@redhat.com>
 <20200716083230.GA25316@joy-OptiPlex-7040>
 <20200717101258.65555978@x1.home>
 <20200721005113.GA10502@joy-OptiPlex-7040>
 <20200727072440.GA28676@joy-OptiPlex-7040>
 <20200727162321.7097070e@x1.home>
 <20200729080503.GB28676@joy-OptiPlex-7040>
Message-ID: <20200804183503.39f56516.cohuck@redhat.com>

[sorry about not chiming in earlier]

On Wed, 29 Jul 2020 16:05:03 +0800
Yan Zhao <yan.y.zhao at intel.com> wrote:

> On Mon, Jul 27, 2020 at 04:23:21PM -0600, Alex Williamson wrote:

(...)

> > Based on the feedback we've received, the previously proposed interface
> > is not viable.  I think there's agreement that the user needs to be
> > able to parse and interpret the version information.  Using json seems
> > viable, but I don't know if it's the best option.  Is there any
> > precedent of markup strings returned via sysfs we could follow?  

I don't think encoding complex information in a sysfs file is a viable
approach. Quoting Documentation/filesystems/sysfs.rst:

"Attributes should be ASCII text files, preferably with only one value            
per file. It is noted that it may not be efficient to contain only one           
value per file, so it is socially acceptable to express an array of              
values of the same type.                                                         
                                                                                 
Mixing types, expressing multiple lines of data, and doing fancy                 
formatting of data is heavily frowned upon."

Even though this is an older file, I think these restrictions still
apply.

> I found some examples of using formatted string under /sys, mostly under
> tracing. maybe we can do a similar implementation.
> 
> #cat /sys/kernel/debug/tracing/events/kvm/kvm_mmio/format

Note that this is *not* sysfs (anything under debug/ follows different
rules anyway!)

> 
> name: kvm_mmio
> ID: 32
> format:
>         field:unsigned short common_type;       offset:0;       size:2; signed:0;
>         field:unsigned char common_flags;       offset:2;       size:1; signed:0;
>         field:unsigned char common_preempt_count;       offset:3;       size:1; signed:0;
>         field:int common_pid;   offset:4;       size:4; signed:1;
> 
>         field:u32 type; offset:8;       size:4; signed:0;
>         field:u32 len;  offset:12;      size:4; signed:0;
>         field:u64 gpa;  offset:16;      size:8; signed:0;
>         field:u64 val;  offset:24;      size:8; signed:0;
> 
> print fmt: "mmio %s len %u gpa 0x%llx val 0x%llx", __print_symbolic(REC->type, { 0, "unsatisfied-read" }, { 1, "read" }, { 2, "write" }), REC->len, REC->gpa, REC->val
> 
> 
> #cat /sys/devices/pci0000:00/0000:00:02.0/uevent

'uevent' can probably be considered a special case, I would not really
want to copy it.

> DRIVER=vfio-pci
> PCI_CLASS=30000
> PCI_ID=8086:591D
> PCI_SUBSYS_ID=8086:2212
> PCI_SLOT_NAME=0000:00:02.0
> MODALIAS=pci:v00008086d0000591Dsv00008086sd00002212bc03sc00i00
> 

(...)

> what about a migration_compatible attribute under device node like
> below?
> 
> #cat /sys/bus/pci/devices/0000\:00\:02.0/UUID1/migration_compatible
> SELF:
> 	device_type=pci
> 	device_id=8086591d
> 	mdev_type=i915-GVTg_V5_2
> 	aggregator=1
> 	pv_mode="none+ppgtt+context"
> 	interface_version=3
> COMPATIBLE:
> 	device_type=pci
> 	device_id=8086591d
> 	mdev_type=i915-GVTg_V5_{val1:int:1,2,4,8}
> 	aggregator={val1}/2
> 	pv_mode={val2:string:"none+ppgtt","none+context","none+ppgtt+context"} 
> 	interface_version={val3:int:2,3}
> COMPATIBLE:
> 	device_type=pci
> 	device_id=8086591d
> 	mdev_type=i915-GVTg_V5_{val1:int:1,2,4,8}
> 	aggregator={val1}/2
> 	pv_mode=""  #"" meaning empty, could be absent in a compatible device
> 	interface_version=1

I'd consider anything of a comparable complexity to be a big no-no. If
anything, this needs to be split into individual files (with many of
them being vendor driver specific anyway.)

I think we can list compatible versions in a range/list format, though.
Something like

cat interface_version 
2.1.3

cat interface_version_compatible
2.0.2-2.0.4,2.1.0-

(indicating that versions 2.0.{2,3,4} and all versions after 2.1.0 are
compatible, considering versions <2 and >2 incompatible by default)

Possible compatibility between different mdev types feels a bit odd to
me, and should not be included by default (only if it makes sense for a
particular vendor driver.)


From melwittt at gmail.com  Tue Aug  4 21:08:46 2020
From: melwittt at gmail.com (melanie witt)
Date: Tue, 4 Aug 2020 14:08:46 -0700
Subject: [placement][gate] functional tests failing
Message-ID: <6486f281-5124-4566-af62-55c8a71905bf@gmail.com>

Hi all,

I recently proposed a change to openstack/placement and found that the 
functional tests are currently failing. It's because of a recent-ish 
bump to upper-constraints to allow os-traits 2.4.0:

https://review.opendev.org/739330

and placement has a func test that asserts the number of standard traits 
(more traits are available in 2.4.0).

I've proposed a fix for the func test here if anyone could please help 
review:

https://review.opendev.org/744790

Cheers,
-melanie


From kennelson11 at gmail.com  Tue Aug  4 21:22:04 2020
From: kennelson11 at gmail.com (Kendall Nelson)
Date: Tue, 4 Aug 2020 14:22:04 -0700
Subject: [cloudkitty][tc] Cloudkitty abandoned?
In-Reply-To: <CAG97racc_j30TVhZ1UBk9SAo1rNrer1j1pr-uQEjkee4cTY_gg@mail.gmail.com>
References: <c6150166-1fa9-5020-988a-0a8e9101bca8@gmx.com>
 <CAG97raekL-OR3mT_GdpD+zcwoCCLYDTCsieMwqCkYC9brmHKpw@mail.gmail.com>
 <CAFHSqWrKO4nDiVNpZe3LUsccXwefqRirZGk9BCsQjads0VrYCA@mail.gmail.com>
 <CAG97racc_j30TVhZ1UBk9SAo1rNrer1j1pr-uQEjkee4cTY_gg@mail.gmail.com>
Message-ID: <CAJ6yrQgvpyTpR2Z99-_ruqRMJextYcx4e-yP2kkpPY5PDWdAvA@mail.gmail.com>

I think the majority of 'maintenance' activities at the moment for
Cloudkitty are the reviewing of open patches in gerrit [1] and triaging
bugs that are reported in Launchpad[2] as they come in. When things come up
on this mailing list that have the cloudkitty tag in the subject line (like
this email), weighing in on them would also be helpful.

If you need help getting setup with gerrit, I am happy to assist anyway I
can :)

-Kendall Nelson (diablo_rojo)

[1]
https://review.opendev.org/#/q/project:openstack/cloudkitty+OR+project:openstack/python-cloudkittyclient+OR+project:openstack/cloudkitty-dashboard
[2] https://launchpad.net/cloudkitty


On Tue, Aug 4, 2020 at 6:21 AM Rafael Weingärtner <
rafaelweingartner at gmail.com> wrote:

> I am not sure how the projects/communities here in OpenStack are
> maintained and conducted, but I could for sure help.
> I am a committer and PMC for some Apache projects; therefore, I am a bit
> familiar with some processes in OpenSource communities.
>
> On Tue, Aug 4, 2020 at 5:11 AM Mark Goddard <mark at stackhpc.com> wrote:
>
>> On Thu, 30 Jul 2020 at 14:43, Rafael Weingärtner
>> <rafaelweingartner at gmail.com> wrote:
>> >
>> > We are working on it. So far we have 3 open proposals there, but we do
>> not have enough karma to move things along.
>> > Besides these 3 open proposals, we do have more ongoing extensions that
>> have not yet been proposed to the community.
>>
>> It's good to hear you want to help improve cloudkitty, however it
>> sounds like what is required is help with maintaining the project. Is
>> that something you could be involved with?
>> Mark
>>
>> >
>> > On Thu, Jul 30, 2020 at 10:22 AM Sean McGinnis <sean.mcginnis at gmx.com>
>> wrote:
>> >>
>> >> Posting here to raise awareness, and start discussion about next steps.
>> >>
>> >> It appears there is no one working on Cloudkitty anymore. No patches
>> >> have been merged for several months now, including simple bot proposed
>> >> patches. It would appear no one is maintaining this project anymore.
>> >>
>> >> I know there is a need out there for this type of functionality, so
>> >> maybe this will raise awareness and get some attention to it. But
>> >> barring that, I am wondering if we should start the process to retire
>> >> this project.
>> >>
>> >>  From a Victoria release perspective, it is milestone-2 week, so we
>> >> should make a decision if any of the Cloudkitty deliverables should be
>> >> included in this release or not. We can certainly force releases of
>> >> whatever is the latest, but I think that is a bit risky since these
>> >> repos have never merged the job template change for victoria and
>> >> therefore are not even testing with Python 3.8. That is an official
>> >> runtime for Victoria, so we run the risk of having issues with the code
>> >> if someone runs under 3.8 but we have not tested to make sure there are
>> >> no problems doing so.
>> >>
>> >> I am hoping this at least starts the discussion. I will not propose any
>> >> release patches to remove anything until we have had a chance to
>> discuss
>> >> the situation.
>> >>
>> >> Sean
>> >>
>> >>
>> >
>> >
>> > --
>> > Rafael Weingärtner
>>
>
>
> --
> Rafael Weingärtner
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200804/be618194/attachment.html>

From thomas.king at gmail.com  Tue Aug  4 19:41:48 2020
From: thomas.king at gmail.com (Thomas King)
Date: Tue, 4 Aug 2020 13:41:48 -0600
Subject: [Openstack-mentoring] Neutron subnet with DHCP relay - continued
In-Reply-To: <CAMD4D2JLhn-7EOyjA7jjUf0k0rOpes5tAv1tKRKLXSDqZd+7bw@mail.gmail.com>
References: <CAMD4D2JWGhNJxnFRGo5N2kN8W=6qQaZeMWKai6vzhRR7DUQhgg@mail.gmail.com>
 <CAFs83QoqtBpRoYTwkg6NSPdOtjyOvuGLjJpeXvMFdr+PuL3GZQ@mail.gmail.com>
 <CAMD4D2K+hKwsnTgP_mOy0kJwL-o3pVRjpoXhzQCin1vYZX_HvA@mail.gmail.com>
 <CABE5tBYpC_WEH4AJZdjJTH87m7S2HLAs8ov5fGQF1+u+wGAaVQ@mail.gmail.com>
 <CAMD4D2JPomjciGJBivFFBsWquwkMGb5ycdT1fGPwRNx_LRGXTg@mail.gmail.com>
 <CABE5tBYc3ZU4m-BAOHPp66KXULrjHhv5dDBRXWOTBJxe3Nmf2Q@mail.gmail.com>
 <CAMD4D2+i3fbA5pONBgaWzU8p9OYRzeRkeNs++Q-fB9_-Ly_v8Q@mail.gmail.com>
 <CABE5tBZfWU7-Rn8QBi+yiiXT9TqHE1m5AJgr1XsqKonqfyxKxA@mail.gmail.com>
 <CAMD4D2KprJQmsMKnvgKbEFPMS+MYoVZRFCoppwsZyK-cexGRbw@mail.gmail.com>
 <CAMD4D2LgKVss+ybf=DDQC39XD=b6r9M3AWdcxEPU7ZC9J7j3JQ@mail.gmail.com>
 <CAMD4D2JLhn-7EOyjA7jjUf0k0rOpes5tAv1tKRKLXSDqZd+7bw@mail.gmail.com>
Message-ID: <CAMD4D2K_M0E2WkKv1i904B21MRtp_s6qMunWTLqPEROintkX9g@mail.gmail.com>

Changing the ml2 flat_networks from specific physical networks to a
wildcard allowed me to create a segment. I may be unstuck.

New config:
[ml2_type_flat]
flat_networks=*

Now to try creating the subnet and try a remote provision.

Tom King

On Mon, Aug 3, 2020 at 3:58 PM Thomas King <thomas.king at gmail.com> wrote:

> I've been using named physical networks so long, I completely forgot using
> wildcards!
>
> Is this the answer????
>
> https://docs.openstack.org/mitaka/config-reference/networking/networking_options_reference.html#modular-layer-2-ml2-flat-type-configuration-options
>
> Tom King
>
> On Tue, Jul 28, 2020 at 3:46 PM Thomas King <thomas.king at gmail.com> wrote:
>
>> Ruslanas has been a tremendous help. To catch up the discussion lists...
>> 1. I enabled Neutron segments.
>> 2. I renamed the existing segments for each network so they'll make
>> sense.
>> 3. I attempted to create a segment for a remote subnet (it is using DHCP
>> relay) and this was the error that is blocking me. This is where the docs
>> do not cover:
>> [root at sea-maas-controller ~(keystone_admin)]# openstack network segment
>> create --physical-network remote146-30-32 --network-type flat --network
>> baremetal seg-remote-146-30-32
>> BadRequestException: 400: Client Error for url:
>> http://10.146.30.65:9696/v2.0/segments, Invalid input for operation:
>> physical_network 'remote146-30-32' unknown for flat provider network.
>>
>> I've asked Ruslanas to clarify how their physical networks correspond to
>> their remote networks. They have a single provider network and multiple
>> segments tied to multiple physical networks.
>>
>> However, if anyone can shine some light on this, I would greatly
>> appreciate it. How should neutron's configurations accommodate remote
>> networks<->Neutron segments when I have only one physical network
>> attachment for provisioning?
>>
>> Thanks!
>> Tom King
>>
>> On Wed, Jul 15, 2020 at 3:33 PM Thomas King <thomas.king at gmail.com>
>> wrote:
>>
>>> That helps a lot, thank you!
>>>
>>> "I use only one network..."
>>> This bit seems to go completely against the Neutron segments
>>> documentation. When you have access, please let me know if Triple-O is
>>> using segments or some other method.
>>>
>>> I greatly appreciate this, this is a tremendous help.
>>>
>>> Tom King
>>>
>>> On Wed, Jul 15, 2020 at 1:07 PM Ruslanas Gžibovskis <ruslanas at lpic.lt>
>>> wrote:
>>>
>>>> Hi Thomas,
>>>>
>>>> I have a bit complicated setup from tripleo side :) I use only one
>>>> network (only ControlPlane). thanks to Harold, he helped to make it work
>>>> for me.
>>>>
>>>> Yes, as written in the tripleo docs for leaf networks, it use the same
>>>> neutron network, different subnets. so neutron network is ctlplane (I
>>>> think) and have ctlplane-subnet, remote-provision and remote-KI :)) that
>>>> generates additional lines in "ip r s" output for routing "foreign" subnets
>>>> through correct gw, if you would have isolated networks, by vlans and ports
>>>> this would apply for each subnet different gw... I believe you
>>>> know/understand that part.
>>>>
>>>> remote* subnets have dhcp-relay setup by network team... do not ask
>>>> details for that. I do not know how to, but can ask :)
>>>>
>>>>
>>>> in undercloud/tripleo i have 2 dhcp servers, one is for introspection,
>>>> another for provide/cleanup and deployment process.
>>>>
>>>> all of those subnets have organization level tagged networks and are
>>>> tagged on network devices, but they are untagged on provisioning
>>>> interfaces/ports, as in general pxe should be untagged, but some nic's can
>>>> do vlan untag on nic/bios level. but who cares!?
>>>>
>>>> I just did a brief check on your first post, I think I have simmilar
>>>> setup to yours :)) I will check in around 12hours :)) more deaply, as will
>>>> be at work :)))
>>>>
>>>>
>>>> P.S. sorry for wrong terms, I am bad at naming.
>>>>
>>>>
>>>> On Wed, 15 Jul 2020, 21:13 Thomas King, <thomas.king at gmail.com> wrote:
>>>>
>>>>> Ruslanas, that would be excellent!
>>>>>
>>>>> I will reply to you directly for details later unless the maillist
>>>>> would like the full thread.
>>>>>
>>>>> Some preliminary questions:
>>>>>
>>>>>    - Do you have a separate physical interface for the segment(s)
>>>>>    used for your remote subnets?
>>>>>    The docs state each segment must have a unique physical network
>>>>>    name, which suggests a separate physical interface for each segment unless
>>>>>    I'm misunderstanding something.
>>>>>    - Are your provisioning segments all on the same Neutron network?
>>>>>    - Are you using tagged switchports or access switchports to your
>>>>>    Ironic server(s)?
>>>>>
>>>>> Thanks,
>>>>> Tom King
>>>>>
>>>>> On Wed, Jul 15, 2020 at 12:26 AM Ruslanas Gžibovskis <ruslanas at lpic.lt>
>>>>> wrote:
>>>>>
>>>>>> I have deployed that with tripleO, but now we are recabling and
>>>>>> redeploying it. So once I have it running I can share my configs, just name
>>>>>> which you want :)
>>>>>>
>>>>>> On Tue, 14 Jul 2020 at 18:40, Thomas King <thomas.king at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> I have. That's the Triple-O docs and they don't go through the
>>>>>>> normal .conf files to explain how it works outside of Triple-O. It has some
>>>>>>> ideas but no running configurations.
>>>>>>>
>>>>>>> Tom King
>>>>>>>
>>>>>>> On Tue, Jul 14, 2020 at 3:01 AM Ruslanas Gžibovskis <
>>>>>>> ruslanas at lpic.lt> wrote:
>>>>>>>
>>>>>>>> hi, have you checked:
>>>>>>>> https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/features/routed_spine_leaf_network.html
>>>>>>>>  ?
>>>>>>>> I am following this link. I only have one network, having different
>>>>>>>> issues tho ;)
>>>>>>>>
>>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200804/6dcc5099/attachment-0001.html>

From emiller at genesishosting.com  Tue Aug  4 23:21:35 2020
From: emiller at genesishosting.com (Eric K. Miller)
Date: Tue, 4 Aug 2020 18:21:35 -0500
Subject: [nova] Hyper-V hosts
In-Reply-To: <BN8PR03MB5123648F223C6196BAEB7919DD4A0@BN8PR03MB5123.namprd03.prod.outlook.com>
References: <046E9C0290DD9149B106B72FC9156BEA04814461@gmsxchsvr01.thecreation.com>
 <BN8PR03MB5123648F223C6196BAEB7919DD4A0@BN8PR03MB5123.namprd03.prod.outlook.com>
Message-ID: <046E9C0290DD9149B106B72FC9156BEA0481446B@gmsxchsvr01.thecreation.com>

> You currently need a hyper-v host that is running at least Windows
Insider
> Build 19640 in order to use Epyc with nested virtualization [1].  See
if the beta
> compute driver works[2]. The hyper-v driver for ussuri has release
notes[3],
> so it should be OK, although I haven't personally tried it.
> 
> Chris Apsey

Thank you Chris!  I must have seen the Windows Insider Build notes
somewhere about this.  Thanks for the link!  Glad to see that
development continues on the cloudbase components.

We'll take a test run with this in the near future when we have some
hardware dedicated to this.

Eric


From thomas.king at gmail.com  Tue Aug  4 22:22:11 2020
From: thomas.king at gmail.com (Thomas King)
Date: Tue, 4 Aug 2020 16:22:11 -0600
Subject: [Openstack-mentoring] Neutron subnet with DHCP relay - continued
In-Reply-To: <CAMD4D2K_M0E2WkKv1i904B21MRtp_s6qMunWTLqPEROintkX9g@mail.gmail.com>
References: <CAMD4D2JWGhNJxnFRGo5N2kN8W=6qQaZeMWKai6vzhRR7DUQhgg@mail.gmail.com>
 <CAFs83QoqtBpRoYTwkg6NSPdOtjyOvuGLjJpeXvMFdr+PuL3GZQ@mail.gmail.com>
 <CAMD4D2K+hKwsnTgP_mOy0kJwL-o3pVRjpoXhzQCin1vYZX_HvA@mail.gmail.com>
 <CABE5tBYpC_WEH4AJZdjJTH87m7S2HLAs8ov5fGQF1+u+wGAaVQ@mail.gmail.com>
 <CAMD4D2JPomjciGJBivFFBsWquwkMGb5ycdT1fGPwRNx_LRGXTg@mail.gmail.com>
 <CABE5tBYc3ZU4m-BAOHPp66KXULrjHhv5dDBRXWOTBJxe3Nmf2Q@mail.gmail.com>
 <CAMD4D2+i3fbA5pONBgaWzU8p9OYRzeRkeNs++Q-fB9_-Ly_v8Q@mail.gmail.com>
 <CABE5tBZfWU7-Rn8QBi+yiiXT9TqHE1m5AJgr1XsqKonqfyxKxA@mail.gmail.com>
 <CAMD4D2KprJQmsMKnvgKbEFPMS+MYoVZRFCoppwsZyK-cexGRbw@mail.gmail.com>
 <CAMD4D2LgKVss+ybf=DDQC39XD=b6r9M3AWdcxEPU7ZC9J7j3JQ@mail.gmail.com>
 <CAMD4D2JLhn-7EOyjA7jjUf0k0rOpes5tAv1tKRKLXSDqZd+7bw@mail.gmail.com>
 <CAMD4D2K_M0E2WkKv1i904B21MRtp_s6qMunWTLqPEROintkX9g@mail.gmail.com>
Message-ID: <CAMD4D2Lhe5cQn99=oWu4FRYiPJJFSohVu4kicDom8CLCT-RjdA@mail.gmail.com>

Getting closer. I was able to create the segment and the subnet for the
remote network on that segment.

When I attempted to provide the baremetal node, Neutron is unable to
create/attach a port to the remote node:
WARNING ironic.common.neutron [req-b3f373fc-e76a-4c13-9ebb-41cfc682d31b
4946f15716c04f8585d013e364802c6c 1664a38fc668432ca6bee9189be142d9 - default
default] The local_link_connection is required for 'neutron' network
interface and is not present in the nodes
3ed87e51-00c5-4b27-95c0-665c8337e49b port
ccc335c6-3521-48a5-927d-d7ee13f7f05b

I changed its network interface from neutron back to flat and it went past
this. I'm now waiting to see if the node will PXE boot.

On Tue, Aug 4, 2020 at 1:41 PM Thomas King <thomas.king at gmail.com> wrote:

> Changing the ml2 flat_networks from specific physical networks to a
> wildcard allowed me to create a segment. I may be unstuck.
>
> New config:
> [ml2_type_flat]
> flat_networks=*
>
> Now to try creating the subnet and try a remote provision.
>
> Tom King
>
> On Mon, Aug 3, 2020 at 3:58 PM Thomas King <thomas.king at gmail.com> wrote:
>
>> I've been using named physical networks so long, I completely forgot
>> using wildcards!
>>
>> Is this the answer????
>>
>> https://docs.openstack.org/mitaka/config-reference/networking/networking_options_reference.html#modular-layer-2-ml2-flat-type-configuration-options
>>
>> Tom King
>>
>> On Tue, Jul 28, 2020 at 3:46 PM Thomas King <thomas.king at gmail.com>
>> wrote:
>>
>>> Ruslanas has been a tremendous help. To catch up the discussion lists...
>>> 1. I enabled Neutron segments.
>>> 2. I renamed the existing segments for each network so they'll make
>>> sense.
>>> 3. I attempted to create a segment for a remote subnet (it is using DHCP
>>> relay) and this was the error that is blocking me. This is where the docs
>>> do not cover:
>>> [root at sea-maas-controller ~(keystone_admin)]# openstack network segment
>>> create --physical-network remote146-30-32 --network-type flat --network
>>> baremetal seg-remote-146-30-32
>>> BadRequestException: 400: Client Error for url:
>>> http://10.146.30.65:9696/v2.0/segments, Invalid input for operation:
>>> physical_network 'remote146-30-32' unknown for flat provider network.
>>>
>>> I've asked Ruslanas to clarify how their physical networks correspond to
>>> their remote networks. They have a single provider network and multiple
>>> segments tied to multiple physical networks.
>>>
>>> However, if anyone can shine some light on this, I would greatly
>>> appreciate it. How should neutron's configurations accommodate remote
>>> networks<->Neutron segments when I have only one physical network
>>> attachment for provisioning?
>>>
>>> Thanks!
>>> Tom King
>>>
>>> On Wed, Jul 15, 2020 at 3:33 PM Thomas King <thomas.king at gmail.com>
>>> wrote:
>>>
>>>> That helps a lot, thank you!
>>>>
>>>> "I use only one network..."
>>>> This bit seems to go completely against the Neutron segments
>>>> documentation. When you have access, please let me know if Triple-O is
>>>> using segments or some other method.
>>>>
>>>> I greatly appreciate this, this is a tremendous help.
>>>>
>>>> Tom King
>>>>
>>>> On Wed, Jul 15, 2020 at 1:07 PM Ruslanas Gžibovskis <ruslanas at lpic.lt>
>>>> wrote:
>>>>
>>>>> Hi Thomas,
>>>>>
>>>>> I have a bit complicated setup from tripleo side :) I use only one
>>>>> network (only ControlPlane). thanks to Harold, he helped to make it work
>>>>> for me.
>>>>>
>>>>> Yes, as written in the tripleo docs for leaf networks, it use the same
>>>>> neutron network, different subnets. so neutron network is ctlplane (I
>>>>> think) and have ctlplane-subnet, remote-provision and remote-KI :)) that
>>>>> generates additional lines in "ip r s" output for routing "foreign" subnets
>>>>> through correct gw, if you would have isolated networks, by vlans and ports
>>>>> this would apply for each subnet different gw... I believe you
>>>>> know/understand that part.
>>>>>
>>>>> remote* subnets have dhcp-relay setup by network team... do not ask
>>>>> details for that. I do not know how to, but can ask :)
>>>>>
>>>>>
>>>>> in undercloud/tripleo i have 2 dhcp servers, one is for introspection,
>>>>> another for provide/cleanup and deployment process.
>>>>>
>>>>> all of those subnets have organization level tagged networks and are
>>>>> tagged on network devices, but they are untagged on provisioning
>>>>> interfaces/ports, as in general pxe should be untagged, but some nic's can
>>>>> do vlan untag on nic/bios level. but who cares!?
>>>>>
>>>>> I just did a brief check on your first post, I think I have simmilar
>>>>> setup to yours :)) I will check in around 12hours :)) more deaply, as will
>>>>> be at work :)))
>>>>>
>>>>>
>>>>> P.S. sorry for wrong terms, I am bad at naming.
>>>>>
>>>>>
>>>>> On Wed, 15 Jul 2020, 21:13 Thomas King, <thomas.king at gmail.com> wrote:
>>>>>
>>>>>> Ruslanas, that would be excellent!
>>>>>>
>>>>>> I will reply to you directly for details later unless the maillist
>>>>>> would like the full thread.
>>>>>>
>>>>>> Some preliminary questions:
>>>>>>
>>>>>>    - Do you have a separate physical interface for the segment(s)
>>>>>>    used for your remote subnets?
>>>>>>    The docs state each segment must have a unique physical network
>>>>>>    name, which suggests a separate physical interface for each segment unless
>>>>>>    I'm misunderstanding something.
>>>>>>    - Are your provisioning segments all on the same Neutron network?
>>>>>>    - Are you using tagged switchports or access switchports to your
>>>>>>    Ironic server(s)?
>>>>>>
>>>>>> Thanks,
>>>>>> Tom King
>>>>>>
>>>>>> On Wed, Jul 15, 2020 at 12:26 AM Ruslanas Gžibovskis <
>>>>>> ruslanas at lpic.lt> wrote:
>>>>>>
>>>>>>> I have deployed that with tripleO, but now we are recabling and
>>>>>>> redeploying it. So once I have it running I can share my configs, just name
>>>>>>> which you want :)
>>>>>>>
>>>>>>> On Tue, 14 Jul 2020 at 18:40, Thomas King <thomas.king at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I have. That's the Triple-O docs and they don't go through the
>>>>>>>> normal .conf files to explain how it works outside of Triple-O. It has some
>>>>>>>> ideas but no running configurations.
>>>>>>>>
>>>>>>>> Tom King
>>>>>>>>
>>>>>>>> On Tue, Jul 14, 2020 at 3:01 AM Ruslanas Gžibovskis <
>>>>>>>> ruslanas at lpic.lt> wrote:
>>>>>>>>
>>>>>>>>> hi, have you checked:
>>>>>>>>> https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/features/routed_spine_leaf_network.html
>>>>>>>>>  ?
>>>>>>>>> I am following this link. I only have one network, having
>>>>>>>>> different issues tho ;)
>>>>>>>>>
>>>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200804/515bfbda/attachment.html>

From melwittt at gmail.com  Wed Aug  5 01:50:40 2020
From: melwittt at gmail.com (melanie witt)
Date: Tue, 4 Aug 2020 18:50:40 -0700
Subject: [placement][gate] functional tests failing
In-Reply-To: <6486f281-5124-4566-af62-55c8a71905bf@gmail.com>
References: <6486f281-5124-4566-af62-55c8a71905bf@gmail.com>
Message-ID: <deb3e708-d1ee-d2b6-48f3-f48229510721@gmail.com>

On 8/4/20 14:08, melanie witt wrote:
> Hi all,
> 
> I recently proposed a change to openstack/placement and found that the 
> functional tests are currently failing. It's because of a recent-ish 
> bump to upper-constraints to allow os-traits 2.4.0:
> 
> https://review.opendev.org/739330
> 
> and placement has a func test that asserts the number of standard traits 
> (more traits are available in 2.4.0).
> 
> I've proposed a fix for the func test here if anyone could please help 
> review:
> 
> https://review.opendev.org/744790

The fix has merged and the placement gate is all clear!

-melanie


From jasonanderson at uchicago.edu  Wed Aug  5 03:49:55 2020
From: jasonanderson at uchicago.edu (Jason Anderson)
Date: Wed, 5 Aug 2020 03:49:55 +0000
Subject: [swift][ceph] Container ACLs don't seem to be respected on Ceph RGW
Message-ID: <757BCAB6-CA22-439E-9C0C-BE4DEC7B7927@uchicago.edu>

Hi all,

Just scratching my head at this for a while and though I’d ask here in case it saves some time. I’m running a Ceph cluster on the Nautilus release and it’s running Swift via the rgw. I have Keystone authentication turned on. Everything works fine in the normal case of creating containers, uploading files, listing containers, etc.

However, I notice that ACLs don’t seem to work. I am not overriding "rgw enforce swift acls”, so it is set to the default of true. I can’t seem to share a container or make it public.

(Side note, confusingly, the Ceph implementation has a different syntax for public read/write containers, ‘*’ as opposed to ‘*:*’ for public write for example.)

Here’s what I’m doing

(as admin)
swift post —write-acl ‘*’ —read-acl ‘*’ public-container
swift stat public-container
                      Account: v1
                    Container: public-container
                      Objects: 1
                        Bytes: 5801
                     Read ACL: *
                    Write ACL: *
                      Sync To:
                     Sync Key:
                  X-Timestamp: 1595883106.23179
X-Container-Bytes-Used-Actual: 8192
             X-Storage-Policy: default-placement
              X-Storage-Class: STANDARD
                Last-Modified: Wed, 05 Aug 2020 03:42:11 GMT
                   X-Trans-Id: tx000000000000000662156-005f2a2bea-23478-default
       X-Openstack-Request-Id: tx000000000000000662156-005f2a2bea-23478-default
                Accept-Ranges: bytes
                 Content-Type: text/plain; charset=utf-8

(as non-admin)
swift upload public-container test.txt
Warning: failed to create container 'public-container': 409 Conflict: BucketAlreadyExists
Object HEAD failed: https://ceph.example.org:7480/swift/v1/public-container/README.md 403 Forbidden

swift list public-container
Container GET failed: https://ceph.example.org:7480/swift/v1/public-container?format=json 403 Forbidden  [first 60 chars of response] b'{"Code":"AccessDenied","BucketName”:”public-container","RequestId":"tx0'
Failed Transaction ID: tx000000000000000662162-005f2a2c2a-23478-default

What am I missing? Thanks in advance!

/Jason

From mark at stackhpc.com  Wed Aug  5 07:53:20 2020
From: mark at stackhpc.com (Mark Goddard)
Date: Wed, 5 Aug 2020 08:53:20 +0100
Subject: [openstack-community] Octavia :; Unable to create load balancer
In-Reply-To: <DM6PR08MB5194961A1F38F812D414AE408F4A0@DM6PR08MB5194.namprd08.prod.outlook.com>
References: <DM6PR08MB51944451C8861614628EE9DA8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAFs83Qr3Us22pK7gnw7X68_FjimnBTRDn0s+RaSzuzhjcXdY6w@mail.gmail.com>
 <CAMH0MgL7mvG-Zgb-Sr1mbNWhHLyZAHYw+BbAuzaHA+EJSJ=DQQ@mail.gmail.com>
 <DM6PR08MB5194D8A130262996A211B61F8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VzM4BBStF8nTwVJtFxaX0surEidjnmB3JExTDsncJsZNg@mail.gmail.com>
 <DM6PR08MB5194211D78CB801C7C3B0D4A8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VwKMWiO7L6CYK+zb4jkT9ntCjcjQRspxdm8MFL+i13aKg@mail.gmail.com>
 <DM6PR08MB519419FF0C8F173E25A6EAB78F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyDANKJ0CvPXdwMFuED-xEea=Urn0sy0tpwnwLBPSnKTg@mail.gmail.com>
 <DM6PR08MB5194C8259398F0810FAA236E8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAMH0MgLADYE6ncadXJHxghaKDYVmL=wYPE6hOejTC3xyHMgaAg@mail.gmail.com>
 <DM6PR08MB519489AEE16CD5C6CA69890C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <DM6PR08MB51949EBABD05456E0D7B14E38F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyATFxkKPL+YJRsrTiY0qzxN4GWVu6FDroKgjGc5Bqzbg@mail.gmail.com>
 <DM6PR08MB519447CBCC68203A8D5C870C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VxOttV=yNCoFp-RGMts82RJrCB7wXkygsPwmNx--miYVg@mail.gmail.com>
 <CAMH0MgLNm36nWsuXWyYvbd=1LNdNdmKLMYJn7u+d-muTnk0kuA@mail.gmail.com>
 <DM6PR08MB5194961A1F38F812D414AE408F4A0@DM6PR08MB5194.namprd08.prod.outlook.com>
Message-ID: <CAFHSqWp0oeLLF36SCvORJ2Vh5YeWyxExJs63efSCG1ak9uaYww@mail.gmail.com>

On Tue, 4 Aug 2020 at 16:58, Monika Samal <monika.samal at outlook.com> wrote:

> Hello Guys,
>
> With Michaels help I was able to solve the problem but now there is
> another error I was able to create my network on vlan but still error
> persist. PFB the logs:
>
> http://paste.openstack.org/show/fEixSudZ6lzscxYxsG1z/
>
> Kindly help
>
> regards,
> Monika
> ------------------------------
> *From:* Michael Johnson <johnsomor at gmail.com>
> *Sent:* Monday, August 3, 2020 9:10 PM
> *To:* Fabian Zimmermann <dev.faz at gmail.com>
> *Cc:* Monika Samal <monika.samal at outlook.com>; openstack-discuss <
> openstack-discuss at lists.openstack.org>
> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
> balancer
>
> Yeah, it looks like nova is failing to boot the instance.
>
> Check this setting in your octavia.conf files:
> https://docs.openstack.org/octavia/latest/configuration/configref.html#controller_worker.amp_flavor_id
>
> Also, if kolla-ansible didn't set both of these values correctly, please
> open bug reports for kolla-ansible. These all should have been configured
> by the deployment tool.
>
>
I wasn't following this thread due to no [kolla] tag, but here are the
recently added docs for Octavia in kolla [1]. Note
the octavia_service_auth_project variable which was added to migrate from
the admin project to the service project for octavia resources. We're
lacking proper automation for the flavor, image etc, but it is being worked
on in Victoria [2].

[1]
https://docs.openstack.org/kolla-ansible/latest/reference/networking/octavia.html
[2] https://review.opendev.org/740180

Michael
>
> On Mon, Aug 3, 2020 at 7:53 AM Fabian Zimmermann <dev.faz at gmail.com>
> wrote:
>
> Seems like the flavor is missing or empty '' - check for typos and enable
> debug.
>
> Check if the nova req contains valid information/flavor.
>
>  Fabian
>
> Monika Samal <monika.samal at outlook.com> schrieb am Mo., 3. Aug. 2020,
> 15:46:
>
> It's registered
>
> Get Outlook for Android <https://aka.ms/ghei36>
> ------------------------------
> *From:* Fabian Zimmermann <dev.faz at gmail.com>
> *Sent:* Monday, August 3, 2020 7:08:21 PM
> *To:* Monika Samal <monika.samal at outlook.com>; openstack-discuss <
> openstack-discuss at lists.openstack.org>
> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
> balancer
>
> Did you check the (nova) flavor you use in octavia.
>
>  Fabian
>
> Monika Samal <monika.samal at outlook.com> schrieb am Mo., 3. Aug. 2020,
> 10:53:
>
> After Michael suggestion I was able to create load balancer but there is
> error in status.
>
>
>
> PFB the error link:
>
> http://paste.openstack.org/show/meNZCeuOlFkfjj189noN/
> ------------------------------
> *From:* Monika Samal <monika.samal at outlook.com>
> *Sent:* Monday, August 3, 2020 2:08 PM
> *To:* Michael Johnson <johnsomor at gmail.com>
> *Cc:* Fabian Zimmermann <dev.faz at gmail.com>; Amy Marrich <amy at demarco.com>;
> openstack-discuss <openstack-discuss at lists.openstack.org>;
> community at lists.openstack.org <community at lists.openstack.org>
> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
> balancer
>
> Thanks a ton Michael for helping me out
> ------------------------------
> *From:* Michael Johnson <johnsomor at gmail.com>
> *Sent:* Friday, July 31, 2020 3:57 AM
> *To:* Monika Samal <monika.samal at outlook.com>
> *Cc:* Fabian Zimmermann <dev.faz at gmail.com>; Amy Marrich <amy at demarco.com>;
> openstack-discuss <openstack-discuss at lists.openstack.org>;
> community at lists.openstack.org <community at lists.openstack.org>
> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
> balancer
>
> Just to close the loop on this, the octavia.conf file had
> "project_name = admin" instead of "project_name = service" in the
> [service_auth] section. This was causing the keystone errors when
> Octavia was communicating with neutron.
>
> I don't know if that is a bug in kolla-ansible or was just a local
> configuration issue.
>
> Michael
>
> On Thu, Jul 30, 2020 at 1:39 PM Monika Samal <monika.samal at outlook.com>
> wrote:
> >
> > Hello Fabian,,
> >
> > http://paste.openstack.org/show/QxKv2Ai697qulp9UWTjY/
> >
> > Regards,
> > Monika
> > ________________________________
> > From: Fabian Zimmermann <dev.faz at gmail.com>
> > Sent: Friday, July 31, 2020 1:57 AM
> > To: Monika Samal <monika.samal at outlook.com>
> > Cc: Michael Johnson <johnsomor at gmail.com>; Amy Marrich <amy at demarco.com>;
> openstack-discuss <openstack-discuss at lists.openstack.org>;
> community at lists.openstack.org <community at lists.openstack.org>
> > Subject: Re: [openstack-community] Octavia :; Unable to create load
> balancer
> >
> > Hi,
> >
> > just to debug, could you replace the auth_type password with v3password?
> >
> > And do a curl against your :5000 and :35357 urls and paste the output.
> >
> >  Fabian
> >
> > Monika Samal <monika.samal at outlook.com> schrieb am Do., 30. Juli 2020,
> 22:15:
> >
> > Hello Fabian,
> >
> > http://paste.openstack.org/show/796477/
> >
> > Thanks,
> > Monika
> > ________________________________
> > From: Fabian Zimmermann <dev.faz at gmail.com>
> > Sent: Friday, July 31, 2020 1:38 AM
> > To: Monika Samal <monika.samal at outlook.com>
> > Cc: Michael Johnson <johnsomor at gmail.com>; Amy Marrich <amy at demarco.com>;
> openstack-discuss <openstack-discuss at lists.openstack.org>;
> community at lists.openstack.org <community at lists.openstack.org>
> > Subject: Re: [openstack-community] Octavia :; Unable to create load
> balancer
> >
> > The sections should be
> >
> > service_auth
> > keystone_authtoken
> >
> > if i read the docs correctly. Maybe you can just paste your config
> (remove/change passwords) to paste.openstack.org and post the link?
> >
> >  Fabian
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200805/682539d7/attachment.html>

From emiller at genesishosting.com  Wed Aug  5 09:41:10 2020
From: emiller at genesishosting.com (Eric K. Miller)
Date: Wed, 5 Aug 2020 04:41:10 -0500
Subject: [cinder][nova] Local storage in compute node
Message-ID: <046E9C0290DD9149B106B72FC9156BEA04814477@gmsxchsvr01.thecreation.com>

Hi,

 
I'm research methods to get around high storage latency for some
applications where redundancy does not matter, so using local NVMe
drives in compute nodes seems to be the practical choice.

 
However, there does not appear to be a good solution from what I have
read.  For example, BlockDeviceDriver has been deprecated/removed, LVM
is only supported via iSCSI (which is slow) and localization of LVM
volumes onto the same compute node as VMs is impossible, and other
methods (PCI pass-through, etc.) would require direct access to the
local drives, where device cleansing would need to occur after a device
was removed from a VM, and I don't believe there is a hook for this.

 
Ephemeral storage appears to be an option, but I believe it has the same
issue as PCI pass-through, in that there is no abiilty to automatically
cleanse a device after it has been used.  In our default configuration,
ephemeral storage is redirected to use Ceph, which solves the cleansing
issue, but isn't suitable due to its high latency.  Also, ephemeral
storage appears as a second device, not the root disk, so that
complicates a few configurations we have.

 
Is there any other way to write an operating system image onto a local
drive and boot from it?  Or preferably assign an LVM /dev/mapper path as
a device in libvirt (no iSCSI) after configuring a logical volume?  or
am I missing something?

 
Thanks!


Eric

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200805/2c152f3c/attachment-0001.html>

From emiller at genesishosting.com  Wed Aug  5 10:03:29 2020
From: emiller at genesishosting.com (Eric K. Miller)
Date: Wed, 5 Aug 2020 05:03:29 -0500
Subject: [cinder][nova] Local storage in compute node
In-Reply-To: <046E9C0290DD9149B106B72FC9156BEA04814477@gmsxchsvr01.thecreation.com>
References: <046E9C0290DD9149B106B72FC9156BEA04814477@gmsxchsvr01.thecreation.com>
Message-ID: <046E9C0290DD9149B106B72FC9156BEA04814478@gmsxchsvr01.thecreation.com>

In case this is the answer, I found that in nova.conf, under the
[libvirt] stanza, images_type can be set to "lvm".  This looks like it
may do the trick - using the compute node's LVM to provision and mount a
logical volume, for either persistent or ephemeral storage defined in
the flavor.


Can anyone validate that this is the right approach according to our
needs?

 
Also, I have read about the LVM device filters - which is important to
avoid the host's LVM from seeing the guest's volumes, in case anyone
else finds this message.

 
Thanks!


Eric

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200805/4e8cbbd2/attachment.html>

From lyarwood at redhat.com  Wed Aug  5 11:19:34 2020
From: lyarwood at redhat.com (Lee Yarwood)
Date: Wed, 5 Aug 2020 12:19:34 +0100
Subject: [cinder][nova] Local storage in compute node
In-Reply-To: <046E9C0290DD9149B106B72FC9156BEA04814478@gmsxchsvr01.thecreation.com>
References: <046E9C0290DD9149B106B72FC9156BEA04814477@gmsxchsvr01.thecreation.com>
 <046E9C0290DD9149B106B72FC9156BEA04814478@gmsxchsvr01.thecreation.com>
Message-ID: <20200805111934.77lesgmmdiqeo27m@lyarwood.usersys.redhat.com>

On 05-08-20 05:03:29, Eric K. Miller wrote:
> In case this is the answer, I found that in nova.conf, under the
> [libvirt] stanza, images_type can be set to "lvm".  This looks like it
> may do the trick - using the compute node's LVM to provision and mount a
> logical volume, for either persistent or ephemeral storage defined in
> the flavor.
> 
> Can anyone validate that this is the right approach according to our
> needs?

I'm not sure if it is given your initial requirements.

Do you need full host block devices to be provided to the instance?

The LVM imagebackend will just provision LVs on top of the provided VG
so there's no direct mapping to a full host block device with this
approach.

That said there's no real alternative available at the moment.

> Also, I have read about the LVM device filters - which is important to
> avoid the host's LVM from seeing the guest's volumes, in case anyone
> else finds this message.
 
Yeah that's a common pitfall when using LVM based ephemeral disks that
contain additional LVM PVs/VGs/LVs etc. You need to ensure that the host
is configured to not scan these LVs in order for their PVs/VGs/LVs etc
to remain hidden from the host:

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/logical_volume_manager_administration/lvm_filters  

-- 
Lee Yarwood                 A5D1 9385 88CB 7E5F BE64  6618 BCA6 6E33 F672 2D76
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200805/a53474da/attachment.sig>

From smooney at redhat.com  Wed Aug  5 11:40:34 2020
From: smooney at redhat.com (Sean Mooney)
Date: Wed, 05 Aug 2020 12:40:34 +0100
Subject: [cinder][nova] Local storage in compute node
In-Reply-To: <20200805111934.77lesgmmdiqeo27m@lyarwood.usersys.redhat.com>
References: <046E9C0290DD9149B106B72FC9156BEA04814477@gmsxchsvr01.thecreation.com>
 <046E9C0290DD9149B106B72FC9156BEA04814478@gmsxchsvr01.thecreation.com>
 <20200805111934.77lesgmmdiqeo27m@lyarwood.usersys.redhat.com>
Message-ID: <7b7f6e277f77423ae6502d81c6d778fd4249b99d.camel@redhat.com>

On Wed, 2020-08-05 at 12:19 +0100, Lee Yarwood wrote:
> On 05-08-20 05:03:29, Eric K. Miller wrote:
> > In case this is the answer, I found that in nova.conf, under the
> > [libvirt] stanza, images_type can be set to "lvm".  This looks like it
> > may do the trick - using the compute node's LVM to provision and mount a
> > logical volume, for either persistent or ephemeral storage defined in
> > the flavor.
> > 
> > Can anyone validate that this is the right approach according to our
> > needs?
> 
> I'm not sure if it is given your initial requirements.
> 
> Do you need full host block devices to be provided to the instance?
> 
> The LVM imagebackend will just provision LVs on top of the provided VG
> so there's no direct mapping to a full host block device with this
> approach.
> 
> That said there's no real alternative available at the moment.
well one alternitive to nova providing local lvm storage is to use
the cinder lvm driver but install it on all compute nodes then 
use the cidner InstanceLocalityFilter to ensure the volume is alocated form the host
the vm is on.
https://docs.openstack.org/cinder/latest/configuration/block-storage/scheduler-filters.html#instancelocalityfilter
on drawback to this is that if the if the vm is moved i think you would need to also migrate the cinder volume
seperatly afterwards.

> 
> > Also, I have read about the LVM device filters - which is important to
> > avoid the host's LVM from seeing the guest's volumes, in case anyone
> > else finds this message.
> 
>  
> Yeah that's a common pitfall when using LVM based ephemeral disks that
> contain additional LVM PVs/VGs/LVs etc. You need to ensure that the host
> is configured to not scan these LVs in order for their PVs/VGs/LVs etc
> to remain hidden from the host:
> 
> 
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/logical_volume_manager_administration/lvm_filters 
>  
> 


From smooney at redhat.com  Wed Aug  5 11:45:47 2020
From: smooney at redhat.com (Sean Mooney)
Date: Wed, 05 Aug 2020 12:45:47 +0100
Subject: [cinder][nova] Local storage in compute node
In-Reply-To: <7b7f6e277f77423ae6502d81c6d778fd4249b99d.camel@redhat.com>
References: <046E9C0290DD9149B106B72FC9156BEA04814477@gmsxchsvr01.thecreation.com>
 <046E9C0290DD9149B106B72FC9156BEA04814478@gmsxchsvr01.thecreation.com>
 <20200805111934.77lesgmmdiqeo27m@lyarwood.usersys.redhat.com>
 <7b7f6e277f77423ae6502d81c6d778fd4249b99d.camel@redhat.com>
Message-ID: <92839697a08966dc17cd5c4c181bb32e2d197f93.camel@redhat.com>

On Wed, 2020-08-05 at 12:40 +0100, Sean Mooney wrote:
> On Wed, 2020-08-05 at 12:19 +0100, Lee Yarwood wrote:
> > On 05-08-20 05:03:29, Eric K. Miller wrote:
> > > In case this is the answer, I found that in nova.conf, under the
> > > [libvirt] stanza, images_type can be set to "lvm".  This looks like it
> > > may do the trick - using the compute node's LVM to provision and mount a
> > > logical volume, for either persistent or ephemeral storage defined in
> > > the flavor.
> > > 
> > > Can anyone validate that this is the right approach according to our
> > > needs?
> > 
> > I'm not sure if it is given your initial requirements.
> > 
> > Do you need full host block devices to be provided to the instance?
> > 
> > The LVM imagebackend will just provision LVs on top of the provided VG
> > so there's no direct mapping to a full host block device with this
> > approach.
> > 
> > That said there's no real alternative available at the moment.
> 
> well one alternitive to nova providing local lvm storage is to use
> the cinder lvm driver but install it on all compute nodes then 
> use the cidner InstanceLocalityFilter to ensure the volume is alocated form the host
> the vm is on.
> https://docs.openstack.org/cinder/latest/configuration/block-storage/scheduler-filters.html#instancelocalityfilter
> on drawback to this is that if the if the vm is moved i think you would need to also migrate the cinder volume
> seperatly afterwards.
by the way if you were to take this approch i think there is an nvmeof driver so you can use nvme over rdma
instead of iscsi.
> 
> > 
> > > Also, I have read about the LVM device filters - which is important to
> > > avoid the host's LVM from seeing the guest's volumes, in case anyone
> > > else finds this message.
> > 
> >  
> > Yeah that's a common pitfall when using LVM based ephemeral disks that
> > contain additional LVM PVs/VGs/LVs etc. You need to ensure that the host
> > is configured to not scan these LVs in order for their PVs/VGs/LVs etc
> > to remain hidden from the host:
> > 
> > 
> 
> 
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/logical_volume_manager_administration/lvm_filters
>  
> >  
> > 
> 
> 


From donny at fortnebula.com  Wed Aug  5 12:36:18 2020
From: donny at fortnebula.com (Donny Davis)
Date: Wed, 5 Aug 2020 08:36:18 -0400
Subject: [cinder][nova] Local storage in compute node
In-Reply-To: <92839697a08966dc17cd5c4c181bb32e2d197f93.camel@redhat.com>
References: <046E9C0290DD9149B106B72FC9156BEA04814477@gmsxchsvr01.thecreation.com>
 <046E9C0290DD9149B106B72FC9156BEA04814478@gmsxchsvr01.thecreation.com>
 <20200805111934.77lesgmmdiqeo27m@lyarwood.usersys.redhat.com>
 <7b7f6e277f77423ae6502d81c6d778fd4249b99d.camel@redhat.com>
 <92839697a08966dc17cd5c4c181bb32e2d197f93.camel@redhat.com>
Message-ID: <CAMHmko9PNufKHiZO_yt4ds7OTmG9xdpH+=ooGWoFdpimx9tYng@mail.gmail.com>

I use local nvme to drive the CI workload for the openstack community for
the last year or so. It seems to work pretty well. I just created a
filesystem (xfs) and mounted it to /var/lib/nova/instances
I moved glance to using my swift backend and it really made the download of
the images much faster.

It depends on if the workload is going to handle HA or you are expecting to
migrate machines. If the workload is ephemeral or HA can be handled in the
app I think local storage is still a very viable option.

Simpler is better IMO


On Wed, Aug 5, 2020 at 7:48 AM Sean Mooney <smooney at redhat.com> wrote:

> On Wed, 2020-08-05 at 12:40 +0100, Sean Mooney wrote:
> > On Wed, 2020-08-05 at 12:19 +0100, Lee Yarwood wrote:
> > > On 05-08-20 05:03:29, Eric K. Miller wrote:
> > > > In case this is the answer, I found that in nova.conf, under the
> > > > [libvirt] stanza, images_type can be set to "lvm".  This looks like
> it
> > > > may do the trick - using the compute node's LVM to provision and
> mount a
> > > > logical volume, for either persistent or ephemeral storage defined in
> > > > the flavor.
> > > >
> > > > Can anyone validate that this is the right approach according to our
> > > > needs?
> > >
> > > I'm not sure if it is given your initial requirements.
> > >
> > > Do you need full host block devices to be provided to the instance?
> > >
> > > The LVM imagebackend will just provision LVs on top of the provided VG
> > > so there's no direct mapping to a full host block device with this
> > > approach.
> > >
> > > That said there's no real alternative available at the moment.
> >
> > well one alternitive to nova providing local lvm storage is to use
> > the cinder lvm driver but install it on all compute nodes then
> > use the cidner InstanceLocalityFilter to ensure the volume is alocated
> form the host
> > the vm is on.
> >
> https://docs.openstack.org/cinder/latest/configuration/block-storage/scheduler-filters.html#instancelocalityfilter
> > on drawback to this is that if the if the vm is moved i think you would
> need to also migrate the cinder volume
> > seperatly afterwards.
> by the way if you were to take this approch i think there is an nvmeof
> driver so you can use nvme over rdma
> instead of iscsi.
> >
> > >
> > > > Also, I have read about the LVM device filters - which is important
> to
> > > > avoid the host's LVM from seeing the guest's volumes, in case anyone
> > > > else finds this message.
> > >
> > >
> > > Yeah that's a common pitfall when using LVM based ephemeral disks that
> > > contain additional LVM PVs/VGs/LVs etc. You need to ensure that the
> host
> > > is configured to not scan these LVs in order for their PVs/VGs/LVs etc
> > > to remain hidden from the host:
> > >
> > >
> >
> >
>
> https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/logical_volume_manager_administration/lvm_filters
> >
> > >
> > >
> >
> >
>
>
>

-- 
~/DonnyD
C: 805 814 6800
"No mission too difficult. No sacrifice too great. Duty First"
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200805/4569271a/attachment.html>

From smooney at redhat.com  Wed Aug  5 13:01:12 2020
From: smooney at redhat.com (Sean Mooney)
Date: Wed, 05 Aug 2020 14:01:12 +0100
Subject: [cinder][nova] Local storage in compute node
In-Reply-To: <CAMHmko9PNufKHiZO_yt4ds7OTmG9xdpH+=ooGWoFdpimx9tYng@mail.gmail.com>
References: <046E9C0290DD9149B106B72FC9156BEA04814477@gmsxchsvr01.thecreation.com>
 <046E9C0290DD9149B106B72FC9156BEA04814478@gmsxchsvr01.thecreation.com>
 <20200805111934.77lesgmmdiqeo27m@lyarwood.usersys.redhat.com>
 <7b7f6e277f77423ae6502d81c6d778fd4249b99d.camel@redhat.com>
 <92839697a08966dc17cd5c4c181bb32e2d197f93.camel@redhat.com>
 <CAMHmko9PNufKHiZO_yt4ds7OTmG9xdpH+=ooGWoFdpimx9tYng@mail.gmail.com>
Message-ID: <4f025d444406898903dabf3049ed021822cce19b.camel@redhat.com>

On Wed, 2020-08-05 at 08:36 -0400, Donny Davis wrote:
> I use local nvme to drive the CI workload for the openstack community for
> the last year or so. It seems to work pretty well. I just created a
> filesystem (xfs) and mounted it to /var/lib/nova/instances
> I moved glance to using my swift backend and it really made the download of
> the images much faster.
> 
> It depends on if the workload is going to handle HA or you are expecting to
> migrate machines. If the workload is ephemeral or HA can be handled in the
> app I think local storage is still a very viable option.
> 
> Simpler is better IMO
yes that works well with the default flat/qcow file format
i assume there was a reason this was not the starting point.
the nova lvm backend i think does not supprot thin provisioning
so fi you did the same thing creating the volume group on the nvme deivce
you would technically get better write performance after the vm is booted but
the vm spwan is slower since we cant take advantage of thin providioning and
each root disk need to be copided  form the cahced image.

so just monting the nova data directory on an nvme driver or a raid of nvme drives
works well and is simple to do.

i take a slightly more complex approach from my home cluster wehre i put the
nova data directory on a bcache block device which puts an nvme pci ssd as a cache
infront of my raid 10 fo HDDs to acclerate it. from nova point of view there is nothing special
about this setup it just works.

the draw back to this is you cant change teh stroage avaiable to a vm without creating a new flaovr.
exposing the nvme deivce or subsection of them via cinder has the advantage of allowing you to use
teh vloume api to tailor the amount of storage per vm rather then creating a bunch of different flavors
but with the over head fo needing to connect to the storage over a network protocol.

so there are trade off with both appoches.
generally i recommend using local sotrage e.g. the vm root disk or ephemeral disk for fast scratchpad space
to work on data bug persitie all relevent data permently via cinder volumes. that requires you to understand which block
devices a local and which are remote but it give you the best of both worlds.

> 
> 
> 
> On Wed, Aug 5, 2020 at 7:48 AM Sean Mooney <smooney at redhat.com> wrote:
> 
> > On Wed, 2020-08-05 at 12:40 +0100, Sean Mooney wrote:
> > > On Wed, 2020-08-05 at 12:19 +0100, Lee Yarwood wrote:
> > > > On 05-08-20 05:03:29, Eric K. Miller wrote:
> > > > > In case this is the answer, I found that in nova.conf, under the
> > > > > [libvirt] stanza, images_type can be set to "lvm".  This looks like
> > 
> > it
> > > > > may do the trick - using the compute node's LVM to provision and
> > 
> > mount a
> > > > > logical volume, for either persistent or ephemeral storage defined in
> > > > > the flavor.
> > > > > 
> > > > > Can anyone validate that this is the right approach according to our
> > > > > needs?
> > > > 
> > > > I'm not sure if it is given your initial requirements.
> > > > 
> > > > Do you need full host block devices to be provided to the instance?
> > > > 
> > > > The LVM imagebackend will just provision LVs on top of the provided VG
> > > > so there's no direct mapping to a full host block device with this
> > > > approach.
> > > > 
> > > > That said there's no real alternative available at the moment.
> > > 
> > > well one alternitive to nova providing local lvm storage is to use
> > > the cinder lvm driver but install it on all compute nodes then
> > > use the cidner InstanceLocalityFilter to ensure the volume is alocated
> > 
> > form the host
> > > the vm is on.
> > > 
> > 
> > https://docs.openstack.org/cinder/latest/configuration/block-storage/scheduler-filters.html#instancelocalityfilter
> > > on drawback to this is that if the if the vm is moved i think you would
> > 
> > need to also migrate the cinder volume
> > > seperatly afterwards.
> > 
> > by the way if you were to take this approch i think there is an nvmeof
> > driver so you can use nvme over rdma
> > instead of iscsi.
> > > 
> > > > 
> > > > > Also, I have read about the LVM device filters - which is important
> > 
> > to
> > > > > avoid the host's LVM from seeing the guest's volumes, in case anyone
> > > > > else finds this message.
> > > > 
> > > > 
> > > > Yeah that's a common pitfall when using LVM based ephemeral disks that
> > > > contain additional LVM PVs/VGs/LVs etc. You need to ensure that the
> > 
> > host
> > > > is configured to not scan these LVs in order for their PVs/VGs/LVs etc
> > > > to remain hidden from the host:
> > > > 
> > > > 
> > > 
> > > 
> > 
> > 
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/logical_volume_manager_administration/lvm_filters
> > > 
> > > > 
> > > > 
> > > 
> > > 
> > 
> > 
> > 
> 
> 


From donny at fortnebula.com  Wed Aug  5 13:22:58 2020
From: donny at fortnebula.com (Donny Davis)
Date: Wed, 5 Aug 2020 09:22:58 -0400
Subject: [cinder][nova] Local storage in compute node
In-Reply-To: <4f025d444406898903dabf3049ed021822cce19b.camel@redhat.com>
References: <046E9C0290DD9149B106B72FC9156BEA04814477@gmsxchsvr01.thecreation.com>
 <046E9C0290DD9149B106B72FC9156BEA04814478@gmsxchsvr01.thecreation.com>
 <20200805111934.77lesgmmdiqeo27m@lyarwood.usersys.redhat.com>
 <7b7f6e277f77423ae6502d81c6d778fd4249b99d.camel@redhat.com>
 <92839697a08966dc17cd5c4c181bb32e2d197f93.camel@redhat.com>
 <CAMHmko9PNufKHiZO_yt4ds7OTmG9xdpH+=ooGWoFdpimx9tYng@mail.gmail.com>
 <4f025d444406898903dabf3049ed021822cce19b.camel@redhat.com>
Message-ID: <CAMHmko_hD9m6wXWvfG8-XZdN=HiwrdaukT-jMUB2psW18xMCNQ@mail.gmail.com>

On Wed, Aug 5, 2020 at 9:01 AM Sean Mooney <smooney at redhat.com> wrote:

> On Wed, 2020-08-05 at 08:36 -0400, Donny Davis wrote:
> > I use local nvme to drive the CI workload for the openstack community for
> > the last year or so. It seems to work pretty well. I just created a
> > filesystem (xfs) and mounted it to /var/lib/nova/instances
> > I moved glance to using my swift backend and it really made the download
> of
> > the images much faster.
> >
> > It depends on if the workload is going to handle HA or you are expecting
> to
> > migrate machines. If the workload is ephemeral or HA can be handled in
> the
> > app I think local storage is still a very viable option.
> >
> > Simpler is better IMO
> yes that works well with the default flat/qcow file format
> i assume there was a reason this was not the starting point.
> the nova lvm backend i think does not supprot thin provisioning
> so fi you did the same thing creating the volume group on the nvme deivce
> you would technically get better write performance after the vm is booted
> but
> the vm spwan is slower since we cant take advantage of thin providioning
> and
> each root disk need to be copided  form the cahced image.
>
> so just monting the nova data directory on an nvme driver or a raid of
> nvme drives
> works well and is simple to do.
>
> i take a slightly more complex approach from my home cluster wehre i put
> the
> nova data directory on a bcache block device which puts an nvme pci ssd as
> a cache
> infront of my raid 10 fo HDDs to acclerate it. from nova point of view
> there is nothing special
> about this setup it just works.
>
> the draw back to this is you cant change teh stroage avaiable to a vm
> without creating a new flaovr.
> exposing the nvme deivce or subsection of them via cinder has the
> advantage of allowing you to use
> teh vloume api to tailor the amount of storage per vm rather then creating
> a bunch of different flavors
> but with the over head fo needing to connect to the storage over a network
> protocol.
>
> so there are trade off with both appoches.
> generally i recommend using local sotrage e.g. the vm root disk or
> ephemeral disk for fast scratchpad space
> to work on data bug persitie all relevent data permently via cinder
> volumes. that requires you to understand which block
> devices a local and which are remote but it give you the best of both
> worlds.
>
> >
> >
> >
> > On Wed, Aug 5, 2020 at 7:48 AM Sean Mooney <smooney at redhat.com> wrote:
> >
> > > On Wed, 2020-08-05 at 12:40 +0100, Sean Mooney wrote:
> > > > On Wed, 2020-08-05 at 12:19 +0100, Lee Yarwood wrote:
> > > > > On 05-08-20 05:03:29, Eric K. Miller wrote:
> > > > > > In case this is the answer, I found that in nova.conf, under the
> > > > > > [libvirt] stanza, images_type can be set to "lvm".  This looks
> like
> > >
> > > it
> > > > > > may do the trick - using the compute node's LVM to provision and
> > >
> > > mount a
> > > > > > logical volume, for either persistent or ephemeral storage
> defined in
> > > > > > the flavor.
> > > > > >
> > > > > > Can anyone validate that this is the right approach according to
> our
> > > > > > needs?
> > > > >
> > > > > I'm not sure if it is given your initial requirements.
> > > > >
> > > > > Do you need full host block devices to be provided to the instance?
> > > > >
> > > > > The LVM imagebackend will just provision LVs on top of the
> provided VG
> > > > > so there's no direct mapping to a full host block device with this
> > > > > approach.
> > > > >
> > > > > That said there's no real alternative available at the moment.
> > > >
> > > > well one alternitive to nova providing local lvm storage is to use
> > > > the cinder lvm driver but install it on all compute nodes then
> > > > use the cidner InstanceLocalityFilter to ensure the volume is
> alocated
> > >
> > > form the host
> > > > the vm is on.
> > > >
> > >
> > >
> https://docs.openstack.org/cinder/latest/configuration/block-storage/scheduler-filters.html#instancelocalityfilter
> > > > on drawback to this is that if the if the vm is moved i think you
> would
> > >
> > > need to also migrate the cinder volume
> > > > seperatly afterwards.
> > >
> > > by the way if you were to take this approch i think there is an nvmeof
> > > driver so you can use nvme over rdma
> > > instead of iscsi.
> > > >
> > > > >
> > > > > > Also, I have read about the LVM device filters - which is
> important
> > >
> > > to
> > > > > > avoid the host's LVM from seeing the guest's volumes, in case
> anyone
> > > > > > else finds this message.
> > > > >
> > > > >
> > > > > Yeah that's a common pitfall when using LVM based ephemeral disks
> that
> > > > > contain additional LVM PVs/VGs/LVs etc. You need to ensure that the
> > >
> > > host
> > > > > is configured to not scan these LVs in order for their PVs/VGs/LVs
> etc
> > > > > to remain hidden from the host:
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
>
> https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/logical_volume_manager_administration/lvm_filters
> > > >
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> > >
> >
> >
>
>
I have been through just about every possible nvme backend option for nova
and the one that has turned up to be the most reliable and predictable has
been simple defaults so far.  Right now I am giving an nvme + nfs  backend
a spin. It doesn't perform badly, but it is not a local nvme. One of the
things I have found with nvme is the mdadm raid driver is just not fast
enough to keep up if you use anything other than raid0/1 (10) - I have a
raid5 array I have got working pretty good - but its still limited. I don't
have any vroc capable equipment, so maybe that will make a difference if
implemented.

I also have an all nvme ceph cluster I plan to test using cephfs (i know
rbd is an option, but where is the fun in that). From my experience over
the last two years in working with nvme only things, it seems that nothing
comes close to matching the performance of what a couple local nvme drives
in raid0 can do. NVME is so fast that the rest of my (old) equipment just
can't keep up, it really does push things to the limits of what is
possible.
The all nvme ceph cluster does push my 40G network to its limits, but I had
to create multiple OSD's per nvme to get there - for my gear (intel DC
p3600's) I ended up at 3 OSD's per nvme. It seems to me to be limited by
network performance.

If you have any other questions I am happy to help where I can - I have
been working with all nvme stuff for the last couple years and have gotten
something into prod for about 1 year with it (maybe a little longer).
>From what I can tell, getting max performance from nvme for an instance is
a non-trivial task because it's just so much faster than the rest of the
stack and careful considerations must be taken to get the most out of it.

I am curious to see where you take this Eric

-- 
~/DonnyD
C: 805 814 6800
"No mission too difficult. No sacrifice too great. Duty First"
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200805/006deae6/attachment-0001.html>

From openstack at gmx.net  Wed Aug  5 14:33:22 2020
From: openstack at gmx.net (Marc Vorwerk)
Date: Wed, 05 Aug 2020 16:33:22 +0200
Subject: [nova] Change Volume Type Properties
Message-ID: <24E5E9E3-6BF6-492C-BBBB-670DC070CF15@gmx.net>

Hi,

 
I'm looking for a way to add the property volume_backend_name to an existing Volume Type which is in use.

If I try to change this, I got the following error:

 
root at control01rt:~# openstack volume type show test-type

+--------------------+--------------------------------------+

| Field              | Value                                |

+--------------------+--------------------------------------+

| access_project_ids | None                                 |

| description        | None                                 |

| id                 | 68febdad-e7b1-4d41-ba11-72d0e1a1cce0 |

| is_public          | True                                 |

| name               | test-type                            |

| properties         |                                      |

| qos_specs_id       | None                                 |

+--------------------+--------------------------------------+

root at control01rt:~# openstack volume type set --property volume_backend_name=ceph test-type

Failed to set volume type property: Volume Type is currently in use. (HTTP 400) (Request-ID: req-2b8f3829-5c16-42c3-ac57-01199688bd58)

Command Failed: One or more of the operations failed

root at control01rt:~#

 
Problem what I see is, that there are instances/volumes which use this volume type.

 
Have anybody an idea, how I can add the volume_backend_name property to the existing Volume Type?

 
thanks in advance!

 
Regards

 Marc

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200805/702a319c/attachment.html>

From mark at stackhpc.com  Wed Aug  5 15:14:49 2020
From: mark at stackhpc.com (Mark Goddard)
Date: Wed, 5 Aug 2020 16:14:49 +0100
Subject: =?UTF-8?Q?Re=3A_=5Bkolla=5D_Proposing_Micha=C5=82_Nasiadka_for_kayobe=2Dco?=
 =?UTF-8?Q?re?=
In-Reply-To: <d48e27e9-35cd-403f-f390-51f9b11dd65f@stackhpc.com>
References: <CAFHSqWriqGpe2VUHeDi-s62NxREUPN-1eOwob+KQrxjKX=eqqA@mail.gmail.com>
 <d48e27e9-35cd-403f-f390-51f9b11dd65f@stackhpc.com>
Message-ID: <CAFHSqWqBoCoT7Jnd77nAKUavniMOsd7MZN1y=JTFfxQbAPcz_Q@mail.gmail.com>

On Tue, 28 Jul 2020 at 16:08, Doug Szumski <doug at stackhpc.com> wrote:
>
>
> On 28/07/2020 15:50, Mark Goddard wrote:
> > Hi,
> >
> > I'd like to propose adding Michał Nasiadka to the kayobe-core group.
> > Michał is a valued member of the Kolla core team, and has been
> > providing some good patches and reviews for Kayobe too.
> >
> > Kayobians, please respond with +1/-1.
It's been a week, with only approvals - welcome to the core team Michał!
> Sounds excellent, +1 for Michał!
> >
> > Cheers,
> > Mark
> >


From johnsomor at gmail.com  Wed Aug  5 15:16:23 2020
From: johnsomor at gmail.com (Michael Johnson)
Date: Wed, 5 Aug 2020 08:16:23 -0700
Subject: [openstack-community] Octavia :; Unable to create load balancer
In-Reply-To: <CAFHSqWp0oeLLF36SCvORJ2Vh5YeWyxExJs63efSCG1ak9uaYww@mail.gmail.com>
References: <DM6PR08MB51944451C8861614628EE9DA8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAFs83Qr3Us22pK7gnw7X68_FjimnBTRDn0s+RaSzuzhjcXdY6w@mail.gmail.com>
 <CAMH0MgL7mvG-Zgb-Sr1mbNWhHLyZAHYw+BbAuzaHA+EJSJ=DQQ@mail.gmail.com>
 <DM6PR08MB5194D8A130262996A211B61F8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VzM4BBStF8nTwVJtFxaX0surEidjnmB3JExTDsncJsZNg@mail.gmail.com>
 <DM6PR08MB5194211D78CB801C7C3B0D4A8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VwKMWiO7L6CYK+zb4jkT9ntCjcjQRspxdm8MFL+i13aKg@mail.gmail.com>
 <DM6PR08MB519419FF0C8F173E25A6EAB78F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyDANKJ0CvPXdwMFuED-xEea=Urn0sy0tpwnwLBPSnKTg@mail.gmail.com>
 <DM6PR08MB5194C8259398F0810FAA236E8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAMH0MgLADYE6ncadXJHxghaKDYVmL=wYPE6hOejTC3xyHMgaAg@mail.gmail.com>
 <DM6PR08MB519489AEE16CD5C6CA69890C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <DM6PR08MB51949EBABD05456E0D7B14E38F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyATFxkKPL+YJRsrTiY0qzxN4GWVu6FDroKgjGc5Bqzbg@mail.gmail.com>
 <DM6PR08MB519447CBCC68203A8D5C870C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VxOttV=yNCoFp-RGMts82RJrCB7wXkygsPwmNx--miYVg@mail.gmail.com>
 <CAMH0MgLNm36nWsuXWyYvbd=1LNdNdmKLMYJn7u+d-muTnk0kuA@mail.gmail.com>
 <DM6PR08MB5194961A1F38F812D414AE408F4A0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAFHSqWp0oeLLF36SCvORJ2Vh5YeWyxExJs63efSCG1ak9uaYww@mail.gmail.com>
Message-ID: <CAMH0Mg+bVbsT+D-MxR7u5VaNthb==OM6gtU1d72t6nZ1tioV1w@mail.gmail.com>

Looking at that error, it appears that the lb-mgmt-net is not setup
correctly. The Octavia controller containers are not able to reach the
amphora instances on the lb-mgmt-net subnet.

I don't know how kolla is setup to connect the containers to the neutron
lb-mgmt-net network. Maybe the above documents will help with that.

Michael

On Wed, Aug 5, 2020 at 12:53 AM Mark Goddard <mark at stackhpc.com> wrote:

>
>
> On Tue, 4 Aug 2020 at 16:58, Monika Samal <monika.samal at outlook.com>
> wrote:
>
>> Hello Guys,
>>
>> With Michaels help I was able to solve the problem but now there is
>> another error I was able to create my network on vlan but still error
>> persist. PFB the logs:
>>
>> http://paste.openstack.org/show/fEixSudZ6lzscxYxsG1z/
>>
>> Kindly help
>>
>> regards,
>> Monika
>> ------------------------------
>> *From:* Michael Johnson <johnsomor at gmail.com>
>> *Sent:* Monday, August 3, 2020 9:10 PM
>> *To:* Fabian Zimmermann <dev.faz at gmail.com>
>> *Cc:* Monika Samal <monika.samal at outlook.com>; openstack-discuss <
>> openstack-discuss at lists.openstack.org>
>> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
>> balancer
>>
>> Yeah, it looks like nova is failing to boot the instance.
>>
>> Check this setting in your octavia.conf files:
>> https://docs.openstack.org/octavia/latest/configuration/configref.html#controller_worker.amp_flavor_id
>>
>> Also, if kolla-ansible didn't set both of these values correctly, please
>> open bug reports for kolla-ansible. These all should have been configured
>> by the deployment tool.
>>
>>
> I wasn't following this thread due to no [kolla] tag, but here are the
> recently added docs for Octavia in kolla [1]. Note
> the octavia_service_auth_project variable which was added to migrate from
> the admin project to the service project for octavia resources. We're
> lacking proper automation for the flavor, image etc, but it is being worked
> on in Victoria [2].
>
> [1]
> https://docs.openstack.org/kolla-ansible/latest/reference/networking/octavia.html
> [2] https://review.opendev.org/740180
>
> Michael
>>
>> On Mon, Aug 3, 2020 at 7:53 AM Fabian Zimmermann <dev.faz at gmail.com>
>> wrote:
>>
>> Seems like the flavor is missing or empty '' - check for typos and enable
>> debug.
>>
>> Check if the nova req contains valid information/flavor.
>>
>>  Fabian
>>
>> Monika Samal <monika.samal at outlook.com> schrieb am Mo., 3. Aug. 2020,
>> 15:46:
>>
>> It's registered
>>
>> Get Outlook for Android <https://aka.ms/ghei36>
>> ------------------------------
>> *From:* Fabian Zimmermann <dev.faz at gmail.com>
>> *Sent:* Monday, August 3, 2020 7:08:21 PM
>> *To:* Monika Samal <monika.samal at outlook.com>; openstack-discuss <
>> openstack-discuss at lists.openstack.org>
>> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
>> balancer
>>
>> Did you check the (nova) flavor you use in octavia.
>>
>>  Fabian
>>
>> Monika Samal <monika.samal at outlook.com> schrieb am Mo., 3. Aug. 2020,
>> 10:53:
>>
>> After Michael suggestion I was able to create load balancer but there is
>> error in status.
>>
>>
>>
>> PFB the error link:
>>
>> http://paste.openstack.org/show/meNZCeuOlFkfjj189noN/
>> ------------------------------
>> *From:* Monika Samal <monika.samal at outlook.com>
>> *Sent:* Monday, August 3, 2020 2:08 PM
>> *To:* Michael Johnson <johnsomor at gmail.com>
>> *Cc:* Fabian Zimmermann <dev.faz at gmail.com>; Amy Marrich <amy at demarco.com>;
>> openstack-discuss <openstack-discuss at lists.openstack.org>;
>> community at lists.openstack.org <community at lists.openstack.org>
>> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
>> balancer
>>
>> Thanks a ton Michael for helping me out
>> ------------------------------
>> *From:* Michael Johnson <johnsomor at gmail.com>
>> *Sent:* Friday, July 31, 2020 3:57 AM
>> *To:* Monika Samal <monika.samal at outlook.com>
>> *Cc:* Fabian Zimmermann <dev.faz at gmail.com>; Amy Marrich <amy at demarco.com>;
>> openstack-discuss <openstack-discuss at lists.openstack.org>;
>> community at lists.openstack.org <community at lists.openstack.org>
>> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
>> balancer
>>
>> Just to close the loop on this, the octavia.conf file had
>> "project_name = admin" instead of "project_name = service" in the
>> [service_auth] section. This was causing the keystone errors when
>> Octavia was communicating with neutron.
>>
>> I don't know if that is a bug in kolla-ansible or was just a local
>> configuration issue.
>>
>> Michael
>>
>> On Thu, Jul 30, 2020 at 1:39 PM Monika Samal <monika.samal at outlook.com>
>> wrote:
>> >
>> > Hello Fabian,,
>> >
>> > http://paste.openstack.org/show/QxKv2Ai697qulp9UWTjY/
>> >
>> > Regards,
>> > Monika
>> > ________________________________
>> > From: Fabian Zimmermann <dev.faz at gmail.com>
>> > Sent: Friday, July 31, 2020 1:57 AM
>> > To: Monika Samal <monika.samal at outlook.com>
>> > Cc: Michael Johnson <johnsomor at gmail.com>; Amy Marrich <amy at demarco.com>;
>> openstack-discuss <openstack-discuss at lists.openstack.org>;
>> community at lists.openstack.org <community at lists.openstack.org>
>> > Subject: Re: [openstack-community] Octavia :; Unable to create load
>> balancer
>> >
>> > Hi,
>> >
>> > just to debug, could you replace the auth_type password with v3password?
>> >
>> > And do a curl against your :5000 and :35357 urls and paste the output.
>> >
>> >  Fabian
>> >
>> > Monika Samal <monika.samal at outlook.com> schrieb am Do., 30. Juli 2020,
>> 22:15:
>> >
>> > Hello Fabian,
>> >
>> > http://paste.openstack.org/show/796477/
>> >
>> > Thanks,
>> > Monika
>> > ________________________________
>> > From: Fabian Zimmermann <dev.faz at gmail.com>
>> > Sent: Friday, July 31, 2020 1:38 AM
>> > To: Monika Samal <monika.samal at outlook.com>
>> > Cc: Michael Johnson <johnsomor at gmail.com>; Amy Marrich <amy at demarco.com>;
>> openstack-discuss <openstack-discuss at lists.openstack.org>;
>> community at lists.openstack.org <community at lists.openstack.org>
>> > Subject: Re: [openstack-community] Octavia :; Unable to create load
>> balancer
>> >
>> > The sections should be
>> >
>> > service_auth
>> > keystone_authtoken
>> >
>> > if i read the docs correctly. Maybe you can just paste your config
>> (remove/change passwords) to paste.openstack.org and post the link?
>> >
>> >  Fabian
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200805/cda986fa/attachment-0001.html>

From mnasiadka at gmail.com  Wed Aug  5 15:28:52 2020
From: mnasiadka at gmail.com (=?utf-8?Q?Micha=C5=82_Nasiadka?=)
Date: Wed, 5 Aug 2020 17:28:52 +0200
Subject: =?utf-8?Q?Re=3A_=5Bkolla=5D_Proposing_Micha=C5=82_Nasiadka_for_ka?=
 =?utf-8?Q?yobe-core?=
In-Reply-To: <CAFHSqWqBoCoT7Jnd77nAKUavniMOsd7MZN1y=JTFfxQbAPcz_Q@mail.gmail.com>
References: <CAFHSqWriqGpe2VUHeDi-s62NxREUPN-1eOwob+KQrxjKX=eqqA@mail.gmail.com>
 <d48e27e9-35cd-403f-f390-51f9b11dd65f@stackhpc.com>
 <CAFHSqWqBoCoT7Jnd77nAKUavniMOsd7MZN1y=JTFfxQbAPcz_Q@mail.gmail.com>
Message-ID: <A4DCAA09-DF48-471D-BCC5-D17EBE4DD035@gmail.com>

Hi,

Thanks for being a part of such a great team!

Best regards,
Michal

> On 5 Aug 2020, at 17:14, Mark Goddard <mark at stackhpc.com> wrote:
> 
> On Tue, 28 Jul 2020 at 16:08, Doug Szumski <doug at stackhpc.com> wrote:
>> 
>> 
>> On 28/07/2020 15:50, Mark Goddard wrote:
>>> Hi,
>>> 
>>> I'd like to propose adding Michał Nasiadka to the kayobe-core group.
>>> Michał is a valued member of the Kolla core team, and has been
>>> providing some good patches and reviews for Kayobe too.
>>> 
>>> Kayobians, please respond with +1/-1.
> It's been a week, with only approvals - welcome to the core team Michał!
>> Sounds excellent, +1 for Michał!
>>> 
>>> Cheers,
>>> Mark
>>> 
> 


From jasowang at redhat.com  Wed Aug  5 02:22:15 2020
From: jasowang at redhat.com (Jason Wang)
Date: Wed, 5 Aug 2020 10:22:15 +0800
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200804183503.39f56516.cohuck@redhat.com>
References: <20200713232957.GD5955@joy-OptiPlex-7040>
 <9bfa8700-91f5-ebb4-3977-6321f0487a63@redhat.com>
 <20200716083230.GA25316@joy-OptiPlex-7040> <20200717101258.65555978@x1.home>
 <20200721005113.GA10502@joy-OptiPlex-7040>
 <20200727072440.GA28676@joy-OptiPlex-7040> <20200727162321.7097070e@x1.home>
 <20200729080503.GB28676@joy-OptiPlex-7040>
 <20200804183503.39f56516.cohuck@redhat.com>
Message-ID: <c178a0d3-269d-1620-22b1-9010f602d8ff@redhat.com>


On 2020/8/5 上午12:35, Cornelia Huck wrote:
> [sorry about not chiming in earlier]
>
> On Wed, 29 Jul 2020 16:05:03 +0800
> Yan Zhao <yan.y.zhao at intel.com> wrote:
>
>> On Mon, Jul 27, 2020 at 04:23:21PM -0600, Alex Williamson wrote:
> (...)
>
>>> Based on the feedback we've received, the previously proposed interface
>>> is not viable.  I think there's agreement that the user needs to be
>>> able to parse and interpret the version information.  Using json seems
>>> viable, but I don't know if it's the best option.  Is there any
>>> precedent of markup strings returned via sysfs we could follow?
> I don't think encoding complex information in a sysfs file is a viable
> approach. Quoting Documentation/filesystems/sysfs.rst:
>
> "Attributes should be ASCII text files, preferably with only one value
> per file. It is noted that it may not be efficient to contain only one
> value per file, so it is socially acceptable to express an array of
> values of the same type.
>                                                                                   
> Mixing types, expressing multiple lines of data, and doing fancy
> formatting of data is heavily frowned upon."
>
> Even though this is an older file, I think these restrictions still
> apply.


+1, that's another reason why devlink(netlink) is better.

Thanks


From yan.y.zhao at intel.com  Wed Aug  5 02:16:54 2020
From: yan.y.zhao at intel.com (Yan Zhao)
Date: Wed, 5 Aug 2020 10:16:54 +0800
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <c178a0d3-269d-1620-22b1-9010f602d8ff@redhat.com>
References: <20200713232957.GD5955@joy-OptiPlex-7040>
 <9bfa8700-91f5-ebb4-3977-6321f0487a63@redhat.com>
 <20200716083230.GA25316@joy-OptiPlex-7040>
 <20200717101258.65555978@x1.home>
 <20200721005113.GA10502@joy-OptiPlex-7040>
 <20200727072440.GA28676@joy-OptiPlex-7040>
 <20200727162321.7097070e@x1.home>
 <20200729080503.GB28676@joy-OptiPlex-7040>
 <20200804183503.39f56516.cohuck@redhat.com>
 <c178a0d3-269d-1620-22b1-9010f602d8ff@redhat.com>
Message-ID: <20200805021654.GB30485@joy-OptiPlex-7040>

On Wed, Aug 05, 2020 at 10:22:15AM +0800, Jason Wang wrote:
> 
> On 2020/8/5 上午12:35, Cornelia Huck wrote:
> > [sorry about not chiming in earlier]
> > 
> > On Wed, 29 Jul 2020 16:05:03 +0800
> > Yan Zhao <yan.y.zhao at intel.com> wrote:
> > 
> > > On Mon, Jul 27, 2020 at 04:23:21PM -0600, Alex Williamson wrote:
> > (...)
> > 
> > > > Based on the feedback we've received, the previously proposed interface
> > > > is not viable.  I think there's agreement that the user needs to be
> > > > able to parse and interpret the version information.  Using json seems
> > > > viable, but I don't know if it's the best option.  Is there any
> > > > precedent of markup strings returned via sysfs we could follow?
> > I don't think encoding complex information in a sysfs file is a viable
> > approach. Quoting Documentation/filesystems/sysfs.rst:
> > 
> > "Attributes should be ASCII text files, preferably with only one value
> > per file. It is noted that it may not be efficient to contain only one
> > value per file, so it is socially acceptable to express an array of
> > values of the same type.
> > Mixing types, expressing multiple lines of data, and doing fancy
> > formatting of data is heavily frowned upon."
> > 
> > Even though this is an older file, I think these restrictions still
> > apply.
> 
> 
> +1, that's another reason why devlink(netlink) is better.
>
hi Jason,
do you have any materials or sample code about devlink, so we can have a good
study of it?
I found some kernel docs about it but my preliminary study didn't show me the
advantage of devlink.

Thanks
Yan


From jasowang at redhat.com  Wed Aug  5 02:41:54 2020
From: jasowang at redhat.com (Jason Wang)
Date: Wed, 5 Aug 2020 10:41:54 +0800
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200805021654.GB30485@joy-OptiPlex-7040>
References: <20200713232957.GD5955@joy-OptiPlex-7040>
 <9bfa8700-91f5-ebb4-3977-6321f0487a63@redhat.com>
 <20200716083230.GA25316@joy-OptiPlex-7040> <20200717101258.65555978@x1.home>
 <20200721005113.GA10502@joy-OptiPlex-7040>
 <20200727072440.GA28676@joy-OptiPlex-7040> <20200727162321.7097070e@x1.home>
 <20200729080503.GB28676@joy-OptiPlex-7040>
 <20200804183503.39f56516.cohuck@redhat.com>
 <c178a0d3-269d-1620-22b1-9010f602d8ff@redhat.com>
 <20200805021654.GB30485@joy-OptiPlex-7040>
Message-ID: <2624b12f-3788-7e2b-2cb7-93534960bcb7@redhat.com>


On 2020/8/5 上午10:16, Yan Zhao wrote:
> On Wed, Aug 05, 2020 at 10:22:15AM +0800, Jason Wang wrote:
>> On 2020/8/5 上午12:35, Cornelia Huck wrote:
>>> [sorry about not chiming in earlier]
>>>
>>> On Wed, 29 Jul 2020 16:05:03 +0800
>>> Yan Zhao <yan.y.zhao at intel.com> wrote:
>>>
>>>> On Mon, Jul 27, 2020 at 04:23:21PM -0600, Alex Williamson wrote:
>>> (...)
>>>
>>>>> Based on the feedback we've received, the previously proposed interface
>>>>> is not viable.  I think there's agreement that the user needs to be
>>>>> able to parse and interpret the version information.  Using json seems
>>>>> viable, but I don't know if it's the best option.  Is there any
>>>>> precedent of markup strings returned via sysfs we could follow?
>>> I don't think encoding complex information in a sysfs file is a viable
>>> approach. Quoting Documentation/filesystems/sysfs.rst:
>>>
>>> "Attributes should be ASCII text files, preferably with only one value
>>> per file. It is noted that it may not be efficient to contain only one
>>> value per file, so it is socially acceptable to express an array of
>>> values of the same type.
>>> Mixing types, expressing multiple lines of data, and doing fancy
>>> formatting of data is heavily frowned upon."
>>>
>>> Even though this is an older file, I think these restrictions still
>>> apply.
>>
>> +1, that's another reason why devlink(netlink) is better.
>>
> hi Jason,
> do you have any materials or sample code about devlink, so we can have a good
> study of it?
> I found some kernel docs about it but my preliminary study didn't show me the
> advantage of devlink.


CC Jiri and Parav for a better answer for this.

My understanding is that the following advantages are obvious (as I 
replied in another thread):

- existing users (NIC, crypto, SCSI, ib), mature and stable
- much better error reporting (ext_ack other than string or errno)
- namespace aware
- do not couple with kobject

Thanks


>
> Thanks
> Yan
>


From jiri at mellanox.com  Wed Aug  5 07:56:47 2020
From: jiri at mellanox.com (Jiri Pirko)
Date: Wed, 5 Aug 2020 09:56:47 +0200
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <2624b12f-3788-7e2b-2cb7-93534960bcb7@redhat.com>
References: <20200716083230.GA25316@joy-OptiPlex-7040>
 <20200717101258.65555978@x1.home>
 <20200721005113.GA10502@joy-OptiPlex-7040>
 <20200727072440.GA28676@joy-OptiPlex-7040>
 <20200727162321.7097070e@x1.home>
 <20200729080503.GB28676@joy-OptiPlex-7040>
 <20200804183503.39f56516.cohuck@redhat.com>
 <c178a0d3-269d-1620-22b1-9010f602d8ff@redhat.com>
 <20200805021654.GB30485@joy-OptiPlex-7040>
 <2624b12f-3788-7e2b-2cb7-93534960bcb7@redhat.com>
Message-ID: <20200805075647.GB2177@nanopsycho>

Wed, Aug 05, 2020 at 04:41:54AM CEST, jasowang at redhat.com wrote:
>
>On 2020/8/5 上午10:16, Yan Zhao wrote:
>> On Wed, Aug 05, 2020 at 10:22:15AM +0800, Jason Wang wrote:
>> > On 2020/8/5 上午12:35, Cornelia Huck wrote:
>> > > [sorry about not chiming in earlier]
>> > > 
>> > > On Wed, 29 Jul 2020 16:05:03 +0800
>> > > Yan Zhao <yan.y.zhao at intel.com> wrote:
>> > > 
>> > > > On Mon, Jul 27, 2020 at 04:23:21PM -0600, Alex Williamson wrote:
>> > > (...)
>> > > 
>> > > > > Based on the feedback we've received, the previously proposed interface
>> > > > > is not viable.  I think there's agreement that the user needs to be
>> > > > > able to parse and interpret the version information.  Using json seems
>> > > > > viable, but I don't know if it's the best option.  Is there any
>> > > > > precedent of markup strings returned via sysfs we could follow?
>> > > I don't think encoding complex information in a sysfs file is a viable
>> > > approach. Quoting Documentation/filesystems/sysfs.rst:
>> > > 
>> > > "Attributes should be ASCII text files, preferably with only one value
>> > > per file. It is noted that it may not be efficient to contain only one
>> > > value per file, so it is socially acceptable to express an array of
>> > > values of the same type.
>> > > Mixing types, expressing multiple lines of data, and doing fancy
>> > > formatting of data is heavily frowned upon."
>> > > 
>> > > Even though this is an older file, I think these restrictions still
>> > > apply.
>> > 
>> > +1, that's another reason why devlink(netlink) is better.
>> > 
>> hi Jason,
>> do you have any materials or sample code about devlink, so we can have a good
>> study of it?
>> I found some kernel docs about it but my preliminary study didn't show me the
>> advantage of devlink.
>
>
>CC Jiri and Parav for a better answer for this.
>
>My understanding is that the following advantages are obvious (as I replied
>in another thread):
>
>- existing users (NIC, crypto, SCSI, ib), mature and stable
>- much better error reporting (ext_ack other than string or errno)
>- namespace aware
>- do not couple with kobject

Jason, what is your use case?


>
>Thanks
>
>
>> 
>> Thanks
>> Yan
>> 
>


From jasowang at redhat.com  Wed Aug  5 08:02:48 2020
From: jasowang at redhat.com (Jason Wang)
Date: Wed, 5 Aug 2020 16:02:48 +0800
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200805075647.GB2177@nanopsycho>
References: <20200716083230.GA25316@joy-OptiPlex-7040>
 <20200717101258.65555978@x1.home> <20200721005113.GA10502@joy-OptiPlex-7040>
 <20200727072440.GA28676@joy-OptiPlex-7040> <20200727162321.7097070e@x1.home>
 <20200729080503.GB28676@joy-OptiPlex-7040>
 <20200804183503.39f56516.cohuck@redhat.com>
 <c178a0d3-269d-1620-22b1-9010f602d8ff@redhat.com>
 <20200805021654.GB30485@joy-OptiPlex-7040>
 <2624b12f-3788-7e2b-2cb7-93534960bcb7@redhat.com>
 <20200805075647.GB2177@nanopsycho>
Message-ID: <eb1d01c2-fbad-36b6-10cf-9e03483a736b@redhat.com>


On 2020/8/5 下午3:56, Jiri Pirko wrote:
> Wed, Aug 05, 2020 at 04:41:54AM CEST, jasowang at redhat.com wrote:
>> On 2020/8/5 上午10:16, Yan Zhao wrote:
>>> On Wed, Aug 05, 2020 at 10:22:15AM +0800, Jason Wang wrote:
>>>> On 2020/8/5 上午12:35, Cornelia Huck wrote:
>>>>> [sorry about not chiming in earlier]
>>>>>
>>>>> On Wed, 29 Jul 2020 16:05:03 +0800
>>>>> Yan Zhao <yan.y.zhao at intel.com> wrote:
>>>>>
>>>>>> On Mon, Jul 27, 2020 at 04:23:21PM -0600, Alex Williamson wrote:
>>>>> (...)
>>>>>
>>>>>>> Based on the feedback we've received, the previously proposed interface
>>>>>>> is not viable.  I think there's agreement that the user needs to be
>>>>>>> able to parse and interpret the version information.  Using json seems
>>>>>>> viable, but I don't know if it's the best option.  Is there any
>>>>>>> precedent of markup strings returned via sysfs we could follow?
>>>>> I don't think encoding complex information in a sysfs file is a viable
>>>>> approach. Quoting Documentation/filesystems/sysfs.rst:
>>>>>
>>>>> "Attributes should be ASCII text files, preferably with only one value
>>>>> per file. It is noted that it may not be efficient to contain only one
>>>>> value per file, so it is socially acceptable to express an array of
>>>>> values of the same type.
>>>>> Mixing types, expressing multiple lines of data, and doing fancy
>>>>> formatting of data is heavily frowned upon."
>>>>>
>>>>> Even though this is an older file, I think these restrictions still
>>>>> apply.
>>>> +1, that's another reason why devlink(netlink) is better.
>>>>
>>> hi Jason,
>>> do you have any materials or sample code about devlink, so we can have a good
>>> study of it?
>>> I found some kernel docs about it but my preliminary study didn't show me the
>>> advantage of devlink.
>>
>> CC Jiri and Parav for a better answer for this.
>>
>> My understanding is that the following advantages are obvious (as I replied
>> in another thread):
>>
>> - existing users (NIC, crypto, SCSI, ib), mature and stable
>> - much better error reporting (ext_ack other than string or errno)
>> - namespace aware
>> - do not couple with kobject
> Jason, what is your use case?


I think the use case is to report device compatibility for live 
migration. Yan proposed a simple sysfs based migration version first, 
but it looks not sufficient and something based on JSON is discussed.

Yan, can you help to summarize the discussion so far for Jiri as a 
reference?

Thanks


>
>
>
>> Thanks
>>
>>
>>> Thanks
>>> Yan
>>>


From dgilbert at redhat.com  Wed Aug  5 09:44:23 2020
From: dgilbert at redhat.com (Dr. David Alan Gilbert)
Date: Wed, 5 Aug 2020 10:44:23 +0100
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200804083708.GA30485@joy-OptiPlex-7040>
References: <20200717101258.65555978@x1.home>
 <20200721005113.GA10502@joy-OptiPlex-7040>
 <20200727072440.GA28676@joy-OptiPlex-7040>
 <20200727162321.7097070e@x1.home>
 <20200729080503.GB28676@joy-OptiPlex-7040>
 <e8a973ea0bb2bc3eb15649fb1c44599ae3509e84.camel@redhat.com>
 <20200729131255.68730f68@x1.home>
 <20200730034104.GB32327@joy-OptiPlex-7040>
 <20200730112930.6f4c5762@x1.home>
 <20200804083708.GA30485@joy-OptiPlex-7040>
Message-ID: <20200805094423.GB3004@work-vm>

* Yan Zhao (yan.y.zhao at intel.com) wrote:
> > > yes, include a device_api field is better.
> > > for mdev, "device_type=vfio-mdev", is it right?
> > 
> > No, vfio-mdev is not a device API, it's the driver that attaches to the
> > mdev bus device to expose it through vfio.  The device_api exposes the
> > actual interface of the vfio device, it's also vfio-pci for typical
> > mdev devices found on x86, but may be vfio-ccw, vfio-ap, etc...  See
> > VFIO_DEVICE_API_PCI_STRING and friends.
> > 
> ok. got it.
> 
> > > > > > 	device_id=8086591d  
> > > > 
> > > > Is device_id interpreted relative to device_type?  How does this
> > > > relate to mdev_type?  If we have an mdev_type, doesn't that fully
> > > > defined the software API?
> > > >   
> > > it's parent pci id for mdev actually.
> >
> > If we need to specify the parent PCI ID then something is fundamentally
> > wrong with the mdev_type.  The mdev_type should define a unique,
> > software compatible interface, regardless of the parent device IDs.  If
> > a i915-GVTg_V5_2 means different things based on the parent device IDs,
> > then then different mdev_types should be reported for those parent
> > devices.
> >
> hmm, then do we allow vendor specific fields?
> or is it a must that a vendor specific field should have corresponding
> vendor attribute?
> 
> another thing is that the definition of mdev_type in GVT only corresponds
> to vGPU computing ability currently,
> e.g. i915-GVTg_V5_2, is 1/2 of a gen9 IGD, i915-GVTg_V4_2 is 1/2 of a
> gen8 IGD.
> It is too coarse-grained to live migration compatibility.

Can you explain why that's too coarse?

Is this because it's too specific (i.e. that a i915-GVTg_V4_2 could be
migrated to a newer device?), or that it's too specific on the exact
sizings (i.e. that there may be multiple different sizes of a gen9)?

Dave

> Do you think we need to update GVT's definition of mdev_type?
> And is there any guide in mdev_type definition?
> 
> > > > > > 	mdev_type=i915-GVTg_V5_2  
> > > > 
> > > > And how are non-mdev devices represented?
> > > >   
> > > non-mdev can opt to not include this field, or as you said below, a
> > > vendor signature. 
> > > 
> > > > > > 	aggregator=1
> > > > > > 	pv_mode="none+ppgtt+context"  
> > > > 
> > > > These are meaningless vendor specific matches afaict.
> > > >   
> > > yes, pv_mode and aggregator are vendor specific fields.
> > > but they are important to decide whether two devices are compatible.
> > > pv_mode means whether a vGPU supports guest paravirtualized api.
> > > "none+ppgtt+context" means guest can not use pv, or use ppgtt mode pv or
> > > use context mode pv.
> > > 
> > > > > > 	interface_version=3  
> > > > 
> > > > Not much granularity here, I prefer Sean's previous
> > > > <major>.<minor>[.bugfix] scheme.
> > > >   
> > > yes, <major>.<minor>[.bugfix] scheme may be better, but I'm not sure if
> > > it works for a complicated scenario.
> > > e.g for pv_mode,
> > > (1) initially,  pv_mode is not supported, so it's pv_mode=none, it's 0.0.0,
> > > (2) then, pv_mode=ppgtt is supported, pv_mode="none+ppgtt", it's 0.1.0,
> > > indicating pv_mode=none can migrate to pv_mode="none+ppgtt", but not vice versa.
> > > (3) later, pv_mode=context is also supported,
> > > pv_mode="none+ppgtt+context", so it's 0.2.0.
> > > 
> > > But if later, pv_mode=ppgtt is removed. pv_mode="none+context", how to
> > > name its version? "none+ppgtt" (0.1.0) is not compatible to
> > > "none+context", but "none+ppgtt+context" (0.2.0) is compatible to
> > > "none+context".
> > 
> > If pv_mode=ppgtt is removed, then the compatible versions would be
> > 0.0.0 or 1.0.0, ie. the major version would be incremented due to
> > feature removal.
> >  
> > > Maintain such scheme is painful to vendor driver.
> > 
> > Migration compatibility is painful, there's no way around that.  I
> > think the version scheme is an attempt to push some of that low level
> > burden on the vendor driver, otherwise the management tools need to
> > work on an ever growing matrix of vendor specific features which is
> > going to become unwieldy and is largely meaningless outside of the
> > vendor driver.  Instead, the vendor driver can make strategic decisions
> > about where to continue to maintain a support burden and make explicit
> > decisions to maintain or break compatibility.  The version scheme is a
> > simplification and abstraction of vendor driver features in order to
> > create a small, logical compatibility matrix.  Compromises necessarily
> > need to be made for that to occur.
> >
> ok. got it.
> 
> > > > > > COMPATIBLE:
> > > > > > 	device_type=pci
> > > > > > 	device_id=8086591d
> > > > > > 	mdev_type=i915-GVTg_V5_{val1:int:1,2,4,8}    
> > > > > this mixed notation will be hard to parse so i would avoid that.  
> > > > 
> > > > Some background, Intel has been proposing aggregation as a solution to
> > > > how we scale mdev devices when hardware exposes large numbers of
> > > > assignable objects that can be composed in essentially arbitrary ways.
> > > > So for instance, if we have a workqueue (wq), we might have an mdev
> > > > type for 1wq, 2wq, 3wq,... Nwq.  It's not really practical to expose a
> > > > discrete mdev type for each of those, so they want to define a base
> > > > type which is composable to other types via this aggregation.  This is
> > > > what this substitution and tagging is attempting to accomplish.  So
> > > > imagine this set of values for cases where it's not practical to unroll
> > > > the values for N discrete types.
> > > >   
> > > > > > 	aggregator={val1}/2  
> > > > 
> > > > So the {val1} above would be substituted here, though an aggregation
> > > > factor of 1/2 is a head scratcher...
> > > >   
> > > > > > 	pv_mode={val2:string:"none+ppgtt","none+context","none+ppgtt+context"}  
> > > > 
> > > > I'm lost on this one though.  I think maybe it's indicating that it's
> > > > compatible with any of these, so do we need to list it?  Couldn't this
> > > > be handled by Sean's version proposal where the minor version
> > > > represents feature compatibility?  
> > > yes, it's indicating that it's compatible with any of these.
> > > Sean's version proposal may also work, but it would be painful for
> > > vendor driver to maintain the versions when multiple similar features
> > > are involved.
> > 
> > This is something vendor drivers need to consider when adding and
> > removing features.
> > 
> > > > > > 	interface_version={val3:int:2,3}  
> > > > 
> > > > What does this turn into in a few years, 2,7,12,23,75,96,...
> > > >   
> > > is a range better?
> > 
> > I was really trying to point out that sparseness becomes an issue if
> > the vendor driver is largely disconnected from how their feature
> > addition and deprecation affects migration support.  Thanks,
> >
> ok. we'll use the x.y.z scheme then.
> 
> Thanks
> Yan
> 
--
Dr. David Alan Gilbert / dgilbert at redhat.com / Manchester, UK


From yan.y.zhao at intel.com  Wed Aug  5 09:33:38 2020
From: yan.y.zhao at intel.com (Yan Zhao)
Date: Wed, 5 Aug 2020 17:33:38 +0800
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <eb1d01c2-fbad-36b6-10cf-9e03483a736b@redhat.com>
References: <20200721005113.GA10502@joy-OptiPlex-7040>
 <20200727072440.GA28676@joy-OptiPlex-7040>
 <20200727162321.7097070e@x1.home>
 <20200729080503.GB28676@joy-OptiPlex-7040>
 <20200804183503.39f56516.cohuck@redhat.com>
 <c178a0d3-269d-1620-22b1-9010f602d8ff@redhat.com>
 <20200805021654.GB30485@joy-OptiPlex-7040>
 <2624b12f-3788-7e2b-2cb7-93534960bcb7@redhat.com>
 <20200805075647.GB2177@nanopsycho>
 <eb1d01c2-fbad-36b6-10cf-9e03483a736b@redhat.com>
Message-ID: <20200805093338.GC30485@joy-OptiPlex-7040>

On Wed, Aug 05, 2020 at 04:02:48PM +0800, Jason Wang wrote:
> 
> On 2020/8/5 下午3:56, Jiri Pirko wrote:
> > Wed, Aug 05, 2020 at 04:41:54AM CEST, jasowang at redhat.com wrote:
> > > On 2020/8/5 上午10:16, Yan Zhao wrote:
> > > > On Wed, Aug 05, 2020 at 10:22:15AM +0800, Jason Wang wrote:
> > > > > On 2020/8/5 上午12:35, Cornelia Huck wrote:
> > > > > > [sorry about not chiming in earlier]
> > > > > > 
> > > > > > On Wed, 29 Jul 2020 16:05:03 +0800
> > > > > > Yan Zhao <yan.y.zhao at intel.com> wrote:
> > > > > > 
> > > > > > > On Mon, Jul 27, 2020 at 04:23:21PM -0600, Alex Williamson wrote:
> > > > > > (...)
> > > > > > 
> > > > > > > > Based on the feedback we've received, the previously proposed interface
> > > > > > > > is not viable.  I think there's agreement that the user needs to be
> > > > > > > > able to parse and interpret the version information.  Using json seems
> > > > > > > > viable, but I don't know if it's the best option.  Is there any
> > > > > > > > precedent of markup strings returned via sysfs we could follow?
> > > > > > I don't think encoding complex information in a sysfs file is a viable
> > > > > > approach. Quoting Documentation/filesystems/sysfs.rst:
> > > > > > 
> > > > > > "Attributes should be ASCII text files, preferably with only one value
> > > > > > per file. It is noted that it may not be efficient to contain only one
> > > > > > value per file, so it is socially acceptable to express an array of
> > > > > > values of the same type.
> > > > > > Mixing types, expressing multiple lines of data, and doing fancy
> > > > > > formatting of data is heavily frowned upon."
> > > > > > 
> > > > > > Even though this is an older file, I think these restrictions still
> > > > > > apply.
> > > > > +1, that's another reason why devlink(netlink) is better.
> > > > > 
> > > > hi Jason,
> > > > do you have any materials or sample code about devlink, so we can have a good
> > > > study of it?
> > > > I found some kernel docs about it but my preliminary study didn't show me the
> > > > advantage of devlink.
> > > 
> > > CC Jiri and Parav for a better answer for this.
> > > 
> > > My understanding is that the following advantages are obvious (as I replied
> > > in another thread):
> > > 
> > > - existing users (NIC, crypto, SCSI, ib), mature and stable
> > > - much better error reporting (ext_ack other than string or errno)
> > > - namespace aware
> > > - do not couple with kobject
> > Jason, what is your use case?
> 
> 
> I think the use case is to report device compatibility for live migration.
> Yan proposed a simple sysfs based migration version first, but it looks not
> sufficient and something based on JSON is discussed.
> 
> Yan, can you help to summarize the discussion so far for Jiri as a
> reference?
> 
yes.
we are currently defining an device live migration compatibility
interface in order to let user space like openstack and libvirt knows
which two devices are live migration compatible.
currently the devices include mdev (a kernel emulated virtual device)
and physical devices (e.g.  a VF of a PCI SRIOV device).

the attributes we want user space to compare including
common attribues:
    device_api: vfio-pci, vfio-ccw...
    mdev_type: mdev type of mdev or similar signature for physical device
               It specifies a device's hardware capability. e.g.
	       i915-GVTg_V5_4 means it's of 1/4 of a gen9 Intel graphics
	       device.
    software_version: device driver's version.
               in <major>.<minor>[.bugfix] scheme, where there is no
	       compatibility across major versions, minor versions have
	       forward compatibility (ex. 1-> 2 is ok, 2 -> 1 is not) and
	       bugfix version number indicates some degree of internal
	       improvement that is not visible to the user in terms of
	       features or compatibility,

vendor specific attributes: each vendor may define different attributes
   device id : device id of a physical devices or mdev's parent pci device.
               it could be equal to pci id for pci devices
   aggregator: used together with mdev_type. e.g. aggregator=2 together
               with i915-GVTg_V5_4 means 2*1/4=1/2 of a gen9 Intel
	       graphics device.
   remote_url: for a local NVMe VF, it may be configured with a remote
               url of a remote storage and all data is stored in the
	       remote side specified by the remote url.
   ...

Comparing those attributes by user space alone is not an easy job, as it
can't simply assume an equal relationship between source attributes and
target attributes. e.g.
for a source device of mdev_type=i915-GVTg_V5_4,aggregator=2, (1/2 of
gen9), it actually could find a compatible device of
mdev_type=i915-GVTg_V5_8,aggregator=4 (also 1/2 of gen9),
if mdev_type of i915-GVTg_V5_4 is not available in the target machine.

So, in our current proposal, we want to create two sysfs attributes
under a device sysfs node.
/sys/<path to device>/migration/self
/sys/<path to device>/migration/compatible

#cat /sys/<path to device>/migration/self
device_type=vfio_pci
mdev_type=i915-GVTg_V5_4
device_id=8086591d
aggregator=2
software_version=1.0.0

#cat /sys/<path to device>/migration/compatible
device_type=vfio_pci
mdev_type=i915-GVTg_V5_{val1:int:2,4,8}
device_id=8086591d
aggregator={val1}/2
software_version=1.0.0

The /sys/<path to device>/migration/self specifies self attributes of
a device.
The /sys/<path to device>/migration/compatible specifies the list of
compatible devices of a device. as in the example, compatible devices
could have
	device_type == vfio_pci &&
	device_id == 8086591d   &&
	software_version == 1.0.0 &&
        (
	(mdev_type of i915-GVTg_V5_2 && aggregator==1) ||
	(mdev_type of i915-GVTg_V5_4 && aggregator==2) ||
	(mdev_type of i915-GVTg_V5_8 && aggregator=4)
	)

by comparing whether a target device is in compatible list of source
device, the user space can know whether a two devices are live migration
compatible.

Additional notes:
1)software_version in the compatible list may not be necessary as it
already has a major.minor.bugfix scheme.
2)for vendor attribute like remote_url, it may not be statically
assigned and could be changed with a device interface.

So, as Cornelia pointed that it's not good to use complex format in
a sysfs attribute, we'd like to know whether there're other good ways to
our use case, e.g. splitting a single attribute to multiple simple sysfs
attributes as what Cornelia suggested or devlink that Jason has strongly
recommended.

Thanks
Yan


From jiri at mellanox.com  Wed Aug  5 10:53:19 2020
From: jiri at mellanox.com (Jiri Pirko)
Date: Wed, 5 Aug 2020 12:53:19 +0200
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200805093338.GC30485@joy-OptiPlex-7040>
References: <20200727072440.GA28676@joy-OptiPlex-7040>
 <20200727162321.7097070e@x1.home>
 <20200729080503.GB28676@joy-OptiPlex-7040>
 <20200804183503.39f56516.cohuck@redhat.com>
 <c178a0d3-269d-1620-22b1-9010f602d8ff@redhat.com>
 <20200805021654.GB30485@joy-OptiPlex-7040>
 <2624b12f-3788-7e2b-2cb7-93534960bcb7@redhat.com>
 <20200805075647.GB2177@nanopsycho>
 <eb1d01c2-fbad-36b6-10cf-9e03483a736b@redhat.com>
 <20200805093338.GC30485@joy-OptiPlex-7040>
Message-ID: <20200805105319.GF2177@nanopsycho>

Wed, Aug 05, 2020 at 11:33:38AM CEST, yan.y.zhao at intel.com wrote:
>On Wed, Aug 05, 2020 at 04:02:48PM +0800, Jason Wang wrote:
>> 
>> On 2020/8/5 下午3:56, Jiri Pirko wrote:
>> > Wed, Aug 05, 2020 at 04:41:54AM CEST, jasowang at redhat.com wrote:
>> > > On 2020/8/5 上午10:16, Yan Zhao wrote:
>> > > > On Wed, Aug 05, 2020 at 10:22:15AM +0800, Jason Wang wrote:
>> > > > > On 2020/8/5 上午12:35, Cornelia Huck wrote:
>> > > > > > [sorry about not chiming in earlier]
>> > > > > > 
>> > > > > > On Wed, 29 Jul 2020 16:05:03 +0800
>> > > > > > Yan Zhao <yan.y.zhao at intel.com> wrote:
>> > > > > > 
>> > > > > > > On Mon, Jul 27, 2020 at 04:23:21PM -0600, Alex Williamson wrote:
>> > > > > > (...)
>> > > > > > 
>> > > > > > > > Based on the feedback we've received, the previously proposed interface
>> > > > > > > > is not viable.  I think there's agreement that the user needs to be
>> > > > > > > > able to parse and interpret the version information.  Using json seems
>> > > > > > > > viable, but I don't know if it's the best option.  Is there any
>> > > > > > > > precedent of markup strings returned via sysfs we could follow?
>> > > > > > I don't think encoding complex information in a sysfs file is a viable
>> > > > > > approach. Quoting Documentation/filesystems/sysfs.rst:
>> > > > > > 
>> > > > > > "Attributes should be ASCII text files, preferably with only one value
>> > > > > > per file. It is noted that it may not be efficient to contain only one
>> > > > > > value per file, so it is socially acceptable to express an array of
>> > > > > > values of the same type.
>> > > > > > Mixing types, expressing multiple lines of data, and doing fancy
>> > > > > > formatting of data is heavily frowned upon."
>> > > > > > 
>> > > > > > Even though this is an older file, I think these restrictions still
>> > > > > > apply.
>> > > > > +1, that's another reason why devlink(netlink) is better.
>> > > > > 
>> > > > hi Jason,
>> > > > do you have any materials or sample code about devlink, so we can have a good
>> > > > study of it?
>> > > > I found some kernel docs about it but my preliminary study didn't show me the
>> > > > advantage of devlink.
>> > > 
>> > > CC Jiri and Parav for a better answer for this.
>> > > 
>> > > My understanding is that the following advantages are obvious (as I replied
>> > > in another thread):
>> > > 
>> > > - existing users (NIC, crypto, SCSI, ib), mature and stable
>> > > - much better error reporting (ext_ack other than string or errno)
>> > > - namespace aware
>> > > - do not couple with kobject
>> > Jason, what is your use case?
>> 
>> 
>> I think the use case is to report device compatibility for live migration.
>> Yan proposed a simple sysfs based migration version first, but it looks not
>> sufficient and something based on JSON is discussed.
>> 
>> Yan, can you help to summarize the discussion so far for Jiri as a
>> reference?
>> 
>yes.
>we are currently defining an device live migration compatibility
>interface in order to let user space like openstack and libvirt knows
>which two devices are live migration compatible.
>currently the devices include mdev (a kernel emulated virtual device)
>and physical devices (e.g.  a VF of a PCI SRIOV device).
>
>the attributes we want user space to compare including
>common attribues:
>    device_api: vfio-pci, vfio-ccw...
>    mdev_type: mdev type of mdev or similar signature for physical device
>               It specifies a device's hardware capability. e.g.
>	       i915-GVTg_V5_4 means it's of 1/4 of a gen9 Intel graphics
>	       device.
>    software_version: device driver's version.
>               in <major>.<minor>[.bugfix] scheme, where there is no
>	       compatibility across major versions, minor versions have
>	       forward compatibility (ex. 1-> 2 is ok, 2 -> 1 is not) and
>	       bugfix version number indicates some degree of internal
>	       improvement that is not visible to the user in terms of
>	       features or compatibility,
>
>vendor specific attributes: each vendor may define different attributes
>   device id : device id of a physical devices or mdev's parent pci device.
>               it could be equal to pci id for pci devices
>   aggregator: used together with mdev_type. e.g. aggregator=2 together
>               with i915-GVTg_V5_4 means 2*1/4=1/2 of a gen9 Intel
>	       graphics device.
>   remote_url: for a local NVMe VF, it may be configured with a remote
>               url of a remote storage and all data is stored in the
>	       remote side specified by the remote url.
>   ...
>
>Comparing those attributes by user space alone is not an easy job, as it
>can't simply assume an equal relationship between source attributes and
>target attributes. e.g.
>for a source device of mdev_type=i915-GVTg_V5_4,aggregator=2, (1/2 of
>gen9), it actually could find a compatible device of
>mdev_type=i915-GVTg_V5_8,aggregator=4 (also 1/2 of gen9),
>if mdev_type of i915-GVTg_V5_4 is not available in the target machine.
>
>So, in our current proposal, we want to create two sysfs attributes
>under a device sysfs node.
>/sys/<path to device>/migration/self
>/sys/<path to device>/migration/compatible
>
>#cat /sys/<path to device>/migration/self
>device_type=vfio_pci
>mdev_type=i915-GVTg_V5_4
>device_id=8086591d
>aggregator=2
>software_version=1.0.0
>
>#cat /sys/<path to device>/migration/compatible
>device_type=vfio_pci
>mdev_type=i915-GVTg_V5_{val1:int:2,4,8}
>device_id=8086591d
>aggregator={val1}/2
>software_version=1.0.0
>
>The /sys/<path to device>/migration/self specifies self attributes of
>a device.
>The /sys/<path to device>/migration/compatible specifies the list of
>compatible devices of a device. as in the example, compatible devices
>could have
>	device_type == vfio_pci &&
>	device_id == 8086591d   &&
>	software_version == 1.0.0 &&
>        (
>	(mdev_type of i915-GVTg_V5_2 && aggregator==1) ||
>	(mdev_type of i915-GVTg_V5_4 && aggregator==2) ||
>	(mdev_type of i915-GVTg_V5_8 && aggregator=4)
>	)
>
>by comparing whether a target device is in compatible list of source
>device, the user space can know whether a two devices are live migration
>compatible.
>
>Additional notes:
>1)software_version in the compatible list may not be necessary as it
>already has a major.minor.bugfix scheme.
>2)for vendor attribute like remote_url, it may not be statically
>assigned and could be changed with a device interface.
>
>So, as Cornelia pointed that it's not good to use complex format in
>a sysfs attribute, we'd like to know whether there're other good ways to
>our use case, e.g. splitting a single attribute to multiple simple sysfs
>attributes as what Cornelia suggested or devlink that Jason has strongly
>recommended.

Hi Yan.

Thanks for the explanation, I'm still fuzzy about the details.
Anyway, I suggest you to check "devlink dev info" command we have
implemented for multiple drivers. You can try netdevsim to test this.
I think that the info you need to expose might be put there.

Devlink creates instance per-device. Specific device driver calls into
devlink core to create the instance.  What device do you have? What
driver is it handled by?


>
>Thanks
>Yan
>
>
>


From smooney at redhat.com  Wed Aug  5 11:35:01 2020
From: smooney at redhat.com (Sean Mooney)
Date: Wed, 05 Aug 2020 12:35:01 +0100
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200805105319.GF2177@nanopsycho>
References: <20200727072440.GA28676@joy-OptiPlex-7040>
 <20200727162321.7097070e@x1.home>
 <20200729080503.GB28676@joy-OptiPlex-7040>
 <20200804183503.39f56516.cohuck@redhat.com>
 <c178a0d3-269d-1620-22b1-9010f602d8ff@redhat.com>
 <20200805021654.GB30485@joy-OptiPlex-7040>
 <2624b12f-3788-7e2b-2cb7-93534960bcb7@redhat.com>
 <20200805075647.GB2177@nanopsycho>
 <eb1d01c2-fbad-36b6-10cf-9e03483a736b@redhat.com>
 <20200805093338.GC30485@joy-OptiPlex-7040>
 <20200805105319.GF2177@nanopsycho>
Message-ID: <4cf2824c803c96496e846c5b06767db305e9fb5a.camel@redhat.com>

On Wed, 2020-08-05 at 12:53 +0200, Jiri Pirko wrote:
> Wed, Aug 05, 2020 at 11:33:38AM CEST, yan.y.zhao at intel.com wrote:
> > On Wed, Aug 05, 2020 at 04:02:48PM +0800, Jason Wang wrote:
> > > 
> > > On 2020/8/5 下午3:56, Jiri Pirko wrote:
> > > > Wed, Aug 05, 2020 at 04:41:54AM CEST, jasowang at redhat.com wrote:
> > > > > On 2020/8/5 上午10:16, Yan Zhao wrote:
> > > > > > On Wed, Aug 05, 2020 at 10:22:15AM +0800, Jason Wang wrote:
> > > > > > > On 2020/8/5 上午12:35, Cornelia Huck wrote:
> > > > > > > > [sorry about not chiming in earlier]
> > > > > > > > 
> > > > > > > > On Wed, 29 Jul 2020 16:05:03 +0800
> > > > > > > > Yan Zhao <yan.y.zhao at intel.com> wrote:
> > > > > > > > 
> > > > > > > > > On Mon, Jul 27, 2020 at 04:23:21PM -0600, Alex Williamson wrote:
> > > > > > > > 
> > > > > > > > (...)
> > > > > > > > 
> > > > > > > > > > Based on the feedback we've received, the previously proposed interface
> > > > > > > > > > is not viable.  I think there's agreement that the user needs to be
> > > > > > > > > > able to parse and interpret the version information.  Using json seems
> > > > > > > > > > viable, but I don't know if it's the best option.  Is there any
> > > > > > > > > > precedent of markup strings returned via sysfs we could follow?
> > > > > > > > 
> > > > > > > > I don't think encoding complex information in a sysfs file is a viable
> > > > > > > > approach. Quoting Documentation/filesystems/sysfs.rst:
> > > > > > > > 
> > > > > > > > "Attributes should be ASCII text files, preferably with only one value
> > > > > > > > per file. It is noted that it may not be efficient to contain only one
> > > > > > > > value per file, so it is socially acceptable to express an array of
> > > > > > > > values of the same type.
> > > > > > > > Mixing types, expressing multiple lines of data, and doing fancy
> > > > > > > > formatting of data is heavily frowned upon."
> > > > > > > > 
> > > > > > > > Even though this is an older file, I think these restrictions still
> > > > > > > > apply.
> > > > > > > 
> > > > > > > +1, that's another reason why devlink(netlink) is better.
> > > > > > > 
> > > > > > 
> > > > > > hi Jason,
> > > > > > do you have any materials or sample code about devlink, so we can have a good
> > > > > > study of it?
> > > > > > I found some kernel docs about it but my preliminary study didn't show me the
> > > > > > advantage of devlink.
> > > > > 
> > > > > CC Jiri and Parav for a better answer for this.
> > > > > 
> > > > > My understanding is that the following advantages are obvious (as I replied
> > > > > in another thread):
> > > > > 
> > > > > - existing users (NIC, crypto, SCSI, ib), mature and stable
> > > > > - much better error reporting (ext_ack other than string or errno)
> > > > > - namespace aware
> > > > > - do not couple with kobject
> > > > 
> > > > Jason, what is your use case?
> > > 
> > > 
> > > I think the use case is to report device compatibility for live migration.
> > > Yan proposed a simple sysfs based migration version first, but it looks not
> > > sufficient and something based on JSON is discussed.
> > > 
> > > Yan, can you help to summarize the discussion so far for Jiri as a
> > > reference?
> > > 
> > 
> > yes.
> > we are currently defining an device live migration compatibility
> > interface in order to let user space like openstack and libvirt knows
> > which two devices are live migration compatible.
> > currently the devices include mdev (a kernel emulated virtual device)
> > and physical devices (e.g.  a VF of a PCI SRIOV device).
> > 
> > the attributes we want user space to compare including
> > common attribues:
> >    device_api: vfio-pci, vfio-ccw...
> >    mdev_type: mdev type of mdev or similar signature for physical device
> >               It specifies a device's hardware capability. e.g.
> > 	       i915-GVTg_V5_4 means it's of 1/4 of a gen9 Intel graphics
> > 	       device.
by the way this nameing sceam works the opisite of how it would have expected
i woudl have expected to i915-GVTg_V5 to be the same as i915-GVTg_V5_1 and 
i915-GVTg_V5_4 to use 4 times the amount of resouce as i915-GVTg_V5_1 not 1 quarter.

i would much rather see i915-GVTg_V5_4 express as aggreataor:i915-GVTg_V5=4
e.g. that it is 4 of the basic i915-GVTg_V5 type
the invertion of the relationship makes this much harder to resonabout IMO.

if i915-GVTg_V5_8 and i915-GVTg_V5_4 are both actully claiming the same resouce
and both can be used at the same time with your suggested nameing scemem i have have
to fine the mdevtype with the largest value and store that then do math by devidign it by the suffix
of the requested type every time i want to claim the resouce in our placement inventoies.

if we represent it the way i suggest we dont
if it i915-GVTg_V5_8 i know its using 8 of i915-GVTg_V5
it makes it significantly simpler.

> >    software_version: device driver's version.
> >               in <major>.<minor>[.bugfix] scheme, where there is no
> > 	       compatibility across major versions, minor versions have
> > 	       forward compatibility (ex. 1-> 2 is ok, 2 -> 1 is not) and
> > 	       bugfix version number indicates some degree of internal
> > 	       improvement that is not visible to the user in terms of
> > 	       features or compatibility,
> > 
> > vendor specific attributes: each vendor may define different attributes
> >   device id : device id of a physical devices or mdev's parent pci device.
> >               it could be equal to pci id for pci devices
> >   aggregator: used together with mdev_type. e.g. aggregator=2 together
> >               with i915-GVTg_V5_4 means 2*1/4=1/2 of a gen9 Intel
> > 	       graphics device.
> >   remote_url: for a local NVMe VF, it may be configured with a remote
> >               url of a remote storage and all data is stored in the
> > 	       remote side specified by the remote url.
> >   ...
just a minor not that i find ^ much more simmple to understand then
the current proposal with self and compatiable.
if i have well defiend attibute that i can parse and understand that allow
me to calulate the what is and is not compatible that is likely going to
more useful as you wont have to keep maintianing a list of other compatible
devices every time a new sku is released.

in anycase thank for actully shareing ^ as it make it simpler to reson about what
you have previously proposed.
> > 
> > Comparing those attributes by user space alone is not an easy job, as it
> > can't simply assume an equal relationship between source attributes and
> > target attributes. e.g.
> > for a source device of mdev_type=i915-GVTg_V5_4,aggregator=2, (1/2 of
> > gen9), it actually could find a compatible device of
> > mdev_type=i915-GVTg_V5_8,aggregator=4 (also 1/2 of gen9),
> > if mdev_type of i915-GVTg_V5_4 is not available in the target machine.
> > 
> > So, in our current proposal, we want to create two sysfs attributes
> > under a device sysfs node.
> > /sys/<path to device>/migration/self
> > /sys/<path to device>/migration/compatible
> > 
> > #cat /sys/<path to device>/migration/self
> > device_type=vfio_pci
> > mdev_type=i915-GVTg_V5_4
> > device_id=8086591d
> > aggregator=2
> > software_version=1.0.0
> > 
> > #cat /sys/<path to device>/migration/compatible
> > device_type=vfio_pci
> > mdev_type=i915-GVTg_V5_{val1:int:2,4,8}
> > device_id=8086591d
> > aggregator={val1}/2
> > software_version=1.0.0
> > 
> > The /sys/<path to device>/migration/self specifies self attributes of
> > a device.
> > The /sys/<path to device>/migration/compatible specifies the list of
> > compatible devices of a device. as in the example, compatible devices
> > could have
> > 	device_type == vfio_pci &&
> > 	device_id == 8086591d   &&
> > 	software_version == 1.0.0 &&
> >        (
> > 	(mdev_type of i915-GVTg_V5_2 && aggregator==1) ||
> > 	(mdev_type of i915-GVTg_V5_4 && aggregator==2) ||
> > 	(mdev_type of i915-GVTg_V5_8 && aggregator=4)
> > 	)
> > 
> > by comparing whether a target device is in compatible list of source
> > device, the user space can know whether a two devices are live migration
> > compatible.
> > 
> > Additional notes:
> > 1)software_version in the compatible list may not be necessary as it
> > already has a major.minor.bugfix scheme.
> > 2)for vendor attribute like remote_url, it may not be statically
> > assigned and could be changed with a device interface.
> > 
> > So, as Cornelia pointed that it's not good to use complex format in
> > a sysfs attribute, we'd like to know whether there're other good ways to
> > our use case, e.g. splitting a single attribute to multiple simple sysfs
> > attributes as what Cornelia suggested or devlink that Jason has strongly
> > recommended.
> 
> Hi Yan.
> 
> Thanks for the explanation, I'm still fuzzy about the details.
> Anyway, I suggest you to check "devlink dev info" command we have
> implemented for multiple drivers.

is devlink exposed as a filesytem we can read with just open?
openstack will likely try to leverage libvirt to get this info but when we
cant its much simpler to read sysfs then it is to take a a depenency on a commandline
too and have to fork shell to execute it and parse the cli output.
pyroute2 which we use in some openstack poject has basic python binding for devlink but im not
sure how complete it is as i think its relitivly new addtion. if we need to take a dependcy
we will but that would be a drawback fo devlink not that that is a large one just something
to keep in mind.

>  You can try netdevsim to test this.
> I think that the info you need to expose might be put there.
> 
> Devlink creates instance per-device. Specific device driver calls into
> devlink core to create the instance.  What device do you have? What
> driver is it handled by?
> 
> 
> > 
> > Thanks
> > Yan
> > 
> > 
> > 
> 
> 


From whayutin at redhat.com  Wed Aug  5 16:23:46 2020
From: whayutin at redhat.com (Wesley Hayutin)
Date: Wed, 5 Aug 2020 10:23:46 -0600
Subject: [tripleo][ci] container pulls failing
In-Reply-To: <CAOHJT4LY9p4iA85Sbb44WKbTC4LNCpe0oGxUZ47kr9MYmKxZcg@mail.gmail.com>
References: <CAOHJT4LQL0ggWHdrESnuhOHVUb42vYod-PuNLEpyjXx2PARoxw@mail.gmail.com>
 <CACu=hyufTKUV_vG9nRss2hAbFscS471Wbjs3-pWQgx55GMag4w@mail.gmail.com>
 <CAFsb3b56pNdWyQZokoQJVxgRjOr3h_eTsNF6Bxit6cTMcNbdwQ@mail.gmail.com>
 <CACu=hyvm17efhF4uEe9L4M5vRa8Df1kjOLxD+WLZhDNun8Mntw@mail.gmail.com>
 <CAOHJT4L2t_rqq7oaD=g5scf2PXASi=CQ4rRTiVu=Lm-Z0zQU5Q@mail.gmail.com>
 <d62cfcbe-08c9-f6ea-e1b9-77f0e85386f8@redhat.com>
 <CAOHJT4JfTbtmcyh2vSmeSqtB4nmSt4V6PAuiuvc8ak1HzPyAdQ@mail.gmail.com>
 <CAFsb3b51vjiJvzE9DnaoKM0WuSjKFA5xSfwzr4oS034D1PH-Hw@mail.gmail.com>
 <CAOHJT4LY9p4iA85Sbb44WKbTC4LNCpe0oGxUZ47kr9MYmKxZcg@mail.gmail.com>
Message-ID: <CAOHJT4LP72hpOqgbXMcA3Tw7gndwNJ88nVB0ZjoCcAddAS36dw@mail.gmail.com>

On Wed, Jul 29, 2020 at 4:48 PM Wesley Hayutin <whayutin at redhat.com> wrote:

>
>
> On Wed, Jul 29, 2020 at 4:33 PM Alex Schultz <aschultz at redhat.com> wrote:
>
>> On Wed, Jul 29, 2020 at 7:13 AM Wesley Hayutin <whayutin at redhat.com>
>> wrote:
>> >
>> >
>> >
>> > On Wed, Jul 29, 2020 at 2:25 AM Bogdan Dobrelya <bdobreli at redhat.com>
>> wrote:
>> >>
>> >> On 7/28/20 6:09 PM, Wesley Hayutin wrote:
>> >> >
>> >> >
>> >> > On Tue, Jul 28, 2020 at 7:24 AM Emilien Macchi <emilien at redhat.com
>> >> > <mailto:emilien at redhat.com>> wrote:
>> >> >
>> >> >
>> >> >
>> >> >     On Tue, Jul 28, 2020 at 9:20 AM Alex Schultz <
>> aschultz at redhat.com
>> >> >     <mailto:aschultz at redhat.com>> wrote:
>> >> >
>> >> >         On Tue, Jul 28, 2020 at 7:13 AM Emilien Macchi
>> >> >         <emilien at redhat.com <mailto:emilien at redhat.com>> wrote:
>> >> >          >
>> >> >          >
>> >> >          >
>> >> >          > On Mon, Jul 27, 2020 at 5:27 PM Wesley Hayutin
>> >> >         <whayutin at redhat.com <mailto:whayutin at redhat.com>> wrote:
>> >> >          >>
>> >> >          >> FYI...
>> >> >          >>
>> >> >          >> If you find your jobs are failing with an error similar
>> to
>> >> >         [1], you have been rate limited by docker.io <
>> http://docker.io>
>> >> >         via the upstream mirror system and have hit [2].  I've been
>> >> >         discussing the issue w/ upstream infra, rdo-infra and a few
>> CI
>> >> >         engineers.
>> >> >          >>
>> >> >          >> There are a few ways to mitigate the issue however I
>> don't
>> >> >         see any of the options being completed very quickly so I'm
>> >> >         asking for your patience while this issue is socialized and
>> >> >         resolved.
>> >> >          >>
>> >> >          >> For full transparency we're considering the following
>> options.
>> >> >          >>
>> >> >          >> 1. move off of docker.io <http://docker.io> to quay.io
>> >> >         <http://quay.io>
>> >> >          >
>> >> >          >
>> >> >          > quay.io <http://quay.io> also has API rate limit:
>> >> >          > https://docs.quay.io/issues/429.html
>> >> >          >
>> >> >          > Now I'm not sure about how many requests per seconds one
>> can
>> >> >         do vs the other but this would need to be checked with the
>> quay
>> >> >         team before changing anything.
>> >> >          > Also quay.io <http://quay.io> had its big downtimes as
>> well,
>> >> >         SLA needs to be considered.
>> >> >          >
>> >> >          >> 2. local container builds for each job in master,
>> possibly
>> >> >         ussuri
>> >> >          >
>> >> >          >
>> >> >          > Not convinced.
>> >> >          > You can look at CI logs:
>> >> >          > - pulling / updating / pushing container images from
>> >> >         docker.io <http://docker.io> to local registry takes ~10
>> min on
>> >> >         standalone (OVH)
>> >> >          > - building containers from scratch with updated repos and
>> >> >         pushing them to local registry takes ~29 min on standalone
>> (OVH).
>> >> >          >
>> >> >          >>
>> >> >          >> 3. parent child jobs upstream where rpms and containers
>> will
>> >> >         be build and host artifacts for the child jobs
>> >> >          >
>> >> >          >
>> >> >          > Yes, we need to investigate that.
>> >> >          >
>> >> >          >>
>> >> >          >> 4. remove some portion of the upstream jobs to lower the
>> >> >         impact we have on 3rd party infrastructure.
>> >> >          >
>> >> >          >
>> >> >          > I'm not sure I understand this one, maybe you can give an
>> >> >         example of what could be removed?
>> >> >
>> >> >         We need to re-evaulate our use of scenarios (e.g. we have two
>> >> >         scenario010's both are non-voting).  There's a reason we
>> >> >         historically
>> >> >         didn't want to add more jobs because of these types of
>> resource
>> >> >         constraints.  I think we've added new jobs recently and
>> likely
>> >> >         need to
>> >> >         reduce what we run. Additionally we might want to look into
>> reducing
>> >> >         what we run on stable branches as well.
>> >> >
>> >> >
>> >> >     Oh... removing jobs (I thought we would remove some steps of the
>> jobs).
>> >> >     Yes big +1, this should be a continuous goal when working on CI,
>> and
>> >> >     always evaluating what we need vs what we run now.
>> >> >
>> >> >     We should look at:
>> >> >     1) services deployed in scenarios that aren't worth testing (e.g.
>> >> >     deprecated or unused things) (and deprecate the unused things)
>> >> >     2) jobs themselves (I don't have any example beside scenario010
>> but
>> >> >     I'm sure there are more).
>> >> >     --
>> >> >     Emilien Macchi
>> >> >
>> >> >
>> >> > Thanks Alex, Emilien
>> >> >
>> >> > +1 to reviewing the catalog and adjusting things on an ongoing basis.
>> >> >
>> >> > All.. it looks like the issues with docker.io <http://docker.io>
>> were
>> >> > more of a flare up than a change in docker.io <http://docker.io>
>> policy
>> >> > or infrastructure [2].  The flare up started on July 27 8am utc and
>> >> > ended on July 27 17:00 utc, see screenshots.
>> >>
>> >> The numbers of image prepare workers and its exponential fallback
>> >> intervals should be also adjusted. I've analysed the log snippet [0]
>> for
>> >> the connection reset counts by workers versus the times the rate
>> >> limiting was triggered. See the details in the reported bug [1].
>> >>
>> >> tl;dr -- for an example 5 sec interval 03:55:31,379 - 03:55:36,110:
>> >>
>> >> Conn Reset Counts by a Worker PID:
>> >>        3 58412
>> >>        2 58413
>> >>        3 58415
>> >>        3 58417
>> >>
>> >> which seems too much of (workers*reconnects) and triggers rate limiting
>> >> immediately.
>> >>
>> >> [0]
>> >>
>> https://13b475d7469ed7126ee9-28d4ad440f46f2186fe3f98464e57890.ssl.cf1.rackcdn.com/741228/6/check/tripleo-ci-centos-8-undercloud-containers/8e47836/logs/undercloud/var/log/tripleo-container-image-prepare.log
>> >>
>> >> [1] https://bugs.launchpad.net/tripleo/+bug/1889372
>> >>
>> >> --
>> >> Best regards,
>> >> Bogdan Dobrelya,
>> >> Irc #bogdando
>> >>
>> >
>> > FYI..
>> >
>> > The issue w/ "too many requests" is back.  Expect delays and failures
>> in attempting to merge your patches upstream across all branches.   The
>> issue is being tracked as a critical issue.
>>
>> Working with the infra folks and we have identified the authorization
>> header as causing issues when we're rediected from docker.io to
>> cloudflare. I'll throw up a patch tomorrow to handle this case which
>> should improve our usage of the cache.  It needs some testing against
>> other registries to ensure that we don't break authenticated fetching
>> of resources.
>>
>> Thanks Alex!
>


FYI.. we have been revisited by the container pull issue, "too many
requests".
Alex has some fresh patches on it:
https://review.opendev.org/#/q/status:open+project:openstack/tripleo-common+topic:bug/1889122

expect trouble in check and gate:
http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22429%20Client%20Error%3A%20Too%20Many%20Requests%20for%20url%3A%5C%22%20AND%20voting%3A1
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200805/35fe6f31/attachment-0001.html>

From yumeng_bao at yahoo.com  Wed Aug  5 17:06:03 2020
From: yumeng_bao at yahoo.com (yumeng bao)
Date: Thu, 6 Aug 2020 01:06:03 +0800
Subject: [nova] If any spec freeze exception now?
References: <F0C72BB2-2EEE-42D9-A304-6633DB56B1C6.ref@yahoo.com>
Message-ID: <F0C72BB2-2EEE-42D9-A304-6633DB56B1C6@yahoo.com>

﻿Hi gibi and all,

I wanna mention the SRIOV SmartNIC Support Spec https://review.opendev.org/#/c/742785

This spec is proposed based on feedback from our PTG discussion, yet there are still open questions need to be nailed down. Since this spec involves nova neutron and cyborg,  it will probably take a long time to get ideas from different aspects and reach an agreement. Can we keep this as an exception and keep review it to reach closer to an agreement? Hopefully we can reach an agreement in Victoria, and start to land in W.

Xinran and I were trying to attend nova’s weekly meeting to discuss this spec, but the time too late for us. :( We will find if there is any other way to sync and response more actively to all your comments and feedback.

And just to point out,  nova operations support are still one of cyborg’s high priority goals in Victoria, we will keep focus on it and won’t sacrifice time of this goal. 


Regards，
Yumeng
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200806/92531a9e/attachment.html>

From sean.mcginnis at gmx.com  Wed Aug  5 17:10:28 2020
From: sean.mcginnis at gmx.com (Sean McGinnis)
Date: Wed, 5 Aug 2020 12:10:28 -0500
Subject: [cinder] Change Volume Type Properties
In-Reply-To: <24E5E9E3-6BF6-492C-BBBB-670DC070CF15@gmx.net>
References: <24E5E9E3-6BF6-492C-BBBB-670DC070CF15@gmx.net>
Message-ID: <ae087899-3967-9ddc-a5df-dc6afb1f9ae2@gmx.com>

(updated subject with correct project name)

On 8/5/20 9:33 AM, Marc Vorwerk wrote:
>
> Hi,
>
> I'm looking fora way to add the property /volume_backend_name/ to an 
> existing Volume Type which is in use.
>
> If I try to change this, I got the following error:
>
> root at control01rt:~# openstack volume type show test-type
>
> +--------------------+--------------------------------------+
>
> | Field              | Value                                |
>
> +--------------------+--------------------------------------+
>
> | access_project_ids | None                                 |
>
> | description        | None                                 |
>
> | id                 | 68febdad-e7b1-4d41-ba11-72d0e1a1cce0 |
>
> | is_public          | True                                 |
>
> | name               | test-type                            |
>
> | properties |                                      |
>
> | qos_specs_id       | None                         |
>
> +--------------------+--------------------------------------+
>
> root at control01rt:~# openstack volume type set --property 
> volume_backend_name=ceph test-type
>
> Failed to set volume type property: Volume Type is currently in use. 
> (HTTP 400) (Request-ID: req-2b8f3829-5c16-42c3-ac57-01199688bd58)
>
> Command Failed: One or more of the operations failed
>
> root at control01rt:~#
>
> Problem what I see is, that there are instances/volumes which use this 
> volume type.
>
> Have anybody an idea, how I can add the /volume_backend_name/ property 
> to the existing Volume Type?
>
This is not allowed since the scheduler may have already scheduled these 
volumes to a different backend than the one you are now specifying in 
the extra specs. That would lead to a mismatch between the volumes and 
their volume type that isn't obvious.

To get around this, you will need to create a new volume type with the 
volume_backend_name you want specified first. You can then retype your 
existing volumes to this new volume type. Assuming most or all of these 
volumes are already on that backend, the retype operation should just be 
a quick database update.

If needed, you can then delete the original volume type that is no 
longer being used, then rename the new volume type to get back to using 
the same type name. This part isn't necessary, but you may need that if 
you've configured the old name as the default volume type in your 
cinder.conf file.

Hope that helps.

Sean

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200805/0d3b0cc7/attachment.html>

From smooney at redhat.com  Wed Aug  5 17:18:03 2020
From: smooney at redhat.com (Sean Mooney)
Date: Wed, 05 Aug 2020 18:18:03 +0100
Subject: [nova] If any spec freeze exception now?
In-Reply-To: <F0C72BB2-2EEE-42D9-A304-6633DB56B1C6@yahoo.com>
References: <F0C72BB2-2EEE-42D9-A304-6633DB56B1C6.ref@yahoo.com>
 <F0C72BB2-2EEE-42D9-A304-6633DB56B1C6@yahoo.com>
Message-ID: <3c98b824f04d8fa6fd07addd956a097db1aa20ba.camel@redhat.com>

On Thu, 2020-08-06 at 01:06 +0800, yumeng bao wrote:
> ﻿Hi gibi and all,
> 
> I wanna mention the SRIOV SmartNIC Support Spec https://review.opendev.org/#/c/742785
> 
> This spec is proposed based on feedback from our PTG discussion, yet there are still open questions need to be nailed
> down. Since this spec involves nova neutron and cyborg,  it will probably take a long time to get ideas from different
> aspects and reach an agreement. Can we keep this as an exception and keep review it to reach closer to an agreement?
> Hopefully we can reach an agreement in Victoria, and start to land in W.
well you dont need to close it without an exception
the way exception work we normlly give a dealin of 1 week to finalise the spec after its granted
so basiclaly unless you think we can fully agreee all the outstanding items before thursday week and merge it
then you should just retarget the spec to the backlog or W release and keep working on it rather then ask for an
excption. exception are only for thing that we expect to merge in victoria including the code.
> 
> Xinran and I were trying to attend nova’s weekly meeting to discuss this spec, but the time too late for us. :( We
> will find if there is any other way to sync and response more actively to all your comments and feedback.
> 
> And just to point out,  nova operations support are still one of cyborg’s high priority goals in Victoria, we will
> keep focus on it and won’t sacrifice time of this goal. 
> 
> 
> Regards，
> Yumeng


From balazs.gibizer at est.tech  Wed Aug  5 17:31:44 2020
From: balazs.gibizer at est.tech (=?iso-8859-1?q?Bal=E1zs?= Gibizer)
Date: Wed, 05 Aug 2020 19:31:44 +0200
Subject: [nova] If any spec freeze exception now?
In-Reply-To: <3c98b824f04d8fa6fd07addd956a097db1aa20ba.camel@redhat.com>
References: <F0C72BB2-2EEE-42D9-A304-6633DB56B1C6.ref@yahoo.com>
 <F0C72BB2-2EEE-42D9-A304-6633DB56B1C6@yahoo.com>
 <3c98b824f04d8fa6fd07addd956a097db1aa20ba.camel@redhat.com>
Message-ID: <W0QLEQ.W37Z8BKJ7W4R@est.tech>


On Wed, Aug 5, 2020 at 18:18, Sean Mooney <smooney at redhat.com> wrote:
> On Thu, 2020-08-06 at 01:06 +0800, yumeng bao wrote:
>>  ﻿Hi gibi and all,
>> 
>>  I wanna mention the SRIOV SmartNIC Support Spec 
>> https://review.opendev.org/#/c/742785
>> 
>>  This spec is proposed based on feedback from our PTG discussion, 
>> yet there are still open questions need to be nailed
>>  down. Since this spec involves nova neutron and cyborg,  it will 
>> probably take a long time to get ideas from different
>>  aspects and reach an agreement. Can we keep this as an exception 
>> and keep review it to reach closer to an agreement?
>>  Hopefully we can reach an agreement in Victoria, and start to land 
>> in W.
> well you dont need to close it without an exception
> the way exception work we normlly give a dealin of 1 week to finalise 
> the spec after its granted
> so basiclaly unless you think we can fully agreee all the outstanding 
> items before thursday week and merge it
> then you should just retarget the spec to the backlog or W release 
> and keep working on it rather then ask for an
> excption. exception are only for thing that we expect to merge in 
> victoria including the code.

Agree with Sean. No need for an exception to continue discussing the 
spec during the V cycle. Having the spec freeze only means that now we 
know that the SmartNIC spec is not going to be implemented in V.

Cheers,
gibi

>> 
>>  Xinran and I were trying to attend nova’s weekly meeting to 
>> discuss this spec, but the time too late for us. :( We
>>  will find if there is any other way to sync and response more 
>> actively to all your comments and feedback.
>> 
>>  And just to point out,  nova operations support are still one of 
>> cyborg’s high priority goals in Victoria, we will
>>  keep focus on it and won’t sacrifice time of this goal.
>> 
>> 
>>  Regards，
>>  Yumeng
> 


From gouthampravi at gmail.com  Wed Aug  5 18:54:41 2020
From: gouthampravi at gmail.com (Goutham Pacha Ravi)
Date: Wed, 5 Aug 2020 11:54:41 -0700
Subject: [manila] Doc-a-thon event coming up next Thursday (Aug 6th)
In-Reply-To: <CAJ_e2gDKLRx9GGMZXSqr35FTAL6MQSrEYJTWSUn1TvVTfougZA@mail.gmail.com>
References: <CAJ_e2gD3SNqirw0euCBOjhyC2maHeN75G6-FpP77eXCRWTvOjg@mail.gmail.com>
 <CAJ_e2gDKLRx9GGMZXSqr35FTAL6MQSrEYJTWSUn1TvVTfougZA@mail.gmail.com>
Message-ID: <CAKSuTPaENFN65raKphoJT0j3r-P+S68tDOcGaOoBq1n_V6YPWA@mail.gmail.com>

Thank you so much for putting this together Victoria, and Vida!
As a reminder, we will not be meeting on IRC tomorrow (6th August 2020),
but instead will be in https://meetpad.opendev.org/ManilaV-ReleaseDocAThon

You can get to the etherpad link for the meeting by going to
etherpad.opendev.org instead of meetpad.opendev.org:
https://etherpad.opendev.org/p/ManilaV-ReleaseDocAThon

Please bring any documentation issues to that meeting
Hoping to see you all there!


On Mon, Aug 3, 2020 at 12:20 PM Victoria Martínez de la Cruz <
victoria at vmartinezdelacruz.com> wrote:

> Hi everybody,
>
> An update on this. We decided to take over the upstream meeting directly
> and start *at* the slot of the Manila weekly meeting. We will join the
> Jitsi bridge [0] at 3pm UTC time and start going through the list of bugs
> we have in [1]. There is no finish time, you can join and leave the bridge
> freely. We will also use IRC Freenode channel #openstack-manila if needed.
>
> If the time slot doesn't work for you (we are aware this is not a friendly
> slot for EMEA/APAC), you can still go through the bug list in [1], claim a
> bug and work on it.
>
> If things go well, we plan to do this again in a different slot so
> everybody that wants to collaborate can do it.
>
> Looking forward to see you there,
>
> Cheers,
>
> V
>
> [0] https://meetpad.opendev.org/ManilaV-ReleaseDocAThon
> [1] https://ethercalc.openstack.org/ur17jprbprxx
>
> On Fri, Jul 31, 2020 at 2:05 PM Victoria Martínez de la Cruz <
> victoria at vmartinezdelacruz.com> wrote:
>
>> Hi folks,
>>
>> We will be organizing a doc-a-thon next Thursday, August 6th, with the
>> main goal of improving our docs for the next release. We will be gathering
>> on our Freenode channel #openstack-manila after our weekly meeting (3pm
>> UTC) and also using a videoconference tool (exact details TBC) to go over a
>> curated list of opened doc bugs we have here [0].
>>
>> *Your* participation is truly valued, being you an already Manila
>> contributor or if you are interested in contributing and you didn't know
>> how, so looking forward to seeing you there :)
>>
>> Cheers,
>>
>> Victoria
>>
>> [0] https://ethercalc.openstack.org/ur17jprbprxx
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200805/25c8b793/attachment.html>

From mnaser at vexxhost.com  Wed Aug  5 19:45:28 2020
From: mnaser at vexxhost.com (Mohammed Naser)
Date: Wed, 5 Aug 2020 15:45:28 -0400
Subject: [tc] monthly meeting
Message-ID: <CAEs876jZvRkeiyOvac24b9b8AN+UXN=P5MoK1aAiaVQOxd+Y1Q@mail.gmail.com>

Hi everyone,

Here’s the agenda for our monthly TC meeting. It will happen tomorrow
(Thursday the 6th) at 1400 UTC in #openstack-tc and I will be your
chair.

If you can’t attend, please put your name in the “Apologies for
Absence” section.

https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting

* ACTIVE INITIATIVES
- Follow up on past action items
- OpenStack User-facing APIs and CLIs (belmoreira)
- W cycle goal selection start
- Completion of retirement cleanup (gmann)
https://etherpad.opendev.org/p/tc-retirement-cleanup

Thank you,
Mohammed

-- 
Mohammed Naser
VEXXHOST, Inc.


From skaplons at redhat.com  Wed Aug  5 20:26:01 2020
From: skaplons at redhat.com (Slawek Kaplonski)
Date: Wed, 5 Aug 2020 22:26:01 +0200
Subject: [neutron] Drivers team meetings 7.08.2020 and 14.08.2020
Message-ID: <D3661413-5196-45EA-9ECD-E4A822F93B46@redhat.com>

Hi,

I have internal training this Friday and I will be on PTO next Friday.
Because of that I will not be able to chair neutron drivers meetings those 2 weeks. Currently we don’t have any new RFES to discuss so lets cancel those meetings and focus on implementation of RFEs already accepted.
See You on drivers meeting on Friday, 21.08.2020
 
— 
Slawek Kaplonski
Principal software engineer
Red Hat


From nate.johnston at redhat.com  Wed Aug  5 22:02:23 2020
From: nate.johnston at redhat.com (Nate Johnston)
Date: Wed, 5 Aug 2020 18:02:23 -0400
Subject: [tc][ptl] Proposal: Distributed Project Leadership
Message-ID: <20200805220223.2elo2nrauzr575al@firewall>

The governing structure for OpenStack projects has long been for a Project
Technical Lead (PTL) to be elected to serve as a singular focus for that
project.  While the PTL role varies significantly from project to project, the
PTL has many responsibilities for managing the development and release process
for a project as well as representing the project both internally and
externally.

There have been a number of projects that have expressed an interest in
functioning in a mode that would not require a PTL, but would rather devolve the
responsibilities of the PTL into various liaison roles, that could in turn be
held by one or more contributors.  This topic was discussed by the TC at the
recent virtual PTG, and we now have a proposal to put forth to the community for
comment.  Jean-Phillipe Evrard and I worked up a more detailed proposal for
everyone to comment on and review.  We are calling this the 'distributed project
leadership model'.

Most importantly, this is an opt-in process where projects interested in
pursuing a distributed project leadership model should opt in to it, but for
projects satisfied with the status quo nothing would change.  I encourage
everyone who is interested to examine the proposal and comment:

https://review.opendev.org/744995

Thank you,

Nate Johnston 


From jasonanderson at uchicago.edu  Wed Aug  5 23:18:28 2020
From: jasonanderson at uchicago.edu (Jason Anderson)
Date: Wed, 5 Aug 2020 23:18:28 +0000
Subject: [swift][ceph] Container ACLs don't seem to be respected on Ceph
 RGW
In-Reply-To: <757BCAB6-CA22-439E-9C0C-BE4DEC7B7927@uchicago.edu>
References: <757BCAB6-CA22-439E-9C0C-BE4DEC7B7927@uchicago.edu>
Message-ID: <48C1BF75-211F-4F7C-ABCA-D59777C469A8@uchicago.edu>

As an update, I think one of my problems was the dangling space after “_member_” in my ACL list, which was quite painful to discover. I think it was breaking the matching of my user, which had the role _member_ assigned.

And, it does look like read ACLs must be of the form “.r:*”, despite the Ceph docs. With this in place, public read ACL works. I still can’t get write ACLs to work though, and from looking at the code[1] I’m not sure how it’s supposed to work.

/Jason

[1]: https://github.com/ceph/ceph/blob/f52fb99f011d9b124ed91f3d001d3551e9a10c8d/src/rgw/rgw_acl_swift.cc

> On Aug 4, 2020, at 10:49 PM, Jason Anderson <jasonanderson at uchicago.edu> wrote:
> 
> Hi all,
> 
> Just scratching my head at this for a while and though I’d ask here in case it saves some time. I’m running a Ceph cluster on the Nautilus release and it’s running Swift via the rgw. I have Keystone authentication turned on. Everything works fine in the normal case of creating containers, uploading files, listing containers, etc.
> 
> However, I notice that ACLs don’t seem to work. I am not overriding "rgw enforce swift acls”, so it is set to the default of true. I can’t seem to share a container or make it public.
> 
> (Side note, confusingly, the Ceph implementation has a different syntax for public read/write containers, ‘*’ as opposed to ‘*:*’ for public write for example.)
> 
> Here’s what I’m doing
> 
> (as admin)
> swift post —write-acl ‘*’ —read-acl ‘*’ public-container
> swift stat public-container
>                      Account: v1
>                    Container: public-container
>                      Objects: 1
>                        Bytes: 5801
>                     Read ACL: *
>                    Write ACL: *
>                      Sync To:
>                     Sync Key:
>                  X-Timestamp: 1595883106.23179
> X-Container-Bytes-Used-Actual: 8192
>             X-Storage-Policy: default-placement
>              X-Storage-Class: STANDARD
>                Last-Modified: Wed, 05 Aug 2020 03:42:11 GMT
>                   X-Trans-Id: tx000000000000000662156-005f2a2bea-23478-default
>       X-Openstack-Request-Id: tx000000000000000662156-005f2a2bea-23478-default
>                Accept-Ranges: bytes
>                 Content-Type: text/plain; charset=utf-8
> 
> (as non-admin)
> swift upload public-container test.txt
> Warning: failed to create container 'public-container': 409 Conflict: BucketAlreadyExists
> Object HEAD failed: https://ceph.example.org:7480/swift/v1/public-container/README.md 403 Forbidden
> 
> swift list public-container
> Container GET failed: https://ceph.example.org:7480/swift/v1/public-container?format=json 403 Forbidden  [first 60 chars of response] b'{"Code":"AccessDenied","BucketName”:”public-container","RequestId":"tx0'
> Failed Transaction ID: tx000000000000000662162-005f2a2c2a-23478-default
> 
> What am I missing? Thanks in advance!
> 
> /Jason


From jasonanderson at uchicago.edu  Wed Aug  5 23:19:53 2020
From: jasonanderson at uchicago.edu (Jason Anderson)
Date: Wed, 5 Aug 2020 23:19:53 +0000
Subject: [swift][ceph] Container ACLs don't seem to be respected on Ceph
 RGW
In-Reply-To: <48C1BF75-211F-4F7C-ABCA-D59777C469A8@uchicago.edu>
References: <757BCAB6-CA22-439E-9C0C-BE4DEC7B7927@uchicago.edu>
 <48C1BF75-211F-4F7C-ABCA-D59777C469A8@uchicago.edu>
Message-ID: <D8A50EC3-F619-4C0B-91A5-7AC1B4B17AEE@uchicago.edu>


On Aug 5, 2020, at 6:18 PM, Jason Anderson <jasonanderson at uchicago.edu<mailto:jasonanderson at uchicago.edu>> wrote:

As an update, I think one of my problems was the dangling space after “_member_” in my ACL list, which was quite painful to discover. I think it was breaking the matching of my user, which had the role _member_ assigned.

Sorry, I meant in my Ceph configuration, which had this line in the rgw section:

rgw keystone accepted roles =  _member_ , Member, admin

And, it does look like read ACLs must be of the form “.r:*”, despite the Ceph docs. With this in place, public read ACL works. I still can’t get write ACLs to work though, and from looking at the code[1] I’m not sure how it’s supposed to work.

/Jason

[1]: https://github.com/ceph/ceph/blob/f52fb99f011d9b124ed91f3d001d3551e9a10c8d/src/rgw/rgw_acl_swift.cc

On Aug 4, 2020, at 10:49 PM, Jason Anderson <jasonanderson at uchicago.edu<mailto:jasonanderson at uchicago.edu>> wrote:

Hi all,

Just scratching my head at this for a while and though I’d ask here in case it saves some time. I’m running a Ceph cluster on the Nautilus release and it’s running Swift via the rgw. I have Keystone authentication turned on. Everything works fine in the normal case of creating containers, uploading files, listing containers, etc.

However, I notice that ACLs don’t seem to work. I am not overriding "rgw enforce swift acls”, so it is set to the default of true. I can’t seem to share a container or make it public.

(Side note, confusingly, the Ceph implementation has a different syntax for public read/write containers, ‘*’ as opposed to ‘*:*’ for public write for example.)

Here’s what I’m doing

(as admin)
swift post —write-acl ‘*’ —read-acl ‘*’ public-container
swift stat public-container
                    Account: v1
                  Container: public-container
                    Objects: 1
                      Bytes: 5801
                   Read ACL: *
                  Write ACL: *
                    Sync To:
                   Sync Key:
                X-Timestamp: 1595883106.23179
X-Container-Bytes-Used-Actual: 8192
           X-Storage-Policy: default-placement
            X-Storage-Class: STANDARD
              Last-Modified: Wed, 05 Aug 2020 03:42:11 GMT
                 X-Trans-Id: tx000000000000000662156-005f2a2bea-23478-default
     X-Openstack-Request-Id: tx000000000000000662156-005f2a2bea-23478-default
              Accept-Ranges: bytes
               Content-Type: text/plain; charset=utf-8

(as non-admin)
swift upload public-container test.txt
Warning: failed to create container 'public-container': 409 Conflict: BucketAlreadyExists
Object HEAD failed: https://ceph.example.org:7480/swift/v1/public-container/README.md 403 Forbidden

swift list public-container
Container GET failed: https://ceph.example.org:7480/swift/v1/public-container?format=json 403 Forbidden  [first 60 chars of response] b'{"Code":"AccessDenied","BucketName”:”public-container","RequestId":"tx0'
Failed Transaction ID: tx000000000000000662162-005f2a2c2a-23478-default

What am I missing? Thanks in advance!

/Jason


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200805/8ab75ea3/attachment.html>

From yumeng_bao at yahoo.com  Thu Aug  6 02:11:19 2020
From: yumeng_bao at yahoo.com (yumeng bao)
Date: Thu, 6 Aug 2020 10:11:19 +0800
Subject: [nova] If any spec freeze exception now?
In-Reply-To: <W0QLEQ.W37Z8BKJ7W4R@est.tech>
References: <W0QLEQ.W37Z8BKJ7W4R@est.tech>
Message-ID: <681905FE-A165-42B2-9D67-0AF259F62E5A@yahoo.com>


Ok. That make sense to me!
Thanks gibi and Sean!

Regards，
Yumeng

> On Aug 6, 2020, at 1:31 AM, Balázs Gibizer <balazs.gibizer at est.tech> wrote:
> 
> ﻿
> 
>> On Wed, Aug 5, 2020 at 18:18, Sean Mooney <smooney at redhat.com> wrote:
>>> On Thu, 2020-08-06 at 01:06 +0800, yumeng bao wrote:
>>> ﻿Hi gibi and all,
>>> I wanna mention the SRIOV SmartNIC Support Spec https://review.opendev.org/#/c/742785
>>> This spec is proposed based on feedback from our PTG discussion, yet there are still open questions need to be nailed
>>> down. Since this spec involves nova neutron and cyborg,  it will probably take a long time to get ideas from different
>>> aspects and reach an agreement. Can we keep this as an exception and keep review it to reach closer to an agreement?
>>> Hopefully we can reach an agreement in Victoria, and start to land in W.
>> well you dont need to close it without an exception
>> the way exception work we normlly give a dealin of 1 week to finalise the spec after its granted
>> so basiclaly unless you think we can fully agreee all the outstanding items before thursday week and merge it
>> then you should just retarget the spec to the backlog or W release and keep working on it rather then ask for an
>> excption. exception are only for thing that we expect to merge in victoria including the code.
> 
> Agree with Sean. No need for an exception to continue discussing the spec during the V cycle. Having the spec freeze only means that now we know that the SmartNIC spec is not going to be implemented in V.
> 
> Cheers,
> gibi
> 
>>> Xinran and I were trying to attend nova’s weekly meeting to discuss this spec, but the time too late for us. :( We
>>> will find if there is any other way to sync and response more actively to all your comments and feedback.
>>> And just to point out,  nova operations support are still one of cyborg’s high priority goals in Victoria, we will
>>> keep focus on it and won’t sacrifice time of this goal.
>>> Regards，
>>> Yumeng
> 
> 


From emiller at genesishosting.com  Thu Aug  6 04:28:28 2020
From: emiller at genesishosting.com (Eric K. Miller)
Date: Wed, 5 Aug 2020 23:28:28 -0500
Subject: [cinder][nova] Local storage in compute node
In-Reply-To: <20200805111934.77lesgmmdiqeo27m@lyarwood.usersys.redhat.com>
References: <046E9C0290DD9149B106B72FC9156BEA04814477@gmsxchsvr01.thecreation.com>
 <046E9C0290DD9149B106B72FC9156BEA04814478@gmsxchsvr01.thecreation.com>
 <20200805111934.77lesgmmdiqeo27m@lyarwood.usersys.redhat.com>
Message-ID: <046E9C0290DD9149B106B72FC9156BEA0481447A@gmsxchsvr01.thecreation.com>

> Do you need full host block devices to be provided to the instance?

No - a thin-provisioned LV in LVM would be best.

> The LVM imagebackend will just provision LVs on top of the provided VG so
> there's no direct mapping to a full host block device with this approach.

That's perfect!

> Yeah that's a common pitfall when using LVM based ephemeral disks that
> contain additional LVM PVs/VGs/LVs etc. You need to ensure that the host is
> configured to not scan these LVs in order for their PVs/VGs/LVs etc to remain
> hidden from the host:

Thanks for the link!

I will let everyone know how testing goes.

Eric

From emiller at genesishosting.com  Thu Aug  6 04:30:01 2020
From: emiller at genesishosting.com (Eric K. Miller)
Date: Wed, 5 Aug 2020 23:30:01 -0500
Subject: [cinder][nova] Local storage in compute node
In-Reply-To: <7b7f6e277f77423ae6502d81c6d778fd4249b99d.camel@redhat.com>
References: <046E9C0290DD9149B106B72FC9156BEA04814477@gmsxchsvr01.thecreation.com>
 <046E9C0290DD9149B106B72FC9156BEA04814478@gmsxchsvr01.thecreation.com>
 <20200805111934.77lesgmmdiqeo27m@lyarwood.usersys.redhat.com>
 <7b7f6e277f77423ae6502d81c6d778fd4249b99d.camel@redhat.com>
Message-ID: <046E9C0290DD9149B106B72FC9156BEA0481447B@gmsxchsvr01.thecreation.com>

> That said there's no real alternative available at the moment.
> well one alternitive to nova providing local lvm storage is to use
> the cinder lvm driver but install it on all compute nodes then
> use the cidner InstanceLocalityFilter to ensure the volume is alocated form
> the host
> the vm is on.
> https://docs.openstack.org/cinder/latest/configuration/block-
> storage/scheduler-filters.html#instancelocalityfilter
> on drawback to this is that if the if the vm is moved i think you would need to
> also migrate the cinder volume
> seperatly afterwards.

I wasn't aware of the InstanceLocalityFilter, so thank you for mentioning it!  

Eric

From emiller at genesishosting.com  Thu Aug  6 04:39:50 2020
From: emiller at genesishosting.com (Eric K. Miller)
Date: Wed, 5 Aug 2020 23:39:50 -0500
Subject: [cinder][nova] Local storage in compute node
In-Reply-To: <CAMHmko_hD9m6wXWvfG8-XZdN=HiwrdaukT-jMUB2psW18xMCNQ@mail.gmail.com>
References: <046E9C0290DD9149B106B72FC9156BEA04814477@gmsxchsvr01.thecreation.com>
 <046E9C0290DD9149B106B72FC9156BEA04814478@gmsxchsvr01.thecreation.com>
 <20200805111934.77lesgmmdiqeo27m@lyarwood.usersys.redhat.com>
 <7b7f6e277f77423ae6502d81c6d778fd4249b99d.camel@redhat.com>
 <92839697a08966dc17cd5c4c181bb32e2d197f93.camel@redhat.com>
 <CAMHmko9PNufKHiZO_yt4ds7OTmG9xdpH+=ooGWoFdpimx9tYng@mail.gmail.com>
 <4f025d444406898903dabf3049ed021822cce19b.camel@redhat.com>
 <CAMHmko_hD9m6wXWvfG8-XZdN=HiwrdaukT-jMUB2psW18xMCNQ@mail.gmail.com>
Message-ID: <046E9C0290DD9149B106B72FC9156BEA0481447C@gmsxchsvr01.thecreation.com>

From: Donny Davis [mailto:donny at fortnebula.com] 
Sent: Wednesday, August 05, 2020 8:23 AM

> If you have any other questions I am happy to help where I can - I have been working with all nvme stuff for the last couple years and have gotten something into prod for about 1 year with it (maybe a little longer). 
> From what I can tell, getting max performance from nvme for an instance is a non-trivial task because it's just so much faster than the rest of the stack and careful considerations must be taken to get the most out of it. 
> I am curious to see where you take this Eric

Thanks for the response!  We also use Ceph with NVMe SSDs, with many NVMe namespaces with one OSD per namespace, to fully utilize the SSDs.  You are right - they are so fast that they are literally faster than any application can use.  They are great for multi-tenant environments, though, where it's usually better to have more hardware than people can utilize.

My first test is to try using the Libvirt "images_type=lvm" method to see how well it works.  I will report back...

Eric

From emiller at genesishosting.com  Thu Aug  6 04:53:49 2020
From: emiller at genesishosting.com (Eric K. Miller)
Date: Wed, 5 Aug 2020 23:53:49 -0500
Subject: [cinder][nova] Local storage in compute node
In-Reply-To: <4f025d444406898903dabf3049ed021822cce19b.camel@redhat.com>
References: <046E9C0290DD9149B106B72FC9156BEA04814477@gmsxchsvr01.thecreation.com>
 <046E9C0290DD9149B106B72FC9156BEA04814478@gmsxchsvr01.thecreation.com>
 <20200805111934.77lesgmmdiqeo27m@lyarwood.usersys.redhat.com>
 <7b7f6e277f77423ae6502d81c6d778fd4249b99d.camel@redhat.com>
 <92839697a08966dc17cd5c4c181bb32e2d197f93.camel@redhat.com>
 <CAMHmko9PNufKHiZO_yt4ds7OTmG9xdpH+=ooGWoFdpimx9tYng@mail.gmail.com>
 <4f025d444406898903dabf3049ed021822cce19b.camel@redhat.com>
Message-ID: <046E9C0290DD9149B106B72FC9156BEA0481447D@gmsxchsvr01.thecreation.com>

> From: Sean Mooney [mailto:smooney at redhat.com]
> Sent: Wednesday, August 05, 2020 8:01 AM
> yes that works well with the default flat/qcow file format
> i assume there was a reason this was not the starting point.
> the nova lvm backend i think does not supprot thin provisioning
> so fi you did the same thing creating the volume group on the nvme deivce
> you would technically get better write performance after the vm is booted
> but
> the vm spwan is slower since we cant take advantage of thin providioning
> and
> each root disk need to be copided  form the cahced image.

I wasn't aware that the nova LVM backend ([libvirt]/images_type = lvm) didn't support thin provisioned LV's.  However, I do see that the "sparse_logical_volumes" parameter indicates it has been deprecated:
https://docs.openstack.org/nova/rocky/configuration/config.html#libvirt.sparse_logical_volumes

That would definitely be a downer.

> so just monting the nova data directory on an nvme driver or a raid of nvme
> drives
> works well and is simple to do.

Maybe we should consider doing this instead.  I'll test with the Nova LVM backend first.

> so there are trade off with both appoches.
> generally i recommend using local sotrage e.g. the vm root disk or ephemeral
> disk for fast scratchpad space
> to work on data bug persitie all relevent data permently via cinder volumes.
> that requires you to understand which block
> devices a local and which are remote but it give you the best of both worlds.

Our use case simply requires high-speed non-redundant storage for self-replicating applications like Couchbase, Cassandra, MongoDB, etc. or very inexpensive VMs that are backed-up often and can withstand the downtime when restoring from backup.

That will be one more requirement (or rather a very nice to have), is to be able to create images (backups) of the local storage onto object storage, so hopefully "openstack server backup create" works like it does with rbd-backed Nova-managed persistent storage.

I will let you know what I find out!

Thanks everyone!

Eric

From marino.mrc at gmail.com  Thu Aug  6 07:32:21 2020
From: marino.mrc at gmail.com (Marco Marino)
Date: Thu, 6 Aug 2020 09:32:21 +0200
Subject: [tripleo] Deploy overcloud without provisioning
Message-ID: <CAFHVVuLcEsk7jf2-9zEmGHsekXnXRNdHKR4Q5GVx5eYr+O+xBg@mail.gmail.com>

Hi,
I'm trying to deploy an overcloud using tripleo with pre provisioned nodes.
My configuration is quite simple:
- 1 controller and 1 compute nodes on which I already installed CentOS 8.2
- Both nodes have a dedicated idrac interface with an ip in 192.168.199.0/24.
Please note that this interface is not visible with "ip a" or "ifconfig".
It's a dedicated IDRAC interface
- Both nodes have a NIC configured in the subnet 192.168.199.0/24
(192.168.199.200 and 192.168.199.201)
- Undercloud uses 192.168.199.0/24 as pxe/provisioning network (but I don't
need provisioning)

Question: should I import nodes with "openstack overcloud node import
nodes.yaml" even if I don't need the provisioning step?

Furthermore, on the undercloud I created one file:
/home/stack/templates/node-info.yaml with the following content

parameter_defaults:
  OvercloudControllerFlavor: control
  OvercloudComputeFlavor: compute
  ControllerCount: 1
  ComputeCount: 1

Question: How can I specify that "node X with ip Y should be used as a
controller and node Z with ip K should be used as a compute"?? Should I set
the property with the following command?
openstack baremetal node set --property capabilities='profile:control'
controller1

Thank you,
Marco
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200806/577f865a/attachment.html>

From mark at stackhpc.com  Thu Aug  6 07:46:01 2020
From: mark at stackhpc.com (Mark Goddard)
Date: Thu, 6 Aug 2020 08:46:01 +0100
Subject: [openstack-community] Octavia :; Unable to create load balancer
In-Reply-To: <CAMH0Mg+bVbsT+D-MxR7u5VaNthb==OM6gtU1d72t6nZ1tioV1w@mail.gmail.com>
References: <DM6PR08MB51944451C8861614628EE9DA8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAFs83Qr3Us22pK7gnw7X68_FjimnBTRDn0s+RaSzuzhjcXdY6w@mail.gmail.com>
 <CAMH0MgL7mvG-Zgb-Sr1mbNWhHLyZAHYw+BbAuzaHA+EJSJ=DQQ@mail.gmail.com>
 <DM6PR08MB5194D8A130262996A211B61F8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VzM4BBStF8nTwVJtFxaX0surEidjnmB3JExTDsncJsZNg@mail.gmail.com>
 <DM6PR08MB5194211D78CB801C7C3B0D4A8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VwKMWiO7L6CYK+zb4jkT9ntCjcjQRspxdm8MFL+i13aKg@mail.gmail.com>
 <DM6PR08MB519419FF0C8F173E25A6EAB78F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyDANKJ0CvPXdwMFuED-xEea=Urn0sy0tpwnwLBPSnKTg@mail.gmail.com>
 <DM6PR08MB5194C8259398F0810FAA236E8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAMH0MgLADYE6ncadXJHxghaKDYVmL=wYPE6hOejTC3xyHMgaAg@mail.gmail.com>
 <DM6PR08MB519489AEE16CD5C6CA69890C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <DM6PR08MB51949EBABD05456E0D7B14E38F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyATFxkKPL+YJRsrTiY0qzxN4GWVu6FDroKgjGc5Bqzbg@mail.gmail.com>
 <DM6PR08MB519447CBCC68203A8D5C870C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VxOttV=yNCoFp-RGMts82RJrCB7wXkygsPwmNx--miYVg@mail.gmail.com>
 <CAMH0MgLNm36nWsuXWyYvbd=1LNdNdmKLMYJn7u+d-muTnk0kuA@mail.gmail.com>
 <DM6PR08MB5194961A1F38F812D414AE408F4A0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAFHSqWp0oeLLF36SCvORJ2Vh5YeWyxExJs63efSCG1ak9uaYww@mail.gmail.com>
 <CAMH0Mg+bVbsT+D-MxR7u5VaNthb==OM6gtU1d72t6nZ1tioV1w@mail.gmail.com>
Message-ID: <CAFHSqWrY6+ai-wCyYFscc6=gqkOO09r2TieN=zyVLg6Sf_4LAA@mail.gmail.com>

On Wed, 5 Aug 2020 at 16:16, Michael Johnson <johnsomor at gmail.com> wrote:

> Looking at that error, it appears that the lb-mgmt-net is not setup
> correctly. The Octavia controller containers are not able to reach the
> amphora instances on the lb-mgmt-net subnet.
>
> I don't know how kolla is setup to connect the containers to the neutron
> lb-mgmt-net network. Maybe the above documents will help with that.
>

Right now it's up to the operator to configure that. The kolla
documentation doesn't prescribe any particular setup. We're working on
automating it in Victoria.


> Michael
>
> On Wed, Aug 5, 2020 at 12:53 AM Mark Goddard <mark at stackhpc.com> wrote:
>
>>
>>
>> On Tue, 4 Aug 2020 at 16:58, Monika Samal <monika.samal at outlook.com>
>> wrote:
>>
>>> Hello Guys,
>>>
>>> With Michaels help I was able to solve the problem but now there is
>>> another error I was able to create my network on vlan but still error
>>> persist. PFB the logs:
>>>
>>> http://paste.openstack.org/show/fEixSudZ6lzscxYxsG1z/
>>>
>>> Kindly help
>>>
>>> regards,
>>> Monika
>>> ------------------------------
>>> *From:* Michael Johnson <johnsomor at gmail.com>
>>> *Sent:* Monday, August 3, 2020 9:10 PM
>>> *To:* Fabian Zimmermann <dev.faz at gmail.com>
>>> *Cc:* Monika Samal <monika.samal at outlook.com>; openstack-discuss <
>>> openstack-discuss at lists.openstack.org>
>>> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
>>> balancer
>>>
>>> Yeah, it looks like nova is failing to boot the instance.
>>>
>>> Check this setting in your octavia.conf files:
>>> https://docs.openstack.org/octavia/latest/configuration/configref.html#controller_worker.amp_flavor_id
>>>
>>> Also, if kolla-ansible didn't set both of these values correctly, please
>>> open bug reports for kolla-ansible. These all should have been configured
>>> by the deployment tool.
>>>
>>>
>> I wasn't following this thread due to no [kolla] tag, but here are the
>> recently added docs for Octavia in kolla [1]. Note
>> the octavia_service_auth_project variable which was added to migrate from
>> the admin project to the service project for octavia resources. We're
>> lacking proper automation for the flavor, image etc, but it is being worked
>> on in Victoria [2].
>>
>> [1]
>> https://docs.openstack.org/kolla-ansible/latest/reference/networking/octavia.html
>> [2] https://review.opendev.org/740180
>>
>> Michael
>>>
>>> On Mon, Aug 3, 2020 at 7:53 AM Fabian Zimmermann <dev.faz at gmail.com>
>>> wrote:
>>>
>>> Seems like the flavor is missing or empty '' - check for typos and
>>> enable debug.
>>>
>>> Check if the nova req contains valid information/flavor.
>>>
>>>  Fabian
>>>
>>> Monika Samal <monika.samal at outlook.com> schrieb am Mo., 3. Aug. 2020,
>>> 15:46:
>>>
>>> It's registered
>>>
>>> Get Outlook for Android <https://aka.ms/ghei36>
>>> ------------------------------
>>> *From:* Fabian Zimmermann <dev.faz at gmail.com>
>>> *Sent:* Monday, August 3, 2020 7:08:21 PM
>>> *To:* Monika Samal <monika.samal at outlook.com>; openstack-discuss <
>>> openstack-discuss at lists.openstack.org>
>>> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
>>> balancer
>>>
>>> Did you check the (nova) flavor you use in octavia.
>>>
>>>  Fabian
>>>
>>> Monika Samal <monika.samal at outlook.com> schrieb am Mo., 3. Aug. 2020,
>>> 10:53:
>>>
>>> After Michael suggestion I was able to create load balancer but there is
>>> error in status.
>>>
>>>
>>>
>>> PFB the error link:
>>>
>>> http://paste.openstack.org/show/meNZCeuOlFkfjj189noN/
>>> ------------------------------
>>> *From:* Monika Samal <monika.samal at outlook.com>
>>> *Sent:* Monday, August 3, 2020 2:08 PM
>>> *To:* Michael Johnson <johnsomor at gmail.com>
>>> *Cc:* Fabian Zimmermann <dev.faz at gmail.com>; Amy Marrich <
>>> amy at demarco.com>; openstack-discuss <
>>> openstack-discuss at lists.openstack.org>; community at lists.openstack.org <
>>> community at lists.openstack.org>
>>> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
>>> balancer
>>>
>>> Thanks a ton Michael for helping me out
>>> ------------------------------
>>> *From:* Michael Johnson <johnsomor at gmail.com>
>>> *Sent:* Friday, July 31, 2020 3:57 AM
>>> *To:* Monika Samal <monika.samal at outlook.com>
>>> *Cc:* Fabian Zimmermann <dev.faz at gmail.com>; Amy Marrich <
>>> amy at demarco.com>; openstack-discuss <
>>> openstack-discuss at lists.openstack.org>; community at lists.openstack.org <
>>> community at lists.openstack.org>
>>> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
>>> balancer
>>>
>>> Just to close the loop on this, the octavia.conf file had
>>> "project_name = admin" instead of "project_name = service" in the
>>> [service_auth] section. This was causing the keystone errors when
>>> Octavia was communicating with neutron.
>>>
>>> I don't know if that is a bug in kolla-ansible or was just a local
>>> configuration issue.
>>>
>>> Michael
>>>
>>> On Thu, Jul 30, 2020 at 1:39 PM Monika Samal <monika.samal at outlook.com>
>>> wrote:
>>> >
>>> > Hello Fabian,,
>>> >
>>> > http://paste.openstack.org/show/QxKv2Ai697qulp9UWTjY/
>>> >
>>> > Regards,
>>> > Monika
>>> > ________________________________
>>> > From: Fabian Zimmermann <dev.faz at gmail.com>
>>> > Sent: Friday, July 31, 2020 1:57 AM
>>> > To: Monika Samal <monika.samal at outlook.com>
>>> > Cc: Michael Johnson <johnsomor at gmail.com>; Amy Marrich <
>>> amy at demarco.com>; openstack-discuss <
>>> openstack-discuss at lists.openstack.org>; community at lists.openstack.org <
>>> community at lists.openstack.org>
>>> > Subject: Re: [openstack-community] Octavia :; Unable to create load
>>> balancer
>>> >
>>> > Hi,
>>> >
>>> > just to debug, could you replace the auth_type password with
>>> v3password?
>>> >
>>> > And do a curl against your :5000 and :35357 urls and paste the output.
>>> >
>>> >  Fabian
>>> >
>>> > Monika Samal <monika.samal at outlook.com> schrieb am Do., 30. Juli
>>> 2020, 22:15:
>>> >
>>> > Hello Fabian,
>>> >
>>> > http://paste.openstack.org/show/796477/
>>> >
>>> > Thanks,
>>> > Monika
>>> > ________________________________
>>> > From: Fabian Zimmermann <dev.faz at gmail.com>
>>> > Sent: Friday, July 31, 2020 1:38 AM
>>> > To: Monika Samal <monika.samal at outlook.com>
>>> > Cc: Michael Johnson <johnsomor at gmail.com>; Amy Marrich <
>>> amy at demarco.com>; openstack-discuss <
>>> openstack-discuss at lists.openstack.org>; community at lists.openstack.org <
>>> community at lists.openstack.org>
>>> > Subject: Re: [openstack-community] Octavia :; Unable to create load
>>> balancer
>>> >
>>> > The sections should be
>>> >
>>> > service_auth
>>> > keystone_authtoken
>>> >
>>> > if i read the docs correctly. Maybe you can just paste your config
>>> (remove/change passwords) to paste.openstack.org and post the link?
>>> >
>>> >  Fabian
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200806/f7907445/attachment-0001.html>

From e0ne at e0ne.info  Thu Aug  6 07:46:25 2020
From: e0ne at e0ne.info (Ivan Kolodyazhny)
Date: Thu, 6 Aug 2020 10:46:25 +0300
Subject: [horizon] Victoria virtual mid-cycle poll
In-Reply-To: <CAGocpaEC=VtNuTxoHcyy5795bfwe3VOyt7M5g8cvZQ1_fTf=tw@mail.gmail.com>
References: <CAGocpaG1SSrRoekC56pz_2yMDu=hj_bVaXm-4-6dCF-5zt+tAQ@mail.gmail.com>
 <CAGocpaEC=VtNuTxoHcyy5795bfwe3VOyt7M5g8cvZQ1_fTf=tw@mail.gmail.com>
Message-ID: <CAGocpaEp6HjfFpCSk_u3t=LyitG+axfzLBZrp=0x_gJcbGxaKQ@mail.gmail.com>

Hi everybody,

According to our poll [2] we'll have a one-hour mid-cycle poll today at
13.00 UTC. I'll share a Zoom link before the meeting today.

We're going to discuss current release priorities [3] and our future plans.


[2] https://doodle.com/poll/dkmsai49v4zzpca2
[3] https://etherpad.opendev.org/p/horizon-release-priorities

Regards,
Ivan Kolodyazhny,
http://blog.e0ne.info/


On Thu, Jul 30, 2020 at 1:00 PM Ivan Kolodyazhny <e0ne at e0ne.info> wrote:

> Hi team,
>
> If something can go wrong, it will definitely go wrong.
> It means that I did a mistake in my original mail and sent you
> completely wrong dates:(.
>
> Horizon Virtual mid-cycle is supposed to be next week Aug 5-7. I'm
> planning to have a single one-hour session.
> In case, if we've got a lot of participants and topic to discuss, we can
> schedule one more session a week or two weeks later.
>
> Here is a correct poll: https://doodle.com/poll/dkmsai49v4zzpca2
>
> Regards,
> Ivan Kolodyazhny,
> http://blog.e0ne.info/
>
>
> On Wed, Jul 22, 2020 at 10:26 AM Ivan Kolodyazhny <e0ne at e0ne.info> wrote:
>
>> Hi team,
>>
>> As discussed at Horizon's Virtual PTG [1], we'll have a virtual mid-cycle
>> meeting around Victoria-2 milestone.
>>
>> We'll discuss Horizon current cycle development priorities and the future
>> of Horizon with modern JS frameworks.
>>
>> Please indicate your availability to meet for the first session, which
>> will be held during the week of July 27-31:
>>
>>     https://doodle.com/poll/3neps94amcreaw8q
>>
>> Please respond before 12:00 UTC on Tuesday 4 August.
>>
>> [1] https://etherpad.opendev.org/p/horizon-v-ptg
>>
>> Regards,
>> Ivan Kolodyazhny,
>> http://blog.e0ne.info/
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200806/c79f8e04/attachment.html>

From e0ne at e0ne.info  Thu Aug  6 07:48:39 2020
From: e0ne at e0ne.info (Ivan Kolodyazhny)
Date: Thu, 6 Aug 2020 10:48:39 +0300
Subject: [horizon] patternfly?
In-Reply-To: <20200628140127.GA502608@straylight.m.ringlet.net>
References: <CAEs876iAfCT05vockLG0wwuZxe_90LdkP2jkSDkn2mojMhjFkw@mail.gmail.com>
 <CALhU9tkUmf5OQ6Pct-_c_8+UVMhuK3Ln0BR=GykH1etyy==yJA@mail.gmail.com>
 <20200623005343.rkgtee524s5tl7kx@yuggoth.org>
 <115da5a2-0bf1-4ec0-8ba6-0b3d1f3b9ab7@debian.org>
 <7406ea49-37ed-da56-24c5-786c342e632e@catalyst.net.nz>
 <20200628140127.GA502608@straylight.m.ringlet.net>
Message-ID: <CAGocpaG=D-phx+NX2xybS3cyCCOHHBV=AttfVpzXfu_hoP=bwQ@mail.gmail.com>

Hi,

We can discuss Horizon v next today during our mid-cycle call:
http://lists.openstack.org/pipermail/openstack-discuss/2020-August/016346.html

Regards,
Ivan Kolodyazhny,
http://blog.e0ne.info/


On Sun, Jun 28, 2020 at 5:03 PM Peter Pentchev <roam at ringlet.net> wrote:

> On Wed, Jun 24, 2020 at 02:07:06PM +1200, Adrian Turjak wrote:
> > On 24/06/20 1:05 am, Thomas Goirand wrote:
> > > Anyone dismissing how huge of a problem this is, isn't doing serious
> > > programming, for serious production use. That person would just be
> doing
> > > script kiddy work in the playground. Yes, it works, yes, it's shiny and
> > > all. The upstream code may even be super nice, well written and all.
> But
> > > it's *NOT* serious to put such JS bundling approach in production.
> > And yet people are running huge projects in production like this just
> > fine. So clearly people are finding sane practices around it that give
> > them enough security to feel safe that don't involve packaging each npm
> > requirement as an OS package. How exactly are all the huge powerhouses
> > doing it then when most of the internet's front end is giant js bundles
> > built from npm dependencies? How does gitlab do it for their omnibus?
> > From a cursory glance it did seem like they did use npm, and had a rake
> > job to compile the js. Gitlab most definitely isn't "script kiddy work".
> >
> > I'm mostly a python dev, so I don't deal with npm often. When it comes
> > to python though, other than underlying OS packages for python/pip
> > itself, I use pip for installing my versions (in a venv or container).
> > I've had too many painful cases of weird OS package versions, and I
> > dislike the idea of relying on the OS when there is a perfectly valid
> > and working package management system for my application requirements. I
> > can audit the versions installed against known CVEs, and because I
> > control the requirements, I can ensure I'm never using out of date
> > libraries.
> >
> > Javascript and npm is only different because the sheer number of
> > dependencies. Which is terrifying, don't get me wrong, but you can lock
> > versions, you can audit them against CVEs, you can be warned if they are
> > out of date. How other than by sheer scale is it really worse than pip
> > if you follow some standards and a consistent process?
>
> What Thomas is trying to say, and I think other people in this thread
> also agreed with, is that it's not "only" because of the sheer number of
> dependencies. My personal opinion is that the Javascript ecosystem is
> currently where Perl/CPAN was 25 years ago, Python was between 15 and 20
> years ago, and Ruby was 10-15 years ago: quite popular, attracting many
> people who "just want to write a couple of lines of code to solve this
> simple task", and, as a very logical consequence, full of small
> libraries that various people developed to fix their own itches and just
> released out into the wild without very much thought of long-term
> maintenance. Now, this has several consequences (most of them have been
> pointed out already):
>
> - there are many (not all, but many) developers who do not even try to
>   keep their own libraries backwards-compatible
>
> - there are many (not all, but many) developers who, once they have
>   written a piece of code that uses three libraries from other people,
>   do not really bother to follow the development of those libraries and
>   try to make their own piece of code compatible with their new versions
>   (this holds even more if there are not three, but fifteen libraries
>   from other people; it can be a bit hard to keep up with them all if
>   their authors do not care about API stability)
>
> - there are many libraries that lock the versions of their dependencies,
>   thus bringing back what was once known as "DLL hell", over and over
>   and over again (and yes, this happens in other languages, too)
>
> - there are many, many, *many* libraries that solve the same problems
>   over and over again in subtly different ways, either because their
>   authors were not aware of the other implementations or because said
>   other implementations could not exactly scratch the author's itch and
>   it was easier to write their own instead of spend some more time
>   trying to adapt the other one and propose changes to its author
>   (and, yes, I myself have been guilty of this in C, Perl, and Python
>   projects in the past; NIH is a very, very easy slope to slide down
>   along)
>
> I *think* that, with time, many Javascript developers will realize that
> this situation is unsustainable in the long term, and, one by one, they
> will start doing what C/C++, Perl, Python, and Ruby people have been
> doing for some time now:
>
> - start thinking about backwards compatibility, think really hard before
>   making an incompatible change and, if they really have to, use
>   something like semantic versioning (not necessarily exactly semver,
>   but something similar) to signal the API breakage
>
> - once the authors of the libraries they depend on start doing this,
>   start encoding loose version requirements (not strictly pinned), such
>   as "dep >= 1.2.1, dep < 3". This is already done in many Python
>   packages, and OpenStack's upper-constraints machinery is a wonderful
>   example of how this can be maintained in a conservative manner that
>   virtually guarantees that the end result will work.
>
> - start wondering whether it is really worth it to maintain their own
>   pet implementation instead of extending a more-widely-used one, thus
>   eventually having the community settle on a well-known set of
>   more-or-less comprehensive and very widely tested packages for most
>   tasks. Once this happens, the authors of these widely-used libraries
>   absolutely *have* to keep some degree of backwards compatibility and
>   some kind of reasonable versioning scheme to signal changes.
>
> So, I'm kind of optimistic and I believe that, with time, the Javascript
> ecosystem will become better. Unfortunately, this process has taken many
> years for the other languages I've mentioned, and is not really fully
> complete in any of them: any module repository has its share of
> mostly-maintained reimplementations of various shapes and sizes of the
> wheel. So I guess the point of all this was mostly to explain the
> problem (once again) more than propose any short-term solutions :/
>
> G'luck,
> Peter
>
> --
> Peter Pentchev  roam at ringlet.net roam at debian.org pp at storpool.com
> PGP key:        http://people.FreeBSD.org/~roam/roam.key.asc
> Key fingerprint 2EE7 A7A5 17FC 124C F115  C354 651E EFB0 2527 DF13
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200806/7709b80b/attachment.html>

From ramishra at redhat.com  Thu Aug  6 07:51:53 2020
From: ramishra at redhat.com (Rabi Mishra)
Date: Thu, 6 Aug 2020 13:21:53 +0530
Subject: [tripleo] Deploy overcloud without provisioning
In-Reply-To: <CAFHVVuLcEsk7jf2-9zEmGHsekXnXRNdHKR4Q5GVx5eYr+O+xBg@mail.gmail.com>
References: <CAFHVVuLcEsk7jf2-9zEmGHsekXnXRNdHKR4Q5GVx5eYr+O+xBg@mail.gmail.com>
Message-ID: <CABJHmF594_s6dCt+kAzZj1CXCunUv-dGCxhVbBnZ51f9BAHsgg@mail.gmail.com>

On Thu, Aug 6, 2020 at 1:07 PM Marco Marino <marino.mrc at gmail.com> wrote:

> Hi,
> I'm trying to deploy an overcloud using tripleo with pre provisioned
> nodes. My configuration is quite simple:
> - 1 controller and 1 compute nodes on which I already installed CentOS 8.2
> - Both nodes have a dedicated idrac interface with an ip in
> 192.168.199.0/24. Please note that this interface is not visible with "ip
> a" or "ifconfig". It's a dedicated IDRAC interface
> - Both nodes have a NIC configured in the subnet 192.168.199.0/24
> (192.168.199.200 and 192.168.199.201)
> - Undercloud uses 192.168.199.0/24 as pxe/provisioning network (but I
> don't need provisioning)
>
> Question: should I import nodes with "openstack overcloud node import
> nodes.yaml" even if I don't need the provisioning step?
>
> Furthermore, on the undercloud I created one file:
> /home/stack/templates/node-info.yaml with the following content
>
> parameter_defaults:
>   OvercloudControllerFlavor: control
>   OvercloudComputeFlavor: compute
>   ControllerCount: 1
>   ComputeCount: 1
>
>
Question: How can I specify that "node X with ip Y should be used as a
> controller and node Z with ip K should be used as a compute"??
>

With pre-provisioned nodes (DeployedServer), you would need to specify
HostnameMap and DeployedServerPortMap parameters that would map the
pre-provisioned hosts and ctlplane ips.

Please check documentation[1] for more details.

[1]
https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/features/deployed_server.html#deployed-server-with-config-download


> Should I set the property with the following command?
> openstack baremetal node set --property capabilities='profile:control'
> controller1
>
> Thank you,
> Marco
>
>

-- 
Regards,
Rabi Mishra
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200806/ace5628b/attachment-0001.html>

From emiller at genesishosting.com  Thu Aug  6 07:57:54 2020
From: emiller at genesishosting.com (Eric K. Miller)
Date: Thu, 6 Aug 2020 02:57:54 -0500
Subject: [cinder][nova] Local storage in compute node
References: <046E9C0290DD9149B106B72FC9156BEA04814477@gmsxchsvr01.thecreation.com>
 <046E9C0290DD9149B106B72FC9156BEA04814478@gmsxchsvr01.thecreation.com>
 <20200805111934.77lesgmmdiqeo27m@lyarwood.usersys.redhat.com> 
Message-ID: <046E9C0290DD9149B106B72FC9156BEA0481447F@gmsxchsvr01.thecreation.com>

> No - a thin-provisioned LV in LVM would be best.

From testing, it looks like thick-provisioned is the only choice at this stage.  That's fine.

> I will let everyone know how testing goes.

So far, everything is working perfectly with Nova using LVM.  It was a quick configuration and it did exactly what I expected, which is always nice. :)

As far as performance goes, it is decent, but not stellar.  Of course, I'm comparing crazy fast native NVMe storage in RAID 0 across 4 x Micron 9300 SSDs (using md as the underlying physical volume in LVM) to virtualized storage.

Some numbers from fio, just to get an idea for how good/bad the IOPS will be:

Configuration:
32 core EPYC 7502P with 512GiB of RAM - CentOS 7 latest updates - Kolla Ansible (Stein) deployment
32 vCPU VM with 64GiB of RAM
32 x 10GiB test files (I'm using file tests, not raw device tests, so not optimal, but easiest when the VM root disk is the test disk)
iodepth=10
numofjobs=32
time=30 (seconds)

The VM was deployed using a qcow2 image, then deployed as a raw image, to see the difference in performance.  There was none, which makes sense, since I'm pretty sure the qcow2 image was decompressed and stored in the LVM logical volume - so both tests were measuring the same thing.

Bare metal (random 4KiB reads):
8066MiB/sec
154.34 microsecond avg latency
2.065 million IOPS

VM qcow2 (random 4KiB reads):
589MiB/sec
2122.10 microsecond avg latency
151k IOPS

Bare metal (random 4KiB writes):
4940MiB/sec
252.44 microsecond avg latency
1.265 million IOPS

VM qcow2 (random 4KiB writes):
589MiB/sec
2119.16 microsecond avg latency
151k IOPS

Since the read and write VM results are nearly identical, my assumption is that the emulation layer is the bottleneck.  CPUs in the VM were all at 55% utilization (all kernel usage).  The qemu process on the bare metal machine indicated 1600% (or so) CPU utilization.

Below are runs with sequential 1MiB block tests

Bare metal (sequential 1MiB reads):
13.3GiB/sec
23446.43 microsecond avg latency
13.7k IOPS

VM qcow2 (sequential 1MiB reads):
8378MiB/sec
38164.52 microsecond avg latency
8377 IOPS

Bare metal (sequential 1MiB writes):
8098MiB/sec
39488.00 microsecond avg latency
8097 million IOPS

VM qcow2 (sequential 1MiB writes):
8087MiB/sec
39534.96 microsecond avg latency
8087 IOPS

Amazing that a VM can move 8GiB/sec to/from storage. :)  However, IOPS limits are a bit disappointing when compared to bare metal (but this is relative since 151k IOPS is quite a bit!).

Not sure if additional "iothreads" QEMU would help, but that is not set in the Libvirt XML file, and I don't see any way to use Nova to set it.

The Libvirt XML for the disk appears as:

    <disk type='block' device='disk'>
      <driver name='qemu' type='raw' cache='none' io='native' discard='unmap'/>
      <source dev='/dev/nova_vg/4cc7dfa4-c57f-4e73-a6fa-0da283244a4b_disk'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </disk>

Any suggestions for improvement?

I "think" that the "images_type = flat" option in nova.conf indicates that images are stored in the /var/lib/nova/instances/* directories?  If so, that might be an option, but since we're using Kolla, that directory (or rather /var/lib/nova) is currently a docker volume.  So, it might be necessary to mount the NVMe storage at its respective /var/lib/docker/volumes/nova_compute/_data/instances directory.

Not sure if the "flat" option will be any faster, especially since Docker would be another layer to go through.  Any opinions?

Thanks!

Eric


From marino.mrc at gmail.com  Thu Aug  6 08:05:21 2020
From: marino.mrc at gmail.com (Marco Marino)
Date: Thu, 6 Aug 2020 10:05:21 +0200
Subject: [tripleo] Deploy overcloud without provisioning
In-Reply-To: <CABJHmF594_s6dCt+kAzZj1CXCunUv-dGCxhVbBnZ51f9BAHsgg@mail.gmail.com>
References: <CAFHVVuLcEsk7jf2-9zEmGHsekXnXRNdHKR4Q5GVx5eYr+O+xBg@mail.gmail.com>
 <CABJHmF594_s6dCt+kAzZj1CXCunUv-dGCxhVbBnZ51f9BAHsgg@mail.gmail.com>
Message-ID: <CAFHVVuLdApRjyvYZG0CRZ875yzuaqkHDCnFLsf6isBytB=kZ=w@mail.gmail.com>

Thank you Rabi,
but node import is mandatory or not? Bare Metal part is useful for power
management and I'd like to maintain this feature.

Marco

Il giorno gio 6 ago 2020 alle ore 09:52 Rabi Mishra <ramishra at redhat.com>
ha scritto:

>
>
> On Thu, Aug 6, 2020 at 1:07 PM Marco Marino <marino.mrc at gmail.com> wrote:
>
>> Hi,
>> I'm trying to deploy an overcloud using tripleo with pre provisioned
>> nodes. My configuration is quite simple:
>> - 1 controller and 1 compute nodes on which I already installed CentOS 8.2
>> - Both nodes have a dedicated idrac interface with an ip in
>> 192.168.199.0/24. Please note that this interface is not visible with
>> "ip a" or "ifconfig". It's a dedicated IDRAC interface
>> - Both nodes have a NIC configured in the subnet 192.168.199.0/24
>> (192.168.199.200 and 192.168.199.201)
>> - Undercloud uses 192.168.199.0/24 as pxe/provisioning network (but I
>> don't need provisioning)
>>
>> Question: should I import nodes with "openstack overcloud node import
>> nodes.yaml" even if I don't need the provisioning step?
>>
>> Furthermore, on the undercloud I created one file:
>> /home/stack/templates/node-info.yaml with the following content
>>
>> parameter_defaults:
>>   OvercloudControllerFlavor: control
>>   OvercloudComputeFlavor: compute
>>   ControllerCount: 1
>>   ComputeCount: 1
>>
>>
> Question: How can I specify that "node X with ip Y should be used as a
>> controller and node Z with ip K should be used as a compute"??
>>
>
> With pre-provisioned nodes (DeployedServer), you would need to specify
> HostnameMap and DeployedServerPortMap parameters that would map the
> pre-provisioned hosts and ctlplane ips.
>
> Please check documentation[1] for more details.
>
> [1]
> https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/features/deployed_server.html#deployed-server-with-config-download
>
>
>> Should I set the property with the following command?
>> openstack baremetal node set --property capabilities='profile:control'
>> controller1
>>
>> Thank you,
>> Marco
>>
>>
>
> --
> Regards,
> Rabi Mishra
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200806/f56a1e1c/attachment.html>

From ramishra at redhat.com  Thu Aug  6 08:23:32 2020
From: ramishra at redhat.com (Rabi Mishra)
Date: Thu, 6 Aug 2020 13:53:32 +0530
Subject: [tripleo] Deploy overcloud without provisioning
In-Reply-To: <CAFHVVuLdApRjyvYZG0CRZ875yzuaqkHDCnFLsf6isBytB=kZ=w@mail.gmail.com>
References: <CAFHVVuLcEsk7jf2-9zEmGHsekXnXRNdHKR4Q5GVx5eYr+O+xBg@mail.gmail.com>
 <CABJHmF594_s6dCt+kAzZj1CXCunUv-dGCxhVbBnZ51f9BAHsgg@mail.gmail.com>
 <CAFHVVuLdApRjyvYZG0CRZ875yzuaqkHDCnFLsf6isBytB=kZ=w@mail.gmail.com>
Message-ID: <CABJHmF5NQV-41K3eXmFkT0vE5PBPBfbC7ru2J9Z86j2yiB0BLA@mail.gmail.com>

On Thu, Aug 6, 2020 at 1:35 PM Marco Marino <marino.mrc at gmail.com> wrote:

> Thank you Rabi,
> but node import is mandatory or not? Bare Metal part is useful for power
> management and I'd like to maintain this feature.
>
>
AFAIK, the intent of using pre-provisioned nodes is to create an overcloud
without power management control, among other things. Node import is not
required when Tripleo(Ironic) is not doing the provisioning.

I don't know if there are ways for Ironic to do power management of
pre-provisioned nodes. Someone else may have a better answer.


> Marco
>
> Il giorno gio 6 ago 2020 alle ore 09:52 Rabi Mishra <ramishra at redhat.com>
> ha scritto:
>
>>
>>
>> On Thu, Aug 6, 2020 at 1:07 PM Marco Marino <marino.mrc at gmail.com> wrote:
>>
>>> Hi,
>>> I'm trying to deploy an overcloud using tripleo with pre provisioned
>>> nodes. My configuration is quite simple:
>>> - 1 controller and 1 compute nodes on which I already installed CentOS
>>> 8.2
>>> - Both nodes have a dedicated idrac interface with an ip in
>>> 192.168.199.0/24. Please note that this interface is not visible with
>>> "ip a" or "ifconfig". It's a dedicated IDRAC interface
>>> - Both nodes have a NIC configured in the subnet 192.168.199.0/24
>>> (192.168.199.200 and 192.168.199.201)
>>> - Undercloud uses 192.168.199.0/24 as pxe/provisioning network (but I
>>> don't need provisioning)
>>>
>>> Question: should I import nodes with "openstack overcloud node import
>>> nodes.yaml" even if I don't need the provisioning step?
>>>
>>> Furthermore, on the undercloud I created one file:
>>> /home/stack/templates/node-info.yaml with the following content
>>>
>>> parameter_defaults:
>>>   OvercloudControllerFlavor: control
>>>   OvercloudComputeFlavor: compute
>>>   ControllerCount: 1
>>>   ComputeCount: 1
>>>
>>>
>> Question: How can I specify that "node X with ip Y should be used as a
>>> controller and node Z with ip K should be used as a compute"??
>>>
>>
>> With pre-provisioned nodes (DeployedServer), you would need to specify
>> HostnameMap and DeployedServerPortMap parameters that would map the
>> pre-provisioned hosts and ctlplane ips.
>>
>> Please check documentation[1] for more details.
>>
>> [1]
>> https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/features/deployed_server.html#deployed-server-with-config-download
>>
>>
>>> Should I set the property with the following command?
>>> openstack baremetal node set --property capabilities='profile:control'
>>> controller1
>>>
>>> Thank you,
>>> Marco
>>>
>>>
>>
>> --
>> Regards,
>> Rabi Mishra
>>
>>

-- 
Regards,
Rabi Mishra
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200806/19b0cd70/attachment.html>

From massimo.sgaravatto at gmail.com  Thu Aug  6 10:02:26 2020
From: massimo.sgaravatto at gmail.com (Massimo Sgaravatto)
Date: Thu, 6 Aug 2020 12:02:26 +0200
Subject: [ops] [keystone] "Roles are not immutable"
Message-ID: <CALaZjRERLhp7Jo6Tu6Z6TuxPFfrfoT6xq=o6ru8g8Q0jSgaeJg@mail.gmail.com>

After updating to Train from rocky (on stein we just performed the
db-sync), we tried the new "keystone-status update check" command which
says that the admin role is not immutable [*].

As far as I understand this is something that was done to prevent
deleting/modifying the default roles (that could cause major problems).

But how am I supposed to fix this?
The "--immutable" option for the "openstack role set" command, documented
at:

https://docs.openstack.org/python-openstackclient/latest/cli/command-objects/role-v3.html

is not available in Train.

Thanks, Massimo


[*]
+-------------------------------------------+
| Check: Check default roles are immutable  |
| Result: Failure                           |
| Details: Roles are not immutable: admin   |
+-------------------------------------------+
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200806/94bf9109/attachment-0001.html>

From dtantsur at redhat.com  Thu Aug  6 10:13:36 2020
From: dtantsur at redhat.com (Dmitry Tantsur)
Date: Thu, 6 Aug 2020 12:13:36 +0200
Subject: [ironic] [infra] bifrost-integration-tinyipa-opensuse-15 broken
Message-ID: <CACNgkFx87sLExc0odjJy7nd8iySWJHMJNy_1jwV8LxbN4P+sjA@mail.gmail.com>

Hi folks,

Our openSUSE CI job has been broken for a few days [1]. It fails on the
early bindep stage with [2]

<download url="
https://mirror.mtl01.inap.opendev.org/opensuse/distribution/leap/15.1/repo/oss/suse/x86_64/libJudy1-1.0.5-lp151.2.2.x86_64.rpm"
percent="-1" rate="-1"/>
<download url="
https://mirror.mtl01.inap.opendev.org/opensuse/distribution/leap/15.1/repo/oss/suse/x86_64/libJudy1-1.0.5-lp151.2.2.x86_64.rpm"
rate="-1" done="0"/>
<message type="error">File
&apos;./suse/x86_64/libJudy1-1.0.5-lp151.2.2.x86_64.rpm&apos; not found on
medium &apos;
https://mirror.mtl01.inap.opendev.org/opensuse/distribution/leap/15.1/repo/oss/&apos
;</message>

I've raised it on #openstack-infra, but I'm not sure if there has been any
follow up.

Help is appreciated

Dmitry

[1]
https://zuul.openstack.org/builds?job_name=bifrost-integration-tinyipa-opensuse-15
[2]
https://zuul.openstack.org/build/f4c7d174d171482394d1d0754c863ae1/console
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200806/5e7ff04d/attachment.html>

From smooney at redhat.com  Thu Aug  6 10:15:26 2020
From: smooney at redhat.com (Sean Mooney)
Date: Thu, 06 Aug 2020 11:15:26 +0100
Subject: [cinder][nova] Local storage in compute node
In-Reply-To: <046E9C0290DD9149B106B72FC9156BEA0481447D@gmsxchsvr01.thecreation.com>
References: <046E9C0290DD9149B106B72FC9156BEA04814477@gmsxchsvr01.thecreation.com>
 <046E9C0290DD9149B106B72FC9156BEA04814478@gmsxchsvr01.thecreation.com>
 <20200805111934.77lesgmmdiqeo27m@lyarwood.usersys.redhat.com>
 <7b7f6e277f77423ae6502d81c6d778fd4249b99d.camel@redhat.com>
 <92839697a08966dc17cd5c4c181bb32e2d197f93.camel@redhat.com>
 <CAMHmko9PNufKHiZO_yt4ds7OTmG9xdpH+=ooGWoFdpimx9tYng@mail.gmail.com>
 <4f025d444406898903dabf3049ed021822cce19b.camel@redhat.com>
 <046E9C0290DD9149B106B72FC9156BEA0481447D@gmsxchsvr01.thecreation.com>
Message-ID: <5136cf19c506c5fa8b0293a0b5f4f15cb714ce3b.camel@redhat.com>

On Wed, 2020-08-05 at 23:53 -0500, Eric K. Miller wrote:
> > From: Sean Mooney [mailto:smooney at redhat.com]
> > Sent: Wednesday, August 05, 2020 8:01 AM
> > yes that works well with the default flat/qcow file format
> > i assume there was a reason this was not the starting point.
> > the nova lvm backend i think does not supprot thin provisioning
> > so fi you did the same thing creating the volume group on the nvme deivce
> > you would technically get better write performance after the vm is booted
> > but
> > the vm spwan is slower since we cant take advantage of thin providioning
> > and
> > each root disk need to be copided  form the cahced image.
> 
> I wasn't aware that the nova LVM backend ([libvirt]/images_type = lvm) didn't support thin provisioned LV's.  However,
> I do see that the "sparse_logical_volumes" parameter indicates it has been deprecated:
> https://docs.openstack.org/nova/rocky/configuration/config.html#libvirt.sparse_logical_volumes
> 
> That would definitely be a downer.
> 
> > so just monting the nova data directory on an nvme driver or a raid of nvme
> > drives
> > works well and is simple to do.
> 
> Maybe we should consider doing this instead.  I'll test with the Nova LVM backend first.
> 
> > so there are trade off with both appoches.
> > generally i recommend using local sotrage e.g. the vm root disk or ephemeral
> > disk for fast scratchpad space
> > to work on data bug persitie all relevent data permently via cinder volumes.
> > that requires you to understand which block
> > devices a local and which are remote but it give you the best of both worlds.
> 
> Our use case simply requires high-speed non-redundant storage for self-replicating applications like Couchbase,
> Cassandra, MongoDB, etc. or very inexpensive VMs that are backed-up often and can withstand the downtime when
> restoring from backup.
> 
> That will be one more requirement (or rather a very nice to have), is to be able to create images (backups) of the
> local storage onto object storage, so hopefully "openstack server backup create" works like it does with rbd-backed
> Nova-managed persistent storage.
it wil snapshot the root disk
if you use addtional ephmeeral disks i do not think they are included
but if you create the vms wit a singel root disk that is big enaough for your needs and use swift as your glance backend
then yes. it will store the backups in object storage and rotate up to N backups per instance.
> 
> I will let you know what I find out!
> 
> Thanks everyone!
> 
> Eric


From emiller at genesishosting.com  Thu Aug  6 10:26:59 2020
From: emiller at genesishosting.com (Eric K. Miller)
Date: Thu, 6 Aug 2020 05:26:59 -0500
Subject: [cinder][nova] Local storage in compute node
In-Reply-To: <5136cf19c506c5fa8b0293a0b5f4f15cb714ce3b.camel@redhat.com>
References: <046E9C0290DD9149B106B72FC9156BEA04814477@gmsxchsvr01.thecreation.com>
 <046E9C0290DD9149B106B72FC9156BEA04814478@gmsxchsvr01.thecreation.com>
 <20200805111934.77lesgmmdiqeo27m@lyarwood.usersys.redhat.com>
 <7b7f6e277f77423ae6502d81c6d778fd4249b99d.camel@redhat.com>
 <92839697a08966dc17cd5c4c181bb32e2d197f93.camel@redhat.com>
 <CAMHmko9PNufKHiZO_yt4ds7OTmG9xdpH+=ooGWoFdpimx9tYng@mail.gmail.com>
 <4f025d444406898903dabf3049ed021822cce19b.camel@redhat.com>
 <046E9C0290DD9149B106B72FC9156BEA0481447D@gmsxchsvr01.thecreation.com>
 <5136cf19c506c5fa8b0293a0b5f4f15cb714ce3b.camel@redhat.com>
Message-ID: <046E9C0290DD9149B106B72FC9156BEA04814483@gmsxchsvr01.thecreation.com>

> it wil snapshot the root disk
> if you use addtional ephmeeral disks i do not think they are included
> but if you create the vms wit a singel root disk that is big enaough for your
> needs and use swift as your glance backend
> then yes. it will store the backups in object storage and rotate up to N
> backups per instance.

Thanks Sean!  I tested a VM with a single root disk (no ephemeral disks) and it worked as expected (how you described).

From e0ne at e0ne.info  Thu Aug  6 12:03:53 2020
From: e0ne at e0ne.info (Ivan Kolodyazhny)
Date: Thu, 6 Aug 2020 15:03:53 +0300
Subject: [horizon] Victoria virtual mid-cycle poll
In-Reply-To: <CAGocpaEp6HjfFpCSk_u3t=LyitG+axfzLBZrp=0x_gJcbGxaKQ@mail.gmail.com>
References: <CAGocpaG1SSrRoekC56pz_2yMDu=hj_bVaXm-4-6dCF-5zt+tAQ@mail.gmail.com>
 <CAGocpaEC=VtNuTxoHcyy5795bfwe3VOyt7M5g8cvZQ1_fTf=tw@mail.gmail.com>
 <CAGocpaEp6HjfFpCSk_u3t=LyitG+axfzLBZrp=0x_gJcbGxaKQ@mail.gmail.com>
Message-ID: <CAGocpaHJ-nJps0AC_t41oJbQ878+YCrePb=AxWv1eO24z51x2Q@mail.gmail.com>

Here is Zoom connection details:

Topic: Horizon Virtual Mid-Cycle
Time: Aug 6, 2020 01:00 PM Universal Time UTC

Join Zoom Meeting
https://zoom.us/j/94173501669?pwd=c3JuNnpJMnBvNzgzdVJ5NDRhMnlhQT09

Meeting ID: 941 7350 1669
Passcode: 710495
One tap mobile
+16468769923,,94173501669#,,,,,,0#,,710495# US (New York)
+16699006833,,94173501669#,,,,,,0#,,710495# US (San Jose)

Dial by your location
        +1 646 876 9923 US (New York)
        +1 669 900 6833 US (San Jose)
        +1 253 215 8782 US (Tacoma)
        +1 301 715 8592 US (Germantown)
        +1 312 626 6799 US (Chicago)
        +1 346 248 7799 US (Houston)
        +1 408 638 0968 US (San Jose)
Meeting ID: 941 7350 1669
Passcode: 710495
Find your local number: https://zoom.us/u/ah3SiLk1q


Regards,
Ivan Kolodyazhny,
http://blog.e0ne.info/


On Thu, Aug 6, 2020 at 10:46 AM Ivan Kolodyazhny <e0ne at e0ne.info> wrote:

> Hi everybody,
>
> According to our poll [2] we'll have a one-hour mid-cycle poll today at
> 13.00 UTC. I'll share a Zoom link before the meeting today.
>
> We're going to discuss current release priorities [3] and our future plans.
>
>
> [2] https://doodle.com/poll/dkmsai49v4zzpca2
> [3] https://etherpad.opendev.org/p/horizon-release-priorities
>
> Regards,
> Ivan Kolodyazhny,
> http://blog.e0ne.info/
>
>
> On Thu, Jul 30, 2020 at 1:00 PM Ivan Kolodyazhny <e0ne at e0ne.info> wrote:
>
>> Hi team,
>>
>> If something can go wrong, it will definitely go wrong.
>> It means that I did a mistake in my original mail and sent you
>> completely wrong dates:(.
>>
>> Horizon Virtual mid-cycle is supposed to be next week Aug 5-7. I'm
>> planning to have a single one-hour session.
>> In case, if we've got a lot of participants and topic to discuss, we can
>> schedule one more session a week or two weeks later.
>>
>> Here is a correct poll: https://doodle.com/poll/dkmsai49v4zzpca2
>>
>> Regards,
>> Ivan Kolodyazhny,
>> http://blog.e0ne.info/
>>
>>
>> On Wed, Jul 22, 2020 at 10:26 AM Ivan Kolodyazhny <e0ne at e0ne.info> wrote:
>>
>>> Hi team,
>>>
>>> As discussed at Horizon's Virtual PTG [1], we'll have a virtual
>>> mid-cycle meeting around Victoria-2 milestone.
>>>
>>> We'll discuss Horizon current cycle development priorities and the
>>> future of Horizon with modern JS frameworks.
>>>
>>> Please indicate your availability to meet for the first session, which
>>> will be held during the week of July 27-31:
>>>
>>>     https://doodle.com/poll/3neps94amcreaw8q
>>>
>>> Please respond before 12:00 UTC on Tuesday 4 August.
>>>
>>> [1] https://etherpad.opendev.org/p/horizon-v-ptg
>>>
>>> Regards,
>>> Ivan Kolodyazhny,
>>> http://blog.e0ne.info/
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200806/bf9df817/attachment.html>

From arnaud.morin at gmail.com  Thu Aug  6 14:04:21 2020
From: arnaud.morin at gmail.com (Arnaud Morin)
Date: Thu, 6 Aug 2020 14:04:21 +0000
Subject: [largescale-sig][nova][neutron][oslo] RPC ping
In-Reply-To: <88c24f3a-7d29-aa39-ed12-803279cc90c1@openstack.org>
References: <20200727095744.GK31915@sync>
 <3d238530-6c84-d611-da4c-553ba836fc02@nemebean.com>
 <m2v9i8yjfw.fsf@caffeine.hv.danplanet.com>
 <CAJoCO=Mi8d23=Jhjsdy-05k1jWayg_1=NvTsdjUw+uDA9eE3hw@mail.gmail.com>
 <88c24f3a-7d29-aa39-ed12-803279cc90c1@openstack.org>
Message-ID: <20200806140421.GN31915@sync>

Hey all,

Thanks for your replies.
About the fact that nova already implement this, I will try again on my
side, but maybe it was not yet implemented in newton (I only tried nova
on newton version). Thank you for bringing that to me.

About the healhcheck already done on nova side (and also on neutron).
As far as I understand, it's done using a specific rabbit queue, which
can work while others queues are not working.
The purpose of adding ping endpoint here is to be able to ping in all
topics, not only those used for healthcheck reports.

Also, as mentionned by Thierry, what we need is a way to externally
do pings toward neutron agents and nova computes.
The patch itself is not going to add any load on rabbit. It really
depends on the way the operator will use it.
On my side, I built a small external oslo.messaging script which I can
use to do such pings.

Cheers,

-- 
Arnaud Morin

On 03.08.20 - 12:15, Thierry Carrez wrote:
> Ken Giusti wrote:
> > On Mon, Jul 27, 2020 at 1:18 PM Dan Smith <dms at danplanet.com
> > <mailto:dms at danplanet.com>> wrote:
> > >     The primary concern was about something other than nova sitting on our
> > >     bus making calls to our internal services. I imagine that the proposal
> > >     to bake it into oslo.messaging is for the same purpose, and I'd probably
> > >     have the same concern. At the time I think we agreed that if we were
> > >     going to support direct-to-service health checks, they should be teensy
> > >     HTTP servers with oslo healthchecks middleware. Further loading down
> > >     rabbit with those pings doesn't seem like the best plan to
> > >     me. Especially since Nova (compute) services already check in over RPC
> > >     periodically and the success of that is discoverable en masse through
> > >     the API.
> > 
> > While initially in favor of this feature Dan's concern has me
> > reconsidering this.
> > 
> > Now I believe that if the purpose of this feature is to check the
> > operational health of a service _using_ oslo.messaging, then I'm against
> > it.   A naked ping to a generic service point in an application doesn't
> > prove the operating health of that application beyond its connection to
> > rabbit.
> 
> While I understand the need to further avoid loading down Rabbit, I like the
> universality of this solution, solving a real operational issue.
> 
> Obviously that creates a trade-off (further loading rabbit to get more
> operational insights), but nobody forces you to run those ping calls, they
> would be opt-in. So the proposed code in itself does not weigh down Rabbit,
> or make anything sit on the bus.
> 
> > Connectivity monitoring between an application and rabbit is done using
> > the keepalive connection heartbeat mechanism built into the rabbit
> > protocol, which O.M. supports today.
> 
> I'll let Arnaud answer, but I suspect the operational need is code-external
> checking of the rabbit->agent chain, not code-internal checking of the
> agent->rabbit chain. The heartbeat mechanism is used by the agent to keep
> the Rabbit connection alive, ensuring it works in most of the cases. The
> check described above is to catch the corner cases where it still doesn't.
> 
> -- 
> Thierry Carrez (ttx)
> 


From marino.mrc at gmail.com  Thu Aug  6 14:06:08 2020
From: marino.mrc at gmail.com (Marco Marino)
Date: Thu, 6 Aug 2020 16:06:08 +0200
Subject: [tripleo] Overcloud without provisioning error: os-net-config command
 not found
Message-ID: <CAFHVVuKusCBs84JLY1GECOnqXDg3tD_Q+udejcArwQoFC8y03A@mail.gmail.com>

Hi, I'm trying to deploy an overcloud using pre-provisioned nodes. I have
only 2 nodes, 1 compute and 1 controller and here is the command I'm using:


openstack overcloud deploy --templates --disable-validations -e
/usr/share/openstack-tripleo-heat-templates/environments/deployed-server-environment.yaml
 -e /home/stack/templates/node-info.yaml -e
/home/stack/templates/ctlplane-assignments.yaml -e
/home/stack/templates/hostname-map.yaml -e
/usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml
-e
/usr/share/openstack-tripleo-heat-templates/environments/network-environment.yaml
-e
/usr/share/openstack-tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml
-n /home/stack/os-deploy-custom-config/network_data.yaml
 --overcloud-ssh-user stack --overcloud-ssh-key /home/stack/.ssh/id_rsa

Here is the content of custom files:

(undercloud) [stack at undercloud ~]$ cat templates/node-info.yaml
parameter_defaults:
  OvercloudControllerFlavor: control
  OvercloudComputeFlavor: compute
  ControllerCount: 1
  ComputeCount: 1

(undercloud) [stack at undercloud ~]$ cat templates/ctlplane-assignments.yaml
resource_registry:
  OS::TripleO::DeployedServer::ControlPlanePort:
/usr/share/openstack-tripleo-heat-templates/deployed-server/deployed-neutron-port.yaml

parameter_defaults:
  DeployedServerPortMap:
    controller-0-ctlplane:
      fixed_ips:
        - ip_address: 192.168.199.200
      subnets:
        - cidr: 192.168.199.0/24
      network:
        tags:
          192.168.199.0/24
    compute-0-ctlplane:
      fixed_ips:
        - ip_address: 192.168.199.210
      subnets:
        - cidr: 192.168.199.0/24
      network:
        tags:
          192.168.199.0/24


(undercloud) [stack at undercloud ~]$ cat templates/hostname-map.yaml
parameter_defaults:
  HostnameMap:
    overcloud-controller-0: controller-0
    overcloud-novacompute-0: compute-0


http://paste.openstack.org/show/796634/  <-- Here is the complete output
for overcloud deploy command.
It seems that the error is
/var/lib/tripleo-config/scripts/run_os_net_config.sh: line 59:
os-net-config: command not found"

os-net-config is provided by "delorean-component-tripleo" repository. So my
question is: should I pre install Openstack repositories on pre-provisioned
nodes in addition to operating system installation and network
configuration?

Thank you,
Marco
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200806/5a98f4ad/attachment.html>

From arnaud.morin at gmail.com  Thu Aug  6 14:11:32 2020
From: arnaud.morin at gmail.com (Arnaud Morin)
Date: Thu, 6 Aug 2020 14:11:32 +0000
Subject: [largescale-sig] RPC ping
In-Reply-To: <CAEs876jC_0E-Pnr+KiPD5b+LRety9zhwR6tMof0ajK81-16YRA@mail.gmail.com>
References: <20200727095744.GK31915@sync>
 <CAEs876jC_0E-Pnr+KiPD5b+LRety9zhwR6tMof0ajK81-16YRA@mail.gmail.com>
Message-ID: <20200806141132.GO31915@sync>

Hi Mohammed,

1 - That's something we would also like, but it's beyond the patch I
propose.
I need this patch not only for kubernetes, but also for monitoring my
legagy openstack agents running outside of k8s.

2 - Yes, latest version of rabbitmq is better on that point, but we
still see some weird issue (I will ask the community about it in another
topic).

3 - Thanks for this operator, we'll take a look!
By saying 1 rabbit per service, I understand 1 server, not 1 cluster,
right?
That sounds risky if you lose the server.

I suppose you dont do that for the database?

4 - Nice, how to you monitor those consumptions? Using rabbit management
API?

Cheers,

-- 
Arnaud Morin

On 03.08.20 - 10:21, Mohammed Naser wrote:
> I have a few operational suggestions on how I think we could do this best:
> 
> 1. I think exposing a healthcheck endpoint that _actually_ runs the
> ping and responds with a 200 OK makes a lot more sense in terms of
> being able to run it inside something like Kubernetes, you end up with
> a "who makes the ping and who responds to it" type of scenario which
> can be tricky though I'm sure we can figure that out
> 2. I've found that newer releases of RabbitMQ really help with those
> un-usable queues after a split, I haven't had any issues at all with
> newer releases, so that could be something to help your life be a lot
> easier.
> 3. You mentioned you're moving towards Kubernetes, we're doing the
> same and building an operator:
> https://opendev.org/vexxhost/openstack-operator -- Because the
> operator manages the whole thing and Kubernetes does it's thing too,
> we started moving towards 1 (single) rabbitmq per service, which
> reaaaaaaally helped a lot in stabilizing things.  Oslo messaging is a
> lot better at recovering when a single service IP is pointing towards
> it because it doesn't do weird things like have threads trying to
> connect to other Rabbit ports.  Just a thought.
> 4. In terms of telemetry and making sure you avoid that issue, we
> track the consumption rates of queues inside OpenStack.  OpenStack
> consumption rate should be constant and never growing, anytime it
> grows, we instantly detect that something is fishy.  However, the
> other issue comes in that when you restart any openstack service, it
> 'forgets' all it's existing queues and then you have a set of building
> up queues until they automatically expire which happens around 30
> minutes-ish, so it makes that alarm of "things are not being consumed"
> a little noisy if you're restarting services
> 
> Sorry for the wall of super unorganized text, all over the place here
> but thought I'd chime in with my 2 cents :)
> 
> On Mon, Jul 27, 2020 at 6:04 AM Arnaud Morin <arnaud.morin at gmail.com> wrote:
> >
> > Hey all,
> >
> > TLDR: I propose a change to oslo_messaging to allow doing a ping over RPC,
> >       this is useful to monitor liveness of agents.
> >
> >
> > Few weeks ago, I proposed a patch to oslo_messaging [1], which is adding a
> > ping endpoint to RPC dispatcher.
> > It means that every openstack service which is using oslo_messaging RPC
> > endpoints (almosts all OpenStack services and agents - e.g. neutron
> > server + agents, nova + computes, etc.) will then be able to answer to a
> > specific "ping" call over RPC.
> >
> > I decided to propose this patch in my company mainly for 2 reasons:
> > 1 - we are struggling monitoring our nova compute and neutron agents in a
> >   correct way:
> >
> > 1.1 - sometimes our agents are disconnected from RPC, but the python process
> > is still running.
> > 1.2 - sometimes the agent is still connected, but the queue / binding on
> > rabbit cluster is not working anymore (after a rabbit split for
> > example). This one is very hard to debug, because the agent is still
> > reporting health correctly on neutron server, but it's not able to
> > receive messages anymore.
> >
> >
> > 2 - we are trying to monitor agents running in k8s pods:
> > when running a python agent (neutron l3-agent for example) in a k8s pod, we
> > wanted to find a way to monitor if it is still live of not.
> >
> >
> > Adding a RPC ping endpoint could help us solve both these issues.
> > Note that we still need an external mechanism (out of OpenStack) to do this
> > ping.
> > We also think it could be nice for other OpenStackers, and especially
> > large scale ops.
> >
> > Feel free to comment.
> >
> >
> > [1] https://review.opendev.org/#/c/735385/
> >
> >
> > --
> > Arnaud Morin
> >
> >
> 
> 
> -- 
> Mohammed Naser
> VEXXHOST, Inc.


From sgolovat at redhat.com  Thu Aug  6 14:16:56 2020
From: sgolovat at redhat.com (Sergii Golovatiuk)
Date: Thu, 6 Aug 2020 16:16:56 +0200
Subject: [tripleo][ci] Make
 tripleo-ci-centos-8-containerized-undercloud-upgrades voting again
Message-ID: <CALeCkHMp9UWicZcndT26SeP0R6T7ppxjmk1CN39yAdt3B+5CcA@mail.gmail.com>

Hi,

tripleo-ci-centos-8-containerized-undercloud-upgrades has been improved
significantly in terms of stability [1]. To improve CI coverage for
upgrades I propose to make it voting. That will help to make upgrades more
stable and catch bugs as early as possible. To keep it stable, Upgrade team
is going to add it to their own triage process and dedicate the engineer to
fix it if it's red for 2-3 days in a row.

[1]
https://zuul.opendev.org/t/openstack/builds?job_name=tripleo-ci-centos-8-containerized-undercloud-upgrades&project=openstack/tripleo-common

-- 
Sergii Golovatiuk

Senior Software Developer

Red Hat  <https://www.redhat.com/>
<https://www.redhat.com/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200806/a15ad5e7/attachment.html>

From aschultz at redhat.com  Thu Aug  6 14:31:24 2020
From: aschultz at redhat.com (Alex Schultz)
Date: Thu, 6 Aug 2020 08:31:24 -0600
Subject: [tripleo] Overcloud without provisioning error: os-net-config
 command not found
In-Reply-To: <CAFHVVuKusCBs84JLY1GECOnqXDg3tD_Q+udejcArwQoFC8y03A@mail.gmail.com>
References: <CAFHVVuKusCBs84JLY1GECOnqXDg3tD_Q+udejcArwQoFC8y03A@mail.gmail.com>
Message-ID: <CAFsb3b4n0_nq6yWEQf3xk+yo7=KVhsn62=vE1iB+sX3rCCrVFw@mail.gmail.com>

On Thu, Aug 6, 2020 at 8:13 AM Marco Marino <marino.mrc at gmail.com> wrote:
>
> Hi, I'm trying to deploy an overcloud using pre-provisioned nodes. I have only 2 nodes, 1 compute and 1 controller and here is the command I'm using:
>
>
> openstack overcloud deploy --templates --disable-validations -e /usr/share/openstack-tripleo-heat-templates/environments/deployed-server-environment.yaml  -e /home/stack/templates/node-info.yaml -e /home/stack/templates/ctlplane-assignments.yaml -e /home/stack/templates/hostname-map.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-environment.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml -n /home/stack/os-deploy-custom-config/network_data.yaml    --overcloud-ssh-user stack --overcloud-ssh-key /home/stack/.ssh/id_rsa
>
> Here is the content of custom files:
>
> (undercloud) [stack at undercloud ~]$ cat templates/node-info.yaml
> parameter_defaults:
>   OvercloudControllerFlavor: control
>   OvercloudComputeFlavor: compute
>   ControllerCount: 1
>   ComputeCount: 1
>
> (undercloud) [stack at undercloud ~]$ cat templates/ctlplane-assignments.yaml
> resource_registry:
>   OS::TripleO::DeployedServer::ControlPlanePort: /usr/share/openstack-tripleo-heat-templates/deployed-server/deployed-neutron-port.yaml
>
> parameter_defaults:
>   DeployedServerPortMap:
>     controller-0-ctlplane:
>       fixed_ips:
>         - ip_address: 192.168.199.200
>       subnets:
>         - cidr: 192.168.199.0/24
>       network:
>         tags:
>           192.168.199.0/24
>     compute-0-ctlplane:
>       fixed_ips:
>         - ip_address: 192.168.199.210
>       subnets:
>         - cidr: 192.168.199.0/24
>       network:
>         tags:
>           192.168.199.0/24
>
>
> (undercloud) [stack at undercloud ~]$ cat templates/hostname-map.yaml
> parameter_defaults:
>   HostnameMap:
>     overcloud-controller-0: controller-0
>     overcloud-novacompute-0: compute-0
>
>
> http://paste.openstack.org/show/796634/  <-- Here is the complete output for overcloud deploy command.
> It seems that the error is
> /var/lib/tripleo-config/scripts/run_os_net_config.sh: line 59: os-net-config: command not found"
>
> os-net-config is provided by "delorean-component-tripleo" repository. So my question is: should I pre install Openstack repositories on pre-provisioned nodes in addition to operating system installation and network configuration?
>

Yes per the documentation:
https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/features/deployed_server.html#package-repositories

> Thank you,
> Marco
>
>
>
>


From arnaud.morin at gmail.com  Thu Aug  6 14:40:16 2020
From: arnaud.morin at gmail.com (Arnaud Morin)
Date: Thu, 6 Aug 2020 14:40:16 +0000
Subject: [nova][neutron][oslo][ops] rabbit bindings issue
Message-ID: <20200806144016.GP31915@sync>

Hey all,

I would like to ask the community about a rabbit issue we have from time
to time.

In our current architecture, we have a cluster of rabbits (3 nodes) for
all our OpenStack services (mostly nova and neutron).

When one node of this cluster is down, the cluster continue working (we
use pause_minority strategy).
But, sometimes, the third server is not able to recover automatically
and need a manual intervention.
After this intervention, we restart the rabbitmq-server process, which
is then able to join the cluster back.

At this time, the cluster looks ok, everything is fine.
BUT, nothing works.
Neutron and nova agents are not able to report back to servers.
They appear dead.
Servers seems not being able to consume messages.
The exchanges, queues, bindings seems good in rabbit.

What we see is that removing bindings (using rabbitmqadmin delete
binding or the web interface) and recreate them again (using the same
routing key) brings the service back up and running.

Doing this for all queues is really painful. Our next plan is to
automate it, but is there anyone in the community already saw this kind
of issues?

Our bug looks like the one described in [1].
Someone recommands to create an Alternate Exchange.
Is there anyone already tried that?

FYI, we are running rabbit 3.8.2 (with OpenStack Stein).
We had the same kind of issues using older version of rabbit.

Thanks for your help.

[1] https://groups.google.com/forum/#!newtopic/rabbitmq-users/rabbitmq-users/zFhmpHF2aWk

-- 
Arnaud Morin


From bdobreli at redhat.com  Thu Aug  6 15:02:34 2020
From: bdobreli at redhat.com (Bogdan Dobrelya)
Date: Thu, 6 Aug 2020 17:02:34 +0200
Subject: [tripleo][ci] Make
 tripleo-ci-centos-8-containerized-undercloud-upgrades voting again
In-Reply-To: <CALeCkHMp9UWicZcndT26SeP0R6T7ppxjmk1CN39yAdt3B+5CcA@mail.gmail.com>
References: <CALeCkHMp9UWicZcndT26SeP0R6T7ppxjmk1CN39yAdt3B+5CcA@mail.gmail.com>
Message-ID: <bfc5b7f0-6a17-d96b-2a5c-235eb526c2b5@redhat.com>

+1

On 8/6/20 4:16 PM, Sergii Golovatiuk wrote:
> Hi,
> 
> tripleo-ci-centos-8-containerized-undercloud-upgrades has been improved 
> significantly in terms of stability [1]. To improve CI coverage for 
> upgrades I propose to make it voting. That will help to make upgrades 
> more stable and catch bugs as early as possible. To keep it stable, 
> Upgrade team is going to add it to their own triage process and 
> dedicate the engineer to fix it if it's red for 2-3 days in a row.
> 
> [1] 
> https://zuul.opendev.org/t/openstack/builds?job_name=tripleo-ci-centos-8-containerized-undercloud-upgrades&project=openstack/tripleo-common
> 
> -- 
> SergiiGolovatiuk
> 
> Senior Software Developer
> 
> Red Hat <https://www.redhat.com/>
> 
> <https://www.redhat.com/>
> 


-- 
Best regards,
Bogdan Dobrelya,
Irc #bogdando


From marios at redhat.com  Thu Aug  6 15:51:07 2020
From: marios at redhat.com (Marios Andreou)
Date: Thu, 6 Aug 2020 18:51:07 +0300
Subject: [tripleo][ci] Make
 tripleo-ci-centos-8-containerized-undercloud-upgrades voting again
In-Reply-To: <CALeCkHMp9UWicZcndT26SeP0R6T7ppxjmk1CN39yAdt3B+5CcA@mail.gmail.com>
References: <CALeCkHMp9UWicZcndT26SeP0R6T7ppxjmk1CN39yAdt3B+5CcA@mail.gmail.com>
Message-ID: <CAM51kHWRY+n_YU-LGrTAa2t+REnDZ=izdX+Z6FHktNrwZq9e+g@mail.gmail.com>

On Thu, Aug 6, 2020 at 5:19 PM Sergii Golovatiuk <sgolovat at redhat.com>
wrote:

> Hi,
>
> tripleo-ci-centos-8-containerized-undercloud-upgrades has been improved
> significantly in terms of stability [1]. To improve CI coverage for
> upgrades I propose to make it voting. That will help to make upgrades more
> stable and catch bugs as early as possible. To keep it stable, Upgrade team
> is going to add it to their own triage process and dedicate the engineer to
> fix it if it's red for 2-3 days in a row.
>
>
o/

as discussed on irc, IMO we should make it voting "until we can't". Your
triage process about 2/3 days sounds reasonable but it will be seen in
practice how well that works. Which is the original reason there is
push-back against master voting upgrades jobs -  i.e. whilst developing for
the cycle the upgrades jobs might break until all new features are merged
and you can accommodate for them in the upgrade. So they break and this
blocks master gates. They were made non voting during a time that they were
broken often and for long periods.

let's make them voting "until we can't".


> [1]
> https://zuul.opendev.org/t/openstack/builds?job_name=tripleo-ci-centos-8-containerized-undercloud-upgrades&project=openstack/tripleo-common
>
> --
> Sergii Golovatiuk
>
> Senior Software Developer
>
> Red Hat  <https://www.redhat.com/>
> <https://www.redhat.com/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200806/78e2a784/attachment-0001.html>

From zaitcev at redhat.com  Thu Aug  6 16:41:25 2020
From: zaitcev at redhat.com (Pete Zaitcev)
Date: Thu, 6 Aug 2020 11:41:25 -0500
Subject: [TripleO] "bundle install" in puppet-tripleo
Message-ID: <20200806114125.0f0961a9@suzdal.zaitcev.lan>

Hello:

Due to some circumstances, I started looking at running unit tests
in puppet-tripleo. The official document[1] tells me to start by
running "bundle install". This results in:

[zaitcev at suzdal puppet-tripleo-c744015]$ bundle install
Fetching https://git.openstack.org/openstack/puppet-openstack_spec_helper
Fetching gem metadata from https://rubygems.org/........
Resolving dependencies........
Fetching rake 13.0.1
.......
Installing netaddr 1.5.1
Using pathspec 0.2.1  <--------------------- in black, not green
Fetching pry 0.12.2
.......
Installing webmock 3.8.3
Using puppet-openstack_spec_helper 17.0.0 from https://git.openstack.org/openstack/puppet-openstack_spec_helper (at master at 273d24f)
Updating files in vendor/cache
Could not find pathspec-0.2.1.gem for installation
[zaitcev at suzdal puppet-tripleo-c744015]$ 

Anyone got an idea what the above means?

-- Pete

[1] https://docs.openstack.org/puppet-openstack-guide/latest/contributor/testing.html


From caifti at gmail.com  Thu Aug  6 07:00:44 2020
From: caifti at gmail.com (Doina Cristina Duma)
Date: Thu, 6 Aug 2020 09:00:44 +0200
Subject: [TC] [PTG] Victoria vPTG Summary of Conversations and Action Items
In-Reply-To: <CAPkQhnc1mW_Cepo3+cYJOkiQm5PpZcYiOv4nFhYU3jCfbwveCQ@mail.gmail.com>
References: <CAJ6yrQhCTT6i-377wW5u7SiEAq27igWFpeqxr6e1W2Cut7fvVg@mail.gmail.com>
 <CAPkQhnc1mW_Cepo3+cYJOkiQm5PpZcYiOv4nFhYU3jCfbwveCQ@mail.gmail.com>
Message-ID: <CALe7Fp3S3KDYTL33-oEbWC29s_7iiV41euQz9y7t7MUw8-ofmw@mail.gmail.com>

Hello everyone,

On Tue, Aug 4, 2020 at 2:14 PM Belmiro Moreira <
moreira.belmiro.email.lists at gmail.com> wrote:

> Hi everyone,
> the problem described in the "OpenStack User-facing APIs" is something
> that we face daily in our deployment. Different CLIs for different
> operations.
>
same for us, really frustrating, going around and see what is missing (what
options)


> I'm really interested in driving this action item.
>

I totally support your proposal!

Cristina

>
> Belmiro
>
> On Fri, Jun 12, 2020 at 9:38 PM Kendall Nelson <kennelson11 at gmail.com>
> wrote:
>
>> Hello Everyone!
>>
>> I hope you all had a productive and enjoyable PTG! While it’s still
>> reasonably fresh, I wanted to take a moment to summarize discussions and
>> actions that came out of TC discussions.
>>
>> If there is a particular action item you are interested in taking, please
>> reply on this thread!
>>
>> For the long version, check out the etherpad from the PTG[1].
>>
>> Tuesday
>>
>> ======
>>
>> Ussuri Retrospective
>>
>> ----------------------------
>>
>> As usual we accomplished a lot. Some of the things we accomplished were
>> around enumerating operating systems per release (again), removing python2
>> support, and adding the ideas repository. Towards the end of the release,
>> we had a lot of discussions around what to do with leaderless projects, the
>> role of PTLs, and what to do with projects that were missing PTL candidates
>> for the next release. We discussed office hours, their history and reason
>> for existence, and clarified how we can strengthen communication amongst
>> ourselves, the projects, and the larger community.
>>
>> TC Onboarding
>>
>> --------------------
>>
>> It was brought up that those elected most recently (and even new members
>> the election before) felt like there wasn’t enough onboarding into the TC.
>> Through discussion about what we can do to better support returning members
>> is to better document the daily, weekly and monthly tasks TC members are
>> supposed to be doing. Kendall Nelson proposed a patch to start adding more
>> detail to a guide for TC members already[2]. It was also proposed that we
>> have a sort of mentorship or shadow program for people interested in
>> joining the TC or new TC members by more experienced TC members. The
>> discussion about the shadow/mentorship program is to be continued.
>>
>> TC/UC Merge
>>
>> ------------------
>>
>> Thierry gave an update on the merge of the committees. The simplified
>> version is that the current proposal is that UC members are picked from TC
>> members, the UC operates within the TC, and that we are already setup for
>> this given the number of TC members that have AUC status. None of this
>> requires a by-laws change. One next step that has already begun is the
>> merging of the openstack-users ML into openstack-discuss ML. Other next
>> steps are to decide when to do the actual transition (disbanding the
>> separate UC, probably at the next election?) and when to setup AUC’s to be
>> defined as extra-ATC’s to be included in the electorate for elections. For
>> more detail, check out the openstack-discuss ML thread[3].
>>
>> Wednesday
>>
>> =========
>>
>> Help Wanted List
>>
>> -----------------------
>>
>> We settled on a format for the job postings and have several on the list.
>> We talked about how often we want to look through, update or add to it. The
>> proposal is to do this yearly. We need to continue pushing on the board to
>> dedicate contributors at their companies to work on these items, and get
>> them to understand that it's an investment that will take longer than a
>> year in a lot of cases; interns are great, but not enough.
>>
>> TC Position on Foundation Member Community Contributions
>>
>>
>> ----------------------------------------------------------------------------------
>>
>> The discussion started with a state of things today - the expectations of
>> platinum members, the benefits the members get being on the board and why
>> they should donate contributor resources for these benefits, etc. A variety
>> of proposals were made: either enforce or remove the minimum contribution
>> level, give gold members the chance to have increased visibility (perhaps
>> giving them some of the platinum member advantages) if they supplement
>> their monetary contributions with contributor contributions, etc. The
>> #ACTION that was decided was for Mohammed to take these ideas to the board
>> and see what they think.
>>
>> OpenStack User-facing APIs
>>
>> --------------------------------------
>>
>> Users are confused about the state of the user facing API’s; they’ve been
>> told to use the OpenStackClient(OSC) but upon use, they discover that there
>> are features missing that exist in the python-*clients. Partial
>> implementation in the OSC is worse than if the service only used their
>> specific CLI. Members of the OpenStackSDK joined discussions and explained
>> that many of the barriers that projects used to have behind implementing
>> certain commands have been resolved. The proposal is to create a pop up
>> team and that they start with fully migrating Nova, documenting the process
>> and collecting any other unresolved blocking issues with the hope that one
>> day we can set the migration of the remaining projects as a community goal.
>> Supplementally, a new idea was proposed-  enforcing new functionality to
>> services is only added to the SDK (and optionally the OSC) and not the
>> project’s specific CLI to stop increasing the disparity between the two.
>> The #ACTION here is to start the pop up team, if you are interested, please
>> reply! Additionally, if you disagree with this kind of enforcement, please
>> contact the TC as soon as possible and explain your concerns.
>>
>> PTL Role in OpenStack today & Leaderless Projects
>>
>> ---------------------------------------------------------------------
>>
>> This was a veeeeeeeerrrry long conversation that went in circles a few
>> times. The very short version is that we, the TC, are willing to let
>> project teams decide for themselves if they want to have a more
>> deconstructed kind of PTL role by breaking it into someone responsible for
>> releases and someone responsible for security issues. This new format also
>> comes with setting the expectation that for things like project updates and
>> signing up for PTG time, if someone on the team doesn’t actively take that
>> on, the default assumption is that the project won’t participate. The
>> #ACTION we need someone to take on is to write a resolution about how this
>> will work and how it can be done. Ideally, this would be done before the
>> next technical election, so that teams can choose it at that point. If you
>> are interested in taking on the writing of this resolution, please speak up!
>>
>> Cross Project Work
>>
>> -------------------------
>>
>> -Pop Up Teams-
>>
>> The two teams we have right now are Encryption and Secure Consistent
>> Policy Groups. Both are making slow progress and will continue.
>>
>>
>>
>> -Reducing Community Goals Per Cycle-
>>
>> Historically we have had two goals per cycle, but for smaller teams this
>> can be a HUGE lift. The #ACTION is to clearly outline the documentation for
>> the goal proposal and selection process to clarify that selecting only one
>> goal is fine. No one has claimed this action item yet.
>>
>> -Victoria Goal Finalization-
>>
>> Currently, we have three proposals and one accepted goal. If we are going
>> to select a second goal, it needs to be done ASAP as Victoria development
>> has already begun. All TC members should review the last proposal
>> requesting selection[4].
>>
>> -Wallaby Cycle Goal Discussion Kick Off-
>>
>> Firstly, there is a #ACTION that one or two TC members are needed to
>> guide the W goal selection. If you are interested, please reply to this
>> thread! There were a few proposed goals for VIctoria that didn’t make it
>> that could be the starting point for W discussions, in particular, the
>> rootwrap goal which would be good for operators. The OpenStackCLI might be
>> another goal to propose for Wallaby.
>>
>> Detecting Unmaintained Projects Early
>>
>> ---------------------------------------------------
>>
>> The TC liaisons program had been created a few releases ago, but the
>> initial load on TC members was large. We discussed bringing this program
>> back and making the project health checks happen twice a release, either
>> the start or end of the release and once in the middle. TC liaisons will
>> look at  previously proposed releases,  release activity of the team, the
>> state of tempest plugins, if regular meetings are happening, if there are
>> patches in progress and how busy the project’s IRC channel is to make a
>> determination. Since more than one liaison will be assigned to each
>> project, those liaisons can divvy up the work how they see fit. The other
>> aspect that still needs to be decided is where the health checks will be
>> recorded- in a wiki? In a meeting and meeting logs? That decision is still
>> to be continued. The current #ACTION currently unassigned is that we need
>> to assign liaisons for the Victoria cycle and decide when to do the first
>> health check.
>>
>> Friday
>>
>> =====
>>
>> Reducing Systems and Friction to Drive Change
>>
>> ----------------------------------------------------------------
>>
>> This was another conversation that went in circles a bit before realizing
>> that we should make a list of the more specific problems we want to address
>> and then brainstorm solutions for them. The list we created (including
>> things already being worked) are as follows:
>>
>>    -
>>
>>    TC separate from UC (solution in progress)
>>    -
>>
>>    Stable releases being approved by a separate team (solution in
>>    progress)
>>    -
>>
>>    Making repository creation faster (especially for established project
>>    teams)
>>    -
>>
>>    Create a process blueprint for project team mergers
>>    -
>>
>>    Requirements Team being one person
>>    -
>>
>>    Stable Team
>>    -
>>
>>    Consolidate the agent experience
>>    -
>>
>>    Figure out how to improve project <--> openstack client/sdk
>>    interaction.
>>
>> If you feel compelled to pick one of these things up and start proposing
>> solutions or add to the list, please do!
>>
>> Monitoring in OpenStack (Ceilometer + Telemetry + Gnocchi State)
>>
>>
>> -----------------------------------------------------------------------------------------
>>
>> This conversation is also ongoing, but essentially we talked about the
>> state of things right now- largely they are not well maintained and there
>> is added complexity with Ceilometers being partially dependent on Gnocchi.
>> There are a couple of ideas to look into like using oslo.metrics for the
>> interface between all the tools or using Ceilometer without Gnocchi if we
>> can clean up those dependencies. No specific action items here, just please
>> share your thoughts if you have them.
>>
>> Ideas Repo Next Steps
>>
>> -------------------------------
>>
>> Out of the Ussuri retrospective, it was brought up that we probably
>> needed to talk a little more about what we wanted for this repo.
>> Essentially we just want it to be a place to collect ideas into without
>> worrying about the how. It should be a place to document ideas we have had
>> (old and new) and keep all the discussion in one place as opposed to
>> historic email threads, meetings logs, other IRC logs, etc. We decided it
>> would be good to periodically go through this repo, likely as a forum
>> session at a summit to see if there is any updating that could happen or
>> promotion of ideas to community goals, etc.
>>
>> ‘tc:approved-release’ Tag
>>
>> ---------------------------------
>>
>> This topic was proposed by the Manila team from a discussion they had
>> earlier in the week. We talked about the history of the tag and how usage
>> of tags has evolved. At this point, the proposal is to remove the tag as
>> anything in the releases repo is essentially tc-approved. Ghanshyam has
>> volunteered to document this and do the removal. The board also needs to be
>> notified of this and to look at projects.yaml in the governance repo as the
>> source of truth for TC approved projects. The unassigned #ACTION item is to
>> review remaining tags and see if there are others that need to be
>> modified/removed/added to  drive common behavior across OpenSack
>> components.
>>
>> Board Proposals
>>
>> ----------------------
>>
>> This was a pretty quick summary of all discussions we had that had any
>> impact on the board and largely decided who would mention them.
>>
>>
>>
>> Session Feedback
>>
>> ------------------------
>>
>> This was also a pretty quick topic compared to many of the others, we
>> talked about how things went across all our discussions (largely we called
>> the PTG a success) logistically. We tried to make good use of the raising
>> hands feature which mostly worked, but it lacks context and its possible
>> that the conversation has moved on by the time it’s your turn (if you even
>> remember what you want to say).
>>
>> OpenStack 2.0: k8s Native
>>
>> -----------------------------------
>>
>> This topic was brought up at the end of our time so we didn’t have time
>> to discuss it really. Basically Mohammed wanted to start the conversation
>> about adding k8s as a base service[5] and what we would do if a project
>> proposed required k8s. Adding services that work with k8s could open a door
>> to new innovation in OpenStack. Obviously this topic will need to be
>> discussed further as we barely got started before we had to wrap things up.
>>
>>
>> So.
>>
>>
>> The tldr;
>>
>>
>> Here are the #ACTION items we need owners for:
>>
>>    -
>>
>>    Start the User Facing API Pop Up Team
>>    -
>>
>>    Write a resolution about how the deconstructed PTL roles will work
>>    -
>>
>>    Update Goal Selection docs to explain that one or more goals is fine;
>>    it doesn’t have to be more than one
>>    -
>>
>>    Two volunteers to start the W goal selection process
>>    -
>>
>>    Assign two TC liaisons per project
>>    -
>>
>>     Review Tags to make sure they are still good for driving common
>>    behavior across all openstack projects
>>
>>
>> Here are the things EVERYONE needs to do:
>>
>>    -
>>
>>    Review last goal proposal so that we can decide to accept or reject
>>    it for the V release[4]
>>    -
>>
>>    Add systems that are barriers to progress in openstack to the
>>    Reducing Systems and Friction list
>>    -
>>
>>    Continue conversations you find important
>>
>>
>>
>> Thanks everyone for your hard work and great conversations :)
>>
>> Enjoy the attached (photoshopped) team photo :)
>>
>> -Kendall (diablo_rojo)
>>
>>
>>
>> [1] TC PTG Etherpad: https://etherpad.opendev.org/p/tc-victoria-ptg
>>
>> [2] TC Guide Patch: https://review.opendev.org/#/c/732983/
>>
>> [3] UC TC Merge Thread:
>> http://lists.openstack.org/pipermail/openstack-discuss/2020-May/014736.html
>>
>>
>> [4] Proposed V Goal: https://review.opendev.org/#/c/731213/
>>
>> [5] Base Service Description:
>> https://governance.openstack.org/tc/reference/base-services.html
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200806/db1ff4d3/attachment-0001.html>

From monika.samal at outlook.com  Thu Aug  6 13:38:30 2020
From: monika.samal at outlook.com (Monika Samal)
Date: Thu, 6 Aug 2020 13:38:30 +0000
Subject: [openstack-community] Octavia :; Unable to create load balancer
In-Reply-To: <CAFHSqWrY6+ai-wCyYFscc6=gqkOO09r2TieN=zyVLg6Sf_4LAA@mail.gmail.com>
References: <DM6PR08MB51944451C8861614628EE9DA8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAFs83Qr3Us22pK7gnw7X68_FjimnBTRDn0s+RaSzuzhjcXdY6w@mail.gmail.com>
 <CAMH0MgL7mvG-Zgb-Sr1mbNWhHLyZAHYw+BbAuzaHA+EJSJ=DQQ@mail.gmail.com>
 <DM6PR08MB5194D8A130262996A211B61F8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VzM4BBStF8nTwVJtFxaX0surEidjnmB3JExTDsncJsZNg@mail.gmail.com>
 <DM6PR08MB5194211D78CB801C7C3B0D4A8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VwKMWiO7L6CYK+zb4jkT9ntCjcjQRspxdm8MFL+i13aKg@mail.gmail.com>
 <DM6PR08MB519419FF0C8F173E25A6EAB78F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyDANKJ0CvPXdwMFuED-xEea=Urn0sy0tpwnwLBPSnKTg@mail.gmail.com>
 <DM6PR08MB5194C8259398F0810FAA236E8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAMH0MgLADYE6ncadXJHxghaKDYVmL=wYPE6hOejTC3xyHMgaAg@mail.gmail.com>
 <DM6PR08MB519489AEE16CD5C6CA69890C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <DM6PR08MB51949EBABD05456E0D7B14E38F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyATFxkKPL+YJRsrTiY0qzxN4GWVu6FDroKgjGc5Bqzbg@mail.gmail.com>
 <DM6PR08MB519447CBCC68203A8D5C870C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VxOttV=yNCoFp-RGMts82RJrCB7wXkygsPwmNx--miYVg@mail.gmail.com>
 <CAMH0MgLNm36nWsuXWyYvbd=1LNdNdmKLMYJn7u+d-muTnk0kuA@mail.gmail.com>
 <DM6PR08MB5194961A1F38F812D414AE408F4A0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAFHSqWp0oeLLF36SCvORJ2Vh5YeWyxExJs63efSCG1ak9uaYww@mail.gmail.com>
 <CAMH0Mg+bVbsT+D-MxR7u5VaNthb==OM6gtU1d72t6nZ1tioV1w@mail.gmail.com>,
 <CAFHSqWrY6+ai-wCyYFscc6=gqkOO09r2TieN=zyVLg6Sf_4LAA@mail.gmail.com>
Message-ID: <DM6PR08MB51947C975A1A9F6D800B4D318F480@DM6PR08MB5194.namprd08.prod.outlook.com>

Thanks for responding ?
________________________________
From: Mark Goddard <mark at stackhpc.com>
Sent: Thursday, August 6, 2020 1:16 PM
To: Michael Johnson <johnsomor at gmail.com>
Cc: Monika Samal <monika.samal at outlook.com>; Fabian Zimmermann <dev.faz at gmail.com>; openstack-discuss <openstack-discuss at lists.openstack.org>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer


On Wed, 5 Aug 2020 at 16:16, Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>> wrote:
Looking at that error, it appears that the lb-mgmt-net is not setup correctly. The Octavia controller containers are not able to reach the amphora instances on the lb-mgmt-net subnet.

I don't know how kolla is setup to connect the containers to the neutron lb-mgmt-net network. Maybe the above documents will help with that.

Right now it's up to the operator to configure that. The kolla documentation doesn't prescribe any particular setup. We're working on automating it in Victoria.


Michael

On Wed, Aug 5, 2020 at 12:53 AM Mark Goddard <mark at stackhpc.com<mailto:mark at stackhpc.com>> wrote:


On Tue, 4 Aug 2020 at 16:58, Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> wrote:
Hello Guys,

With Michaels help I was able to solve the problem but now there is another error I was able to create my network on vlan but still error persist. PFB the logs:

http://paste.openstack.org/show/fEixSudZ6lzscxYxsG1z/

Kindly help

regards,
Monika
________________________________
From: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>
Sent: Monday, August 3, 2020 9:10 PM
To: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>
Cc: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

Yeah, it looks like nova is failing to boot the instance.

Check this setting in your octavia.conf files: https://docs.openstack.org/octavia/latest/configuration/configref.html#controller_worker.amp_flavor_id

Also, if kolla-ansible didn't set both of these values correctly, please open bug reports for kolla-ansible. These all should have been configured by the deployment tool.


I wasn't following this thread due to no [kolla] tag, but here are the recently added docs for Octavia in kolla [1]. Note the octavia_service_auth_project variable which was added to migrate from the admin project to the service project for octavia resources. We're lacking proper automation for the flavor, image etc, but it is being worked on in Victoria [2].

[1] https://docs.openstack.org/kolla-ansible/latest/reference/networking/octavia.html
[2] https://review.opendev.org/740180

Michael

On Mon, Aug 3, 2020 at 7:53 AM Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>> wrote:
Seems like the flavor is missing or empty '' - check for typos and enable debug.

Check if the nova req contains valid information/flavor.

 Fabian

Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> schrieb am Mo., 3. Aug. 2020, 15:46:
It's registered

Get Outlook for Android<https://aka.ms/ghei36>
________________________________
From: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>
Sent: Monday, August 3, 2020 7:08:21 PM
To: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

Did you check the (nova) flavor you use in octavia.

 Fabian

Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> schrieb am Mo., 3. Aug. 2020, 10:53:
After Michael suggestion I was able to create load balancer but there is error in status.


[X]

PFB the error link:

http://paste.openstack.org/show/meNZCeuOlFkfjj189noN/
________________________________
From: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>
Sent: Monday, August 3, 2020 2:08 PM
To: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>
Cc: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>; Amy Marrich <amy at demarco.com<mailto:amy at demarco.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>; community at lists.openstack.org<mailto:community at lists.openstack.org> <community at lists.openstack.org<mailto:community at lists.openstack.org>>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

Thanks a ton Michael for helping me out
________________________________
From: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>
Sent: Friday, July 31, 2020 3:57 AM
To: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>
Cc: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>; Amy Marrich <amy at demarco.com<mailto:amy at demarco.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>; community at lists.openstack.org<mailto:community at lists.openstack.org> <community at lists.openstack.org<mailto:community at lists.openstack.org>>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

Just to close the loop on this, the octavia.conf file had
"project_name = admin" instead of "project_name = service" in the
[service_auth] section. This was causing the keystone errors when
Octavia was communicating with neutron.

I don't know if that is a bug in kolla-ansible or was just a local
configuration issue.

Michael

On Thu, Jul 30, 2020 at 1:39 PM Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> wrote:
>
> Hello Fabian,,
>
> http://paste.openstack.org/show/QxKv2Ai697qulp9UWTjY/
>
> Regards,
> Monika
> ________________________________
> From: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>
> Sent: Friday, July 31, 2020 1:57 AM
> To: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>
> Cc: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>; Amy Marrich <amy at demarco.com<mailto:amy at demarco.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>; community at lists.openstack.org<mailto:community at lists.openstack.org> <community at lists.openstack.org<mailto:community at lists.openstack.org>>
> Subject: Re: [openstack-community] Octavia :; Unable to create load balancer
>
> Hi,
>
> just to debug, could you replace the auth_type password with v3password?
>
> And do a curl against your :5000 and :35357 urls and paste the output.
>
>  Fabian
>
> Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> schrieb am Do., 30. Juli 2020, 22:15:
>
> Hello Fabian,
>
> http://paste.openstack.org/show/796477/
>
> Thanks,
> Monika
> ________________________________
> From: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>
> Sent: Friday, July 31, 2020 1:38 AM
> To: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>
> Cc: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>; Amy Marrich <amy at demarco.com<mailto:amy at demarco.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>; community at lists.openstack.org<mailto:community at lists.openstack.org> <community at lists.openstack.org<mailto:community at lists.openstack.org>>
> Subject: Re: [openstack-community] Octavia :; Unable to create load balancer
>
> The sections should be
>
> service_auth
> keystone_authtoken
>
> if i read the docs correctly. Maybe you can just paste your config (remove/change passwords) to paste.openstack.org<http://paste.openstack.org> and post the link?
>
>  Fabian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200806/826789c7/attachment-0001.html>

From lijie at unitedstack.com  Thu Aug  6 16:59:52 2020
From: lijie at unitedstack.com (=?utf-8?B?UmFtYm8=?=)
Date: Fri, 7 Aug 2020 00:59:52 +0800
Subject: [cinder] Could you help me review the reimage feature?
Message-ID: <tencent_022FE737684FEC3458A7C7DB@qq.com>

Hi,all:
&nbsp; &nbsp; &nbsp; &nbsp; I have a spec which is support volume backed server rebuild[0].This spec was accepted in Stein, but some of the work did not finish, so repropose it for Victoria.&nbsp;&nbsp;I sincerely wish this spec will approved in Victoria, so I make an exception for this, and the Nova team will approved this if the cinder reimage question is solved this week[1].&nbsp; This spec is depend on the cinder reimage api [2], and the reimage api has a question. We just need to know if cinder are ok with the change in polling to event like the volume extend. More clearly, Cinder reimage should add a new 'volume-reimage' external event like the volume extend, so that nova can wait for cinder to complete the reimage[3].
&nbsp; &nbsp; &nbsp; &nbsp;The Cinder code is[4], if you have some ideas, you can comments on it.Thank you very much!


Ref:

[0]:https://blueprints.launchpad.net/nova/+spec/volume-backed-server-rebuild

[1]:http://eavesdrop.openstack.org/irclogs/%23openstack-meeting-3/%23openstack-meeting-3.2020-08-06.log.html#t2020-08-06T16:18:22-2
[2]:https://blueprints.launchpad.net/cinder/+spec/add-volume-re-image-api
[3]:https://review.opendev.org/#/c/454287/
[4]:https://review.opendev.org/#/c/606346/
Best Regards
Rambo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200807/08d59ce5/attachment.html>

From fungi at yuggoth.org  Thu Aug  6 17:08:29 2020
From: fungi at yuggoth.org (Jeremy Stanley)
Date: Thu, 6 Aug 2020 17:08:29 +0000
Subject: [ironic] [infra] bifrost-integration-tinyipa-opensuse-15 broken
In-Reply-To: <CACNgkFx87sLExc0odjJy7nd8iySWJHMJNy_1jwV8LxbN4P+sjA@mail.gmail.com>
References: <CACNgkFx87sLExc0odjJy7nd8iySWJHMJNy_1jwV8LxbN4P+sjA@mail.gmail.com>
Message-ID: <20200806170829.rcotrtmneyeyktbn@yuggoth.org>

On 2020-08-06 12:13:36 +0200 (+0200), Dmitry Tantsur wrote:
> Our openSUSE CI job has been broken for a few days [1]. It fails on the
> early bindep stage with [2]
> 
> <download url="
> https://mirror.mtl01.inap.opendev.org/opensuse/distribution/leap/15.1/repo/oss/suse/x86_64/libJudy1-1.0.5-lp151.2.2.x86_64.rpm"
> percent="-1" rate="-1"/>
> <download url="
> https://mirror.mtl01.inap.opendev.org/opensuse/distribution/leap/15.1/repo/oss/suse/x86_64/libJudy1-1.0.5-lp151.2.2.x86_64.rpm"
> rate="-1" done="0"/>
> <message type="error">File
> &apos;./suse/x86_64/libJudy1-1.0.5-lp151.2.2.x86_64.rpm&apos; not found on
> medium &apos;
> https://mirror.mtl01.inap.opendev.org/opensuse/distribution/leap/15.1/repo/oss/&apos
> ;</message>
> 
> I've raised it on #openstack-infra, but I'm not sure if there has been any
> follow up.
[...]

Yes, we discussed it at some length immediately after you mentioned
it:

http://eavesdrop.openstack.org/irclogs/%23openstack-infra/%23openstack-infra.2020-08-05.log.html#t2020-08-05T15:43:03

In short, the packages are in
/opensuse/distribution/leap/15.1/repo/oss/x86_64/ not
/opensuse/distribution/leap/15.1/repo/oss/suse/x86_64/ and the
INDEX.gz files seem to point to the correct location for them. It's
not clear to us why zypper is looking in the latter path; help from
someone with more familiarity with openSUSE and zypper would be much
appreciated. Our mirrors match the official mirrors in this regard,
and our base jobs configure only the first part of the repository
path:

<URL: https://opendev.org/zuul/zuul-jobs/src/commit/1ba95015acc977dea8269889235434d052c736e2/roles/configure-mirrors/tasks/mirror/Suse.yaml#L3 >

It's not clear to any of us what's adding the "/suse" to the URLs
zypper is requesting.
-- 
Jeremy Stanley
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 963 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200806/3600e31b/attachment.sig>

From rosmaita.fossdev at gmail.com  Thu Aug  6 21:00:28 2020
From: rosmaita.fossdev at gmail.com (Brian Rosmaita)
Date: Thu, 6 Aug 2020 17:00:28 -0400
Subject: [cinder] Could you help me review the reimage feature?
In-Reply-To: <tencent_022FE737684FEC3458A7C7DB@qq.com>
References: <tencent_022FE737684FEC3458A7C7DB@qq.com>
Message-ID: <36500a46-7fcc-4fa0-fc09-d7235a833c9f@gmail.com>

On 8/6/20 12:59 PM, Rambo wrote:
> Hi,all:
>          I have a spec which is support volume backed server 
> rebuild[0].This spec was accepted in Stein, but some of the work did not 
> finish, so repropose it for Victoria.  I sincerely wish this spec will 
> approved in Victoria, so I make an exception for this, and the Nova team 
> will approved this *if the cinder reimage question is solved this 
> week*[1].  This spec is depend on the cinder reimage api [2], and the 
> reimage api has a question. We just need to know if cinder are ok with 
> the change in polling to event like the volume extend. More clearly, 
> Cinder reimage should add a new 'volume-reimage' external event like the 
> volume extend, so that nova can wait for cinder to complete the reimage[3].
>         The Cinder code is[4], if you have some ideas, you can comments 
> on it.Thank you very much!

The Cinder team is not going to approve this proposal this week, but we 
encourage you to continue working on it for Wallaby.

The spec was approved for Stein and then re-targeted to Train.  Until 
July 30, the last activity on the patch was April 1, 2019, so this has 
not been on the Cinder team's radar at all this development cycle.

Because the spec is outdated, it should be proposed for Wallaby so the 
current Cinder team can review it and assess how it fits into the 
current project plans.

I've already penciled you in for next week's midcycle so we can discuss 
this in more depth.  But I am against making a snap decision in the next 
two days.


cheers,
brian

> 
> 
> Ref:
> 
> [0]:https://blueprints.launchpad.net/nova/+spec/volume-backed-server-rebuild
> 
> [1]:http://eavesdrop.openstack.org/irclogs/%23openstack-meeting-3/%23openstack-meeting-3.2020-08-06.log.html#t2020-08-06T16:18:22-2
> 
> [2]:https://blueprints.launchpad.net/cinder/+spec/add-volume-re-image-api
> [3]:https://review.opendev.org/#/c/454287/
> [4]:https://review.opendev.org/#/c/606346/
> Best Regards
> Rambo


From cboylan at sapwetik.org  Thu Aug  6 21:44:54 2020
From: cboylan at sapwetik.org (Clark Boylan)
Date: Thu, 06 Aug 2020 14:44:54 -0700
Subject: [ironic] [infra] bifrost-integration-tinyipa-opensuse-15 broken
In-Reply-To: <20200806170829.rcotrtmneyeyktbn@yuggoth.org>
References: <CACNgkFx87sLExc0odjJy7nd8iySWJHMJNy_1jwV8LxbN4P+sjA@mail.gmail.com>
 <20200806170829.rcotrtmneyeyktbn@yuggoth.org>
Message-ID: <4e21ac6a-e287-4d3b-b0e6-c581c631e992@www.fastmail.com>

On Thu, Aug 6, 2020, at 10:08 AM, Jeremy Stanley wrote:
> On 2020-08-06 12:13:36 +0200 (+0200), Dmitry Tantsur wrote:
> > Our openSUSE CI job has been broken for a few days [1]. It fails on the
> > early bindep stage with [2]
> > 
> > <download url="
> > https://mirror.mtl01.inap.opendev.org/opensuse/distribution/leap/15.1/repo/oss/suse/x86_64/libJudy1-1.0.5-lp151.2.2.x86_64.rpm"
> > percent="-1" rate="-1"/>
> > <download url="
> > https://mirror.mtl01.inap.opendev.org/opensuse/distribution/leap/15.1/repo/oss/suse/x86_64/libJudy1-1.0.5-lp151.2.2.x86_64.rpm"
> > rate="-1" done="0"/>
> > <message type="error">File
> > &apos;./suse/x86_64/libJudy1-1.0.5-lp151.2.2.x86_64.rpm&apos; not found on
> > medium &apos;
> > https://mirror.mtl01.inap.opendev.org/opensuse/distribution/leap/15.1/repo/oss/&apos
> > ;</message>
> > 
> > I've raised it on #openstack-infra, but I'm not sure if there has been any
> > follow up.
> [...]
> 
> Yes, we discussed it at some length immediately after you mentioned
> it:
> 
> http://eavesdrop.openstack.org/irclogs/%23openstack-infra/%23openstack-infra.2020-08-05.log.html#t2020-08-05T15:43:03
> 
> In short, the packages are in
> /opensuse/distribution/leap/15.1/repo/oss/x86_64/ not
> /opensuse/distribution/leap/15.1/repo/oss/suse/x86_64/ and the
> INDEX.gz files seem to point to the correct location for them. It's
> not clear to us why zypper is looking in the latter path; help from
> someone with more familiarity with openSUSE and zypper would be much
> appreciated. Our mirrors match the official mirrors in this regard,
> and our base jobs configure only the first part of the repository
> path:
> 
> <URL: 
> https://opendev.org/zuul/zuul-jobs/src/commit/1ba95015acc977dea8269889235434d052c736e2/roles/configure-mirrors/tasks/mirror/Suse.yaml#L3 >
> 
> It's not clear to any of us what's adding the "/suse" to the URLs
> zypper is requesting.

https://review.opendev.org/745225 has landed and seems to fix this issue. Our hunch is that the type of the repo changed upstream of us which we then mirrored. Once this happened our repo configs were no longer correct. Zypper man pages and docs say repos should have their type auto-detected anyway so we've dropped the type specification entirely. This fixed things in testing.

If anyone understands this better that info would be appreciated, but I expect the ironic jobs to be happier now too.

Clark


From kennelson11 at gmail.com  Fri Aug  7 00:00:13 2020
From: kennelson11 at gmail.com (Kendall Nelson)
Date: Thu, 6 Aug 2020 17:00:13 -0700
Subject: [TC] New Office Hour Plans
Message-ID: <CAJ6yrQj5PbYxUQY7u9iYB580YEZQUa0e1NMnnpYcZi8X3T7FqQ@mail.gmail.com>

Hello!

After taking a look at the poll results, Mohammed and I have two proposed
plans for office hours:

Plan A: Two office hours instead of three. This gives us slightly more
coverage than one office hour without overextending ourselves to cover
three office hours. Mohammed and I were thinking that one of the reasons
why three office hours wasn't working was that it was kind of a big time
commitment and TC members could easily rationalize not going to ones later
in the week if they had already attended one earlier in the week. The two
times that enable most TC members to attend at least one, if not both,
would be Monday @14:00 UTC (TC members available: Belmiro, Rico, Kristi,
Jay, Mohammed, myself, and Nate + non members) and Wednesday @ 15:00 UTC
(TC members available: Rico, Kristi, Jay, Mohammed, myself, Ghanshyam, and
Nate + non members).

Plan B: Move to a single office hour on Wednesday @ 15:00 UTC (TC members
available: Rico, Kristi, Jay, Mohammed, myself, Ghanshyam, and Nate + non
members). Having only one office hour gives it more weight and importance
and that should hopefully encourage more attendance from both community
members and TC members alike.

I guess Plan C is to go ahead with Plan A and then if we don't see
activity during the Monday time slot, to reduce down to one office hour and
go with Plan B.

Please check out the patches Mohammed posted [1][2] and vote on what you'd
prefer!

-Kendall (diablo_rojo)

[1] Dual Office Hour: https://review.opendev.org/#/c/745201/
[2] Single Office Hour: https://review.opendev.org/#/c/745200/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200806/c3de444d/attachment-0001.html>

From monika.samal at outlook.com  Thu Aug  6 23:11:52 2020
From: monika.samal at outlook.com (Monika Samal)
Date: Thu, 6 Aug 2020 23:11:52 +0000
Subject: [openstack-community] Octavia :; Unable to create load balancer
In-Reply-To: <CAFHSqWrY6+ai-wCyYFscc6=gqkOO09r2TieN=zyVLg6Sf_4LAA@mail.gmail.com>
References: <DM6PR08MB51944451C8861614628EE9DA8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAFs83Qr3Us22pK7gnw7X68_FjimnBTRDn0s+RaSzuzhjcXdY6w@mail.gmail.com>
 <CAMH0MgL7mvG-Zgb-Sr1mbNWhHLyZAHYw+BbAuzaHA+EJSJ=DQQ@mail.gmail.com>
 <DM6PR08MB5194D8A130262996A211B61F8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VzM4BBStF8nTwVJtFxaX0surEidjnmB3JExTDsncJsZNg@mail.gmail.com>
 <DM6PR08MB5194211D78CB801C7C3B0D4A8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VwKMWiO7L6CYK+zb4jkT9ntCjcjQRspxdm8MFL+i13aKg@mail.gmail.com>
 <DM6PR08MB519419FF0C8F173E25A6EAB78F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyDANKJ0CvPXdwMFuED-xEea=Urn0sy0tpwnwLBPSnKTg@mail.gmail.com>
 <DM6PR08MB5194C8259398F0810FAA236E8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAMH0MgLADYE6ncadXJHxghaKDYVmL=wYPE6hOejTC3xyHMgaAg@mail.gmail.com>
 <DM6PR08MB519489AEE16CD5C6CA69890C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <DM6PR08MB51949EBABD05456E0D7B14E38F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyATFxkKPL+YJRsrTiY0qzxN4GWVu6FDroKgjGc5Bqzbg@mail.gmail.com>
 <DM6PR08MB519447CBCC68203A8D5C870C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VxOttV=yNCoFp-RGMts82RJrCB7wXkygsPwmNx--miYVg@mail.gmail.com>
 <CAMH0MgLNm36nWsuXWyYvbd=1LNdNdmKLMYJn7u+d-muTnk0kuA@mail.gmail.com>
 <DM6PR08MB5194961A1F38F812D414AE408F4A0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAFHSqWp0oeLLF36SCvORJ2Vh5YeWyxExJs63efSCG1ak9uaYww@mail.gmail.com>
 <CAMH0Mg+bVbsT+D-MxR7u5VaNthb==OM6gtU1d72t6nZ1tioV1w@mail.gmail.com>,
 <CAFHSqWrY6+ai-wCyYFscc6=gqkOO09r2TieN=zyVLg6Sf_4LAA@mail.gmail.com>
Message-ID: <DM6PR08MB5194D8E41FA6C8B3708045818F480@DM6PR08MB5194.namprd08.prod.outlook.com>

I tried following above document still facing same Octavia connection error with amphora image.

Regards,
Monika
________________________________
From: Mark Goddard <mark at stackhpc.com>
Sent: Thursday, August 6, 2020 1:16:01 PM
To: Michael Johnson <johnsomor at gmail.com>
Cc: Monika Samal <monika.samal at outlook.com>; Fabian Zimmermann <dev.faz at gmail.com>; openstack-discuss <openstack-discuss at lists.openstack.org>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer


On Wed, 5 Aug 2020 at 16:16, Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>> wrote:
Looking at that error, it appears that the lb-mgmt-net is not setup correctly. The Octavia controller containers are not able to reach the amphora instances on the lb-mgmt-net subnet.

I don't know how kolla is setup to connect the containers to the neutron lb-mgmt-net network. Maybe the above documents will help with that.

Right now it's up to the operator to configure that. The kolla documentation doesn't prescribe any particular setup. We're working on automating it in Victoria.


Michael

On Wed, Aug 5, 2020 at 12:53 AM Mark Goddard <mark at stackhpc.com<mailto:mark at stackhpc.com>> wrote:


On Tue, 4 Aug 2020 at 16:58, Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> wrote:
Hello Guys,

With Michaels help I was able to solve the problem but now there is another error I was able to create my network on vlan but still error persist. PFB the logs:

http://paste.openstack.org/show/fEixSudZ6lzscxYxsG1z/

Kindly help

regards,
Monika
________________________________
From: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>
Sent: Monday, August 3, 2020 9:10 PM
To: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>
Cc: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

Yeah, it looks like nova is failing to boot the instance.

Check this setting in your octavia.conf files: https://docs.openstack.org/octavia/latest/configuration/configref.html#controller_worker.amp_flavor_id

Also, if kolla-ansible didn't set both of these values correctly, please open bug reports for kolla-ansible. These all should have been configured by the deployment tool.


I wasn't following this thread due to no [kolla] tag, but here are the recently added docs for Octavia in kolla [1]. Note the octavia_service_auth_project variable which was added to migrate from the admin project to the service project for octavia resources. We're lacking proper automation for the flavor, image etc, but it is being worked on in Victoria [2].

[1] https://docs.openstack.org/kolla-ansible/latest/reference/networking/octavia.html
[2] https://review.opendev.org/740180

Michael

On Mon, Aug 3, 2020 at 7:53 AM Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>> wrote:
Seems like the flavor is missing or empty '' - check for typos and enable debug.

Check if the nova req contains valid information/flavor.

 Fabian

Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> schrieb am Mo., 3. Aug. 2020, 15:46:
It's registered

Get Outlook for Android<https://aka.ms/ghei36>
________________________________
From: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>
Sent: Monday, August 3, 2020 7:08:21 PM
To: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

Did you check the (nova) flavor you use in octavia.

 Fabian

Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> schrieb am Mo., 3. Aug. 2020, 10:53:
After Michael suggestion I was able to create load balancer but there is error in status.


[X]

PFB the error link:

http://paste.openstack.org/show/meNZCeuOlFkfjj189noN/
________________________________
From: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>
Sent: Monday, August 3, 2020 2:08 PM
To: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>
Cc: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>; Amy Marrich <amy at demarco.com<mailto:amy at demarco.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>; community at lists.openstack.org<mailto:community at lists.openstack.org> <community at lists.openstack.org<mailto:community at lists.openstack.org>>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

Thanks a ton Michael for helping me out
________________________________
From: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>
Sent: Friday, July 31, 2020 3:57 AM
To: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>
Cc: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>; Amy Marrich <amy at demarco.com<mailto:amy at demarco.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>; community at lists.openstack.org<mailto:community at lists.openstack.org> <community at lists.openstack.org<mailto:community at lists.openstack.org>>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

Just to close the loop on this, the octavia.conf file had
"project_name = admin" instead of "project_name = service" in the
[service_auth] section. This was causing the keystone errors when
Octavia was communicating with neutron.

I don't know if that is a bug in kolla-ansible or was just a local
configuration issue.

Michael

On Thu, Jul 30, 2020 at 1:39 PM Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> wrote:
>
> Hello Fabian,,
>
> http://paste.openstack.org/show/QxKv2Ai697qulp9UWTjY/
>
> Regards,
> Monika
> ________________________________
> From: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>
> Sent: Friday, July 31, 2020 1:57 AM
> To: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>
> Cc: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>; Amy Marrich <amy at demarco.com<mailto:amy at demarco.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>; community at lists.openstack.org<mailto:community at lists.openstack.org> <community at lists.openstack.org<mailto:community at lists.openstack.org>>
> Subject: Re: [openstack-community] Octavia :; Unable to create load balancer
>
> Hi,
>
> just to debug, could you replace the auth_type password with v3password?
>
> And do a curl against your :5000 and :35357 urls and paste the output.
>
>  Fabian
>
> Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> schrieb am Do., 30. Juli 2020, 22:15:
>
> Hello Fabian,
>
> http://paste.openstack.org/show/796477/
>
> Thanks,
> Monika
> ________________________________
> From: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>
> Sent: Friday, July 31, 2020 1:38 AM
> To: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>
> Cc: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>; Amy Marrich <amy at demarco.com<mailto:amy at demarco.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>; community at lists.openstack.org<mailto:community at lists.openstack.org> <community at lists.openstack.org<mailto:community at lists.openstack.org>>
> Subject: Re: [openstack-community] Octavia :; Unable to create load balancer
>
> The sections should be
>
> service_auth
> keystone_authtoken
>
> if i read the docs correctly. Maybe you can just paste your config (remove/change passwords) to paste.openstack.org<http://paste.openstack.org> and post the link?
>
>  Fabian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200806/3fa0dde6/attachment-0001.html>

From berndbausch at gmail.com  Fri Aug  7 00:29:11 2020
From: berndbausch at gmail.com (Bernd Bausch)
Date: Fri, 7 Aug 2020 09:29:11 +0900
Subject: [openstack-community] [infra] Problem with ask.openstack.org
Message-ID: <6d128727-27b5-ff0d-6798-fbcf72998012@gmail.com>

While ask.openstack.org is not necessarily loved by many, it continues 
to be used, and there are still people who answer questions.

Recently, one of its features ceased working. I am talking about the 
"responses" page that lists all responses to questions that I have 
answered or commented on. This makes it very hard to follow up on such 
questions; I don't have a tool to see if somebody anwered my question or 
is the person who asked a question has provided updates.

Is there anybody who can fix this?

I know that some people would like to do away with ask.openstack.org 
entirely, since the software is bug-ridden and nobody manages the site. 
My personal opinion is that the current situation is worse than no "ask" 
site at all, since people might ask questions, get partial answers and 
no follow-up. This can create a negative view of the OpenStack community.

In short, either fix it or remove it. Unfortunately I don't have the 
means to do either.

Bernd.


From laurentfdumont at gmail.com  Fri Aug  7 00:36:26 2020
From: laurentfdumont at gmail.com (Laurent Dumont)
Date: Thu, 6 Aug 2020 20:36:26 -0400
Subject: [openstack-community] [infra] Problem with ask.openstack.org
In-Reply-To: <6d128727-27b5-ff0d-6798-fbcf72998012@gmail.com>
References: <6d128727-27b5-ff0d-6798-fbcf72998012@gmail.com>
Message-ID: <CAOAKi8wkc0t1HZ2N09aX8wSM9FP0ReSvysLVZYJPBzj7+y9a=g@mail.gmail.com>

It's definitely a tough sell. I'm not sure if it's worse to not have a
community driven "a-la-Stackoverflow" style or one that is not super in
shape.

I would rather see go into archive mode only if it's too much of an
Operational burden to keep running :(

On Thu, Aug 6, 2020 at 8:33 PM Bernd Bausch <berndbausch at gmail.com> wrote:

> While ask.openstack.org is not necessarily loved by many, it continues
> to be used, and there are still people who answer questions.
>
> Recently, one of its features ceased working. I am talking about the
> "responses" page that lists all responses to questions that I have
> answered or commented on. This makes it very hard to follow up on such
> questions; I don't have a tool to see if somebody anwered my question or
> is the person who asked a question has provided updates.
>
> Is there anybody who can fix this?
>
> I know that some people would like to do away with ask.openstack.org
> entirely, since the software is bug-ridden and nobody manages the site.
> My personal opinion is that the current situation is worse than no "ask"
> site at all, since people might ask questions, get partial answers and
> no follow-up. This can create a negative view of the OpenStack community.
>
> In short, either fix it or remove it. Unfortunately I don't have the
> means to do either.
>
> Bernd.
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200806/4ee4d0b7/attachment.html>

From cboylan at sapwetik.org  Fri Aug  7 00:41:36 2020
From: cboylan at sapwetik.org (Clark Boylan)
Date: Thu, 06 Aug 2020 17:41:36 -0700
Subject: [openstack-community] [infra] Problem with ask.openstack.org
In-Reply-To: <6d128727-27b5-ff0d-6798-fbcf72998012@gmail.com>
References: <6d128727-27b5-ff0d-6798-fbcf72998012@gmail.com>
Message-ID: <9b6967b9-5cb7-47f2-bacd-87d3304a3428@www.fastmail.com>

On Thu, Aug 6, 2020, at 5:29 PM, Bernd Bausch wrote:
> While ask.openstack.org is not necessarily loved by many, it continues 
> to be used, and there are still people who answer questions.
> 
> Recently, one of its features ceased working. I am talking about the 
> "responses" page that lists all responses to questions that I have 
> answered or commented on. This makes it very hard to follow up on such 
> questions; I don't have a tool to see if somebody anwered my question or 
> is the person who asked a question has provided updates.
> 
> Is there anybody who can fix this?
> 
> I know that some people would like to do away with ask.openstack.org 
> entirely, since the software is bug-ridden and nobody manages the site. 
> My personal opinion is that the current situation is worse than no "ask" 
> site at all, since people might ask questions, get partial answers and 
> no follow-up. This can create a negative view of the OpenStack community.
> 
> In short, either fix it or remove it. Unfortunately I don't have the 
> means to do either.

I'm not able to debug the issue at this moment, but did want to point out that all of our config management is collaboratively managed in Git repos code reviewed in Gerrit. This means that if you know what the problem is you absolutely can fix it. Or if you'd prefer to turn off the service you can write a change for that as well. The biggest gap is in identifying the issue without access to server logs. Depending on the issue figuring out what is going on may require access.

Relevant bits of code:
https://opendev.org/opendev/system-config/src/branch/master/manifests/site.pp#L525-L538
https://opendev.org/opendev/system-config/src/branch/master/modules/openstack_project/manifests/ask.pp
https://opendev.org/opendev/puppet-askbot

Finally, we also expose server and service statistics via cacti and graphite. These can be useful for checking service health:
http://cacti.openstack.org/cacti/graph_view.php
https://grafana.opendev.org/?orgId=1

Clark


From yasufum.o at gmail.com  Fri Aug  7 08:37:16 2020
From: yasufum.o at gmail.com (Yasufumi Ogawa)
Date: Fri, 7 Aug 2020 17:37:16 +0900
Subject: [tacker] PTL on vacation
Message-ID: <9f29f573-69f5-d6d2-20d9-5c5ef7d775f4@gmail.com>

I will be on vacation from 10th to 17th Aug. I would like to skip the 
next IRC meeting because many of tacker members are also on vacation 
next week.

Thanks,
Yasufumi


From berndbausch at gmail.com  Fri Aug  7 09:09:31 2020
From: berndbausch at gmail.com (Bernd Bausch)
Date: Fri, 7 Aug 2020 18:09:31 +0900
Subject: [openstack-community] [infra] Problem with ask.openstack.org
In-Reply-To: <CAOAKi8wkc0t1HZ2N09aX8wSM9FP0ReSvysLVZYJPBzj7+y9a=g@mail.gmail.com>
References: <6d128727-27b5-ff0d-6798-fbcf72998012@gmail.com>
 <CAOAKi8wkc0t1HZ2N09aX8wSM9FP0ReSvysLVZYJPBzj7+y9a=g@mail.gmail.com>
Message-ID: <21ed671d-63d2-53e1-1ce0-31b977515be6@gmail.com>

Sending people to Stackoverflow directly is a good option IMO. This 
suggestion was made before. Of course, I would lose my 7700 karma 
points, but I can stomach it :)

On 8/7/2020 9:36 AM, Laurent Dumont wrote:
> It's definitely a tough sell. I'm not sure if it's worse to not have a 
> community driven "a-la-Stackoverflow" style or one that is not super 
> in shape.
>
> I would rather see go into archive mode only if it's too much of an 
> Operational burden to keep running :(
>
> On Thu, Aug 6, 2020 at 8:33 PM Bernd Bausch <berndbausch at gmail.com 
> <mailto:berndbausch at gmail.com>> wrote:
>
>     While ask.openstack.org <http://ask.openstack.org> is not
>     necessarily loved by many, it continues
>     to be used, and there are still people who answer questions.
>
>     Recently, one of its features ceased working. I am talking about the
>     "responses" page that lists all responses to questions that I have
>     answered or commented on. This makes it very hard to follow up on
>     such
>     questions; I don't have a tool to see if somebody anwered my
>     question or
>     is the person who asked a question has provided updates.
>
>     Is there anybody who can fix this?
>
>     I know that some people would like to do away with
>     ask.openstack.org <http://ask.openstack.org>
>     entirely, since the software is bug-ridden and nobody manages the
>     site.
>     My personal opinion is that the current situation is worse than no
>     "ask"
>     site at all, since people might ask questions, get partial answers
>     and
>     no follow-up. This can create a negative view of the OpenStack
>     community.
>
>     In short, either fix it or remove it. Unfortunately I don't have the
>     means to do either.
>
>     Bernd.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200807/b54a1b82/attachment.html>

From pierre at stackhpc.com  Fri Aug  7 09:18:48 2020
From: pierre at stackhpc.com (Pierre Riteau)
Date: Fri, 7 Aug 2020 11:18:48 +0200
Subject: [cloudkitty][tc] Cloudkitty abandoned?
In-Reply-To: <CAJ6yrQgvpyTpR2Z99-_ruqRMJextYcx4e-yP2kkpPY5PDWdAvA@mail.gmail.com>
References: <c6150166-1fa9-5020-988a-0a8e9101bca8@gmx.com>
 <CAG97raekL-OR3mT_GdpD+zcwoCCLYDTCsieMwqCkYC9brmHKpw@mail.gmail.com>
 <CAFHSqWrKO4nDiVNpZe3LUsccXwefqRirZGk9BCsQjads0VrYCA@mail.gmail.com>
 <CAG97racc_j30TVhZ1UBk9SAo1rNrer1j1pr-uQEjkee4cTY_gg@mail.gmail.com>
 <CAJ6yrQgvpyTpR2Z99-_ruqRMJextYcx4e-yP2kkpPY5PDWdAvA@mail.gmail.com>
Message-ID: <CA+ny2swxRnfJS8mStB2T9tEA1Ox4=_LdHS+giMEQPiCbu=MUDw@mail.gmail.com>

Thanks a lot for starting this discussion, I am also quite concerned about this.

At StackHPC we started looking into CloudKitty a year ago, when the
community was still fairly active. There was an IRC meeting every
month or so throughout 2019. Patches were getting merged.

Unfortunately in 2020 activity stopped abruptly. There hasn't been any
IRC meeting since early December and no patch has been merged since
the end of March. I have submitted straightforward stable backports of
bug fixes which have not received any answer.

I am well aware of the difficulty of keeping up with open-source
project maintenance when work deadlines are always taking priority. If
the existing core team would be willing to grant +2 votes to more
people, I would be happy to participate in the maintenance of the
project. We've now deployed CloudKitty for several of our customers
and have to maintain a stable fork anyway. We would rather maintain
upstream directly!

Pierre Riteau (priteau)


On Tue, 4 Aug 2020 at 23:22, Kendall Nelson <kennelson11 at gmail.com> wrote:
>
> I think the majority of 'maintenance' activities at the moment for Cloudkitty are the reviewing of open patches in gerrit [1] and triaging bugs that are reported in Launchpad[2] as they come in. When things come up on this mailing list that have the cloudkitty tag in the subject line (like this email), weighing in on them would also be helpful.
>
> If you need help getting setup with gerrit, I am happy to assist anyway I can :)
>
> -Kendall Nelson (diablo_rojo)
>
> [1] https://review.opendev.org/#/q/project:openstack/cloudkitty+OR+project:openstack/python-cloudkittyclient+OR+project:openstack/cloudkitty-dashboard
> [2] https://launchpad.net/cloudkitty
>
>
> On Tue, Aug 4, 2020 at 6:21 AM Rafael Weingärtner <rafaelweingartner at gmail.com> wrote:
>>
>> I am not sure how the projects/communities here in OpenStack are maintained and conducted, but I could for sure help.
>> I am a committer and PMC for some Apache projects; therefore, I am a bit familiar with some processes in OpenSource communities.
>>
>> On Tue, Aug 4, 2020 at 5:11 AM Mark Goddard <mark at stackhpc.com> wrote:
>>>
>>> On Thu, 30 Jul 2020 at 14:43, Rafael Weingärtner
>>> <rafaelweingartner at gmail.com> wrote:
>>> >
>>> > We are working on it. So far we have 3 open proposals there, but we do not have enough karma to move things along.
>>> > Besides these 3 open proposals, we do have more ongoing extensions that have not yet been proposed to the community.
>>>
>>> It's good to hear you want to help improve cloudkitty, however it
>>> sounds like what is required is help with maintaining the project. Is
>>> that something you could be involved with?
>>> Mark
>>>
>>> >
>>> > On Thu, Jul 30, 2020 at 10:22 AM Sean McGinnis <sean.mcginnis at gmx.com> wrote:
>>> >>
>>> >> Posting here to raise awareness, and start discussion about next steps.
>>> >>
>>> >> It appears there is no one working on Cloudkitty anymore. No patches
>>> >> have been merged for several months now, including simple bot proposed
>>> >> patches. It would appear no one is maintaining this project anymore.
>>> >>
>>> >> I know there is a need out there for this type of functionality, so
>>> >> maybe this will raise awareness and get some attention to it. But
>>> >> barring that, I am wondering if we should start the process to retire
>>> >> this project.
>>> >>
>>> >>  From a Victoria release perspective, it is milestone-2 week, so we
>>> >> should make a decision if any of the Cloudkitty deliverables should be
>>> >> included in this release or not. We can certainly force releases of
>>> >> whatever is the latest, but I think that is a bit risky since these
>>> >> repos have never merged the job template change for victoria and
>>> >> therefore are not even testing with Python 3.8. That is an official
>>> >> runtime for Victoria, so we run the risk of having issues with the code
>>> >> if someone runs under 3.8 but we have not tested to make sure there are
>>> >> no problems doing so.
>>> >>
>>> >> I am hoping this at least starts the discussion. I will not propose any
>>> >> release patches to remove anything until we have had a chance to discuss
>>> >> the situation.
>>> >>
>>> >> Sean
>>> >>
>>> >>
>>> >
>>> >
>>> > --
>>> > Rafael Weingärtner
>>
>>
>>
>> --
>> Rafael Weingärtner


From skaplons at redhat.com  Fri Aug  7 10:19:38 2020
From: skaplons at redhat.com (Slawek Kaplonski)
Date: Fri, 7 Aug 2020 12:19:38 +0200
Subject: [neutron][OVS firewall] Multicast non-IGMP traffic is allowed by
 default, not in iptables FW (LP#1889631)
In-Reply-To: <CAECr9X5vd2y9KV5hvGh67x_sBZTbXA_MymveQSLr1SQicmwmaA@mail.gmail.com>
References: <CAECr9X5vd2y9KV5hvGh67x_sBZTbXA_MymveQSLr1SQicmwmaA@mail.gmail.com>
Message-ID: <38E3A820-FD9D-4A2B-B989-4735092D304F@redhat.com>

Hi,

> On 4 Aug 2020, at 19:05, Rodolfo Alonso Hernandez <ralonsoh at redhat.com> wrote:
> 
> Hello all:
> 
> First of all, the link: https://bugs.launchpad.net/neutron/+bug/1889631
> 
> To sum up the bug: in iptables FW, the non-IGMP multicast traffic from 224.0.0.x was blocked; this is not happening in OVS FW.
> 
> That was discussed today in the Neutron meeting today [1]. We face two possible situations here:
> - If we block this traffic now, some deployments using the OVS FW will experience an unexpected network blockage.

I would be for this option but left stable branches not touched.
Additionally we should of course add release note with info that this behaviour changed now and also we can add upgrade check which will write warning about that if any of the agents in the DB is using “openvswitch” firewall driver.
I don’t think we can do anything more to warn users about such change.

> - Deployments migrating from iptables to OVS FW, now won't be able to explicitly allow this traffic (or block it by default). This also breaks the current API, because some rules won't have any effect (those ones allowing this traffic).

This is current issue, right? If we would fix it as You proposed above, then behaviour between both drivers would be the same. Am I understanding correct?

> 
> A possible solution is to add a new knob in the FW configuration; this config option will allow to block or not this traffic by default. Remember that the FW can only create permissive rules, not blocking ones.

I don’t like to add yet another config knob for that. And also as I think Akihiro mentioned it’s not good practice to change API behaviour depending on config options. This wouldn’t be discoverable in API.

> 
> Any feedback is welcome!
> 
> Regards.
> 
> [1]http://eavesdrop.openstack.org/meetings/networking/2020/networking.2020-08-04-14.00.log.html#l-136
> 
> 

— 
Slawek Kaplonski
Principal software engineer
Red Hat


From ralonsoh at redhat.com  Fri Aug  7 10:30:53 2020
From: ralonsoh at redhat.com (Rodolfo Alonso Hernandez)
Date: Fri, 7 Aug 2020 11:30:53 +0100
Subject: [neutron][OVS firewall] Multicast non-IGMP traffic is allowed by
 default, not in iptables FW (LP#1889631)
In-Reply-To: <38E3A820-FD9D-4A2B-B989-4735092D304F@redhat.com>
References: <CAECr9X5vd2y9KV5hvGh67x_sBZTbXA_MymveQSLr1SQicmwmaA@mail.gmail.com>
 <38E3A820-FD9D-4A2B-B989-4735092D304F@redhat.com>
Message-ID: <CAECr9X70+1phmJj_pCV+h-_ZZdPCa8NDF+W4pzHY5VpOTsozHw@mail.gmail.com>

Hi Slawek:

I agree with Akihiro and you:
- This should be fixed to match both FW behaviour, but only in master.
- Of course, a "big" release note to make this public.
- Not to add a knob that changes the API behaviour.

I'll wait for more feedback. Although I'll be on PTO, I'll check and reply
to the mail.

Thank you and regards.


On Fri, Aug 7, 2020 at 11:19 AM Slawek Kaplonski <skaplons at redhat.com>
wrote:

> Hi,
>
> > On 4 Aug 2020, at 19:05, Rodolfo Alonso Hernandez <ralonsoh at redhat.com>
> wrote:
> >
> > Hello all:
> >
> > First of all, the link: https://bugs.launchpad.net/neutron/+bug/1889631
> >
> > To sum up the bug: in iptables FW, the non-IGMP multicast traffic from
> 224.0.0.x was blocked; this is not happening in OVS FW.
> >
> > That was discussed today in the Neutron meeting today [1]. We face two
> possible situations here:
> > - If we block this traffic now, some deployments using the OVS FW will
> experience an unexpected network blockage.
>
> I would be for this option but left stable branches not touched.
> Additionally we should of course add release note with info that this
> behaviour changed now and also we can add upgrade check which will write
> warning about that if any of the agents in the DB is using “openvswitch”
> firewall driver.
> I don’t think we can do anything more to warn users about such change.
>
> > - Deployments migrating from iptables to OVS FW, now won't be able to
> explicitly allow this traffic (or block it by default). This also breaks
> the current API, because some rules won't have any effect (those ones
> allowing this traffic).
>
> This is current issue, right? If we would fix it as You proposed above,
> then behaviour between both drivers would be the same. Am I understanding
> correct?
>
> >
> > A possible solution is to add a new knob in the FW configuration; this
> config option will allow to block or not this traffic by default. Remember
> that the FW can only create permissive rules, not blocking ones.
>
> I don’t like to add yet another config knob for that. And also as I think
> Akihiro mentioned it’s not good practice to change API behaviour depending
> on config options. This wouldn’t be discoverable in API.
>
> >
> > Any feedback is welcome!
> >
> > Regards.
> >
> > [1]
> http://eavesdrop.openstack.org/meetings/networking/2020/networking.2020-08-04-14.00.log.html#l-136
> >
> >
>
> —
> Slawek Kaplonski
> Principal software engineer
> Red Hat
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200807/1de68647/attachment.html>

From fungi at yuggoth.org  Fri Aug  7 12:45:15 2020
From: fungi at yuggoth.org (Jeremy Stanley)
Date: Fri, 7 Aug 2020 12:45:15 +0000
Subject: [openstack-community] [infra] Problem with ask.openstack.org
In-Reply-To: <21ed671d-63d2-53e1-1ce0-31b977515be6@gmail.com>
References: <6d128727-27b5-ff0d-6798-fbcf72998012@gmail.com>
 <CAOAKi8wkc0t1HZ2N09aX8wSM9FP0ReSvysLVZYJPBzj7+y9a=g@mail.gmail.com>
 <21ed671d-63d2-53e1-1ce0-31b977515be6@gmail.com>
Message-ID: <20200807124515.7k5xhijkj6mi4lec@yuggoth.org>

On 2020-08-07 18:09:31 +0900 (+0900), Bernd Bausch wrote:
> Sending people to Stackoverflow directly is a good option IMO.
[...]

Yes, ask.openstack.org was originally created for two reasons:

1. We could not keep up with the constant spam load on
forums.openstack.org, but when we wanted to shut it down we kept
hearing that many OpenStack users needed us to provide a Web forum
because they wouldn't/couldn't use E-mail.

2. When we approached Stackexchange/Stackoverflow about getting a
site like Ubuntu had, they said OpenStack was not popular enough
software to warrant that.

OSF originally contracted the author of Askbot to assist in
maintaining the ask.openstack.org site, but his interests eventually
moved on to other endeavors and the site has sat unmaintained
(except for an occasional reboot by community infrastructure team
sysadmins) for a number of years now. At this point it's a
liability, and unless folks are interested in getting it back into a
well-managed state I think we probably have no choice but to phase
it out.
-- 
Jeremy Stanley
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 963 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200807/0aa374bc/attachment.sig>

From gmann at ghanshyammann.com  Fri Aug  7 13:33:18 2020
From: gmann at ghanshyammann.com (Ghanshyam Mann)
Date: Fri, 07 Aug 2020 08:33:18 -0500
Subject: [TC] New Office Hour Plans
In-Reply-To: <CAJ6yrQj5PbYxUQY7u9iYB580YEZQUa0e1NMnnpYcZi8X3T7FqQ@mail.gmail.com>
References: <CAJ6yrQj5PbYxUQY7u9iYB580YEZQUa0e1NMnnpYcZi8X3T7FqQ@mail.gmail.com>
Message-ID: <173c920361f.e63c2eb4109191.7004381406663804589@ghanshyammann.com>

 ---- On Thu, 06 Aug 2020 19:00:13 -0500 Kendall Nelson <kennelson11 at gmail.com> wrote ----
 > Hello!
 > After taking a look at the poll results, Mohammed and I have two proposed plans for office hours:
 > Plan A: Two office hours instead of three. This gives us slightly more coverage than one office hour without overextending ourselves to cover three office hours. Mohammed and I were thinking that one of the reasons why three office hours wasn't working was that it was kind of a big time commitment and TC members could easily rationalize not going to ones later in the week if they had already attended one earlier in the week. The two times that enable most TC members to attend at least one, if not both, would be Monday @14:00 UTC (TC members available: Belmiro, Rico, Kristi, Jay, Mohammed, myself, and Nate + non members) and Wednesday @ 15:00 UTC (TC members available: Rico, Kristi, Jay, Mohammed, myself, Ghanshyam, and Nate + non members). 
 > Plan B: Move to a single office hour on Wednesday @ 15:00 UTC (TC members available: Rico, Kristi, Jay, Mohammed, myself, Ghanshyam, and Nate + non members). Having only one office hour gives it more weight and importance and that should hopefully encourage more attendance from both community members and TC members alike. 

Thanks Kendal for following up on office hours plan. 

The idea of having multiple office hours was to cover TC availability across different TZ. Even Asia TZ office
hours might not have many TC members available but still, someone to address/ack the issues and bring it to
TC when most of the members are available. There was no expectation for all TC members to be present in all
three office hours so all office hours being inactive might be due to some other reason not due to *many office hours*.

Thursday office hour was most TC available one which is not the case anymore.

I MO, we should consider the 'covering most of TZ (as much we can) for TC-availability' so in first option we can move
either of the office hour in different TZ.

I still in favor of moving to weekly TC meeting (in alternate TZ or so) than office hours but I am ok to give office hours a 
another try with new time. 

-gmann

 > I guess Plan C is to go ahead with Plan A and then if we don't see  activity during the Monday time slot, to reduce down to one office hour and go with Plan B. 
 > Please check out the patches Mohammed posted [1][2] and vote on what you'd prefer!
 > -Kendall (diablo_rojo)
 > [1] Dual Office Hour: https://review.opendev.org/#/c/745201/[2] Single Office Hour: https://review.opendev.org/#/c/745200/


From gmann at ghanshyammann.com  Fri Aug  7 14:10:53 2020
From: gmann at ghanshyammann.com (Ghanshyam Mann)
Date: Fri, 07 Aug 2020 09:10:53 -0500
Subject: [cloudkitty][tc] Cloudkitty abandoned?
In-Reply-To: <CA+ny2swxRnfJS8mStB2T9tEA1Ox4=_LdHS+giMEQPiCbu=MUDw@mail.gmail.com>
References: <c6150166-1fa9-5020-988a-0a8e9101bca8@gmx.com>
 <CAG97raekL-OR3mT_GdpD+zcwoCCLYDTCsieMwqCkYC9brmHKpw@mail.gmail.com>
 <CAFHSqWrKO4nDiVNpZe3LUsccXwefqRirZGk9BCsQjads0VrYCA@mail.gmail.com>
 <CAG97racc_j30TVhZ1UBk9SAo1rNrer1j1pr-uQEjkee4cTY_gg@mail.gmail.com>
 <CAJ6yrQgvpyTpR2Z99-_ruqRMJextYcx4e-yP2kkpPY5PDWdAvA@mail.gmail.com>
 <CA+ny2swxRnfJS8mStB2T9tEA1Ox4=_LdHS+giMEQPiCbu=MUDw@mail.gmail.com>
Message-ID: <173c942a17b.dfe050d2111458.180813585646259079@ghanshyammann.com>

Thanks, Pierre for helping with this.

ttx has reached out to PTL (Justin Ferrieu (jferrieu) <justin.ferrieu at objectif-libre.com>)
but I am not sure if he got any response back. 

Can you also send email to PTL as well as the current core team to add you in the core list
for project maintenance? 

Please note that, migration of CI/CD to ubuntu work might break the cloudkitty gate if patches are
not merged on time. I am still working on few repos though.

-gmann


 ---- On Fri, 07 Aug 2020 04:18:48 -0500 Pierre Riteau <pierre at stackhpc.com> wrote ----
 > Thanks a lot for starting this discussion, I am also quite concerned about this.
 > 
 > At StackHPC we started looking into CloudKitty a year ago, when the
 > community was still fairly active. There was an IRC meeting every
 > month or so throughout 2019. Patches were getting merged.
 > 
 > Unfortunately in 2020 activity stopped abruptly. There hasn't been any
 > IRC meeting since early December and no patch has been merged since
 > the end of March. I have submitted straightforward stable backports of
 > bug fixes which have not received any answer.
 > 
 > I am well aware of the difficulty of keeping up with open-source
 > project maintenance when work deadlines are always taking priority. If
 > the existing core team would be willing to grant +2 votes to more
 > people, I would be happy to participate in the maintenance of the
 > project. We've now deployed CloudKitty for several of our customers
 > and have to maintain a stable fork anyway. We would rather maintain
 > upstream directly!
 > 
 > Pierre Riteau (priteau)
 > 
 > 
 > On Tue, 4 Aug 2020 at 23:22, Kendall Nelson <kennelson11 at gmail.com> wrote:
 > >
 > > I think the majority of 'maintenance' activities at the moment for Cloudkitty are the reviewing of open patches in gerrit [1] and triaging bugs that are reported in Launchpad[2] as they come in. When things come up on this mailing list that have the cloudkitty tag in the subject line (like this email), weighing in on them would also be helpful.
 > >
 > > If you need help getting setup with gerrit, I am happy to assist anyway I can :)
 > >
 > > -Kendall Nelson (diablo_rojo)
 > >
 > > [1] https://review.opendev.org/#/q/project:openstack/cloudkitty+OR+project:openstack/python-cloudkittyclient+OR+project:openstack/cloudkitty-dashboard
 > > [2] https://launchpad.net/cloudkitty
 > >
 > >
 > > On Tue, Aug 4, 2020 at 6:21 AM Rafael Weingärtner <rafaelweingartner at gmail.com> wrote:
 > >>
 > >> I am not sure how the projects/communities here in OpenStack are maintained and conducted, but I could for sure help.
 > >> I am a committer and PMC for some Apache projects; therefore, I am a bit familiar with some processes in OpenSource communities.
 > >>
 > >> On Tue, Aug 4, 2020 at 5:11 AM Mark Goddard <mark at stackhpc.com> wrote:
 > >>>
 > >>> On Thu, 30 Jul 2020 at 14:43, Rafael Weingärtner
 > >>> <rafaelweingartner at gmail.com> wrote:
 > >>> >
 > >>> > We are working on it. So far we have 3 open proposals there, but we do not have enough karma to move things along.
 > >>> > Besides these 3 open proposals, we do have more ongoing extensions that have not yet been proposed to the community.
 > >>>
 > >>> It's good to hear you want to help improve cloudkitty, however it
 > >>> sounds like what is required is help with maintaining the project. Is
 > >>> that something you could be involved with?
 > >>> Mark
 > >>>
 > >>> >
 > >>> > On Thu, Jul 30, 2020 at 10:22 AM Sean McGinnis <sean.mcginnis at gmx.com> wrote:
 > >>> >>
 > >>> >> Posting here to raise awareness, and start discussion about next steps.
 > >>> >>
 > >>> >> It appears there is no one working on Cloudkitty anymore. No patches
 > >>> >> have been merged for several months now, including simple bot proposed
 > >>> >> patches. It would appear no one is maintaining this project anymore.
 > >>> >>
 > >>> >> I know there is a need out there for this type of functionality, so
 > >>> >> maybe this will raise awareness and get some attention to it. But
 > >>> >> barring that, I am wondering if we should start the process to retire
 > >>> >> this project.
 > >>> >>
 > >>> >>  From a Victoria release perspective, it is milestone-2 week, so we
 > >>> >> should make a decision if any of the Cloudkitty deliverables should be
 > >>> >> included in this release or not. We can certainly force releases of
 > >>> >> whatever is the latest, but I think that is a bit risky since these
 > >>> >> repos have never merged the job template change for victoria and
 > >>> >> therefore are not even testing with Python 3.8. That is an official
 > >>> >> runtime for Victoria, so we run the risk of having issues with the code
 > >>> >> if someone runs under 3.8 but we have not tested to make sure there are
 > >>> >> no problems doing so.
 > >>> >>
 > >>> >> I am hoping this at least starts the discussion. I will not propose any
 > >>> >> release patches to remove anything until we have had a chance to discuss
 > >>> >> the situation.
 > >>> >>
 > >>> >> Sean
 > >>> >>
 > >>> >>
 > >>> >
 > >>> >
 > >>> > --
 > >>> > Rafael Weingärtner
 > >>
 > >>
 > >>
 > >> --
 > >> Rafael Weingärtner
 > 
 >


From mark at stackhpc.com  Fri Aug  7 14:11:12 2020
From: mark at stackhpc.com (Mark Goddard)
Date: Fri, 7 Aug 2020 15:11:12 +0100
Subject: [kolla] Kolla klub break
Message-ID: <CAFHSqWoGTAircETLV-e6VeUAns134P3sWx7fCQOzHhWViWUgsA@mail.gmail.com>

Hi,

We agreed in Wednesday's IRC meeting to take a short summer break from
the klub. Let's meet again on 10th September.

Thanks to everyone who has taken part in these meetings so far, we've
had some really great discussions. As always, if anyone has ideas for
topics, please add them to the Google doc.

Looking forward to some more great sessions in September.

https://docs.google.com/document/d/1EwQs2GXF-EvJZamEx9vQAOSDB5tCjsDCJyHQN5_4_Sw/edit#

Thanks,
Mark


From balazs.gibizer at est.tech  Fri Aug  7 15:26:53 2020
From: balazs.gibizer at est.tech (=?iso-8859-1?q?Bal=E1zs?= Gibizer)
Date: Fri, 07 Aug 2020 17:26:53 +0200
Subject: [nova] Nova PTL is on PTO until 24th of Aug
Message-ID: <TK9PEQ.12LLVNY8EN7K@est.tech>

Hi,

I will be on vacation during the next two weeks.

Cheers,
gibi


From pierre at stackhpc.com  Fri Aug  7 16:10:45 2020
From: pierre at stackhpc.com (Pierre Riteau)
Date: Fri, 7 Aug 2020 18:10:45 +0200
Subject: Helping out with CloudKitty maintenance
Message-ID: <CA+ny2swCDgNNaT_GEc2905G8uAyLkRtuSuxDtQpfw2WGKHiZSg@mail.gmail.com>

Hello,

Following the discussion about the state of CloudKitty [1], I would
like to volunteer my help with maintaining the project, as no one of
the core team appears to be active at the moment. I have been working
with CloudKitty for about a year and have used both the Gnocchi and
Monasca collectors. Being a core reviewer on two other OpenStack
projects, I am familiar with the process of maintaining OpenStack
code.

Would it be possible to get core reviewer privileges to help? I would
initially focus on keeping CI green and making sure bug fixes are
merged and backported.

Thanks in advance,
Pierre Riteau (priteau)

[1] http://lists.openstack.org/pipermail/openstack-discuss/2020-August/016384.html


From rafaelweingartner at gmail.com  Fri Aug  7 16:21:53 2020
From: rafaelweingartner at gmail.com (=?UTF-8?Q?Rafael_Weing=C3=A4rtner?=)
Date: Fri, 7 Aug 2020 13:21:53 -0300
Subject: [cloudkitty][tc] Cloudkitty abandoned?
In-Reply-To: <CAJ6yrQgvpyTpR2Z99-_ruqRMJextYcx4e-yP2kkpPY5PDWdAvA@mail.gmail.com>
References: <c6150166-1fa9-5020-988a-0a8e9101bca8@gmx.com>
 <CAG97raekL-OR3mT_GdpD+zcwoCCLYDTCsieMwqCkYC9brmHKpw@mail.gmail.com>
 <CAFHSqWrKO4nDiVNpZe3LUsccXwefqRirZGk9BCsQjads0VrYCA@mail.gmail.com>
 <CAG97racc_j30TVhZ1UBk9SAo1rNrer1j1pr-uQEjkee4cTY_gg@mail.gmail.com>
 <CAJ6yrQgvpyTpR2Z99-_ruqRMJextYcx4e-yP2kkpPY5PDWdAvA@mail.gmail.com>
Message-ID: <CAG97raeRdsVf0kK+mSK5hTDL-Nt2o9ekWQCF2QOeySt5S0nC9g@mail.gmail.com>

I see. Thanks for the heads up. I will try to dedicate some time every week
for these tasks.

On Tue, Aug 4, 2020 at 6:22 PM Kendall Nelson <kennelson11 at gmail.com> wrote:

> I think the majority of 'maintenance' activities at the moment for
> Cloudkitty are the reviewing of open patches in gerrit [1] and triaging
> bugs that are reported in Launchpad[2] as they come in. When things come up
> on this mailing list that have the cloudkitty tag in the subject line (like
> this email), weighing in on them would also be helpful.
>
> If you need help getting setup with gerrit, I am happy to assist anyway I
> can :)
>
> -Kendall Nelson (diablo_rojo)
>
> [1]
> https://review.opendev.org/#/q/project:openstack/cloudkitty+OR+project:openstack/python-cloudkittyclient+OR+project:openstack/cloudkitty-dashboard
> [2] https://launchpad.net/cloudkitty
>
>
> On Tue, Aug 4, 2020 at 6:21 AM Rafael Weingärtner <
> rafaelweingartner at gmail.com> wrote:
>
>> I am not sure how the projects/communities here in OpenStack are
>> maintained and conducted, but I could for sure help.
>> I am a committer and PMC for some Apache projects; therefore, I am a bit
>> familiar with some processes in OpenSource communities.
>>
>> On Tue, Aug 4, 2020 at 5:11 AM Mark Goddard <mark at stackhpc.com> wrote:
>>
>>> On Thu, 30 Jul 2020 at 14:43, Rafael Weingärtner
>>> <rafaelweingartner at gmail.com> wrote:
>>> >
>>> > We are working on it. So far we have 3 open proposals there, but we do
>>> not have enough karma to move things along.
>>> > Besides these 3 open proposals, we do have more ongoing extensions
>>> that have not yet been proposed to the community.
>>>
>>> It's good to hear you want to help improve cloudkitty, however it
>>> sounds like what is required is help with maintaining the project. Is
>>> that something you could be involved with?
>>> Mark
>>>
>>> >
>>> > On Thu, Jul 30, 2020 at 10:22 AM Sean McGinnis <sean.mcginnis at gmx.com>
>>> wrote:
>>> >>
>>> >> Posting here to raise awareness, and start discussion about next
>>> steps.
>>> >>
>>> >> It appears there is no one working on Cloudkitty anymore. No patches
>>> >> have been merged for several months now, including simple bot proposed
>>> >> patches. It would appear no one is maintaining this project anymore.
>>> >>
>>> >> I know there is a need out there for this type of functionality, so
>>> >> maybe this will raise awareness and get some attention to it. But
>>> >> barring that, I am wondering if we should start the process to retire
>>> >> this project.
>>> >>
>>> >>  From a Victoria release perspective, it is milestone-2 week, so we
>>> >> should make a decision if any of the Cloudkitty deliverables should be
>>> >> included in this release or not. We can certainly force releases of
>>> >> whatever is the latest, but I think that is a bit risky since these
>>> >> repos have never merged the job template change for victoria and
>>> >> therefore are not even testing with Python 3.8. That is an official
>>> >> runtime for Victoria, so we run the risk of having issues with the
>>> code
>>> >> if someone runs under 3.8 but we have not tested to make sure there
>>> are
>>> >> no problems doing so.
>>> >>
>>> >> I am hoping this at least starts the discussion. I will not propose
>>> any
>>> >> release patches to remove anything until we have had a chance to
>>> discuss
>>> >> the situation.
>>> >>
>>> >> Sean
>>> >>
>>> >>
>>> >
>>> >
>>> > --
>>> > Rafael Weingärtner
>>>
>>
>>
>> --
>> Rafael Weingärtner
>>
>

-- 
Rafael Weingärtner
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200807/d07d0409/attachment.html>

From luis.ramirez at opencloud.es  Fri Aug  7 16:30:30 2020
From: luis.ramirez at opencloud.es (Luis Ramirez)
Date: Fri, 7 Aug 2020 18:30:30 +0200
Subject: Helping out with CloudKitty maintenance
In-Reply-To: <CA+ny2swCDgNNaT_GEc2905G8uAyLkRtuSuxDtQpfw2WGKHiZSg@mail.gmail.com>
References: <CA+ny2swCDgNNaT_GEc2905G8uAyLkRtuSuxDtQpfw2WGKHiZSg@mail.gmail.com>
Message-ID: <CAAvZhtkCdScQ4fe4FsYgMvzDkQac1+t9tJnb5JDzqX6_7udqEA@mail.gmail.com>

Hi,

+1. We need to move fwd to keep it active. I’m also working on a charm for
CloudKitty.

Br
Luis Rmz

El El vie, 7 ago 2020 a las 18:16, Pierre Riteau <pierre at stackhpc.com>
escribió:

> Hello,
>
> Following the discussion about the state of CloudKitty [1], I would
> like to volunteer my help with maintaining the project, as no one of
> the core team appears to be active at the moment. I have been working
> with CloudKitty for about a year and have used both the Gnocchi and
> Monasca collectors. Being a core reviewer on two other OpenStack
> projects, I am familiar with the process of maintaining OpenStack
> code.
>
> Would it be possible to get core reviewer privileges to help? I would
> initially focus on keeping CI green and making sure bug fixes are
> merged and backported.
>
> Thanks in advance,
> Pierre Riteau (priteau)
>
> [1]
> http://lists.openstack.org/pipermail/openstack-discuss/2020-August/016384.html
>
> --
Br,
Luis Rmz
Blockchain, DevOps & Open Source Cloud Solutions Architect
----------------------------------------
Founder & CEO
OpenCloud.es
luis.ramirez at opencloud.es
Skype ID: d.overload
Hangouts: luis.ramirez at opencloud.es
 +34 911 950 123 / +39 392 1289553 / +49 152 26917722
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200807/a0d5287b/attachment.html>

From jungleboyj at gmail.com  Fri Aug  7 17:12:25 2020
From: jungleboyj at gmail.com (Jay Bryant)
Date: Fri, 7 Aug 2020 12:12:25 -0500
Subject: [TC] New Office Hour Plans
In-Reply-To: <173c920361f.e63c2eb4109191.7004381406663804589@ghanshyammann.com>
References: <CAJ6yrQj5PbYxUQY7u9iYB580YEZQUa0e1NMnnpYcZi8X3T7FqQ@mail.gmail.com>
 <173c920361f.e63c2eb4109191.7004381406663804589@ghanshyammann.com>
Message-ID: <976bf811-536b-faff-cb30-dbab1ac6d83a@gmail.com>


On 8/7/2020 8:33 AM, Ghanshyam Mann wrote:
>   ---- On Thu, 06 Aug 2020 19:00:13 -0500 Kendall Nelson <kennelson11 at gmail.com> wrote ----
>   > Hello!
>   > After taking a look at the poll results, Mohammed and I have two proposed plans for office hours:
>   > Plan A: Two office hours instead of three. This gives us slightly more coverage than one office hour without overextending ourselves to cover three office hours. Mohammed and I were thinking that one of the reasons why three office hours wasn't working was that it was kind of a big time commitment and TC members could easily rationalize not going to ones later in the week if they had already attended one earlier in the week. The two times that enable most TC members to attend at least one, if not both, would be Monday @14:00 UTC (TC members available: Belmiro, Rico, Kristi, Jay, Mohammed, myself, and Nate + non members) and Wednesday @ 15:00 UTC (TC members available: Rico, Kristi, Jay, Mohammed, myself, Ghanshyam, and Nate + non members).
>   > Plan B: Move to a single office hour on Wednesday @ 15:00 UTC (TC members available: Rico, Kristi, Jay, Mohammed, myself, Ghanshyam, and Nate + non members). Having only one office hour gives it more weight and importance and that should hopefully encourage more attendance from both community members and TC members alike.
>
> Thanks Kendal for following up on office hours plan.
>
> The idea of having multiple office hours was to cover TC availability across different TZ. Even Asia TZ office
> hours might not have many TC members available but still, someone to address/ack the issues and bring it to
> TC when most of the members are available. There was no expectation for all TC members to be present in all
> three office hours so all office hours being inactive might be due to some other reason not due to *many office hours*.
>
> Thursday office hour was most TC available one which is not the case anymore.
>
> I MO, we should consider the 'covering most of TZ (as much we can) for TC-availability' so in first option we can move
> either of the office hour in different TZ.

I think that Gmann makes a good point here.  If we are going to have 
multiple office hours one should be in an AP timezone.  Was there a 
second time where the most people were in the AP timeframe were available?

So, reduce to two office hours and try to cover both sides of the world?

Jay

>
> I still in favor of moving to weekly TC meeting (in alternate TZ or so) than office hours but I am ok to give office hours a
> another try with new time.
>
> -gmann
>
>   > I guess Plan C is to go ahead with Plan A and then if we don't see  activity during the Monday time slot, to reduce down to one office hour and go with Plan B.
>   > Please check out the patches Mohammed posted [1][2] and vote on what you'd prefer!
>   > -Kendall (diablo_rojo)
>   > [1] Dual Office Hour: https://review.opendev.org/#/c/745201/[2] Single Office Hour: https://review.opendev.org/#/c/745200/
>


From sean.mcginnis at gmx.com  Fri Aug  7 19:56:38 2020
From: sean.mcginnis at gmx.com (Sean McGinnis)
Date: Fri, 7 Aug 2020 14:56:38 -0500
Subject: [all] Proposed Wallaby cycle schedule
Message-ID: <2e56de68-c416-e3ea-f3da-caaf9399287d@gmx.com>

Hey everyone,

The Victoria cycle is going by fast, and it's already time to start
planning some of the early things for the Wallaby release. One of the
first steps for that is actually deciding on the release schedule.

Typically we have done this based on when the next Summit event was
planned to take place. Due to several reasons, we don't have a date yet
for the first 2021 event.

The current thinking is it will likely take place in May (nothing is
set, just an educated guess, so please don't use that for any other
planning). So for the sake of figuring out the release schedule, we are
targeting a release date in early May. Hopefully this will then align
well with event plans.

I have a proposed release schedule up for review here:

https://review.opendev.org/#/c/744729/

For ease of viewing (until the job logs are garbage collected), you can
see the rendered schedule here:

https://0e6b8aeca433e85b429b-46fd243db6dc394538bd0555f339eba5.ssl.cf1.rackcdn.com/744729/3/check/openstack-tox-docs/4f76901/docs/wallaby/schedule.html

There are always outside conflicts, but I think this has aligned mostly
well with major holidays. But please feel free to comment on the patch
if you see any major issues that we may have not considered.

Thanks!

Sean


From mnaser at vexxhost.com  Fri Aug  7 20:22:08 2020
From: mnaser at vexxhost.com (Mohammed Naser)
Date: Fri, 7 Aug 2020 16:22:08 -0400
Subject: [TC] New Office Hour Plans
In-Reply-To: <173c920361f.e63c2eb4109191.7004381406663804589@ghanshyammann.com>
References: <CAJ6yrQj5PbYxUQY7u9iYB580YEZQUa0e1NMnnpYcZi8X3T7FqQ@mail.gmail.com>
 <173c920361f.e63c2eb4109191.7004381406663804589@ghanshyammann.com>
Message-ID: <CAEs876j+J2d9hUCu1Pze-xiFzS+-CNN-w9YfcRPopG2+f6u=Zw@mail.gmail.com>

On Fri, Aug 7, 2020 at 9:37 AM Ghanshyam Mann <gmann at ghanshyammann.com> wrote:
>
>  ---- On Thu, 06 Aug 2020 19:00:13 -0500 Kendall Nelson <kennelson11 at gmail.com> wrote ----
>  > Hello!
>  > After taking a look at the poll results, Mohammed and I have two proposed plans for office hours:
>  > Plan A: Two office hours instead of three. This gives us slightly more coverage than one office hour without overextending ourselves to cover three office hours. Mohammed and I were thinking that one of the reasons why three office hours wasn't working was that it was kind of a big time commitment and TC members could easily rationalize not going to ones later in the week if they had already attended one earlier in the week. The two times that enable most TC members to attend at least one, if not both, would be Monday @14:00 UTC (TC members available: Belmiro, Rico, Kristi, Jay, Mohammed, myself, and Nate + non members) and Wednesday @ 15:00 UTC (TC members available: Rico, Kristi, Jay, Mohammed, myself, Ghanshyam, and Nate + non members).
>  > Plan B: Move to a single office hour on Wednesday @ 15:00 UTC (TC members available: Rico, Kristi, Jay, Mohammed, myself, Ghanshyam, and Nate + non members). Having only one office hour gives it more weight and importance and that should hopefully encourage more attendance from both community members and TC members alike.
>
> Thanks Kendal for following up on office hours plan.
>
> The idea of having multiple office hours was to cover TC availability across different TZ. Even Asia TZ office
> hours might not have many TC members available but still, someone to address/ack the issues and bring it to
> TC when most of the members are available. There was no expectation for all TC members to be present in all
> three office hours so all office hours being inactive might be due to some other reason not due to *many office hours*.

I think if we limit the number of times, then more people can likely
show up because it's a smaller commitment.  The 2 office hours except
Thursday are pretty much non-existant at that point

> Thursday office hour was most TC available one which is not the case anymore.

The Wednesday was actually the time where we had 10 people mention
they'd be available, 9 of them being TC members.  I'm hoping that is
the most successful time line

> I MO, we should consider the 'covering most of TZ (as much we can) for TC-availability' so in first option we can move
> either of the office hour in different TZ.
>
> I still in favor of moving to weekly TC meeting (in alternate TZ or so) than office hours but I am ok to give office hours a
> another try with new time.

I think given the commitment I see here, I am confident Wednesday
should be successful:

https://doodle.com/poll/q27t8pucq7b8xbme

> -gmann
>
>  > I guess Plan C is to go ahead with Plan A and then if we don't see  activity during the Monday time slot, to reduce down to one office hour and go with Plan B.
>  > Please check out the patches Mohammed posted [1][2] and vote on what you'd prefer!
>  > -Kendall (diablo_rojo)
>  > [1] Dual Office Hour: https://review.opendev.org/#/c/745201/[2] Single Office Hour: https://review.opendev.org/#/c/745200/
>


-- 
Mohammed Naser
VEXXHOST, Inc.


From its-openstack at zohocorp.com  Fri Aug  7 07:21:32 2020
From: its-openstack at zohocorp.com (its-openstack at zohocorp.com)
Date: Fri, 07 Aug 2020 12:51:32 +0530
Subject: Openstack-Train VCPU issue in Hyper-V
Message-ID: <173c7cbda12.e31c17678315.6864041805036536996@zohocorp.com>

Dear Team, 


   We are using Openstack-Train in our organization.We have created windows server 2016 Std R2 instances with this flavor m5.xlarge ( RAM - 65536 , Disk - 500 , VCPUs - 16 ).Once Hyper-V future enabled in this instances VCPU count is automatically reduced to 1 core after restart.Even we have enabled nested virtualisation in openstack compute server.Herewith attached screenshot for your references.Please help us to short out this issue.


#cat /sys/module/kvm_intel/parameters/nested

Y


Regards,

Sysadmin.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200807/44c0c2dd/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Before Hyper-V.bmp
Type: application/octet-stream
Size: 2306502 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200807/44c0c2dd/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: After Hyper-V .png
Type: image/png
Size: 47337 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200807/44c0c2dd/attachment-0001.png>

From cohuck at redhat.com  Fri Aug  7 11:59:42 2020
From: cohuck at redhat.com (Cornelia Huck)
Date: Fri, 7 Aug 2020 13:59:42 +0200
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <4cf2824c803c96496e846c5b06767db305e9fb5a.camel@redhat.com>
References: <20200727072440.GA28676@joy-OptiPlex-7040>
 <20200727162321.7097070e@x1.home>
 <20200729080503.GB28676@joy-OptiPlex-7040>
 <20200804183503.39f56516.cohuck@redhat.com>
 <c178a0d3-269d-1620-22b1-9010f602d8ff@redhat.com>
 <20200805021654.GB30485@joy-OptiPlex-7040>
 <2624b12f-3788-7e2b-2cb7-93534960bcb7@redhat.com>
 <20200805075647.GB2177@nanopsycho>
 <eb1d01c2-fbad-36b6-10cf-9e03483a736b@redhat.com>
 <20200805093338.GC30485@joy-OptiPlex-7040>
 <20200805105319.GF2177@nanopsycho>
 <4cf2824c803c96496e846c5b06767db305e9fb5a.camel@redhat.com>
Message-ID: <20200807135942.5d56a202.cohuck@redhat.com>

On Wed, 05 Aug 2020 12:35:01 +0100
Sean Mooney <smooney at redhat.com> wrote:

> On Wed, 2020-08-05 at 12:53 +0200, Jiri Pirko wrote:
> > Wed, Aug 05, 2020 at 11:33:38AM CEST, yan.y.zhao at intel.com wrote:  

(...)

> > >    software_version: device driver's version.
> > >               in <major>.<minor>[.bugfix] scheme, where there is no
> > > 	       compatibility across major versions, minor versions have
> > > 	       forward compatibility (ex. 1-> 2 is ok, 2 -> 1 is not) and
> > > 	       bugfix version number indicates some degree of internal
> > > 	       improvement that is not visible to the user in terms of
> > > 	       features or compatibility,
> > > 
> > > vendor specific attributes: each vendor may define different attributes
> > >   device id : device id of a physical devices or mdev's parent pci device.
> > >               it could be equal to pci id for pci devices
> > >   aggregator: used together with mdev_type. e.g. aggregator=2 together
> > >               with i915-GVTg_V5_4 means 2*1/4=1/2 of a gen9 Intel
> > > 	       graphics device.
> > >   remote_url: for a local NVMe VF, it may be configured with a remote
> > >               url of a remote storage and all data is stored in the
> > > 	       remote side specified by the remote url.
> > >   ...  
> just a minor not that i find ^ much more simmple to understand then
> the current proposal with self and compatiable.
> if i have well defiend attibute that i can parse and understand that allow
> me to calulate the what is and is not compatible that is likely going to
> more useful as you wont have to keep maintianing a list of other compatible
> devices every time a new sku is released.
> 
> in anycase thank for actully shareing ^ as it make it simpler to reson about what
> you have previously proposed.

So, what would be the most helpful format? A 'software_version' field
that follows the conventions outlined above, and other (possibly
optional) fields that have to match?

(...)

> > Thanks for the explanation, I'm still fuzzy about the details.
> > Anyway, I suggest you to check "devlink dev info" command we have
> > implemented for multiple drivers.  
> 
> is devlink exposed as a filesytem we can read with just open?
> openstack will likely try to leverage libvirt to get this info but when we
> cant its much simpler to read sysfs then it is to take a a depenency on a commandline
> too and have to fork shell to execute it and parse the cli output.
> pyroute2 which we use in some openstack poject has basic python binding for devlink but im not
> sure how complete it is as i think its relitivly new addtion. if we need to take a dependcy
> we will but that would be a drawback fo devlink not that that is a large one just something
> to keep in mind.

A devlinkfs, maybe? At least for reading information (IIUC, "devlink
dev info" is only about information retrieval, right?)


From sean.mcginnis at gmx.com  Fri Aug  7 21:09:00 2020
From: sean.mcginnis at gmx.com (Sean McGinnis)
Date: Fri, 7 Aug 2020 16:09:00 -0500
Subject: [sigs][vendors] Proposal to create Hardware Vendor SIG
Message-ID: <5d4928c2-8e14-82a7-c06b-dd8df4de44fb@gmx.com>

Hey everyone,

OpenStack is a community for creating open source software, but by the
nature of infrastructure management, there is a strong intersection with
hardware, and therefore hardware vendors. Nova, Cinder, Neutron, Ironic,
and many others need to interact with hardware and support vendor
specific drivers for this interaction. There is currently a spectrum
where some of this hardware interaction is done as openly as possible,
while others develop their integration "glue" behind closed doors. Part
of this is the disconnect between the fully open development of
OpenStack, and the traditionally proprietary development of many products.

For those that do want to do their vendor-specific development as openly
as possible - and hopefully attract the involvement of customers,
partners, and others in the community - there hasn't been a great venue
for this so far. Some vendors that have done open development have even
had difficulty finding a place in the community and ended up deciding to
just develop behind closed doors.

I would like to try to change this, so I am proposing the creation of
the Hardware Vendor SIG as a place where we can collaborate on vendor
specific things, and encourage development to happen in the open. This
would be a place for any vendors and interested parties to work together
to fix bugs, implement features, and overall improve the quality of
anything that helps provide that glue to bridge the gap between our open
source services and vendor hardware. This would include servers,
storage, networking, and really anything else that plays a role in being
able to set up an OpenStack cloud.

This is a call out to any others interested in participating. If you are
interested in this effort, and if you have any existing code (whether
hosted on OpenDev, hosted on GitHub, or hosted on your own platform)
that you think would be a good fit for this, please add your contact
information and any relevant details here:

https://etherpad.opendev.org/p/HardwareVendorSIG

Also, please feel free to show your support by voting on the proposal to
create this SIG here:

https://review.opendev.org/#/c/745185/

Thanks!

Sean


From zigo at debian.org  Fri Aug  7 21:19:45 2020
From: zigo at debian.org (Thomas Goirand)
Date: Fri, 7 Aug 2020 23:19:45 +0200
Subject: [cloudkitty][tc] Cloudkitty abandoned?
In-Reply-To: <173c942a17b.dfe050d2111458.180813585646259079@ghanshyammann.com>
References: <c6150166-1fa9-5020-988a-0a8e9101bca8@gmx.com>
 <CAG97raekL-OR3mT_GdpD+zcwoCCLYDTCsieMwqCkYC9brmHKpw@mail.gmail.com>
 <CAFHSqWrKO4nDiVNpZe3LUsccXwefqRirZGk9BCsQjads0VrYCA@mail.gmail.com>
 <CAG97racc_j30TVhZ1UBk9SAo1rNrer1j1pr-uQEjkee4cTY_gg@mail.gmail.com>
 <CAJ6yrQgvpyTpR2Z99-_ruqRMJextYcx4e-yP2kkpPY5PDWdAvA@mail.gmail.com>
 <CA+ny2swxRnfJS8mStB2T9tEA1Ox4=_LdHS+giMEQPiCbu=MUDw@mail.gmail.com>
 <173c942a17b.dfe050d2111458.180813585646259079@ghanshyammann.com>
Message-ID: <e104109d-a7c8-9e88-42e8-cd947de9a88c@debian.org>

On 8/7/20 4:10 PM, Ghanshyam Mann wrote:
> Thanks, Pierre for helping with this.
> 
> ttx has reached out to PTL (Justin Ferrieu (jferrieu) <justin.ferrieu at objectif-libre.com>)
> but I am not sure if he got any response back.

The end of the very good maintenance of Cloudkitty matched the date when
objectif libre was sold to Linkbynet. Maybe the new owner don't care enough?

This is very disappointing as I've been using it for some time already,
and that I was satisfied by it (ie: it does the job...), and especially
that latest releases are able to scale correctly.

I very much would love if Pierre Riteau was successful in taking over.
Good luck Pierre! I'll try to help whenever I can and if I'm not too busy.

Cheers,

Thomas Goirand (zigo)


From Arkady.Kanevsky at dell.com  Sat Aug  8 02:48:01 2020
From: Arkady.Kanevsky at dell.com (Kanevsky, Arkady)
Date: Sat, 8 Aug 2020 02:48:01 +0000
Subject: [sigs][vendors] Proposal to create Hardware Vendor SIG
In-Reply-To: <5d4928c2-8e14-82a7-c06b-dd8df4de44fb@gmx.com>
References: <5d4928c2-8e14-82a7-c06b-dd8df4de44fb@gmx.com>
Message-ID: <BN8PR19MB30766BB088EAFC8E06EA1326F9460@BN8PR19MB3076.namprd19.prod.outlook.com>

Great idea. Long time overdue.
Great place for many out-of-tree repos.
Thanks
Arkady

-----Original Message-----
From: Sean McGinnis <sean.mcginnis at gmx.com> 
Sent: Friday, August 7, 2020 4:09 PM
To: openstack-discuss
Subject: [sigs][vendors] Proposal to create Hardware Vendor SIG


[EXTERNAL EMAIL] 

Hey everyone,

OpenStack is a community for creating open source software, but by the nature of infrastructure management, there is a strong intersection with hardware, and therefore hardware vendors. Nova, Cinder, Neutron, Ironic, and many others need to interact with hardware and support vendor specific drivers for this interaction. There is currently a spectrum where some of this hardware interaction is done as openly as possible, while others develop their integration "glue" behind closed doors. Part of this is the disconnect between the fully open development of OpenStack, and the traditionally proprietary development of many products.

For those that do want to do their vendor-specific development as openly as possible - and hopefully attract the involvement of customers, partners, and others in the community - there hasn't been a great venue for this so far. Some vendors that have done open development have even had difficulty finding a place in the community and ended up deciding to just develop behind closed doors.

I would like to try to change this, so I am proposing the creation of the Hardware Vendor SIG as a place where we can collaborate on vendor specific things, and encourage development to happen in the open. This would be a place for any vendors and interested parties to work together to fix bugs, implement features, and overall improve the quality of anything that helps provide that glue to bridge the gap between our open source services and vendor hardware. This would include servers, storage, networking, and really anything else that plays a role in being able to set up an OpenStack cloud.

This is a call out to any others interested in participating. If you are interested in this effort, and if you have any existing code (whether hosted on OpenDev, hosted on GitHub, or hosted on your own platform) that you think would be a good fit for this, please add your contact information and any relevant details here:

https://etherpad.opendev.org/p/HardwareVendorSIG

Also, please feel free to show your support by voting on the proposal to create this SIG here:

https://review.opendev.org/#/c/745185/

Thanks!

Sean


From dev.faz at gmail.com  Sat Aug  8 04:30:09 2020
From: dev.faz at gmail.com (Fabian Zimmermann)
Date: Sat, 8 Aug 2020 06:30:09 +0200
Subject: [nova][neutron][oslo][ops] rabbit bindings issue
In-Reply-To: <20200806144016.GP31915@sync>
References: <20200806144016.GP31915@sync>
Message-ID: <CAA857VxY8mr0OSw8vSYdMph4tXb=5Wwwx3cM_YVgGa8=P+cxkQ@mail.gmail.com>

Hi,

we also have this issue.

Our solution was (up to now) to delete the queues with a script or even
reset the complete cluster.

We just upgraded rabbitmq to the latest version - without luck.

Anyone else seeing this issue?

 Fabian


Arnaud Morin <arnaud.morin at gmail.com> schrieb am Do., 6. Aug. 2020, 16:47:

> Hey all,
>
> I would like to ask the community about a rabbit issue we have from time
> to time.
>
> In our current architecture, we have a cluster of rabbits (3 nodes) for
> all our OpenStack services (mostly nova and neutron).
>
> When one node of this cluster is down, the cluster continue working (we
> use pause_minority strategy).
> But, sometimes, the third server is not able to recover automatically
> and need a manual intervention.
> After this intervention, we restart the rabbitmq-server process, which
> is then able to join the cluster back.
>
> At this time, the cluster looks ok, everything is fine.
> BUT, nothing works.
> Neutron and nova agents are not able to report back to servers.
> They appear dead.
> Servers seems not being able to consume messages.
> The exchanges, queues, bindings seems good in rabbit.
>
> What we see is that removing bindings (using rabbitmqadmin delete
> binding or the web interface) and recreate them again (using the same
> routing key) brings the service back up and running.
>
> Doing this for all queues is really painful. Our next plan is to
> automate it, but is there anyone in the community already saw this kind
> of issues?
>
> Our bug looks like the one described in [1].
> Someone recommands to create an Alternate Exchange.
> Is there anyone already tried that?
>
> FYI, we are running rabbit 3.8.2 (with OpenStack Stein).
> We had the same kind of issues using older version of rabbit.
>
> Thanks for your help.
>
> [1]
> https://groups.google.com/forum/#!newtopic/rabbitmq-users/rabbitmq-users/zFhmpHF2aWk
>
> --
> Arnaud Morin
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200808/ed7f0bca/attachment.html>

From bhartendu at gmail.com  Sat Aug  8 07:00:51 2020
From: bhartendu at gmail.com (Bhartendu)
Date: Sat, 8 Aug 2020 12:30:51 +0530
Subject: Openstack VMmachine internet connectivity issue
Message-ID: <CANVmZ+c1sYyLuh12QvMD9t2+25k8CV=9gwcj8wW8z_NpzMqdhw@mail.gmail.com>

Hi All,

I need help or any pointer to solve Openstack VM machine internet
connectivity issue. I am successfully able to create VM machines on
Openstack cloud.


Facing following Openstack VM machine internet connectivity issues:


1) There is no internet connectivity from Openstack VM machine.

2) ping is successful till  Openstack machine (192.168.0.166), but Gateway
ip (192.168.0.1) not reachable from Openstack VM machine.

3) No website reachable from Openstack VM machine.

Openstack is installed on Virtualbox VM machine (CentOS 8.2).

Connectivity information:
Internet<=====>Router(192.168.0.1)<=====>Oracle VM machine (CentOS 8.2;
192.168.0.166)<=====>Openstack VM Machine (192.168.0.174)

Any help or trigger is much appreciated.

-----------------------------------------------
Openstack VM Machine (192.168.0.174) Logs
-----------------------------------------------
$ ip route
default via 192.168.0.1 dev eth0
169.254.169.254 via 192.168.0.171 dev eth0
192.168.0.0/24 dev eth0 scope link  src 192.168.0.174
$ ping 192.168.0.166
PING 192.168.0.166 (192.168.0.166): 56 data bytes
64 bytes from 192.168.0.166: seq=0 ttl=64 time=2.657 ms
64 bytes from 192.168.0.166: seq=1 ttl=64 time=1.196 ms
64 bytes from 192.168.0.166: seq=2 ttl=64 time=1.312 ms
64 bytes from 192.168.0.166: seq=3 ttl=64 time=0.875 ms
64 bytes from 192.168.0.166: seq=4 ttl=64 time=0.782 ms
^C
--- 192.168.0.166 ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max = 0.782/1.364/2.657 ms
$ ping 192.168.0.1
PING 192.168.0.1 (192.168.0.1): 56 data bytes
^C
--- 192.168.0.1 ping statistics ---
3 packets transmitted, 0 packets received, 100% packet loss
$ ping google.com
ping: bad address 'google.com'
$
$ sudo cat /etc/resolv.conf
nameserver 192.168.0.1
nameserver 192.168.0.166
nameserver 8.8.8.8
$
$ ip route
default via 192.168.0.1 dev eth0
169.254.169.254 via 192.168.0.171 dev eth0
192.168.0.0/24 dev eth0 scope link  src 192.168.0.174
$
$ ifconfig
eth0      Link encap:Ethernet  HWaddr FA:16:3E:17:F2:F9
          inet addr:192.168.0.174  Bcast:192.168.0.255  Mask:255.255.255.0
          inet6 addr: fe80::f816:3eff:fe17:f2f9/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:572 errors:0 dropped:0 overruns:0 frame:0
          TX packets:571 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:51697 (50.4 KiB)  TX bytes:46506 (45.4 KiB)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:35 errors:0 dropped:0 overruns:0 frame:0
          TX packets:35 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:3764 (3.6 KiB)  TX bytes:3764 (3.6 KiB)

$

-----------------------------------------------------
Oracle VM machine (CentOS 8.2; 192.168.0.166) Logs
-----------------------------------------------------

[root at openstack ~]# ovs-vsctl show
6718c4ce-6a58-463c-95e4-20a34edbe041
    Manager "ptcp:6640:127.0.0.1"
        is_connected: true
    Bridge br-int
        fail_mode: secure
        datapath_type: system
        Port br-int
            Interface br-int
                type: internal
        Port "tap50a66e3f-00"
            Interface "tap50a66e3f-00"
        Port "patch-br-int-to-provnet-27804412-47c0-497d-8f2e-8c0bf8b04df1"
            Interface
"patch-br-int-to-provnet-27804412-47c0-497d-8f2e-8c0bf8b04df1"
                type: patch
                options:
{peer="patch-provnet-27804412-47c0-497d-8f2e-8c0bf8b04df1-to-br-int"}
        Port "tap6ea42aaf-80"
            Interface "tap6ea42aaf-80"
        Port "tap743cdf36-c8"
            Interface "tap743cdf36-c8"
        Port "tap2a93518d-90"
            Interface "tap2a93518d-90"
    Bridge br-ex
        fail_mode: standalone
        Port "patch-provnet-27804412-47c0-497d-8f2e-8c0bf8b04df1-to-br-int"
            Interface
"patch-provnet-27804412-47c0-497d-8f2e-8c0bf8b04df1-to-br-int"
                type: patch
                options:
{peer="patch-br-int-to-provnet-27804412-47c0-497d-8f2e-8c0bf8b04df1"}
        Port br-ex
            Interface br-ex
                type: internal
        Port "enp0s3"
            Interface "enp0s3"
    ovs_version: "2.12.0"
[root at openstack ~]# ip a s enp0s3
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master
ovs-system state UP group default qlen 1000
    link/ether 08:00:27:cd:fc:4f brd ff:ff:ff:ff:ff:ff
[root at openstack ~]# ip a s br-ex
13: br-ex: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state
UNKNOWN group default qlen 1000
    link/ether 08:00:27:cd:fc:4f brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.166/24 brd 192.168.0.255 scope global br-ex
       valid_lft forever preferred_lft forever
    inet6 fe80::f057:6bff:fe69:1f47/64 scope link
       valid_lft forever preferred_lft forever
[root at openstack ~]#
[root at openstack ~]# cat /etc/resolv.conf
# Generated by NetworkManager
search example.com
nameserver 192.168.0.1
nameserver 8.8.8.8
[root at openstack ~]#
[root at openstack ~]# ip route
default via 192.168.0.1 dev br-ex
169.254.0.0/16 dev br-ex scope link metric 1013
192.168.0.0/24 dev br-ex proto kernel scope link src 192.168.0.166
[root at openstack ~]#
[root at openstack ~]# ifconfig
br-ex: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.0.166  netmask 255.255.255.0  broadcast 192.168.0.255
        inet6 fe80::f057:6bff:fe69:1f47  prefixlen 64  scopeid 0x20<link>
        ether 08:00:27:cd:fc:4f  txqueuelen 1000  (Ethernet)
        RX packets 2781  bytes 505203 (493.3 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1441  bytes 159474 (155.7 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enp0s3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        ether 08:00:27:cd:fc:4f  txqueuelen 1000  (Ethernet)
        RX packets 94856  bytes 111711956 (106.5 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 33859  bytes 15114156 (14.4 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 1741807  bytes 441610166 (421.1 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1741807  bytes 441610166 (421.1 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

tap2a93518d-90: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::681d:e6ff:fe1d:963f  prefixlen 64  scopeid 0x20<link>
        ether 6a:1d:e6:1d:96:3f  txqueuelen 1000  (Ethernet)
        RX packets 55  bytes 6540 (6.3 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 2950  bytes 916693 (895.2 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

tap50a66e3f-00: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1442
        ether fe:16:3e:06:df:bf  txqueuelen 1000  (Ethernet)
        RX packets 689  bytes 59074 (57.6 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 703  bytes 58933 (57.5 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

tap6ea42aaf-80: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::745b:6aff:fe1c:d270  prefixlen 64  scopeid 0x20<link>
        ether 76:5b:6a:1c:d2:70  txqueuelen 1000  (Ethernet)
        RX packets 63  bytes 7332 (7.1 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 126  bytes 10692 (10.4 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

tap743cdf36-c8: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        ether fe:16:3e:17:f2:f9  txqueuelen 1000  (Ethernet)
        RX packets 1005  bytes 91770 (89.6 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 3555  bytes 902987 (881.8 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root at openstack ~]#

Thanks & Regards
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200808/2e1c5699/attachment-0001.html>

From massimo.sgaravatto at gmail.com  Sat Aug  8 07:36:40 2020
From: massimo.sgaravatto at gmail.com (Massimo Sgaravatto)
Date: Sat, 8 Aug 2020 09:36:40 +0200
Subject: [nova][neutron][oslo][ops] rabbit bindings issue
In-Reply-To: <CAA857VxY8mr0OSw8vSYdMph4tXb=5Wwwx3cM_YVgGa8=P+cxkQ@mail.gmail.com>
References: <20200806144016.GP31915@sync>
 <CAA857VxY8mr0OSw8vSYdMph4tXb=5Wwwx3cM_YVgGa8=P+cxkQ@mail.gmail.com>
Message-ID: <CALaZjRGEcFDBiQ2YnsMjXJ_WF3ALcjWVLQuetupVd__CWhjg0g@mail.gmail.com>

We also see the issue.  When it happens stopping and restarting the rabbit
cluster usually helps.

I thought the problem was because of a wrong setting in the openstack
services conf files: I missed these settings (that I am now going to add):

[oslo_messaging_rabbit]
rabbit_ha_queues = true
amqp_durable_queues = true

Cheers, Massimo


On Sat, Aug 8, 2020 at 6:34 AM Fabian Zimmermann <dev.faz at gmail.com> wrote:

> Hi,
>
> we also have this issue.
>
> Our solution was (up to now) to delete the queues with a script or even
> reset the complete cluster.
>
> We just upgraded rabbitmq to the latest version - without luck.
>
> Anyone else seeing this issue?
>
>  Fabian
>
>
>
> Arnaud Morin <arnaud.morin at gmail.com> schrieb am Do., 6. Aug. 2020, 16:47:
>
>> Hey all,
>>
>> I would like to ask the community about a rabbit issue we have from time
>> to time.
>>
>> In our current architecture, we have a cluster of rabbits (3 nodes) for
>> all our OpenStack services (mostly nova and neutron).
>>
>> When one node of this cluster is down, the cluster continue working (we
>> use pause_minority strategy).
>> But, sometimes, the third server is not able to recover automatically
>> and need a manual intervention.
>> After this intervention, we restart the rabbitmq-server process, which
>> is then able to join the cluster back.
>>
>> At this time, the cluster looks ok, everything is fine.
>> BUT, nothing works.
>> Neutron and nova agents are not able to report back to servers.
>> They appear dead.
>> Servers seems not being able to consume messages.
>> The exchanges, queues, bindings seems good in rabbit.
>>
>> What we see is that removing bindings (using rabbitmqadmin delete
>> binding or the web interface) and recreate them again (using the same
>> routing key) brings the service back up and running.
>>
>> Doing this for all queues is really painful. Our next plan is to
>> automate it, but is there anyone in the community already saw this kind
>> of issues?
>>
>> Our bug looks like the one described in [1].
>> Someone recommands to create an Alternate Exchange.
>> Is there anyone already tried that?
>>
>> FYI, we are running rabbit 3.8.2 (with OpenStack Stein).
>> We had the same kind of issues using older version of rabbit.
>>
>> Thanks for your help.
>>
>> [1]
>> https://groups.google.com/forum/#!newtopic/rabbitmq-users/rabbitmq-users/zFhmpHF2aWk
>>
>> --
>> Arnaud Morin
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200808/c8c8826a/attachment.html>

From dev.faz at gmail.com  Sat Aug  8 13:06:36 2020
From: dev.faz at gmail.com (Fabian Zimmermann)
Date: Sat, 8 Aug 2020 15:06:36 +0200
Subject: [nova][neutron][oslo][ops] rabbit bindings issue
In-Reply-To: <CALaZjRGEcFDBiQ2YnsMjXJ_WF3ALcjWVLQuetupVd__CWhjg0g@mail.gmail.com>
References: <20200806144016.GP31915@sync>
 <CAA857VxY8mr0OSw8vSYdMph4tXb=5Wwwx3cM_YVgGa8=P+cxkQ@mail.gmail.com>
 <CALaZjRGEcFDBiQ2YnsMjXJ_WF3ALcjWVLQuetupVd__CWhjg0g@mail.gmail.com>
Message-ID: <CAA857VzKREnBOEF667n4B+9L9Rbm+y7o2LVrDjhxhCT=2XWR8g@mail.gmail.com>

Hi,

dont know if durable queues help, but should be enabled by rabbitmq policy
which (alone) doesnt seem to fix this (we have this active)

 Fabian

Massimo Sgaravatto <massimo.sgaravatto at gmail.com> schrieb am Sa., 8. Aug.
2020, 09:36:

> We also see the issue.  When it happens stopping and restarting the rabbit
> cluster usually helps.
>
> I thought the problem was because of a wrong setting in the openstack
> services conf files: I missed these settings (that I am now going to add):
>
> [oslo_messaging_rabbit]
> rabbit_ha_queues = true
> amqp_durable_queues = true
>
> Cheers, Massimo
>
>
> On Sat, Aug 8, 2020 at 6:34 AM Fabian Zimmermann <dev.faz at gmail.com>
> wrote:
>
>> Hi,
>>
>> we also have this issue.
>>
>> Our solution was (up to now) to delete the queues with a script or even
>> reset the complete cluster.
>>
>> We just upgraded rabbitmq to the latest version - without luck.
>>
>> Anyone else seeing this issue?
>>
>>  Fabian
>>
>>
>>
>> Arnaud Morin <arnaud.morin at gmail.com> schrieb am Do., 6. Aug. 2020,
>> 16:47:
>>
>>> Hey all,
>>>
>>> I would like to ask the community about a rabbit issue we have from time
>>> to time.
>>>
>>> In our current architecture, we have a cluster of rabbits (3 nodes) for
>>> all our OpenStack services (mostly nova and neutron).
>>>
>>> When one node of this cluster is down, the cluster continue working (we
>>> use pause_minority strategy).
>>> But, sometimes, the third server is not able to recover automatically
>>> and need a manual intervention.
>>> After this intervention, we restart the rabbitmq-server process, which
>>> is then able to join the cluster back.
>>>
>>> At this time, the cluster looks ok, everything is fine.
>>> BUT, nothing works.
>>> Neutron and nova agents are not able to report back to servers.
>>> They appear dead.
>>> Servers seems not being able to consume messages.
>>> The exchanges, queues, bindings seems good in rabbit.
>>>
>>> What we see is that removing bindings (using rabbitmqadmin delete
>>> binding or the web interface) and recreate them again (using the same
>>> routing key) brings the service back up and running.
>>>
>>> Doing this for all queues is really painful. Our next plan is to
>>> automate it, but is there anyone in the community already saw this kind
>>> of issues?
>>>
>>> Our bug looks like the one described in [1].
>>> Someone recommands to create an Alternate Exchange.
>>> Is there anyone already tried that?
>>>
>>> FYI, we are running rabbit 3.8.2 (with OpenStack Stein).
>>> We had the same kind of issues using older version of rabbit.
>>>
>>> Thanks for your help.
>>>
>>> [1]
>>> https://groups.google.com/forum/#!newtopic/rabbitmq-users/rabbitmq-users/zFhmpHF2aWk
>>>
>>> --
>>> Arnaud Morin
>>>
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200808/b7f9b0ca/attachment.html>

From pwm2012 at gmail.com  Sun Aug  9 15:54:47 2020
From: pwm2012 at gmail.com (pwm)
Date: Sun, 9 Aug 2020 23:54:47 +0800
Subject: DNS server instead of /etc/hosts file on Infra Server
Message-ID: <CAPQD=Mb96WmE6r0RDBdaG7zVOKidppLMgLj2qVTGMsqKx5XKmA@mail.gmail.com>

Hi,
Anyone interested in replacing the /etc/hosts file entry with a DNS server
on the openstack-ansible deployment?

Thank you
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200809/fe8ca113/attachment.html>

From monika.samal at outlook.com  Sun Aug  9 11:59:20 2020
From: monika.samal at outlook.com (Monika Samal)
Date: Sun, 9 Aug 2020 11:59:20 +0000
Subject: [openstack-community] Octavia :; Unable to create load balancer
In-Reply-To: <DM6PR08MB5194D8E41FA6C8B3708045818F480@DM6PR08MB5194.namprd08.prod.outlook.com>
References: <DM6PR08MB51944451C8861614628EE9DA8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAFs83Qr3Us22pK7gnw7X68_FjimnBTRDn0s+RaSzuzhjcXdY6w@mail.gmail.com>
 <CAMH0MgL7mvG-Zgb-Sr1mbNWhHLyZAHYw+BbAuzaHA+EJSJ=DQQ@mail.gmail.com>
 <DM6PR08MB5194D8A130262996A211B61F8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VzM4BBStF8nTwVJtFxaX0surEidjnmB3JExTDsncJsZNg@mail.gmail.com>
 <DM6PR08MB5194211D78CB801C7C3B0D4A8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VwKMWiO7L6CYK+zb4jkT9ntCjcjQRspxdm8MFL+i13aKg@mail.gmail.com>
 <DM6PR08MB519419FF0C8F173E25A6EAB78F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyDANKJ0CvPXdwMFuED-xEea=Urn0sy0tpwnwLBPSnKTg@mail.gmail.com>
 <DM6PR08MB5194C8259398F0810FAA236E8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAMH0MgLADYE6ncadXJHxghaKDYVmL=wYPE6hOejTC3xyHMgaAg@mail.gmail.com>
 <DM6PR08MB519489AEE16CD5C6CA69890C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <DM6PR08MB51949EBABD05456E0D7B14E38F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyATFxkKPL+YJRsrTiY0qzxN4GWVu6FDroKgjGc5Bqzbg@mail.gmail.com>
 <DM6PR08MB519447CBCC68203A8D5C870C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VxOttV=yNCoFp-RGMts82RJrCB7wXkygsPwmNx--miYVg@mail.gmail.com>
 <CAMH0MgLNm36nWsuXWyYvbd=1LNdNdmKLMYJn7u+d-muTnk0kuA@mail.gmail.com>
 <DM6PR08MB5194961A1F38F812D414AE408F4A0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAFHSqWp0oeLLF36SCvORJ2Vh5YeWyxExJs63efSCG1ak9uaYww@mail.gmail.com>
 <CAMH0Mg+bVbsT+D-MxR7u5VaNthb==OM6gtU1d72t6nZ1tioV1w@mail.gmail.com>,
 <CAFHSqWrY6+ai-wCyYFscc6=gqkOO09r2TieN=zyVLg6Sf_4LAA@mail.gmail.com>,
 <DM6PR08MB5194D8E41FA6C8B3708045818F480@DM6PR08MB5194.namprd08.prod.outlook.com>
Message-ID: <DM6PR08MB5194028756FA158E6F1AA9BB8F470@DM6PR08MB5194.namprd08.prod.outlook.com>

________________________________
From: Monika Samal <monika.samal at outlook.com>
Sent: Friday, August 7, 2020 4:41:52 AM
To: Mark Goddard <mark at stackhpc.com>; Michael Johnson <johnsomor at gmail.com>
Cc: Fabian Zimmermann <dev.faz at gmail.com>; openstack-discuss <openstack-discuss at lists.openstack.org>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

I tried following above document still facing same Octavia connection error with amphora image.

Regards,
Monika
________________________________
From: Mark Goddard <mark at stackhpc.com>
Sent: Thursday, August 6, 2020 1:16:01 PM
To: Michael Johnson <johnsomor at gmail.com>
Cc: Monika Samal <monika.samal at outlook.com>; Fabian Zimmermann <dev.faz at gmail.com>; openstack-discuss <openstack-discuss at lists.openstack.org>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer


On Wed, 5 Aug 2020 at 16:16, Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>> wrote:
Looking at that error, it appears that the lb-mgmt-net is not setup correctly. The Octavia controller containers are not able to reach the amphora instances on the lb-mgmt-net subnet.

I don't know how kolla is setup to connect the containers to the neutron lb-mgmt-net network. Maybe the above documents will help with that.

Right now it's up to the operator to configure that. The kolla documentation doesn't prescribe any particular setup. We're working on automating it in Victoria.


Michael

On Wed, Aug 5, 2020 at 12:53 AM Mark Goddard <mark at stackhpc.com<mailto:mark at stackhpc.com>> wrote:


On Tue, 4 Aug 2020 at 16:58, Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> wrote:
Hello Guys,

With Michaels help I was able to solve the problem but now there is another error I was able to create my network on vlan but still error persist. PFB the logs:

http://paste.openstack.org/show/fEixSudZ6lzscxYxsG1z/

Kindly help

regards,
Monika
________________________________
From: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>
Sent: Monday, August 3, 2020 9:10 PM
To: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>
Cc: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

Yeah, it looks like nova is failing to boot the instance.

Check this setting in your octavia.conf files: https://docs.openstack.org/octavia/latest/configuration/configref.html#controller_worker.amp_flavor_id

Also, if kolla-ansible didn't set both of these values correctly, please open bug reports for kolla-ansible. These all should have been configured by the deployment tool.


I wasn't following this thread due to no [kolla] tag, but here are the recently added docs for Octavia in kolla [1]. Note the octavia_service_auth_project variable which was added to migrate from the admin project to the service project for octavia resources. We're lacking proper automation for the flavor, image etc, but it is being worked on in Victoria [2].

[1] https://docs.openstack.org/kolla-ansible/latest/reference/networking/octavia.html
[2] https://review.opendev.org/740180

Michael

On Mon, Aug 3, 2020 at 7:53 AM Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>> wrote:
Seems like the flavor is missing or empty '' - check for typos and enable debug.

Check if the nova req contains valid information/flavor.

 Fabian

Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> schrieb am Mo., 3. Aug. 2020, 15:46:
It's registered

Get Outlook for Android<https://aka.ms/ghei36>
________________________________
From: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>
Sent: Monday, August 3, 2020 7:08:21 PM
To: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

Did you check the (nova) flavor you use in octavia.

 Fabian

Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> schrieb am Mo., 3. Aug. 2020, 10:53:
After Michael suggestion I was able to create load balancer but there is error in status.


[X]

PFB the error link:

http://paste.openstack.org/show/meNZCeuOlFkfjj189noN/
________________________________
From: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>
Sent: Monday, August 3, 2020 2:08 PM
To: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>
Cc: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>; Amy Marrich <amy at demarco.com<mailto:amy at demarco.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>; community at lists.openstack.org<mailto:community at lists.openstack.org> <community at lists.openstack.org<mailto:community at lists.openstack.org>>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

Thanks a ton Michael for helping me out
________________________________
From: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>
Sent: Friday, July 31, 2020 3:57 AM
To: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>
Cc: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>; Amy Marrich <amy at demarco.com<mailto:amy at demarco.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>; community at lists.openstack.org<mailto:community at lists.openstack.org> <community at lists.openstack.org<mailto:community at lists.openstack.org>>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

Just to close the loop on this, the octavia.conf file had
"project_name = admin" instead of "project_name = service" in the
[service_auth] section. This was causing the keystone errors when
Octavia was communicating with neutron.

I don't know if that is a bug in kolla-ansible or was just a local
configuration issue.

Michael

On Thu, Jul 30, 2020 at 1:39 PM Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> wrote:
>
> Hello Fabian,,
>
> http://paste.openstack.org/show/QxKv2Ai697qulp9UWTjY/
>
> Regards,
> Monika
> ________________________________
> From: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>
> Sent: Friday, July 31, 2020 1:57 AM
> To: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>
> Cc: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>; Amy Marrich <amy at demarco.com<mailto:amy at demarco.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>; community at lists.openstack.org<mailto:community at lists.openstack.org> <community at lists.openstack.org<mailto:community at lists.openstack.org>>
> Subject: Re: [openstack-community] Octavia :; Unable to create load balancer
>
> Hi,
>
> just to debug, could you replace the auth_type password with v3password?
>
> And do a curl against your :5000 and :35357 urls and paste the output.
>
>  Fabian
>
> Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> schrieb am Do., 30. Juli 2020, 22:15:
>
> Hello Fabian,
>
> http://paste.openstack.org/show/796477/
>
> Thanks,
> Monika
> ________________________________
> From: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>
> Sent: Friday, July 31, 2020 1:38 AM
> To: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>
> Cc: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>; Amy Marrich <amy at demarco.com<mailto:amy at demarco.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>; community at lists.openstack.org<mailto:community at lists.openstack.org> <community at lists.openstack.org<mailto:community at lists.openstack.org>>
> Subject: Re: [openstack-community] Octavia :; Unable to create load balancer
>
> The sections should be
>
> service_auth
> keystone_authtoken
>
> if i read the docs correctly. Maybe you can just paste your config (remove/change passwords) to paste.openstack.org<http://paste.openstack.org> and post the link?
>
>  Fabian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200809/335c8387/attachment-0001.html>

From monika.samal at outlook.com  Sun Aug  9 12:02:01 2020
From: monika.samal at outlook.com (Monika Samal)
Date: Sun, 9 Aug 2020 12:02:01 +0000
Subject: [openstack-community] Octavia :; Unable to create load balancer
In-Reply-To: <DM6PR08MB5194028756FA158E6F1AA9BB8F470@DM6PR08MB5194.namprd08.prod.outlook.com>
References: <DM6PR08MB51944451C8861614628EE9DA8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAFs83Qr3Us22pK7gnw7X68_FjimnBTRDn0s+RaSzuzhjcXdY6w@mail.gmail.com>
 <CAMH0MgL7mvG-Zgb-Sr1mbNWhHLyZAHYw+BbAuzaHA+EJSJ=DQQ@mail.gmail.com>
 <DM6PR08MB5194D8A130262996A211B61F8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VzM4BBStF8nTwVJtFxaX0surEidjnmB3JExTDsncJsZNg@mail.gmail.com>
 <DM6PR08MB5194211D78CB801C7C3B0D4A8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VwKMWiO7L6CYK+zb4jkT9ntCjcjQRspxdm8MFL+i13aKg@mail.gmail.com>
 <DM6PR08MB519419FF0C8F173E25A6EAB78F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyDANKJ0CvPXdwMFuED-xEea=Urn0sy0tpwnwLBPSnKTg@mail.gmail.com>
 <DM6PR08MB5194C8259398F0810FAA236E8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAMH0MgLADYE6ncadXJHxghaKDYVmL=wYPE6hOejTC3xyHMgaAg@mail.gmail.com>
 <DM6PR08MB519489AEE16CD5C6CA69890C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <DM6PR08MB51949EBABD05456E0D7B14E38F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyATFxkKPL+YJRsrTiY0qzxN4GWVu6FDroKgjGc5Bqzbg@mail.gmail.com>
 <DM6PR08MB519447CBCC68203A8D5C870C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VxOttV=yNCoFp-RGMts82RJrCB7wXkygsPwmNx--miYVg@mail.gmail.com>
 <CAMH0MgLNm36nWsuXWyYvbd=1LNdNdmKLMYJn7u+d-muTnk0kuA@mail.gmail.com>
 <DM6PR08MB5194961A1F38F812D414AE408F4A0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAFHSqWp0oeLLF36SCvORJ2Vh5YeWyxExJs63efSCG1ak9uaYww@mail.gmail.com>
 <CAMH0Mg+bVbsT+D-MxR7u5VaNthb==OM6gtU1d72t6nZ1tioV1w@mail.gmail.com>,
 <CAFHSqWrY6+ai-wCyYFscc6=gqkOO09r2TieN=zyVLg6Sf_4LAA@mail.gmail.com>,
 <DM6PR08MB5194D8E41FA6C8B3708045818F480@DM6PR08MB5194.namprd08.prod.outlook.com>,
 <DM6PR08MB5194028756FA158E6F1AA9BB8F470@DM6PR08MB5194.namprd08.prod.outlook.com>
Message-ID: <DM6PR08MB5194776B80A9582ED02D9C728F470@DM6PR08MB5194.namprd08.prod.outlook.com>

Hi All,

Below is the error am getting, i tried configuring network issue as well still finding it difficult to resolve.

Below is my log...if somebody can help me resolving it..it would be great help since its very urgent...

http://paste.openstack.org/show/TsagcQX2ZKd6rhhsYcYd/

Regards,
Monika
________________________________
From: Monika Samal <monika.samal at outlook.com>
Sent: Sunday, 9 August, 2020, 5:29 pm
To: Mark Goddard; Michael Johnson; openstack-discuss
Cc: Fabian Zimmermann
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

________________________________
From: Monika Samal <monika.samal at outlook.com>
Sent: Friday, August 7, 2020 4:41:52 AM
To: Mark Goddard <mark at stackhpc.com>; Michael Johnson <johnsomor at gmail.com>
Cc: Fabian Zimmermann <dev.faz at gmail.com>; openstack-discuss <openstack-discuss at lists.openstack.org>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

I tried following above document still facing same Octavia connection error with amphora image.

Regards,
Monika
________________________________
From: Mark Goddard <mark at stackhpc.com>
Sent: Thursday, August 6, 2020 1:16:01 PM
To: Michael Johnson <johnsomor at gmail.com>
Cc: Monika Samal <monika.samal at outlook.com>; Fabian Zimmermann <dev.faz at gmail.com>; openstack-discuss <openstack-discuss at lists.openstack.org>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer


On Wed, 5 Aug 2020 at 16:16, Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>> wrote:
Looking at that error, it appears that the lb-mgmt-net is not setup correctly. The Octavia controller containers are not able to reach the amphora instances on the lb-mgmt-net subnet.

I don't know how kolla is setup to connect the containers to the neutron lb-mgmt-net network. Maybe the above documents will help with that.

Right now it's up to the operator to configure that. The kolla documentation doesn't prescribe any particular setup. We're working on automating it in Victoria.


Michael

On Wed, Aug 5, 2020 at 12:53 AM Mark Goddard <mark at stackhpc.com<mailto:mark at stackhpc.com>> wrote:


On Tue, 4 Aug 2020 at 16:58, Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> wrote:
Hello Guys,

With Michaels help I was able to solve the problem but now there is another error I was able to create my network on vlan but still error persist. PFB the logs:

http://paste.openstack.org/show/fEixSudZ6lzscxYxsG1z/

Kindly help

regards,
Monika
________________________________
From: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>
Sent: Monday, August 3, 2020 9:10 PM
To: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>
Cc: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

Yeah, it looks like nova is failing to boot the instance.

Check this setting in your octavia.conf files: https://docs.openstack.org/octavia/latest/configuration/configref.html#controller_worker.amp_flavor_id

Also, if kolla-ansible didn't set both of these values correctly, please open bug reports for kolla-ansible. These all should have been configured by the deployment tool.


I wasn't following this thread due to no [kolla] tag, but here are the recently added docs for Octavia in kolla [1]. Note the octavia_service_auth_project variable which was added to migrate from the admin project to the service project for octavia resources. We're lacking proper automation for the flavor, image etc, but it is being worked on in Victoria [2].

[1] https://docs.openstack.org/kolla-ansible/latest/reference/networking/octavia.html
[2] https://review.opendev.org/740180

Michael

On Mon, Aug 3, 2020 at 7:53 AM Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>> wrote:
Seems like the flavor is missing or empty '' - check for typos and enable debug.

Check if the nova req contains valid information/flavor.

 Fabian

Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> schrieb am Mo., 3. Aug. 2020, 15:46:
It's registered

Get Outlook for Android<https://aka.ms/ghei36>
________________________________
From: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>
Sent: Monday, August 3, 2020 7:08:21 PM
To: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

Did you check the (nova) flavor you use in octavia.

 Fabian

Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> schrieb am Mo., 3. Aug. 2020, 10:53:
After Michael suggestion I was able to create load balancer but there is error in status.


[X]

PFB the error link:

http://paste.openstack.org/show/meNZCeuOlFkfjj189noN/
________________________________
From: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>
Sent: Monday, August 3, 2020 2:08 PM
To: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>
Cc: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>; Amy Marrich <amy at demarco.com<mailto:amy at demarco.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>; community at lists.openstack.org<mailto:community at lists.openstack.org> <community at lists.openstack.org<mailto:community at lists.openstack.org>>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

Thanks a ton Michael for helping me out
________________________________
From: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>
Sent: Friday, July 31, 2020 3:57 AM
To: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>
Cc: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>; Amy Marrich <amy at demarco.com<mailto:amy at demarco.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>; community at lists.openstack.org<mailto:community at lists.openstack.org> <community at lists.openstack.org<mailto:community at lists.openstack.org>>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

Just to close the loop on this, the octavia.conf file had
"project_name = admin" instead of "project_name = service" in the
[service_auth] section. This was causing the keystone errors when
Octavia was communicating with neutron.

I don't know if that is a bug in kolla-ansible or was just a local
configuration issue.

Michael

On Thu, Jul 30, 2020 at 1:39 PM Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> wrote:
>
> Hello Fabian,,
>
> http://paste.openstack.org/show/QxKv2Ai697qulp9UWTjY/
>
> Regards,
> Monika
> ________________________________
> From: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>
> Sent: Friday, July 31, 2020 1:57 AM
> To: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>
> Cc: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>; Amy Marrich <amy at demarco.com<mailto:amy at demarco.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>; community at lists.openstack.org<mailto:community at lists.openstack.org> <community at lists.openstack.org<mailto:community at lists.openstack.org>>
> Subject: Re: [openstack-community] Octavia :; Unable to create load balancer
>
> Hi,
>
> just to debug, could you replace the auth_type password with v3password?
>
> And do a curl against your :5000 and :35357 urls and paste the output.
>
>  Fabian
>
> Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> schrieb am Do., 30. Juli 2020, 22:15:
>
> Hello Fabian,
>
> http://paste.openstack.org/show/796477/
>
> Thanks,
> Monika
> ________________________________
> From: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>
> Sent: Friday, July 31, 2020 1:38 AM
> To: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>
> Cc: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>; Amy Marrich <amy at demarco.com<mailto:amy at demarco.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>; community at lists.openstack.org<mailto:community at lists.openstack.org> <community at lists.openstack.org<mailto:community at lists.openstack.org>>
> Subject: Re: [openstack-community] Octavia :; Unable to create load balancer
>
> The sections should be
>
> service_auth
> keystone_authtoken
>
> if i read the docs correctly. Maybe you can just paste your config (remove/change passwords) to paste.openstack.org<http://paste.openstack.org> and post the link?
>
>  Fabian

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200809/3a9260ca/attachment-0001.html>

From johnsomor at gmail.com  Mon Aug 10 05:44:29 2020
From: johnsomor at gmail.com (Michael Johnson)
Date: Sun, 9 Aug 2020 22:44:29 -0700
Subject: [openstack-community] Octavia :; Unable to create load balancer
In-Reply-To: <DM6PR08MB5194776B80A9582ED02D9C728F470@DM6PR08MB5194.namprd08.prod.outlook.com>
References: <DM6PR08MB51944451C8861614628EE9DA8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAFs83Qr3Us22pK7gnw7X68_FjimnBTRDn0s+RaSzuzhjcXdY6w@mail.gmail.com>
 <CAMH0MgL7mvG-Zgb-Sr1mbNWhHLyZAHYw+BbAuzaHA+EJSJ=DQQ@mail.gmail.com>
 <DM6PR08MB5194D8A130262996A211B61F8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VzM4BBStF8nTwVJtFxaX0surEidjnmB3JExTDsncJsZNg@mail.gmail.com>
 <DM6PR08MB5194211D78CB801C7C3B0D4A8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VwKMWiO7L6CYK+zb4jkT9ntCjcjQRspxdm8MFL+i13aKg@mail.gmail.com>
 <DM6PR08MB519419FF0C8F173E25A6EAB78F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyDANKJ0CvPXdwMFuED-xEea=Urn0sy0tpwnwLBPSnKTg@mail.gmail.com>
 <DM6PR08MB5194C8259398F0810FAA236E8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAMH0MgLADYE6ncadXJHxghaKDYVmL=wYPE6hOejTC3xyHMgaAg@mail.gmail.com>
 <DM6PR08MB519489AEE16CD5C6CA69890C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <DM6PR08MB51949EBABD05456E0D7B14E38F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyATFxkKPL+YJRsrTiY0qzxN4GWVu6FDroKgjGc5Bqzbg@mail.gmail.com>
 <DM6PR08MB519447CBCC68203A8D5C870C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VxOttV=yNCoFp-RGMts82RJrCB7wXkygsPwmNx--miYVg@mail.gmail.com>
 <CAMH0MgLNm36nWsuXWyYvbd=1LNdNdmKLMYJn7u+d-muTnk0kuA@mail.gmail.com>
 <DM6PR08MB5194961A1F38F812D414AE408F4A0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAFHSqWp0oeLLF36SCvORJ2Vh5YeWyxExJs63efSCG1ak9uaYww@mail.gmail.com>
 <CAMH0Mg+bVbsT+D-MxR7u5VaNthb==OM6gtU1d72t6nZ1tioV1w@mail.gmail.com>
 <CAFHSqWrY6+ai-wCyYFscc6=gqkOO09r2TieN=zyVLg6Sf_4LAA@mail.gmail.com>
 <DM6PR08MB5194D8E41FA6C8B3708045818F480@DM6PR08MB5194.namprd08.prod.outlook.com>
 <DM6PR08MB5194028756FA158E6F1AA9BB8F470@DM6PR08MB5194.namprd08.prod.outlook.com>
 <DM6PR08MB5194776B80A9582ED02D9C728F470@DM6PR08MB5194.namprd08.prod.outlook.com>
Message-ID: <CAMH0MgLqJLM8dF6LYssH8eAiFqxriq87MQo1iXp2E+dJpPd-BQ@mail.gmail.com>

That looks like there is still a kolla networking issue where the amphora
are not able to reach the controller processes. Please fix the lb-mgmt-net
such that it can reach the amphora and the controller containers. This
should be setup via the deployment tool, kolla in this case.

Michael

On Sun, Aug 9, 2020 at 5:02 AM Monika Samal <monika.samal at outlook.com>
wrote:

> Hi All,
>
> Below is the error am getting, i tried configuring network issue as well
> still finding it difficult to resolve.
>
> Below is my log...if somebody can help me resolving it..it would be great
> help since its very urgent...
>
> http://paste.openstack.org/show/TsagcQX2ZKd6rhhsYcYd/
>
> Regards,
> Monika
> ------------------------------
> *From:* Monika Samal <monika.samal at outlook.com>
> *Sent:* Sunday, 9 August, 2020, 5:29 pm
> *To:* Mark Goddard; Michael Johnson; openstack-discuss
> *Cc:* Fabian Zimmermann
> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
> balancer
>
> ------------------------------
> *From:* Monika Samal <monika.samal at outlook.com>
> *Sent:* Friday, August 7, 2020 4:41:52 AM
> *To:* Mark Goddard <mark at stackhpc.com>; Michael Johnson <
> johnsomor at gmail.com>
> *Cc:* Fabian Zimmermann <dev.faz at gmail.com>; openstack-discuss <
> openstack-discuss at lists.openstack.org>
> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
> balancer
>
> I tried following above document still facing same Octavia connection
> error with amphora image.
>
> Regards,
> Monika
> ------------------------------
> *From:* Mark Goddard <mark at stackhpc.com>
> *Sent:* Thursday, August 6, 2020 1:16:01 PM
> *To:* Michael Johnson <johnsomor at gmail.com>
> *Cc:* Monika Samal <monika.samal at outlook.com>; Fabian Zimmermann <
> dev.faz at gmail.com>; openstack-discuss <
> openstack-discuss at lists.openstack.org>
> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
> balancer
>
>
>
> On Wed, 5 Aug 2020 at 16:16, Michael Johnson <johnsomor at gmail.com> wrote:
>
> Looking at that error, it appears that the lb-mgmt-net is not setup
> correctly. The Octavia controller containers are not able to reach the
> amphora instances on the lb-mgmt-net subnet.
>
> I don't know how kolla is setup to connect the containers to the neutron
> lb-mgmt-net network. Maybe the above documents will help with that.
>
>
> Right now it's up to the operator to configure that. The kolla
> documentation doesn't prescribe any particular setup. We're working on
> automating it in Victoria.
>
>
> Michael
>
> On Wed, Aug 5, 2020 at 12:53 AM Mark Goddard <mark at stackhpc.com> wrote:
>
>
>
> On Tue, 4 Aug 2020 at 16:58, Monika Samal <monika.samal at outlook.com>
> wrote:
>
> Hello Guys,
>
> With Michaels help I was able to solve the problem but now there is
> another error I was able to create my network on vlan but still error
> persist. PFB the logs:
>
> http://paste.openstack.org/show/fEixSudZ6lzscxYxsG1z/
>
> Kindly help
>
> regards,
> Monika
> ------------------------------
> *From:* Michael Johnson <johnsomor at gmail.com>
> *Sent:* Monday, August 3, 2020 9:10 PM
> *To:* Fabian Zimmermann <dev.faz at gmail.com>
> *Cc:* Monika Samal <monika.samal at outlook.com>; openstack-discuss <
> openstack-discuss at lists.openstack.org>
> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
> balancer
>
> Yeah, it looks like nova is failing to boot the instance.
>
> Check this setting in your octavia.conf files:
> https://docs.openstack.org/octavia/latest/configuration/configref.html#controller_worker.amp_flavor_id
>
> Also, if kolla-ansible didn't set both of these values correctly, please
> open bug reports for kolla-ansible. These all should have been configured
> by the deployment tool.
>
>
> I wasn't following this thread due to no [kolla] tag, but here are the
> recently added docs for Octavia in kolla [1]. Note
> the octavia_service_auth_project variable which was added to migrate from
> the admin project to the service project for octavia resources. We're
> lacking proper automation for the flavor, image etc, but it is being worked
> on in Victoria [2].
>
> [1]
> https://docs.openstack.org/kolla-ansible/latest/reference/networking/octavia.html
> [2] https://review.opendev.org/740180
>
> Michael
>
> On Mon, Aug 3, 2020 at 7:53 AM Fabian Zimmermann <dev.faz at gmail.com>
> wrote:
>
> Seems like the flavor is missing or empty '' - check for typos and enable
> debug.
>
> Check if the nova req contains valid information/flavor.
>
>  Fabian
>
> Monika Samal <monika.samal at outlook.com> schrieb am Mo., 3. Aug. 2020,
> 15:46:
>
> It's registered
>
> Get Outlook for Android <https://aka.ms/ghei36>
> ------------------------------
> *From:* Fabian Zimmermann <dev.faz at gmail.com>
> *Sent:* Monday, August 3, 2020 7:08:21 PM
> *To:* Monika Samal <monika.samal at outlook.com>; openstack-discuss <
> openstack-discuss at lists.openstack.org>
> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
> balancer
>
> Did you check the (nova) flavor you use in octavia.
>
>  Fabian
>
> Monika Samal <monika.samal at outlook.com> schrieb am Mo., 3. Aug. 2020,
> 10:53:
>
> After Michael suggestion I was able to create load balancer but there is
> error in status.
>
>
>
> PFB the error link:
>
> http://paste.openstack.org/show/meNZCeuOlFkfjj189noN/
> ------------------------------
> *From:* Monika Samal <monika.samal at outlook.com>
> *Sent:* Monday, August 3, 2020 2:08 PM
> *To:* Michael Johnson <johnsomor at gmail.com>
> *Cc:* Fabian Zimmermann <dev.faz at gmail.com>; Amy Marrich <amy at demarco.com>;
> openstack-discuss <openstack-discuss at lists.openstack.org>;
> community at lists.openstack.org <community at lists.openstack.org>
> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
> balancer
>
> Thanks a ton Michael for helping me out
> ------------------------------
> *From:* Michael Johnson <johnsomor at gmail.com>
> *Sent:* Friday, July 31, 2020 3:57 AM
> *To:* Monika Samal <monika.samal at outlook.com>
> *Cc:* Fabian Zimmermann <dev.faz at gmail.com>; Amy Marrich <amy at demarco.com>;
> openstack-discuss <openstack-discuss at lists.openstack.org>;
> community at lists.openstack.org <community at lists.openstack.org>
> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
> balancer
>
> Just to close the loop on this, the octavia.conf file had
> "project_name = admin" instead of "project_name = service" in the
> [service_auth] section. This was causing the keystone errors when
> Octavia was communicating with neutron.
>
> I don't know if that is a bug in kolla-ansible or was just a local
> configuration issue.
>
> Michael
>
> On Thu, Jul 30, 2020 at 1:39 PM Monika Samal <monika.samal at outlook.com>
> wrote:
> >
> > Hello Fabian,,
> >
> > http://paste.openstack.org/show/QxKv2Ai697qulp9UWTjY/
> >
> > Regards,
> > Monika
> > ________________________________
> > From: Fabian Zimmermann <dev.faz at gmail.com>
> > Sent: Friday, July 31, 2020 1:57 AM
> > To: Monika Samal <monika.samal at outlook.com>
> > Cc: Michael Johnson <johnsomor at gmail.com>; Amy Marrich <amy at demarco.com>;
> openstack-discuss <openstack-discuss at lists.openstack.org>;
> community at lists.openstack.org <community at lists.openstack.org>
> > Subject: Re: [openstack-community] Octavia :; Unable to create load
> balancer
> >
> > Hi,
> >
> > just to debug, could you replace the auth_type password with v3password?
> >
> > And do a curl against your :5000 and :35357 urls and paste the output.
> >
> >  Fabian
> >
> > Monika Samal <monika.samal at outlook.com> schrieb am Do., 30. Juli 2020,
> 22:15:
> >
> > Hello Fabian,
> >
> > http://paste.openstack.org/show/796477/
> >
> > Thanks,
> > Monika
> > ________________________________
> > From: Fabian Zimmermann <dev.faz at gmail.com>
> > Sent: Friday, July 31, 2020 1:38 AM
> > To: Monika Samal <monika.samal at outlook.com>
> > Cc: Michael Johnson <johnsomor at gmail.com>; Amy Marrich <amy at demarco.com>;
> openstack-discuss <openstack-discuss at lists.openstack.org>;
> community at lists.openstack.org <community at lists.openstack.org>
> > Subject: Re: [openstack-community] Octavia :; Unable to create load
> balancer
> >
> > The sections should be
> >
> > service_auth
> > keystone_authtoken
> >
> > if i read the docs correctly. Maybe you can just paste your config
> (remove/change passwords) to paste.openstack.org and post the link?
> >
> >  Fabian
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200809/5a436985/attachment-0001.html>

From dev.faz at gmail.com  Mon Aug 10 05:49:36 2020
From: dev.faz at gmail.com (Fabian Zimmermann)
Date: Mon, 10 Aug 2020 07:49:36 +0200
Subject: [openstack-community] Octavia :; Unable to create load balancer
In-Reply-To: <CAMH0MgLqJLM8dF6LYssH8eAiFqxriq87MQo1iXp2E+dJpPd-BQ@mail.gmail.com>
References: <DM6PR08MB51944451C8861614628EE9DA8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAFs83Qr3Us22pK7gnw7X68_FjimnBTRDn0s+RaSzuzhjcXdY6w@mail.gmail.com>
 <CAMH0MgL7mvG-Zgb-Sr1mbNWhHLyZAHYw+BbAuzaHA+EJSJ=DQQ@mail.gmail.com>
 <DM6PR08MB5194D8A130262996A211B61F8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VzM4BBStF8nTwVJtFxaX0surEidjnmB3JExTDsncJsZNg@mail.gmail.com>
 <DM6PR08MB5194211D78CB801C7C3B0D4A8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VwKMWiO7L6CYK+zb4jkT9ntCjcjQRspxdm8MFL+i13aKg@mail.gmail.com>
 <DM6PR08MB519419FF0C8F173E25A6EAB78F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyDANKJ0CvPXdwMFuED-xEea=Urn0sy0tpwnwLBPSnKTg@mail.gmail.com>
 <DM6PR08MB5194C8259398F0810FAA236E8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAMH0MgLADYE6ncadXJHxghaKDYVmL=wYPE6hOejTC3xyHMgaAg@mail.gmail.com>
 <DM6PR08MB519489AEE16CD5C6CA69890C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <DM6PR08MB51949EBABD05456E0D7B14E38F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyATFxkKPL+YJRsrTiY0qzxN4GWVu6FDroKgjGc5Bqzbg@mail.gmail.com>
 <DM6PR08MB519447CBCC68203A8D5C870C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VxOttV=yNCoFp-RGMts82RJrCB7wXkygsPwmNx--miYVg@mail.gmail.com>
 <CAMH0MgLNm36nWsuXWyYvbd=1LNdNdmKLMYJn7u+d-muTnk0kuA@mail.gmail.com>
 <DM6PR08MB5194961A1F38F812D414AE408F4A0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAFHSqWp0oeLLF36SCvORJ2Vh5YeWyxExJs63efSCG1ak9uaYww@mail.gmail.com>
 <CAMH0Mg+bVbsT+D-MxR7u5VaNthb==OM6gtU1d72t6nZ1tioV1w@mail.gmail.com>
 <CAFHSqWrY6+ai-wCyYFscc6=gqkOO09r2TieN=zyVLg6Sf_4LAA@mail.gmail.com>
 <DM6PR08MB5194D8E41FA6C8B3708045818F480@DM6PR08MB5194.namprd08.prod.outlook.com>
 <DM6PR08MB5194028756FA158E6F1AA9BB8F470@DM6PR08MB5194.namprd08.prod.outlook.com>
 <DM6PR08MB5194776B80A9582ED02D9C728F470@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAMH0MgLqJLM8dF6LYssH8eAiFqxriq87MQo1iXp2E+dJpPd-BQ@mail.gmail.com>
Message-ID: <CAA857VyTo_yY_f5YzamyQxj7fAvFb33gO1PdDZ6aFtVJndieRA@mail.gmail.com>

Hi,

to test your connection you can create an instance im the octavia network
and try to ping/ssh from your controller (dont forget a suitable security
group)

 Fabian

Michael Johnson <johnsomor at gmail.com> schrieb am Mo., 10. Aug. 2020, 07:44:

>
> That looks like there is still a kolla networking issue where the amphora
> are not able to reach the controller processes. Please fix the lb-mgmt-net
> such that it can reach the amphora and the controller containers. This
> should be setup via the deployment tool, kolla in this case.
>
> Michael
>
> On Sun, Aug 9, 2020 at 5:02 AM Monika Samal <monika.samal at outlook.com>
> wrote:
>
>> Hi All,
>>
>> Below is the error am getting, i tried configuring network issue as well
>> still finding it difficult to resolve.
>>
>> Below is my log...if somebody can help me resolving it..it would be great
>> help since its very urgent...
>>
>> http://paste.openstack.org/show/TsagcQX2ZKd6rhhsYcYd/
>>
>> Regards,
>> Monika
>> ------------------------------
>> *From:* Monika Samal <monika.samal at outlook.com>
>> *Sent:* Sunday, 9 August, 2020, 5:29 pm
>> *To:* Mark Goddard; Michael Johnson; openstack-discuss
>> *Cc:* Fabian Zimmermann
>> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
>> balancer
>>
>> ------------------------------
>> *From:* Monika Samal <monika.samal at outlook.com>
>> *Sent:* Friday, August 7, 2020 4:41:52 AM
>> *To:* Mark Goddard <mark at stackhpc.com>; Michael Johnson <
>> johnsomor at gmail.com>
>> *Cc:* Fabian Zimmermann <dev.faz at gmail.com>; openstack-discuss <
>> openstack-discuss at lists.openstack.org>
>> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
>> balancer
>>
>> I tried following above document still facing same Octavia connection
>> error with amphora image.
>>
>> Regards,
>> Monika
>> ------------------------------
>> *From:* Mark Goddard <mark at stackhpc.com>
>> *Sent:* Thursday, August 6, 2020 1:16:01 PM
>> *To:* Michael Johnson <johnsomor at gmail.com>
>> *Cc:* Monika Samal <monika.samal at outlook.com>; Fabian Zimmermann <
>> dev.faz at gmail.com>; openstack-discuss <
>> openstack-discuss at lists.openstack.org>
>> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
>> balancer
>>
>>
>>
>> On Wed, 5 Aug 2020 at 16:16, Michael Johnson <johnsomor at gmail.com> wrote:
>>
>> Looking at that error, it appears that the lb-mgmt-net is not setup
>> correctly. The Octavia controller containers are not able to reach the
>> amphora instances on the lb-mgmt-net subnet.
>>
>> I don't know how kolla is setup to connect the containers to the neutron
>> lb-mgmt-net network. Maybe the above documents will help with that.
>>
>>
>> Right now it's up to the operator to configure that. The kolla
>> documentation doesn't prescribe any particular setup. We're working on
>> automating it in Victoria.
>>
>>
>> Michael
>>
>> On Wed, Aug 5, 2020 at 12:53 AM Mark Goddard <mark at stackhpc.com> wrote:
>>
>>
>>
>> On Tue, 4 Aug 2020 at 16:58, Monika Samal <monika.samal at outlook.com>
>> wrote:
>>
>> Hello Guys,
>>
>> With Michaels help I was able to solve the problem but now there is
>> another error I was able to create my network on vlan but still error
>> persist. PFB the logs:
>>
>> http://paste.openstack.org/show/fEixSudZ6lzscxYxsG1z/
>>
>> Kindly help
>>
>> regards,
>> Monika
>> ------------------------------
>> *From:* Michael Johnson <johnsomor at gmail.com>
>> *Sent:* Monday, August 3, 2020 9:10 PM
>> *To:* Fabian Zimmermann <dev.faz at gmail.com>
>> *Cc:* Monika Samal <monika.samal at outlook.com>; openstack-discuss <
>> openstack-discuss at lists.openstack.org>
>> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
>> balancer
>>
>> Yeah, it looks like nova is failing to boot the instance.
>>
>> Check this setting in your octavia.conf files:
>> https://docs.openstack.org/octavia/latest/configuration/configref.html#controller_worker.amp_flavor_id
>>
>> Also, if kolla-ansible didn't set both of these values correctly, please
>> open bug reports for kolla-ansible. These all should have been configured
>> by the deployment tool.
>>
>>
>> I wasn't following this thread due to no [kolla] tag, but here are the
>> recently added docs for Octavia in kolla [1]. Note
>> the octavia_service_auth_project variable which was added to migrate from
>> the admin project to the service project for octavia resources. We're
>> lacking proper automation for the flavor, image etc, but it is being worked
>> on in Victoria [2].
>>
>> [1]
>> https://docs.openstack.org/kolla-ansible/latest/reference/networking/octavia.html
>> [2] https://review.opendev.org/740180
>>
>> Michael
>>
>> On Mon, Aug 3, 2020 at 7:53 AM Fabian Zimmermann <dev.faz at gmail.com>
>> wrote:
>>
>> Seems like the flavor is missing or empty '' - check for typos and enable
>> debug.
>>
>> Check if the nova req contains valid information/flavor.
>>
>>  Fabian
>>
>> Monika Samal <monika.samal at outlook.com> schrieb am Mo., 3. Aug. 2020,
>> 15:46:
>>
>> It's registered
>>
>> Get Outlook for Android <https://aka.ms/ghei36>
>> ------------------------------
>> *From:* Fabian Zimmermann <dev.faz at gmail.com>
>> *Sent:* Monday, August 3, 2020 7:08:21 PM
>> *To:* Monika Samal <monika.samal at outlook.com>; openstack-discuss <
>> openstack-discuss at lists.openstack.org>
>> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
>> balancer
>>
>> Did you check the (nova) flavor you use in octavia.
>>
>>  Fabian
>>
>> Monika Samal <monika.samal at outlook.com> schrieb am Mo., 3. Aug. 2020,
>> 10:53:
>>
>> After Michael suggestion I was able to create load balancer but there is
>> error in status.
>>
>>
>>
>> PFB the error link:
>>
>> http://paste.openstack.org/show/meNZCeuOlFkfjj189noN/
>> ------------------------------
>> *From:* Monika Samal <monika.samal at outlook.com>
>> *Sent:* Monday, August 3, 2020 2:08 PM
>> *To:* Michael Johnson <johnsomor at gmail.com>
>> *Cc:* Fabian Zimmermann <dev.faz at gmail.com>; Amy Marrich <amy at demarco.com>;
>> openstack-discuss <openstack-discuss at lists.openstack.org>;
>> community at lists.openstack.org <community at lists.openstack.org>
>> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
>> balancer
>>
>> Thanks a ton Michael for helping me out
>> ------------------------------
>> *From:* Michael Johnson <johnsomor at gmail.com>
>> *Sent:* Friday, July 31, 2020 3:57 AM
>> *To:* Monika Samal <monika.samal at outlook.com>
>> *Cc:* Fabian Zimmermann <dev.faz at gmail.com>; Amy Marrich <amy at demarco.com>;
>> openstack-discuss <openstack-discuss at lists.openstack.org>;
>> community at lists.openstack.org <community at lists.openstack.org>
>> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
>> balancer
>>
>> Just to close the loop on this, the octavia.conf file had
>> "project_name = admin" instead of "project_name = service" in the
>> [service_auth] section. This was causing the keystone errors when
>> Octavia was communicating with neutron.
>>
>> I don't know if that is a bug in kolla-ansible or was just a local
>> configuration issue.
>>
>> Michael
>>
>> On Thu, Jul 30, 2020 at 1:39 PM Monika Samal <monika.samal at outlook.com>
>> wrote:
>> >
>> > Hello Fabian,,
>> >
>> > http://paste.openstack.org/show/QxKv2Ai697qulp9UWTjY/
>> >
>> > Regards,
>> > Monika
>> > ________________________________
>> > From: Fabian Zimmermann <dev.faz at gmail.com>
>> > Sent: Friday, July 31, 2020 1:57 AM
>> > To: Monika Samal <monika.samal at outlook.com>
>> > Cc: Michael Johnson <johnsomor at gmail.com>; Amy Marrich <amy at demarco.com>;
>> openstack-discuss <openstack-discuss at lists.openstack.org>;
>> community at lists.openstack.org <community at lists.openstack.org>
>> > Subject: Re: [openstack-community] Octavia :; Unable to create load
>> balancer
>> >
>> > Hi,
>> >
>> > just to debug, could you replace the auth_type password with v3password?
>> >
>> > And do a curl against your :5000 and :35357 urls and paste the output.
>> >
>> >  Fabian
>> >
>> > Monika Samal <monika.samal at outlook.com> schrieb am Do., 30. Juli 2020,
>> 22:15:
>> >
>> > Hello Fabian,
>> >
>> > http://paste.openstack.org/show/796477/
>> >
>> > Thanks,
>> > Monika
>> > ________________________________
>> > From: Fabian Zimmermann <dev.faz at gmail.com>
>> > Sent: Friday, July 31, 2020 1:38 AM
>> > To: Monika Samal <monika.samal at outlook.com>
>> > Cc: Michael Johnson <johnsomor at gmail.com>; Amy Marrich <amy at demarco.com>;
>> openstack-discuss <openstack-discuss at lists.openstack.org>;
>> community at lists.openstack.org <community at lists.openstack.org>
>> > Subject: Re: [openstack-community] Octavia :; Unable to create load
>> balancer
>> >
>> > The sections should be
>> >
>> > service_auth
>> > keystone_authtoken
>> >
>> > if i read the docs correctly. Maybe you can just paste your config
>> (remove/change passwords) to paste.openstack.org and post the link?
>> >
>> >  Fabian
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200810/68097cc1/attachment-0001.html>

From marino.mrc at gmail.com  Mon Aug 10 07:49:28 2020
From: marino.mrc at gmail.com (Marco Marino)
Date: Mon, 10 Aug 2020 09:49:28 +0200
Subject: [ironic][tripleo][ussuri] Problem with bare metal provisioning
 and old RAID controllers
In-Reply-To: <CAFHVVuK+aMZ8AviE5kTxUWJG5nNDiWyU6B5pzHbT=AcTxNwasg@mail.gmail.com>
References: <CAFHVVuJdusoD1aMPtQEf3FqyFYZQuGwx9J=cio6hOks1hE6vRg@mail.gmail.com>
 <CACNgkFxN75EwBzqE2rXbKk0DEZ5f=KVQhUWxzjNXG3CG1oHk9w@mail.gmail.com>
 <CAFHVVuK+aMZ8AviE5kTxUWJG5nNDiWyU6B5pzHbT=AcTxNwasg@mail.gmail.com>
Message-ID: <CAFHVVuJcM9=feVz9RpNV8LRpXFxYM2cty3WQZZ80efwMwgkB4g@mail.gmail.com>

Hi, I'm sorry if I reopen this thread, but I cannot find a solution at the
moment. Please, can someone give me some hint on how to detect megaraid
controllers with IPA? I think this could be useful for many users.
PS: I can do any test, I have a 6 servers test environment (5 nodes +
undercloud) with megaraid controllers (Poweredge R620)
Thank you

Il giorno mar 4 ago 2020 alle ore 12:57 Marco Marino <marino.mrc at gmail.com>
ha scritto:

> Here is what I did:
> # /usr/lib/dracut/skipcpio
> /home/stack/images/ironic-python-agent.initramfs | zcat | cpio -ivd | pax -r
> # mount dd-megaraid_sas-07.710.50.00-1.el8_2.elrepo.iso /mnt/
> # rpm2cpio
> /mnt/rpms/x86_64/kmod-megaraid_sas-07.710.50.00-1.el8_2.elrepo.x86_64.rpm |
> pax -r
> # find . 2>/dev/null | cpio --quiet -c -o | gzip -8  >
> /home/stack/images/ironic-python-agent.initramfs
> # chown stack: /home/stack/images/ironic-python-agent.initramfs
> (undercloud) [stack at undercloud ~]$ openstack overcloud image upload
> --update-existing --image-path /home/stack/images/
>
> At this point I checked that agent.ramdisk in /var/lib/ironic/httpboot has
> an update timestamp
>
> Then
> (undercloud) [stack at undercloud ~]$ openstack overcloud node introspect
> --provide controller2
> /usr/lib64/python3.6/importlib/_bootstrap.py:219: ImportWarning: can't
> resolve package from __spec__ or __package__, falling back on __name__ and
> __path__
>   return f(*args, **kwds)
>
> PLAY [Baremetal Introspection for multiple Ironic Nodes]
> ***********************
> 2020-08-04 12:32:26.684368 | ecf4bbd2-e605-20dd-3da9-000000000008 |
> TASK | Check for required inputs
> 2020-08-04 12:32:26.739797 | ecf4bbd2-e605-20dd-3da9-000000000008 |
>  SKIPPED | Check for required inputs | localhost | item=node_uuids
> 2020-08-04 12:32:26.746684 | ecf4bbd2-e605-20dd-3da9-00000000000a |
> TASK | Set node_uuids_intro fact
> [WARNING]: Failure using method (v2_playbook_on_task_start) in callback
> plugin
> (<ansible.plugins.callback.tripleo_profile_tasks.CallbackModule object at
> 0x7f1b0f9bce80>): maximum recursion depth exceeded while calling a Python
> object
> 2020-08-04 12:32:26.828985 | ecf4bbd2-e605-20dd-3da9-00000000000a |
>   OK | Set node_uuids_intro fact | localhost
> 2020-08-04 12:32:26.834281 | ecf4bbd2-e605-20dd-3da9-00000000000c |
> TASK | Notice
> 2020-08-04 12:32:26.911106 | ecf4bbd2-e605-20dd-3da9-00000000000c |
>  SKIPPED | Notice | localhost
> 2020-08-04 12:32:26.916344 | ecf4bbd2-e605-20dd-3da9-00000000000e |
> TASK | Set concurrency fact
> 2020-08-04 12:32:26.994087 | ecf4bbd2-e605-20dd-3da9-00000000000e |
>   OK | Set concurrency fact | localhost
> 2020-08-04 12:32:27.005932 | ecf4bbd2-e605-20dd-3da9-000000000010 |
> TASK | Check if validation enabled
> 2020-08-04 12:32:27.116425 | ecf4bbd2-e605-20dd-3da9-000000000010 |
>  SKIPPED | Check if validation enabled | localhost
> 2020-08-04 12:32:27.129120 | ecf4bbd2-e605-20dd-3da9-000000000011 |
> TASK | Run Validations
> 2020-08-04 12:32:27.239850 | ecf4bbd2-e605-20dd-3da9-000000000011 |
>  SKIPPED | Run Validations | localhost
> 2020-08-04 12:32:27.251796 | ecf4bbd2-e605-20dd-3da9-000000000012 |
> TASK | Fail if validations are disabled
> 2020-08-04 12:32:27.362050 | ecf4bbd2-e605-20dd-3da9-000000000012 |
>  SKIPPED | Fail if validations are disabled | localhost
> 2020-08-04 12:32:27.373947 | ecf4bbd2-e605-20dd-3da9-000000000014 |
> TASK | Start baremetal introspection
>
>
> 2020-08-04 12:48:19.944028 | ecf4bbd2-e605-20dd-3da9-000000000014 |
>  CHANGED | Start baremetal introspection | localhost
> 2020-08-04 12:48:19.966517 | ecf4bbd2-e605-20dd-3da9-000000000015 |
> TASK | Nodes that passed introspection
> 2020-08-04 12:48:20.130913 | ecf4bbd2-e605-20dd-3da9-000000000015 |
>   OK | Nodes that passed introspection | localhost | result={
>     "changed": false,
>     "msg": " 00c5e81b-1e5d-442b-b64f-597a604051f7"
> }
> 2020-08-04 12:48:20.142919 | ecf4bbd2-e605-20dd-3da9-000000000016 |
> TASK | Nodes that failed introspection
> 2020-08-04 12:48:20.305004 | ecf4bbd2-e605-20dd-3da9-000000000016 |
>   OK | Nodes that failed introspection | localhost | result={
>     "changed": false,
>     "failed_when_result": false,
>     "msg": " All nodes completed introspection successfully!"
> }
> 2020-08-04 12:48:20.316860 | ecf4bbd2-e605-20dd-3da9-000000000017 |
> TASK | Node introspection failed and no results are provided
> 2020-08-04 12:48:20.427675 | ecf4bbd2-e605-20dd-3da9-000000000017 |
>  SKIPPED | Node introspection failed and no results are provided | localhost
>
> PLAY RECAP
> *********************************************************************
> localhost                  : ok=5    changed=1    unreachable=0
>  failed=0    skipped=6    rescued=0    ignored=0
> [WARNING]: Failure using method (v2_playbook_on_stats) in callback plugin
> (<ansible.plugins.callback.tripleo_profile_tasks.CallbackModule object at
> 0x7f1b0f9bce80>): _output() missing 1 required positional argument: 'color'
> Successfully introspected nodes: ['controller2']
> Exception occured while running the command
> Traceback (most recent call last):
>   File "/usr/lib/python3.6/site-packages/ansible_runner/runner_config.py",
> line 340, in prepare_command
>     cmdline_args = self.loader.load_file('args', string_types,
> encoding=None)
>   File "/usr/lib/python3.6/site-packages/ansible_runner/loader.py", line
> 164, in load_file
>     contents = parsed_data = self.get_contents(path)
>   File "/usr/lib/python3.6/site-packages/ansible_runner/loader.py", line
> 98, in get_contents
>     raise ConfigurationError('specified path does not exist %s' % path)
> ansible_runner.exceptions.ConfigurationError: specified path does not
> exist /tmp/tripleop89yr8i8/args
>
> During handling of the above exception, another exception occurred:
>
> Traceback (most recent call last):
>   File "/usr/lib/python3.6/site-packages/tripleoclient/command.py", line
> 34, in run
>     super(Command, self).run(parsed_args)
>   File "/usr/lib/python3.6/site-packages/osc_lib/command/command.py", line
> 41, in run
>     return super(Command, self).run(parsed_args)
>   File "/usr/lib/python3.6/site-packages/cliff/command.py", line 187, in
> run
>     return_code = self.take_action(parsed_args) or 0
>   File
> "/usr/lib/python3.6/site-packages/tripleoclient/v2/overcloud_node.py", line
> 210, in take_action
>     node_uuids=parsed_args.node_uuids,
>   File
> "/usr/lib/python3.6/site-packages/tripleoclient/workflows/baremetal.py",
> line 134, in provide
>     'node_uuids': node_uuids
>   File "/usr/lib/python3.6/site-packages/tripleoclient/utils.py", line
> 659, in run_ansible_playbook
>     runner_config.prepare()
>   File "/usr/lib/python3.6/site-packages/ansible_runner/runner_config.py",
> line 174, in prepare
>     self.prepare_command()
>   File "/usr/lib/python3.6/site-packages/ansible_runner/runner_config.py",
> line 346, in prepare_command
>     self.command = self.generate_ansible_command()
>   File "/usr/lib/python3.6/site-packages/ansible_runner/runner_config.py",
> line 415, in generate_ansible_command
>     v = 'v' * self.verbosity
> TypeError: can't multiply sequence by non-int of type 'ClientManager'
> can't multiply sequence by non-int of type 'ClientManager'
> (undercloud) [stack at undercloud ~]$
>
>
> and
> (undercloud) [stack at undercloud ~]$ openstack baremetal node show
> controller2
> ....
> | properties             | {'local_gb': '0', 'cpus': '24', 'cpu_arch':
> 'x86_64', 'memory_mb': '32768', 'capabilities':
> 'cpu_vt:true,cpu_aes:true,cpu_hugepages:true,cpu_hugepages_1g:true,cpu_txt:true'}
>
>
> It seems that megaraid driver is correctly inserted in ramdisk:
> # lsinitrd /var/lib/ironic/httpboot/agent.ramdisk | grep  megaraid
> /bin/lsinitrd: line 276: warning: command substitution: ignored null byte
> in input
> -rw-r--r--   1 root     root           50 Apr 28 21:55
> etc/depmod.d/kmod-megaraid_sas.conf
> drwxr-xr-x   2 root     root            0 Aug  4 12:13
> usr/lib/modules/4.18.0-193.6.3.el8_2.x86_64/kernel/drivers/scsi/megaraid
> -rw-r--r--   1 root     root        68240 Aug  4 12:13
> usr/lib/modules/4.18.0-193.6.3.el8_2.x86_64/kernel/drivers/scsi/megaraid/megaraid_sas.ko.xz
> drwxr-xr-x   2 root     root            0 Apr 28 21:55
> usr/lib/modules/4.18.0-193.el8.x86_64/extra/megaraid_sas
> -rw-r--r--   1 root     root       309505 Apr 28 21:55
> usr/lib/modules/4.18.0-193.el8.x86_64/extra/megaraid_sas/megaraid_sas.ko
> drwxr-xr-x   2 root     root            0 Apr 28 21:55
> usr/share/doc/kmod-megaraid_sas-07.710.50.00
> -rw-r--r--   1 root     root        18092 Apr 28 21:55
> usr/share/doc/kmod-megaraid_sas-07.710.50.00/GPL-v2.0.txt
> -rw-r--r--   1 root     root         1152 Apr 28 21:55
> usr/share/doc/kmod-megaraid_sas-07.710.50.00/greylist.txt
>
> If the solution is to use a Centos7 ramdisk, please can you give me some
> hint? I have no idea on how to build a new ramdisk from scratch
> Thank you
>
>
>
>
>
>
>
>
> Il giorno mar 4 ago 2020 alle ore 12:33 Dmitry Tantsur <
> dtantsur at redhat.com> ha scritto:
>
>> Hi,
>>
>> On Tue, Aug 4, 2020 at 11:58 AM Marco Marino <marino.mrc at gmail.com>
>> wrote:
>>
>>> Hi, I'm trying to install openstack Ussuri on Centos 8 hardware using
>>> tripleo. I'm using a relatively old hardware (dell PowerEdge R620) with old
>>> RAID controllers, deprecated in RHEL8/Centos8. Here is some basic
>>> information:
>>> # lspci | grep -i raid
>>> 00:1f.2 RAID bus controller: Intel Corporation C600/X79 series chipset
>>> SATA RAID Controller (rev 05)
>>> 02:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS 2008 [Falcon]
>>> (rev 03)
>>>
>>> I'm able to manually install centos 8 using DUD driver from here ->
>>> https://elrepo.org/linux/dud/el8/x86_64/dd-megaraid_sas-07.710.50.00-1.el8_2.elrepo.iso
>>> (basically I add inst.dd and I use an usb pendrive with iso).
>>> Is there a way to do bare metal provisioning using openstack on this
>>> kind of server? At the moment, when I launch "openstack overcloud node
>>> introspect --provide controller1" it doesn't recognize disks (local_gb = 0
>>> in properties) and in inspector logs I see:
>>> Jun 22 11:12:42 localhost.localdomain ironic-python-agent[1543]:
>>> 2018-06-22 11:12:42.261 1543 DEBUG root [-] Still waiting for the root
>>> device to appear, attempt 1 of 10 wait_for_disks
>>> /usr/lib/python3.6/site-packages/ironic_python_agent/hardware.py:652
>>> Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]:
>>> 2018-06-22 11:12:45.299 1543 DEBUG oslo_concurrency.processutils [-]
>>> Running cmd (subprocess): udevadm settle execute
>>> /usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py:372
>>> Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]:
>>> 2018-06-22 11:12:45.357 1543 DEBUG oslo_concurrency.processutils [-] CMD
>>> "udevadm settle" returned: 0 in 0.058s execute
>>> /usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py:409
>>> Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]:
>>> 2018-06-22 11:12:45.392 1543 DEBUG ironic_lib.utils [-] Execution
>>> completed, command line is "udevadm settle" execute
>>> /usr/lib/python3.6/site-packages/ironic_lib/utils.py:101
>>> Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]:
>>> 2018-06-22 11:12:45.426 1543 DEBUG ironic_lib.utils [-] Command stdout is:
>>> "" execute /usr/lib/python3.6/site-packages/ironic_lib/utils.py:103
>>> Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]:
>>> 2018-06-22 11:12:45.460 1543 DEBUG ironic_lib.utils [-] Command stderr is:
>>> "" execute /usr/lib/python3.6/site-packages/ironic_lib/utils.py:104
>>> Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]:
>>> 2018-06-22 11:12:45.496 1543 WARNING root [-] Path /dev/disk/by-path is
>>> inaccessible, /dev/disk/by-path/* version of block device name is
>>> unavailable Cause: [Errno 2] No such file or directory:
>>> '/dev/disk/by-path': FileNotFoundError: [Errno 2] No such file or
>>> directory: '/dev/disk/by-path'
>>> Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]:
>>> 2018-06-22 11:12:45.549 1543 DEBUG oslo_concurrency.processutils [-]
>>> Running cmd (subprocess): lsblk -Pbia -oKNAME,MODEL,SIZE,ROTA,TYPE execute
>>> /usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py:372
>>> Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]:
>>> 2018-06-22 11:12:45.647 1543 DEBUG oslo_concurrency.processutils [-] CMD
>>> "lsblk -Pbia -oKNAME,MODEL,SIZE,ROTA,TYPE" returned: 0 in 0.097s execute
>>> /usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py:409
>>> Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]:
>>> 2018-06-22 11:12:45.683 1543 DEBUG ironic_lib.utils [-] Execution
>>> completed, command line is "lsblk -Pbia -oKNAME,MODEL,SIZE,ROTA,TYPE"
>>> execute /usr/lib/python3.6/site-packages/ironic_lib/utils.py:101
>>> Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]:
>>> 2018-06-22 11:12:45.719 1543 DEBUG ironic_lib.utils [-] Command stdout is:
>>> "" execute /usr/lib/python3.6/site-packages/ironic_lib/utils.py:103
>>> Jun 22 11:12:45 localhost.localdomain ironic-python-agent[1543]:
>>> 2018-06-22 11:12:45.755 1543 DEBUG ironic_lib.utils [-] Command stderr is:
>>> "" execute /usr/lib/python3.6/site-packages/ironic_lib/utils.py:104
>>>
>>> Is there a way to solve the issue? For example, can I modify ramdisk and
>>> include DUD driver? I tried this guide:
>>> https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.0/html/partner_integration/overcloud_images#initrd_modifying_the_initial_ramdisks
>>>
>>> but I don't know how to include an ISO instead of an rpm packet as
>>> described in the example.
>>>
>>
>> Indeed, I don't think you can use ISO as it is, you'll need to figure out
>> what is inside. If it's an RPM (as I assume), you'll need to extract it and
>> install into the ramdisk.
>>
>> If nothing helps, you can try building a ramdisk with CentOS 7, the
>> (very) recent versions of ironic-python-agent-builder allow using Python 3
>> on CentOS 7.
>>
>> Dmitry
>>
>>
>>> Thank you,
>>> Marco
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200810/104a2757/attachment-0001.html>

From mark at stackhpc.com  Mon Aug 10 08:07:04 2020
From: mark at stackhpc.com (Mark Goddard)
Date: Mon, 10 Aug 2020 09:07:04 +0100
Subject: [openstack-community] Octavia :; Unable to create load balancer
In-Reply-To: <CAMH0MgLqJLM8dF6LYssH8eAiFqxriq87MQo1iXp2E+dJpPd-BQ@mail.gmail.com>
References: <DM6PR08MB51944451C8861614628EE9DA8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAFs83Qr3Us22pK7gnw7X68_FjimnBTRDn0s+RaSzuzhjcXdY6w@mail.gmail.com>
 <CAMH0MgL7mvG-Zgb-Sr1mbNWhHLyZAHYw+BbAuzaHA+EJSJ=DQQ@mail.gmail.com>
 <DM6PR08MB5194D8A130262996A211B61F8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VzM4BBStF8nTwVJtFxaX0surEidjnmB3JExTDsncJsZNg@mail.gmail.com>
 <DM6PR08MB5194211D78CB801C7C3B0D4A8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VwKMWiO7L6CYK+zb4jkT9ntCjcjQRspxdm8MFL+i13aKg@mail.gmail.com>
 <DM6PR08MB519419FF0C8F173E25A6EAB78F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyDANKJ0CvPXdwMFuED-xEea=Urn0sy0tpwnwLBPSnKTg@mail.gmail.com>
 <DM6PR08MB5194C8259398F0810FAA236E8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAMH0MgLADYE6ncadXJHxghaKDYVmL=wYPE6hOejTC3xyHMgaAg@mail.gmail.com>
 <DM6PR08MB519489AEE16CD5C6CA69890C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <DM6PR08MB51949EBABD05456E0D7B14E38F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyATFxkKPL+YJRsrTiY0qzxN4GWVu6FDroKgjGc5Bqzbg@mail.gmail.com>
 <DM6PR08MB519447CBCC68203A8D5C870C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VxOttV=yNCoFp-RGMts82RJrCB7wXkygsPwmNx--miYVg@mail.gmail.com>
 <CAMH0MgLNm36nWsuXWyYvbd=1LNdNdmKLMYJn7u+d-muTnk0kuA@mail.gmail.com>
 <DM6PR08MB5194961A1F38F812D414AE408F4A0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAFHSqWp0oeLLF36SCvORJ2Vh5YeWyxExJs63efSCG1ak9uaYww@mail.gmail.com>
 <CAMH0Mg+bVbsT+D-MxR7u5VaNthb==OM6gtU1d72t6nZ1tioV1w@mail.gmail.com>
 <CAFHSqWrY6+ai-wCyYFscc6=gqkOO09r2TieN=zyVLg6Sf_4LAA@mail.gmail.com>
 <DM6PR08MB5194D8E41FA6C8B3708045818F480@DM6PR08MB5194.namprd08.prod.outlook.com>
 <DM6PR08MB5194028756FA158E6F1AA9BB8F470@DM6PR08MB5194.namprd08.prod.outlook.com>
 <DM6PR08MB5194776B80A9582ED02D9C728F470@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAMH0MgLqJLM8dF6LYssH8eAiFqxriq87MQo1iXp2E+dJpPd-BQ@mail.gmail.com>
Message-ID: <CAFHSqWq5Q7RosChxt2MARcRB6BL5a36HrCFvE4idZmqruN55nQ@mail.gmail.com>

On Mon, 10 Aug 2020 at 06:44, Michael Johnson <johnsomor at gmail.com> wrote:

>
> That looks like there is still a kolla networking issue where the amphora
> are not able to reach the controller processes. Please fix the lb-mgmt-net
> such that it can reach the amphora and the controller containers. This
> should be setup via the deployment tool, kolla in this case.
>

As mentioned before, Kolla doesn't currently do this - it is up to the
user. We're improving the integration in the Victoria cycle.


> Michael
>
> On Sun, Aug 9, 2020 at 5:02 AM Monika Samal <monika.samal at outlook.com>
> wrote:
>
>> Hi All,
>>
>> Below is the error am getting, i tried configuring network issue as well
>> still finding it difficult to resolve.
>>
>> Below is my log...if somebody can help me resolving it..it would be great
>> help since its very urgent...
>>
>> http://paste.openstack.org/show/TsagcQX2ZKd6rhhsYcYd/
>>
>> Regards,
>> Monika
>> ------------------------------
>> *From:* Monika Samal <monika.samal at outlook.com>
>> *Sent:* Sunday, 9 August, 2020, 5:29 pm
>> *To:* Mark Goddard; Michael Johnson; openstack-discuss
>> *Cc:* Fabian Zimmermann
>> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
>> balancer
>>
>> ------------------------------
>> *From:* Monika Samal <monika.samal at outlook.com>
>> *Sent:* Friday, August 7, 2020 4:41:52 AM
>> *To:* Mark Goddard <mark at stackhpc.com>; Michael Johnson <
>> johnsomor at gmail.com>
>> *Cc:* Fabian Zimmermann <dev.faz at gmail.com>; openstack-discuss <
>> openstack-discuss at lists.openstack.org>
>> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
>> balancer
>>
>> I tried following above document still facing same Octavia connection
>> error with amphora image.
>>
>> Regards,
>> Monika
>> ------------------------------
>> *From:* Mark Goddard <mark at stackhpc.com>
>> *Sent:* Thursday, August 6, 2020 1:16:01 PM
>> *To:* Michael Johnson <johnsomor at gmail.com>
>> *Cc:* Monika Samal <monika.samal at outlook.com>; Fabian Zimmermann <
>> dev.faz at gmail.com>; openstack-discuss <
>> openstack-discuss at lists.openstack.org>
>> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
>> balancer
>>
>>
>>
>> On Wed, 5 Aug 2020 at 16:16, Michael Johnson <johnsomor at gmail.com> wrote:
>>
>> Looking at that error, it appears that the lb-mgmt-net is not setup
>> correctly. The Octavia controller containers are not able to reach the
>> amphora instances on the lb-mgmt-net subnet.
>>
>> I don't know how kolla is setup to connect the containers to the neutron
>> lb-mgmt-net network. Maybe the above documents will help with that.
>>
>>
>> Right now it's up to the operator to configure that. The kolla
>> documentation doesn't prescribe any particular setup. We're working on
>> automating it in Victoria.
>>
>>
>> Michael
>>
>> On Wed, Aug 5, 2020 at 12:53 AM Mark Goddard <mark at stackhpc.com> wrote:
>>
>>
>>
>> On Tue, 4 Aug 2020 at 16:58, Monika Samal <monika.samal at outlook.com>
>> wrote:
>>
>> Hello Guys,
>>
>> With Michaels help I was able to solve the problem but now there is
>> another error I was able to create my network on vlan but still error
>> persist. PFB the logs:
>>
>> http://paste.openstack.org/show/fEixSudZ6lzscxYxsG1z/
>>
>> Kindly help
>>
>> regards,
>> Monika
>> ------------------------------
>> *From:* Michael Johnson <johnsomor at gmail.com>
>> *Sent:* Monday, August 3, 2020 9:10 PM
>> *To:* Fabian Zimmermann <dev.faz at gmail.com>
>> *Cc:* Monika Samal <monika.samal at outlook.com>; openstack-discuss <
>> openstack-discuss at lists.openstack.org>
>> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
>> balancer
>>
>> Yeah, it looks like nova is failing to boot the instance.
>>
>> Check this setting in your octavia.conf files:
>> https://docs.openstack.org/octavia/latest/configuration/configref.html#controller_worker.amp_flavor_id
>>
>> Also, if kolla-ansible didn't set both of these values correctly, please
>> open bug reports for kolla-ansible. These all should have been configured
>> by the deployment tool.
>>
>>
>> I wasn't following this thread due to no [kolla] tag, but here are the
>> recently added docs for Octavia in kolla [1]. Note
>> the octavia_service_auth_project variable which was added to migrate from
>> the admin project to the service project for octavia resources. We're
>> lacking proper automation for the flavor, image etc, but it is being worked
>> on in Victoria [2].
>>
>> [1]
>> https://docs.openstack.org/kolla-ansible/latest/reference/networking/octavia.html
>> [2] https://review.opendev.org/740180
>>
>> Michael
>>
>> On Mon, Aug 3, 2020 at 7:53 AM Fabian Zimmermann <dev.faz at gmail.com>
>> wrote:
>>
>> Seems like the flavor is missing or empty '' - check for typos and enable
>> debug.
>>
>> Check if the nova req contains valid information/flavor.
>>
>>  Fabian
>>
>> Monika Samal <monika.samal at outlook.com> schrieb am Mo., 3. Aug. 2020,
>> 15:46:
>>
>> It's registered
>>
>> Get Outlook for Android <https://aka.ms/ghei36>
>> ------------------------------
>> *From:* Fabian Zimmermann <dev.faz at gmail.com>
>> *Sent:* Monday, August 3, 2020 7:08:21 PM
>> *To:* Monika Samal <monika.samal at outlook.com>; openstack-discuss <
>> openstack-discuss at lists.openstack.org>
>> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
>> balancer
>>
>> Did you check the (nova) flavor you use in octavia.
>>
>>  Fabian
>>
>> Monika Samal <monika.samal at outlook.com> schrieb am Mo., 3. Aug. 2020,
>> 10:53:
>>
>> After Michael suggestion I was able to create load balancer but there is
>> error in status.
>>
>>
>>
>> PFB the error link:
>>
>> http://paste.openstack.org/show/meNZCeuOlFkfjj189noN/
>> ------------------------------
>> *From:* Monika Samal <monika.samal at outlook.com>
>> *Sent:* Monday, August 3, 2020 2:08 PM
>> *To:* Michael Johnson <johnsomor at gmail.com>
>> *Cc:* Fabian Zimmermann <dev.faz at gmail.com>; Amy Marrich <amy at demarco.com>;
>> openstack-discuss <openstack-discuss at lists.openstack.org>;
>> community at lists.openstack.org <community at lists.openstack.org>
>> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
>> balancer
>>
>> Thanks a ton Michael for helping me out
>> ------------------------------
>> *From:* Michael Johnson <johnsomor at gmail.com>
>> *Sent:* Friday, July 31, 2020 3:57 AM
>> *To:* Monika Samal <monika.samal at outlook.com>
>> *Cc:* Fabian Zimmermann <dev.faz at gmail.com>; Amy Marrich <amy at demarco.com>;
>> openstack-discuss <openstack-discuss at lists.openstack.org>;
>> community at lists.openstack.org <community at lists.openstack.org>
>> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
>> balancer
>>
>> Just to close the loop on this, the octavia.conf file had
>> "project_name = admin" instead of "project_name = service" in the
>> [service_auth] section. This was causing the keystone errors when
>> Octavia was communicating with neutron.
>>
>> I don't know if that is a bug in kolla-ansible or was just a local
>> configuration issue.
>>
>> Michael
>>
>> On Thu, Jul 30, 2020 at 1:39 PM Monika Samal <monika.samal at outlook.com>
>> wrote:
>> >
>> > Hello Fabian,,
>> >
>> > http://paste.openstack.org/show/QxKv2Ai697qulp9UWTjY/
>> >
>> > Regards,
>> > Monika
>> > ________________________________
>> > From: Fabian Zimmermann <dev.faz at gmail.com>
>> > Sent: Friday, July 31, 2020 1:57 AM
>> > To: Monika Samal <monika.samal at outlook.com>
>> > Cc: Michael Johnson <johnsomor at gmail.com>; Amy Marrich <amy at demarco.com>;
>> openstack-discuss <openstack-discuss at lists.openstack.org>;
>> community at lists.openstack.org <community at lists.openstack.org>
>> > Subject: Re: [openstack-community] Octavia :; Unable to create load
>> balancer
>> >
>> > Hi,
>> >
>> > just to debug, could you replace the auth_type password with v3password?
>> >
>> > And do a curl against your :5000 and :35357 urls and paste the output.
>> >
>> >  Fabian
>> >
>> > Monika Samal <monika.samal at outlook.com> schrieb am Do., 30. Juli 2020,
>> 22:15:
>> >
>> > Hello Fabian,
>> >
>> > http://paste.openstack.org/show/796477/
>> >
>> > Thanks,
>> > Monika
>> > ________________________________
>> > From: Fabian Zimmermann <dev.faz at gmail.com>
>> > Sent: Friday, July 31, 2020 1:38 AM
>> > To: Monika Samal <monika.samal at outlook.com>
>> > Cc: Michael Johnson <johnsomor at gmail.com>; Amy Marrich <amy at demarco.com>;
>> openstack-discuss <openstack-discuss at lists.openstack.org>;
>> community at lists.openstack.org <community at lists.openstack.org>
>> > Subject: Re: [openstack-community] Octavia :; Unable to create load
>> balancer
>> >
>> > The sections should be
>> >
>> > service_auth
>> > keystone_authtoken
>> >
>> > if i read the docs correctly. Maybe you can just paste your config
>> (remove/change passwords) to paste.openstack.org and post the link?
>> >
>> >  Fabian
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200810/7cd4ad05/attachment-0001.html>

From moreira.belmiro.email.lists at gmail.com  Mon Aug 10 08:13:24 2020
From: moreira.belmiro.email.lists at gmail.com (Belmiro Moreira)
Date: Mon, 10 Aug 2020 10:13:24 +0200
Subject: [all][TC] OpenStack Client (OSC) vs python-*clients
Message-ID: <CAPkQhndGy1VrBT1yOCXQaZcgDxXr5ww9RL1VnU45iHsDJqZpcw@mail.gmail.com>

Hi,
during the last PTG the TC discussed the problem of supporting different
clients (OpenStack Client - OSC vs python-*clients) [1].
Currently, we don't have feature parity between the OSC and the
python-*clients.

Different OpenStack projects invest in different clients.
This can be a huge problem for users/ops. Depending on the projects
deployed in their infrastructures, they need to use different clients for
different tasks.
It's confusing because of the partial implementation in the OSC.

There was also the proposal to enforce new functionality only in the SDK
(and optionally the OSC) and not the project’s specific clients to stop
increasing the disparity between the two.

We would like to understand first the problems and missing pieces that
projects are facing to move into OSC and help to overcome them.
Let us know.

Belmiro,
on behalf of the TC

[1]
http://lists.openstack.org/pipermail/openstack-discuss/2020-June/015418.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200810/90b1e958/attachment.html>

From dikonoor at in.ibm.com  Mon Aug 10 08:16:57 2020
From: dikonoor at in.ibm.com (Divya K Konoor)
Date: Mon, 10 Aug 2020 13:46:57 +0530
Subject: [openstack-community] Keystone and DBNonExistent Errors
In-Reply-To: <OF3EBE7401.86F37941-ON652585C0.002A2644-652585C0.002B6DCD@LocalDomain>
References: <DM6PR08MB51944451C8861614628EE9DA8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VzM4BBStF8nTwVJtFxaX0surEidjnmB3JExTDsncJsZNg@mail.gmail.com>
 <DM6PR08MB5194211D78CB801C7C3B0D4A8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VwKMWiO7L6CYK+zb4jkT9ntCjcjQRspxdm8MFL+i13aKg@mail.gmail.com>
 <DM6PR08MB519419FF0C8F173E25A6EAB78F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyDANKJ0CvPXdwMFuED-xEea=Urn0sy0tpwnwLBPSnKTg@mail.gmail.com>
 <DM6PR08MB5194C8259398F0810FAA236E8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAMH0MgLADYE6ncadXJHxghaKDYVmL=wYPE6hOejTC3xyHM
 <DM6PR08MB5194776B80A9582ED02D9C728F470@DM6PR08MB5194.namprd08.prod.outlook.com>
 <OF3EBE7401.86F37941-ON652585C0.002A2644-652585C0.002B6DCD@LocalDomain>
Message-ID: <mailman.677.1597047427.1268.openstack-discuss@lists.openstack.org>


Hi,

I am using OpenStack Keystone Stein and run into the below error often
where Keystone public process(listening to 5000) is running inside Apache
httpd runs into the below. This problem is resolved with a restart of httpd
service. Has anyone run into a similar issue ? This is seen soon after
httpd is restarted and does not happen all the time. My environment has
MariaDB backend. This problem is not limited to the assignment table and is
seen across all other tables in Keystone. MariaDB service is functional and
all the tables are in place.


[Fri Aug 07 08:20:59.936087 2020] [:info] [pid 1420287] mod_wsgi
(pid=1420287, process='keystone-public', application=''): Loading WSGI
script '/usr/bin/keystone-wsgi-public'.
[Fri Aug 07 08:20:59.936089 2020] [:info] [pid 1420288] mod_wsgi
(pid=1420288, process='keystone-admin', application=''): Loading WSGI
script '/usr/bin/keystone-wsgi-admin'.
[Fri Aug 07 08:20:59.943431 2020] [ssl:info] [pid 1420290] [client
1.2.3.95:35762] AH01964: Connection to child 1 established (server
1.2.3.50:5000)
[Fri Aug 07 08:21:00.009317 2020] [ssl:info] [pid 1420291] [client
1.2.3.113:60132] AH01964: Connection to child 2 established (server
1.2.3.50:5000)
[Fri Aug 07 08:21:01.243594 2020] [ssl:info] [pid 1420289] [client
1.2.3.50:53996] AH01964: Connection to child 0 established (server
1.2.3.50:5000)
[Fri Aug 07 08:21:01.386329 2020] [ssl:info] [pid 1420293] [client
x.x.x.x:38645] AH01964: Connection to child 4 established (server
1.2.3.50:5000)
[Fri Aug 07 08:21:01.824041 2020] [ssl:info] [pid 1420349] [client
1.2.3.101:42974] AH01964: Connection to child 5 established (server
1.2.3.50:5000)
[Fri Aug 07 08:21:02.949166 2020] [ssl:info] [pid 1420378] [client
1.2.3.50:54014] AH01964: Connection to child 9 established (server
1.2.3.50:5000)
[Fri Aug 07 08:21:02.949172 2020] [ssl:info] [pid 1420379] [client
1.2.3.80:46924] AH01964: Connection to child 10 established (server
1.2.3.50:5000)
[Fri Aug 07 08:21:03.286057 2020] [:info] [pid 1420287] mod_wsgi
(pid=1420287): Create interpreter '1.2.3.50:5000|'.
[Fri Aug 07 08:21:03.287286 2020] [:info] [pid 1420287] [remote
1.2.3.95:156] mod_wsgi (pid=1420287, process='keystone-public',
application='1.2.3.50:5000|'): Loading WSGI script
'/usr/bin/keystone-wsgi-public'.
[Fri Aug 07 08:21:04.675059 2020] [ssl:info] [pid 1420436] [client
1.2.3.50:54032] AH01964: Connection to child 12 established (server
1.2.3.50:5000)
[Fri Aug 07 08:21:04.705975 2020] [ssl:info] [pid 1420437] [client
1.2.3.107:59554] AH01964: Connection to child 13 established (server
1.2.3.50:5000)
[Fri Aug 07 08:21:06.960940 2020] [ssl:info] [pid 1420438] [client
1.2.3.80:46970] AH01964: Connection to child 14 established (server
1.2.3.50:5000)
[Fri Aug 07 08:21:07.661670 2020] [ssl:info] [pid 1420349] [client
1.2.3.50:54124] AH01964: Connection to child 5 established (server
1.2.3.50:5000)
[Fri Aug 07 08:21:07.683383 2020] [ssl:info] [pid 1420292] [client
x.x.x.x:30065] AH01964: Connection to child 3 established (server
1.2.3.50:5000)
[Fri Aug 07 08:21:08.442956 2020] [:error] [pid 1420287] [remote
x.x.x.x:144] mod_wsgi (pid=1420287): Exception occurred processing WSGI
script '/usr/bin/keystone-wsgi-public'.
[Fri Aug 07 08:21:08.443002 2020] [:error] [pid 1420287] [remote
x.x.x.x:144] Traceback (most recent call last):
[Fri Aug 07 08:21:08.443017 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/flask/app.py", line
2309, in __call__
[Fri Aug 07 08:21:08.443509 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return self.wsgi_app(environ, start_response)
[Fri Aug 07 08:21:08.443525 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/werkzeug/contrib/fixers.py", line 152, in
__call__
[Fri Aug 07 08:21:08.443630 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return self.app(environ, start_response)
[Fri Aug 07 08:21:08.443644 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/webob/dec.py", line
129, in __call__
[Fri Aug 07 08:21:08.443746 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     resp = self.call_func(req, *args, **kw)
[Fri Aug 07 08:21:08.443756 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/webob/dec.py", line
193, in call_func
[Fri Aug 07 08:21:08.443773 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return self.func(req, *args, **kwargs)
[Fri Aug 07 08:21:08.443781 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/oslo_middleware/base.py", line 131, in
__call__
[Fri Aug 07 08:21:08.443844 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     response = req.get_response(self.application)
..
...
....
..

[Fri Aug 07 08:21:08.450055 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/dogpile/cache/region.py", line 1216, in
creator
[Fri Aug 07 08:21:08.450071 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return fn(*arg, **kw)
[Fri Aug 07 08:21:08.450080 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/keystone/assignment/core.py", line 128,
in get_roles_for_user_and_project
[Fri Aug 07 08:21:08.450975 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     user_id=user_id, project_id=project_id, effective=True)
[Fri Aug 07 08:21:08.450985 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/keystone/common/manager.py", line 116, in
wrapped
[Fri Aug 07 08:21:08.451001 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     __ret_val = __f(*args, **kwargs)
[Fri Aug 07 08:21:08.451009 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/keystone/assignment/core.py", line 999,
in list_role_assignments
[Fri Aug 07 08:21:08.451025 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     strip_domain_roles)
[Fri Aug 07 08:21:08.451033 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/keystone/assignment/core.py", line 845,
in _list_effective_role_assignments
[Fri Aug 07 08:21:08.451049 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     domain_id=domain_id, inherited=inherited)
[Fri Aug 07 08:21:08.451057 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/keystone/assignment/core.py", line 780,
in list_role_assignments_for_actor
[Fri Aug 07 08:21:08.451072 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     group_ids=group_ids, inherited_to_projects=False)
[Fri Aug 07 08:21:08.451081 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/keystone/assignment/backends/sql.py",
line 248, in list_role_assignments
[Fri Aug 07 08:21:08.451599 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return [denormalize_role(ref) for ref in query.all()]
[Fri Aug 07 08:21:08.451609 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2925, in
all
[Fri Aug 07 08:21:08.451632 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return list(self)
[Fri Aug 07 08:21:08.451641 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 3081, in
__iter__
[Fri Aug 07 08:21:08.451656 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return self._execute_and_instances(context)
[Fri Aug 07 08:21:08.451665 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 3106, in
_execute_and_instances
[Fri Aug 07 08:21:08.451683 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     result = conn.execute(querycontext.statement,
self._params)
[Fri Aug 07 08:21:08.451691 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 980,
in execute
[Fri Aug 07 08:21:08.451711 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return meth(self, multiparams, params)
[Fri Aug 07 08:21:08.451720 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib64/python2.7/site-packages/sqlalchemy/sql/elements.py", line 273,
in _execute_on_connection
[Fri Aug 07 08:21:08.451736 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return connection._execute_clauseelement(self,
multiparams, params)
[Fri Aug 07 08:21:08.451745 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1099,
in _execute_clauseelement
[Fri Aug 07 08:21:08.451762 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     distilled_params,
[Fri Aug 07 08:21:08.451771 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1240,
in _execute_context
[Fri Aug 07 08:21:08.451786 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     e, statement, parameters, cursor, context
[Fri Aug 07 08:21:08.451795 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1456,
in _handle_dbapi_exception
[Fri Aug 07 08:21:08.451810 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     util.raise_from_cause(newraise, exc_info)
[Fri Aug 07 08:21:08.451818 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib64/python2.7/site-packages/sqlalchemy/util/compat.py", line 296,
in raise_from_cause
[Fri Aug 07 08:21:08.451834 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     reraise(type(exception), exception, tb=exc_tb,
cause=cause)
[Fri Aug 07 08:21:08.451843 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1236,
in _execute_context
[Fri Aug 07 08:21:08.451858 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     cursor, statement, parameters, context
[Fri Aug 07 08:21:08.451866 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib64/python2.7/site-packages/sqlalchemy/engine/default.py", line
536, in do_execute
[Fri Aug 07 08:21:08.451882 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     cursor.execute(statement, parameters)
[Fri Aug 07 08:21:08.451923 2020] [:error] [pid 1420287] [remote
x.x.x.x:144] DBNonExistentTable: (sqlite3.OperationalError) no such table:
assignment [SQL: u'SELECT assignment.type AS assignment_type,
assignment.actor_id AS assignment_actor_id, assignment.target_id AS
assignment_target_id, assignment.role_id AS assignment_role_id,
assignment.inherited AS assignment_inherited \\nFROM assignment \\nWHERE
assignment.actor_id IN (?) AND assignment.target_id IN (?) AND
assignment.type IN (?) AND assignment.inherited = 0'] [parameters:
('15c2fe91e053af57a997c568c117c908d59c138f996bdc19ae97e9f16df12345',
'12345978536e45ab8a279e2b0fa4f947', 'UserProject')] (Background on this
error at: http://sqlalche.me/e/e3q8)

Regards,
Divya
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200810/91e4a0e2/attachment.html>

From radoslaw.piliszek at gmail.com  Mon Aug 10 08:26:24 2020
From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=)
Date: Mon, 10 Aug 2020 10:26:24 +0200
Subject: [all][TC] OpenStack Client (OSC) vs python-*clients
In-Reply-To: <CAPkQhndGy1VrBT1yOCXQaZcgDxXr5ww9RL1VnU45iHsDJqZpcw@mail.gmail.com>
References: <CAPkQhndGy1VrBT1yOCXQaZcgDxXr5ww9RL1VnU45iHsDJqZpcw@mail.gmail.com>
Message-ID: <CAKZ_x7_CCbHzVdruU2BvS7kK1mvte+1AW1PJptwVzFJQVVCs4Q@mail.gmail.com>

On Mon, Aug 10, 2020 at 10:19 AM Belmiro Moreira <
moreira.belmiro.email.lists at gmail.com> wrote:

> Hi,
> during the last PTG the TC discussed the problem of supporting different
> clients (OpenStack Client - OSC vs python-*clients) [1].
> Currently, we don't have feature parity between the OSC and the
> python-*clients.
>

Is it true of any client? I guess some are just OSC plugins 100%.
Do we know which clients have this disparity?
Personally, I encountered this with Glance the most and Cinder to some
extent (but I believe over the course of action Cinder got all features I
wanted from it in the OSC).

-yoctozepto
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200810/0a28bd59/attachment.html>

From ltoscano at redhat.com  Mon Aug 10 08:37:22 2020
From: ltoscano at redhat.com (Luigi Toscano)
Date: Mon, 10 Aug 2020 10:37:22 +0200
Subject: [all][TC] OpenStack Client (OSC) vs python-*clients
In-Reply-To: <CAKZ_x7_CCbHzVdruU2BvS7kK1mvte+1AW1PJptwVzFJQVVCs4Q@mail.gmail.com>
References: <CAPkQhndGy1VrBT1yOCXQaZcgDxXr5ww9RL1VnU45iHsDJqZpcw@mail.gmail.com>
 <CAKZ_x7_CCbHzVdruU2BvS7kK1mvte+1AW1PJptwVzFJQVVCs4Q@mail.gmail.com>
Message-ID: <1668118.VLH7GnMWUR@whitebase.usersys.redhat.com>

On Monday, 10 August 2020 10:26:24 CEST Radosław Piliszek wrote:
> On Mon, Aug 10, 2020 at 10:19 AM Belmiro Moreira <
> 
> moreira.belmiro.email.lists at gmail.com> wrote:
> > Hi,
> > during the last PTG the TC discussed the problem of supporting different
> > clients (OpenStack Client - OSC vs python-*clients) [1].
> > Currently, we don't have feature parity between the OSC and the
> > python-*clients.
> 
> Is it true of any client? I guess some are just OSC plugins 100%.
> Do we know which clients have this disparity?
> Personally, I encountered this with Glance the most and Cinder to some
> extent (but I believe over the course of action Cinder got all features I
> wanted from it in the OSC).

As far as I know there is still a huge problem with microversion handling 
which impacts some cinder features. It has been discussed in the past and 
still present.


-- 
Luigi


From thierry at openstack.org  Mon Aug 10 10:01:37 2020
From: thierry at openstack.org (Thierry Carrez)
Date: Mon, 10 Aug 2020 12:01:37 +0200
Subject: [largescale-sig] Next meeting: August 12, 16utc
Message-ID: <6e7a4e43-08f4-3030-2eb0-9311f27d9647@openstack.org>

Hi everyone,

In order to accommodate US members, the Large Scale SIG recently decided 
to rotate between an EU-APAC-friendly time and an US-EU-friendly time.

Our next meeting will be the first US-EU meeting, on Wednesday, August 
12 at 16 UTC[1] in the #openstack-meeting-3 channel on IRC:

https://www.timeanddate.com/worldclock/fixedtime.html?iso=20200812T16

Feel free to add topics to our agenda at:

https://etherpad.openstack.org/p/large-scale-sig-meeting

A reminder of the TODOs we had from last meeting, in case you have time 
to make progress on them:

- amorin to add some meat to the wiki page before we push the Nova doc 
patch further
- all to describe briefly how you solved metrics/billing in your 
deployment in https://etherpad.openstack.org/p/large-scale-sig-documentation

Talk to you all on Wednesday,

-- 
Thierry Carrez


From emilien at redhat.com  Mon Aug 10 12:29:22 2020
From: emilien at redhat.com (Emilien Macchi)
Date: Mon, 10 Aug 2020 08:29:22 -0400
Subject: [puppet][congress] Retiring puppet-congress
In-Reply-To: <CAL_crJSgdc+juZ8z41ECbkecUDgP0dLKcCuet6e2yU0PRp-Fgg@mail.gmail.com>
References: <CAL_crJSgdc+juZ8z41ECbkecUDgP0dLKcCuet6e2yU0PRp-Fgg@mail.gmail.com>
Message-ID: <CACu=hyvnh1C0K0rKxegMFzW9y-BHUUooyQw4uPBnK4a3aY5Tew@mail.gmail.com>

On Sat, Jun 20, 2020 at 12:44 PM Takashi Kajinami <tkajinam at redhat.com>
wrote:

> Hello,
>
>
> As you know, Congress project has been retired already[1],
> so we will retire its puppet module, puppet-congress in
> openstack puppet project as well.
>  [1]
> http://lists.openstack.org/pipermail/openstack-discuss/2020-April/014292.html
>
> Because congress was directly retired instead of getting migrated
> to x namespace, we'll follow the same way about puppet-congress retirement
> and won't create x/puppet-congress.
>
> Thank you for the contribution made for the project !
> Please let us know if you have any concerns about this retirement.
>

+2
-- 
Emilien Macchi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200810/5eb7f055/attachment.html>

From ildiko.vancsa at gmail.com  Mon Aug 10 12:31:38 2020
From: ildiko.vancsa at gmail.com (Ildiko Vancsa)
Date: Mon, 10 Aug 2020 14:31:38 +0200
Subject: [upstream-institute] Virtual training sign-up and planning
Message-ID: <BE49233E-B171-42F8-8DD2-2623D04158B4@gmail.com>

Hi mentors,

I’m reaching out to you as the next Open Infrastructure Summit is approaching quickly so it is time to start planning for the next OpenStack Upstream Institute.

As the next event will be virtual we will need to re-think the training format and experience to make sure our audience gets the most out of it.

I created a new entry on our training occasions wiki page here: https://wiki.openstack.org/wiki/OpenStack_Upstream_Institute_Occasions#Virtual_Training.2C_2020

Please __sign up on the wiki__ if you would like to participate in the preparations and running the virtual training.

As it is still vacation season I think we can target the last week of August or first week of September to have the first prep meeting and can collect ideas here or discuss them on the #openstack-upstream-institute IRC channel on Freenode in the meantime.

Please let me know if you have any questions or need any help with signing up on the wiki.

Thanks and Best Regards,
Ildikó


From monika.samal at outlook.com  Mon Aug 10 07:32:06 2020
From: monika.samal at outlook.com (Monika Samal)
Date: Mon, 10 Aug 2020 07:32:06 +0000
Subject: [openstack-community] Octavia :; Unable to create load balancer
In-Reply-To: <CAA857VyTo_yY_f5YzamyQxj7fAvFb33gO1PdDZ6aFtVJndieRA@mail.gmail.com>
References: <DM6PR08MB51944451C8861614628EE9DA8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAFs83Qr3Us22pK7gnw7X68_FjimnBTRDn0s+RaSzuzhjcXdY6w@mail.gmail.com>
 <CAMH0MgL7mvG-Zgb-Sr1mbNWhHLyZAHYw+BbAuzaHA+EJSJ=DQQ@mail.gmail.com>
 <DM6PR08MB5194D8A130262996A211B61F8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VzM4BBStF8nTwVJtFxaX0surEidjnmB3JExTDsncJsZNg@mail.gmail.com>
 <DM6PR08MB5194211D78CB801C7C3B0D4A8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VwKMWiO7L6CYK+zb4jkT9ntCjcjQRspxdm8MFL+i13aKg@mail.gmail.com>
 <DM6PR08MB519419FF0C8F173E25A6EAB78F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyDANKJ0CvPXdwMFuED-xEea=Urn0sy0tpwnwLBPSnKTg@mail.gmail.com>
 <DM6PR08MB5194C8259398F0810FAA236E8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAMH0MgLADYE6ncadXJHxghaKDYVmL=wYPE6hOejTC3xyHMgaAg@mail.gmail.com>
 <DM6PR08MB519489AEE16CD5C6CA69890C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <DM6PR08MB51949EBABD05456E0D7B14E38F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyATFxkKPL+YJRsrTiY0qzxN4GWVu6FDroKgjGc5Bqzbg@mail.gmail.com>
 <DM6PR08MB519447CBCC68203A8D5C870C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VxOttV=yNCoFp-RGMts82RJrCB7wXkygsPwmNx--miYVg@mail.gmail.com>
 <CAMH0MgLNm36nWsuXWyYvbd=1LNdNdmKLMYJn7u+d-muTnk0kuA@mail.gmail.com>
 <DM6PR08MB5194961A1F38F812D414AE408F4A0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAFHSqWp0oeLLF36SCvORJ2Vh5YeWyxExJs63efSCG1ak9uaYww@mail.gmail.com>
 <CAMH0Mg+bVbsT+D-MxR7u5VaNthb==OM6gtU1d72t6nZ1tioV1w@mail.gmail.com>
 <CAFHSqWrY6+ai-wCyYFscc6=gqkOO09r2TieN=zyVLg6Sf_4LAA@mail.gmail.com>
 <DM6PR08MB5194D8E41FA6C8B3708045818F480@DM6PR08MB5194.namprd08.prod.outlook.com>
 <DM6PR08MB5194028756FA158E6F1AA9BB8F470@DM6PR08MB5194.namprd08.prod.outlook.com>
 <DM6PR08MB5194776B80A9582ED02D9C728F470@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAMH0MgLqJLM8dF6LYssH8eAiFqxriq87MQo1iXp2E+dJpPd-BQ@mail.gmail.com>,
 <CAA857VyTo_yY_f5YzamyQxj7fAvFb33gO1PdDZ6aFtVJndieRA@mail.gmail.com>
Message-ID: <DM6PR08MB5194DB0E0741BEDA445E38B08F440@DM6PR08MB5194.namprd08.prod.outlook.com>

Sure, I am trying and will confirm shortly

Get Outlook for Android<https://aka.ms/ghei36>
________________________________
From: Fabian Zimmermann <dev.faz at gmail.com>
Sent: Monday, August 10, 2020 11:19:36 AM
To: Michael Johnson <johnsomor at gmail.com>
Cc: Monika Samal <monika.samal at outlook.com>; openstack-discuss <openstack-discuss at lists.openstack.org>; Mark Goddard <mark at stackhpc.com>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

Hi,

to test your connection you can create an instance im the octavia network and try to ping/ssh from your controller (dont forget a suitable security group)

 Fabian

Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>> schrieb am Mo., 10. Aug. 2020, 07:44:

That looks like there is still a kolla networking issue where the amphora are not able to reach the controller processes. Please fix the lb-mgmt-net such that it can reach the amphora and the controller containers. This should be setup via the deployment tool, kolla in this case.

Michael

On Sun, Aug 9, 2020 at 5:02 AM Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> wrote:
Hi All,

Below is the error am getting, i tried configuring network issue as well still finding it difficult to resolve.

Below is my log...if somebody can help me resolving it..it would be great help since its very urgent...

http://paste.openstack.org/show/TsagcQX2ZKd6rhhsYcYd/

Regards,
Monika
________________________________
From: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>
Sent: Sunday, 9 August, 2020, 5:29 pm
To: Mark Goddard; Michael Johnson; openstack-discuss
Cc: Fabian Zimmermann
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

________________________________
From: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>
Sent: Friday, August 7, 2020 4:41:52 AM
To: Mark Goddard <mark at stackhpc.com<mailto:mark at stackhpc.com>>; Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>
Cc: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

I tried following above document still facing same Octavia connection error with amphora image.

Regards,
Monika
________________________________
From: Mark Goddard <mark at stackhpc.com<mailto:mark at stackhpc.com>>
Sent: Thursday, August 6, 2020 1:16:01 PM
To: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>
Cc: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>; Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer


On Wed, 5 Aug 2020 at 16:16, Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>> wrote:
Looking at that error, it appears that the lb-mgmt-net is not setup correctly. The Octavia controller containers are not able to reach the amphora instances on the lb-mgmt-net subnet.

I don't know how kolla is setup to connect the containers to the neutron lb-mgmt-net network. Maybe the above documents will help with that.

Right now it's up to the operator to configure that. The kolla documentation doesn't prescribe any particular setup. We're working on automating it in Victoria.


Michael

On Wed, Aug 5, 2020 at 12:53 AM Mark Goddard <mark at stackhpc.com<mailto:mark at stackhpc.com>> wrote:


On Tue, 4 Aug 2020 at 16:58, Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> wrote:
Hello Guys,

With Michaels help I was able to solve the problem but now there is another error I was able to create my network on vlan but still error persist. PFB the logs:

http://paste.openstack.org/show/fEixSudZ6lzscxYxsG1z/

Kindly help

regards,
Monika
________________________________
From: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>
Sent: Monday, August 3, 2020 9:10 PM
To: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>
Cc: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

Yeah, it looks like nova is failing to boot the instance.

Check this setting in your octavia.conf files: https://docs.openstack.org/octavia/latest/configuration/configref.html#controller_worker.amp_flavor_id

Also, if kolla-ansible didn't set both of these values correctly, please open bug reports for kolla-ansible. These all should have been configured by the deployment tool.


I wasn't following this thread due to no [kolla] tag, but here are the recently added docs for Octavia in kolla [1]. Note the octavia_service_auth_project variable which was added to migrate from the admin project to the service project for octavia resources. We're lacking proper automation for the flavor, image etc, but it is being worked on in Victoria [2].

[1] https://docs.openstack.org/kolla-ansible/latest/reference/networking/octavia.html
[2] https://review.opendev.org/740180

Michael

On Mon, Aug 3, 2020 at 7:53 AM Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>> wrote:
Seems like the flavor is missing or empty '' - check for typos and enable debug.

Check if the nova req contains valid information/flavor.

 Fabian

Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> schrieb am Mo., 3. Aug. 2020, 15:46:
It's registered

Get Outlook for Android<https://aka.ms/ghei36>
________________________________
From: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>
Sent: Monday, August 3, 2020 7:08:21 PM
To: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

Did you check the (nova) flavor you use in octavia.

 Fabian

Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> schrieb am Mo., 3. Aug. 2020, 10:53:
After Michael suggestion I was able to create load balancer but there is error in status.


[X]

PFB the error link:

http://paste.openstack.org/show/meNZCeuOlFkfjj189noN/
________________________________
From: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>
Sent: Monday, August 3, 2020 2:08 PM
To: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>
Cc: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>; Amy Marrich <amy at demarco.com<mailto:amy at demarco.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>; community at lists.openstack.org<mailto:community at lists.openstack.org> <community at lists.openstack.org<mailto:community at lists.openstack.org>>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

Thanks a ton Michael for helping me out
________________________________
From: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>
Sent: Friday, July 31, 2020 3:57 AM
To: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>
Cc: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>; Amy Marrich <amy at demarco.com<mailto:amy at demarco.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>; community at lists.openstack.org<mailto:community at lists.openstack.org> <community at lists.openstack.org<mailto:community at lists.openstack.org>>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

Just to close the loop on this, the octavia.conf file had
"project_name = admin" instead of "project_name = service" in the
[service_auth] section. This was causing the keystone errors when
Octavia was communicating with neutron.

I don't know if that is a bug in kolla-ansible or was just a local
configuration issue.

Michael

On Thu, Jul 30, 2020 at 1:39 PM Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> wrote:
>
> Hello Fabian,,
>
> http://paste.openstack.org/show/QxKv2Ai697qulp9UWTjY/
>
> Regards,
> Monika
> ________________________________
> From: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>
> Sent: Friday, July 31, 2020 1:57 AM
> To: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>
> Cc: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>; Amy Marrich <amy at demarco.com<mailto:amy at demarco.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>; community at lists.openstack.org<mailto:community at lists.openstack.org> <community at lists.openstack.org<mailto:community at lists.openstack.org>>
> Subject: Re: [openstack-community] Octavia :; Unable to create load balancer
>
> Hi,
>
> just to debug, could you replace the auth_type password with v3password?
>
> And do a curl against your :5000 and :35357 urls and paste the output.
>
>  Fabian
>
> Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> schrieb am Do., 30. Juli 2020, 22:15:
>
> Hello Fabian,
>
> http://paste.openstack.org/show/796477/
>
> Thanks,
> Monika
> ________________________________
> From: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>
> Sent: Friday, July 31, 2020 1:38 AM
> To: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>
> Cc: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>; Amy Marrich <amy at demarco.com<mailto:amy at demarco.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>; community at lists.openstack.org<mailto:community at lists.openstack.org> <community at lists.openstack.org<mailto:community at lists.openstack.org>>
> Subject: Re: [openstack-community] Octavia :; Unable to create load balancer
>
> The sections should be
>
> service_auth
> keystone_authtoken
>
> if i read the docs correctly. Maybe you can just paste your config (remove/change passwords) to paste.openstack.org<http://paste.openstack.org> and post the link?
>
>  Fabian

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200810/85067af1/attachment-0001.html>

From dikonoor at in.ibm.com  Mon Aug 10 07:54:21 2020
From: dikonoor at in.ibm.com (Divya K Konoor)
Date: Mon, 10 Aug 2020 13:24:21 +0530
Subject: [openstack-community] Keystone and DBNonExistent Errors
In-Reply-To: <DM6PR08MB5194776B80A9582ED02D9C728F470@DM6PR08MB5194.namprd08.prod.outlook.com>
References: <DM6PR08MB51944451C8861614628EE9DA8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <DM6PR08MB5194D8A130262996A211B61F8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VzM4BBStF8nTwVJtFxaX0surEidjnmB3JExTDsncJsZNg@mail.gmail.com>
 <DM6PR08MB5194211D78CB801C7C3B0D4A8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VwKMWiO7L6CYK+zb4jkT9ntCjcjQRspxdm8MFL+i13aKg@mail.gmail.com>
 <DM6PR08MB519419FF0C8F173E25A6EAB78F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyDANKJ0CvPXdwMFuED-xEea=Urn0sy0tpwnwLBPSnKTg@mail.gmail.com>
 <DM6PR08MB5194C8259398F0810FAA236E8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAMH0MgLADYE6ncadXJHxghaKDYVmL=wYPE6hOejTC3xyHM
 <DM6PR08MB5194776B80A9582ED02D9C728F470@DM6PR08MB5194.namprd08.prod.outlook.com>
Message-ID: <mailman.680.1597068720.1268.openstack-discuss@lists.openstack.org>


Hi,

I am using OpenStack Keystone Stein and run into the below error often
where Keystone public process(listening to 5000) is running inside Apache
httpd runs into the below. This problem is resolved with a restart of httpd
service. Has anyone run into a similar issue ? This is seen soon after
httpd is restarted and does not happen all the time. My environment has
MariaDB backend. This problem is not limited to the assignment table and is
seen across all other tables in Keystone. MariaDB service is functional and
all the tables are in place.


[Fri Aug 07 08:20:59.936087 2020] [:info] [pid 1420287] mod_wsgi
(pid=1420287, process='keystone-public', application=''): Loading WSGI
script '/usr/bin/keystone-wsgi-public'.
[Fri Aug 07 08:20:59.936089 2020] [:info] [pid 1420288] mod_wsgi
(pid=1420288, process='keystone-admin', application=''): Loading WSGI
script '/usr/bin/keystone-wsgi-admin'.
[Fri Aug 07 08:20:59.943431 2020] [ssl:info] [pid 1420290] [client
1.2.3.95:35762] AH01964: Connection to child 1 established (server
1.2.3.50:5000)
[Fri Aug 07 08:21:00.009317 2020] [ssl:info] [pid 1420291] [client
1.2.3.113:60132] AH01964: Connection to child 2 established (server
1.2.3.50:5000)
[Fri Aug 07 08:21:01.243594 2020] [ssl:info] [pid 1420289] [client
1.2.3.50:53996] AH01964: Connection to child 0 established (server
1.2.3.50:5000)
[Fri Aug 07 08:21:01.386329 2020] [ssl:info] [pid 1420293] [client
x.x.x.x:38645] AH01964: Connection to child 4 established (server
1.2.3.50:5000)
[Fri Aug 07 08:21:01.824041 2020] [ssl:info] [pid 1420349] [client
1.2.3.101:42974] AH01964: Connection to child 5 established (server
1.2.3.50:5000)
[Fri Aug 07 08:21:02.949166 2020] [ssl:info] [pid 1420378] [client
1.2.3.50:54014] AH01964: Connection to child 9 established (server
1.2.3.50:5000)
[Fri Aug 07 08:21:02.949172 2020] [ssl:info] [pid 1420379] [client
1.2.3.80:46924] AH01964: Connection to child 10 established (server
1.2.3.50:5000)
[Fri Aug 07 08:21:03.286057 2020] [:info] [pid 1420287] mod_wsgi
(pid=1420287): Create interpreter '1.2.3.50:5000|'.
[Fri Aug 07 08:21:03.287286 2020] [:info] [pid 1420287] [remote
1.2.3.95:156] mod_wsgi (pid=1420287, process='keystone-public',
application='1.2.3.50:5000|'): Loading WSGI script
'/usr/bin/keystone-wsgi-public'.
[Fri Aug 07 08:21:04.675059 2020] [ssl:info] [pid 1420436] [client
1.2.3.50:54032] AH01964: Connection to child 12 established (server
1.2.3.50:5000)
[Fri Aug 07 08:21:04.705975 2020] [ssl:info] [pid 1420437] [client
1.2.3.107:59554] AH01964: Connection to child 13 established (server
1.2.3.50:5000)
[Fri Aug 07 08:21:06.960940 2020] [ssl:info] [pid 1420438] [client
1.2.3.80:46970] AH01964: Connection to child 14 established (server
1.2.3.50:5000)
[Fri Aug 07 08:21:07.661670 2020] [ssl:info] [pid 1420349] [client
1.2.3.50:54124] AH01964: Connection to child 5 established (server
1.2.3.50:5000)
[Fri Aug 07 08:21:07.683383 2020] [ssl:info] [pid 1420292] [client
x.x.x.x:30065] AH01964: Connection to child 3 established (server
1.2.3.50:5000)
[Fri Aug 07 08:21:08.442956 2020] [:error] [pid 1420287] [remote
x.x.x.x:144] mod_wsgi (pid=1420287): Exception occurred processing WSGI
script '/usr/bin/keystone-wsgi-public'.
[Fri Aug 07 08:21:08.443002 2020] [:error] [pid 1420287] [remote
x.x.x.x:144] Traceback (most recent call last):
[Fri Aug 07 08:21:08.443017 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/flask/app.py", line
2309, in __call__
[Fri Aug 07 08:21:08.443509 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return self.wsgi_app(environ, start_response)
[Fri Aug 07 08:21:08.443525 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/werkzeug/contrib/fixers.py", line 152, in
__call__
[Fri Aug 07 08:21:08.443630 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return self.app(environ, start_response)
[Fri Aug 07 08:21:08.443644 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/webob/dec.py", line
129, in __call__
[Fri Aug 07 08:21:08.443746 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     resp = self.call_func(req, *args, **kw)
[Fri Aug 07 08:21:08.443756 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/webob/dec.py", line
193, in call_func
[Fri Aug 07 08:21:08.443773 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return self.func(req, *args, **kwargs)
[Fri Aug 07 08:21:08.443781 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/oslo_middleware/base.py", line 131, in
__call__
[Fri Aug 07 08:21:08.443844 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     response = req.get_response(self.application)
[Fri Aug 07 08:21:08.443859 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/webob/request.py",
line 1314, in send
[Fri Aug 07 08:21:08.444194 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     application, catch_exc_info=False)
[Fri Aug 07 08:21:08.444203 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/webob/request.py",
line 1278, in call_application
[Fri Aug 07 08:21:08.444220 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     app_iter = application(self.environ, start_response)
[Fri Aug 07 08:21:08.444229 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/webob/dec.py", line
143, in __call__
[Fri Aug 07 08:21:08.444245 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return resp(environ, start_response)
[Fri Aug 07 08:21:08.444253 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/webob/dec.py", line
129, in __call__
[Fri Aug 07 08:21:08.444268 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     resp = self.call_func(req, *args, **kw)
[Fri Aug 07 08:21:08.444276 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/webob/dec.py", line
193, in call_func
[Fri Aug 07 08:21:08.444292 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return self.func(req, *args, **kwargs)
[Fri Aug 07 08:21:08.444300 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/oslo_middleware/base.py", line 131, in
__call__
[Fri Aug 07 08:21:08.444315 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     response = req.get_response(self.application)
[Fri Aug 07 08:21:08.444323 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/webob/request.py",
line 1314, in send
[Fri Aug 07 08:21:08.444338 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     application, catch_exc_info=False)
[Fri Aug 07 08:21:08.444346 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/webob/request.py",
line 1278, in call_application
[Fri Aug 07 08:21:08.444361 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     app_iter = application(self.environ, start_response)
[Fri Aug 07 08:21:08.444370 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/webob/dec.py", line
129, in __call__
[Fri Aug 07 08:21:08.444385 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     resp = self.call_func(req, *args, **kw)
[Fri Aug 07 08:21:08.444393 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/webob/dec.py", line
193, in call_func
[Fri Aug 07 08:21:08.444408 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return self.func(req, *args, **kwargs)
[Fri Aug 07 08:21:08.444416 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/osprofiler/web.py",
line 112, in __call__
[Fri Aug 07 08:21:08.444476 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return request.get_response(self.application)
[Fri Aug 07 08:21:08.444485 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/webob/request.py",
line 1314, in send
[Fri Aug 07 08:21:08.444501 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     application, catch_exc_info=False)
[Fri Aug 07 08:21:08.444509 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/webob/request.py",
line 1278, in call_application
[Fri Aug 07 08:21:08.444524 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     app_iter = application(self.environ, start_response)
[Fri Aug 07 08:21:08.444533 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/webob/dec.py", line
129, in __call__
[Fri Aug 07 08:21:08.444547 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     resp = self.call_func(req, *args, **kw)
[Fri Aug 07 08:21:08.444556 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/webob/dec.py", line
193, in call_func
[Fri Aug 07 08:21:08.444571 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return self.func(req, *args, **kwargs)
[Fri Aug 07 08:21:08.444587 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/oslo_middleware/request_id.py", line 58,
in __call__
[Fri Aug 07 08:21:08.444636 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     response = req.get_response(self.application)
[Fri Aug 07 08:21:08.444645 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/webob/request.py",
line 1314, in send
[Fri Aug 07 08:21:08.444660 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     application, catch_exc_info=False)
[Fri Aug 07 08:21:08.444669 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/webob/request.py",
line 1278, in call_application
[Fri Aug 07 08:21:08.444684 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     app_iter = application(self.environ, start_response)
[Fri Aug 07 08:21:08.444698 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/keystone/server/flask/request_processing/middleware/url_normalize.py",
 line 38, in __call__
[Fri Aug 07 08:21:08.444750 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return self.app(environ, start_response)
[Fri Aug 07 08:21:08.444759 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/webob/dec.py", line
129, in __call__
[Fri Aug 07 08:21:08.444774 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     resp = self.call_func(req, *args, **kw)
[Fri Aug 07 08:21:08.444783 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/webob/dec.py", line
193, in call_func
[Fri Aug 07 08:21:08.444797 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return self.func(req, *args, **kwargs)
[Fri Aug 07 08:21:08.444810 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/keystonemiddleware/auth_token/__init__.py",
 line 333, in __call__
[Fri Aug 07 08:21:08.444828 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     response = req.get_response(self._app)
[Fri Aug 07 08:21:08.444836 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/webob/request.py",
line 1314, in send
[Fri Aug 07 08:21:08.444851 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     application, catch_exc_info=False)
[Fri Aug 07 08:21:08.444859 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/webob/request.py",
line 1278, in call_application
[Fri Aug 07 08:21:08.444874 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     app_iter = application(self.environ, start_response)
[Fri Aug 07 08:21:08.444883 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/webob/dec.py", line
129, in __call__
[Fri Aug 07 08:21:08.444897 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     resp = self.call_func(req, *args, **kw)
[Fri Aug 07 08:21:08.444906 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/webob/dec.py", line
193, in call_func
[Fri Aug 07 08:21:08.444920 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return self.func(req, *args, **kwargs)
[Fri Aug 07 08:21:08.444929 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/oslo_middleware/base.py", line 131, in
__call__
[Fri Aug 07 08:21:08.444944 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     response = req.get_response(self.application)
[Fri Aug 07 08:21:08.444952 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/webob/request.py",
line 1314, in send
[Fri Aug 07 08:21:08.444967 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     application, catch_exc_info=False)
[Fri Aug 07 08:21:08.444975 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/webob/request.py",
line 1278, in call_application
[Fri Aug 07 08:21:08.444990 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     app_iter = application(self.environ, start_response)
[Fri Aug 07 08:21:08.444998 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/werkzeug/wsgi.py",
line 826, in __call__
[Fri Aug 07 08:21:08.445279 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return app(environ, start_response)
[Fri Aug 07 08:21:08.445288 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/flask/app.py", line
2295, in wsgi_app
[Fri Aug 07 08:21:08.445304 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     response = self.handle_exception(e)
[Fri Aug 07 08:21:08.445316 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.445490 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.445500 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.445516 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.445524 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.445539 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.445547 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.445562 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.445570 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.445585 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.445593 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.445608 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.445616 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.445630 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.445639 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.445654 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.445662 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.445676 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.445685 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.445699 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.445708 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.445722 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.445731 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.445745 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.445758 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.445772 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.445780 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.445795 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.445803 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.445818 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.445826 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.445841 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.445849 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.445863 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.445871 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.445886 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.445894 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.445908 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.445917 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.445931 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.445939 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.445954 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.445962 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.445976 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.445984 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.445999 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446007 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446021 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446030 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446044 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446052 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446067 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446075 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446090 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446098 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446113 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446121 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446135 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446143 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446158 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446166 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/flask/app.py", line
1741, in handle_exception
[Fri Aug 07 08:21:08.446181 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     reraise(exc_type, exc_value, tb)
[Fri Aug 07 08:21:08.446190 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 266, in
error_router
[Fri Aug 07 08:21:08.446204 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return self.handle_error(e)
[Fri Aug 07 08:21:08.446212 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/flask/app.py", line
2292, in wsgi_app
[Fri Aug 07 08:21:08.446227 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     response = self.full_dispatch_request()
[Fri Aug 07 08:21:08.446236 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/flask/app.py", line
1815, in full_dispatch_request
[Fri Aug 07 08:21:08.446250 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     rv = self.handle_user_exception(e)
[Fri Aug 07 08:21:08.446259 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446273 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446282 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446296 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446304 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446318 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446327 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446341 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446349 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446363 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446372 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446386 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446395 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446409 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446417 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446432 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446440 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446454 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446463 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446477 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446485 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446500 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446508 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446522 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446531 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446545 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446553 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446568 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446576 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446590 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446599 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446613 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446621 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446636 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446644 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446658 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446667 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446681 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446689 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446704 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446712 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446727 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446735 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446749 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446757 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446772 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446780 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446795 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446803 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446817 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446826 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446840 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446848 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446863 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446871 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446885 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446893 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446908 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446916 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 269, in
error_router
[Fri Aug 07 08:21:08.446930 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return original_handler(e)
[Fri Aug 07 08:21:08.446938 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/flask/app.py", line
1718, in handle_user_exception
[Fri Aug 07 08:21:08.446953 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     reraise(exc_type, exc_value, tb)
[Fri Aug 07 08:21:08.446962 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 266, in
error_router
[Fri Aug 07 08:21:08.446976 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return self.handle_error(e)
[Fri Aug 07 08:21:08.446984 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/flask/app.py", line
1813, in full_dispatch_request
[Fri Aug 07 08:21:08.446999 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     rv = self.dispatch_request()
[Fri Aug 07 08:21:08.447007 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/flask/app.py", line
1799, in dispatch_request
[Fri Aug 07 08:21:08.447022 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return self.view_functions[rule.endpoint](**req.view_args)
[Fri Aug 07 08:21:08.447031 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 458, in
wrapper
[Fri Aug 07 08:21:08.447046 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     resp = resource(*args, **kwargs)
[Fri Aug 07 08:21:08.447055 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/flask/views.py", line
88, in view
[Fri Aug 07 08:21:08.447119 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return self.dispatch_request(*args, **kwargs)
[Fri Aug 07 08:21:08.447128 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 573, in
dispatch_request
[Fri Aug 07 08:21:08.447144 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     resp = meth(*args, **kwargs)
[Fri Aug 07 08:21:08.447152 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/keystone/server/flask/common.py", line
1060, in wrapper
[Fri Aug 07 08:21:08.447392 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return f(*args, **kwargs)
[Fri Aug 07 08:21:08.447406 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/keystone/api/auth.py", line 312, in post
[Fri Aug 07 08:21:08.447551 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     token = authentication.authenticate_for_token(auth_data)
[Fri Aug 07 08:21:08.447561 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/keystone/api/_shared/authentication.py",
line 229, in authenticate_for_token
[Fri Aug 07 08:21:08.447652 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     app_cred_id=app_cred_id, parent_audit_id=token_audit_id)
[Fri Aug 07 08:21:08.447662 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/keystone/common/manager.py", line 116, in
wrapped
[Fri Aug 07 08:21:08.447679 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     __ret_val = __f(*args, **kwargs)
[Fri Aug 07 08:21:08.447687 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/keystone/token/provider.py", line 252, in
issue_token
[Fri Aug 07 08:21:08.447706 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     token.mint(token_id, issued_at)
[Fri Aug 07 08:21:08.447714 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/keystone/models/token_model.py", line
563, in mint
[Fri Aug 07 08:21:08.448498 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     self._validate_project_scope()
[Fri Aug 07 08:21:08.448508 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/keystone/models/token_model.py", line
512, in _validate_project_scope
[Fri Aug 07 08:21:08.448525 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     if self.project_scoped and not self.roles:
[Fri Aug 07 08:21:08.448533 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/keystone/models/token_model.py", line
438, in roles
[Fri Aug 07 08:21:08.448549 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     roles = self._get_project_roles()
[Fri Aug 07 08:21:08.448557 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/keystone/models/token_model.py", line
400, in _get_project_roles
[Fri Aug 07 08:21:08.448573 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     self.user_id, self.project_id
[Fri Aug 07 08:21:08.448581 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/keystone/common/manager.py", line 116, in
wrapped
[Fri Aug 07 08:21:08.448597 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     __ret_val = __f(*args, **kwargs)
[Fri Aug 07 08:21:08.448605 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/dogpile/cache/region.py", line 1220, in
decorate
[Fri Aug 07 08:21:08.449478 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     should_cache_fn)
[Fri Aug 07 08:21:08.449488 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/dogpile/cache/region.py", line 825, in
get_or_create
[Fri Aug 07 08:21:08.449504 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     async_creator) as value:
[Fri Aug 07 08:21:08.449512 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/dogpile/lock.py",
line 154, in __enter__
[Fri Aug 07 08:21:08.449967 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return self._enter()
[Fri Aug 07 08:21:08.449977 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/dogpile/lock.py",
line 94, in _enter
[Fri Aug 07 08:21:08.449995 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     generated = self._enter_create(createdtime)
[Fri Aug 07 08:21:08.450004 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File "/usr/lib/python2.7/site-packages/dogpile/lock.py",
line 145, in _enter_create
[Fri Aug 07 08:21:08.450020 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     created = self.creator()
[Fri Aug 07 08:21:08.450029 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/dogpile/cache/region.py", line 792, in
gen_value
[Fri Aug 07 08:21:08.450046 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     created_value = creator()
[Fri Aug 07 08:21:08.450055 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/dogpile/cache/region.py", line 1216, in
creator
[Fri Aug 07 08:21:08.450071 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return fn(*arg, **kw)
[Fri Aug 07 08:21:08.450080 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/keystone/assignment/core.py", line 128,
in get_roles_for_user_and_project
[Fri Aug 07 08:21:08.450975 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     user_id=user_id, project_id=project_id, effective=True)
[Fri Aug 07 08:21:08.450985 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/keystone/common/manager.py", line 116, in
wrapped
[Fri Aug 07 08:21:08.451001 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     __ret_val = __f(*args, **kwargs)
[Fri Aug 07 08:21:08.451009 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/keystone/assignment/core.py", line 999,
in list_role_assignments
[Fri Aug 07 08:21:08.451025 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     strip_domain_roles)
[Fri Aug 07 08:21:08.451033 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/keystone/assignment/core.py", line 845,
in _list_effective_role_assignments
[Fri Aug 07 08:21:08.451049 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     domain_id=domain_id, inherited=inherited)
[Fri Aug 07 08:21:08.451057 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/keystone/assignment/core.py", line 780,
in list_role_assignments_for_actor
[Fri Aug 07 08:21:08.451072 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     group_ids=group_ids, inherited_to_projects=False)
[Fri Aug 07 08:21:08.451081 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib/python2.7/site-packages/keystone/assignment/backends/sql.py",
line 248, in list_role_assignments
[Fri Aug 07 08:21:08.451599 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return [denormalize_role(ref) for ref in query.all()]
[Fri Aug 07 08:21:08.451609 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2925, in
all
[Fri Aug 07 08:21:08.451632 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return list(self)
[Fri Aug 07 08:21:08.451641 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 3081, in
__iter__
[Fri Aug 07 08:21:08.451656 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return self._execute_and_instances(context)
[Fri Aug 07 08:21:08.451665 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 3106, in
_execute_and_instances
[Fri Aug 07 08:21:08.451683 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     result = conn.execute(querycontext.statement,
self._params)
[Fri Aug 07 08:21:08.451691 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 980,
in execute
[Fri Aug 07 08:21:08.451711 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return meth(self, multiparams, params)
[Fri Aug 07 08:21:08.451720 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib64/python2.7/site-packages/sqlalchemy/sql/elements.py", line 273,
in _execute_on_connection
[Fri Aug 07 08:21:08.451736 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     return connection._execute_clauseelement(self,
multiparams, params)
[Fri Aug 07 08:21:08.451745 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1099,
in _execute_clauseelement
[Fri Aug 07 08:21:08.451762 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     distilled_params,
[Fri Aug 07 08:21:08.451771 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1240,
in _execute_context
[Fri Aug 07 08:21:08.451786 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     e, statement, parameters, cursor, context
[Fri Aug 07 08:21:08.451795 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1456,
in _handle_dbapi_exception
[Fri Aug 07 08:21:08.451810 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     util.raise_from_cause(newraise, exc_info)
[Fri Aug 07 08:21:08.451818 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib64/python2.7/site-packages/sqlalchemy/util/compat.py", line 296,
in raise_from_cause
[Fri Aug 07 08:21:08.451834 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     reraise(type(exception), exception, tb=exc_tb,
cause=cause)
[Fri Aug 07 08:21:08.451843 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1236,
in _execute_context
[Fri Aug 07 08:21:08.451858 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     cursor, statement, parameters, context
[Fri Aug 07 08:21:08.451866 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]   File
"/usr/lib64/python2.7/site-packages/sqlalchemy/engine/default.py", line
536, in do_execute
[Fri Aug 07 08:21:08.451882 2020] [:error] [pid 1420287] [remote
x.x.x.x:144]     cursor.execute(statement, parameters)
[Fri Aug 07 08:21:08.451923 2020] [:error] [pid 1420287] [remote
x.x.x.x:144] DBNonExistentTable: (sqlite3.OperationalError) no such table:
assignment [SQL: u'SELECT assignment.type AS assignment_type,
assignment.actor_id AS assignment_actor_id, assignment.target_id AS
assignment_target_id, assignment.role_id AS assignment_role_id,
assignment.inherited AS assignment_inherited \\nFROM assignment \\nWHERE
assignment.actor_id IN (?) AND assignment.target_id IN (?) AND
assignment.type IN (?) AND assignment.inherited = 0'] [parameters:
('15c2fe91e053af57a997c568c117c908d59c138f996bdc19ae97e9f16df12345',
'12345978536e45ab8a279e2b0fa4f947', 'UserProject')] (Background on this
error at: http://sqlalche.me/e/e3q8)

Regards,
Divya
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200810/ea1550c1/attachment.html>

From yan.y.zhao at intel.com  Mon Aug 10 07:46:31 2020
From: yan.y.zhao at intel.com (Yan Zhao)
Date: Mon, 10 Aug 2020 15:46:31 +0800
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200805105319.GF2177@nanopsycho>
References: <20200727162321.7097070e@x1.home>
 <20200729080503.GB28676@joy-OptiPlex-7040>
 <20200804183503.39f56516.cohuck@redhat.com>
 <c178a0d3-269d-1620-22b1-9010f602d8ff@redhat.com>
 <20200805021654.GB30485@joy-OptiPlex-7040>
 <2624b12f-3788-7e2b-2cb7-93534960bcb7@redhat.com>
 <20200805075647.GB2177@nanopsycho>
 <eb1d01c2-fbad-36b6-10cf-9e03483a736b@redhat.com>
 <20200805093338.GC30485@joy-OptiPlex-7040>
 <20200805105319.GF2177@nanopsycho>
Message-ID: <20200810074631.GA29059@joy-OptiPlex-7040>

On Wed, Aug 05, 2020 at 12:53:19PM +0200, Jiri Pirko wrote:
> Wed, Aug 05, 2020 at 11:33:38AM CEST, yan.y.zhao at intel.com wrote:
> >On Wed, Aug 05, 2020 at 04:02:48PM +0800, Jason Wang wrote:
> >> 
> >> On 2020/8/5 下午3:56, Jiri Pirko wrote:
> >> > Wed, Aug 05, 2020 at 04:41:54AM CEST, jasowang at redhat.com wrote:
> >> > > On 2020/8/5 上午10:16, Yan Zhao wrote:
> >> > > > On Wed, Aug 05, 2020 at 10:22:15AM +0800, Jason Wang wrote:
> >> > > > > On 2020/8/5 上午12:35, Cornelia Huck wrote:
> >> > > > > > [sorry about not chiming in earlier]
> >> > > > > > 
> >> > > > > > On Wed, 29 Jul 2020 16:05:03 +0800
> >> > > > > > Yan Zhao <yan.y.zhao at intel.com> wrote:
> >> > > > > > 
> >> > > > > > > On Mon, Jul 27, 2020 at 04:23:21PM -0600, Alex Williamson wrote:
> >> > > > > > (...)
> >> > > > > > 
> >> > > > > > > > Based on the feedback we've received, the previously proposed interface
> >> > > > > > > > is not viable.  I think there's agreement that the user needs to be
> >> > > > > > > > able to parse and interpret the version information.  Using json seems
> >> > > > > > > > viable, but I don't know if it's the best option.  Is there any
> >> > > > > > > > precedent of markup strings returned via sysfs we could follow?
> >> > > > > > I don't think encoding complex information in a sysfs file is a viable
> >> > > > > > approach. Quoting Documentation/filesystems/sysfs.rst:
> >> > > > > > 
> >> > > > > > "Attributes should be ASCII text files, preferably with only one value
> >> > > > > > per file. It is noted that it may not be efficient to contain only one
> >> > > > > > value per file, so it is socially acceptable to express an array of
> >> > > > > > values of the same type.
> >> > > > > > Mixing types, expressing multiple lines of data, and doing fancy
> >> > > > > > formatting of data is heavily frowned upon."
> >> > > > > > 
> >> > > > > > Even though this is an older file, I think these restrictions still
> >> > > > > > apply.
> >> > > > > +1, that's another reason why devlink(netlink) is better.
> >> > > > > 
> >> > > > hi Jason,
> >> > > > do you have any materials or sample code about devlink, so we can have a good
> >> > > > study of it?
> >> > > > I found some kernel docs about it but my preliminary study didn't show me the
> >> > > > advantage of devlink.
> >> > > 
> >> > > CC Jiri and Parav for a better answer for this.
> >> > > 
> >> > > My understanding is that the following advantages are obvious (as I replied
> >> > > in another thread):
> >> > > 
> >> > > - existing users (NIC, crypto, SCSI, ib), mature and stable
> >> > > - much better error reporting (ext_ack other than string or errno)
> >> > > - namespace aware
> >> > > - do not couple with kobject
> >> > Jason, what is your use case?
> >> 
> >> 
> >> I think the use case is to report device compatibility for live migration.
> >> Yan proposed a simple sysfs based migration version first, but it looks not
> >> sufficient and something based on JSON is discussed.
> >> 
> >> Yan, can you help to summarize the discussion so far for Jiri as a
> >> reference?
> >> 
> >yes.
> >we are currently defining an device live migration compatibility
> >interface in order to let user space like openstack and libvirt knows
> >which two devices are live migration compatible.
> >currently the devices include mdev (a kernel emulated virtual device)
> >and physical devices (e.g.  a VF of a PCI SRIOV device).
> >
> >the attributes we want user space to compare including
> >common attribues:
> >    device_api: vfio-pci, vfio-ccw...
> >    mdev_type: mdev type of mdev or similar signature for physical device
> >               It specifies a device's hardware capability. e.g.
> >	       i915-GVTg_V5_4 means it's of 1/4 of a gen9 Intel graphics
> >	       device.
> >    software_version: device driver's version.
> >               in <major>.<minor>[.bugfix] scheme, where there is no
> >	       compatibility across major versions, minor versions have
> >	       forward compatibility (ex. 1-> 2 is ok, 2 -> 1 is not) and
> >	       bugfix version number indicates some degree of internal
> >	       improvement that is not visible to the user in terms of
> >	       features or compatibility,
> >
> >vendor specific attributes: each vendor may define different attributes
> >   device id : device id of a physical devices or mdev's parent pci device.
> >               it could be equal to pci id for pci devices
> >   aggregator: used together with mdev_type. e.g. aggregator=2 together
> >               with i915-GVTg_V5_4 means 2*1/4=1/2 of a gen9 Intel
> >	       graphics device.
> >   remote_url: for a local NVMe VF, it may be configured with a remote
> >               url of a remote storage and all data is stored in the
> >	       remote side specified by the remote url.
> >   ...
> >
> >Comparing those attributes by user space alone is not an easy job, as it
> >can't simply assume an equal relationship between source attributes and
> >target attributes. e.g.
> >for a source device of mdev_type=i915-GVTg_V5_4,aggregator=2, (1/2 of
> >gen9), it actually could find a compatible device of
> >mdev_type=i915-GVTg_V5_8,aggregator=4 (also 1/2 of gen9),
> >if mdev_type of i915-GVTg_V5_4 is not available in the target machine.
> >
> >So, in our current proposal, we want to create two sysfs attributes
> >under a device sysfs node.
> >/sys/<path to device>/migration/self
> >/sys/<path to device>/migration/compatible
> >
> >#cat /sys/<path to device>/migration/self
> >device_type=vfio_pci
> >mdev_type=i915-GVTg_V5_4
> >device_id=8086591d
> >aggregator=2
> >software_version=1.0.0
> >
> >#cat /sys/<path to device>/migration/compatible
> >device_type=vfio_pci
> >mdev_type=i915-GVTg_V5_{val1:int:2,4,8}
> >device_id=8086591d
> >aggregator={val1}/2
> >software_version=1.0.0
> >
> >The /sys/<path to device>/migration/self specifies self attributes of
> >a device.
> >The /sys/<path to device>/migration/compatible specifies the list of
> >compatible devices of a device. as in the example, compatible devices
> >could have
> >	device_type == vfio_pci &&
> >	device_id == 8086591d   &&
> >	software_version == 1.0.0 &&
> >        (
> >	(mdev_type of i915-GVTg_V5_2 && aggregator==1) ||
> >	(mdev_type of i915-GVTg_V5_4 && aggregator==2) ||
> >	(mdev_type of i915-GVTg_V5_8 && aggregator=4)
> >	)
> >
> >by comparing whether a target device is in compatible list of source
> >device, the user space can know whether a two devices are live migration
> >compatible.
> >
> >Additional notes:
> >1)software_version in the compatible list may not be necessary as it
> >already has a major.minor.bugfix scheme.
> >2)for vendor attribute like remote_url, it may not be statically
> >assigned and could be changed with a device interface.
> >
> >So, as Cornelia pointed that it's not good to use complex format in
> >a sysfs attribute, we'd like to know whether there're other good ways to
> >our use case, e.g. splitting a single attribute to multiple simple sysfs
> >attributes as what Cornelia suggested or devlink that Jason has strongly
> >recommended.
> 
> Hi Yan.
> 
Hi Jiri,
> Thanks for the explanation, I'm still fuzzy about the details.
> Anyway, I suggest you to check "devlink dev info" command we have
> implemented for multiple drivers. You can try netdevsim to test this.
> I think that the info you need to expose might be put there.
do you mean drivers/net/netdevsim/ ?
> 
> Devlink creates instance per-device. Specific device driver calls into
> devlink core to create the instance.  What device do you have? What
the devlink core is net/core/devlink.c ?

> driver is it handled by?

It looks that the devlink is for network device specific, and in
devlink.h, it says
include/uapi/linux/devlink.h - Network physical device Netlink
interface, I feel like it's not very appropriate for a GPU driver to use
this interface. Is that right?

Thanks
Yan
 

From monika.samal at outlook.com  Mon Aug 10 11:12:31 2020
From: monika.samal at outlook.com (Monika Samal)
Date: Mon, 10 Aug 2020 11:12:31 +0000
Subject: [openstack-community] Octavia :; Unable to create load balancer
In-Reply-To: <CAA857VyTo_yY_f5YzamyQxj7fAvFb33gO1PdDZ6aFtVJndieRA@mail.gmail.com>
References: <DM6PR08MB51944451C8861614628EE9DA8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAFs83Qr3Us22pK7gnw7X68_FjimnBTRDn0s+RaSzuzhjcXdY6w@mail.gmail.com>
 <CAMH0MgL7mvG-Zgb-Sr1mbNWhHLyZAHYw+BbAuzaHA+EJSJ=DQQ@mail.gmail.com>
 <DM6PR08MB5194D8A130262996A211B61F8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VzM4BBStF8nTwVJtFxaX0surEidjnmB3JExTDsncJsZNg@mail.gmail.com>
 <DM6PR08MB5194211D78CB801C7C3B0D4A8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VwKMWiO7L6CYK+zb4jkT9ntCjcjQRspxdm8MFL+i13aKg@mail.gmail.com>
 <DM6PR08MB519419FF0C8F173E25A6EAB78F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyDANKJ0CvPXdwMFuED-xEea=Urn0sy0tpwnwLBPSnKTg@mail.gmail.com>
 <DM6PR08MB5194C8259398F0810FAA236E8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAMH0MgLADYE6ncadXJHxghaKDYVmL=wYPE6hOejTC3xyHMgaAg@mail.gmail.com>
 <DM6PR08MB519489AEE16CD5C6CA69890C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <DM6PR08MB51949EBABD05456E0D7B14E38F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyATFxkKPL+YJRsrTiY0qzxN4GWVu6FDroKgjGc5Bqzbg@mail.gmail.com>
 <DM6PR08MB519447CBCC68203A8D5C870C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VxOttV=yNCoFp-RGMts82RJrCB7wXkygsPwmNx--miYVg@mail.gmail.com>
 <CAMH0MgLNm36nWsuXWyYvbd=1LNdNdmKLMYJn7u+d-muTnk0kuA@mail.gmail.com>
 <DM6PR08MB5194961A1F38F812D414AE408F4A0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAFHSqWp0oeLLF36SCvORJ2Vh5YeWyxExJs63efSCG1ak9uaYww@mail.gmail.com>
 <CAMH0Mg+bVbsT+D-MxR7u5VaNthb==OM6gtU1d72t6nZ1tioV1w@mail.gmail.com>
 <CAFHSqWrY6+ai-wCyYFscc6=gqkOO09r2TieN=zyVLg6Sf_4LAA@mail.gmail.com>
 <DM6PR08MB5194D8E41FA6C8B3708045818F480@DM6PR08MB5194.namprd08.prod.outlook.com>
 <DM6PR08MB5194028756FA158E6F1AA9BB8F470@DM6PR08MB5194.namprd08.prod.outlook.com>
 <DM6PR08MB5194776B80A9582ED02D9C728F470@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAMH0MgLqJLM8dF6LYssH8eAiFqxriq87MQo1iXp2E+dJpPd-BQ@mail.gmail.com>,
 <CAA857VyTo_yY_f5YzamyQxj7fAvFb33gO1PdDZ6aFtVJndieRA@mail.gmail.com>
Message-ID: <DM6PR08MB51948AB2B72DA86CD38CA7F58F440@DM6PR08MB5194.namprd08.prod.outlook.com>

Hey Fabian,

I tried creating and testing instance with my available subnet created for loadbalancer , I am not able to ping it.

Please find below ip a output for controller and deployment node:

Controller Node: 30.0.0.14
[cid:18b1d2b8-1adf-45f1-9dd6-b185223e060e]

Deployment Node: 30.0.0.11

[cid:20b35bae-677f-462c-8c38-7fafdb058219]

[cid:66fb7ccf-ba53-4662-a0b8-b129da885f1a]
________________________________
From: Fabian Zimmermann <dev.faz at gmail.com>
Sent: Monday, August 10, 2020 11:19 AM
To: Michael Johnson <johnsomor at gmail.com>
Cc: Monika Samal <monika.samal at outlook.com>; openstack-discuss <openstack-discuss at lists.openstack.org>; Mark Goddard <mark at stackhpc.com>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

Hi,

to test your connection you can create an instance im the octavia network and try to ping/ssh from your controller (dont forget a suitable security group)

 Fabian

Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>> schrieb am Mo., 10. Aug. 2020, 07:44:

That looks like there is still a kolla networking issue where the amphora are not able to reach the controller processes. Please fix the lb-mgmt-net such that it can reach the amphora and the controller containers. This should be setup via the deployment tool, kolla in this case.

Michael

On Sun, Aug 9, 2020 at 5:02 AM Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> wrote:
Hi All,

Below is the error am getting, i tried configuring network issue as well still finding it difficult to resolve.

Below is my log...if somebody can help me resolving it..it would be great help since its very urgent...

http://paste.openstack.org/show/TsagcQX2ZKd6rhhsYcYd/

Regards,
Monika
________________________________
From: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>
Sent: Sunday, 9 August, 2020, 5:29 pm
To: Mark Goddard; Michael Johnson; openstack-discuss
Cc: Fabian Zimmermann
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

________________________________
From: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>
Sent: Friday, August 7, 2020 4:41:52 AM
To: Mark Goddard <mark at stackhpc.com<mailto:mark at stackhpc.com>>; Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>
Cc: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

I tried following above document still facing same Octavia connection error with amphora image.

Regards,
Monika
________________________________
From: Mark Goddard <mark at stackhpc.com<mailto:mark at stackhpc.com>>
Sent: Thursday, August 6, 2020 1:16:01 PM
To: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>
Cc: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>; Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer


On Wed, 5 Aug 2020 at 16:16, Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>> wrote:
Looking at that error, it appears that the lb-mgmt-net is not setup correctly. The Octavia controller containers are not able to reach the amphora instances on the lb-mgmt-net subnet.

I don't know how kolla is setup to connect the containers to the neutron lb-mgmt-net network. Maybe the above documents will help with that.

Right now it's up to the operator to configure that. The kolla documentation doesn't prescribe any particular setup. We're working on automating it in Victoria.


Michael

On Wed, Aug 5, 2020 at 12:53 AM Mark Goddard <mark at stackhpc.com<mailto:mark at stackhpc.com>> wrote:


On Tue, 4 Aug 2020 at 16:58, Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> wrote:
Hello Guys,

With Michaels help I was able to solve the problem but now there is another error I was able to create my network on vlan but still error persist. PFB the logs:

http://paste.openstack.org/show/fEixSudZ6lzscxYxsG1z/

Kindly help

regards,
Monika
________________________________
From: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>
Sent: Monday, August 3, 2020 9:10 PM
To: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>
Cc: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

Yeah, it looks like nova is failing to boot the instance.

Check this setting in your octavia.conf files: https://docs.openstack.org/octavia/latest/configuration/configref.html#controller_worker.amp_flavor_id

Also, if kolla-ansible didn't set both of these values correctly, please open bug reports for kolla-ansible. These all should have been configured by the deployment tool.


I wasn't following this thread due to no [kolla] tag, but here are the recently added docs for Octavia in kolla [1]. Note the octavia_service_auth_project variable which was added to migrate from the admin project to the service project for octavia resources. We're lacking proper automation for the flavor, image etc, but it is being worked on in Victoria [2].

[1] https://docs.openstack.org/kolla-ansible/latest/reference/networking/octavia.html
[2] https://review.opendev.org/740180

Michael

On Mon, Aug 3, 2020 at 7:53 AM Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>> wrote:
Seems like the flavor is missing or empty '' - check for typos and enable debug.

Check if the nova req contains valid information/flavor.

 Fabian

Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> schrieb am Mo., 3. Aug. 2020, 15:46:
It's registered

Get Outlook for Android<https://aka.ms/ghei36>
________________________________
From: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>
Sent: Monday, August 3, 2020 7:08:21 PM
To: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

Did you check the (nova) flavor you use in octavia.

 Fabian

Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> schrieb am Mo., 3. Aug. 2020, 10:53:
After Michael suggestion I was able to create load balancer but there is error in status.


[X]

PFB the error link:

http://paste.openstack.org/show/meNZCeuOlFkfjj189noN/
________________________________
From: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>
Sent: Monday, August 3, 2020 2:08 PM
To: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>
Cc: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>; Amy Marrich <amy at demarco.com<mailto:amy at demarco.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>; community at lists.openstack.org<mailto:community at lists.openstack.org> <community at lists.openstack.org<mailto:community at lists.openstack.org>>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

Thanks a ton Michael for helping me out
________________________________
From: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>
Sent: Friday, July 31, 2020 3:57 AM
To: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>
Cc: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>; Amy Marrich <amy at demarco.com<mailto:amy at demarco.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>; community at lists.openstack.org<mailto:community at lists.openstack.org> <community at lists.openstack.org<mailto:community at lists.openstack.org>>
Subject: Re: [openstack-community] Octavia :; Unable to create load balancer

Just to close the loop on this, the octavia.conf file had
"project_name = admin" instead of "project_name = service" in the
[service_auth] section. This was causing the keystone errors when
Octavia was communicating with neutron.

I don't know if that is a bug in kolla-ansible or was just a local
configuration issue.

Michael

On Thu, Jul 30, 2020 at 1:39 PM Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> wrote:
>
> Hello Fabian,,
>
> http://paste.openstack.org/show/QxKv2Ai697qulp9UWTjY/
>
> Regards,
> Monika
> ________________________________
> From: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>
> Sent: Friday, July 31, 2020 1:57 AM
> To: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>
> Cc: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>; Amy Marrich <amy at demarco.com<mailto:amy at demarco.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>; community at lists.openstack.org<mailto:community at lists.openstack.org> <community at lists.openstack.org<mailto:community at lists.openstack.org>>
> Subject: Re: [openstack-community] Octavia :; Unable to create load balancer
>
> Hi,
>
> just to debug, could you replace the auth_type password with v3password?
>
> And do a curl against your :5000 and :35357 urls and paste the output.
>
>  Fabian
>
> Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>> schrieb am Do., 30. Juli 2020, 22:15:
>
> Hello Fabian,
>
> http://paste.openstack.org/show/796477/
>
> Thanks,
> Monika
> ________________________________
> From: Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>
> Sent: Friday, July 31, 2020 1:38 AM
> To: Monika Samal <monika.samal at outlook.com<mailto:monika.samal at outlook.com>>
> Cc: Michael Johnson <johnsomor at gmail.com<mailto:johnsomor at gmail.com>>; Amy Marrich <amy at demarco.com<mailto:amy at demarco.com>>; openstack-discuss <openstack-discuss at lists.openstack.org<mailto:openstack-discuss at lists.openstack.org>>; community at lists.openstack.org<mailto:community at lists.openstack.org> <community at lists.openstack.org<mailto:community at lists.openstack.org>>
> Subject: Re: [openstack-community] Octavia :; Unable to create load balancer
>
> The sections should be
>
> service_auth
> keystone_authtoken
>
> if i read the docs correctly. Maybe you can just paste your config (remove/change passwords) to paste.openstack.org<http://paste.openstack.org> and post the link?
>
>  Fabian

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200810/e6d3b479/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 97899 bytes
Desc: image.png
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200810/e6d3b479/attachment-0003.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 22490 bytes
Desc: image.png
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200810/e6d3b479/attachment-0004.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 24398 bytes
Desc: image.png
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200810/e6d3b479/attachment-0005.png>

From juliaashleykreger at gmail.com  Mon Aug 10 16:17:41 2020
From: juliaashleykreger at gmail.com (Julia Kreger)
Date: Mon, 10 Aug 2020 09:17:41 -0700
Subject: [Ironic] User Survey question
Message-ID: <CAF7gwdgh4BkN17fGL-LqbjnYfr2Gj-Nov8MVQ7PMJs2ncaE4DQ@mail.gmail.com>

Greetings awesome Artificial Intelligences and fellow humanoid carbon units!

This week I need to submit the question for the 2021 user survey. We
discussed this some during our weekly IRC meeting today.[0]

Presently, the question is:

"Ironic: What would you find most useful if it was part of ironic?"

I'd like to propose we collect more data in order to enable us to make
informed decisions for features and maintenance work moving forward.
While this is long term thinking, I'm wondering if operators would be
interested in collecting and submitting some basic data or using a
tool, to submit anonymous usage data so we can gain insight into
hardware types in use, numbers of machines, which interfaces are used,
etc.

So I'm thinking something along the lines of:

"Ironic: Would you be willing to submit anonymous usage statistics
(Number of nodes, conductors, which drivers are in use, etc) if such a
tool existed? Yes/No/Not Applicable"

Thoughts? Feelings? Concerns? Other ideas?

-Julia


[0]: http://eavesdrop.openstack.org/meetings/ironic/2020/ironic.2020-08-10-15.00.log.html


From dev.faz at gmail.com  Mon Aug 10 16:57:57 2020
From: dev.faz at gmail.com (Fabian Zimmermann)
Date: Mon, 10 Aug 2020 18:57:57 +0200
Subject: [openstack-community] Octavia :; Unable to create load balancer
In-Reply-To: <DM6PR08MB51948AB2B72DA86CD38CA7F58F440@DM6PR08MB5194.namprd08.prod.outlook.com>
References: <DM6PR08MB51944451C8861614628EE9DA8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAFs83Qr3Us22pK7gnw7X68_FjimnBTRDn0s+RaSzuzhjcXdY6w@mail.gmail.com>
 <CAMH0MgL7mvG-Zgb-Sr1mbNWhHLyZAHYw+BbAuzaHA+EJSJ=DQQ@mail.gmail.com>
 <DM6PR08MB5194D8A130262996A211B61F8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VzM4BBStF8nTwVJtFxaX0surEidjnmB3JExTDsncJsZNg@mail.gmail.com>
 <DM6PR08MB5194211D78CB801C7C3B0D4A8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VwKMWiO7L6CYK+zb4jkT9ntCjcjQRspxdm8MFL+i13aKg@mail.gmail.com>
 <DM6PR08MB519419FF0C8F173E25A6EAB78F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyDANKJ0CvPXdwMFuED-xEea=Urn0sy0tpwnwLBPSnKTg@mail.gmail.com>
 <DM6PR08MB5194C8259398F0810FAA236E8F710@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAMH0MgLADYE6ncadXJHxghaKDYVmL=wYPE6hOejTC3xyHMgaAg@mail.gmail.com>
 <DM6PR08MB519489AEE16CD5C6CA69890C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <DM6PR08MB51949EBABD05456E0D7B14E38F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VyATFxkKPL+YJRsrTiY0qzxN4GWVu6FDroKgjGc5Bqzbg@mail.gmail.com>
 <DM6PR08MB519447CBCC68203A8D5C870C8F4D0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAA857VxOttV=yNCoFp-RGMts82RJrCB7wXkygsPwmNx--miYVg@mail.gmail.com>
 <CAMH0MgLNm36nWsuXWyYvbd=1LNdNdmKLMYJn7u+d-muTnk0kuA@mail.gmail.com>
 <DM6PR08MB5194961A1F38F812D414AE408F4A0@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAFHSqWp0oeLLF36SCvORJ2Vh5YeWyxExJs63efSCG1ak9uaYww@mail.gmail.com>
 <CAMH0Mg+bVbsT+D-MxR7u5VaNthb==OM6gtU1d72t6nZ1tioV1w@mail.gmail.com>
 <CAFHSqWrY6+ai-wCyYFscc6=gqkOO09r2TieN=zyVLg6Sf_4LAA@mail.gmail.com>
 <DM6PR08MB5194D8E41FA6C8B3708045818F480@DM6PR08MB5194.namprd08.prod.outlook.com>
 <DM6PR08MB5194028756FA158E6F1AA9BB8F470@DM6PR08MB5194.namprd08.prod.outlook.com>
 <DM6PR08MB5194776B80A9582ED02D9C728F470@DM6PR08MB5194.namprd08.prod.outlook.com>
 <CAMH0MgLqJLM8dF6LYssH8eAiFqxriq87MQo1iXp2E+dJpPd-BQ@mail.gmail.com>
 <CAA857VyTo_yY_f5YzamyQxj7fAvFb33gO1PdDZ6aFtVJndieRA@mail.gmail.com>
 <DM6PR08MB51948AB2B72DA86CD38CA7F58F440@DM6PR08MB5194.namprd08.prod.outlook.com>
Message-ID: <CAA857Vy_x=5gXiDhcm7Y=2QkCO0GDKiWFtdk_RXBFndu_rWHpQ@mail.gmail.com>

Hi,

Check if the vlan of eth1 is reachable by the compute nodes on the nic
connected to br-ex.
Is the lb-net created with the correct vlan-id so the traffic is able to
flow from the nic to br-ex to the instance?
Proof this with tcpdump (-e)

 Fabian


connected to the

Monika Samal <monika.samal at outlook.com> schrieb am Mo., 10. Aug. 2020,
13:12:

> Hey Fabian,
>
> I tried creating and testing instance with my available subnet created for
> loadbalancer , I am not able to ping it.
>
> Please find below ip a output for controller and deployment node:
>
> Controller Node: 30.0.0.14
>
> Deployment Node: 30.0.0.11
>
>
>
> ------------------------------
> *From:* Fabian Zimmermann <dev.faz at gmail.com>
> *Sent:* Monday, August 10, 2020 11:19 AM
> *To:* Michael Johnson <johnsomor at gmail.com>
> *Cc:* Monika Samal <monika.samal at outlook.com>; openstack-discuss <
> openstack-discuss at lists.openstack.org>; Mark Goddard <mark at stackhpc.com>
> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
> balancer
>
> Hi,
>
> to test your connection you can create an instance im the octavia network
> and try to ping/ssh from your controller (dont forget a suitable security
> group)
>
>  Fabian
>
> Michael Johnson <johnsomor at gmail.com> schrieb am Mo., 10. Aug. 2020,
> 07:44:
>
>
> That looks like there is still a kolla networking issue where the amphora
> are not able to reach the controller processes. Please fix the lb-mgmt-net
> such that it can reach the amphora and the controller containers. This
> should be setup via the deployment tool, kolla in this case.
>
> Michael
>
> On Sun, Aug 9, 2020 at 5:02 AM Monika Samal <monika.samal at outlook.com>
> wrote:
>
> Hi All,
>
> Below is the error am getting, i tried configuring network issue as well
> still finding it difficult to resolve.
>
> Below is my log...if somebody can help me resolving it..it would be great
> help since its very urgent...
>
> http://paste.openstack.org/show/TsagcQX2ZKd6rhhsYcYd/
>
> Regards,
> Monika
> ------------------------------
> *From:* Monika Samal <monika.samal at outlook.com>
> *Sent:* Sunday, 9 August, 2020, 5:29 pm
> *To:* Mark Goddard; Michael Johnson; openstack-discuss
> *Cc:* Fabian Zimmermann
> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
> balancer
>
> ------------------------------
> *From:* Monika Samal <monika.samal at outlook.com>
> *Sent:* Friday, August 7, 2020 4:41:52 AM
> *To:* Mark Goddard <mark at stackhpc.com>; Michael Johnson <
> johnsomor at gmail.com>
> *Cc:* Fabian Zimmermann <dev.faz at gmail.com>; openstack-discuss <
> openstack-discuss at lists.openstack.org>
> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
> balancer
>
> I tried following above document still facing same Octavia connection
> error with amphora image.
>
> Regards,
> Monika
> ------------------------------
> *From:* Mark Goddard <mark at stackhpc.com>
> *Sent:* Thursday, August 6, 2020 1:16:01 PM
> *To:* Michael Johnson <johnsomor at gmail.com>
> *Cc:* Monika Samal <monika.samal at outlook.com>; Fabian Zimmermann <
> dev.faz at gmail.com>; openstack-discuss <
> openstack-discuss at lists.openstack.org>
> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
> balancer
>
>
>
> On Wed, 5 Aug 2020 at 16:16, Michael Johnson <johnsomor at gmail.com> wrote:
>
> Looking at that error, it appears that the lb-mgmt-net is not setup
> correctly. The Octavia controller containers are not able to reach the
> amphora instances on the lb-mgmt-net subnet.
>
> I don't know how kolla is setup to connect the containers to the neutron
> lb-mgmt-net network. Maybe the above documents will help with that.
>
>
> Right now it's up to the operator to configure that. The kolla
> documentation doesn't prescribe any particular setup. We're working on
> automating it in Victoria.
>
>
> Michael
>
> On Wed, Aug 5, 2020 at 12:53 AM Mark Goddard <mark at stackhpc.com> wrote:
>
>
>
> On Tue, 4 Aug 2020 at 16:58, Monika Samal <monika.samal at outlook.com>
> wrote:
>
> Hello Guys,
>
> With Michaels help I was able to solve the problem but now there is
> another error I was able to create my network on vlan but still error
> persist. PFB the logs:
>
> http://paste.openstack.org/show/fEixSudZ6lzscxYxsG1z/
>
> Kindly help
>
> regards,
> Monika
> ------------------------------
> *From:* Michael Johnson <johnsomor at gmail.com>
> *Sent:* Monday, August 3, 2020 9:10 PM
> *To:* Fabian Zimmermann <dev.faz at gmail.com>
> *Cc:* Monika Samal <monika.samal at outlook.com>; openstack-discuss <
> openstack-discuss at lists.openstack.org>
> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
> balancer
>
> Yeah, it looks like nova is failing to boot the instance.
>
> Check this setting in your octavia.conf files:
> https://docs.openstack.org/octavia/latest/configuration/configref.html#controller_worker.amp_flavor_id
>
> Also, if kolla-ansible didn't set both of these values correctly, please
> open bug reports for kolla-ansible. These all should have been configured
> by the deployment tool.
>
>
> I wasn't following this thread due to no [kolla] tag, but here are the
> recently added docs for Octavia in kolla [1]. Note
> the octavia_service_auth_project variable which was added to migrate from
> the admin project to the service project for octavia resources. We're
> lacking proper automation for the flavor, image etc, but it is being worked
> on in Victoria [2].
>
> [1]
> https://docs.openstack.org/kolla-ansible/latest/reference/networking/octavia.html
> [2] https://review.opendev.org/740180
>
> Michael
>
> On Mon, Aug 3, 2020 at 7:53 AM Fabian Zimmermann <dev.faz at gmail.com>
> wrote:
>
> Seems like the flavor is missing or empty '' - check for typos and enable
> debug.
>
> Check if the nova req contains valid information/flavor.
>
>  Fabian
>
> Monika Samal <monika.samal at outlook.com> schrieb am Mo., 3. Aug. 2020,
> 15:46:
>
> It's registered
>
> Get Outlook for Android <https://aka.ms/ghei36>
> ------------------------------
> *From:* Fabian Zimmermann <dev.faz at gmail.com>
> *Sent:* Monday, August 3, 2020 7:08:21 PM
> *To:* Monika Samal <monika.samal at outlook.com>; openstack-discuss <
> openstack-discuss at lists.openstack.org>
> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
> balancer
>
> Did you check the (nova) flavor you use in octavia.
>
>  Fabian
>
> Monika Samal <monika.samal at outlook.com> schrieb am Mo., 3. Aug. 2020,
> 10:53:
>
> After Michael suggestion I was able to create load balancer but there is
> error in status.
>
>
>
> PFB the error link:
>
> http://paste.openstack.org/show/meNZCeuOlFkfjj189noN/
> ------------------------------
> *From:* Monika Samal <monika.samal at outlook.com>
> *Sent:* Monday, August 3, 2020 2:08 PM
> *To:* Michael Johnson <johnsomor at gmail.com>
> *Cc:* Fabian Zimmermann <dev.faz at gmail.com>; Amy Marrich <amy at demarco.com>;
> openstack-discuss <openstack-discuss at lists.openstack.org>;
> community at lists.openstack.org <community at lists.openstack.org>
> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
> balancer
>
> Thanks a ton Michael for helping me out
> ------------------------------
> *From:* Michael Johnson <johnsomor at gmail.com>
> *Sent:* Friday, July 31, 2020 3:57 AM
> *To:* Monika Samal <monika.samal at outlook.com>
> *Cc:* Fabian Zimmermann <dev.faz at gmail.com>; Amy Marrich <amy at demarco.com>;
> openstack-discuss <openstack-discuss at lists.openstack.org>;
> community at lists.openstack.org <community at lists.openstack.org>
> *Subject:* Re: [openstack-community] Octavia :; Unable to create load
> balancer
>
> Just to close the loop on this, the octavia.conf file had
> "project_name = admin" instead of "project_name = service" in the
> [service_auth] section. This was causing the keystone errors when
> Octavia was communicating with neutron.
>
> I don't know if that is a bug in kolla-ansible or was just a local
> configuration issue.
>
> Michael
>
> On Thu, Jul 30, 2020 at 1:39 PM Monika Samal <monika.samal at outlook.com>
> wrote:
> >
> > Hello Fabian,,
> >
> > http://paste.openstack.org/show/QxKv2Ai697qulp9UWTjY/
> >
> > Regards,
> > Monika
> > ________________________________
> > From: Fabian Zimmermann <dev.faz at gmail.com>
> > Sent: Friday, July 31, 2020 1:57 AM
> > To: Monika Samal <monika.samal at outlook.com>
> > Cc: Michael Johnson <johnsomor at gmail.com>; Amy Marrich <amy at demarco.com>;
> openstack-discuss <openstack-discuss at lists.openstack.org>;
> community at lists.openstack.org <community at lists.openstack.org>
> > Subject: Re: [openstack-community] Octavia :; Unable to create load
> balancer
> >
> > Hi,
> >
> > just to debug, could you replace the auth_type password with v3password?
> >
> > And do a curl against your :5000 and :35357 urls and paste the output.
> >
> >  Fabian
> >
> > Monika Samal <monika.samal at outlook.com> schrieb am Do., 30. Juli 2020,
> 22:15:
> >
> > Hello Fabian,
> >
> > http://paste.openstack.org/show/796477/
> >
> > Thanks,
> > Monika
> > ________________________________
> > From: Fabian Zimmermann <dev.faz at gmail.com>
> > Sent: Friday, July 31, 2020 1:38 AM
> > To: Monika Samal <monika.samal at outlook.com>
> > Cc: Michael Johnson <johnsomor at gmail.com>; Amy Marrich <amy at demarco.com>;
> openstack-discuss <openstack-discuss at lists.openstack.org>;
> community at lists.openstack.org <community at lists.openstack.org>
> > Subject: Re: [openstack-community] Octavia :; Unable to create load
> balancer
> >
> > The sections should be
> >
> > service_auth
> > keystone_authtoken
> >
> > if i read the docs correctly. Maybe you can just paste your config
> (remove/change passwords) to paste.openstack.org and post the link?
> >
> >  Fabian
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200810/7f63fe09/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 97899 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200810/7f63fe09/attachment-0004.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 22490 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200810/7f63fe09/attachment-0005.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 24398 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200810/7f63fe09/attachment-0006.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 24398 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200810/7f63fe09/attachment-0007.png>

From opensrloo at gmail.com  Mon Aug 10 17:28:46 2020
From: opensrloo at gmail.com (Ruby Loo)
Date: Mon, 10 Aug 2020 13:28:46 -0400
Subject: [Ironic] User Survey question
In-Reply-To: <CAF7gwdgh4BkN17fGL-LqbjnYfr2Gj-Nov8MVQ7PMJs2ncaE4DQ@mail.gmail.com>
References: <CAF7gwdgh4BkN17fGL-LqbjnYfr2Gj-Nov8MVQ7PMJs2ncaE4DQ@mail.gmail.com>
Message-ID: <CAJkGiNsxmwukLY_B82dsUkk3db_10S+E2gyt7Dc5cqJ1HYZ-AA@mail.gmail.com>

Hi Julia,

Please remind me, are we allowed one question?

I was wondering what prevents us from having this tool and then
announcing/asking folks to provide the information. Or is the idea that if
no one says 'yes', it would be a waste of time to provide such a tool? My
concern is that if this is the only question we are allowed to ask, we
might not get that much useful information.

What about pain-points wrt ironic? Could we ask that?

--ruby

On Mon, Aug 10, 2020 at 12:23 PM Julia Kreger <juliaashleykreger at gmail.com>
wrote:

> Greetings awesome Artificial Intelligences and fellow humanoid carbon
> units!
>
> This week I need to submit the question for the 2021 user survey. We
> discussed this some during our weekly IRC meeting today.[0]
>
> Presently, the question is:
>
> "Ironic: What would you find most useful if it was part of ironic?"
>
> I'd like to propose we collect more data in order to enable us to make
> informed decisions for features and maintenance work moving forward.
> While this is long term thinking, I'm wondering if operators would be
> interested in collecting and submitting some basic data or using a
> tool, to submit anonymous usage data so we can gain insight into
> hardware types in use, numbers of machines, which interfaces are used,
> etc.
>
> So I'm thinking something along the lines of:
>
> "Ironic: Would you be willing to submit anonymous usage statistics
> (Number of nodes, conductors, which drivers are in use, etc) if such a
> tool existed? Yes/No/Not Applicable"
>
> Thoughts? Feelings? Concerns? Other ideas?
>
> -Julia
>
>
> [0]:
> http://eavesdrop.openstack.org/meetings/ironic/2020/ironic.2020-08-10-15.00.log.html
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200810/510512db/attachment.html>

From allison at openstack.org  Mon Aug 10 17:30:48 2020
From: allison at openstack.org (Allison Price)
Date: Mon, 10 Aug 2020 12:30:48 -0500
Subject: [Ironic] User Survey question
In-Reply-To: <CAJkGiNsxmwukLY_B82dsUkk3db_10S+E2gyt7Dc5cqJ1HYZ-AA@mail.gmail.com>
References: <CAF7gwdgh4BkN17fGL-LqbjnYfr2Gj-Nov8MVQ7PMJs2ncaE4DQ@mail.gmail.com>
 <CAJkGiNsxmwukLY_B82dsUkk3db_10S+E2gyt7Dc5cqJ1HYZ-AA@mail.gmail.com>
Message-ID: <6E7AE88E-3E6A-4791-AFDB-600DB8E2AC40@openstack.org>

Coming solely from the User Survey POV, each project and SIG is allowed up to 2 questions. We create that limit to ensure that the survey does not get too terribly long. 

If the Ironic team would like to add one question, we can. 

Thanks!
Allison


> On Aug 10, 2020, at 12:28 PM, Ruby Loo <opensrloo at gmail.com> wrote:
> 
> Hi Julia,
> 
> Please remind me, are we allowed one question?
> 
> I was wondering what prevents us from having this tool and then announcing/asking folks to provide the information. Or is the idea that if no one says 'yes', it would be a waste of time to provide such a tool? My concern is that if this is the only question we are allowed to ask, we might not get that much useful information.
> 
> What about pain-points wrt ironic? Could we ask that?
> 
> --ruby
> 
> On Mon, Aug 10, 2020 at 12:23 PM Julia Kreger <juliaashleykreger at gmail.com <mailto:juliaashleykreger at gmail.com>> wrote:
> Greetings awesome Artificial Intelligences and fellow humanoid carbon units!
> 
> This week I need to submit the question for the 2021 user survey. We
> discussed this some during our weekly IRC meeting today.[0]
> 
> Presently, the question is:
> 
> "Ironic: What would you find most useful if it was part of ironic?"
> 
> I'd like to propose we collect more data in order to enable us to make
> informed decisions for features and maintenance work moving forward.
> While this is long term thinking, I'm wondering if operators would be
> interested in collecting and submitting some basic data or using a
> tool, to submit anonymous usage data so we can gain insight into
> hardware types in use, numbers of machines, which interfaces are used,
> etc.
> 
> So I'm thinking something along the lines of:
> 
> "Ironic: Would you be willing to submit anonymous usage statistics
> (Number of nodes, conductors, which drivers are in use, etc) if such a
> tool existed? Yes/No/Not Applicable"
> 
> Thoughts? Feelings? Concerns? Other ideas?
> 
> -Julia
> 
> 
> [0]: http://eavesdrop.openstack.org/meetings/ironic/2020/ironic.2020-08-10-15.00.log.html <http://eavesdrop.openstack.org/meetings/ironic/2020/ironic.2020-08-10-15.00.log.html>
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200810/385fad09/attachment.html>

From kennelson11 at gmail.com  Mon Aug 10 19:13:21 2020
From: kennelson11 at gmail.com (Kendall Nelson)
Date: Mon, 10 Aug 2020 12:13:21 -0700
Subject: [all] Virtual PTG October 2020 Dates & Registration
Message-ID: <CAJ6yrQibKupHd6VM7q_T9rLMN26erK25N79z3Jhq-+iyHyEeUw@mail.gmail.com>

Hello Everyone!

I'm sure you all have been anxiously awaiting the announcement of the dates
for the next virtual PTG! After polling the community[1] and balancing the
pros and cons, we have decided the PTG will take place the week after the
Open Infrastructure Summit[2][3] from October 26th to October 30th, 2020.

PTG registration is now open[4]. Like last time, it is free, but we will
again be using it to communicate details about the event (schedules,
passwords, etc), so please register!

Later this week we will send out info about signing up teams. Also, the
same as last time, we will have an ethercalc signup and a survey to gather
some other data about your team.

-the Kendalls (diablo_rojo & wendallkaters)

[1] ML Poll for PTG Dates:
http://lists.openstack.org/pipermail/openstack-discuss/2020-July/016098.html
[2] Summit Site: https://www.openstack.org/summit/2020/
[3] Summit Registration: https://openinfrasummit2020.eventbrite.com
[4] PTG Registration: https://october2020ptg.eventbrite.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200810/c306b37f/attachment.html>

From tim.bell at cern.ch  Mon Aug 10 20:37:27 2020
From: tim.bell at cern.ch (timbell)
Date: Mon, 10 Aug 2020 22:37:27 +0200
Subject: [Ironic] User Survey question
In-Reply-To: <CAF7gwdgh4BkN17fGL-LqbjnYfr2Gj-Nov8MVQ7PMJs2ncaE4DQ@mail.gmail.com>
References: <CAF7gwdgh4BkN17fGL-LqbjnYfr2Gj-Nov8MVQ7PMJs2ncaE4DQ@mail.gmail.com>
Message-ID: <BE5B6BEB-27DE-4A63-B0CB-286104F3CB89@cern.ch>

Ceph has done something like this with good results - https://docs.ceph.com/docs/master/mgr/telemetry/ <https://docs.ceph.com/docs/master/mgr/telemetry/>

I think the things that have helped this to be successful are

- easy way to see what you would send
- option (not the default) to provide more details such as company and contact

I think many OpenStack projects could benefit from this sort of approach for

- capacity growth
- rate of upgrade
- support some of the user survey activities by automatically collecting data rather than asking for responses in a manual survey

Would it be possible to consider an Oslo module so the infrastructure could be common and then we make it anonymous opt-in ?

Tim

> On 10 Aug 2020, at 18:17, Julia Kreger <juliaashleykreger at gmail.com> wrote:
> 
> Greetings awesome Artificial Intelligences and fellow humanoid carbon units!
> 
> This week I need to submit the question for the 2021 user survey. We
> discussed this some during our weekly IRC meeting today.[0]
> 
> Presently, the question is:
> 
> "Ironic: What would you find most useful if it was part of ironic?"
> 
> I'd like to propose we collect more data in order to enable us to make
> informed decisions for features and maintenance work moving forward.
> While this is long term thinking, I'm wondering if operators would be
> interested in collecting and submitting some basic data or using a
> tool, to submit anonymous usage data so we can gain insight into
> hardware types in use, numbers of machines, which interfaces are used,
> etc.
> 
> So I'm thinking something along the lines of:
> 
> "Ironic: Would you be willing to submit anonymous usage statistics
> (Number of nodes, conductors, which drivers are in use, etc) if such a
> tool existed? Yes/No/Not Applicable"
> 
> Thoughts? Feelings? Concerns? Other ideas?
> 
> -Julia
> 
> 
> [0]: http://eavesdrop.openstack.org/meetings/ironic/2020/ironic.2020-08-10-15.00.log.html
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200810/f9663f68/attachment-0001.html>

From emilien at redhat.com  Mon Aug 10 20:48:09 2020
From: emilien at redhat.com (Emilien Macchi)
Date: Mon, 10 Aug 2020 16:48:09 -0400
Subject: [tripleo][ansible] the current action plugins use patterns are
 suboptimal?
In-Reply-To: <385dc8d7-198f-64ce-908f-49ab823ed229@redhat.com>
References: <6feb1d83-5cc8-1916-90a7-1a6b54593310@redhat.com>
 <CAGHQP9w45ZNRebwRVTx870qoZ2NiJ7mdxWmmkGExhDx3OLYYzA@mail.gmail.com>
 <b01f96a4-c716-7db3-180e-ff1e6f76f975@redhat.com>
 <CAFsb3b7eU4Bus-6vk1w+bCMPd1Y7h_6xYVqNCzcqbai=DhCXHg@mail.gmail.com>
 <385dc8d7-198f-64ce-908f-49ab823ed229@redhat.com>
Message-ID: <CACu=hytQr6EsWbtPd_4Uqtj9dVWkETsG9hE5x1eUg8M7quTW4g@mail.gmail.com>

On Tue, Aug 4, 2020 at 5:42 AM Bogdan Dobrelya <bdobreli at redhat.com> wrote:
(...)

> I can understand that ansible should not be fixed for some composition
> tasks what require iterations and have complex logic for its "unit of
> work". And such ones also should be unit tested indeed. What I do not
> fully understand though is then what abandoning paunch for its action
> plugin had bought for us in the end?
>
> Paunch was self-contained and had no external dependencies on
> fast-changing ansible frameworks. There was also no need for paunch to
> handle the ansible-specific execution strategies and nuances, like "what
> if that action plugin is called in async or in the check-mode?" Unit
> tests exited in paunch as well. It was easy to backport changes within a
> single code base.
>
> So, looking back retrospectively, was rewriting paunch as an action
> plugin a simplification of the deployment framework? Please reply to
> yourself honestly. It does pretty same things but differently and added
> external framework. It is now also self-contained action plugin, since
> traditional tasks cannot be used in loops for this goal because of
> performance reasons.
>

I asked myself the same questions several times and to me the major driver
around removing paunch was to move the container logic into Ansible modules
which would be community supported vs paunch-runner code.
The Ansible role version has also brought more plugability with other
components of the framework (Ansible strategies, debugging, etc) but I
don't think it was the major reason why we wrote it.

The simplification aspect was mostly about removing the CLI which I don't
think was needed at the end; and also the unique name thing which was a
mistake and caused us many troubles as you know.

To summarize, action plugins may be a good solution indeed, but perhaps
> we should go back and use paunch instead of ansible? Same applies for
> *some* other tasks? That would also provide a balance, for action
> plugins, tasks and common sense.
>

Sagi is currently working on replacing the podman_containers action plugin
by a module that will be able to process multiple containers at the same
time, so we'll have less tasks and potentially faster operations at scale.
-- 
Emilien Macchi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200810/5fc6a93a/attachment.html>

From fungi at yuggoth.org  Mon Aug 10 20:53:26 2020
From: fungi at yuggoth.org (Jeremy Stanley)
Date: Mon, 10 Aug 2020 20:53:26 +0000
Subject: [Ironic] User Survey question
In-Reply-To: <BE5B6BEB-27DE-4A63-B0CB-286104F3CB89@cern.ch>
References: <CAF7gwdgh4BkN17fGL-LqbjnYfr2Gj-Nov8MVQ7PMJs2ncaE4DQ@mail.gmail.com>
 <BE5B6BEB-27DE-4A63-B0CB-286104F3CB89@cern.ch>
Message-ID: <20200810205325.42rr7i4qr4jhdow3@yuggoth.org>

On 2020-08-10 22:37:27 +0200 (+0200), timbell wrote:
> Ceph has done something like this with good results -
> https://docs.ceph.com/docs/master/mgr/telemetry/
> 
> I think the things that have helped this to be successful are
> 
> - easy way to see what you would send
> - option (not the default) to provide more details such as company
>   and contact
[...]

Other prior art which springs to mind:

Debian has provided a popcon tool for ages, as an opt-in means of
periodically providing feedback on what packages are seeing use in
their distro. It's current incarnation can submit reports via SMTP
or HTTP protocols for added flexibility. https://popcon.debian.org/

OpenBSD takes a low-effort approach and suggests a command in their
install guide which the admin can run to send a copy of dmesg output
to the project so they can keep track of what sorts of hardware is
running their operating system out in the wild.
https://www.openbsd.org/faq/faq4.html#SendDmesg

-- 
Jeremy Stanley
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 963 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200810/1d941ee8/attachment.sig>

From Arkady.Kanevsky at dell.com  Mon Aug 10 21:06:19 2020
From: Arkady.Kanevsky at dell.com (Kanevsky, Arkady)
Date: Mon, 10 Aug 2020 21:06:19 +0000
Subject: [Ironic] User Survey question
In-Reply-To: <6E7AE88E-3E6A-4791-AFDB-600DB8E2AC40@openstack.org>
References: <CAF7gwdgh4BkN17fGL-LqbjnYfr2Gj-Nov8MVQ7PMJs2ncaE4DQ@mail.gmail.com>
 <CAJkGiNsxmwukLY_B82dsUkk3db_10S+E2gyt7Dc5cqJ1HYZ-AA@mail.gmail.com>
 <6E7AE88E-3E6A-4791-AFDB-600DB8E2AC40@openstack.org>
Message-ID: <BN8PR19MB3076329EA8257E8C7192723FF9440@BN8PR19MB3076.namprd19.prod.outlook.com>

Do we know what % of current deployments use Ironic?
I recall several years back it was 25%. But do not recall seeing latest info.
Then Julia question on size and which components of Ironic are being used.
Maybe if we treat "N/A" answers as they do not use Ironic it first into a single question.
I do love open ended question where users can ask for improvements/extensions.
Thanks,
Arkady

From: Allison Price <allison at openstack.org>
Sent: Monday, August 10, 2020 12:31 PM
To: Ruby Loo
Cc: Julia Kreger; openstack-discuss
Subject: Re: [Ironic] User Survey question


[EXTERNAL EMAIL]
Coming solely from the User Survey POV, each project and SIG is allowed up to 2 questions. We create that limit to ensure that the survey does not get too terribly long.

If the Ironic team would like to add one question, we can.

Thanks!
Allison


On Aug 10, 2020, at 12:28 PM, Ruby Loo <opensrloo at gmail.com<mailto:opensrloo at gmail.com>> wrote:

Hi Julia,

Please remind me, are we allowed one question?

I was wondering what prevents us from having this tool and then announcing/asking folks to provide the information. Or is the idea that if no one says 'yes', it would be a waste of time to provide such a tool? My concern is that if this is the only question we are allowed to ask, we might not get that much useful information.

What about pain-points wrt ironic? Could we ask that?

--ruby

On Mon, Aug 10, 2020 at 12:23 PM Julia Kreger <juliaashleykreger at gmail.com<mailto:juliaashleykreger at gmail.com>> wrote:
Greetings awesome Artificial Intelligences and fellow humanoid carbon units!

This week I need to submit the question for the 2021 user survey. We
discussed this some during our weekly IRC meeting today.[0]

Presently, the question is:

"Ironic: What would you find most useful if it was part of ironic?"

I'd like to propose we collect more data in order to enable us to make
informed decisions for features and maintenance work moving forward.
While this is long term thinking, I'm wondering if operators would be
interested in collecting and submitting some basic data or using a
tool, to submit anonymous usage data so we can gain insight into
hardware types in use, numbers of machines, which interfaces are used,
etc.

So I'm thinking something along the lines of:

"Ironic: Would you be willing to submit anonymous usage statistics
(Number of nodes, conductors, which drivers are in use, etc) if such a
tool existed? Yes/No/Not Applicable"

Thoughts? Feelings? Concerns? Other ideas?

-Julia


[0]: http://eavesdrop.openstack.org/meetings/ironic/2020/ironic.2020-08-10-15.00.log.html

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200810/f2464901/attachment-0001.html>

From allison at openstack.org  Mon Aug 10 21:13:31 2020
From: allison at openstack.org (Allison Price)
Date: Mon, 10 Aug 2020 16:13:31 -0500
Subject: [Ironic] User Survey question
In-Reply-To: <BN8PR19MB3076329EA8257E8C7192723FF9440@BN8PR19MB3076.namprd19.prod.outlook.com>
References: <CAF7gwdgh4BkN17fGL-LqbjnYfr2Gj-Nov8MVQ7PMJs2ncaE4DQ@mail.gmail.com>
 <CAJkGiNsxmwukLY_B82dsUkk3db_10S+E2gyt7Dc5cqJ1HYZ-AA@mail.gmail.com>
 <6E7AE88E-3E6A-4791-AFDB-600DB8E2AC40@openstack.org>
 <BN8PR19MB3076329EA8257E8C7192723FF9440@BN8PR19MB3076.namprd19.prod.outlook.com>
Message-ID: <1DFB0CD9-008D-4937-AD70-F3150BEC823F@openstack.org>

As of right now for the 2020 version, we are sitting at 19% and for 2019, it was the same. 


> On Aug 10, 2020, at 4:06 PM, Kanevsky, Arkady <Arkady.Kanevsky at dell.com> wrote:
> 
> Do we know what % of current deployments use Ironic?
> I recall several years back it was 25%. But do not recall seeing latest info.
> Then Julia question on size and which components of Ironic are being used.
> Maybe if we treat “N/A” answers as they do not use Ironic it first into a single question.
> I do love open ended question where users can ask for improvements/extensions.
> Thanks,
> Arkady
>  
> From: Allison Price <allison at openstack.org <mailto:allison at openstack.org>> 
> Sent: Monday, August 10, 2020 12:31 PM
> To: Ruby Loo
> Cc: Julia Kreger; openstack-discuss
> Subject: Re: [Ironic] User Survey question
>  
> [EXTERNAL EMAIL] 
> 
> Coming solely from the User Survey POV, each project and SIG is allowed up to 2 questions. We create that limit to ensure that the survey does not get too terribly long. 
>  
> If the Ironic team would like to add one question, we can. 
>  
> Thanks!
> Allison
>  
>  
> 
> 
> On Aug 10, 2020, at 12:28 PM, Ruby Loo <opensrloo at gmail.com <mailto:opensrloo at gmail.com>> wrote:
>  
> Hi Julia,
>  
> Please remind me, are we allowed one question?
>  
> I was wondering what prevents us from having this tool and then announcing/asking folks to provide the information. Or is the idea that if no one says 'yes', it would be a waste of time to provide such a tool? My concern is that if this is the only question we are allowed to ask, we might not get that much useful information.
>  
> What about pain-points wrt ironic? Could we ask that?
>  
> --ruby
>  
> On Mon, Aug 10, 2020 at 12:23 PM Julia Kreger <juliaashleykreger at gmail.com <mailto:juliaashleykreger at gmail.com>> wrote:
> Greetings awesome Artificial Intelligences and fellow humanoid carbon units!
> 
> This week I need to submit the question for the 2021 user survey. We
> discussed this some during our weekly IRC meeting today.[0]
> 
> Presently, the question is:
> 
> "Ironic: What would you find most useful if it was part of ironic?"
> 
> I'd like to propose we collect more data in order to enable us to make
> informed decisions for features and maintenance work moving forward.
> While this is long term thinking, I'm wondering if operators would be
> interested in collecting and submitting some basic data or using a
> tool, to submit anonymous usage data so we can gain insight into
> hardware types in use, numbers of machines, which interfaces are used,
> etc.
> 
> So I'm thinking something along the lines of:
> 
> "Ironic: Would you be willing to submit anonymous usage statistics
> (Number of nodes, conductors, which drivers are in use, etc) if such a
> tool existed? Yes/No/Not Applicable"
> 
> Thoughts? Feelings? Concerns? Other ideas?
> 
> -Julia
> 
> 
> [0]: http://eavesdrop.openstack.org/meetings/ironic/2020/ironic.2020-08-10-15.00.log.html <http://eavesdrop.openstack.org/meetings/ironic/2020/ironic.2020-08-10-15.00.log.html>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200810/f3d81d2c/attachment.html>

From miguel at mlavalle.com  Mon Aug 10 22:07:29 2020
From: miguel at mlavalle.com (Miguel Lavalle)
Date: Mon, 10 Aug 2020 17:07:29 -0500
Subject: [neutron] bug deputy report August 3 - 9
Message-ID: <CAEzGLDiHV4PtgM=XNK8YxUybNNU=fXXK-p9EZvTek9jsUu89tA@mail.gmail.com>

Critical
======

https://bugs.launchpad.net/neutron/+bug/1890445 [ovn] Tempest test
test_update_router_admin_state failing very often. NEEDS OWNER

https://bugs.launchpad.net/neutron/+bug/1890493 Periodic job
neutron-ovn-tempest-ovs-master-fedora is failing 100% of times. SEEMS TO
NEED AN OWNER


High
====

https://bugs.launchpad.net/neutron/+bug/1890269 Fullstack test
neutron.tests.fullstack.test_logging.TestLogging is failing on Ubuntu
Focal. Proposed fix: https://review.opendev.org/#/c/734304/

https://bugs.launchpad.net/neutron/+bug/1890297 CI grenade jobs failing.
Proposed fix: https://review.opendev.org/#/c/744753/1. Fix released

https://bugs.launchpad.net/neutron/+bug/1890400 Default gateway in HA
router namespace not set if using Keepalived 1.x. Awaiting patch from Slawek

https://bugs.launchpad.net/neutron/+bug/1890353 support pyroute2 0.5.13.
Awaiting patch from Rodolfo

https://bugs.launchpad.net/neutron/+bug/1890432 Create subnet is failing
under high load with OVN. WIP fix: https://review.opendev.org/#/c/745330/


Medium
======

https://bugs.launchpad.net/neutron/+bug/1890539 failed to create port with
security group of other tenant. Proposed fix:
https://review.opendev.org/#/c/745089
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200810/df0e6a35/attachment.html>

From juliaashleykreger at gmail.com  Mon Aug 10 23:38:30 2020
From: juliaashleykreger at gmail.com (Julia Kreger)
Date: Mon, 10 Aug 2020 16:38:30 -0700
Subject: [Ironic] User Survey question
In-Reply-To: <BN8PR19MB3076329EA8257E8C7192723FF9440@BN8PR19MB3076.namprd19.prod.outlook.com>
References: <CAF7gwdgh4BkN17fGL-LqbjnYfr2Gj-Nov8MVQ7PMJs2ncaE4DQ@mail.gmail.com>
 <CAJkGiNsxmwukLY_B82dsUkk3db_10S+E2gyt7Dc5cqJ1HYZ-AA@mail.gmail.com>
 <6E7AE88E-3E6A-4791-AFDB-600DB8E2AC40@openstack.org>
 <BN8PR19MB3076329EA8257E8C7192723FF9440@BN8PR19MB3076.namprd19.prod.outlook.com>
Message-ID: <CAF7gwdjLpRV2XkvfQQFgma+0tM2G6BybLvcrp0Q651Qc4-YUvw@mail.gmail.com>

For me, it is less about aggregate usage of the respondent users, and
more about collecting the actual utilization statistics for running
deployments so we can make informed decisions. For example, If we know
there is huge IPMI console usage, then we can know how to prioritize
the same for redfish, or potentially not.

I do also like the open-ended question response nature, but I've not
found the data very valuable except to help tailor some of the outward
facing communications for purposes of project updates. Largely because
many users are not "zoomed in" at the level most contributors or even
where everyday individual project users are. They are zoomed out
looking at the whole.

If OSF is willing for us to have two questions, I'm all for making use
of it. I always thought we would only be permitted one.

-Julia

On Mon, Aug 10, 2020 at 2:06 PM Kanevsky, Arkady
<Arkady.Kanevsky at dell.com> wrote:
>
> Do we know what % of current deployments use Ironic?
>
> I recall several years back it was 25%. But do not recall seeing latest info.
>
> Then Julia question on size and which components of Ironic are being used.
>
> Maybe if we treat “N/A” answers as they do not use Ironic it first into a single question.
>
> I do love open ended question where users can ask for improvements/extensions.
>
> Thanks,
>
> Arkady
>
>
>
> From: Allison Price <allison at openstack.org>
> Sent: Monday, August 10, 2020 12:31 PM
> To: Ruby Loo
> Cc: Julia Kreger; openstack-discuss
> Subject: Re: [Ironic] User Survey question
>
>
>
> [EXTERNAL EMAIL]
>
> Coming solely from the User Survey POV, each project and SIG is allowed up to 2 questions. We create that limit to ensure that the survey does not get too terribly long.
>
>
>
> If the Ironic team would like to add one question, we can.
>
>
>
> Thanks!
>
> Allison
>
>
>
>
>
>
>
> On Aug 10, 2020, at 12:28 PM, Ruby Loo <opensrloo at gmail.com> wrote:
>
>
>
> Hi Julia,
>
>
>
> Please remind me, are we allowed one question?
>
>
>
> I was wondering what prevents us from having this tool and then announcing/asking folks to provide the information. Or is the idea that if no one says 'yes', it would be a waste of time to provide such a tool? My concern is that if this is the only question we are allowed to ask, we might not get that much useful information.
>
>
>
> What about pain-points wrt ironic? Could we ask that?
>
>
>
> --ruby
>
>
>
> On Mon, Aug 10, 2020 at 12:23 PM Julia Kreger <juliaashleykreger at gmail.com> wrote:
>
> Greetings awesome Artificial Intelligences and fellow humanoid carbon units!
>
> This week I need to submit the question for the 2021 user survey. We
> discussed this some during our weekly IRC meeting today.[0]
>
> Presently, the question is:
>
> "Ironic: What would you find most useful if it was part of ironic?"
>
> I'd like to propose we collect more data in order to enable us to make
> informed decisions for features and maintenance work moving forward.
> While this is long term thinking, I'm wondering if operators would be
> interested in collecting and submitting some basic data or using a
> tool, to submit anonymous usage data so we can gain insight into
> hardware types in use, numbers of machines, which interfaces are used,
> etc.
>
> So I'm thinking something along the lines of:
>
> "Ironic: Would you be willing to submit anonymous usage statistics
> (Number of nodes, conductors, which drivers are in use, etc) if such a
> tool existed? Yes/No/Not Applicable"
>
> Thoughts? Feelings? Concerns? Other ideas?
>
> -Julia
>
>
> [0]: http://eavesdrop.openstack.org/meetings/ironic/2020/ironic.2020-08-10-15.00.log.html
>
>


From jimmy at openstack.org  Tue Aug 11 00:43:26 2020
From: jimmy at openstack.org (Jimmy McArthur)
Date: Mon, 10 Aug 2020 19:43:26 -0500
Subject: [Ironic] User Survey question
In-Reply-To: <CAF7gwdjLpRV2XkvfQQFgma+0tM2G6BybLvcrp0Q651Qc4-YUvw@mail.gmail.com>
References: <CAF7gwdjLpRV2XkvfQQFgma+0tM2G6BybLvcrp0Q651Qc4-YUvw@mail.gmail.com>
Message-ID: <10446121-C615-4345-A761-58F33E6C1DD8@getmailspring.com>

I think originally it was one, but the survey grew enough that 2 questions were needed by a lot of projects. Definitely feel free to add another question. And I agree - having a free-text question is cool, but it doesn't help for tracking trends, by and large.

Sometimes we're even able to fold two questions into one... so if you can give us an idea of exactly the data you want and how you would format the questions in a perfect world, we might be able to get more out of it.
For instance, a follow up question like "Other, please explain" still only counts as the one question :) So if you want to do some second level dependency stuff, we might be able to extract more data out of it.
Cheers,
Jimmy

On Aug 10 2020, at 6:38 pm, Julia Kreger <juliaashleykreger at gmail.com> wrote:
> For me, it is less about aggregate usage of the respondent users, and
> more about collecting the actual utilization statistics for running
> deployments so we can make informed decisions. For example, If we know
> there is huge IPMI console usage, then we can know how to prioritize
> the same for redfish, or potentially not.
>
> I do also like the open-ended question response nature, but I've not
> found the data very valuable except to help tailor some of the outward
> facing communications for purposes of project updates. Largely because
> many users are not "zoomed in" at the level most contributors or even
> where everyday individual project users are. They are zoomed out
> looking at the whole.
>
> If OSF is willing for us to have two questions, I'm all for making use
> of it. I always thought we would only be permitted one.
>
> -Julia
> On Mon, Aug 10, 2020 at 2:06 PM Kanevsky, Arkady
> <Arkady.Kanevsky at dell.com> wrote:
> >
> > Do we know what % of current deployments use Ironic?
> >
> > I recall several years back it was 25%. But do not recall seeing latest info.
> >
> > Then Julia question on size and which components of Ironic are being used.
> >
> > Maybe if we treat “N/A” answers as they do not use Ironic it first into a single question.
> >
> > I do love open ended question where users can ask for improvements/extensions.
> >
> > Thanks,
> >
> > Arkady
> >
> >
> >
> > From: Allison Price <allison at openstack.org>
> > Sent: Monday, August 10, 2020 12:31 PM
> > To: Ruby Loo
> > Cc: Julia Kreger; openstack-discuss
> > Subject: Re: [Ironic] User Survey question
> >
> >
> >
> > [EXTERNAL EMAIL]
> >
> > Coming solely from the User Survey POV, each project and SIG is allowed up to 2 questions. We create that limit to ensure that the survey does not get too terribly long.
> >
> >
> >
> > If the Ironic team would like to add one question, we can.
> >
> >
> >
> > Thanks!
> >
> > Allison
> >
> >
> >
> >
> >
> >
> >
> > On Aug 10, 2020, at 12:28 PM, Ruby Loo <opensrloo at gmail.com> wrote:
> >
> >
> >
> > Hi Julia,
> >
> >
> >
> > Please remind me, are we allowed one question?
> >
> >
> >
> > I was wondering what prevents us from having this tool and then announcing/asking folks to provide the information. Or is the idea that if no one says 'yes', it would be a waste of time to provide such a tool? My concern is that if this is the only question we are allowed to ask, we might not get that much useful information.
> >
> >
> >
> > What about pain-points wrt ironic? Could we ask that?
> >
> >
> >
> > --ruby
> >
> >
> >
> > On Mon, Aug 10, 2020 at 12:23 PM Julia Kreger <juliaashleykreger at gmail.com> wrote:
> >
> > Greetings awesome Artificial Intelligences and fellow humanoid carbon units!
> >
> > This week I need to submit the question for the 2021 user survey. We
> > discussed this some during our weekly IRC meeting today.[0]
> >
> > Presently, the question is:
> >
> > "Ironic: What would you find most useful if it was part of ironic?"
> >
> > I'd like to propose we collect more data in order to enable us to make
> > informed decisions for features and maintenance work moving forward.
> > While this is long term thinking, I'm wondering if operators would be
> > interested in collecting and submitting some basic data or using a
> > tool, to submit anonymous usage data so we can gain insight into
> > hardware types in use, numbers of machines, which interfaces are used,
> > etc.
> >
> > So I'm thinking something along the lines of:
> >
> > "Ironic: Would you be willing to submit anonymous usage statistics
> > (Number of nodes, conductors, which drivers are in use, etc) if such a
> > tool existed? Yes/No/Not Applicable"
> >
> > Thoughts? Feelings? Concerns? Other ideas?
> >
> > -Julia
> >
> >
> > [0]: http://eavesdrop.openstack.org/meetings/ironic/2020/ironic.2020-08-10-15.00.log.html
> >
> >
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200810/33eb5019/attachment.html>

From thierry at openstack.org  Tue Aug 11 10:09:31 2020
From: thierry at openstack.org (Thierry Carrez)
Date: Tue, 11 Aug 2020 12:09:31 +0200
Subject: [cloudkitty][tc] Cloudkitty abandoned?
In-Reply-To: <e104109d-a7c8-9e88-42e8-cd947de9a88c@debian.org>
References: <c6150166-1fa9-5020-988a-0a8e9101bca8@gmx.com>
 <CAG97raekL-OR3mT_GdpD+zcwoCCLYDTCsieMwqCkYC9brmHKpw@mail.gmail.com>
 <CAFHSqWrKO4nDiVNpZe3LUsccXwefqRirZGk9BCsQjads0VrYCA@mail.gmail.com>
 <CAG97racc_j30TVhZ1UBk9SAo1rNrer1j1pr-uQEjkee4cTY_gg@mail.gmail.com>
 <CAJ6yrQgvpyTpR2Z99-_ruqRMJextYcx4e-yP2kkpPY5PDWdAvA@mail.gmail.com>
 <CA+ny2swxRnfJS8mStB2T9tEA1Ox4=_LdHS+giMEQPiCbu=MUDw@mail.gmail.com>
 <173c942a17b.dfe050d2111458.180813585646259079@ghanshyammann.com>
 <e104109d-a7c8-9e88-42e8-cd947de9a88c@debian.org>
Message-ID: <da1a13d1-e840-c43d-ee83-a5ac4dea02db@openstack.org>

Thomas Goirand wrote:
> On 8/7/20 4:10 PM, Ghanshyam Mann wrote:
>> Thanks, Pierre for helping with this.
>>
>> ttx has reached out to PTL (Justin Ferrieu (jferrieu) <justin.ferrieu at objectif-libre.com>)
>> but I am not sure if he got any response back.

No response so far, but they may all be in company summer vacation.

> The end of the very good maintenance of Cloudkitty matched the date when
> objectif libre was sold to Linkbynet. Maybe the new owner don't care enough?
> 
> This is very disappointing as I've been using it for some time already,
> and that I was satisfied by it (ie: it does the job...), and especially
> that latest releases are able to scale correctly.
> 
> I very much would love if Pierre Riteau was successful in taking over.
> Good luck Pierre! I'll try to help whenever I can and if I'm not too busy.

Given the volunteers (Pierre, Rafael, Luis) I would support the TC using 
its unholy powers to add extra core reviewers to cloudkitty.

If the current PTL comes back, I'm sure they will appreciate the help, 
and can always fix/revert things before Victoria release.

-- 
Thierry Carrez (ttx)


From thierry at openstack.org  Tue Aug 11 10:13:56 2020
From: thierry at openstack.org (Thierry Carrez)
Date: Tue, 11 Aug 2020 12:13:56 +0200
Subject: [sigs][vendors] Proposal to create Hardware Vendor SIG
In-Reply-To: <BN8PR19MB30766BB088EAFC8E06EA1326F9460@BN8PR19MB3076.namprd19.prod.outlook.com>
References: <5d4928c2-8e14-82a7-c06b-dd8df4de44fb@gmx.com>
 <BN8PR19MB30766BB088EAFC8E06EA1326F9460@BN8PR19MB3076.namprd19.prod.outlook.com>
Message-ID: <a092055b-6011-3940-72ec-33d7007efaa2@openstack.org>

Kanevsky, Arkady wrote:
> Great idea. Long time overdue.
> Great place for many out-of-tree repos.

+1, great idea.

-- 
Thierry


From thierry at openstack.org  Tue Aug 11 10:24:00 2020
From: thierry at openstack.org (Thierry Carrez)
Date: Tue, 11 Aug 2020 12:24:00 +0200
Subject: [nova][neutron][oslo][ops] rabbit bindings issue
In-Reply-To: <CAA857VzKREnBOEF667n4B+9L9Rbm+y7o2LVrDjhxhCT=2XWR8g@mail.gmail.com>
References: <20200806144016.GP31915@sync>
 <CAA857VxY8mr0OSw8vSYdMph4tXb=5Wwwx3cM_YVgGa8=P+cxkQ@mail.gmail.com>
 <CALaZjRGEcFDBiQ2YnsMjXJ_WF3ALcjWVLQuetupVd__CWhjg0g@mail.gmail.com>
 <CAA857VzKREnBOEF667n4B+9L9Rbm+y7o2LVrDjhxhCT=2XWR8g@mail.gmail.com>
Message-ID: <1a338d7e-c82c-cda2-2d47-b5aebb999142@openstack.org>

If you can reproduce it with current versions, I would suggest to file 
an issue on https://github.com/rabbitmq/rabbitmq-server/issues/

The behavior you describe seems to match 
https://github.com/rabbitmq/rabbitmq-server/issues/1873 but the 
maintainers seem to think it's been fixed by a number of 
somewhat-related changes in 3.7.13, because nobody reported issues 
anymore :)

Fabian Zimmermann wrote:
> Hi,
> 
> dont know if durable queues help, but should be enabled by rabbitmq 
> policy which (alone) doesnt seem to fix this (we have this active)
> 
>   Fabian
> 
> Massimo Sgaravatto <massimo.sgaravatto at gmail.com 
> <mailto:massimo.sgaravatto at gmail.com>> schrieb am Sa., 8. Aug. 2020, 09:36:
> 
>     We also see the issue.  When it happens stopping and restarting the
>     rabbit cluster usually helps.
> 
>     I thought the problem was because of a wrong setting in the
>     openstack services conf files: I missed these settings (that I am
>     now going to add):
> 
>     [oslo_messaging_rabbit]
>     rabbit_ha_queues = true
>     amqp_durable_queues = true
> 
>     Cheers, Massimo
> 
> 
>     On Sat, Aug 8, 2020 at 6:34 AM Fabian Zimmermann <dev.faz at gmail.com
>     <mailto:dev.faz at gmail.com>> wrote:
> 
>         Hi,
> 
>         we also have this issue.
> 
>         Our solution was (up to now) to delete the queues with a script
>         or even reset the complete cluster.
> 
>         We just upgraded rabbitmq to the latest version - without luck.
> 
>         Anyone else seeing this issue?
> 
>           Fabian
> 
> 
> 
>         Arnaud Morin <arnaud.morin at gmail.com
>         <mailto:arnaud.morin at gmail.com>> schrieb am Do., 6. Aug. 2020,
>         16:47:
> 
>             Hey all,
> 
>             I would like to ask the community about a rabbit issue we
>             have from time
>             to time.
> 
>             In our current architecture, we have a cluster of rabbits (3
>             nodes) for
>             all our OpenStack services (mostly nova and neutron).
> 
>             When one node of this cluster is down, the cluster continue
>             working (we
>             use pause_minority strategy).
>             But, sometimes, the third server is not able to recover
>             automatically
>             and need a manual intervention.
>             After this intervention, we restart the rabbitmq-server
>             process, which
>             is then able to join the cluster back.
> 
>             At this time, the cluster looks ok, everything is fine.
>             BUT, nothing works.
>             Neutron and nova agents are not able to report back to servers.
>             They appear dead.
>             Servers seems not being able to consume messages.
>             The exchanges, queues, bindings seems good in rabbit.
> 
>             What we see is that removing bindings (using rabbitmqadmin
>             delete
>             binding or the web interface) and recreate them again (using
>             the same
>             routing key) brings the service back up and running.
> 
>             Doing this for all queues is really painful. Our next plan is to
>             automate it, but is there anyone in the community already
>             saw this kind
>             of issues?
> 
>             Our bug looks like the one described in [1].
>             Someone recommands to create an Alternate Exchange.
>             Is there anyone already tried that?
> 
>             FYI, we are running rabbit 3.8.2 (with OpenStack Stein).
>             We had the same kind of issues using older version of rabbit.
> 
>             Thanks for your help.
> 
>             [1]
>             https://groups.google.com/forum/#!newtopic/rabbitmq-users/rabbitmq-users/zFhmpHF2aWk
> 
>             -- 
>             Arnaud Morin
> 
> 


-- 
Thierry Carrez (ttx)


From arnaud.morin at gmail.com  Tue Aug 11 10:28:43 2020
From: arnaud.morin at gmail.com (Arnaud Morin)
Date: Tue, 11 Aug 2020 10:28:43 +0000
Subject: [nova][neutron][oslo][ops] rabbit bindings issue
In-Reply-To: <CAA857VzKREnBOEF667n4B+9L9Rbm+y7o2LVrDjhxhCT=2XWR8g@mail.gmail.com>
References: <20200806144016.GP31915@sync>
 <CAA857VxY8mr0OSw8vSYdMph4tXb=5Wwwx3cM_YVgGa8=P+cxkQ@mail.gmail.com>
 <CALaZjRGEcFDBiQ2YnsMjXJ_WF3ALcjWVLQuetupVd__CWhjg0g@mail.gmail.com>
 <CAA857VzKREnBOEF667n4B+9L9Rbm+y7o2LVrDjhxhCT=2XWR8g@mail.gmail.com>
Message-ID: <20200811102843.GS31915@sync>

Thanks for those tips, I will check both values asap.

About the complete reset of the cluster, this is also what we use to do,
but this has some downside, such as the need to restart all agents, services,
etc)

Cheers,

-- 
Arnaud Morin

On 08.08.20 - 15:06, Fabian Zimmermann wrote:
> Hi,
> 
> dont know if durable queues help, but should be enabled by rabbitmq policy
> which (alone) doesnt seem to fix this (we have this active)
> 
>  Fabian
> 
> Massimo Sgaravatto <massimo.sgaravatto at gmail.com> schrieb am Sa., 8. Aug.
> 2020, 09:36:
> 
> > We also see the issue.  When it happens stopping and restarting the rabbit
> > cluster usually helps.
> >
> > I thought the problem was because of a wrong setting in the openstack
> > services conf files: I missed these settings (that I am now going to add):
> >
> > [oslo_messaging_rabbit]
> > rabbit_ha_queues = true
> > amqp_durable_queues = true
> >
> > Cheers, Massimo
> >
> >
> > On Sat, Aug 8, 2020 at 6:34 AM Fabian Zimmermann <dev.faz at gmail.com>
> > wrote:
> >
> >> Hi,
> >>
> >> we also have this issue.
> >>
> >> Our solution was (up to now) to delete the queues with a script or even
> >> reset the complete cluster.
> >>
> >> We just upgraded rabbitmq to the latest version - without luck.
> >>
> >> Anyone else seeing this issue?
> >>
> >>  Fabian
> >>
> >>
> >>
> >> Arnaud Morin <arnaud.morin at gmail.com> schrieb am Do., 6. Aug. 2020,
> >> 16:47:
> >>
> >>> Hey all,
> >>>
> >>> I would like to ask the community about a rabbit issue we have from time
> >>> to time.
> >>>
> >>> In our current architecture, we have a cluster of rabbits (3 nodes) for
> >>> all our OpenStack services (mostly nova and neutron).
> >>>
> >>> When one node of this cluster is down, the cluster continue working (we
> >>> use pause_minority strategy).
> >>> But, sometimes, the third server is not able to recover automatically
> >>> and need a manual intervention.
> >>> After this intervention, we restart the rabbitmq-server process, which
> >>> is then able to join the cluster back.
> >>>
> >>> At this time, the cluster looks ok, everything is fine.
> >>> BUT, nothing works.
> >>> Neutron and nova agents are not able to report back to servers.
> >>> They appear dead.
> >>> Servers seems not being able to consume messages.
> >>> The exchanges, queues, bindings seems good in rabbit.
> >>>
> >>> What we see is that removing bindings (using rabbitmqadmin delete
> >>> binding or the web interface) and recreate them again (using the same
> >>> routing key) brings the service back up and running.
> >>>
> >>> Doing this for all queues is really painful. Our next plan is to
> >>> automate it, but is there anyone in the community already saw this kind
> >>> of issues?
> >>>
> >>> Our bug looks like the one described in [1].
> >>> Someone recommands to create an Alternate Exchange.
> >>> Is there anyone already tried that?
> >>>
> >>> FYI, we are running rabbit 3.8.2 (with OpenStack Stein).
> >>> We had the same kind of issues using older version of rabbit.
> >>>
> >>> Thanks for your help.
> >>>
> >>> [1]
> >>> https://groups.google.com/forum/#!newtopic/rabbitmq-users/rabbitmq-users/zFhmpHF2aWk
> >>>
> >>> --
> >>> Arnaud Morin
> >>>
> >>>
> >>>


From arnaud.morin at gmail.com  Tue Aug 11 10:33:15 2020
From: arnaud.morin at gmail.com (Arnaud Morin)
Date: Tue, 11 Aug 2020 10:33:15 +0000
Subject: [largescale-sig] Next meeting: August 12, 16utc
In-Reply-To: <6e7a4e43-08f4-3030-2eb0-9311f27d9647@openstack.org>
References: <6e7a4e43-08f4-3030-2eb0-9311f27d9647@openstack.org>
Message-ID: <20200811103315.GT31915@sync>

Hi Thierry and all,

Thank you for bringing that up.
I am off this week and will not be able to attend.
Also, my TODO is still on TODO :(

Cheers

-- 
Arnaud Morin

On 10.08.20 - 12:01, Thierry Carrez wrote:
> Hi everyone,
> 
> In order to accommodate US members, the Large Scale SIG recently decided to
> rotate between an EU-APAC-friendly time and an US-EU-friendly time.
> 
> Our next meeting will be the first US-EU meeting, on Wednesday, August 12 at
> 16 UTC[1] in the #openstack-meeting-3 channel on IRC:
> 
> https://www.timeanddate.com/worldclock/fixedtime.html?iso=20200812T16
> 
> Feel free to add topics to our agenda at:
> 
> https://etherpad.openstack.org/p/large-scale-sig-meeting
> 
> A reminder of the TODOs we had from last meeting, in case you have time to
> make progress on them:
> 
> - amorin to add some meat to the wiki page before we push the Nova doc patch
> further
> - all to describe briefly how you solved metrics/billing in your deployment
> in https://etherpad.openstack.org/p/large-scale-sig-documentation
> 
> Talk to you all on Wednesday,
> 
> -- 
> Thierry Carrez
> 


From radoslaw.piliszek at gmail.com  Tue Aug 11 11:05:24 2020
From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=)
Date: Tue, 11 Aug 2020 13:05:24 +0200
Subject: [all] [dev&ops] Early warning about new "stable" ansible releases
Message-ID: <CAKZ_x786Qq5kL3BkHHpnMQ9Yx9_2piKPX5RVNynwd6QViUQb3w@mail.gmail.com>

Hiya Folks,

Ansible 2.8.14 and 2.9.12 change the default mode, that created files will
get, from 0666 (with umask; which would usually produce 0644) to 0600. [1]

Kolla-Ansible got hit by it, and Zuul relies on Ansible so might pick it up
at some point, possibly causing some little havoc for all of us.

[1] https://github.com/ansible/ansible/issues/71200

-yoctozepto
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200811/a6594978/attachment.html>

From kotobi at dkrz.de  Tue Aug 11 11:11:02 2020
From: kotobi at dkrz.de (Amjad Kotobi)
Date: Tue, 11 Aug 2020 13:11:02 +0200
Subject: [horizon][dashboard] Disable admin and identity dashboard panel for
 user role
Message-ID: <C341A0ED-B0C9-414C-B934-63684B947AB0@dkrz.de>

Hi,

I’m trying to customise view level of dashboard to users with “User” role in keystone, by that I meant to disable “admin” + “identity” panels for users, but when I’m adding “DISABLED = True” to admin panel, it will disable panel for admin and user roles.
Is there any way to disable “admin” & “identity” panels only for user role?

Installed openstack-dashboard 

openstack-dashboard-16.2.0-1.el7

Thanks
Amjad
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200811/85411a58/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5223 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200811/85411a58/attachment.bin>

From moreira.belmiro.email.lists at gmail.com  Tue Aug 11 12:22:40 2020
From: moreira.belmiro.email.lists at gmail.com (Belmiro Moreira)
Date: Tue, 11 Aug 2020 14:22:40 +0200
Subject: [all][TC] OpenStack Client (OSC) vs python-*clients
In-Reply-To: <CAKZ_x7_CCbHzVdruU2BvS7kK1mvte+1AW1PJptwVzFJQVVCs4Q@mail.gmail.com>
References: <CAPkQhndGy1VrBT1yOCXQaZcgDxXr5ww9RL1VnU45iHsDJqZpcw@mail.gmail.com>
 <CAKZ_x7_CCbHzVdruU2BvS7kK1mvte+1AW1PJptwVzFJQVVCs4Q@mail.gmail.com>
Message-ID: <CAPkQhncJrrf+m7Z6jdGKiLE5oGHnKURCHHfM-Hu_kCj++SDcCw@mail.gmail.com>

Hi Radosław,
no, it's not true for every project. There are projects that have
completely migrated to OSC (for example, Keystone).
Other projects still have discrepancies (for example, Nova, Glance).

Belmiro

On Mon, Aug 10, 2020 at 10:26 AM Radosław Piliszek <
radoslaw.piliszek at gmail.com> wrote:

> On Mon, Aug 10, 2020 at 10:19 AM Belmiro Moreira <
> moreira.belmiro.email.lists at gmail.com> wrote:
>
>> Hi,
>> during the last PTG the TC discussed the problem of supporting different
>> clients (OpenStack Client - OSC vs python-*clients) [1].
>> Currently, we don't have feature parity between the OSC and the
>> python-*clients.
>>
>
> Is it true of any client? I guess some are just OSC plugins 100%.
> Do we know which clients have this disparity?
> Personally, I encountered this with Glance the most and Cinder to some
> extent (but I believe over the course of action Cinder got all features I
> wanted from it in the OSC).
>
> -yoctozepto
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200811/e68a98cd/attachment.html>

From thomas.king at gmail.com  Mon Aug 10 22:01:06 2020
From: thomas.king at gmail.com (Thomas King)
Date: Mon, 10 Aug 2020 16:01:06 -0600
Subject: [Openstack-mentoring] Neutron subnet with DHCP relay - continued
In-Reply-To: <CAMD4D2Lhe5cQn99=oWu4FRYiPJJFSohVu4kicDom8CLCT-RjdA@mail.gmail.com>
References: <CAMD4D2JWGhNJxnFRGo5N2kN8W=6qQaZeMWKai6vzhRR7DUQhgg@mail.gmail.com>
 <CAFs83QoqtBpRoYTwkg6NSPdOtjyOvuGLjJpeXvMFdr+PuL3GZQ@mail.gmail.com>
 <CAMD4D2K+hKwsnTgP_mOy0kJwL-o3pVRjpoXhzQCin1vYZX_HvA@mail.gmail.com>
 <CABE5tBYpC_WEH4AJZdjJTH87m7S2HLAs8ov5fGQF1+u+wGAaVQ@mail.gmail.com>
 <CAMD4D2JPomjciGJBivFFBsWquwkMGb5ycdT1fGPwRNx_LRGXTg@mail.gmail.com>
 <CABE5tBYc3ZU4m-BAOHPp66KXULrjHhv5dDBRXWOTBJxe3Nmf2Q@mail.gmail.com>
 <CAMD4D2+i3fbA5pONBgaWzU8p9OYRzeRkeNs++Q-fB9_-Ly_v8Q@mail.gmail.com>
 <CABE5tBZfWU7-Rn8QBi+yiiXT9TqHE1m5AJgr1XsqKonqfyxKxA@mail.gmail.com>
 <CAMD4D2KprJQmsMKnvgKbEFPMS+MYoVZRFCoppwsZyK-cexGRbw@mail.gmail.com>
 <CAMD4D2LgKVss+ybf=DDQC39XD=b6r9M3AWdcxEPU7ZC9J7j3JQ@mail.gmail.com>
 <CAMD4D2JLhn-7EOyjA7jjUf0k0rOpes5tAv1tKRKLXSDqZd+7bw@mail.gmail.com>
 <CAMD4D2K_M0E2WkKv1i904B21MRtp_s6qMunWTLqPEROintkX9g@mail.gmail.com>
 <CAMD4D2Lhe5cQn99=oWu4FRYiPJJFSohVu4kicDom8CLCT-RjdA@mail.gmail.com>
Message-ID: <CAMD4D2J8ecvCKq4XRYnuQNpwdK=8JzG_TGwpfoVLCUFzXVxEgw@mail.gmail.com>

The node will PXE boot, but having the provisioning network separate from
the control plane network, and having a specific route back to the remote
subnet causes a LOT of issues.

With the specific route, the remote node will PXE boot but not talk to the
ironic API service on the controller node.
Without the specific route, the remote node can talk to the ironic API but
cannot PXE boot off the provisioning network.

Unless I add a bunch of network namespace stuff, the simple answer is to
move *everything* onto the control plane. The docs dissuade against this,
however, apparently for security reasons.

Moving everything onto the control plane network seems to be the obvious
but less desirable choice.

Tom King

On Tue, Aug 4, 2020 at 4:22 PM Thomas King <thomas.king at gmail.com> wrote:

> Getting closer. I was able to create the segment and the subnet for the
> remote network on that segment.
>
> When I attempted to provide the baremetal node, Neutron is unable to
> create/attach a port to the remote node:
> WARNING ironic.common.neutron [req-b3f373fc-e76a-4c13-9ebb-41cfc682d31b
> 4946f15716c04f8585d013e364802c6c 1664a38fc668432ca6bee9189be142d9 - default
> default] The local_link_connection is required for 'neutron' network
> interface and is not present in the nodes
> 3ed87e51-00c5-4b27-95c0-665c8337e49b port
> ccc335c6-3521-48a5-927d-d7ee13f7f05b
>
> I changed its network interface from neutron back to flat and it went past
> this. I'm now waiting to see if the node will PXE boot.
>
> On Tue, Aug 4, 2020 at 1:41 PM Thomas King <thomas.king at gmail.com> wrote:
>
>> Changing the ml2 flat_networks from specific physical networks to a
>> wildcard allowed me to create a segment. I may be unstuck.
>>
>> New config:
>> [ml2_type_flat]
>> flat_networks=*
>>
>> Now to try creating the subnet and try a remote provision.
>>
>> Tom King
>>
>> On Mon, Aug 3, 2020 at 3:58 PM Thomas King <thomas.king at gmail.com> wrote:
>>
>>> I've been using named physical networks so long, I completely forgot
>>> using wildcards!
>>>
>>> Is this the answer????
>>>
>>> https://docs.openstack.org/mitaka/config-reference/networking/networking_options_reference.html#modular-layer-2-ml2-flat-type-configuration-options
>>>
>>> Tom King
>>>
>>> On Tue, Jul 28, 2020 at 3:46 PM Thomas King <thomas.king at gmail.com>
>>> wrote:
>>>
>>>> Ruslanas has been a tremendous help. To catch up the discussion lists...
>>>> 1. I enabled Neutron segments.
>>>> 2. I renamed the existing segments for each network so they'll make
>>>> sense.
>>>> 3. I attempted to create a segment for a remote subnet (it is using
>>>> DHCP relay) and this was the error that is blocking me. This is where the
>>>> docs do not cover:
>>>> [root at sea-maas-controller ~(keystone_admin)]# openstack network
>>>> segment create --physical-network remote146-30-32 --network-type flat
>>>> --network baremetal seg-remote-146-30-32
>>>> BadRequestException: 400: Client Error for url:
>>>> http://10.146.30.65:9696/v2.0/segments, Invalid input for operation:
>>>> physical_network 'remote146-30-32' unknown for flat provider network.
>>>>
>>>> I've asked Ruslanas to clarify how their physical networks correspond
>>>> to their remote networks. They have a single provider network and multiple
>>>> segments tied to multiple physical networks.
>>>>
>>>> However, if anyone can shine some light on this, I would greatly
>>>> appreciate it. How should neutron's configurations accommodate remote
>>>> networks<->Neutron segments when I have only one physical network
>>>> attachment for provisioning?
>>>>
>>>> Thanks!
>>>> Tom King
>>>>
>>>> On Wed, Jul 15, 2020 at 3:33 PM Thomas King <thomas.king at gmail.com>
>>>> wrote:
>>>>
>>>>> That helps a lot, thank you!
>>>>>
>>>>> "I use only one network..."
>>>>> This bit seems to go completely against the Neutron segments
>>>>> documentation. When you have access, please let me know if Triple-O is
>>>>> using segments or some other method.
>>>>>
>>>>> I greatly appreciate this, this is a tremendous help.
>>>>>
>>>>> Tom King
>>>>>
>>>>> On Wed, Jul 15, 2020 at 1:07 PM Ruslanas Gžibovskis <ruslanas at lpic.lt>
>>>>> wrote:
>>>>>
>>>>>> Hi Thomas,
>>>>>>
>>>>>> I have a bit complicated setup from tripleo side :) I use only one
>>>>>> network (only ControlPlane). thanks to Harold, he helped to make it work
>>>>>> for me.
>>>>>>
>>>>>> Yes, as written in the tripleo docs for leaf networks, it use the
>>>>>> same neutron network, different subnets. so neutron network is ctlplane (I
>>>>>> think) and have ctlplane-subnet, remote-provision and remote-KI :)) that
>>>>>> generates additional lines in "ip r s" output for routing "foreign" subnets
>>>>>> through correct gw, if you would have isolated networks, by vlans and ports
>>>>>> this would apply for each subnet different gw... I believe you
>>>>>> know/understand that part.
>>>>>>
>>>>>> remote* subnets have dhcp-relay setup by network team... do not ask
>>>>>> details for that. I do not know how to, but can ask :)
>>>>>>
>>>>>>
>>>>>> in undercloud/tripleo i have 2 dhcp servers, one is for
>>>>>> introspection, another for provide/cleanup and deployment process.
>>>>>>
>>>>>> all of those subnets have organization level tagged networks and are
>>>>>> tagged on network devices, but they are untagged on provisioning
>>>>>> interfaces/ports, as in general pxe should be untagged, but some nic's can
>>>>>> do vlan untag on nic/bios level. but who cares!?
>>>>>>
>>>>>> I just did a brief check on your first post, I think I have simmilar
>>>>>> setup to yours :)) I will check in around 12hours :)) more deaply, as will
>>>>>> be at work :)))
>>>>>>
>>>>>>
>>>>>> P.S. sorry for wrong terms, I am bad at naming.
>>>>>>
>>>>>>
>>>>>> On Wed, 15 Jul 2020, 21:13 Thomas King, <thomas.king at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Ruslanas, that would be excellent!
>>>>>>>
>>>>>>> I will reply to you directly for details later unless the maillist
>>>>>>> would like the full thread.
>>>>>>>
>>>>>>> Some preliminary questions:
>>>>>>>
>>>>>>>    - Do you have a separate physical interface for the segment(s)
>>>>>>>    used for your remote subnets?
>>>>>>>    The docs state each segment must have a unique physical network
>>>>>>>    name, which suggests a separate physical interface for each segment unless
>>>>>>>    I'm misunderstanding something.
>>>>>>>    - Are your provisioning segments all on the same Neutron
>>>>>>>    network?
>>>>>>>    - Are you using tagged switchports or access switchports to your
>>>>>>>    Ironic server(s)?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Tom King
>>>>>>>
>>>>>>> On Wed, Jul 15, 2020 at 12:26 AM Ruslanas Gžibovskis <
>>>>>>> ruslanas at lpic.lt> wrote:
>>>>>>>
>>>>>>>> I have deployed that with tripleO, but now we are recabling and
>>>>>>>> redeploying it. So once I have it running I can share my configs, just name
>>>>>>>> which you want :)
>>>>>>>>
>>>>>>>> On Tue, 14 Jul 2020 at 18:40, Thomas King <thomas.king at gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I have. That's the Triple-O docs and they don't go through the
>>>>>>>>> normal .conf files to explain how it works outside of Triple-O. It has some
>>>>>>>>> ideas but no running configurations.
>>>>>>>>>
>>>>>>>>> Tom King
>>>>>>>>>
>>>>>>>>> On Tue, Jul 14, 2020 at 3:01 AM Ruslanas Gžibovskis <
>>>>>>>>> ruslanas at lpic.lt> wrote:
>>>>>>>>>
>>>>>>>>>> hi, have you checked:
>>>>>>>>>> https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/features/routed_spine_leaf_network.html
>>>>>>>>>>  ?
>>>>>>>>>> I am following this link. I only have one network, having
>>>>>>>>>> different issues tho ;)
>>>>>>>>>>
>>>>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200810/8e24102e/attachment-0001.html>

From knikolla at bu.edu  Tue Aug 11 15:22:58 2020
From: knikolla at bu.edu (Nikolla, Kristi)
Date: Tue, 11 Aug 2020 15:22:58 +0000
Subject: [keystone] Weekly meeting cancelled today
Message-ID: <BCB5E016-1408-4A46-90E8-0E4E82B9F3BF@bu.edu>

Hi all,

There are no items in today's weekly meeting agenda, and I'm unavailable to host/attend it due to a scheduling conflict. Therefore we can go ahead and cancel today's meeting.

Thank you, and sorry for any inconvenience
Kristi Nikolla

From openstack at nemebean.com  Tue Aug 11 20:20:43 2020
From: openstack at nemebean.com (Ben Nemec)
Date: Tue, 11 Aug 2020 15:20:43 -0500
Subject: [largescale-sig][nova][neutron][oslo] RPC ping
In-Reply-To: <671fec63-8bea-4215-c773-d8360e368a99@sap.com>
References: <20200727095744.GK31915@sync>
 <3d238530-6c84-d611-da4c-553ba836fc02@nemebean.com>
 <m2v9i8yjfw.fsf@caffeine.hv.danplanet.com>
 <671fec63-8bea-4215-c773-d8360e368a99@sap.com>
Message-ID: <c5afa01d-d8ef-7441-26a5-ddbc756f719c@nemebean.com>


On 7/28/20 3:02 AM, Johannes Kulik wrote:
> Hi,
> 
> On 7/27/20 7:08 PM, Dan Smith wrote:
>>
>> The primary concern was about something other than nova sitting on our
>> bus making calls to our internal services. I imagine that the proposal
>> to bake it into oslo.messaging is for the same purpose, and I'd probably
>> have the same concern. At the time I think we agreed that if we were
>> going to support direct-to-service health checks, they should be teensy
>> HTTP servers with oslo healthchecks middleware. Further loading down
>> rabbit with those pings doesn't seem like the best plan to
>> me. Especially since Nova (compute) services already check in over RPC
>> periodically and the success of that is discoverable en masse through
>> the API.
>>
>> --Dan
>>
> 
> While I get this concern, we have seen the problem described by the 
> original poster in production multiple times: nova-compute reports to be 
> healthy, is seen as up through the API, but doesn't work on any messages 
> anymore.
> A health-check going through rabbitmq would really help spotting those 
> situations, while having an additional HTTP server doesn't.

I wonder if this does help though. It seems like a bug that a 
nova-compute service would stop processing messages and still be seen as 
up in the service status. Do we understand why that is happening? If 
not, I'm unclear that a ping living at the oslo.messaging layer is going 
to do a better job of exposing such an outage. The fact that 
oslo.messaging is responding does not necessarily equate to nova-compute 
functioning as expected.

To be clear, this is not me nacking the ping feature. I just want to 
make sure we understand what is going on here so we don't add another 
unreliable healthchecking mechanism to the one we already have.

> 
> Have a nice day,
> Johannes
> 


From openstack at nemebean.com  Tue Aug 11 20:28:05 2020
From: openstack at nemebean.com (Ben Nemec)
Date: Tue, 11 Aug 2020 15:28:05 -0500
Subject: [largescale-sig] RPC ping
In-Reply-To: <CAEs876jC_0E-Pnr+KiPD5b+LRety9zhwR6tMof0ajK81-16YRA@mail.gmail.com>
References: <20200727095744.GK31915@sync>
 <CAEs876jC_0E-Pnr+KiPD5b+LRety9zhwR6tMof0ajK81-16YRA@mail.gmail.com>
Message-ID: <e3805ecb-07d7-5394-8960-b5b73bc2d198@nemebean.com>


On 8/3/20 9:21 AM, Mohammed Naser wrote:
> 3. You mentioned you're moving towards Kubernetes, we're doing the
> same and building an operator:
> https://opendev.org/vexxhost/openstack-operator  -- Because the
> operator manages the whole thing and Kubernetes does it's thing too,
> we started moving towards 1 (single) rabbitmq per service, which
> reaaaaaaally helped a lot in stabilizing things.  Oslo messaging is a
> lot better at recovering when a single service IP is pointing towards
> it because it doesn't do weird things like have threads trying to
> connect to other Rabbit ports.  Just a thought.

On a related note, LINE actually broke it down even further than that. 
There are details of their design in [0], but essentially they have 
downstream changes where they can specify a transport per notification 
topic to further separate out rabbit traffic.

The spec hasn't been implemented yet upstream, but I thought I'd mention 
it since it seems relevant to this discussion.

0: 
https://specs.openstack.org/openstack/oslo-specs/specs/victoria/support-transports-per-oslo-notifications.html


From smooney at redhat.com  Tue Aug 11 21:20:07 2020
From: smooney at redhat.com (Sean Mooney)
Date: Tue, 11 Aug 2020 22:20:07 +0100
Subject: [largescale-sig][nova][neutron][oslo] RPC ping
In-Reply-To: <c5afa01d-d8ef-7441-26a5-ddbc756f719c@nemebean.com>
References: <20200727095744.GK31915@sync>
 <3d238530-6c84-d611-da4c-553ba836fc02@nemebean.com>
 <m2v9i8yjfw.fsf@caffeine.hv.danplanet.com>
 <671fec63-8bea-4215-c773-d8360e368a99@sap.com>
 <c5afa01d-d8ef-7441-26a5-ddbc756f719c@nemebean.com>
Message-ID: <2af09e63936f75489946ea6b70c41d6e091531ee.camel@redhat.com>

On Tue, 2020-08-11 at 15:20 -0500, Ben Nemec wrote:
> 
> On 7/28/20 3:02 AM, Johannes Kulik wrote:
> > Hi,
> > 
> > On 7/27/20 7:08 PM, Dan Smith wrote:
> > > 
> > > The primary concern was about something other than nova sitting on our
> > > bus making calls to our internal services. I imagine that the proposal
> > > to bake it into oslo.messaging is for the same purpose, and I'd probably
> > > have the same concern. At the time I think we agreed that if we were
> > > going to support direct-to-service health checks, they should be teensy
> > > HTTP servers with oslo healthchecks middleware. Further loading down
> > > rabbit with those pings doesn't seem like the best plan to
> > > me. Especially since Nova (compute) services already check in over RPC
> > > periodically and the success of that is discoverable en masse through
> > > the API.
> > > 
> > > --Dan
> > > 
> > 
> > While I get this concern, we have seen the problem described by the 
> > original poster in production multiple times: nova-compute reports to be 
> > healthy, is seen as up through the API, but doesn't work on any messages 
> > anymore.
> > A health-check going through rabbitmq would really help spotting those 
> > situations, while having an additional HTTP server doesn't.
> 
> I wonder if this does help though. It seems like a bug that a 
> nova-compute service would stop processing messages and still be seen as 
> up in the service status.
it kind of is a bug this one to be precise  https://bugs.launchpad.net/nova/+bug/1854992
>  Do we understand why that is happening?
assuming it is  https://bugs.launchpad.net/nova/+bug/1854992 then then the reason 
the compute status is still up is the compute service is runing fine and sending heartbeats,
the issue is that under certin failure modes the topic queue used to recivie rpc topic sends
can disappear. one way this can happen is if the rabbitmq server restart, in which case the resend
code in oslo will reconnect to the exchange but it will not nessisarly recreate the topic queue.
>  If 
> not, I'm unclear that a ping living at the oslo.messaging layer is going 
> to do a better job of exposing such an outage. The fact that 
> oslo.messaging is responding does not necessarily equate to nova-compute 
> functioning as expected.

maybe saying that a little clear. https://bugs.launchpad.net/nova/+bug/1854992 has other
causes beyond the rabbit mq server crahsing  but the underlying effect is the same the queue
that the compute service uses to recive rpc call destroyed and not recreated. a related
oslo bug https://bugs.launchpad.net/oslo.messaging/+bug/1661510 was "fixed" by add the mandatory
transport flag feature. (you can porably mark that as fixed releaed by the way)

from a nova persepctive  the intened way to fix the nova bug  was to use the new mandartroy flag
and catch the MessageUndeliverable and have the conductor/api recreate the compute
services topic queue and resent the amqp message.

An open question is will the compute service detact that and start processing the queue again.
if that will not fix the problem plan b was to add a self ping to the compute service
wehere the compute service, on a long timeout (once an hour may once every 15 mins at the most),
would try to send a message to its own recive queue. if it got the MessageUndeliverable excption
then the comptue service woudl recreate its own queue.

addint an interservice ping or triggering the ping enternally is unlikely to help with the nova bug.
ideally we would prefer to have the conductor/api recreate the queue and re send the message if it detect the queue
is missing rather then have a self ping as that does not add addtional load to the message bus and only recreates the
queue if its needed.

im not sure https://bugs.launchpad.net/nova/+bug/1854992 is the bug that is motiviting the creation of this oslo ping
feature but that feels premature if it is. i think it would be better try to adress this by the sender recreating the
queue if the deliver fails and if that is not viable then protpyope thge fix in nova. if the self ping fixes this
miss queue error then we could extract the cod into oslo.


> 
> To be clear, this is not me nacking the ping feature. I just want to 
> make sure we understand what is going on here so we don't add another 
> unreliable healthchecking mechanism to the one we already have.
> 
> > 
> > Have a nice day,
> > Johannes
> > 
> 
> 


From melwittt at gmail.com  Tue Aug 11 21:53:07 2020
From: melwittt at gmail.com (melanie witt)
Date: Tue, 11 Aug 2020 14:53:07 -0700
Subject: [gate][keystone] *-grenade-multinode jobs failing with
 UnicodeDecodeError in keystone
Message-ID: <45926788-6dcf-8825-5bfd-b6353b5facf0@gmail.com>

Howdy all,

FYI the *-grenade-multinode gate jobs are currently failing with the following error in keystone:

  UnicodeDecodeError: 'utf-8' codec can't decode byte 0x87 in position 3: invalid start byte

This appears to be an issue with a new default data format in msgpack v1.0 [1] which was brought in by a recent bump of upper constraints [2].

*-grenade-multinode jobs are affected because they test a rolling upgrade where the controller is upgraded to the N release version but one compute node is on the N-1 release version. It looks like cached keystone tokens being used by the N-1 node are erroring out during msgpack unpacking because they are in the old data format and msgpack v1.0 has a new default data format.

I've opened a bug [3] about and I'm trying out the following keystone patch to fix it:

https://review.opendev.org/745752

Reviews appreciated.

If this is not the best approach or if this affects other projects as well, alternatively we could revert the upper constraint bump to msgpack v1.0 while we figure out the best fix.

Cheers,
-melanie

[1] https://github.com/msgpack/msgpack-python/blob/v1.0.0/README.md#major-breaking-changes-in-msgpack-10
[2] https://review.opendev.org/#/c/745437/2/upper-constraints.txt at 373
[3] https://launchpad.net/bugs/1891244


From mike.carden at gmail.com  Tue Aug 11 22:00:44 2020
From: mike.carden at gmail.com (Mike Carden)
Date: Wed, 12 Aug 2020 08:00:44 +1000
Subject: [horizon][dashboard] Disable admin and identity dashboard panel
 for user role
In-Reply-To: <C341A0ED-B0C9-414C-B934-63684B947AB0@dkrz.de>
References: <C341A0ED-B0C9-414C-B934-63684B947AB0@dkrz.de>
Message-ID: <CAJ+Dxur3paXPxDpKja-n6K9bVoqFSZBKo41+woEWxPKRiY-vNA@mail.gmail.com>

Hi Amjad.

You don't say what version of OpenStack you are running, but I thought I
would just mention that in Queens at least, the Identity tab in Horizon is
essential for users if they belong to more than ~20 Projects because the
Project drop-down in the GUI won't display them all and the user needs the
Identity tab to select any projects not shown in the drop-down.

This may not apply to you, but I think it's worth being aware of.

--
MC
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200812/eb66a8a6/attachment-0001.html>

From melwittt at gmail.com  Wed Aug 12 02:57:02 2020
From: melwittt at gmail.com (melanie witt)
Date: Tue, 11 Aug 2020 19:57:02 -0700
Subject: [gate][keystone] *-grenade-multinode jobs failing with
 UnicodeDecodeError in keystone
In-Reply-To: <45926788-6dcf-8825-5bfd-b6353b5facf0@gmail.com>
References: <45926788-6dcf-8825-5bfd-b6353b5facf0@gmail.com>
Message-ID: <67a115ba-f80a-ebe5-8689-922e3bbb9a40@gmail.com>

On 8/11/20 14:53, melanie witt wrote:
> Howdy all,
> 
> FYI the *-grenade-multinode gate jobs are currently failing with the following error in keystone:
> 
>    UnicodeDecodeError: 'utf-8' codec can't decode byte 0x87 in position 3: invalid start byte
> 
> This appears to be an issue with a new default data format in msgpack v1.0 [1] which was brought in by a recent bump of upper constraints [2].
> 
> *-grenade-multinode jobs are affected because they test a rolling upgrade where the controller is upgraded to the N release version but one compute node is on the N-1 release version. It looks like cached keystone tokens being used by the N-1 node are erroring out during msgpack unpacking because they are in the old data format and msgpack v1.0 has a new default data format.
> 
> I've opened a bug [3] about and I'm trying out the following keystone patch to fix it:
> 
> https://review.opendev.org/745752
> 
> Reviews appreciated.
> 
> If this is not the best approach or if this affects other projects as well, alternatively we could revert the upper constraint bump to msgpack v1.0 while we figure out the best fix.

Here's a patch for reverting the upper constraint for msgpack in case 
that approach is preferred:

https://review.opendev.org/745769

> [1] https://github.com/msgpack/msgpack-python/blob/v1.0.0/README.md#major-breaking-changes-in-msgpack-10
> [2] https://review.opendev.org/#/c/745437/2/upper-constraints.txt at 373
> [3] https://launchpad.net/bugs/1891244
> 


From dev.faz at gmail.com  Wed Aug 12 04:44:03 2020
From: dev.faz at gmail.com (Fabian Zimmermann)
Date: Wed, 12 Aug 2020 06:44:03 +0200
Subject: [largescale-sig][nova][neutron][oslo] RPC ping
In-Reply-To: <20200806140421.GN31915@sync>
References: <20200727095744.GK31915@sync>
 <3d238530-6c84-d611-da4c-553ba836fc02@nemebean.com>
 <m2v9i8yjfw.fsf@caffeine.hv.danplanet.com>
 <CAJoCO=Mi8d23=Jhjsdy-05k1jWayg_1=NvTsdjUw+uDA9eE3hw@mail.gmail.com>
 <88c24f3a-7d29-aa39-ed12-803279cc90c1@openstack.org>
 <20200806140421.GN31915@sync>
Message-ID: <CAA857VxsDofTH84TNq4Yj-78DgM6XCX=5cupDgLccaeTBXe1zQ@mail.gmail.com>

Hi,

would be great if you could share your script.

 Fabian


Arnaud Morin <arnaud.morin at gmail.com> schrieb am Do., 6. Aug. 2020, 16:11:

> Hey all,
>
> Thanks for your replies.
> About the fact that nova already implement this, I will try again on my
> side, but maybe it was not yet implemented in newton (I only tried nova
> on newton version). Thank you for bringing that to me.
>
> About the healhcheck already done on nova side (and also on neutron).
> As far as I understand, it's done using a specific rabbit queue, which
> can work while others queues are not working.
> The purpose of adding ping endpoint here is to be able to ping in all
> topics, not only those used for healthcheck reports.
>
> Also, as mentionned by Thierry, what we need is a way to externally
> do pings toward neutron agents and nova computes.
> The patch itself is not going to add any load on rabbit. It really
> depends on the way the operator will use it.
> On my side, I built a small external oslo.messaging script which I can
> use to do such pings.
>
> Cheers,
>
> --
> Arnaud Morin
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200812/c3b1d2a3/attachment.html>

From melwittt at gmail.com  Wed Aug 12 07:49:10 2020
From: melwittt at gmail.com (melanie witt)
Date: Wed, 12 Aug 2020 00:49:10 -0700
Subject: [gate][keystone][nova][neutron] *-grenade-multinode jobs failing
 with UnicodeDecodeError in keystone
In-Reply-To: <67a115ba-f80a-ebe5-8689-922e3bbb9a40@gmail.com>
References: <45926788-6dcf-8825-5bfd-b6353b5facf0@gmail.com>
 <67a115ba-f80a-ebe5-8689-922e3bbb9a40@gmail.com>
Message-ID: <18572a52-d105-9219-6b19-5fe23f18e3e0@gmail.com>

Adding [nova][neutron] since their gates will continue to be blocked 
until one of the following proposed fixes merges. They are linked inline.

On 8/11/20 19:57, melanie witt wrote:
> On 8/11/20 14:53, melanie witt wrote:
>> Howdy all,
>>
>> FYI the *-grenade-multinode gate jobs are currently failing with the 
>> following error in keystone:
>>
>>    UnicodeDecodeError: 'utf-8' codec can't decode byte 0x87 in 
>> position 3: invalid start byte
>>
>> This appears to be an issue with a new default data format in msgpack 
>> v1.0 [1] which was brought in by a recent bump of upper constraints [2].
>>
>> *-grenade-multinode jobs are affected because they test a rolling 
>> upgrade where the controller is upgraded to the N release version but 
>> one compute node is on the N-1 release version. It looks like cached 
>> keystone tokens being used by the N-1 node are erroring out during 
>> msgpack unpacking because they are in the old data format and msgpack 
>> v1.0 has a new default data format.
>>
>> I've opened a bug [3] about and I'm trying out the following keystone 
>> patch to fix it:
>>
>> https://review.opendev.org/745752
>>
>> Reviews appreciated.

I tested ^ with a DNM patch to nova and nova-grenade-multinode passes 
with it.

>> If this is not the best approach or if this affects other projects as 
>> well, alternatively we could revert the upper constraint bump to 
>> msgpack v1.0 while we figure out the best fix.
> 
> Here's a patch for reverting the upper constraint for msgpack in case 
> that approach is preferred:
> 
> https://review.opendev.org/745769

And this reqs pin ^ is also available if the reviewers find the keystone 
patch unsuitable.

>> [1] 
>> https://github.com/msgpack/msgpack-python/blob/v1.0.0/README.md#major-breaking-changes-in-msgpack-10 
>>
>> [2] https://review.opendev.org/#/c/745437/2/upper-constraints.txt at 373
>> [3] https://launchpad.net/bugs/1891244
>>
> 


From zhangbailin at inspur.com  Wed Aug 12 08:14:37 2020
From: zhangbailin at inspur.com (=?gb2312?B?QnJpbiBaaGFuZyjVxbDZwdYp?=)
Date: Wed, 12 Aug 2020 08:14:37 +0000
Subject: [oslo.cache][keystonemiddleware] enable-sasl-protocol
Message-ID: <35f020916eb54189a6b4176deb3a2a48@inspur.com>

Hi all, we would like to enable sasl protocol to oslo.cache and keystonemiddleware<https://review.opendev.org/#/projects/openstack/keystonemiddleware,dashboards/default> project to improve the security of authority.

SASL(Simple Authentication and Security Layer): is a memchanism used to extend the verification ability of C/S mode. SASL is only the authentication process, which integrates the application layer and the system authentication mechanism.


Need to review patches: https://review.opendev.org/#/q/status:open++branch:master+topic:bp/enable-sasl-protocol


brinzhang


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200812/ab35dcd7/attachment.html>

From skaplons at redhat.com  Wed Aug 12 09:20:32 2020
From: skaplons at redhat.com (Slawek Kaplonski)
Date: Wed, 12 Aug 2020 11:20:32 +0200
Subject: [gate][keystone][nova][neutron] *-grenade-multinode jobs failing
 with UnicodeDecodeError in keystone
In-Reply-To: <18572a52-d105-9219-6b19-5fe23f18e3e0@gmail.com>
References: <45926788-6dcf-8825-5bfd-b6353b5facf0@gmail.com>
 <67a115ba-f80a-ebe5-8689-922e3bbb9a40@gmail.com>
 <18572a52-d105-9219-6b19-5fe23f18e3e0@gmail.com>
Message-ID: <20200812092032.jcwjmy4yci6rjbzd@skaplons-mac>

Hi,

Thx Melanie for the proposed fix for this issue.

On Wed, Aug 12, 2020 at 12:49:10AM -0700, melanie witt wrote:
> Adding [nova][neutron] since their gates will continue to be blocked until
> one of the following proposed fixes merges. They are linked inline.
> 
> On 8/11/20 19:57, melanie witt wrote:
> > On 8/11/20 14:53, melanie witt wrote:
> > > Howdy all,
> > > 
> > > FYI the *-grenade-multinode gate jobs are currently failing with the
> > > following error in keystone:
> > > 
> > > Â Â  UnicodeDecodeError: 'utf-8' codec can't decode byte 0x87 in
> > > position 3: invalid start byte
> > > 
> > > This appears to be an issue with a new default data format in
> > > msgpack v1.0 [1] which was brought in by a recent bump of upper
> > > constraints [2].
> > > 
> > > *-grenade-multinode jobs are affected because they test a rolling
> > > upgrade where the controller is upgraded to the N release version
> > > but one compute node is on the N-1 release version. It looks like
> > > cached keystone tokens being used by the N-1 node are erroring out
> > > during msgpack unpacking because they are in the old data format and
> > > msgpack v1.0 has a new default data format.
> > > 
> > > I've opened a bug [3] about and I'm trying out the following
> > > keystone patch to fix it:
> > > 
> > > https://review.opendev.org/745752
> > > 
> > > Reviews appreciated.
> 
> I tested ^ with a DNM patch to nova and nova-grenade-multinode passes with
> it.
> 
> > > If this is not the best approach or if this affects other projects
> > > as well, alternatively we could revert the upper constraint bump to
> > > msgpack v1.0 while we figure out the best fix.
> > 
> > Here's a patch for reverting the upper constraint for msgpack in case
> > that approach is preferred:
> > 
> > https://review.opendev.org/745769
> 
> And this reqs pin ^ is also available if the reviewers find the keystone
> patch unsuitable.
> 
> > > [1] https://github.com/msgpack/msgpack-python/blob/v1.0.0/README.md#major-breaking-changes-in-msgpack-10
> > > 
> > > [2] https://review.opendev.org/#/c/745437/2/upper-constraints.txt at 373
> > > [3] https://launchpad.net/bugs/1891244
> > > 
> > 
> 
> 

-- 
Slawek Kaplonski
Senior software engineer
Red Hat


From dev.faz at gmail.com  Wed Aug 12 10:14:31 2020
From: dev.faz at gmail.com (Fabian Zimmermann)
Date: Wed, 12 Aug 2020 12:14:31 +0200
Subject: [nova][neutron][oslo][ops] rabbit bindings issue
In-Reply-To: <1a338d7e-c82c-cda2-2d47-b5aebb999142@openstack.org>
References: <20200806144016.GP31915@sync>
 <CAA857VxY8mr0OSw8vSYdMph4tXb=5Wwwx3cM_YVgGa8=P+cxkQ@mail.gmail.com>
 <CALaZjRGEcFDBiQ2YnsMjXJ_WF3ALcjWVLQuetupVd__CWhjg0g@mail.gmail.com>
 <CAA857VzKREnBOEF667n4B+9L9Rbm+y7o2LVrDjhxhCT=2XWR8g@mail.gmail.com>
 <1a338d7e-c82c-cda2-2d47-b5aebb999142@openstack.org>
Message-ID: <CAA857VxxmrHhBZgNHCg2X=ymvOdS_YcpERYN5OnKs0NMEA=cyA@mail.gmail.com>

Hi,

just wrote some small scripts to reproduce our issue and send a msg to
rabbitmq-list.

https://groups.google.com/d/msg/rabbitmq-users/eC8jc-YEt8s/s8K_0KnXDQAJ

 Fabian


Am Di., 11. Aug. 2020 um 12:31 Uhr schrieb Thierry Carrez <
thierry at openstack.org>:

> If you can reproduce it with current versions, I would suggest to file
> an issue on https://github.com/rabbitmq/rabbitmq-server/issues/
>
> The behavior you describe seems to match
> https://github.com/rabbitmq/rabbitmq-server/issues/1873 but the
> maintainers seem to think it's been fixed by a number of
> somewhat-related changes in 3.7.13, because nobody reported issues
> anymore :)
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200812/3e0024dd/attachment.html>

From thierry at openstack.org  Wed Aug 12 10:32:27 2020
From: thierry at openstack.org (Thierry Carrez)
Date: Wed, 12 Aug 2020 12:32:27 +0200
Subject: [largescale-sig][nova][neutron][oslo] RPC ping
In-Reply-To: <2af09e63936f75489946ea6b70c41d6e091531ee.camel@redhat.com>
References: <20200727095744.GK31915@sync>
 <3d238530-6c84-d611-da4c-553ba836fc02@nemebean.com>
 <m2v9i8yjfw.fsf@caffeine.hv.danplanet.com>
 <671fec63-8bea-4215-c773-d8360e368a99@sap.com>
 <c5afa01d-d8ef-7441-26a5-ddbc756f719c@nemebean.com>
 <2af09e63936f75489946ea6b70c41d6e091531ee.camel@redhat.com>
Message-ID: <7496bd35-856e-f48f-b6d8-65155b1777f1@openstack.org>

Sean Mooney wrote:
> On Tue, 2020-08-11 at 15:20 -0500, Ben Nemec wrote:
>> I wonder if this does help though. It seems like a bug that a nova-compute service would stop processing messages and still be seen as up in the service status. Do we understand why that is happening? If not, I'm unclear that a ping living at the oslo.messaging layer is going to do a better job of exposing such an outage. The fact that oslo.messaging is responding does not necessarily equate to nova-compute functioning as expected.
>> 
>> To be clear, this is not me nacking the ping feature. I just want to make sure we understand what is going on here so we don't add another unreliable healthchecking mechanism to the one we already have. 
> [...]
> im not sure https://bugs.launchpad.net/nova/+bug/1854992 is the bug that is motiviting the creation of this oslo ping
> feature but that feels premature if it is. i think it would be better try to adress this by the sender recreating the
> queue if the deliver fails and if that is not viable then protpyope thge fix in nova. if the self ping fixes this
> miss queue error then we could extract the cod into oslo.

I think this is missing the point... This is not about working around a 
specific bug, it's about adding a way to detect a certain class of 
failure. It's more of an operational feature than a development bugfix.

If I understood correctly, OVH is running that patch in production as a 
way to detect certain problems they regularly run into, something our 
existing monitor mechanisms fail to detect. That sounds like a 
worthwhile addition?

Alternatively, if we can monitor the exact same class of failures using 
our existing systems (or by improving them rather than adding a new 
door), that works too.

-- 
Thierry Carrez (ttx)


From moguimar at redhat.com  Wed Aug 12 10:52:11 2020
From: moguimar at redhat.com (Moises Guimaraes de Medeiros)
Date: Wed, 12 Aug 2020 12:52:11 +0200
Subject: [oslo.cache][keystonemiddleware] enable-sasl-protocol
In-Reply-To: <35f020916eb54189a6b4176deb3a2a48@inspur.com>
References: <35f020916eb54189a6b4176deb3a2a48@inspur.com>
Message-ID: <CAG_6SUCJmZUw78d28Dw7TxZ1Bhi6j4PvzN8FVi40GwcUXFh6fQ@mail.gmail.com>

Hi Brin,

Thanks for the patches! I've dropped a few reviews already. Feel free to
reach me also on #openstack-oslo if you have any questions.

moguimar

On Wed, Aug 12, 2020 at 10:16 AM Brin Zhang(张百林) <zhangbailin at inspur.com>
wrote:

> Hi all, we would like to enable sasl protocol to oslo.cache and
> keystonemiddleware
> <https://review.opendev.org/#/projects/openstack/keystonemiddleware,dashboards/default>
> project to improve the security of authority.
>
> SASL(Simple Authentication and Security Layer): is a memchanism used to
> extend the verification ability of C/S mode. SASL is only the
> authentication process, which integrates the application layer and the
> system authentication mechanism.
>
>
>
> Need to review patches:
> https://review.opendev.org/#/q/status:open++branch:master+topic:bp/enable-sasl-protocol
>
>
>
>
>
>
>
> brinzhang
>
>
>


-- 

Moisés Guimarães

Software Engineer

Red Hat <https://www.redhat.com>

<https://red.ht/sig>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200812/505c4477/attachment.html>

From smooney at redhat.com  Wed Aug 12 11:05:28 2020
From: smooney at redhat.com (Sean Mooney)
Date: Wed, 12 Aug 2020 12:05:28 +0100
Subject: [largescale-sig][nova][neutron][oslo] RPC ping
In-Reply-To: <7496bd35-856e-f48f-b6d8-65155b1777f1@openstack.org>
References: <20200727095744.GK31915@sync>
 <3d238530-6c84-d611-da4c-553ba836fc02@nemebean.com>
 <m2v9i8yjfw.fsf@caffeine.hv.danplanet.com>
 <671fec63-8bea-4215-c773-d8360e368a99@sap.com>
 <c5afa01d-d8ef-7441-26a5-ddbc756f719c@nemebean.com>
 <2af09e63936f75489946ea6b70c41d6e091531ee.camel@redhat.com>
 <7496bd35-856e-f48f-b6d8-65155b1777f1@openstack.org>
Message-ID: <4fc5e9d57172a73608e8fdf7e70ff569dca5dfd4.camel@redhat.com>

On Wed, 2020-08-12 at 12:32 +0200, Thierry Carrez wrote:
> Sean Mooney wrote:
> > On Tue, 2020-08-11 at 15:20 -0500, Ben Nemec wrote:
> > > I wonder if this does help though. It seems like a bug that a nova-compute service would stop processing messages
> > > and still be seen as up in the service status. Do we understand why that is happening? If not, I'm unclear that a
> > > ping living at the oslo.messaging layer is going to do a better job of exposing such an outage. The fact that
> > > oslo.messaging is responding does not necessarily equate to nova-compute functioning as expected.
> > > 
> > > To be clear, this is not me nacking the ping feature. I just want to make sure we understand what is going on here
> > > so we don't add another unreliable healthchecking mechanism to the one we already have. 
> > 
> > [...]
> > im not sure https://bugs.launchpad.net/nova/+bug/1854992 is the bug that is motiviting the creation of this oslo
> > ping
> > feature but that feels premature if it is. i think it would be better try to adress this by the sender recreating
> > the
> > queue if the deliver fails and if that is not viable then protpyope thge fix in nova. if the self ping fixes this
> > miss queue error then we could extract the cod into oslo.
> 
> I think this is missing the point... This is not about working around a 
> specific bug, it's about adding a way to detect a certain class of 
> failure. It's more of an operational feature than a development bugfix.

right but we are concerned that there will be a negitive perfromance impact to adding it
and it wont detect the one bug we are aware of of this type in a way that we could not also detect
by using the mandtory flag.

nova already has a heartbeat that the agents send to the conducto to report they are still alive.
this ping would work in the opisite direction by reaching out to the compute node over the rpc bus.
but that would only detect teh vailure mode if the pic use the topic queue and it could only fix it
if recreating the queue via the conducor is a viable solution

if it is using the mandataory flag and just recreating it is a better solution since we dont need to ping
constantly in the background. if  we get teh excpeiton we create the queue and retransmit.
the compute manger does not resubscribe to the topic when the queue is recreated automaticaly then the new ping
feature wont really help. we would need the comptue service or any other service that subsibse to the topic queue
to try to ping its own topic queue and if that fails recreate the subsribtion/queue.

as far as i am ware that is not what the fature is proposing 
> 
> If I understood correctly, OVH is running that patch in production as a 
> way to detect certain problems they regularly run into, something our 
> existing monitor mechanisms fail to detect. That sounds like a 
> worthwhile addition?
im not sure what failure mode it will detect. if they can define that then it would
help with understanding if this is worthwhile or not.
> 
> Alternatively, if we can monitor the exact same class of failures using 
> our existing systems (or by improving them rather than adding a new 
> door), that works too.
we can monitor the exitance of the queue at least form the rabbitmq api(its disable by defualt but just enable the
rabbit-managment plugin) but im not sure what there current  issue this is trying to solve is.
> 


From radoslaw.piliszek at gmail.com  Wed Aug 12 11:45:06 2020
From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=)
Date: Wed, 12 Aug 2020 13:45:06 +0200
Subject: [tc][oslo] Move etcd3gw to OpenStack?
Message-ID: <CAKZ_x79vxOLCkFKr+BCd4TYMB+SiNufnUd3X=bYGBrLp5iuv2g@mail.gmail.com>

Hey, Folks!

I see it has been kinda proposed already [1] so that's mostly why I am
asking about that now.

>From what I understand, etcd3gw is our best bet when trying to get
coordination with etcd3.
However, it has recently released a broken release [2] due to no testing
(not to mention gating with tooz).
I think it could benefit from OpenDev's existing tooling.
And since OpenStack is an important client of it and OpenStack preferring
this client, it might be wise to put it in that namespace already.

I guess the details would have to be discussed with dims (current owner)
himself but he seemed happy about it in [1].

I'm notifying oslo as well as this would probably live best finally under
oslo governance.

Please let me know if any of the above is not true, so that I can amend my
knowledge. :-)

[1] https://github.com/dims/etcd3-gateway/issues/29
[2] https://bugs.launchpad.net/kolla-ansible/+bug/1891314

-yoctozepto
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200812/5b69c9db/attachment-0001.html>

From pwm2012 at gmail.com  Wed Aug 12 13:03:17 2020
From: pwm2012 at gmail.com (pwm)
Date: Wed, 12 Aug 2020 21:03:17 +0800
Subject: DNS server instead of /etc/hosts file on Infra Server
In-Reply-To: <CAPQD=Mb96WmE6r0RDBdaG7zVOKidppLMgLj2qVTGMsqKx5XKmA@mail.gmail.com>
References: <CAPQD=Mb96WmE6r0RDBdaG7zVOKidppLMgLj2qVTGMsqKx5XKmA@mail.gmail.com>
Message-ID: <CAPQD=MacnFtTZEKkv7ZzYBOxJkEJu17N2CBYMWk4kJNgc-pkkQ@mail.gmail.com>

Hi,
Plan to use PowerDNS server instead of the /etc/hosts file for resolving.
Has anyone done this before?
The PowerDNS support MySQL DB backend and a frontend GUI PowerDNS Admin
which allows centralized easy maintenance.

Thanks

On Sun, Aug 9, 2020 at 11:54 PM pwm <pwm2012 at gmail.com> wrote:

> Hi,
> Anyone interested in replacing the /etc/hosts file entry with a DNS server
> on the openstack-ansible deployment?
>
> Thank you
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200812/1334a401/attachment.html>

From sean.mcginnis at gmx.com  Wed Aug 12 13:23:09 2020
From: sean.mcginnis at gmx.com (Sean McGinnis)
Date: Wed, 12 Aug 2020 08:23:09 -0500
Subject: [Release-job-failures] Release of openstack/python-glanceclient
 for ref refs/tags/3.1.2 failed
In-Reply-To: <E1k5nAx-0006UI-88@zuul01.openstack.org>
References: <E1k5nAx-0006UI-88@zuul01.openstack.org>
Message-ID: <ce48e1d8-f9a9-83e5-4408-c6a75862ab83@gmx.com>

On 8/12/20 4:36 AM, zuul at openstack.org wrote:
> Build failed.
>
> - openstack-upload-github-mirror https://zuul.opendev.org/t/openstack/build/cba5dda29e8744059d637a97f358c59f : SUCCESS in 43s
> - release-openstack-python https://zuul.opendev.org/t/openstack/build/b9628cf4f28d4bea95844539295ff520 : SUCCESS in 2m 54s
> - announce-release https://zuul.opendev.org/t/openstack/build/9f9a5910815247049bf02ab612781620 : FAILURE in 24m 20s
> - propose-update-constraints https://zuul.opendev.org/t/openstack/build/44353ec832794af2965e7e2d05d63442 : SUCCESS in 3m 28s

announce-release job appears to have failed due to a temporary network
issue accessing PyPi packages.

Since the release announcement is not critical, no further action is needed.

Sean


From jonathan.rosser at rd.bbc.co.uk  Wed Aug 12 13:28:17 2020
From: jonathan.rosser at rd.bbc.co.uk (Jonathan Rosser)
Date: Wed, 12 Aug 2020 14:28:17 +0100
Subject: DNS server instead of /etc/hosts file on Infra Server
In-Reply-To: <CAPQD=MacnFtTZEKkv7ZzYBOxJkEJu17N2CBYMWk4kJNgc-pkkQ@mail.gmail.com>
References: <CAPQD=Mb96WmE6r0RDBdaG7zVOKidppLMgLj2qVTGMsqKx5XKmA@mail.gmail.com>
 <CAPQD=MacnFtTZEKkv7ZzYBOxJkEJu17N2CBYMWk4kJNgc-pkkQ@mail.gmail.com>
Message-ID: <7db29753-4710-a979-fe71-67a829fa55c3@rd.bbc.co.uk>

Openstack-Ansible already supports optionally using the unbound dns 
server instead of managing
/etc/hosts.

Join #openstack-ansible on IRC if you need any help.

Regards,
Jonathan.

On 12/08/2020 14:03, pwm wrote:
> Hi,
> Plan to use PowerDNS server instead of the /etc/hosts file for 
> resolving. Has anyone done this before?
> The PowerDNS support MySQL DB backend and a frontend GUI PowerDNS 
> Admin which allows centralized easy maintenance.
>
> Thanks
>
> On Sun, Aug 9, 2020 at 11:54 PM pwm <pwm2012 at gmail.com 
> <mailto:pwm2012 at gmail.com>> wrote:
>
>     Hi,
>     Anyone interested in replacing the /etc/hosts file entry with a
>     DNS server on the openstack-ansible deployment?
>
>     Thank you
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200812/1bb6df2e/attachment.html>

From mnaser at vexxhost.com  Wed Aug 12 14:10:46 2020
From: mnaser at vexxhost.com (Mohammed Naser)
Date: Wed, 12 Aug 2020 10:10:46 -0400
Subject: [tc] monthly meeting summary
Message-ID: <CAEs876gcqYZNJ8Av8x5ASf-FNXtxaKaAaY9wyyuxKwjub8pntw@mail.gmail.com>

Hi everyone,

Here’s a summary of what happened in our TC monthly meeting last
Thursday, August 6.

# ATTENDEES (LINES SAID)
- mnaser (100)
- gmann (43)
- diablo_rojo (20)
- jungleboyj (16)
- belmoreira (10)
- evrardjp (8)
- fungi (6)
- zaneb (4)
- knikolla (4)
- njohnston (3)

# MEETING SUMMARY
1. Rollcall (mnaser, 14:00:21)
2. Follow up on past action items (mnaser, 14:02:16)
   - https://review.opendev.org/#/c/744995/ (diablo_rojo, 14:03:14)
   - http://lists.openstack.org/pipermail/openstack-discuss/2020-August/016336.html
(gmann, 14:03:24)
3. OpenStack User-facing APIs and CLIs (belmoreira) (mnaser, 14:25:52)
4. W cycle goal selection start (mnaser, 14:34:39)
5. Completion of retirement cleanup (gmann) (mnaser, 14:40:48)
   - https://etherpad.opendev.org/p/tc-retirement-cleanup (mnaser, 14:41:02)
   - https://review.opendev.org/#/c/739291/1 (gmann, 14:42:17)
   - https://review.opendev.org/#/q/topic:cleanup-retirement+(status:open+OR+status:merged)
(gmann, 14:42:51)

# ACTION ITEMS
- TC members to follow up and review "Resolution to define distributed
leadership for projects"
- mnaser schedule session with sig-arch and k8s steering committee
- gmann continue to audit and clean-up tags
- mnaser propose change to implement weekly meetings
- njohnston and mugsie to work on getting goals groomed/proposed for W cycle
- belmoreira start discussion around openstack user-facing apis & clis
- gmann to merge changes to properly retire projects

To read the full logs of the meeting, please refer to
http://eavesdrop.openstack.org/meetings/tc/2020/tc.2020-08-06-14.00.log.html

Thank you,
Mohammed

-- 
Mohammed Naser
VEXXHOST, Inc.


From mnaser at vexxhost.com  Wed Aug 12 14:22:53 2020
From: mnaser at vexxhost.com (Mohammed Naser)
Date: Wed, 12 Aug 2020 10:22:53 -0400
Subject: [largescale-sig] RPC ping
In-Reply-To: <20200806141132.GO31915@sync>
References: <20200727095744.GK31915@sync>
 <CAEs876jC_0E-Pnr+KiPD5b+LRety9zhwR6tMof0ajK81-16YRA@mail.gmail.com>
 <20200806141132.GO31915@sync>
Message-ID: <CAEs876gvRcq8jdepUKU90CP-B6FjCZGK_dLU8pxM-qEvbSZoxg@mail.gmail.com>

On Thu, Aug 6, 2020 at 10:11 AM Arnaud Morin <arnaud.morin at gmail.com> wrote:
>
> Hi Mohammed,
>
> 1 - That's something we would also like, but it's beyond the patch I
> propose.
> I need this patch not only for kubernetes, but also for monitoring my
> legagy openstack agents running outside of k8s.
>
> 2 - Yes, latest version of rabbitmq is better on that point, but we
> still see some weird issue (I will ask the community about it in another
> topic).
>
> 3 - Thanks for this operator, we'll take a look!
> By saying 1 rabbit per service, I understand 1 server, not 1 cluster,
> right?
> That sounds risky if you lose the server.

The controllers are pretty stable and if a controller dies, Kubernetes
will take care of restarting the pod somewhere else and everything
will reconnect and things will be happy again.

> I suppose you dont do that for the database?

One database cluster per service, with 'old-school' replication
because no one really does true multimaster in Galera with OpenStack
anyways.

> 4 - Nice, how to you monitor those consumptions? Using rabbit management
> API?

Prometheus RabbitMQ exporter, now migrating to the native one shipping
in the new RabbitMQ releases.

> Cheers,
>
> --
> Arnaud Morin
>
> On 03.08.20 - 10:21, Mohammed Naser wrote:
> > I have a few operational suggestions on how I think we could do this best:
> >
> > 1. I think exposing a healthcheck endpoint that _actually_ runs the
> > ping and responds with a 200 OK makes a lot more sense in terms of
> > being able to run it inside something like Kubernetes, you end up with
> > a "who makes the ping and who responds to it" type of scenario which
> > can be tricky though I'm sure we can figure that out
> > 2. I've found that newer releases of RabbitMQ really help with those
> > un-usable queues after a split, I haven't had any issues at all with
> > newer releases, so that could be something to help your life be a lot
> > easier.
> > 3. You mentioned you're moving towards Kubernetes, we're doing the
> > same and building an operator:
> > https://opendev.org/vexxhost/openstack-operator -- Because the
> > operator manages the whole thing and Kubernetes does it's thing too,
> > we started moving towards 1 (single) rabbitmq per service, which
> > reaaaaaaally helped a lot in stabilizing things.  Oslo messaging is a
> > lot better at recovering when a single service IP is pointing towards
> > it because it doesn't do weird things like have threads trying to
> > connect to other Rabbit ports.  Just a thought.
> > 4. In terms of telemetry and making sure you avoid that issue, we
> > track the consumption rates of queues inside OpenStack.  OpenStack
> > consumption rate should be constant and never growing, anytime it
> > grows, we instantly detect that something is fishy.  However, the
> > other issue comes in that when you restart any openstack service, it
> > 'forgets' all it's existing queues and then you have a set of building
> > up queues until they automatically expire which happens around 30
> > minutes-ish, so it makes that alarm of "things are not being consumed"
> > a little noisy if you're restarting services
> >
> > Sorry for the wall of super unorganized text, all over the place here
> > but thought I'd chime in with my 2 cents :)
> >
> > On Mon, Jul 27, 2020 at 6:04 AM Arnaud Morin <arnaud.morin at gmail.com> wrote:
> > >
> > > Hey all,
> > >
> > > TLDR: I propose a change to oslo_messaging to allow doing a ping over RPC,
> > >       this is useful to monitor liveness of agents.
> > >
> > >
> > > Few weeks ago, I proposed a patch to oslo_messaging [1], which is adding a
> > > ping endpoint to RPC dispatcher.
> > > It means that every openstack service which is using oslo_messaging RPC
> > > endpoints (almosts all OpenStack services and agents - e.g. neutron
> > > server + agents, nova + computes, etc.) will then be able to answer to a
> > > specific "ping" call over RPC.
> > >
> > > I decided to propose this patch in my company mainly for 2 reasons:
> > > 1 - we are struggling monitoring our nova compute and neutron agents in a
> > >   correct way:
> > >
> > > 1.1 - sometimes our agents are disconnected from RPC, but the python process
> > > is still running.
> > > 1.2 - sometimes the agent is still connected, but the queue / binding on
> > > rabbit cluster is not working anymore (after a rabbit split for
> > > example). This one is very hard to debug, because the agent is still
> > > reporting health correctly on neutron server, but it's not able to
> > > receive messages anymore.
> > >
> > >
> > > 2 - we are trying to monitor agents running in k8s pods:
> > > when running a python agent (neutron l3-agent for example) in a k8s pod, we
> > > wanted to find a way to monitor if it is still live of not.
> > >
> > >
> > > Adding a RPC ping endpoint could help us solve both these issues.
> > > Note that we still need an external mechanism (out of OpenStack) to do this
> > > ping.
> > > We also think it could be nice for other OpenStackers, and especially
> > > large scale ops.
> > >
> > > Feel free to comment.
> > >
> > >
> > > [1] https://review.opendev.org/#/c/735385/
> > >
> > >
> > > --
> > > Arnaud Morin
> > >
> > >
> >
> >
> > --
> > Mohammed Naser
> > VEXXHOST, Inc.


-- 
Mohammed Naser
VEXXHOST, Inc.


From dev.faz at gmail.com  Wed Aug 12 14:25:49 2020
From: dev.faz at gmail.com (Fabian Zimmermann)
Date: Wed, 12 Aug 2020 16:25:49 +0200
Subject: [nova][neutron][oslo][ops] rabbit bindings issue
In-Reply-To: <CAA857VxxmrHhBZgNHCg2X=ymvOdS_YcpERYN5OnKs0NMEA=cyA@mail.gmail.com>
References: <20200806144016.GP31915@sync>
 <CAA857VxY8mr0OSw8vSYdMph4tXb=5Wwwx3cM_YVgGa8=P+cxkQ@mail.gmail.com>
 <CALaZjRGEcFDBiQ2YnsMjXJ_WF3ALcjWVLQuetupVd__CWhjg0g@mail.gmail.com>
 <CAA857VzKREnBOEF667n4B+9L9Rbm+y7o2LVrDjhxhCT=2XWR8g@mail.gmail.com>
 <1a338d7e-c82c-cda2-2d47-b5aebb999142@openstack.org>
 <CAA857VxxmrHhBZgNHCg2X=ymvOdS_YcpERYN5OnKs0NMEA=cyA@mail.gmail.com>
Message-ID: <CAA857Vya0rou-SwzO+A=smVHXNTZ9a03_3r09BJj=bb3NAcdGg@mail.gmail.com>

Hi,

just could prove, that "durable queues" seem to workaround the issue. If I
enable durable queues, im no longer able to reproduce my issue.
Afaik durable queues have downsides - esp if a node fails and the queue is
not (jet) synced. Anyone information about this?

 Fabian


Am Mi., 12. Aug. 2020 um 12:14 Uhr schrieb Fabian Zimmermann <
dev.faz at gmail.com>:

> Hi,
>
> just wrote some small scripts to reproduce our issue and send a msg to
> rabbitmq-list.
>
> https://groups.google.com/d/msg/rabbitmq-users/eC8jc-YEt8s/s8K_0KnXDQAJ
>
>  Fabian
>
>
> Am Di., 11. Aug. 2020 um 12:31 Uhr schrieb Thierry Carrez <
> thierry at openstack.org>:
>
>> If you can reproduce it with current versions, I would suggest to file
>> an issue on https://github.com/rabbitmq/rabbitmq-server/issues/
>>
>> The behavior you describe seems to match
>> https://github.com/rabbitmq/rabbitmq-server/issues/1873 but the
>> maintainers seem to think it's been fixed by a number of
>> somewhat-related changes in 3.7.13, because nobody reported issues
>> anymore :)
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200812/5571080a/attachment-0001.html>

From mnaser at vexxhost.com  Wed Aug 12 14:30:00 2020
From: mnaser at vexxhost.com (Mohammed Naser)
Date: Wed, 12 Aug 2020 10:30:00 -0400
Subject: [tc] weekly summary
Message-ID: <CAEs876h0m5Pdvy_XdW1d9gM9T0qSqPVVC=x0pBpyE8UAtZiZpQ@mail.gmail.com>

Hi everyone,

Here’s an update for what happened in the OpenStack TC this week. You
can get more information by checking for changes in
openstack/governance repository.  We've also included a few references
to some important mailing list threads that you should check out.

# Patches
## Open Reviews
- Create starter-kit:kubernetes-in-virt tag https://review.opendev.org/736369
- Move towards dual office hours https://review.opendev.org/745201
- Clean up expired i18n SIG extra-ATCs https://review.opendev.org/745565
- Resolution to define distributed leadership for projects
https://review.opendev.org/744995
- Move towards single office hour https://review.opendev.org/745200
- Add legacy repository validation https://review.opendev.org/737559
- Drop all exceptions for legacy validation https://review.opendev.org/745403
- [draft] Add assert:supports-standalone https://review.opendev.org/722399
- Clean up expired i18n SIG extra-ATCs https://review.opendev.org/745565
- Add legacy repository validation https://review.opendev.org/737559
- Pierre Riteau as CloudKitty PTL for Victoria https://review.opendev.org/745653
- Resolution to define distributed leadership for projects
https://review.opendev.org/744995
- Create starter-kit:kubernetes-in-virt tag https://review.opendev.org/736369
- Move towards dual office hours https://review.opendev.org/745201
- Move towards single office hour https://review.opendev.org/745200

## Project Updates
- Deprecate os_congress project https://review.opendev.org/742533
- Add Ceph iSCSI charm to OpenStack charms https://review.opendev.org/744480
- Add Keystone Kerberos charm to OpenStack charms
https://review.opendev.org/743769
- Add python-dracclient to be owned by Hardware Vendor SIG
https://review.opendev.org/745564

## General Changes
- Reverse sort series in selected goals https://review.opendev.org/744897
- Declare supported runtimes for Wallaby release
https://review.opendev.org/743847
- Sort SIG names in repo owner list https://review.opendev.org/745563
- Drop neutron-vpnaas from legacy projects https://review.opendev.org/745401

## Abandoned Changes
- Migrate testing to ubuntu focal https://review.opendev.org/740851

# Email Threads
- CloudKitty Status:
http://lists.openstack.org/pipermail/openstack-discuss/2020-July/016171.html
- OSC vs python-*clients:
http://lists.openstack.org/pipermail/openstack-discuss/2020-August/016409.html
- Proposed Wallaby Schedule:
http://lists.openstack.org/pipermail/openstack-discuss/2020-August/016391.html
- New Office Hour Plans:
http://lists.openstack.org/pipermail/openstack-discuss/2020-August/016372.html

# Other Reminders
- Cycle-Trailing Release Deadline Aug 13

Thanks for reading!
Mohammed & Kendall

--
Mohammed Naser
VEXXHOST, Inc.


From dev.faz at gmail.com  Wed Aug 12 15:03:40 2020
From: dev.faz at gmail.com (Fabian Zimmermann)
Date: Wed, 12 Aug 2020 17:03:40 +0200
Subject: [largescale-sig] RPC ping
In-Reply-To: <CAEs876gvRcq8jdepUKU90CP-B6FjCZGK_dLU8pxM-qEvbSZoxg@mail.gmail.com>
References: <20200727095744.GK31915@sync>
 <CAEs876jC_0E-Pnr+KiPD5b+LRety9zhwR6tMof0ajK81-16YRA@mail.gmail.com>
 <20200806141132.GO31915@sync>
 <CAEs876gvRcq8jdepUKU90CP-B6FjCZGK_dLU8pxM-qEvbSZoxg@mail.gmail.com>
Message-ID: <CAA857VzZ_aYW8vZzvt-6Mo8s15vCLTidgczkxwTkoc6ve-1VNw@mail.gmail.com>

Hi,

Am Mi., 12. Aug. 2020 um 16:30 Uhr schrieb Mohammed Naser <
mnaser at vexxhost.com>:

> On Thu, Aug 6, 2020 at 10:11 AM Arnaud Morin <arnaud.morin at gmail.com>
> wrote:
> The controllers are pretty stable and if a controller dies, Kubernetes
> will take care of restarting the pod somewhere else and everything
> will reconnect and things will be happy again.
>

sounds really interesting. Do you have any docs how to use / do a poc of
this setup?

 Fabian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200812/8a66c9ad/attachment.html>

From sean.mcginnis at gmx.com  Wed Aug 12 15:21:31 2020
From: sean.mcginnis at gmx.com (Sean McGinnis)
Date: Wed, 12 Aug 2020 10:21:31 -0500
Subject: [all] Proposed Wallaby cycle schedule
In-Reply-To: <2e56de68-c416-e3ea-f3da-caaf9399287d@gmx.com>
References: <2e56de68-c416-e3ea-f3da-caaf9399287d@gmx.com>
Message-ID: <0083db2a-0ef7-99fa-0c45-fd170f7d7902@gmx.com>


>
> The current thinking is it will likely take place in May (nothing is
> set, just an educated guess, so please don't use that for any other
> planning). So for the sake of figuring out the release schedule, we are
> targeting a release date in early May. Hopefully this will then align
> well with event plans.
>
> I have a proposed release schedule up for review here:
>
> https://review.opendev.org/#/c/744729/
>
> For ease of viewing (until the job logs are garbage collected), you can
> see the rendered schedule here:
>
> https://0e6b8aeca433e85b429b-46fd243db6dc394538bd0555f339eba5.ssl.cf1.rackcdn.com/744729/3/check/openstack-tox-docs/4f76901/docs/wallaby/schedule.html
>
>
> There are always outside conflicts, but I think this has aligned mostly
> well with major holidays. But please feel free to comment on the patch
> if you see any major issues that we may have not considered.
>
One more update to this. Some concerns were raised around alignment with
the planned Ubuntu release schedule. Plus some general sentiment for
wanting to be closer to a 6 month schedule.

As an alternative option, I have proposed a 26 week option:

https://review.opendev.org/#/c/745911/

This would mean there would be a largish gap between when the X release
starts and when we might hold the PTG for that development. That could
be good or bad. Depending on the in-person event situation, it is also
unknown if we would need to wait for a larger scheduled event, or if we
would be able to hold a virtual event sooner. So lot's of unknowns.

Getting community feedback on these options would be useful. If one
schedule or the other seems better to you, please add comments to the
patches.

Here is a rendered schedule for the 26 week option:

https://f7c086752d1ed6ae5f02-3fd01ef5e4a590ae96edf7e9bfcef60c.ssl.cf1.rackcdn.com/745911/2/check/openstack-tox-docs/5be238a/docs/wallaby/schedule.html

Thanks!

Sean


From openstack at nemebean.com  Wed Aug 12 15:50:21 2020
From: openstack at nemebean.com (Ben Nemec)
Date: Wed, 12 Aug 2020 10:50:21 -0500
Subject: [largescale-sig][nova][neutron][oslo] RPC ping
In-Reply-To: <7496bd35-856e-f48f-b6d8-65155b1777f1@openstack.org>
References: <20200727095744.GK31915@sync>
 <3d238530-6c84-d611-da4c-553ba836fc02@nemebean.com>
 <m2v9i8yjfw.fsf@caffeine.hv.danplanet.com>
 <671fec63-8bea-4215-c773-d8360e368a99@sap.com>
 <c5afa01d-d8ef-7441-26a5-ddbc756f719c@nemebean.com>
 <2af09e63936f75489946ea6b70c41d6e091531ee.camel@redhat.com>
 <7496bd35-856e-f48f-b6d8-65155b1777f1@openstack.org>
Message-ID: <16a3adf0-2f51-dd7d-c729-7b27f1593980@nemebean.com>


On 8/12/20 5:32 AM, Thierry Carrez wrote:
> Sean Mooney wrote:
>> On Tue, 2020-08-11 at 15:20 -0500, Ben Nemec wrote:
>>> I wonder if this does help though. It seems like a bug that a 
>>> nova-compute service would stop processing messages and still be seen 
>>> as up in the service status. Do we understand why that is happening? 
>>> If not, I'm unclear that a ping living at the oslo.messaging layer is 
>>> going to do a better job of exposing such an outage. The fact that 
>>> oslo.messaging is responding does not necessarily equate to 
>>> nova-compute functioning as expected.
>>>
>>> To be clear, this is not me nacking the ping feature. I just want to 
>>> make sure we understand what is going on here so we don't add another 
>>> unreliable healthchecking mechanism to the one we already have. 
>> [...]
>> im not sure https://bugs.launchpad.net/nova/+bug/1854992 is the bug 
>> that is motiviting the creation of this oslo ping
>> feature but that feels premature if it is. i think it would be better 
>> try to adress this by the sender recreating the
>> queue if the deliver fails and if that is not viable then protpyope 
>> thge fix in nova. if the self ping fixes this
>> miss queue error then we could extract the cod into oslo.
> 
> I think this is missing the point... This is not about working around a 
> specific bug, it's about adding a way to detect a certain class of 
> failure. It's more of an operational feature than a development bugfix.
> 
> If I understood correctly, OVH is running that patch in production as a 
> way to detect certain problems they regularly run into, something our 
> existing monitor mechanisms fail to detect. That sounds like a 
> worthwhile addition?

Okay, I don't think I was aware that this was already being used. If 
someone already finds it useful and it's opt-in then I'm not inclined to 
block it. My main concern was that we were adding a feature that didn't 
actually address the problem at hand.

I _would_ feel better about it if someone could give an example of a 
type of failure this is detecting that is missed by other monitoring 
methods though. Both because having a concrete example of a use case for 
the feature is good, and because if it turns out that the problems this 
is detecting are things like the Nova bug Sean is talking about (which I 
don't think this would catch anyway, since the topic is missing and 
there's nothing to ping) then there may be other changes we can/should 
make to improve things.

> 
> Alternatively, if we can monitor the exact same class of failures using 
> our existing systems (or by improving them rather than adding a new 
> door), that works too.
> 


From thierry at openstack.org  Wed Aug 12 16:14:58 2020
From: thierry at openstack.org (Thierry Carrez)
Date: Wed, 12 Aug 2020 18:14:58 +0200
Subject: [largescale-sig] Next meeting: August 12, 16utc
In-Reply-To: <6e7a4e43-08f4-3030-2eb0-9311f27d9647@openstack.org>
References: <6e7a4e43-08f4-3030-2eb0-9311f27d9647@openstack.org>
Message-ID: <dfe21958-ef38-7dc7-15a0-6dda2113ee99@openstack.org>

We just held the meeting, it was very short, as only mdelavergne and 
myself were on. None of the expected US-based recruits joined. We'll 
likely have to beat a larger drum for the next US-EU meeting in 4 weeks.

Meeting logs at:
http://eavesdrop.openstack.org/meetings/large_scale_sig/2020/large_scale_sig.2020-08-12-16.00.html

TODOs:
- amorin to add some meat to the wiki page before we push the Nova doc 
patch further
- all to describe briefly how you solved metrics/billing in your 
deployment in https://etherpad.openstack.org/p/large-scale-sig-documentation

Next meetings: Aug 26, 8:00UTC; Sep 9, 16:00UTC (#openstack-meeting-3)

-- 
Thierry Carrez (ttx)


From mnaser at vexxhost.com  Wed Aug 12 16:56:12 2020
From: mnaser at vexxhost.com (Mohammed Naser)
Date: Wed, 12 Aug 2020 12:56:12 -0400
Subject: [cloudkitty][tc] Cloudkitty abandoned?
In-Reply-To: <da1a13d1-e840-c43d-ee83-a5ac4dea02db@openstack.org>
References: <c6150166-1fa9-5020-988a-0a8e9101bca8@gmx.com>
 <CAG97raekL-OR3mT_GdpD+zcwoCCLYDTCsieMwqCkYC9brmHKpw@mail.gmail.com>
 <CAFHSqWrKO4nDiVNpZe3LUsccXwefqRirZGk9BCsQjads0VrYCA@mail.gmail.com>
 <CAG97racc_j30TVhZ1UBk9SAo1rNrer1j1pr-uQEjkee4cTY_gg@mail.gmail.com>
 <CAJ6yrQgvpyTpR2Z99-_ruqRMJextYcx4e-yP2kkpPY5PDWdAvA@mail.gmail.com>
 <CA+ny2swxRnfJS8mStB2T9tEA1Ox4=_LdHS+giMEQPiCbu=MUDw@mail.gmail.com>
 <173c942a17b.dfe050d2111458.180813585646259079@ghanshyammann.com>
 <e104109d-a7c8-9e88-42e8-cd947de9a88c@debian.org>
 <da1a13d1-e840-c43d-ee83-a5ac4dea02db@openstack.org>
Message-ID: <CAEs876gCAhp+UQW=+ZGQv5wFGDHRa7NG3E5k5DTFgJUOw89mww@mail.gmail.com>

On Tue, Aug 11, 2020 at 6:16 AM Thierry Carrez <thierry at openstack.org> wrote:
>
> Thomas Goirand wrote:
> > On 8/7/20 4:10 PM, Ghanshyam Mann wrote:
> >> Thanks, Pierre for helping with this.
> >>
> >> ttx has reached out to PTL (Justin Ferrieu (jferrieu) <justin.ferrieu at objectif-libre.com>)
> >> but I am not sure if he got any response back.
>
> No response so far, but they may all be in company summer vacation.
>
> > The end of the very good maintenance of Cloudkitty matched the date when
> > objectif libre was sold to Linkbynet. Maybe the new owner don't care enough?
> >
> > This is very disappointing as I've been using it for some time already,
> > and that I was satisfied by it (ie: it does the job...), and especially
> > that latest releases are able to scale correctly.
> >
> > I very much would love if Pierre Riteau was successful in taking over.
> > Good luck Pierre! I'll try to help whenever I can and if I'm not too busy.
>
> Given the volunteers (Pierre, Rafael, Luis) I would support the TC using
> its unholy powers to add extra core reviewers to cloudkitty.

https://review.opendev.org/#/c/745653 is currently merging and fungi will be
adding Pierre as a core.

Thank you for helping.

> If the current PTL comes back, I'm sure they will appreciate the help,
> and can always fix/revert things before Victoria release.
>
> --
> Thierry Carrez (ttx)
>


-- 
Mohammed Naser
VEXXHOST, Inc.


From openstack at nemebean.com  Wed Aug 12 17:02:34 2020
From: openstack at nemebean.com (Ben Nemec)
Date: Wed, 12 Aug 2020 12:02:34 -0500
Subject: [tc][oslo] Move etcd3gw to OpenStack?
In-Reply-To: <CAKZ_x79vxOLCkFKr+BCd4TYMB+SiNufnUd3X=bYGBrLp5iuv2g@mail.gmail.com>
References: <CAKZ_x79vxOLCkFKr+BCd4TYMB+SiNufnUd3X=bYGBrLp5iuv2g@mail.gmail.com>
Message-ID: <0c77a128-acea-75f3-faa2-b3d79c3991aa@nemebean.com>

I'm fine with this for all the reasons mentioned below. It's not a high 
volume project so it shouldn't be a big problem to bring it into Oslo. 
Plus it would give us an excuse to make dims an Oslo core again. ;-)

On 8/12/20 6:45 AM, Radosław Piliszek wrote:
> Hey, Folks!
> 
> I see it has been kinda proposed already [1] so that's mostly why I am 
> asking about that now.
> 
>  From what I understand, etcd3gw is our best bet when trying to get 
> coordination with etcd3.
> However, it has recently released a broken release [2] due to no testing 
> (not to mention gating with tooz).
> I think it could benefit from OpenDev's existing tooling.
> And since OpenStack is an important client of it and OpenStack 
> preferring this client, it might be wise to put it in that namespace 
> already.
> 
> I guess the details would have to be discussed with dims (current owner) 
> himself but he seemed happy about it in [1].
> 
> I'm notifying oslo as well as this would probably live best finally 
> under oslo governance.
> 
> Please let me know if any of the above is not true, so that I can amend 
> my knowledge. :-)
> 
> [1] https://github.com/dims/etcd3-gateway/issues/29
> [2] https://bugs.launchpad.net/kolla-ansible/+bug/1891314
> 
> -yoctozepto
> 


From rafaelweingartner at gmail.com  Wed Aug 12 17:06:53 2020
From: rafaelweingartner at gmail.com (=?UTF-8?Q?Rafael_Weing=C3=A4rtner?=)
Date: Wed, 12 Aug 2020 14:06:53 -0300
Subject: [cloudkitty][tc] Cloudkitty abandoned?
In-Reply-To: <CAEs876gCAhp+UQW=+ZGQv5wFGDHRa7NG3E5k5DTFgJUOw89mww@mail.gmail.com>
References: <c6150166-1fa9-5020-988a-0a8e9101bca8@gmx.com>
 <CAG97raekL-OR3mT_GdpD+zcwoCCLYDTCsieMwqCkYC9brmHKpw@mail.gmail.com>
 <CAFHSqWrKO4nDiVNpZe3LUsccXwefqRirZGk9BCsQjads0VrYCA@mail.gmail.com>
 <CAG97racc_j30TVhZ1UBk9SAo1rNrer1j1pr-uQEjkee4cTY_gg@mail.gmail.com>
 <CAJ6yrQgvpyTpR2Z99-_ruqRMJextYcx4e-yP2kkpPY5PDWdAvA@mail.gmail.com>
 <CA+ny2swxRnfJS8mStB2T9tEA1Ox4=_LdHS+giMEQPiCbu=MUDw@mail.gmail.com>
 <173c942a17b.dfe050d2111458.180813585646259079@ghanshyammann.com>
 <e104109d-a7c8-9e88-42e8-cd947de9a88c@debian.org>
 <da1a13d1-e840-c43d-ee83-a5ac4dea02db@openstack.org>
 <CAEs876gCAhp+UQW=+ZGQv5wFGDHRa7NG3E5k5DTFgJUOw89mww@mail.gmail.com>
Message-ID: <CAG97rafWJ5CPgpnX_3LBj3vqazTMC8UCdB5RqoN_VXxBsDidcg@mail.gmail.com>

Awesome! Thank you guys for the help.
We have few PRs open there that are ready (or close to be ready) to be
merged.

On Wed, Aug 12, 2020 at 1:59 PM Mohammed Naser <mnaser at vexxhost.com> wrote:

> On Tue, Aug 11, 2020 at 6:16 AM Thierry Carrez <thierry at openstack.org>
> wrote:
> >
> > Thomas Goirand wrote:
> > > On 8/7/20 4:10 PM, Ghanshyam Mann wrote:
> > >> Thanks, Pierre for helping with this.
> > >>
> > >> ttx has reached out to PTL (Justin Ferrieu (jferrieu) <
> justin.ferrieu at objectif-libre.com>)
> > >> but I am not sure if he got any response back.
> >
> > No response so far, but they may all be in company summer vacation.
> >
> > > The end of the very good maintenance of Cloudkitty matched the date
> when
> > > objectif libre was sold to Linkbynet. Maybe the new owner don't care
> enough?
> > >
> > > This is very disappointing as I've been using it for some time already,
> > > and that I was satisfied by it (ie: it does the job...), and especially
> > > that latest releases are able to scale correctly.
> > >
> > > I very much would love if Pierre Riteau was successful in taking over.
> > > Good luck Pierre! I'll try to help whenever I can and if I'm not too
> busy.
> >
> > Given the volunteers (Pierre, Rafael, Luis) I would support the TC using
> > its unholy powers to add extra core reviewers to cloudkitty.
>
> https://review.opendev.org/#/c/745653 is currently merging and fungi will
> be
> adding Pierre as a core.
>
> Thank you for helping.
>
> > If the current PTL comes back, I'm sure they will appreciate the help,
> > and can always fix/revert things before Victoria release.
> >
> > --
> > Thierry Carrez (ttx)
> >
>
>
> --
> Mohammed Naser
> VEXXHOST, Inc.
>
>

-- 
Rafael Weingärtner
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200812/3b6c113b/attachment.html>

From pierre at stackhpc.com  Wed Aug 12 20:37:05 2020
From: pierre at stackhpc.com (Pierre Riteau)
Date: Wed, 12 Aug 2020 22:37:05 +0200
Subject: [cloudkitty][tc] Cloudkitty abandoned?
In-Reply-To: <CAG97rafWJ5CPgpnX_3LBj3vqazTMC8UCdB5RqoN_VXxBsDidcg@mail.gmail.com>
References: <c6150166-1fa9-5020-988a-0a8e9101bca8@gmx.com>
 <CAG97raekL-OR3mT_GdpD+zcwoCCLYDTCsieMwqCkYC9brmHKpw@mail.gmail.com>
 <CAFHSqWrKO4nDiVNpZe3LUsccXwefqRirZGk9BCsQjads0VrYCA@mail.gmail.com>
 <CAG97racc_j30TVhZ1UBk9SAo1rNrer1j1pr-uQEjkee4cTY_gg@mail.gmail.com>
 <CAJ6yrQgvpyTpR2Z99-_ruqRMJextYcx4e-yP2kkpPY5PDWdAvA@mail.gmail.com>
 <CA+ny2swxRnfJS8mStB2T9tEA1Ox4=_LdHS+giMEQPiCbu=MUDw@mail.gmail.com>
 <173c942a17b.dfe050d2111458.180813585646259079@ghanshyammann.com>
 <e104109d-a7c8-9e88-42e8-cd947de9a88c@debian.org>
 <da1a13d1-e840-c43d-ee83-a5ac4dea02db@openstack.org>
 <CAEs876gCAhp+UQW=+ZGQv5wFGDHRa7NG3E5k5DTFgJUOw89mww@mail.gmail.com>
 <CAG97rafWJ5CPgpnX_3LBj3vqazTMC8UCdB5RqoN_VXxBsDidcg@mail.gmail.com>
Message-ID: <CA+ny2szqZcvuOsxgTSXOyjTnhgJk5ngF=aQUuMhLTyCpMvH0Uw@mail.gmail.com>

I have now received core reviewer privileges. Thank you to TC members
for trusting us with the CloudKitty project.

I would like to kick things off by resuming IRC meetings. They're set
to run every two weeks (on odd weeks) on Monday at 1400 UTC in
#cloudkitty. Is this a convenient time slot for all potential
contributors to the project?

On Wed, 12 Aug 2020 at 19:08, Rafael Weingärtner
<rafaelweingartner at gmail.com> wrote:
>
> Awesome! Thank you guys for the help.
> We have few PRs open there that are ready (or close to be ready) to be merged.
>
> On Wed, Aug 12, 2020 at 1:59 PM Mohammed Naser <mnaser at vexxhost.com> wrote:
>>
>> On Tue, Aug 11, 2020 at 6:16 AM Thierry Carrez <thierry at openstack.org> wrote:
>> >
>> > Thomas Goirand wrote:
>> > > On 8/7/20 4:10 PM, Ghanshyam Mann wrote:
>> > >> Thanks, Pierre for helping with this.
>> > >>
>> > >> ttx has reached out to PTL (Justin Ferrieu (jferrieu) <justin.ferrieu at objectif-libre.com>)
>> > >> but I am not sure if he got any response back.
>> >
>> > No response so far, but they may all be in company summer vacation.
>> >
>> > > The end of the very good maintenance of Cloudkitty matched the date when
>> > > objectif libre was sold to Linkbynet. Maybe the new owner don't care enough?
>> > >
>> > > This is very disappointing as I've been using it for some time already,
>> > > and that I was satisfied by it (ie: it does the job...), and especially
>> > > that latest releases are able to scale correctly.
>> > >
>> > > I very much would love if Pierre Riteau was successful in taking over.
>> > > Good luck Pierre! I'll try to help whenever I can and if I'm not too busy.
>> >
>> > Given the volunteers (Pierre, Rafael, Luis) I would support the TC using
>> > its unholy powers to add extra core reviewers to cloudkitty.
>>
>> https://review.opendev.org/#/c/745653 is currently merging and fungi will be
>> adding Pierre as a core.
>>
>> Thank you for helping.
>>
>> > If the current PTL comes back, I'm sure they will appreciate the help,
>> > and can always fix/revert things before Victoria release.
>> >
>> > --
>> > Thierry Carrez (ttx)
>> >
>>
>>
>> --
>> Mohammed Naser
>> VEXXHOST, Inc.
>>
>
>
> --
> Rafael Weingärtner


From rafaelweingartner at gmail.com  Wed Aug 12 20:40:57 2020
From: rafaelweingartner at gmail.com (=?UTF-8?Q?Rafael_Weing=C3=A4rtner?=)
Date: Wed, 12 Aug 2020 17:40:57 -0300
Subject: [cloudkitty][tc] Cloudkitty abandoned?
In-Reply-To: <CA+ny2szqZcvuOsxgTSXOyjTnhgJk5ngF=aQUuMhLTyCpMvH0Uw@mail.gmail.com>
References: <c6150166-1fa9-5020-988a-0a8e9101bca8@gmx.com>
 <CAG97raekL-OR3mT_GdpD+zcwoCCLYDTCsieMwqCkYC9brmHKpw@mail.gmail.com>
 <CAFHSqWrKO4nDiVNpZe3LUsccXwefqRirZGk9BCsQjads0VrYCA@mail.gmail.com>
 <CAG97racc_j30TVhZ1UBk9SAo1rNrer1j1pr-uQEjkee4cTY_gg@mail.gmail.com>
 <CAJ6yrQgvpyTpR2Z99-_ruqRMJextYcx4e-yP2kkpPY5PDWdAvA@mail.gmail.com>
 <CA+ny2swxRnfJS8mStB2T9tEA1Ox4=_LdHS+giMEQPiCbu=MUDw@mail.gmail.com>
 <173c942a17b.dfe050d2111458.180813585646259079@ghanshyammann.com>
 <e104109d-a7c8-9e88-42e8-cd947de9a88c@debian.org>
 <da1a13d1-e840-c43d-ee83-a5ac4dea02db@openstack.org>
 <CAEs876gCAhp+UQW=+ZGQv5wFGDHRa7NG3E5k5DTFgJUOw89mww@mail.gmail.com>
 <CAG97rafWJ5CPgpnX_3LBj3vqazTMC8UCdB5RqoN_VXxBsDidcg@mail.gmail.com>
 <CA+ny2szqZcvuOsxgTSXOyjTnhgJk5ngF=aQUuMhLTyCpMvH0Uw@mail.gmail.com>
Message-ID: <CAG97racBQS6usPTeEZwAR0ALP1bDKPj+EX2c_ShsY781FJ4Nrw@mail.gmail.com>

Sounds good to me.

On Wed, Aug 12, 2020 at 5:37 PM Pierre Riteau <pierre at stackhpc.com> wrote:

> I have now received core reviewer privileges. Thank you to TC members
> for trusting us with the CloudKitty project.
>
> I would like to kick things off by resuming IRC meetings. They're set
> to run every two weeks (on odd weeks) on Monday at 1400 UTC in
> #cloudkitty. Is this a convenient time slot for all potential
> contributors to the project?
>
> On Wed, 12 Aug 2020 at 19:08, Rafael Weingärtner
> <rafaelweingartner at gmail.com> wrote:
> >
> > Awesome! Thank you guys for the help.
> > We have few PRs open there that are ready (or close to be ready) to be
> merged.
> >
> > On Wed, Aug 12, 2020 at 1:59 PM Mohammed Naser <mnaser at vexxhost.com>
> wrote:
> >>
> >> On Tue, Aug 11, 2020 at 6:16 AM Thierry Carrez <thierry at openstack.org>
> wrote:
> >> >
> >> > Thomas Goirand wrote:
> >> > > On 8/7/20 4:10 PM, Ghanshyam Mann wrote:
> >> > >> Thanks, Pierre for helping with this.
> >> > >>
> >> > >> ttx has reached out to PTL (Justin Ferrieu (jferrieu) <
> justin.ferrieu at objectif-libre.com>)
> >> > >> but I am not sure if he got any response back.
> >> >
> >> > No response so far, but they may all be in company summer vacation.
> >> >
> >> > > The end of the very good maintenance of Cloudkitty matched the date
> when
> >> > > objectif libre was sold to Linkbynet. Maybe the new owner don't
> care enough?
> >> > >
> >> > > This is very disappointing as I've been using it for some time
> already,
> >> > > and that I was satisfied by it (ie: it does the job...), and
> especially
> >> > > that latest releases are able to scale correctly.
> >> > >
> >> > > I very much would love if Pierre Riteau was successful in taking
> over.
> >> > > Good luck Pierre! I'll try to help whenever I can and if I'm not
> too busy.
> >> >
> >> > Given the volunteers (Pierre, Rafael, Luis) I would support the TC
> using
> >> > its unholy powers to add extra core reviewers to cloudkitty.
> >>
> >> https://review.opendev.org/#/c/745653 is currently merging and fungi
> will be
> >> adding Pierre as a core.
> >>
> >> Thank you for helping.
> >>
> >> > If the current PTL comes back, I'm sure they will appreciate the help,
> >> > and can always fix/revert things before Victoria release.
> >> >
> >> > --
> >> > Thierry Carrez (ttx)
> >> >
> >>
> >>
> >> --
> >> Mohammed Naser
> >> VEXXHOST, Inc.
> >>
> >
> >
> > --
> > Rafael Weingärtner
>


-- 
Rafael Weingärtner
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200812/e473b658/attachment.html>

From luis.ramirez at opencloud.es  Wed Aug 12 21:38:30 2020
From: luis.ramirez at opencloud.es (Luis Ramirez)
Date: Wed, 12 Aug 2020 23:38:30 +0200
Subject: [cloudkitty][tc] Cloudkitty abandoned?
In-Reply-To: <CAG97racBQS6usPTeEZwAR0ALP1bDKPj+EX2c_ShsY781FJ4Nrw@mail.gmail.com>
References: <c6150166-1fa9-5020-988a-0a8e9101bca8@gmx.com>
 <CAG97raekL-OR3mT_GdpD+zcwoCCLYDTCsieMwqCkYC9brmHKpw@mail.gmail.com>
 <CAFHSqWrKO4nDiVNpZe3LUsccXwefqRirZGk9BCsQjads0VrYCA@mail.gmail.com>
 <CAG97racc_j30TVhZ1UBk9SAo1rNrer1j1pr-uQEjkee4cTY_gg@mail.gmail.com>
 <CAJ6yrQgvpyTpR2Z99-_ruqRMJextYcx4e-yP2kkpPY5PDWdAvA@mail.gmail.com>
 <CA+ny2swxRnfJS8mStB2T9tEA1Ox4=_LdHS+giMEQPiCbu=MUDw@mail.gmail.com>
 <173c942a17b.dfe050d2111458.180813585646259079@ghanshyammann.com>
 <e104109d-a7c8-9e88-42e8-cd947de9a88c@debian.org>
 <da1a13d1-e840-c43d-ee83-a5ac4dea02db@openstack.org>
 <CAEs876gCAhp+UQW=+ZGQv5wFGDHRa7NG3E5k5DTFgJUOw89mww@mail.gmail.com>
 <CAG97rafWJ5CPgpnX_3LBj3vqazTMC8UCdB5RqoN_VXxBsDidcg@mail.gmail.com>
 <CA+ny2szqZcvuOsxgTSXOyjTnhgJk5ngF=aQUuMhLTyCpMvH0Uw@mail.gmail.com>
 <CAG97racBQS6usPTeEZwAR0ALP1bDKPj+EX2c_ShsY781FJ4Nrw@mail.gmail.com>
Message-ID: <CAAvZhtmumkFV14N4awVQMECgz9whix5D95JS6mQU=3xgfDSVrQ@mail.gmail.com>

Sounds good to me.

El El mié, 12 ago 2020 a las 22:41, Rafael Weingärtner <
rafaelweingartner at gmail.com> escribió:

> Sounds good to me.
>
> On Wed, Aug 12, 2020 at 5:37 PM Pierre Riteau <pierre at stackhpc.com> wrote:
>
>> I have now received core reviewer privileges. Thank you to TC members
>>
>>
>> for trusting us with the CloudKitty project.
>>
>>
>>
>>
>>
>> I would like to kick things off by resuming IRC meetings. They're set
>>
>>
>> to run every two weeks (on odd weeks) on Monday at 1400 UTC in
>>
>>
>> #cloudkitty. Is this a convenient time slot for all potential
>>
>>
>> contributors to the project?
>>
>>
>>
>>
>>
>> On Wed, 12 Aug 2020 at 19:08, Rafael Weingärtner
>>
>>
>> <rafaelweingartner at gmail.com> wrote:
>>
>>
>> >
>>
>>
>> > Awesome! Thank you guys for the help.
>>
>>
>> > We have few PRs open there that are ready (or close to be ready) to be
>> merged.
>>
>>
>> >
>>
>>
>> > On Wed, Aug 12, 2020 at 1:59 PM Mohammed Naser <mnaser at vexxhost.com>
>> wrote:
>>
>>
>> >>
>>
>>
>> >> On Tue, Aug 11, 2020 at 6:16 AM Thierry Carrez <thierry at openstack.org>
>> wrote:
>>
>>
>> >> >
>>
>>
>> >> > Thomas Goirand wrote:
>>
>>
>> >> > > On 8/7/20 4:10 PM, Ghanshyam Mann wrote:
>>
>>
>> >> > >> Thanks, Pierre for helping with this.
>>
>>
>> >> > >>
>>
>>
>> >> > >> ttx has reached out to PTL (Justin Ferrieu (jferrieu) <
>> justin.ferrieu at objectif-libre.com>)
>>
>>
>> >> > >> but I am not sure if he got any response back.
>>
>>
>> >> >
>>
>>
>> >> > No response so far, but they may all be in company summer vacation.
>>
>>
>> >> >
>>
>>
>> >> > > The end of the very good maintenance of Cloudkitty matched the
>> date when
>>
>>
>> >> > > objectif libre was sold to Linkbynet. Maybe the new owner don't
>> care enough?
>>
>>
>> >> > >
>>
>>
>> >> > > This is very disappointing as I've been using it for some time
>> already,
>>
>>
>> >> > > and that I was satisfied by it (ie: it does the job...), and
>> especially
>>
>>
>> >> > > that latest releases are able to scale correctly.
>>
>>
>> >> > >
>>
>>
>> >> > > I very much would love if Pierre Riteau was successful in taking
>> over.
>>
>>
>> >> > > Good luck Pierre! I'll try to help whenever I can and if I'm not
>> too busy.
>>
>>
>> >> >
>>
>>
>> >> > Given the volunteers (Pierre, Rafael, Luis) I would support the TC
>> using
>>
>>
>> >> > its unholy powers to add extra core reviewers to cloudkitty.
>>
>>
>> >>
>>
>>
>> >> https://review.opendev.org/#/c/745653 is currently merging and fungi
>> will be
>>
>>
>> >> adding Pierre as a core.
>>
>>
>> >>
>>
>>
>> >> Thank you for helping.
>>
>>
>> >>
>>
>>
>> >> > If the current PTL comes back, I'm sure they will appreciate the
>> help,
>>
>>
>> >> > and can always fix/revert things before Victoria release.
>>
>>
>> >> >
>>
>>
>> >> > --
>>
>>
>> >> > Thierry Carrez (ttx)
>>
>>
>> >> >
>>
>>
>> >>
>>
>>
>> >>
>>
>>
>> >> --
>>
>>
>> >> Mohammed Naser
>>
>>
>> >> VEXXHOST, Inc.
>>
>>
>> >>
>>
>>
>> >
>>
>>
>> >
>>
>>
>> > --
>>
>>
>> > Rafael Weingärtner
>>
>>
>>
>
> --
> Rafael Weingärtner
>
>
> --
Br,
Luis Rmz
Blockchain, DevOps & Open Source Cloud Solutions Architect
----------------------------------------
Founder & CEO
OpenCloud.es
luis.ramirez at opencloud.es
Skype ID: d.overload
Hangouts: luis.ramirez at opencloud.es
 +34 911 950 123 / +39 392 1289553 / +49 152 26917722
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200812/e6be213a/attachment-0001.html>

From melwittt at gmail.com  Wed Aug 12 23:22:38 2020
From: melwittt at gmail.com (melanie witt)
Date: Wed, 12 Aug 2020 16:22:38 -0700
Subject: [gate][keystone][nova][neutron] *-grenade-multinode jobs failing
 with UnicodeDecodeError in keystone
In-Reply-To: <18572a52-d105-9219-6b19-5fe23f18e3e0@gmail.com>
References: <45926788-6dcf-8825-5bfd-b6353b5facf0@gmail.com>
 <67a115ba-f80a-ebe5-8689-922e3bbb9a40@gmail.com>
 <18572a52-d105-9219-6b19-5fe23f18e3e0@gmail.com>
Message-ID: <dce56b80-7aeb-2494-0253-ed61e03a4bc2@gmail.com>

On 8/12/20 00:49, melanie witt wrote:
>>> I've opened a bug [3] about and I'm trying out the following keystone 
>>> patch to fix it:
>>>
>>> https://review.opendev.org/745752
>>>
>>> Reviews appreciated.
> 
> I tested ^ with a DNM patch to nova and nova-grenade-multinode passes 
> with it.

The fix has merged and it is now safe to recheck your patches.

Thank you all for the code reviews.

Cheers,
-melanie

>>> [1] 
>>> https://github.com/msgpack/msgpack-python/blob/v1.0.0/README.md#major-breaking-changes-in-msgpack-10 
>>>
>>> [2] https://review.opendev.org/#/c/745437/2/upper-constraints.txt at 373
>>> [3] https://launchpad.net/bugs/1891244


From jayadityagupta11 at gmail.com  Thu Aug 13 08:03:01 2020
From: jayadityagupta11 at gmail.com (jayaditya gupta)
Date: Thu, 13 Aug 2020 10:03:01 +0200
Subject: [openstackclient] Implementing nova migration cmds in OSC
Message-ID: <CAN-TBZGuYH=hwT1BzmEegLNc9M+KyciXpXuv2jG-1DzQVtH6Ug@mail.gmail.com>

Hello , i am trying to implement some nova migrations commands to
openstackclient.

Commands
1. migration-list : list all migrations ever happened 2.
server-migration-list : Get the migrations list of specified server 3.
server-migration-show : show currently going on migration of specified
server.
4. live migration abort feature
5.live-migration-force-complete

Please have a look at this patch : https://review.opendev.org/#/c/742210/ and
share your insight, what should be the correct way to implement it ? should
it be a root command or part of the openstack server migrate command?

Best Regards
Jayaditya Gupta
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200813/1df62c4b/attachment.html>

From thierry at openstack.org  Thu Aug 13 08:24:26 2020
From: thierry at openstack.org (Thierry Carrez)
Date: Thu, 13 Aug 2020 10:24:26 +0200
Subject: [largescale-sig][nova][neutron][oslo] RPC ping
In-Reply-To: <16a3adf0-2f51-dd7d-c729-7b27f1593980@nemebean.com>
References: <20200727095744.GK31915@sync>
 <3d238530-6c84-d611-da4c-553ba836fc02@nemebean.com>
 <m2v9i8yjfw.fsf@caffeine.hv.danplanet.com>
 <671fec63-8bea-4215-c773-d8360e368a99@sap.com>
 <c5afa01d-d8ef-7441-26a5-ddbc756f719c@nemebean.com>
 <2af09e63936f75489946ea6b70c41d6e091531ee.camel@redhat.com>
 <7496bd35-856e-f48f-b6d8-65155b1777f1@openstack.org>
 <16a3adf0-2f51-dd7d-c729-7b27f1593980@nemebean.com>
Message-ID: <a36bcd1d-1dbe-bd69-d195-c89795680e44@openstack.org>

Ben Nemec wrote:
> On 8/12/20 5:32 AM, Thierry Carrez wrote:
>> Sean Mooney wrote:
>>> On Tue, 2020-08-11 at 15:20 -0500, Ben Nemec wrote:
>>>> I wonder if this does help though. It seems like a bug that a 
>>>> nova-compute service would stop processing messages and still be 
>>>> seen as up in the service status. Do we understand why that is 
>>>> happening? If not, I'm unclear that a ping living at the 
>>>> oslo.messaging layer is going to do a better job of exposing such an 
>>>> outage. The fact that oslo.messaging is responding does not 
>>>> necessarily equate to nova-compute functioning as expected.
>>>>
>>>> To be clear, this is not me nacking the ping feature. I just want to 
>>>> make sure we understand what is going on here so we don't add 
>>>> another unreliable healthchecking mechanism to the one we already have. 
>>> [...]
>>> im not sure https://bugs.launchpad.net/nova/+bug/1854992 is the bug 
>>> that is motiviting the creation of this oslo ping
>>> feature but that feels premature if it is. i think it would be better 
>>> try to adress this by the sender recreating the
>>> queue if the deliver fails and if that is not viable then protpyope 
>>> thge fix in nova. if the self ping fixes this
>>> miss queue error then we could extract the cod into oslo.
>>
>> I think this is missing the point... This is not about working around 
>> a specific bug, it's about adding a way to detect a certain class of 
>> failure. It's more of an operational feature than a development bugfix.
>>
>> If I understood correctly, OVH is running that patch in production as 
>> a way to detect certain problems they regularly run into, something 
>> our existing monitor mechanisms fail to detect. That sounds like a 
>> worthwhile addition?
> 
> Okay, I don't think I was aware that this was already being used. If 
> someone already finds it useful and it's opt-in then I'm not inclined to 
> block it. My main concern was that we were adding a feature that didn't 
> actually address the problem at hand.
> 
> I _would_ feel better about it if someone could give an example of a 
> type of failure this is detecting that is missed by other monitoring 
> methods though. Both because having a concrete example of a use case for 
> the feature is good, and because if it turns out that the problems this 
> is detecting are things like the Nova bug Sean is talking about (which I 
> don't think this would catch anyway, since the topic is missing and 
> there's nothing to ping) then there may be other changes we can/should 
> make to improve things.

Right. Let's wait for Arnaud to come back from vacation and confirm that

(1) that patch is not a shot in the dark: it allows them to expose a 
class of issues in production

(2) they fail to expose that same class of issues using other existing 
mechanisms, including those just suggested in this thread

I just wanted to avoid early rejection of this health check ability on 
the grounds that the situation it exposes should just not happen. Or 
that, if enabled and heavily used, it would have a performance impact.

-- 
Thierry Carrez (ttx)


From pierre at stackhpc.com  Thu Aug 13 11:35:42 2020
From: pierre at stackhpc.com (Pierre Riteau)
Date: Thu, 13 Aug 2020 13:35:42 +0200
Subject: [cloudkitty][tc] Cloudkitty abandoned?
In-Reply-To: <CAAvZhtmumkFV14N4awVQMECgz9whix5D95JS6mQU=3xgfDSVrQ@mail.gmail.com>
References: <c6150166-1fa9-5020-988a-0a8e9101bca8@gmx.com>
 <CAG97raekL-OR3mT_GdpD+zcwoCCLYDTCsieMwqCkYC9brmHKpw@mail.gmail.com>
 <CAFHSqWrKO4nDiVNpZe3LUsccXwefqRirZGk9BCsQjads0VrYCA@mail.gmail.com>
 <CAG97racc_j30TVhZ1UBk9SAo1rNrer1j1pr-uQEjkee4cTY_gg@mail.gmail.com>
 <CAJ6yrQgvpyTpR2Z99-_ruqRMJextYcx4e-yP2kkpPY5PDWdAvA@mail.gmail.com>
 <CA+ny2swxRnfJS8mStB2T9tEA1Ox4=_LdHS+giMEQPiCbu=MUDw@mail.gmail.com>
 <173c942a17b.dfe050d2111458.180813585646259079@ghanshyammann.com>
 <e104109d-a7c8-9e88-42e8-cd947de9a88c@debian.org>
 <da1a13d1-e840-c43d-ee83-a5ac4dea02db@openstack.org>
 <CAEs876gCAhp+UQW=+ZGQv5wFGDHRa7NG3E5k5DTFgJUOw89mww@mail.gmail.com>
 <CAG97rafWJ5CPgpnX_3LBj3vqazTMC8UCdB5RqoN_VXxBsDidcg@mail.gmail.com>
 <CA+ny2szqZcvuOsxgTSXOyjTnhgJk5ngF=aQUuMhLTyCpMvH0Uw@mail.gmail.com>
 <CAG97racBQS6usPTeEZwAR0ALP1bDKPj+EX2c_ShsY781FJ4Nrw@mail.gmail.com>
 <CAAvZhtmumkFV14N4awVQMECgz9whix5D95JS6mQU=3xgfDSVrQ@mail.gmail.com>
Message-ID: <CA+ny2sxuoLr5rM2jhvPRDSBQLYV2xQQqUe4Uhzy+s-Pkuc-TxA@mail.gmail.com>

Thank you both.

I've merged a few patches to fix CI and finalise the Ussuri release
(for example release notes were missing).
I gave core reviewer privileges to Rafael and Luis. Let's try to merge
patches with two +2 votes from now on.

On Wed, 12 Aug 2020 at 23:38, Luis Ramirez <luis.ramirez at opencloud.es> wrote:
>
> Sounds good to me.
>
> El El mié, 12 ago 2020 a las 22:41, Rafael Weingärtner <rafaelweingartner at gmail.com> escribió:
>>
>> Sounds good to me.
>>
>> On Wed, Aug 12, 2020 at 5:37 PM Pierre Riteau <pierre at stackhpc.com> wrote:
>>>
>>> I have now received core reviewer privileges. Thank you to TC members
>>>
>>>
>>> for trusting us with the CloudKitty project.
>>>
>>>
>>>
>>>
>>>
>>> I would like to kick things off by resuming IRC meetings. They're set
>>>
>>>
>>> to run every two weeks (on odd weeks) on Monday at 1400 UTC in
>>>
>>>
>>> #cloudkitty. Is this a convenient time slot for all potential
>>>
>>>
>>> contributors to the project?
>>>
>>>
>>>
>>>
>>>
>>> On Wed, 12 Aug 2020 at 19:08, Rafael Weingärtner
>>>
>>>
>>> <rafaelweingartner at gmail.com> wrote:
>>>
>>>
>>> >
>>>
>>>
>>> > Awesome! Thank you guys for the help.
>>>
>>>
>>> > We have few PRs open there that are ready (or close to be ready) to be merged.
>>>
>>>
>>> >
>>>
>>>
>>> > On Wed, Aug 12, 2020 at 1:59 PM Mohammed Naser <mnaser at vexxhost.com> wrote:
>>>
>>>
>>> >>
>>>
>>>
>>> >> On Tue, Aug 11, 2020 at 6:16 AM Thierry Carrez <thierry at openstack.org> wrote:
>>>
>>>
>>> >> >
>>>
>>>
>>> >> > Thomas Goirand wrote:
>>>
>>>
>>> >> > > On 8/7/20 4:10 PM, Ghanshyam Mann wrote:
>>>
>>>
>>> >> > >> Thanks, Pierre for helping with this.
>>>
>>>
>>> >> > >>
>>>
>>>
>>> >> > >> ttx has reached out to PTL (Justin Ferrieu (jferrieu) <justin.ferrieu at objectif-libre.com>)
>>>
>>>
>>> >> > >> but I am not sure if he got any response back.
>>>
>>>
>>> >> >
>>>
>>>
>>> >> > No response so far, but they may all be in company summer vacation.
>>>
>>>
>>> >> >
>>>
>>>
>>> >> > > The end of the very good maintenance of Cloudkitty matched the date when
>>>
>>>
>>> >> > > objectif libre was sold to Linkbynet. Maybe the new owner don't care enough?
>>>
>>>
>>> >> > >
>>>
>>>
>>> >> > > This is very disappointing as I've been using it for some time already,
>>>
>>>
>>> >> > > and that I was satisfied by it (ie: it does the job...), and especially
>>>
>>>
>>> >> > > that latest releases are able to scale correctly.
>>>
>>>
>>> >> > >
>>>
>>>
>>> >> > > I very much would love if Pierre Riteau was successful in taking over.
>>>
>>>
>>> >> > > Good luck Pierre! I'll try to help whenever I can and if I'm not too busy.
>>>
>>>
>>> >> >
>>>
>>>
>>> >> > Given the volunteers (Pierre, Rafael, Luis) I would support the TC using
>>>
>>>
>>> >> > its unholy powers to add extra core reviewers to cloudkitty.
>>>
>>>
>>> >>
>>>
>>>
>>> >> https://review.opendev.org/#/c/745653 is currently merging and fungi will be
>>>
>>>
>>> >> adding Pierre as a core.
>>>
>>>
>>> >>
>>>
>>>
>>> >> Thank you for helping.
>>>
>>>
>>> >>
>>>
>>>
>>> >> > If the current PTL comes back, I'm sure they will appreciate the help,
>>>
>>>
>>> >> > and can always fix/revert things before Victoria release.
>>>
>>>
>>> >> >
>>>
>>>
>>> >> > --
>>>
>>>
>>> >> > Thierry Carrez (ttx)
>>>
>>>
>>> >> >
>>>
>>>
>>> >>
>>>
>>>
>>> >>
>>>
>>>
>>> >> --
>>>
>>>
>>> >> Mohammed Naser
>>>
>>>
>>> >> VEXXHOST, Inc.
>>>
>>>
>>> >>
>>>
>>>
>>> >
>>>
>>>
>>> >
>>>
>>>
>>> > --
>>>
>>>
>>> > Rafael Weingärtner
>>>
>>>
>>
>>
>> --
>> Rafael Weingärtner
>>
>>
> --
> Br,
> Luis Rmz
> Blockchain, DevOps & Open Source Cloud Solutions Architect
> ----------------------------------------
> Founder & CEO
> OpenCloud.es
> luis.ramirez at opencloud.es
> Skype ID: d.overload
> Hangouts: luis.ramirez at opencloud.es
>  +34 911 950 123 / +39 392 1289553 / +49 152 26917722


From rafaelweingartner at gmail.com  Thu Aug 13 11:44:20 2020
From: rafaelweingartner at gmail.com (=?UTF-8?Q?Rafael_Weing=C3=A4rtner?=)
Date: Thu, 13 Aug 2020 08:44:20 -0300
Subject: [cloudkitty][tc] Cloudkitty abandoned?
In-Reply-To: <CA+ny2sxuoLr5rM2jhvPRDSBQLYV2xQQqUe4Uhzy+s-Pkuc-TxA@mail.gmail.com>
References: <c6150166-1fa9-5020-988a-0a8e9101bca8@gmx.com>
 <CAG97raekL-OR3mT_GdpD+zcwoCCLYDTCsieMwqCkYC9brmHKpw@mail.gmail.com>
 <CAFHSqWrKO4nDiVNpZe3LUsccXwefqRirZGk9BCsQjads0VrYCA@mail.gmail.com>
 <CAG97racc_j30TVhZ1UBk9SAo1rNrer1j1pr-uQEjkee4cTY_gg@mail.gmail.com>
 <CAJ6yrQgvpyTpR2Z99-_ruqRMJextYcx4e-yP2kkpPY5PDWdAvA@mail.gmail.com>
 <CA+ny2swxRnfJS8mStB2T9tEA1Ox4=_LdHS+giMEQPiCbu=MUDw@mail.gmail.com>
 <173c942a17b.dfe050d2111458.180813585646259079@ghanshyammann.com>
 <e104109d-a7c8-9e88-42e8-cd947de9a88c@debian.org>
 <da1a13d1-e840-c43d-ee83-a5ac4dea02db@openstack.org>
 <CAEs876gCAhp+UQW=+ZGQv5wFGDHRa7NG3E5k5DTFgJUOw89mww@mail.gmail.com>
 <CAG97rafWJ5CPgpnX_3LBj3vqazTMC8UCdB5RqoN_VXxBsDidcg@mail.gmail.com>
 <CA+ny2szqZcvuOsxgTSXOyjTnhgJk5ngF=aQUuMhLTyCpMvH0Uw@mail.gmail.com>
 <CAG97racBQS6usPTeEZwAR0ALP1bDKPj+EX2c_ShsY781FJ4Nrw@mail.gmail.com>
 <CAAvZhtmumkFV14N4awVQMECgz9whix5D95JS6mQU=3xgfDSVrQ@mail.gmail.com>
 <CA+ny2sxuoLr5rM2jhvPRDSBQLYV2xQQqUe4Uhzy+s-Pkuc-TxA@mail.gmail.com>
Message-ID: <CAG97rafCdhsGf=zbswnDsnta3mKwD8_O9YVbtjzYBhLpi9C--w@mail.gmail.com>

Awesome, thanks!
I will try to dedicate a few hours every week to review CloudKitty patches.

On Thu, Aug 13, 2020 at 8:36 AM Pierre Riteau <pierre at stackhpc.com> wrote:

> Thank you both.
>
> I've merged a few patches to fix CI and finalise the Ussuri release
> (for example release notes were missing).
> I gave core reviewer privileges to Rafael and Luis. Let's try to merge
> patches with two +2 votes from now on.
>
> On Wed, 12 Aug 2020 at 23:38, Luis Ramirez <luis.ramirez at opencloud.es>
> wrote:
> >
> > Sounds good to me.
> >
> > El El mié, 12 ago 2020 a las 22:41, Rafael Weingärtner <
> rafaelweingartner at gmail.com> escribió:
> >>
> >> Sounds good to me.
> >>
> >> On Wed, Aug 12, 2020 at 5:37 PM Pierre Riteau <pierre at stackhpc.com>
> wrote:
> >>>
> >>> I have now received core reviewer privileges. Thank you to TC members
> >>>
> >>>
> >>> for trusting us with the CloudKitty project.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> I would like to kick things off by resuming IRC meetings. They're set
> >>>
> >>>
> >>> to run every two weeks (on odd weeks) on Monday at 1400 UTC in
> >>>
> >>>
> >>> #cloudkitty. Is this a convenient time slot for all potential
> >>>
> >>>
> >>> contributors to the project?
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Wed, 12 Aug 2020 at 19:08, Rafael Weingärtner
> >>>
> >>>
> >>> <rafaelweingartner at gmail.com> wrote:
> >>>
> >>>
> >>> >
> >>>
> >>>
> >>> > Awesome! Thank you guys for the help.
> >>>
> >>>
> >>> > We have few PRs open there that are ready (or close to be ready) to
> be merged.
> >>>
> >>>
> >>> >
> >>>
> >>>
> >>> > On Wed, Aug 12, 2020 at 1:59 PM Mohammed Naser <mnaser at vexxhost.com>
> wrote:
> >>>
> >>>
> >>> >>
> >>>
> >>>
> >>> >> On Tue, Aug 11, 2020 at 6:16 AM Thierry Carrez <
> thierry at openstack.org> wrote:
> >>>
> >>>
> >>> >> >
> >>>
> >>>
> >>> >> > Thomas Goirand wrote:
> >>>
> >>>
> >>> >> > > On 8/7/20 4:10 PM, Ghanshyam Mann wrote:
> >>>
> >>>
> >>> >> > >> Thanks, Pierre for helping with this.
> >>>
> >>>
> >>> >> > >>
> >>>
> >>>
> >>> >> > >> ttx has reached out to PTL (Justin Ferrieu (jferrieu) <
> justin.ferrieu at objectif-libre.com>)
> >>>
> >>>
> >>> >> > >> but I am not sure if he got any response back.
> >>>
> >>>
> >>> >> >
> >>>
> >>>
> >>> >> > No response so far, but they may all be in company summer
> vacation.
> >>>
> >>>
> >>> >> >
> >>>
> >>>
> >>> >> > > The end of the very good maintenance of Cloudkitty matched the
> date when
> >>>
> >>>
> >>> >> > > objectif libre was sold to Linkbynet. Maybe the new owner don't
> care enough?
> >>>
> >>>
> >>> >> > >
> >>>
> >>>
> >>> >> > > This is very disappointing as I've been using it for some time
> already,
> >>>
> >>>
> >>> >> > > and that I was satisfied by it (ie: it does the job...), and
> especially
> >>>
> >>>
> >>> >> > > that latest releases are able to scale correctly.
> >>>
> >>>
> >>> >> > >
> >>>
> >>>
> >>> >> > > I very much would love if Pierre Riteau was successful in
> taking over.
> >>>
> >>>
> >>> >> > > Good luck Pierre! I'll try to help whenever I can and if I'm
> not too busy.
> >>>
> >>>
> >>> >> >
> >>>
> >>>
> >>> >> > Given the volunteers (Pierre, Rafael, Luis) I would support the
> TC using
> >>>
> >>>
> >>> >> > its unholy powers to add extra core reviewers to cloudkitty.
> >>>
> >>>
> >>> >>
> >>>
> >>>
> >>> >> https://review.opendev.org/#/c/745653 is currently merging and
> fungi will be
> >>>
> >>>
> >>> >> adding Pierre as a core.
> >>>
> >>>
> >>> >>
> >>>
> >>>
> >>> >> Thank you for helping.
> >>>
> >>>
> >>> >>
> >>>
> >>>
> >>> >> > If the current PTL comes back, I'm sure they will appreciate the
> help,
> >>>
> >>>
> >>> >> > and can always fix/revert things before Victoria release.
> >>>
> >>>
> >>> >> >
> >>>
> >>>
> >>> >> > --
> >>>
> >>>
> >>> >> > Thierry Carrez (ttx)
> >>>
> >>>
> >>> >> >
> >>>
> >>>
> >>> >>
> >>>
> >>>
> >>> >>
> >>>
> >>>
> >>> >> --
> >>>
> >>>
> >>> >> Mohammed Naser
> >>>
> >>>
> >>> >> VEXXHOST, Inc.
> >>>
> >>>
> >>> >>
> >>>
> >>>
> >>> >
> >>>
> >>>
> >>> >
> >>>
> >>>
> >>> > --
> >>>
> >>>
> >>> > Rafael Weingärtner
> >>>
> >>>
> >>
> >>
> >> --
> >> Rafael Weingärtner
> >>
> >>
> > --
> > Br,
> > Luis Rmz
> > Blockchain, DevOps & Open Source Cloud Solutions Architect
> > ----------------------------------------
> > Founder & CEO
> > OpenCloud.es
> > luis.ramirez at opencloud.es
> > Skype ID: d.overload
> > Hangouts: luis.ramirez at opencloud.es
> >  +34 911 950 123 / +39 392 1289553 / +49 152 26917722
>


-- 
Rafael Weingärtner
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200813/d41c8497/attachment-0001.html>

From smooney at redhat.com  Thu Aug 13 12:14:06 2020
From: smooney at redhat.com (Sean Mooney)
Date: Thu, 13 Aug 2020 13:14:06 +0100
Subject: [largescale-sig][nova][neutron][oslo] RPC ping
In-Reply-To: <a36bcd1d-1dbe-bd69-d195-c89795680e44@openstack.org>
References: <20200727095744.GK31915@sync>
 <3d238530-6c84-d611-da4c-553ba836fc02@nemebean.com>
 <m2v9i8yjfw.fsf@caffeine.hv.danplanet.com>
 <671fec63-8bea-4215-c773-d8360e368a99@sap.com>
 <c5afa01d-d8ef-7441-26a5-ddbc756f719c@nemebean.com>
 <2af09e63936f75489946ea6b70c41d6e091531ee.camel@redhat.com>
 <7496bd35-856e-f48f-b6d8-65155b1777f1@openstack.org>
 <16a3adf0-2f51-dd7d-c729-7b27f1593980@nemebean.com>
 <a36bcd1d-1dbe-bd69-d195-c89795680e44@openstack.org>
Message-ID: <6e68d1a3cfc4efff91d3668bb53805dc469673c6.camel@redhat.com>

On Thu, 2020-08-13 at 10:24 +0200, Thierry Carrez wrote:
> Ben Nemec wrote:
> > On 8/12/20 5:32 AM, Thierry Carrez wrote:
> > > Sean Mooney wrote:
> > > > On Tue, 2020-08-11 at 15:20 -0500, Ben Nemec wrote:
> > > > > I wonder if this does help though. It seems like a bug that a 
> > > > > nova-compute service would stop processing messages and still be 
> > > > > seen as up in the service status. Do we understand why that is 
> > > > > happening? If not, I'm unclear that a ping living at the 
> > > > > oslo.messaging layer is going to do a better job of exposing such an 
> > > > > outage. The fact that oslo.messaging is responding does not 
> > > > > necessarily equate to nova-compute functioning as expected.
> > > > > 
> > > > > To be clear, this is not me nacking the ping feature. I just want to 
> > > > > make sure we understand what is going on here so we don't add 
> > > > > another unreliable healthchecking mechanism to the one we already have. 
> > > > 
> > > > [...]
> > > > im not sure https://bugs.launchpad.net/nova/+bug/1854992 is the bug 
> > > > that is motiviting the creation of this oslo ping
> > > > feature but that feels premature if it is. i think it would be better 
> > > > try to adress this by the sender recreating the
> > > > queue if the deliver fails and if that is not viable then protpyope 
> > > > thge fix in nova. if the self ping fixes this
> > > > miss queue error then we could extract the cod into oslo.
> > > 
> > > I think this is missing the point... This is not about working around 
> > > a specific bug, it's about adding a way to detect a certain class of 
> > > failure. It's more of an operational feature than a development bugfix.
> > > 
> > > If I understood correctly, OVH is running that patch in production as 
> > > a way to detect certain problems they regularly run into, something 
> > > our existing monitor mechanisms fail to detect. That sounds like a 
> > > worthwhile addition?
> > 
> > Okay, I don't think I was aware that this was already being used. If 
> > someone already finds it useful and it's opt-in then I'm not inclined to 
> > block it. My main concern was that we were adding a feature that didn't 
> > actually address the problem at hand.
> > 
> > I _would_ feel better about it if someone could give an example of a 
> > type of failure this is detecting that is missed by other monitoring 
> > methods though. Both because having a concrete example of a use case for 
> > the feature is good, and because if it turns out that the problems this 
> > is detecting are things like the Nova bug Sean is talking about (which I 
> > don't think this would catch anyway, since the topic is missing and 
> > there's nothing to ping) then there may be other changes we can/should 
> > make to improve things.
> 
> Right. Let's wait for Arnaud to come back from vacation and confirm that
> 
> (1) that patch is not a shot in the dark: it allows them to expose a 
> class of issues in production
> 
> (2) they fail to expose that same class of issues using other existing 
> mechanisms, including those just suggested in this thread
> 
> I just wanted to avoid early rejection of this health check ability on 
> the grounds that the situation it exposes should just not happen. Or 
> that, if enabled and heavily used, it would have a performance impact.
I think the inital push back from nova is we already have ping rpc function
https://github.com/openstack/nova/blob/c6218428e9b29a2c52808ec7d27b4b21aadc0299/nova/baserpc.py#L55-L76
so if a geneirc metion called ping is added it will break nova.

the reset of the push back is related to not haveing a concrete usecase, including concern over
perfroamce consideration and external services potenailly acessing the rpc bus which is coniserd an internal
api. e.g. we woudl not want an external monitoring solution connecting to the rpc bus and invoking arbitary
RPC calls, ping is well pretty safe but form a design point of view while litening to notification is fine
we dont want anything outside of the openstack services actully sending message on the rpc bus.

so if this does actully detect somethign we can otherwise detect and the use cases involves using it within
the openstack services not form an external source then i think that is fine but we proably need to use another
name (alive? status?) or otherewise modify nova so that there is no conflict.
> 


From dev.faz at gmail.com  Thu Aug 13 13:13:45 2020
From: dev.faz at gmail.com (Fabian Zimmermann)
Date: Thu, 13 Aug 2020 15:13:45 +0200
Subject: [nova][neutron][oslo][ops] rabbit bindings issue
In-Reply-To: <CAA857Vya0rou-SwzO+A=smVHXNTZ9a03_3r09BJj=bb3NAcdGg@mail.gmail.com>
References: <20200806144016.GP31915@sync>
 <CAA857VxY8mr0OSw8vSYdMph4tXb=5Wwwx3cM_YVgGa8=P+cxkQ@mail.gmail.com>
 <CALaZjRGEcFDBiQ2YnsMjXJ_WF3ALcjWVLQuetupVd__CWhjg0g@mail.gmail.com>
 <CAA857VzKREnBOEF667n4B+9L9Rbm+y7o2LVrDjhxhCT=2XWR8g@mail.gmail.com>
 <1a338d7e-c82c-cda2-2d47-b5aebb999142@openstack.org>
 <CAA857VxxmrHhBZgNHCg2X=ymvOdS_YcpERYN5OnKs0NMEA=cyA@mail.gmail.com>
 <CAA857Vya0rou-SwzO+A=smVHXNTZ9a03_3r09BJj=bb3NAcdGg@mail.gmail.com>
Message-ID: <CAA857Vzw+kvW_7R37oqJF6gRkxGBSY3gNYTLKu++pY5WHH49qg@mail.gmail.com>

Hi,

just did some short tests today in our test-environment (without durable
queues and without replication):

* started a rally task to generate some load
* kill-9-ed rabbitmq on one node
* rally task immediately stopped and the cloud (mostly) stopped working

after some debugging i found (again) exchanges which had bindings to
queues, but these bindings didnt forward any msgs.
Wrote a small script to detect these broken bindings and will now check if
this is "reproducible"

then I will try "durable queues" and "durable queues with replication" to
see if this helps. Even if I would expect
rabbitmq should be able to handle this without these "hidden broken
bindings"

This just FYI.

 Fabian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200813/e7126883/attachment.html>

From moreira.belmiro.email.lists at gmail.com  Thu Aug 13 13:15:41 2020
From: moreira.belmiro.email.lists at gmail.com (Belmiro Moreira)
Date: Thu, 13 Aug 2020 15:15:41 +0200
Subject: [all][TC] OpenStack Client (OSC) vs python-*clients
In-Reply-To: <CAPkQhndGy1VrBT1yOCXQaZcgDxXr5ww9RL1VnU45iHsDJqZpcw@mail.gmail.com>
References: <CAPkQhndGy1VrBT1yOCXQaZcgDxXr5ww9RL1VnU45iHsDJqZpcw@mail.gmail.com>
Message-ID: <CAPkQhnfkmycD7s6yfaC4qAoUnFB1raJ5WWt7Jzg30tqPkm9S8w@mail.gmail.com>

Hi,
we would really appreciate your comments on this.
Especially the OSC team and all the project teams that are facing issues
migrating their clients.

Let us know,
Belmiro

On Mon, Aug 10, 2020 at 10:13 AM Belmiro Moreira <
moreira.belmiro.email.lists at gmail.com> wrote:

> Hi,
> during the last PTG the TC discussed the problem of supporting different
> clients (OpenStack Client - OSC vs python-*clients) [1].
> Currently, we don't have feature parity between the OSC and the
> python-*clients.
>
> Different OpenStack projects invest in different clients.
> This can be a huge problem for users/ops. Depending on the projects
> deployed in their infrastructures, they need to use different clients for
> different tasks.
> It's confusing because of the partial implementation in the OSC.
>
> There was also the proposal to enforce new functionality only in the SDK
> (and optionally the OSC) and not the project’s specific clients to stop
> increasing the disparity between the two.
>
> We would like to understand first the problems and missing pieces that
> projects are facing to move into OSC and help to overcome them.
> Let us know.
>
> Belmiro,
> on behalf of the TC
>
> [1]
> http://lists.openstack.org/pipermail/openstack-discuss/2020-June/015418.html
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200813/7c6b6a45/attachment.html>

From ekuvaja at redhat.com  Thu Aug 13 13:36:34 2020
From: ekuvaja at redhat.com (Erno Kuvaja)
Date: Thu, 13 Aug 2020 14:36:34 +0100
Subject: [all][TC] OpenStack Client (OSC) vs python-*clients
In-Reply-To: <CAPkQhnfkmycD7s6yfaC4qAoUnFB1raJ5WWt7Jzg30tqPkm9S8w@mail.gmail.com>
References: <CAPkQhndGy1VrBT1yOCXQaZcgDxXr5ww9RL1VnU45iHsDJqZpcw@mail.gmail.com>
 <CAPkQhnfkmycD7s6yfaC4qAoUnFB1raJ5WWt7Jzg30tqPkm9S8w@mail.gmail.com>
Message-ID: <CAKwVDDN6wtY0kcmosyXy4JcFgnRAid2xuSKmmCU1Fo1uZ-_tNg@mail.gmail.com>

On Thu, Aug 13, 2020 at 2:19 PM Belmiro Moreira <
moreira.belmiro.email.lists at gmail.com> wrote:

> Hi,
> we would really appreciate your comments on this.
> Especially the OSC team and all the project teams that are facing issues
> migrating their clients.
>
> Let us know,
> Belmiro
>
> In Glance perspective we already stated that we're more than happy to try
endorsing osc again once it has stabilized the Images API facing code and
maintained feature parity for a few cycles. Just stopping developing
python-glanceclient  will only result in no up-to-date stable client for
Images API developed under OpenStack Governance. I really don't think
forcing to fork python-glanceclient to keep development going outside of
OpenStack Governance will be the better solution here.

- jokke

On Mon, Aug 10, 2020 at 10:13 AM Belmiro Moreira <
> moreira.belmiro.email.lists at gmail.com> wrote:
>
>> Hi,
>> during the last PTG the TC discussed the problem of supporting different
>> clients (OpenStack Client - OSC vs python-*clients) [1].
>> Currently, we don't have feature parity between the OSC and the
>> python-*clients.
>>
>> Different OpenStack projects invest in different clients.
>> This can be a huge problem for users/ops. Depending on the projects
>> deployed in their infrastructures, they need to use different clients for
>> different tasks.
>> It's confusing because of the partial implementation in the OSC.
>>
>> There was also the proposal to enforce new functionality only in the SDK
>> (and optionally the OSC) and not the project’s specific clients to stop
>> increasing the disparity between the two.
>>
>> We would like to understand first the problems and missing pieces that
>> projects are facing to move into OSC and help to overcome them.
>> Let us know.
>>
>> Belmiro,
>> on behalf of the TC
>>
>> [1]
>> http://lists.openstack.org/pipermail/openstack-discuss/2020-June/015418.html
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200813/fd4d80e4/attachment-0001.html>

From pwm2012 at gmail.com  Thu Aug 13 13:38:20 2020
From: pwm2012 at gmail.com (pwm)
Date: Thu, 13 Aug 2020 21:38:20 +0800
Subject: DNS server instead of /etc/hosts file on Infra Server
In-Reply-To: <7db29753-4710-a979-fe71-67a829fa55c3@rd.bbc.co.uk>
References: <CAPQD=Mb96WmE6r0RDBdaG7zVOKidppLMgLj2qVTGMsqKx5XKmA@mail.gmail.com>
 <CAPQD=MacnFtTZEKkv7ZzYBOxJkEJu17N2CBYMWk4kJNgc-pkkQ@mail.gmail.com>
 <7db29753-4710-a979-fe71-67a829fa55c3@rd.bbc.co.uk>
Message-ID: <CAPQD=MYh6aKeLWWd7xjTX5aW5d2VfJb9W5aB=DguBFqdsOCJTQ@mail.gmail.com>

Great will check it out. Thanks, Jonathan.

On Wed, Aug 12, 2020 at 9:39 PM Jonathan Rosser <
jonathan.rosser at rd.bbc.co.uk> wrote:

> Openstack-Ansible already supports optionally using the unbound dns server
> instead of managing
> /etc/hosts.
>
> Join #openstack-ansible on IRC if you need any help.
>
> Regards,
> Jonathan.
> On 12/08/2020 14:03, pwm wrote:
>
> Hi,
> Plan to use PowerDNS server instead of the /etc/hosts file for resolving.
> Has anyone done this before?
> The PowerDNS support MySQL DB backend and a frontend GUI PowerDNS Admin
> which allows centralized easy maintenance.
>
> Thanks
>
> On Sun, Aug 9, 2020 at 11:54 PM pwm <pwm2012 at gmail.com> wrote:
>
>> Hi,
>> Anyone interested in replacing the /etc/hosts file entry with a DNS
>> server on the openstack-ansible deployment?
>>
>> Thank you
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200813/c4ad0201/attachment.html>

From luis.ramirez at opencloud.es  Thu Aug 13 13:59:49 2020
From: luis.ramirez at opencloud.es (Luis Ramirez)
Date: Thu, 13 Aug 2020 15:59:49 +0200
Subject: [cloudkitty][tc] Cloudkitty abandoned?
In-Reply-To: <CAG97rafCdhsGf=zbswnDsnta3mKwD8_O9YVbtjzYBhLpi9C--w@mail.gmail.com>
References: <c6150166-1fa9-5020-988a-0a8e9101bca8@gmx.com>
 <CAG97raekL-OR3mT_GdpD+zcwoCCLYDTCsieMwqCkYC9brmHKpw@mail.gmail.com>
 <CAFHSqWrKO4nDiVNpZe3LUsccXwefqRirZGk9BCsQjads0VrYCA@mail.gmail.com>
 <CAG97racc_j30TVhZ1UBk9SAo1rNrer1j1pr-uQEjkee4cTY_gg@mail.gmail.com>
 <CAJ6yrQgvpyTpR2Z99-_ruqRMJextYcx4e-yP2kkpPY5PDWdAvA@mail.gmail.com>
 <CA+ny2swxRnfJS8mStB2T9tEA1Ox4=_LdHS+giMEQPiCbu=MUDw@mail.gmail.com>
 <173c942a17b.dfe050d2111458.180813585646259079@ghanshyammann.com>
 <e104109d-a7c8-9e88-42e8-cd947de9a88c@debian.org>
 <da1a13d1-e840-c43d-ee83-a5ac4dea02db@openstack.org>
 <CAEs876gCAhp+UQW=+ZGQv5wFGDHRa7NG3E5k5DTFgJUOw89mww@mail.gmail.com>
 <CAG97rafWJ5CPgpnX_3LBj3vqazTMC8UCdB5RqoN_VXxBsDidcg@mail.gmail.com>
 <CA+ny2szqZcvuOsxgTSXOyjTnhgJk5ngF=aQUuMhLTyCpMvH0Uw@mail.gmail.com>
 <CAG97racBQS6usPTeEZwAR0ALP1bDKPj+EX2c_ShsY781FJ4Nrw@mail.gmail.com>
 <CAAvZhtmumkFV14N4awVQMECgz9whix5D95JS6mQU=3xgfDSVrQ@mail.gmail.com>
 <CA+ny2sxuoLr5rM2jhvPRDSBQLYV2xQQqUe4Uhzy+s-Pkuc-TxA@mail.gmail.com>
 <CAG97rafCdhsGf=zbswnDsnta3mKwD8_O9YVbtjzYBhLpi9C--w@mail.gmail.com>
Message-ID: <CAAvZhtnHpfR8_sQcnpaLvB_CeYRznP6O2TcBmrmO3fg-O=dHSQ@mail.gmail.com>

Great!

I'll try to do the same.

Br,
Luis Rmz <https://www.linkedin.com/in/luisframirez/>
Blockchain, DevOps & Open Source Cloud Solutions Architect
----------------------------------------
Founder & CEO
OpenCloud.es <http://www.opencloud.es/>
luis.ramirez at opencloud.es
Skype ID: d.overload
Hangouts: luis.ramirez at opencloud.es
[image: ] +34 911 950 123 / [image: ]+39 392 1289553 / [image: ]+49 152
26917722 / Česká republika: +420 774 274 882
-----------------------------------------------------


El jue., 13 ago. 2020 a las 13:44, Rafael Weingärtner (<
rafaelweingartner at gmail.com>) escribió:

> Awesome, thanks!
> I will try to dedicate a few hours every week to review CloudKitty patches.
>
> On Thu, Aug 13, 2020 at 8:36 AM Pierre Riteau <pierre at stackhpc.com> wrote:
>
>> Thank you both.
>>
>> I've merged a few patches to fix CI and finalise the Ussuri release
>> (for example release notes were missing).
>> I gave core reviewer privileges to Rafael and Luis. Let's try to merge
>> patches with two +2 votes from now on.
>>
>> On Wed, 12 Aug 2020 at 23:38, Luis Ramirez <luis.ramirez at opencloud.es>
>> wrote:
>> >
>> > Sounds good to me.
>> >
>> > El El mié, 12 ago 2020 a las 22:41, Rafael Weingärtner <
>> rafaelweingartner at gmail.com> escribió:
>> >>
>> >> Sounds good to me.
>> >>
>> >> On Wed, Aug 12, 2020 at 5:37 PM Pierre Riteau <pierre at stackhpc.com>
>> wrote:
>> >>>
>> >>> I have now received core reviewer privileges. Thank you to TC members
>> >>>
>> >>>
>> >>> for trusting us with the CloudKitty project.
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> I would like to kick things off by resuming IRC meetings. They're set
>> >>>
>> >>>
>> >>> to run every two weeks (on odd weeks) on Monday at 1400 UTC in
>> >>>
>> >>>
>> >>> #cloudkitty. Is this a convenient time slot for all potential
>> >>>
>> >>>
>> >>> contributors to the project?
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> On Wed, 12 Aug 2020 at 19:08, Rafael Weingärtner
>> >>>
>> >>>
>> >>> <rafaelweingartner at gmail.com> wrote:
>> >>>
>> >>>
>> >>> >
>> >>>
>> >>>
>> >>> > Awesome! Thank you guys for the help.
>> >>>
>> >>>
>> >>> > We have few PRs open there that are ready (or close to be ready) to
>> be merged.
>> >>>
>> >>>
>> >>> >
>> >>>
>> >>>
>> >>> > On Wed, Aug 12, 2020 at 1:59 PM Mohammed Naser <mnaser at vexxhost.com>
>> wrote:
>> >>>
>> >>>
>> >>> >>
>> >>>
>> >>>
>> >>> >> On Tue, Aug 11, 2020 at 6:16 AM Thierry Carrez <
>> thierry at openstack.org> wrote:
>> >>>
>> >>>
>> >>> >> >
>> >>>
>> >>>
>> >>> >> > Thomas Goirand wrote:
>> >>>
>> >>>
>> >>> >> > > On 8/7/20 4:10 PM, Ghanshyam Mann wrote:
>> >>>
>> >>>
>> >>> >> > >> Thanks, Pierre for helping with this.
>> >>>
>> >>>
>> >>> >> > >>
>> >>>
>> >>>
>> >>> >> > >> ttx has reached out to PTL (Justin Ferrieu (jferrieu) <
>> justin.ferrieu at objectif-libre.com>)
>> >>>
>> >>>
>> >>> >> > >> but I am not sure if he got any response back.
>> >>>
>> >>>
>> >>> >> >
>> >>>
>> >>>
>> >>> >> > No response so far, but they may all be in company summer
>> vacation.
>> >>>
>> >>>
>> >>> >> >
>> >>>
>> >>>
>> >>> >> > > The end of the very good maintenance of Cloudkitty matched the
>> date when
>> >>>
>> >>>
>> >>> >> > > objectif libre was sold to Linkbynet. Maybe the new owner
>> don't care enough?
>> >>>
>> >>>
>> >>> >> > >
>> >>>
>> >>>
>> >>> >> > > This is very disappointing as I've been using it for some time
>> already,
>> >>>
>> >>>
>> >>> >> > > and that I was satisfied by it (ie: it does the job...), and
>> especially
>> >>>
>> >>>
>> >>> >> > > that latest releases are able to scale correctly.
>> >>>
>> >>>
>> >>> >> > >
>> >>>
>> >>>
>> >>> >> > > I very much would love if Pierre Riteau was successful in
>> taking over.
>> >>>
>> >>>
>> >>> >> > > Good luck Pierre! I'll try to help whenever I can and if I'm
>> not too busy.
>> >>>
>> >>>
>> >>> >> >
>> >>>
>> >>>
>> >>> >> > Given the volunteers (Pierre, Rafael, Luis) I would support the
>> TC using
>> >>>
>> >>>
>> >>> >> > its unholy powers to add extra core reviewers to cloudkitty.
>> >>>
>> >>>
>> >>> >>
>> >>>
>> >>>
>> >>> >> https://review.opendev.org/#/c/745653 is currently merging and
>> fungi will be
>> >>>
>> >>>
>> >>> >> adding Pierre as a core.
>> >>>
>> >>>
>> >>> >>
>> >>>
>> >>>
>> >>> >> Thank you for helping.
>> >>>
>> >>>
>> >>> >>
>> >>>
>> >>>
>> >>> >> > If the current PTL comes back, I'm sure they will appreciate the
>> help,
>> >>>
>> >>>
>> >>> >> > and can always fix/revert things before Victoria release.
>> >>>
>> >>>
>> >>> >> >
>> >>>
>> >>>
>> >>> >> > --
>> >>>
>> >>>
>> >>> >> > Thierry Carrez (ttx)
>> >>>
>> >>>
>> >>> >> >
>> >>>
>> >>>
>> >>> >>
>> >>>
>> >>>
>> >>> >>
>> >>>
>> >>>
>> >>> >> --
>> >>>
>> >>>
>> >>> >> Mohammed Naser
>> >>>
>> >>>
>> >>> >> VEXXHOST, Inc.
>> >>>
>> >>>
>> >>> >>
>> >>>
>> >>>
>> >>> >
>> >>>
>> >>>
>> >>> >
>> >>>
>> >>>
>> >>> > --
>> >>>
>> >>>
>> >>> > Rafael Weingärtner
>> >>>
>> >>>
>> >>
>> >>
>> >> --
>> >> Rafael Weingärtner
>> >>
>> >>
>> > --
>> > Br,
>> > Luis Rmz
>> > Blockchain, DevOps & Open Source Cloud Solutions Architect
>> > ----------------------------------------
>> > Founder & CEO
>> > OpenCloud.es
>> > luis.ramirez at opencloud.es
>> > Skype ID: d.overload
>> > Hangouts: luis.ramirez at opencloud.es
>> >  +34 911 950 123 / +39 392 1289553 / +49 152 26917722
>>
>
>
> --
> Rafael Weingärtner
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200813/f22e6e91/attachment.html>

From ashlee at openstack.org  Thu Aug 13 14:07:40 2020
From: ashlee at openstack.org (Ashlee Ferguson)
Date: Thu, 13 Aug 2020 09:07:40 -0500
Subject: Community Voting for The Virtual Summit Sessions is Open! 
Message-ID: <02EC11B6-6D57-4DFA-94B6-F920E90A4FF6@openstack.org>

Community voting for the virtual Open Infrastructure Summit sessions is open!

You can VOTE HERE <https://www.openstack.org/summit/2020/vote-for-presentations>, but what does that mean?

Now that the Call for Presentations has closed, all submissions are available for community vote and input. After community voting closes, the volunteer Programming Committee members will receive the results to review to help them determine the final selections for Summit schedule. While community votes are meant to help inform the decision, Programming Committee members are expected to exercise judgment in their area of expertise and help ensure diversity of sessions and speakers. View full details of the session selection process <https://www.openstack.org/summit/2020/vote-for-presentations/selection-process>.

In order to vote, you need an OSF community membership. If you do not have an account, please create one by going to openstack.org/join <https://www.openstack.org/join>. If you need to reset your password, you can do that here <https://www.openstack.org/Security/lostpassword>.

Hurry, voting closes Monday, August 17 at 11:59pm Pacific Time.

Don’t forget to Register for the Summit <https://openinfrasummit2020.eventbrite.com/> for free! Visit https://www.openstack.org/summit/2020/ <https://www.openstack.org/summit/2020/> for all other Summit-related information.

Interested in sponsoring? Visit this page <https://www.openstack.org/summit/2020/sponsors/>. 

If you have any questions, please email summit at openstack.org <mailto:summit at openstack.org>.

Cheers,
Ashlee


Ashlee Ferguson
Community & Events Coordinator
OpenStack Foundation

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200813/59309ad3/attachment-0001.html>

From jasowang at redhat.com  Thu Aug 13 04:24:50 2020
From: jasowang at redhat.com (Jason Wang)
Date: Thu, 13 Aug 2020 12:24:50 +0800
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200810074631.GA29059@joy-OptiPlex-7040>
References: <20200727162321.7097070e@x1.home>
 <20200729080503.GB28676@joy-OptiPlex-7040>
 <20200804183503.39f56516.cohuck@redhat.com>
 <c178a0d3-269d-1620-22b1-9010f602d8ff@redhat.com>
 <20200805021654.GB30485@joy-OptiPlex-7040>
 <2624b12f-3788-7e2b-2cb7-93534960bcb7@redhat.com>
 <20200805075647.GB2177@nanopsycho>
 <eb1d01c2-fbad-36b6-10cf-9e03483a736b@redhat.com>
 <20200805093338.GC30485@joy-OptiPlex-7040> <20200805105319.GF2177@nanopsycho>
 <20200810074631.GA29059@joy-OptiPlex-7040>
Message-ID: <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>


On 2020/8/10 下午3:46, Yan Zhao wrote:
>> driver is it handled by?
> It looks that the devlink is for network device specific, and in
> devlink.h, it says
> include/uapi/linux/devlink.h - Network physical device Netlink
> interface,


Actually not, I think there used to have some discussion last year and 
the conclusion is to remove this comment.

It supports IB and probably vDPA in the future.


>   I feel like it's not very appropriate for a GPU driver to use
> this interface. Is that right?


I think not though most of the users are switch or ethernet devices. It 
doesn't prevent you from inventing new abstractions.

Note that devlink is based on netlink, netlink has been widely used by 
various subsystems other than networking.

Thanks


>
> Thanks
> Yan
>   
>


From alifshit at redhat.com  Thu Aug 13 14:30:49 2020
From: alifshit at redhat.com (Artom Lifshitz)
Date: Thu, 13 Aug 2020 10:30:49 -0400
Subject: [all][TC] OpenStack Client (OSC) vs python-*clients
In-Reply-To: <1668118.VLH7GnMWUR@whitebase.usersys.redhat.com>
References: <CAPkQhndGy1VrBT1yOCXQaZcgDxXr5ww9RL1VnU45iHsDJqZpcw@mail.gmail.com>
 <CAKZ_x7_CCbHzVdruU2BvS7kK1mvte+1AW1PJptwVzFJQVVCs4Q@mail.gmail.com>
 <1668118.VLH7GnMWUR@whitebase.usersys.redhat.com>
Message-ID: <CACqyMicKobe7WnQfKEixAM3pTN0+9JdNGQUR8Z9KeW9bE9Xizw@mail.gmail.com>

On Mon, Aug 10, 2020 at 4:40 AM Luigi Toscano <ltoscano at redhat.com> wrote:
>
> On Monday, 10 August 2020 10:26:24 CEST Radosław Piliszek wrote:
> > On Mon, Aug 10, 2020 at 10:19 AM Belmiro Moreira <
> >
> > moreira.belmiro.email.lists at gmail.com> wrote:
> > > Hi,
> > > during the last PTG the TC discussed the problem of supporting different
> > > clients (OpenStack Client - OSC vs python-*clients) [1].
> > > Currently, we don't have feature parity between the OSC and the
> > > python-*clients.
> >
> > Is it true of any client? I guess some are just OSC plugins 100%.
> > Do we know which clients have this disparity?
> > Personally, I encountered this with Glance the most and Cinder to some
> > extent (but I believe over the course of action Cinder got all features I
> > wanted from it in the OSC).
>
> As far as I know there is still a huge problem with microversion handling
> which impacts some cinder features. It has been discussed in the past and
> still present.

Yeah, my understanding is that osc will never "properly" support
microversions. Openstacksdk is the future in that sense, and my
understanding is that the osc team is "porting" osc to use the sdk.
Given these two thing, when we (Nova) talked about this with the osc
folks, we decided that rather than catch up osc to python-novaclient,
we'd rather focus our efforts on the sdk. I've been slowly doing that
[1], starting with the earlier Nova microversions. The eventual long
term goal is for the Nova team to *only* support the sdk, and drop
python-novaclient entirely, but that's a long time away.

[1] https://review.opendev.org/#/q/status:open+project:openstack/openstacksdk+branch:master+topic:story/2007929

>
>
> --
> Luigi
>
>
>


From moguimar at redhat.com  Thu Aug 13 15:06:31 2020
From: moguimar at redhat.com (Moises Guimaraes de Medeiros)
Date: Thu, 13 Aug 2020 17:06:31 +0200
Subject: [oslo] Proposing Lance Bragstad as oslo.cache core
Message-ID: <CAG_6SUC+hgX9SqoqPmFxew2xwfi99MCcN=5v1bTHeVLo5ed4VQ@mail.gmail.com>

Hello everybody,

It is my pleasure to propose Lance Bragstad (lbragstad) as a new member of
the oslo.core core team.

Lance has been a big contributor to the project and is known as a walking
version of the Keystone documentation, which happens to be one of the
biggest consumers of oslo.cache.

Obviously we think he'd make a good addition to the core team. If there are
no objections, I'll make that happen in a week.

Thanks.

-- 

Moisés Guimarães

Software Engineer

Red Hat <https://www.redhat.com>

<https://red.ht/sig>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200813/585907ca/attachment.html>

From sean.mcginnis at gmx.com  Thu Aug 13 15:08:31 2020
From: sean.mcginnis at gmx.com (Sean McGinnis)
Date: Thu, 13 Aug 2020 10:08:31 -0500
Subject: [oslo] Proposing Lance Bragstad as oslo.cache core
In-Reply-To: <CAG_6SUC+hgX9SqoqPmFxew2xwfi99MCcN=5v1bTHeVLo5ed4VQ@mail.gmail.com>
References: <CAG_6SUC+hgX9SqoqPmFxew2xwfi99MCcN=5v1bTHeVLo5ed4VQ@mail.gmail.com>
Message-ID: <82be76d9-0898-63a7-d339-48e9f6db540c@gmx.com>

On 8/13/20 10:06 AM, Moises Guimaraes de Medeiros wrote:
> Hello everybody,
>
> It is my pleasure to propose Lance Bragstad (lbragstad) as a new
> member of the oslo.core core team.
>
> Lance has been a big contributor to the project and is known as a
> walking version of the Keystone documentation, which happens to be one
> of the biggest consumers of oslo.cache.
>
> Obviously we think he'd make a good addition to the core team. If
> there are no objections, I'll make that happen in a week.
>
+1!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200813/277fe155/attachment.html>

From smooney at redhat.com  Thu Aug 13 15:18:45 2020
From: smooney at redhat.com (Sean Mooney)
Date: Thu, 13 Aug 2020 16:18:45 +0100
Subject: [all][TC] OpenStack Client (OSC) vs python-*clients
In-Reply-To: <CACqyMicKobe7WnQfKEixAM3pTN0+9JdNGQUR8Z9KeW9bE9Xizw@mail.gmail.com>
References: <CAPkQhndGy1VrBT1yOCXQaZcgDxXr5ww9RL1VnU45iHsDJqZpcw@mail.gmail.com>
 <CAKZ_x7_CCbHzVdruU2BvS7kK1mvte+1AW1PJptwVzFJQVVCs4Q@mail.gmail.com>
 <1668118.VLH7GnMWUR@whitebase.usersys.redhat.com>
 <CACqyMicKobe7WnQfKEixAM3pTN0+9JdNGQUR8Z9KeW9bE9Xizw@mail.gmail.com>
Message-ID: <9cbf9d69a9beb30d03af71e42a3e2446a516292a.camel@redhat.com>

On Thu, 2020-08-13 at 10:30 -0400, Artom Lifshitz wrote:
> On Mon, Aug 10, 2020 at 4:40 AM Luigi Toscano <ltoscano at redhat.com> wrote:
> > 
> > On Monday, 10 August 2020 10:26:24 CEST Radosław Piliszek wrote:
> > > On Mon, Aug 10, 2020 at 10:19 AM Belmiro Moreira <
> > > 
> > > moreira.belmiro.email.lists at gmail.com> wrote:
> > > > Hi,
> > > > during the last PTG the TC discussed the problem of supporting different
> > > > clients (OpenStack Client - OSC vs python-*clients) [1].
> > > > Currently, we don't have feature parity between the OSC and the
> > > > python-*clients.
> > > 
> > > Is it true of any client? I guess some are just OSC plugins 100%.
> > > Do we know which clients have this disparity?
> > > Personally, I encountered this with Glance the most and Cinder to some
> > > extent (but I believe over the course of action Cinder got all features I
> > > wanted from it in the OSC).
> > 
> > As far as I know there is still a huge problem with microversion handling
> > which impacts some cinder features. It has been discussed in the past and
> > still present.
> 
> Yeah, my understanding is that osc will never "properly" support
> microversions.
it does already properly support micorversion the issue is not everyone agrees
on what properly means. the behavior of the project clients was considered broken
by many. it has been poirpose that we explcity allow a way to opt in to the auto negociation
via a new "auto" sentaial value and i have also suggested that we should tag each comman with the minium
microversion that parmater or command requires and decault to that minium based on teh arges you passed.

both of those imporvement dont break the philosipy of providing stable cli behavior across cloud and would
imporve the ux. defaulting to the minium microversion needed for the arguments passed would solve most of the ux
issues and adding an auto sentical would resolve the rest while still keeping the correct microversion behvior it
already has.

the glance and cinder gaps are not really related to microverions by the way.
its just that no one has done the work and cinder an glance have not require contiuptors to update
osc as part of adding new features. nova has not required that either but there were some who worked on nova
that cared enough about osc to mention it in code review or submit patches themsevles. the glance team does
not really have the resouces to do that and the osc team does not have the resouce to maintain clis for all teams.

so over tiem as service poject added new feature the gaps have increase since there were not people tyring to keep it in
sync.

>  Openstacksdk is the future in that sense, and my
> understanding is that the osc team is "porting" osc to use the sdk.
> Given these two thing, when we (Nova) talked about this with the osc
> folks, we decided that rather than catch up osc to python-novaclient,
> we'd rather focus our efforts on the sdk.
well that is not entirly a good caraterisation. we want to catch up osc too
but the suggest was to support eveything in osc then it would be easier to add osc support
since it just has to call the sdk functions. we did not say we dont want to close the gaps in osc.

>  I've been slowly doing that
> [1], starting with the earlier Nova microversions. The eventual long
> term goal is for the Nova team to *only* support the sdk, and drop
> python-novaclient entirely, but that's a long time away.
> 
> [1] https://review.opendev.org/#/q/status:open+project:openstack/openstacksdk+branch:master+topic:story/2007929
> 
> > 
> > 
> > --
> > Luigi
> > 
> > 
> > 
> 
> 


From openstack at nemebean.com  Thu Aug 13 15:28:12 2020
From: openstack at nemebean.com (Ben Nemec)
Date: Thu, 13 Aug 2020 10:28:12 -0500
Subject: [largescale-sig][nova][neutron][oslo] RPC ping
In-Reply-To: <6e68d1a3cfc4efff91d3668bb53805dc469673c6.camel@redhat.com>
References: <20200727095744.GK31915@sync>
 <3d238530-6c84-d611-da4c-553ba836fc02@nemebean.com>
 <m2v9i8yjfw.fsf@caffeine.hv.danplanet.com>
 <671fec63-8bea-4215-c773-d8360e368a99@sap.com>
 <c5afa01d-d8ef-7441-26a5-ddbc756f719c@nemebean.com>
 <2af09e63936f75489946ea6b70c41d6e091531ee.camel@redhat.com>
 <7496bd35-856e-f48f-b6d8-65155b1777f1@openstack.org>
 <16a3adf0-2f51-dd7d-c729-7b27f1593980@nemebean.com>
 <a36bcd1d-1dbe-bd69-d195-c89795680e44@openstack.org>
 <6e68d1a3cfc4efff91d3668bb53805dc469673c6.camel@redhat.com>
Message-ID: <e614e2ce-e5c6-087a-b186-74ab96889acf@nemebean.com>


On 8/13/20 7:14 AM, Sean Mooney wrote:
> On Thu, 2020-08-13 at 10:24 +0200, Thierry Carrez wrote:
>> Ben Nemec wrote:
>>> On 8/12/20 5:32 AM, Thierry Carrez wrote:
>>>> Sean Mooney wrote:
>>>>> On Tue, 2020-08-11 at 15:20 -0500, Ben Nemec wrote:
>>>>>> I wonder if this does help though. It seems like a bug that a
>>>>>> nova-compute service would stop processing messages and still be
>>>>>> seen as up in the service status. Do we understand why that is
>>>>>> happening? If not, I'm unclear that a ping living at the
>>>>>> oslo.messaging layer is going to do a better job of exposing such an
>>>>>> outage. The fact that oslo.messaging is responding does not
>>>>>> necessarily equate to nova-compute functioning as expected.
>>>>>>
>>>>>> To be clear, this is not me nacking the ping feature. I just want to
>>>>>> make sure we understand what is going on here so we don't add
>>>>>> another unreliable healthchecking mechanism to the one we already have.
>>>>>
>>>>> [...]
>>>>> im not sure https://bugs.launchpad.net/nova/+bug/1854992 is the bug
>>>>> that is motiviting the creation of this oslo ping
>>>>> feature but that feels premature if it is. i think it would be better
>>>>> try to adress this by the sender recreating the
>>>>> queue if the deliver fails and if that is not viable then protpyope
>>>>> thge fix in nova. if the self ping fixes this
>>>>> miss queue error then we could extract the cod into oslo.
>>>>
>>>> I think this is missing the point... This is not about working around
>>>> a specific bug, it's about adding a way to detect a certain class of
>>>> failure. It's more of an operational feature than a development bugfix.
>>>>
>>>> If I understood correctly, OVH is running that patch in production as
>>>> a way to detect certain problems they regularly run into, something
>>>> our existing monitor mechanisms fail to detect. That sounds like a
>>>> worthwhile addition?
>>>
>>> Okay, I don't think I was aware that this was already being used. If
>>> someone already finds it useful and it's opt-in then I'm not inclined to
>>> block it. My main concern was that we were adding a feature that didn't
>>> actually address the problem at hand.
>>>
>>> I _would_ feel better about it if someone could give an example of a
>>> type of failure this is detecting that is missed by other monitoring
>>> methods though. Both because having a concrete example of a use case for
>>> the feature is good, and because if it turns out that the problems this
>>> is detecting are things like the Nova bug Sean is talking about (which I
>>> don't think this would catch anyway, since the topic is missing and
>>> there's nothing to ping) then there may be other changes we can/should
>>> make to improve things.
>>
>> Right. Let's wait for Arnaud to come back from vacation and confirm that
>>
>> (1) that patch is not a shot in the dark: it allows them to expose a
>> class of issues in production
>>
>> (2) they fail to expose that same class of issues using other existing
>> mechanisms, including those just suggested in this thread
>>
>> I just wanted to avoid early rejection of this health check ability on
>> the grounds that the situation it exposes should just not happen. Or
>> that, if enabled and heavily used, it would have a performance impact.
> I think the inital push back from nova is we already have ping rpc function
> https://github.com/openstack/nova/blob/c6218428e9b29a2c52808ec7d27b4b21aadc0299/nova/baserpc.py#L55-L76
> so if a geneirc metion called ping is added it will break nova.

It occurred to me after I commented on the review that we have tempest 
running on oslo.messaging changes and it passed on the patch for this. I 
suppose it's possible that it broke some error handling in Nova that 
just isn't tested, but maybe the new ping could function as a 
cross-project replacement for the Nova ping?

Anyway, it's still be to deduplicate the name, but I felt kind of dumb 
about having asked if it was tested when the test results were right in 
front of me. ;-)

> 
> the reset of the push back is related to not haveing a concrete usecase, including concern over
> perfroamce consideration and external services potenailly acessing the rpc bus which is coniserd an internal
> api. e.g. we woudl not want an external monitoring solution connecting to the rpc bus and invoking arbitary
> RPC calls, ping is well pretty safe but form a design point of view while litening to notification is fine
> we dont want anything outside of the openstack services actully sending message on the rpc bus.

I'm not concerned about the performance impact here. It's an optional 
feature, so anyone using it is choosing to take that hit.

Having external stuff on the RPC bus is more of a gray area, but it's 
not like we can stop operators from doing that. I think it's probably 
better to provide a well-defined endpoint for them to talk to rather 
than have everyone implement their own slightly different RPC ping 
mechanism. The docs for this feature should be very explicit that this 
is the only thing external code should be calling.

> 
> so if this does actully detect somethign we can otherwise detect and the use cases involves using it within
> the openstack services not form an external source then i think that is fine but we proably need to use another
> name (alive? status?) or otherewise modify nova so that there is no conflict.
>>
> 

If I understand your analysis of the bug correctly, this would have 
caught that type of outage after all since the failure was asymmetric. 
The compute node was still able to send its status updates to Nova, but 
wasn't receiving any messages. A ping would have detected that situation.


From openstack at nemebean.com  Thu Aug 13 15:28:43 2020
From: openstack at nemebean.com (Ben Nemec)
Date: Thu, 13 Aug 2020 10:28:43 -0500
Subject: [oslo] Proposing Lance Bragstad as oslo.cache core
In-Reply-To: <CAG_6SUC+hgX9SqoqPmFxew2xwfi99MCcN=5v1bTHeVLo5ed4VQ@mail.gmail.com>
References: <CAG_6SUC+hgX9SqoqPmFxew2xwfi99MCcN=5v1bTHeVLo5ed4VQ@mail.gmail.com>
Message-ID: <305f4be5-950d-97a6-63d0-fe59d56bfe8d@nemebean.com>

+1!

On 8/13/20 10:06 AM, Moises Guimaraes de Medeiros wrote:
> Hello everybody,
> 
> It is my pleasure to propose Lance Bragstad (lbragstad) as a new member 
> of the oslo.core core team.
> 
> Lance has been a big contributor to the project and is known as a 
> walking version of the Keystone documentation, which happens to be one 
> of the biggest consumers of oslo.cache.
> 
> Obviously we think he'd make a good addition to the core team. If there 
> are no objections, I'll make that happen in a week.
> 
> Thanks.
> 
> -- 
> 
> Moisés Guimarães
> 
> Software Engineer
> 
> Red Hat <https://www.redhat.com>
> 
> <https://red.ht/sig>
> 


From allison at openstack.org  Thu Aug 13 15:28:55 2020
From: allison at openstack.org (Allison Price)
Date: Thu, 13 Aug 2020 10:28:55 -0500
Subject: Running OpenStack? Take the 2020 User Survey now! 
Message-ID: <1E70817D-D322-4B3E-8940-762DBAA7708A@openstack.org>

Hi everyone, 

There is only one week left before we are closing the 2020 OpenStack User Survey [1]! If you are running OpenStack, please take a few minutes to log your deployment—all information will remain anonymous unless you indicate otherwise. If you have completed a User Survey before, all you have to do is update your information and answer a few new questions. Anonymous feedback will be passed along to the upstream project teams, and anonymized data will be available in the analytics dashboard [2].

The deadline to add your deployment to this round of analysis is Thursday, August 20. 

Let me know if you have any questions or issues completing. 

Thanks!
Allison

[1] https://www.openstack.org/user-survey/survey-2020 <https://www.openstack.org/user-survey/survey-2020>
[2] https://www.openstack.org/analytics <https://www.openstack.org/analytics>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200813/ad7b90fd/attachment.html>

From abishop at redhat.com  Thu Aug 13 15:42:30 2020
From: abishop at redhat.com (Alan Bishop)
Date: Thu, 13 Aug 2020 08:42:30 -0700
Subject: [all][TC] OpenStack Client (OSC) vs python-*clients
In-Reply-To: <9cbf9d69a9beb30d03af71e42a3e2446a516292a.camel@redhat.com>
References: <CAPkQhndGy1VrBT1yOCXQaZcgDxXr5ww9RL1VnU45iHsDJqZpcw@mail.gmail.com>
 <CAKZ_x7_CCbHzVdruU2BvS7kK1mvte+1AW1PJptwVzFJQVVCs4Q@mail.gmail.com>
 <1668118.VLH7GnMWUR@whitebase.usersys.redhat.com>
 <CACqyMicKobe7WnQfKEixAM3pTN0+9JdNGQUR8Z9KeW9bE9Xizw@mail.gmail.com>
 <9cbf9d69a9beb30d03af71e42a3e2446a516292a.camel@redhat.com>
Message-ID: <CADO3vb4ufSnUpEwk8xk-6WQ9k8QKqnAnkWS1V+6U0x-nZnL7oQ@mail.gmail.com>

On Thu, Aug 13, 2020 at 8:27 AM Sean Mooney <smooney at redhat.com> wrote:

> On Thu, 2020-08-13 at 10:30 -0400, Artom Lifshitz wrote:
> > On Mon, Aug 10, 2020 at 4:40 AM Luigi Toscano <ltoscano at redhat.com>
> wrote:
> > >
> > > On Monday, 10 August 2020 10:26:24 CEST Radosław Piliszek wrote:
> > > > On Mon, Aug 10, 2020 at 10:19 AM Belmiro Moreira <
> > > >
> > > > moreira.belmiro.email.lists at gmail.com> wrote:
> > > > > Hi,
> > > > > during the last PTG the TC discussed the problem of supporting
> different
> > > > > clients (OpenStack Client - OSC vs python-*clients) [1].
> > > > > Currently, we don't have feature parity between the OSC and the
> > > > > python-*clients.
> > > >
> > > > Is it true of any client? I guess some are just OSC plugins 100%.
> > > > Do we know which clients have this disparity?
> > > > Personally, I encountered this with Glance the most and Cinder to
> some
> > > > extent (but I believe over the course of action Cinder got all
> features I
> > > > wanted from it in the OSC).
> > >
> > > As far as I know there is still a huge problem with microversion
> handling
> > > which impacts some cinder features. It has been discussed in the past
> and
> > > still present.
> >
> > Yeah, my understanding is that osc will never "properly" support
> > microversions.
> it does already properly support micorversion the issue is not everyone
> agrees
> on what properly means. the behavior of the project clients was considered
> broken
> by many. it has been poirpose that we explcity allow a way to opt in to
> the auto negociation
> via a new "auto" sentaial value and i have also suggested that we should
> tag each comman with the minium
> microversion that parmater or command requires and decault to that minium
> based on teh arges you passed.
>
> both of those imporvement dont break the philosipy of providing stable cli
> behavior across cloud and would
> imporve the ux. defaulting to the minium microversion needed for the
> arguments passed would solve most of the ux
> issues and adding an auto sentical would resolve the rest while still
> keeping the correct microversion behvior it
> already has.
>
> the glance and cinder gaps are not really related to microverions by the
> way.
> its just that no one has done the work and cinder an glance have not
> require contiuptors to update
>

Updates to osc from cinder's side are pretty much stalled due to lack of
support for microversions. A patch for that was rejected and we've had
trouble getting an update on a viable path forward. See comment in
https://review.opendev.org/590807.

Alan


> osc as part of adding new features. nova has not required that either but
> there were some who worked on nova
> that cared enough about osc to mention it in code review or submit patches
> themsevles. the glance team does
> not really have the resouces to do that and the osc team does not have the
> resouce to maintain clis for all teams.
>
> so over tiem as service poject added new feature the gaps have increase
> since there were not people tyring to keep it in
> sync.
>
> >  Openstacksdk is the future in that sense, and my
> > understanding is that the osc team is "porting" osc to use the sdk.
> > Given these two thing, when we (Nova) talked about this with the osc
> > folks, we decided that rather than catch up osc to python-novaclient,
> > we'd rather focus our efforts on the sdk.
> well that is not entirly a good caraterisation. we want to catch up osc too
> but the suggest was to support eveything in osc then it would be easier to
> add osc support
> since it just has to call the sdk functions. we did not say we dont want
> to close the gaps in osc.
>
> >  I've been slowly doing that
> > [1], starting with the earlier Nova microversions. The eventual long
> > term goal is for the Nova team to *only* support the sdk, and drop
> > python-novaclient entirely, but that's a long time away.
> >
> > [1]
> https://review.opendev.org/#/q/status:open+project:openstack/openstacksdk+branch:master+topic:story/2007929
> >
> > >
> > >
> > > --
> > > Luigi
> > >
> > >
> > >
> >
> >
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200813/a1533c2f/attachment-0001.html>

From smooney at redhat.com  Thu Aug 13 16:07:18 2020
From: smooney at redhat.com (Sean Mooney)
Date: Thu, 13 Aug 2020 17:07:18 +0100
Subject: [largescale-sig][nova][neutron][oslo] RPC ping
In-Reply-To: <e614e2ce-e5c6-087a-b186-74ab96889acf@nemebean.com>
References: <20200727095744.GK31915@sync>
 <3d238530-6c84-d611-da4c-553ba836fc02@nemebean.com>
 <m2v9i8yjfw.fsf@caffeine.hv.danplanet.com>
 <671fec63-8bea-4215-c773-d8360e368a99@sap.com>
 <c5afa01d-d8ef-7441-26a5-ddbc756f719c@nemebean.com>
 <2af09e63936f75489946ea6b70c41d6e091531ee.camel@redhat.com>
 <7496bd35-856e-f48f-b6d8-65155b1777f1@openstack.org>
 <16a3adf0-2f51-dd7d-c729-7b27f1593980@nemebean.com>
 <a36bcd1d-1dbe-bd69-d195-c89795680e44@openstack.org>
 <6e68d1a3cfc4efff91d3668bb53805dc469673c6.camel@redhat.com>
 <e614e2ce-e5c6-087a-b186-74ab96889acf@nemebean.com>
Message-ID: <65204b738f13fcea16b9b6d5a68149c89be73e6a.camel@redhat.com>

On Thu, 2020-08-13 at 10:28 -0500, Ben Nemec wrote:
> 
> On 8/13/20 7:14 AM, Sean Mooney wrote:
> > On Thu, 2020-08-13 at 10:24 +0200, Thierry Carrez wrote:
> > > Ben Nemec wrote:
> > > > On 8/12/20 5:32 AM, Thierry Carrez wrote:
> > > > > Sean Mooney wrote:
> > > > > > On Tue, 2020-08-11 at 15:20 -0500, Ben Nemec wrote:
> > > > > > > I wonder if this does help though. It seems like a bug that a
> > > > > > > nova-compute service would stop processing messages and still be
> > > > > > > seen as up in the service status. Do we understand why that is
> > > > > > > happening? If not, I'm unclear that a ping living at the
> > > > > > > oslo.messaging layer is going to do a better job of exposing such an
> > > > > > > outage. The fact that oslo.messaging is responding does not
> > > > > > > necessarily equate to nova-compute functioning as expected.
> > > > > > > 
> > > > > > > To be clear, this is not me nacking the ping feature. I just want to
> > > > > > > make sure we understand what is going on here so we don't add
> > > > > > > another unreliable healthchecking mechanism to the one we already have.
> > > > > > 
> > > > > > [...]
> > > > > > im not sure https://bugs.launchpad.net/nova/+bug/1854992 is the bug
> > > > > > that is motiviting the creation of this oslo ping
> > > > > > feature but that feels premature if it is. i think it would be better
> > > > > > try to adress this by the sender recreating the
> > > > > > queue if the deliver fails and if that is not viable then protpyope
> > > > > > thge fix in nova. if the self ping fixes this
> > > > > > miss queue error then we could extract the cod into oslo.
> > > > > 
> > > > > I think this is missing the point... This is not about working around
> > > > > a specific bug, it's about adding a way to detect a certain class of
> > > > > failure. It's more of an operational feature than a development bugfix.
> > > > > 
> > > > > If I understood correctly, OVH is running that patch in production as
> > > > > a way to detect certain problems they regularly run into, something
> > > > > our existing monitor mechanisms fail to detect. That sounds like a
> > > > > worthwhile addition?
> > > > 
> > > > Okay, I don't think I was aware that this was already being used. If
> > > > someone already finds it useful and it's opt-in then I'm not inclined to
> > > > block it. My main concern was that we were adding a feature that didn't
> > > > actually address the problem at hand.
> > > > 
> > > > I _would_ feel better about it if someone could give an example of a
> > > > type of failure this is detecting that is missed by other monitoring
> > > > methods though. Both because having a concrete example of a use case for
> > > > the feature is good, and because if it turns out that the problems this
> > > > is detecting are things like the Nova bug Sean is talking about (which I
> > > > don't think this would catch anyway, since the topic is missing and
> > > > there's nothing to ping) then there may be other changes we can/should
> > > > make to improve things.
> > > 
> > > Right. Let's wait for Arnaud to come back from vacation and confirm that
> > > 
> > > (1) that patch is not a shot in the dark: it allows them to expose a
> > > class of issues in production
> > > 
> > > (2) they fail to expose that same class of issues using other existing
> > > mechanisms, including those just suggested in this thread
> > > 
> > > I just wanted to avoid early rejection of this health check ability on
> > > the grounds that the situation it exposes should just not happen. Or
> > > that, if enabled and heavily used, it would have a performance impact.
> > 
> > I think the inital push back from nova is we already have ping rpc function
> > https://github.com/openstack/nova/blob/c6218428e9b29a2c52808ec7d27b4b21aadc0299/nova/baserpc.py#L55-L76
> > so if a geneirc metion called ping is added it will break nova.
> 
> It occurred to me after I commented on the review that we have tempest 
> running on oslo.messaging changes and it passed on the patch for this. I 
> suppose it's possible that it broke some error handling in Nova that 
> just isn't tested, but maybe the new ping could function as a 
> cross-project replacement for the Nova ping?
proably yes its only used in one place
https://opendev.org/openstack/nova/src/branch/master/nova/conductor/api.py#L66-L72
which is only used here in the nova service base class
https://github.com/openstack/nova/blob/0b613729ff975f69587a17cc7818c09f7683ebf2/nova/service.py#L126

os worst case i think its just going to cause the service to start before the conductor is ready however
they have to tolerate the conductor restarting ectra anyway so i dont think it will break anything too badly.

i dont see why we coudl not use a generic version instead.
> 
> Anyway, it's still be to deduplicate the name, but I felt kind of dumb 
> about having asked if it was tested when the test results were right in 
> front of me. ;-)
> 
> > 
> > the reset of the push back is related to not haveing a concrete usecase, including concern over
> > perfroamce consideration and external services potenailly acessing the rpc bus which is coniserd an internal
> > api. e.g. we woudl not want an external monitoring solution connecting to the rpc bus and invoking arbitary
> > RPC calls, ping is well pretty safe but form a design point of view while litening to notification is fine
> > we dont want anything outside of the openstack services actully sending message on the rpc bus.
> 
> I'm not concerned about the performance impact here. It's an optional 
> feature, so anyone using it is choosing to take that hit.
> 
> Having external stuff on the RPC bus is more of a gray area, but it's 
> not like we can stop operators from doing that.
well upstream certenly we cant really stop them. downstream on the other hadn without
going through the certification process to have your product certifed to work with our downstream
distrobution directlly invoking RPC endpoint would invlaidate your support. so form a dwonstream perpective
we do have ways to prevent that via docs and makeing it clear that it not supported. we can technically do that
upstream but cant really enforce it, its opensouce software after all if you break it then you get to keep the broken
pices.

>  I think it's probably 
> better to provide a well-defined endpoint for them to talk to rather 
> than have everyone implement their own slightly different RPC ping 
> mechanism. The docs for this feature should be very explicit that this 
> is the only thing external code should be calling.
ya i think that is a good approch.
i would still prefer if people used say middelware to add a service ping admin api endpoint
instead of driectly calling the rpc endpoint to avoid exposing rabbitmq but that is out of scope of this discussion.

> 
> > 
> > so if this does actully detect somethign we can otherwise detect and the use cases involves using it within
> > the openstack services not form an external source then i think that is fine but we proably need to use another
> > name (alive? status?) or otherewise modify nova so that there is no conflict.
> > > 
> 
> If I understand your analysis of the bug correctly, this would have 
> caught that type of outage after all since the failure was asymmetric. 
am im not sure
it might yes looking at https://review.opendev.org/#/c/735385/6
its not clear to me how the endpoint is invoked. is it doing a topic send or a direct send?
to detech the failure you would need to invoke a ping on the compute service and that ping would
have to been encured on the to nova topic exchante with a routing key of compute.<compute node hostname>

if the compute topic queue was broken either because it was nolonger bound to the correct topic or due to some other
rabbitmq error then you woudl either get a message undeilverbale error of some kind with the mandaroy flag or likely a
timeout without the mandaroty flag. so if the ping would be routed usign a topic too compute.<compute node hostname>
then yes it would find this.

although we can also detech this ourselves and fix it using the mandatory flag i think by just recreating the queue wehn
it extis but we get an undeliverable message, at least i think we can rabbit is not my main are of expertiese so it
woudl be nice is someone that know more about it can weigh in on that.


> The compute node was still able to send its status updates to Nova, but 
> wasn't receiving any messages. A ping would have detected that situation.

> 


From kennelson11 at gmail.com  Thu Aug 13 16:19:27 2020
From: kennelson11 at gmail.com (Kendall Nelson)
Date: Thu, 13 Aug 2020 09:19:27 -0700
Subject: [PTL][SIG][TC] vPTG October 2020 Team Signup
Message-ID: <CAJ6yrQi4TFLAr6CZKfaUvAkLYbQfMyq=hKD4Byz=EXVi5HN4bg@mail.gmail.com>

Greetings!

As you hopefully already know, our next PTG will be virtual again, and held
from Monday October 26th to Friday October 30th. We will have the same
schedule set up available as last time with three windows of time spread
across the day to cover all timezones with breaks in between.

*To signup your team, you must complete **BOTH** the survey[1] AND reserve
time in the ethercalc[2] by September 11th at 7:00 UTC.*

We ask that the PTL/SIG Chair/Team lead sign up for time to have their
discussions in with 4 rules/guidelines.

1. Cross project discussions (like SIGs or support project teams) should be
scheduled towards the start of the week so that any discussions that might
shape those of other teams happen first.
2. No team should sign up for more than 4 hours per UTC day to help keep
participants actively engaged.
3. No team should sign up for more than 16 hours across all time slots to
avoid burning out our contributors and to enable participation in multiple
teams discussions.

Again, you need to fill out BOTH the ethercalc AND the survey to complete
your team's sign up.

If you have any issues with signing up your team, due to conflict or
otherwise, please let me know! While we are trying to empower you to make
your own decisions as to when you meet and for how long (after all, you
know your needs and teams timezones better than we do), we are here to help!

Once your team is signed up, please register! And remind your team to
register! Registration is free, but since it will be how we contact you
with passwords, event details, etc. it is still important!

Continue to check back for updates at openstack.org/ptg.

-the Kendalls (diablo_rojo & wendallkaters)


[1] Team Survey:
https://openstackfoundation.formstack.com/forms/june2020_virtual_ptg_survey
[2] Ethercalc Signup: https://ethercalc.openstack.org/126u8ek25noy
[3] PTG Registration: https://october2020ptg.eventbrite.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200813/415115be/attachment.html>

From openstack at nemebean.com  Thu Aug 13 16:21:31 2020
From: openstack at nemebean.com (Ben Nemec)
Date: Thu, 13 Aug 2020 11:21:31 -0500
Subject: [largescale-sig][nova][neutron][oslo] RPC ping
In-Reply-To: <65204b738f13fcea16b9b6d5a68149c89be73e6a.camel@redhat.com>
References: <20200727095744.GK31915@sync>
 <3d238530-6c84-d611-da4c-553ba836fc02@nemebean.com>
 <m2v9i8yjfw.fsf@caffeine.hv.danplanet.com>
 <671fec63-8bea-4215-c773-d8360e368a99@sap.com>
 <c5afa01d-d8ef-7441-26a5-ddbc756f719c@nemebean.com>
 <2af09e63936f75489946ea6b70c41d6e091531ee.camel@redhat.com>
 <7496bd35-856e-f48f-b6d8-65155b1777f1@openstack.org>
 <16a3adf0-2f51-dd7d-c729-7b27f1593980@nemebean.com>
 <a36bcd1d-1dbe-bd69-d195-c89795680e44@openstack.org>
 <6e68d1a3cfc4efff91d3668bb53805dc469673c6.camel@redhat.com>
 <e614e2ce-e5c6-087a-b186-74ab96889acf@nemebean.com>
 <65204b738f13fcea16b9b6d5a68149c89be73e6a.camel@redhat.com>
Message-ID: <ecae0bcd-5ab1-caaa-aa64-0988836ec07a@nemebean.com>


On 8/13/20 11:07 AM, Sean Mooney wrote:
>>   I think it's probably
>> better to provide a well-defined endpoint for them to talk to rather
>> than have everyone implement their own slightly different RPC ping
>> mechanism. The docs for this feature should be very explicit that this
>> is the only thing external code should be calling.
> ya i think that is a good approch.
> i would still prefer if people used say middelware to add a service ping admin api endpoint
> instead of driectly calling the rpc endpoint to avoid exposing rabbitmq but that is out of scope of this discussion.

Completely agree. In the long run I would like to see this replaced with 
better integrated healthchecking in OpenStack, but we've been talking 
about that for years and have made minimal progress.

> 
>>
>>>
>>> so if this does actully detect somethign we can otherwise detect and the use cases involves using it within
>>> the openstack services not form an external source then i think that is fine but we proably need to use another
>>> name (alive? status?) or otherewise modify nova so that there is no conflict.
>>>>
>>
>> If I understand your analysis of the bug correctly, this would have
>> caught that type of outage after all since the failure was asymmetric.
> am im not sure
> it might yes looking at https://review.opendev.org/#/c/735385/6
> its not clear to me how the endpoint is invoked. is it doing a topic send or a direct send?
> to detech the failure you would need to invoke a ping on the compute service and that ping would
> have to been encured on the to nova topic exchante with a routing key of compute.<compute node hostname>
> 
> if the compute topic queue was broken either because it was nolonger bound to the correct topic or due to some other
> rabbitmq error then you woudl either get a message undeilverbale error of some kind with the mandaroy flag or likely a
> timeout without the mandaroty flag. so if the ping would be routed usign a topic too compute.<compute node hostname>
> then yes it would find this.
> 
> although we can also detech this ourselves and fix it using the mandatory flag i think by just recreating the queue wehn
> it extis but we get an undeliverable message, at least i think we can rabbit is not my main are of expertiese so it
> woudl be nice is someone that know more about it can weigh in on that.

I pinged Ken this morning to take a look at that. He should be able to 
tell us whether it's a good idea or crazy talk. :-)


From ekuvaja at redhat.com  Thu Aug 13 16:27:16 2020
From: ekuvaja at redhat.com (Erno Kuvaja)
Date: Thu, 13 Aug 2020 17:27:16 +0100
Subject: [all][TC] OpenStack Client (OSC) vs python-*clients
In-Reply-To: <CADO3vb4ufSnUpEwk8xk-6WQ9k8QKqnAnkWS1V+6U0x-nZnL7oQ@mail.gmail.com>
References: <CAPkQhndGy1VrBT1yOCXQaZcgDxXr5ww9RL1VnU45iHsDJqZpcw@mail.gmail.com>
 <CAKZ_x7_CCbHzVdruU2BvS7kK1mvte+1AW1PJptwVzFJQVVCs4Q@mail.gmail.com>
 <1668118.VLH7GnMWUR@whitebase.usersys.redhat.com>
 <CACqyMicKobe7WnQfKEixAM3pTN0+9JdNGQUR8Z9KeW9bE9Xizw@mail.gmail.com>
 <9cbf9d69a9beb30d03af71e42a3e2446a516292a.camel@redhat.com>
 <CADO3vb4ufSnUpEwk8xk-6WQ9k8QKqnAnkWS1V+6U0x-nZnL7oQ@mail.gmail.com>
Message-ID: <CAKwVDDOOu_Y6FFu1RQL+vKKg0D+w=qbjzHYzuk06dtjJ4qjjhQ@mail.gmail.com>

On Thu, Aug 13, 2020 at 4:46 PM Alan Bishop <abishop at redhat.com> wrote:

>
>
> On Thu, Aug 13, 2020 at 8:27 AM Sean Mooney <smooney at redhat.com> wrote:
>
>> On Thu, 2020-08-13 at 10:30 -0400, Artom Lifshitz wrote:
>> > On Mon, Aug 10, 2020 at 4:40 AM Luigi Toscano <ltoscano at redhat.com>
>> wrote:
>> > >
>> > > On Monday, 10 August 2020 10:26:24 CEST Radosław Piliszek wrote:
>> > > > On Mon, Aug 10, 2020 at 10:19 AM Belmiro Moreira <
>> > > >
>> > > > moreira.belmiro.email.lists at gmail.com> wrote:
>> > > > > Hi,
>> > > > > during the last PTG the TC discussed the problem of supporting
>> different
>> > > > > clients (OpenStack Client - OSC vs python-*clients) [1].
>> > > > > Currently, we don't have feature parity between the OSC and the
>> > > > > python-*clients.
>> > > >
>> > > > Is it true of any client? I guess some are just OSC plugins 100%.
>> > > > Do we know which clients have this disparity?
>> > > > Personally, I encountered this with Glance the most and Cinder to
>> some
>> > > > extent (but I believe over the course of action Cinder got all
>> features I
>> > > > wanted from it in the OSC).
>> > >
>> > > As far as I know there is still a huge problem with microversion
>> handling
>> > > which impacts some cinder features. It has been discussed in the past
>> and
>> > > still present.
>> >
>> > Yeah, my understanding is that osc will never "properly" support
>> > microversions.
>> it does already properly support micorversion the issue is not everyone
>> agrees
>> on what properly means. the behavior of the project clients was
>> considered broken
>> by many. it has been poirpose that we explcity allow a way to opt in to
>> the auto negociation
>> via a new "auto" sentaial value and i have also suggested that we should
>> tag each comman with the minium
>> microversion that parmater or command requires and decault to that minium
>> based on teh arges you passed.
>>
>> both of those imporvement dont break the philosipy of providing stable
>> cli behavior across cloud and would
>> imporve the ux. defaulting to the minium microversion needed for the
>> arguments passed would solve most of the ux
>> issues and adding an auto sentical would resolve the rest while still
>> keeping the correct microversion behvior it
>> already has.
>>
>> the glance and cinder gaps are not really related to microverions by the
>> way.
>> its just that no one has done the work and cinder an glance have not
>> require contiuptors to update
>>
>
> Updates to osc from cinder's side are pretty much stalled due to lack of
> support for microversions. A patch for that was rejected and we've had
> trouble getting an update on a viable path forward. See comment in
> https://review.opendev.org/590807.
>
> Alan
>
>
>> osc as part of adding new features. nova has not required that either but
>> there were some who worked on nova
>> that cared enough about osc to mention it in code review or submit
>> patches themsevles. the glance team does
>> not really have the resouces to do that and the osc team does not have
>> the resouce to maintain clis for all teams.
>>
>> so over tiem as service poject added new feature the gaps have increase
>> since there were not people tyring to keep it in
>> sync.
>>
>> >  Openstacksdk is the future in that sense, and my
>> > understanding is that the osc team is "porting" osc to use the sdk.
>> > Given these two thing, when we (Nova) talked about this with the osc
>> > folks, we decided that rather than catch up osc to python-novaclient,
>> > we'd rather focus our efforts on the sdk.
>> well that is not entirly a good caraterisation. we want to catch up osc
>> too
>> but the suggest was to support eveything in osc then it would be easier
>> to add osc support
>> since it just has to call the sdk functions. we did not say we dont want
>> to close the gaps in osc.
>>
>> >  I've been slowly doing that
>> > [1], starting with the earlier Nova microversions. The eventual long
>> > term goal is for the Nova team to *only* support the sdk, and drop
>> > python-novaclient entirely, but that's a long time away.
>> >
>> > [1]
>> https://review.opendev.org/#/q/status:open+project:openstack/openstacksdk+branch:master+topic:story/2007929
>> >
>> > >
>> > >
>> > > --
>> > > Luigi
>> > >
>> > >
>> > >
>> >
>> >
>>
>
So if I understand the whole picture correctly the situation has actually
nothing to do with directly working OSC in favor of python-*client provided
CLIs but actually moving everything to OSSDK so it can be supported and
used by OSC to be used as default client for everything? As that seems to
be a consensus that it's not enough to get the OSC to do the right thing if
the client lib under the hood is still python-*client and specially if
microversions.

My question at this point is, do we (as a community) have enough bodies
dedicated to OSSDK _and_ OSC to make this sustainable? I'm being sincere
here as I have not been part of the development of either of those
projects. But if my assumption above is correct, I think we should talk
about these things with their real names rather than trying to mask this
being just OSC vs python-*client CLI thing.

- jokke
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200813/26d7394f/attachment.html>

From openstack at nemebean.com  Thu Aug 13 16:31:34 2020
From: openstack at nemebean.com (Ben Nemec)
Date: Thu, 13 Aug 2020 11:31:34 -0500
Subject: [PTL][SIG][TC] vPTG October 2020 Team Signup
In-Reply-To: <CAJ6yrQi4TFLAr6CZKfaUvAkLYbQfMyq=hKD4Byz=EXVi5HN4bg@mail.gmail.com>
References: <CAJ6yrQi4TFLAr6CZKfaUvAkLYbQfMyq=hKD4Byz=EXVi5HN4bg@mail.gmail.com>
Message-ID: <f1674696-7931-7ac8-3e6a-10f3f54d23f8@nemebean.com>


On 8/13/20 11:19 AM, Kendall Nelson wrote:
> [2] Ethercalc Signup: https://ethercalc.openstack.org/126u8ek25noy

This is taking me to the ethercalc from last time. I assume that wasn't 
intentional?


From fungi at yuggoth.org  Thu Aug 13 16:41:31 2020
From: fungi at yuggoth.org (Jeremy Stanley)
Date: Thu, 13 Aug 2020 16:41:31 +0000
Subject: [all][TC] OpenStack Client (OSC) vs python-*clients
In-Reply-To: <CAKwVDDOOu_Y6FFu1RQL+vKKg0D+w=qbjzHYzuk06dtjJ4qjjhQ@mail.gmail.com>
References: <CAPkQhndGy1VrBT1yOCXQaZcgDxXr5ww9RL1VnU45iHsDJqZpcw@mail.gmail.com>
 <CAKZ_x7_CCbHzVdruU2BvS7kK1mvte+1AW1PJptwVzFJQVVCs4Q@mail.gmail.com>
 <1668118.VLH7GnMWUR@whitebase.usersys.redhat.com>
 <CACqyMicKobe7WnQfKEixAM3pTN0+9JdNGQUR8Z9KeW9bE9Xizw@mail.gmail.com>
 <9cbf9d69a9beb30d03af71e42a3e2446a516292a.camel@redhat.com>
 <CADO3vb4ufSnUpEwk8xk-6WQ9k8QKqnAnkWS1V+6U0x-nZnL7oQ@mail.gmail.com>
 <CAKwVDDOOu_Y6FFu1RQL+vKKg0D+w=qbjzHYzuk06dtjJ4qjjhQ@mail.gmail.com>
Message-ID: <20200813164131.bdmhankpd2qxycux@yuggoth.org>

On 2020-08-13 17:27:16 +0100 (+0100), Erno Kuvaja wrote:
[...]
> My question at this point is, do we (as a community) have enough
> bodies dedicated to OSSDK _and_ OSC to make this sustainable? I'm
> being sincere here as I have not been part of the development of
> either of those projects. But if my assumption above is correct, I
> think we should talk about these things with their real names
> rather than trying to mask this being just OSC vs python-*client
> CLI thing.

Hopefully this doesn't come across as a glib response, but if people
didn't have to maintain multiple CLIs and SDKs then maybe they would
have enough time to collaborate on a universal CLI/SDK pair instead.
-- 
Jeremy Stanley
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 963 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200813/41722b65/attachment.sig>

From kennelson11 at gmail.com  Thu Aug 13 16:43:03 2020
From: kennelson11 at gmail.com (Kendall Nelson)
Date: Thu, 13 Aug 2020 09:43:03 -0700
Subject: [PTL][SIG][TC] vPTG October 2020 Team Signup
In-Reply-To: <f1674696-7931-7ac8-3e6a-10f3f54d23f8@nemebean.com>
References: <CAJ6yrQi4TFLAr6CZKfaUvAkLYbQfMyq=hKD4Byz=EXVi5HN4bg@mail.gmail.com>
 <f1674696-7931-7ac8-3e6a-10f3f54d23f8@nemebean.com>
Message-ID: <CAJ6yrQjp5OZicutA8nbHj-8P0i8be48gFM+U68z1+i2=gYrvPw@mail.gmail.com>

SIGH. Yes. Here is the new ethercalc:

https://ethercalc.openstack.org/7xp2pcbh1ncb

Sorry for the confusion!

-Kendall (diablo_rojo)

On Thu, Aug 13, 2020 at 9:31 AM Ben Nemec <openstack at nemebean.com> wrote:

>
>
> On 8/13/20 11:19 AM, Kendall Nelson wrote:
> > [2] Ethercalc Signup: https://ethercalc.openstack.org/126u8ek25noy
>
> This is taking me to the ethercalc from last time. I assume that wasn't
> intentional?
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200813/203fd979/attachment.html>

From alifshit at redhat.com  Thu Aug 13 17:03:05 2020
From: alifshit at redhat.com (Artom Lifshitz)
Date: Thu, 13 Aug 2020 13:03:05 -0400
Subject: [all][TC] OpenStack Client (OSC) vs python-*clients
In-Reply-To: <20200813164131.bdmhankpd2qxycux@yuggoth.org>
References: <CAPkQhndGy1VrBT1yOCXQaZcgDxXr5ww9RL1VnU45iHsDJqZpcw@mail.gmail.com>
 <CAKZ_x7_CCbHzVdruU2BvS7kK1mvte+1AW1PJptwVzFJQVVCs4Q@mail.gmail.com>
 <1668118.VLH7GnMWUR@whitebase.usersys.redhat.com>
 <CACqyMicKobe7WnQfKEixAM3pTN0+9JdNGQUR8Z9KeW9bE9Xizw@mail.gmail.com>
 <9cbf9d69a9beb30d03af71e42a3e2446a516292a.camel@redhat.com>
 <CADO3vb4ufSnUpEwk8xk-6WQ9k8QKqnAnkWS1V+6U0x-nZnL7oQ@mail.gmail.com>
 <CAKwVDDOOu_Y6FFu1RQL+vKKg0D+w=qbjzHYzuk06dtjJ4qjjhQ@mail.gmail.com>
 <20200813164131.bdmhankpd2qxycux@yuggoth.org>
Message-ID: <CACqyMieYo7tTtWVhRFJnEoWGGJrJHxERtDWfjHwE2gg22STMyw@mail.gmail.com>

On Thu, Aug 13, 2020 at 12:45 PM Jeremy Stanley <fungi at yuggoth.org> wrote:
>
> On 2020-08-13 17:27:16 +0100 (+0100), Erno Kuvaja wrote:
> [...]
> > My question at this point is, do we (as a community) have enough
> > bodies dedicated to OSSDK _and_ OSC to make this sustainable? I'm
> > being sincere here as I have not been part of the development of
> > either of those projects. But if my assumption above is correct, I
> > think we should talk about these things with their real names
> > rather than trying to mask this being just OSC vs python-*client
> > CLI thing.
>
> Hopefully this doesn't come across as a glib response, but if people
> didn't have to maintain multiple CLIs and SDKs then maybe they would
> have enough time to collaborate on a universal CLI/SDK pair instead.

Agreed - but historically that's not what happened, so the question
now is how to improve the situation. My understanding is that osc is
effectively dead, except as a shell around the sdk, since that's where
the future lies. So in my mind, efforts should be concentrated on two
fronts:

1. Continue converting osc to use the sdk
2. Catch up the SDK

This is a bit of a chicken and egg problem, because any gaps in sdk
mean you can't convert osc to use those missing bits, but ideally any
patches to osc that aren't sdk conversions would get blocked (though I
have obviously absolutely no say in the matter, this is just wishful
thinking). The project teams can work on 2 for their project (so like
I've been slowly doing for Nova), the osc team can work on 1.


> --
> Jeremy Stanley


From kennelson11 at gmail.com  Thu Aug 13 17:09:13 2020
From: kennelson11 at gmail.com (Kendall Nelson)
Date: Thu, 13 Aug 2020 10:09:13 -0700
Subject: [PTL][SIG][TC] vPTG October 2020 Team Signup
In-Reply-To: <CAJ6yrQjp5OZicutA8nbHj-8P0i8be48gFM+U68z1+i2=gYrvPw@mail.gmail.com>
References: <CAJ6yrQi4TFLAr6CZKfaUvAkLYbQfMyq=hKD4Byz=EXVi5HN4bg@mail.gmail.com>
 <f1674696-7931-7ac8-3e6a-10f3f54d23f8@nemebean.com>
 <CAJ6yrQjp5OZicutA8nbHj-8P0i8be48gFM+U68z1+i2=gYrvPw@mail.gmail.com>
Message-ID: <CAJ6yrQghvggYCcL2m+8S53+peus__RVA2JRMQhhw1nS-gftS8A@mail.gmail.com>

Sigh. I guess I should have known better than to send this out without
having a cup of tea first.

The survey link in the original email is also from the last PTG. Please use
this survey link:
https://openstackfoundation.formstack.com/forms/oct2020_vptg_survey

-Kendall (diablo_rojo)

On Thu, Aug 13, 2020 at 9:43 AM Kendall Nelson <kennelson11 at gmail.com>
wrote:

> SIGH. Yes. Here is the new ethercalc:
>
> https://ethercalc.openstack.org/7xp2pcbh1ncb
>
> Sorry for the confusion!
>
> -Kendall (diablo_rojo)
>
> On Thu, Aug 13, 2020 at 9:31 AM Ben Nemec <openstack at nemebean.com> wrote:
>
>>
>>
>> On 8/13/20 11:19 AM, Kendall Nelson wrote:
>> > [2] Ethercalc Signup: https://ethercalc.openstack.org/126u8ek25noy
>>
>> This is taking me to the ethercalc from last time. I assume that wasn't
>> intentional?
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200813/649359d5/attachment.html>

From openstack at nemebean.com  Thu Aug 13 17:09:42 2020
From: openstack at nemebean.com (Ben Nemec)
Date: Thu, 13 Aug 2020 12:09:42 -0500
Subject: [oslo] vPTG scheduling
Message-ID: <59c4975f-67c4-8351-caef-b4937e641741@nemebean.com>

Continuing my policy of EAFP scheduling, I've signed us up for two hours 
starting at our regular meeting time. This has worked well for our past 
couple of virtual events so I didn't see any reason to change it. If 
that time doesn't work for you, please let me know ASAP so we can make 
alternate arrangements.

Thanks.

-Ben


From fungi at yuggoth.org  Thu Aug 13 17:20:19 2020
From: fungi at yuggoth.org (Jeremy Stanley)
Date: Thu, 13 Aug 2020 17:20:19 +0000
Subject: [all][TC] OpenStack Client (OSC) vs python-*clients
In-Reply-To: <CACqyMieYo7tTtWVhRFJnEoWGGJrJHxERtDWfjHwE2gg22STMyw@mail.gmail.com>
References: <CAPkQhndGy1VrBT1yOCXQaZcgDxXr5ww9RL1VnU45iHsDJqZpcw@mail.gmail.com>
 <CAKZ_x7_CCbHzVdruU2BvS7kK1mvte+1AW1PJptwVzFJQVVCs4Q@mail.gmail.com>
 <1668118.VLH7GnMWUR@whitebase.usersys.redhat.com>
 <CACqyMicKobe7WnQfKEixAM3pTN0+9JdNGQUR8Z9KeW9bE9Xizw@mail.gmail.com>
 <9cbf9d69a9beb30d03af71e42a3e2446a516292a.camel@redhat.com>
 <CADO3vb4ufSnUpEwk8xk-6WQ9k8QKqnAnkWS1V+6U0x-nZnL7oQ@mail.gmail.com>
 <CAKwVDDOOu_Y6FFu1RQL+vKKg0D+w=qbjzHYzuk06dtjJ4qjjhQ@mail.gmail.com>
 <20200813164131.bdmhankpd2qxycux@yuggoth.org>
 <CACqyMieYo7tTtWVhRFJnEoWGGJrJHxERtDWfjHwE2gg22STMyw@mail.gmail.com>
Message-ID: <20200813172018.pw3mo6viekvzb7wx@yuggoth.org>

On 2020-08-13 13:03:05 -0400 (-0400), Artom Lifshitz wrote:
[...]
> This is a bit of a chicken and egg problem, because any gaps in
> sdk mean you can't convert osc to use those missing bits, but
> ideally any patches to osc that aren't sdk conversions would get
> blocked (though I have obviously absolutely no say in the matter,
> this is just wishful thinking).
[...]

I think you do have a say. At the very least, this is why it's being
discussed on the mailing list, but also as a contributor you get to
vote on TC members to represent your interests in these sorts of
decisions, and for that matter the team's leadership has been very
willing to give interested folks more direct decision making ability
as evidenced by the large core review group for the SDK repo.
-- 
Jeremy Stanley
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 963 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200813/6a5e44ba/attachment.sig>

From kgiusti at gmail.com  Thu Aug 13 18:55:54 2020
From: kgiusti at gmail.com (Ken Giusti)
Date: Thu, 13 Aug 2020 14:55:54 -0400
Subject: [oslo] Proposing Lance Bragstad as oslo.cache core
In-Reply-To: <CAG_6SUC+hgX9SqoqPmFxew2xwfi99MCcN=5v1bTHeVLo5ed4VQ@mail.gmail.com>
References: <CAG_6SUC+hgX9SqoqPmFxew2xwfi99MCcN=5v1bTHeVLo5ed4VQ@mail.gmail.com>
Message-ID: <CAJoCO=OdcmRkFDDupvJND8FOW5TxRN5ke-uSfcqa3txNrVrtkQ@mail.gmail.com>

+1 for Lance!

On Thu, Aug 13, 2020 at 11:17 AM Moises Guimaraes de Medeiros <
moguimar at redhat.com> wrote:

> Hello everybody,
>
> It is my pleasure to propose Lance Bragstad (lbragstad) as a new member
> of the oslo.core core team.
>
> Lance has been a big contributor to the project and is known as a walking
> version of the Keystone documentation, which happens to be one of the
> biggest consumers of oslo.cache.
>
> Obviously we think he'd make a good addition to the core team. If there
> are no objections, I'll make that happen in a week.
>
> Thanks.
>
> --
>
> Moisés Guimarães
>
> Software Engineer
>
> Red Hat <https://www.redhat.com>
>
> <https://red.ht/sig>
>


-- 
Ken Giusti  (kgiusti at gmail.com)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200813/08d75ba3/attachment.html>

From cohuck at redhat.com  Thu Aug 13 15:33:47 2020
From: cohuck at redhat.com (Cornelia Huck)
Date: Thu, 13 Aug 2020 17:33:47 +0200
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200807135942.5d56a202.cohuck@redhat.com>
References: <20200727072440.GA28676@joy-OptiPlex-7040>
 <20200727162321.7097070e@x1.home>
 <20200729080503.GB28676@joy-OptiPlex-7040>
 <20200804183503.39f56516.cohuck@redhat.com>
 <c178a0d3-269d-1620-22b1-9010f602d8ff@redhat.com>
 <20200805021654.GB30485@joy-OptiPlex-7040>
 <2624b12f-3788-7e2b-2cb7-93534960bcb7@redhat.com>
 <20200805075647.GB2177@nanopsycho>
 <eb1d01c2-fbad-36b6-10cf-9e03483a736b@redhat.com>
 <20200805093338.GC30485@joy-OptiPlex-7040>
 <20200805105319.GF2177@nanopsycho>
 <4cf2824c803c96496e846c5b06767db305e9fb5a.camel@redhat.com>
 <20200807135942.5d56a202.cohuck@redhat.com>
Message-ID: <20200813173347.239801fa.cohuck@redhat.com>

On Fri, 7 Aug 2020 13:59:42 +0200
Cornelia Huck <cohuck at redhat.com> wrote:

> On Wed, 05 Aug 2020 12:35:01 +0100
> Sean Mooney <smooney at redhat.com> wrote:
> 
> > On Wed, 2020-08-05 at 12:53 +0200, Jiri Pirko wrote:  
> > > Wed, Aug 05, 2020 at 11:33:38AM CEST, yan.y.zhao at intel.com wrote:    
> 
> (...)
> 
> > > >    software_version: device driver's version.
> > > >               in <major>.<minor>[.bugfix] scheme, where there is no
> > > > 	       compatibility across major versions, minor versions have
> > > > 	       forward compatibility (ex. 1-> 2 is ok, 2 -> 1 is not) and
> > > > 	       bugfix version number indicates some degree of internal
> > > > 	       improvement that is not visible to the user in terms of
> > > > 	       features or compatibility,
> > > > 
> > > > vendor specific attributes: each vendor may define different attributes
> > > >   device id : device id of a physical devices or mdev's parent pci device.
> > > >               it could be equal to pci id for pci devices
> > > >   aggregator: used together with mdev_type. e.g. aggregator=2 together
> > > >               with i915-GVTg_V5_4 means 2*1/4=1/2 of a gen9 Intel
> > > > 	       graphics device.
> > > >   remote_url: for a local NVMe VF, it may be configured with a remote
> > > >               url of a remote storage and all data is stored in the
> > > > 	       remote side specified by the remote url.
> > > >   ...    
> > just a minor not that i find ^ much more simmple to understand then
> > the current proposal with self and compatiable.
> > if i have well defiend attibute that i can parse and understand that allow
> > me to calulate the what is and is not compatible that is likely going to
> > more useful as you wont have to keep maintianing a list of other compatible
> > devices every time a new sku is released.
> > 
> > in anycase thank for actully shareing ^ as it make it simpler to reson about what
> > you have previously proposed.  
> 
> So, what would be the most helpful format? A 'software_version' field
> that follows the conventions outlined above, and other (possibly
> optional) fields that have to match?

Just to get a different perspective, I've been trying to come up with
what would be useful for a very different kind of device, namely
vfio-ccw. (Adding Eric to cc: for that.)

software_version makes sense for everybody, so it should be a standard
attribute.

For the vfio-ccw type, we have only one vendor driver (vfio-ccw_IO).

Given a subchannel A, we want to make sure that subchannel B has a
reasonable chance of being compatible. I guess that means:

- same subchannel type (I/O)
- same chpid type (e.g. all FICON; I assume there are no 'mixed' setups
  -- Eric?)
- same number of chpids? Maybe we can live without that and just inject
  some machine checks, I don't know. Same chpid numbers is something we
  cannot guarantee, especially if we want to migrate cross-CEC (to
  another machine.)

Other possibly interesting information is not available at the
subchannel level (vfio-ccw is a subchannel driver.)

So, looking at a concrete subchannel on one of my machines, it would
look something like the following:

<common>
software_version=1.0.0
type=vfio-ccw          <-- would be vfio-pci on the example above
<vfio-ccw specific>
subchannel_type=0
<vfio-ccw_IO specific>
chpid_type=0x1a
chpid_mask=0xf0        <-- not sure if needed/wanted

Does that make sense?


From farman at linux.ibm.com  Thu Aug 13 19:02:53 2020
From: farman at linux.ibm.com (Eric Farman)
Date: Thu, 13 Aug 2020 15:02:53 -0400
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200813173347.239801fa.cohuck@redhat.com>
References: <20200727072440.GA28676@joy-OptiPlex-7040>
 <20200727162321.7097070e@x1.home> <20200729080503.GB28676@joy-OptiPlex-7040>
 <20200804183503.39f56516.cohuck@redhat.com>
 <c178a0d3-269d-1620-22b1-9010f602d8ff@redhat.com>
 <20200805021654.GB30485@joy-OptiPlex-7040>
 <2624b12f-3788-7e2b-2cb7-93534960bcb7@redhat.com>
 <20200805075647.GB2177@nanopsycho>
 <eb1d01c2-fbad-36b6-10cf-9e03483a736b@redhat.com>
 <20200805093338.GC30485@joy-OptiPlex-7040> <20200805105319.GF2177@nanopsycho>
 <4cf2824c803c96496e846c5b06767db305e9fb5a.camel@redhat.com>
 <20200807135942.5d56a202.cohuck@redhat.com>
 <20200813173347.239801fa.cohuck@redhat.com>
Message-ID: <315669b0-5c75-d359-a912-62ebab496abf@linux.ibm.com>


On 8/13/20 11:33 AM, Cornelia Huck wrote:
> On Fri, 7 Aug 2020 13:59:42 +0200
> Cornelia Huck <cohuck at redhat.com> wrote:
> 
>> On Wed, 05 Aug 2020 12:35:01 +0100
>> Sean Mooney <smooney at redhat.com> wrote:
>>
>>> On Wed, 2020-08-05 at 12:53 +0200, Jiri Pirko wrote:  
>>>> Wed, Aug 05, 2020 at 11:33:38AM CEST, yan.y.zhao at intel.com wrote:    
>>
>> (...)
>>
>>>>>    software_version: device driver's version.
>>>>>               in <major>.<minor>[.bugfix] scheme, where there is no
>>>>> 	       compatibility across major versions, minor versions have
>>>>> 	       forward compatibility (ex. 1-> 2 is ok, 2 -> 1 is not) and
>>>>> 	       bugfix version number indicates some degree of internal
>>>>> 	       improvement that is not visible to the user in terms of
>>>>> 	       features or compatibility,
>>>>>
>>>>> vendor specific attributes: each vendor may define different attributes
>>>>>   device id : device id of a physical devices or mdev's parent pci device.
>>>>>               it could be equal to pci id for pci devices
>>>>>   aggregator: used together with mdev_type. e.g. aggregator=2 together
>>>>>               with i915-GVTg_V5_4 means 2*1/4=1/2 of a gen9 Intel
>>>>> 	       graphics device.
>>>>>   remote_url: for a local NVMe VF, it may be configured with a remote
>>>>>               url of a remote storage and all data is stored in the
>>>>> 	       remote side specified by the remote url.
>>>>>   ...    
>>> just a minor not that i find ^ much more simmple to understand then
>>> the current proposal with self and compatiable.
>>> if i have well defiend attibute that i can parse and understand that allow
>>> me to calulate the what is and is not compatible that is likely going to
>>> more useful as you wont have to keep maintianing a list of other compatible
>>> devices every time a new sku is released.
>>>
>>> in anycase thank for actully shareing ^ as it make it simpler to reson about what
>>> you have previously proposed.  
>>
>> So, what would be the most helpful format? A 'software_version' field
>> that follows the conventions outlined above, and other (possibly
>> optional) fields that have to match?
> 
> Just to get a different perspective, I've been trying to come up with
> what would be useful for a very different kind of device, namely
> vfio-ccw. (Adding Eric to cc: for that.)
> 
> software_version makes sense for everybody, so it should be a standard
> attribute.
> 
> For the vfio-ccw type, we have only one vendor driver (vfio-ccw_IO).
> 
> Given a subchannel A, we want to make sure that subchannel B has a
> reasonable chance of being compatible. I guess that means:
> 
> - same subchannel type (I/O)
> - same chpid type (e.g. all FICON; I assume there are no 'mixed' setups
>   -- Eric?)

Correct.

> - same number of chpids? Maybe we can live without that and just inject
>   some machine checks, I don't know. Same chpid numbers is something we
>   cannot guarantee, especially if we want to migrate cross-CEC (to
>   another machine.)

I think we'd live without it, because I wouldn't expect it to be
consistent between systems.

> 
> Other possibly interesting information is not available at the
> subchannel level (vfio-ccw is a subchannel driver.)

I presume you're alluding to the DASD uid (dasdinfo -x) here?

> 
> So, looking at a concrete subchannel on one of my machines, it would
> look something like the following:
> 
> <common>
> software_version=1.0.0
> type=vfio-ccw          <-- would be vfio-pci on the example above
> <vfio-ccw specific>
> subchannel_type=0
> <vfio-ccw_IO specific>
> chpid_type=0x1a
> chpid_mask=0xf0        <-- not sure if needed/wanted
> 
> Does that make sense?
> 


From alex.kavanagh at canonical.com  Thu Aug 13 19:21:48 2020
From: alex.kavanagh at canonical.com (Alex Kavanagh)
Date: Thu, 13 Aug 2020 20:21:48 +0100
Subject: [charms] OpenStack Charms 20.08 release is now available
Message-ID: <CAO3V+Onu08z9Yog21pZOskTzeZ30QTxaz7OALo2ir6=z+0TfGA@mail.gmail.com>

The 20.08 release of the OpenStack Charms is now available. This release
brings several new features to the existing OpenStack Charms deployments
for Queens, Rocky, Stein, Train, Ussuri, and many stable combinations of
Ubuntu + OpenStack.

Please see the Release Notes for full details:

https://docs.openstack.org/charm-guide/latest/2008.html

== Highlights ==

* New charm: neutron-api-plugin-arista

There is a new supported subordinate charm that provides Arista switch ML2
plugin support to the OpenStack Neutron API service:
neutron-api-plugin-arista.

* New charms: Trilio

The Trilio charms (trilio-data-mover, trilio-dm-api, trilio-horizon-plugin,
and trilio-wlm) have been promoted to supported status. These charms deploy
TrilioVault, a commercial snapshot and restore solution for OpenStack.

* New charm: keystone-kerberos

The keystone-kerberos subordinate charm allows for per-domain
authentication via a Kerberos ticket, thereby providing an additional layer
of security. It is used in conjunction with the keystone charm.

* MySQL InnoDB Cluster TLS communication

TLS communication between MySQL InnoDB Cluster and its cloud clients is now
supported. Due to the circular dependency between the vault and
mysql-innodb-cluster applications, this is a post-deployment feature.

* Gnocchi S3 support

The gnocchi charm can now be configured to use S3 as a storage backend.
This feature is available starting with OpenStack Stein.

* Charm cinder-ceph supports a new relation

When both the nova-compute and cinder-ceph applications are deployed a new
relation is now required. This should not affect most currently deployed
clouds.

* Glance Simplestreams Sync

The glance-simplestreams-sync charm now installs simplestreams as a snap.
As such, the 'channel' configuration option should be used in place of the
‘source’ option.

== OpenStack Charms team ==

The OpenStack Charms team can be contacted on the #openstack-charms IRC
channel on Freenode.

== Thank you ==

Lots of thanks to the below 37 charm contributors who squashed 114 bugs*,
enabled support for a new release of OpenStack, improved documentation, and
added exciting new functionality!

Alex
Kavanagh

Aurelien
Lourot

James
Page

Peter
Matulis

Liam
Young

Hervé
Beraud

Corey
Bryant

Frode
Nordahl

David
Ames

Ryan
Beisner

Chris
MacNaughton

Dmitrii
Shcherbakov

Drew
Freiberger

Edward
Hope-Morley

Facundo
Ciccioli

Andreas
Jaeger

Pedro
Guimarães

Nobuto
Murata

Arif
Ali

Felipe
Reyes

Ponnuvel
Palaniyappan

Brett

Alvaro
Uria

Marco Filipe Moutinho da
Silva

Alejandro Santoyo
Gonzalez

Camille
Rodriguez

oliveiradan

Tiago
Pasqualini

Erlon R.
Cruz

Trent
Lloyd

Nikolay
Vinogradov

Andrew
McLeod

Mauricio Faria de
Oliveira

Vern
Hart

Jeff
Hillman

Rodrigo
Barbieri

Nicolas Bock

* The contributor and bug numbers are based on the OpenStack Victoria
development cycle.

--

OpenStack Charms Team
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200813/8d9c08f4/attachment-0001.html>

From kgiusti at gmail.com  Thu Aug 13 21:17:51 2020
From: kgiusti at gmail.com (Ken Giusti)
Date: Thu, 13 Aug 2020 17:17:51 -0400
Subject: [largescale-sig][nova][neutron][oslo] RPC ping
In-Reply-To: <ecae0bcd-5ab1-caaa-aa64-0988836ec07a@nemebean.com>
References: <20200727095744.GK31915@sync>
 <3d238530-6c84-d611-da4c-553ba836fc02@nemebean.com>
 <m2v9i8yjfw.fsf@caffeine.hv.danplanet.com>
 <671fec63-8bea-4215-c773-d8360e368a99@sap.com>
 <c5afa01d-d8ef-7441-26a5-ddbc756f719c@nemebean.com>
 <2af09e63936f75489946ea6b70c41d6e091531ee.camel@redhat.com>
 <7496bd35-856e-f48f-b6d8-65155b1777f1@openstack.org>
 <16a3adf0-2f51-dd7d-c729-7b27f1593980@nemebean.com>
 <a36bcd1d-1dbe-bd69-d195-c89795680e44@openstack.org>
 <6e68d1a3cfc4efff91d3668bb53805dc469673c6.camel@redhat.com>
 <e614e2ce-e5c6-087a-b186-74ab96889acf@nemebean.com>
 <65204b738f13fcea16b9b6d5a68149c89be73e6a.camel@redhat.com>
 <ecae0bcd-5ab1-caaa-aa64-0988836ec07a@nemebean.com>
Message-ID: <CAJoCO=MJ1vEEfwauom5qog3jgyVyEH7tZs4nRjVu8K_hcHTioA@mail.gmail.com>

On Thu, Aug 13, 2020 at 12:30 PM Ben Nemec <openstack at nemebean.com> wrote:

>
>
> On 8/13/20 11:07 AM, Sean Mooney wrote:
> >>   I think it's probably
> >> better to provide a well-defined endpoint for them to talk to rather
> >> than have everyone implement their own slightly different RPC ping
> >> mechanism. The docs for this feature should be very explicit that this
> >> is the only thing external code should be calling.
> > ya i think that is a good approch.
> > i would still prefer if people used say middelware to add a service ping
> admin api endpoint
> > instead of driectly calling the rpc endpoint to avoid exposing rabbitmq
> but that is out of scope of this discussion.
>
> Completely agree. In the long run I would like to see this replaced with
> better integrated healthchecking in OpenStack, but we've been talking
> about that for years and have made minimal progress.
>
> >
> >>
> >>>
> >>> so if this does actully detect somethign we can otherwise detect and
> the use cases involves using it within
> >>> the openstack services not form an external source then i think that
> is fine but we proably need to use another
> >>> name (alive? status?) or otherewise modify nova so that there is no
> conflict.
> >>>>
> >>
> >> If I understand your analysis of the bug correctly, this would have
> >> caught that type of outage after all since the failure was asymmetric.
> > am im not sure
> > it might yes looking at https://review.opendev.org/#/c/735385/6
> > its not clear to me how the endpoint is invoked. is it doing a topic
> send or a direct send?
> > to detech the failure you would need to invoke a ping on the compute
> service and that ping would
> > have to been encured on the to nova topic exchante with a routing key of
> compute.<compute node hostname>
> >
> > if the compute topic queue was broken either because it was nolonger
> bound to the correct topic or due to some other
> > rabbitmq error then you woudl either get a message undeilverbale error
> of some kind with the mandaroy flag or likely a
> > timeout without the mandaroty flag. so if the ping would be routed usign
> a topic too compute.<compute node hostname>
> > then yes it would find this.
> >
> > although we can also detech this ourselves and fix it using the
> mandatory flag i think by just recreating the queue wehn
> > it extis but we get an undeliverable message, at least i think we can
> rabbit is not my main are of expertiese so it
> > woudl be nice is someone that know more about it can weigh in on that.
>
> I pinged Ken this morning to take a look at that. He should be able to
> tell us whether it's a good idea or crazy talk. :-)
>

Like I can tell the difference between crazy and good ideas.  Ben I thought
you knew me better. ;)

As discussed you can enable the mandatory flag on a per RPCClient instance,
for example:

       _topts = oslo_messaging.TransportOptions(at_least_once=True)
         client = oslo_messaging.RPCClient(self.transport,
                                      self.target,
                                      timeout=conf.timeout,
                                     version_cap=conf.target_version,
                                     transport_options=_topts).prepare()

This will cause an rpc call/cast to fail if rabbitmq cannot find a queue
for the rpc request message [note the difference between 'queuing the
message' and 'having the message consumed' - the mandatory flag has nothing
to do with whether or not the message is eventually consumed].

Keep in mind that there may be some cases where having no active consumers
is ok and you do not want to get a delivery failure exception -
specifically fanout or perhaps cast.  Depends on the use case.   If there
are fanout use cases that fail or degrade if all present services don't get
a message then the mandatory flag will not detect an error if  a subset of
the bindings are lost.

My biggest concern with this type of failure (lost binding) is that
apparently the consumer is none the wiser when it happens.  Without some
sort of event issued by rabbitmq the RPC server cannot detect this problem
and take corrective actions (or at least I cannot think of any ATM).


-- 
Ken Giusti  (kgiusti at gmail.com)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200813/bd1ce268/attachment.html>

From rosmaita.fossdev at gmail.com  Thu Aug 13 21:36:10 2020
From: rosmaita.fossdev at gmail.com (Brian Rosmaita)
Date: Thu, 13 Aug 2020 17:36:10 -0400
Subject: [cinder] victoria mid-cycle part 2 summary available
Message-ID: <30dce86d-4afd-f2cb-2b84-61b730c279b6@gmail.com>

In case you missed yesterday's R-9 virtual mid-cycle session, I've 
updated the victoria mid-cycle wiki with a summary:
   https://wiki.openstack.org/wiki/CinderVictoriaMidCycleSummary

It will eventually include a link to the recording (in case you want to 
see what you missed or if you want to re-live the excitement).

We had a productive meeting yesterday, thanks to all who participated.

Unfortunately, some people had trouble connecting to the 
videoconference.  Please contact me off-list so we can figure out 
whether this was a one-time fail or if we need to look at some other 
videoconf solution for future meetings.


cheers,
brian


From rosmaita.fossdev at gmail.com  Thu Aug 13 21:53:44 2020
From: rosmaita.fossdev at gmail.com (Brian Rosmaita)
Date: Thu, 13 Aug 2020 17:53:44 -0400
Subject: [cinder] victoria os-brick release coming soon
Message-ID: <d0ec8939-83be-b183-f1a7-abe4d8882c74@gmail.com>

Just a quick reminder that the victoria os-brick release is 3 weeks 
away.  Reviews may be a bit slower than usual given that people may be 
taking some end-of-the-summer vacation, so if you have an important 
patch for os-brick, please take the initiative to raise awareness in the 
#openstack-cinder IRC channel if it's not getting the attention it deserves.

cheers,
brian


From rosmaita.fossdev at gmail.com  Thu Aug 13 22:05:00 2020
From: rosmaita.fossdev at gmail.com (Brian Rosmaita)
Date: Thu, 13 Aug 2020 18:05:00 -0400
Subject: [cinder] victoria new feature status checkpoint next week
Message-ID: <8b4f6f98-fbcb-9354-94aa-e01ef0912ebb@gmail.com>

If you are working on a Cinder feature for Victoria that hasn't merged 
yet, please add it to the agenda for next week's cinder weekly meeting 
on 19 August at 1400 UTC:
   https://etherpad.opendev.org/p/cinder-victoria-meetings

If your feature requires client support, keep in mind that the final 
release for client libraries is in four weeks.  Any client changes must 
be reviewed, tested, and merged before 10 September.

Keep in mind that 7 September is a holiday for many Cinder core 
reviewers, so it is likely that we will have reduced reviewer bandwidth 
around the time of the Feature Freeze.  So please plan ahead.


cheers,
brian


From rosmaita.fossdev at gmail.com  Thu Aug 13 22:15:27 2020
From: rosmaita.fossdev at gmail.com (Brian Rosmaita)
Date: Thu, 13 Aug 2020 18:15:27 -0400
Subject: [cinder] driver features declaration for victoria next week
Message-ID: <e51f9c0a-65a3-66eb-ae3f-d45986023809@gmail.com>

Hello Cinder driver maintainers,

This is a reminder that new features added to Cinder drivers for the 
Victoria release must be merged at the time of the OpenStack-wide 
Feature Freeze, which is coming up soon (10 September, to be specific).

In order to avoid the Festival of Insane Driver Reviewing that we had 
last cycle, if you have un-merged driver features that you would like to 
land in Victoria, please post a blueprint in Launchpad listing the 
Gerrit reviews of the associated patches before the next Cinder weekly 
meeting (that is, before 19 August at 1400 UTC).  This will help the 
team prioritize reviews and give you candid early feedback on whether 
the features look ready.

You can look among the Ussuri blueprints for examples; contact me in IRC 
if you have any questions.

Due to the 7 September holiday in the USA, there will be reduced 
reviewing bandwidth right around the Feature Freeze, so that's why I'm 
asking you to plan ahead.


cheers,
brian


From rosmaita.fossdev at gmail.com  Thu Aug 13 22:45:08 2020
From: rosmaita.fossdev at gmail.com (Brian Rosmaita)
Date: Thu, 13 Aug 2020 18:45:08 -0400
Subject: [all][TC] OpenStack Client (OSC) vs python-*clients
In-Reply-To: <CACqyMieYo7tTtWVhRFJnEoWGGJrJHxERtDWfjHwE2gg22STMyw@mail.gmail.com>
References: <CAPkQhndGy1VrBT1yOCXQaZcgDxXr5ww9RL1VnU45iHsDJqZpcw@mail.gmail.com>
 <CAKZ_x7_CCbHzVdruU2BvS7kK1mvte+1AW1PJptwVzFJQVVCs4Q@mail.gmail.com>
 <1668118.VLH7GnMWUR@whitebase.usersys.redhat.com>
 <CACqyMicKobe7WnQfKEixAM3pTN0+9JdNGQUR8Z9KeW9bE9Xizw@mail.gmail.com>
 <9cbf9d69a9beb30d03af71e42a3e2446a516292a.camel@redhat.com>
 <CADO3vb4ufSnUpEwk8xk-6WQ9k8QKqnAnkWS1V+6U0x-nZnL7oQ@mail.gmail.com>
 <CAKwVDDOOu_Y6FFu1RQL+vKKg0D+w=qbjzHYzuk06dtjJ4qjjhQ@mail.gmail.com>
 <20200813164131.bdmhankpd2qxycux@yuggoth.org>
 <CACqyMieYo7tTtWVhRFJnEoWGGJrJHxERtDWfjHwE2gg22STMyw@mail.gmail.com>
Message-ID: <2956d6bd-320e-34ea-64a0-1001e102d75c@gmail.com>

On 8/13/20 1:03 PM, Artom Lifshitz wrote:
> On Thu, Aug 13, 2020 at 12:45 PM Jeremy Stanley <fungi at yuggoth.org> wrote:
>>
>> On 2020-08-13 17:27:16 +0100 (+0100), Erno Kuvaja wrote:
>> [...]
>>> My question at this point is, do we (as a community) have enough
>>> bodies dedicated to OSSDK _and_ OSC to make this sustainable? I'm
>>> being sincere here as I have not been part of the development of
>>> either of those projects. But if my assumption above is correct, I
>>> think we should talk about these things with their real names
>>> rather than trying to mask this being just OSC vs python-*client
>>> CLI thing.
>>
>> Hopefully this doesn't come across as a glib response, but if people
>> didn't have to maintain multiple CLIs and SDKs then maybe they would
>> have enough time to collaborate on a universal CLI/SDK pair instead.
> 
> Agreed - but historically that's not what happened, so the question
> now is how to improve the situation. My understanding is that osc is
> effectively dead, except as a shell around the sdk, since that's where
> the future lies. So in my mind, efforts should be concentrated on two
> fronts:
> 
> 1. Continue converting osc to use the sdk
> 2. Catch up the SDK

My understanding is that the SDK is supposed to be an opinionated entry 
point to the APIs?  Or am I thinking of some other project?

I'm bringing this up because people say they want a single unified CLI, 
but when I've pushed operators about this, they want a CLI in Victoria 
that implements all the admin operations exposed by the Victorian-era 
APIs.  A CLI built on an opinionated SDK is not going to do that.

I could use some clarification on the goal and strategy here.  If it's 
to provide a unified opinionated CLI, then I don't see how that helps us 
to eventually eliminate the project-specific CLIs.

And if it's to provide one CLI that rules them all, the individual 
projects (well, Cinder, anyway) can't stop adding functionality to 
cinderclient CLI until the openstackclient CLI has feature parity.  At 
least now, you can use one CLI to do all cinder-related stuff.  If we 
stop cinderclient CLI development, then you'll need to use 
openstackclient for some things (old features + the latest features) and 
the cinderclient for all the in between features, which doesn't seem 
like progress to me.

Thus it would be helpful to have some clarification about the nature of 
the proposal we're discussing.

> 
> This is a bit of a chicken and egg problem, because any gaps in sdk
> mean you can't convert osc to use those missing bits, but ideally any
> patches to osc that aren't sdk conversions would get blocked (though I
> have obviously absolutely no say in the matter, this is just wishful
> thinking). The project teams can work on 2 for their project (so like
> I've been slowly doing for Nova), the osc team can work on 1. >
> 
> 
>> --
>> Jeremy Stanley
> 
> 


From rosmaita.fossdev at gmail.com  Thu Aug 13 22:51:34 2020
From: rosmaita.fossdev at gmail.com (Brian Rosmaita)
Date: Thu, 13 Aug 2020 18:51:34 -0400
Subject: [cinder] driver maintainers: 3rd party CI checkpoint reminder
Message-ID: <71c6bc06-06a0-6957-1755-063e76c57b2f@gmail.com>

Hello Cinder driver maintainers,

Around the time of the Feature Freeze (10 September), the Cinder team 
will be looking at the Third Party CIs to assess compliance [0].  Out of 
compliance drivers will be marked as 'unsupported' in the Victoria release.

We can avoid a lot of unpleasantness if you take this opportunity to 
review the current situation of your driver's CI [1] and, if necessary, 
take appropriate steps to get it back into compliance before 10 September.


cheers,
brian


[0] 
https://docs.openstack.org/cinder/latest/drivers-all-about.html#driver-compliance
[1] http://cinderstats.ivehearditbothways.com/cireport.txt


From fungi at yuggoth.org  Thu Aug 13 23:02:50 2020
From: fungi at yuggoth.org (Jeremy Stanley)
Date: Thu, 13 Aug 2020 23:02:50 +0000
Subject: [all][TC] OpenStack Client (OSC) vs python-*clients
In-Reply-To: <2956d6bd-320e-34ea-64a0-1001e102d75c@gmail.com>
References: <CAPkQhndGy1VrBT1yOCXQaZcgDxXr5ww9RL1VnU45iHsDJqZpcw@mail.gmail.com>
 <CAKZ_x7_CCbHzVdruU2BvS7kK1mvte+1AW1PJptwVzFJQVVCs4Q@mail.gmail.com>
 <1668118.VLH7GnMWUR@whitebase.usersys.redhat.com>
 <CACqyMicKobe7WnQfKEixAM3pTN0+9JdNGQUR8Z9KeW9bE9Xizw@mail.gmail.com>
 <9cbf9d69a9beb30d03af71e42a3e2446a516292a.camel@redhat.com>
 <CADO3vb4ufSnUpEwk8xk-6WQ9k8QKqnAnkWS1V+6U0x-nZnL7oQ@mail.gmail.com>
 <CAKwVDDOOu_Y6FFu1RQL+vKKg0D+w=qbjzHYzuk06dtjJ4qjjhQ@mail.gmail.com>
 <20200813164131.bdmhankpd2qxycux@yuggoth.org>
 <CACqyMieYo7tTtWVhRFJnEoWGGJrJHxERtDWfjHwE2gg22STMyw@mail.gmail.com>
 <2956d6bd-320e-34ea-64a0-1001e102d75c@gmail.com>
Message-ID: <20200813230250.63rbvs4xaznpcejd@yuggoth.org>

On 2020-08-13 18:45:08 -0400 (-0400), Brian Rosmaita wrote:
[...]
> My understanding is that the SDK is supposed to be an opinionated
> entry point to the APIs?  Or am I thinking of some other project?
[...]

It's modelled as several layers: direct REST API access, functional
access (similar to what our classic python-*client libs provided),
and an opinionated layer with more business logic and plaster over
cloud-specific interoperability problems (formerly the Shade library
which grew out of Nodepool). Callers can mix-n-match the layers,
like use a higher level call to get a Keystone token and then use it
to authenticate REST API methods.

https://docs.openstack.org/openstacksdk/latest/user/index.html#api-documentation

-- 
Jeremy Stanley
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 963 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200813/55742bd7/attachment.sig>

From zbitter at redhat.com  Fri Aug 14 00:37:36 2020
From: zbitter at redhat.com (Zane Bitter)
Date: Thu, 13 Aug 2020 20:37:36 -0400
Subject: [Ocata][Heat] Strange error returned after stack creation failure
 -r aw template with id xxx not found
In-Reply-To: <CAOAKi8wQ1-kQynprPOOTsK=763GbyBFJNTGj4sHSJwcK6B1zkw@mail.gmail.com>
References: <CAOAKi8yBxwMTOu_v9Z-s_YL9yZdkvzww1x=L-GbHbDKWZv3sJA@mail.gmail.com>
 <7fe6626a-0abb-97ca-fbfb-2066f426b9bf@redhat.com>
 <CAOAKi8wQ1-kQynprPOOTsK=763GbyBFJNTGj4sHSJwcK6B1zkw@mail.gmail.com>
Message-ID: <c24f6444-6f98-ca0d-9937-1b9e8a80a662@redhat.com>

On 24/07/20 10:59 am, Laurent Dumont wrote:
> Hey Zane,
> 
> Thank you so much for the details - super interesting. We've worked with 
> the Vendor to try and reproduce while we had our logs for Heat turned to 
> DEBUG. Unfortunately, all of the creations they have attempted since 
> have worked. It first failed 4 times out of 5 and has since worked...

Interesting - sounds like a timing issue, but I haven't spotted any code 
that looks like it could fail by going too fast.

> It's one of those problems! We'll keep trying to reproduce. Just to be 
> sure, the actual yaml is stored in the DB and then accessed to create 
> the actual Heat ressources?

Yep, correct. It's stored and the ID is passed in the RPC message here:

https://opendev.org/openstack/heat/src/branch/master/heat/engine/resources/stack_resource.py#L308
https://opendev.org/openstack/heat/src/branch/master/heat/engine/resources/stack_resource.py#L372-L374
https://opendev.org/openstack/heat/src/branch/master/heat/engine/resources/stack_resource.py#L336-L337

and then when the other engine receives the create_stack RPC message it 
uses the stored template instead of one passed in the message like you 
would get from a create call initiated via the ReST API:

https://opendev.org/openstack/heat/src/branch/master/heat/engine/service.py#L847-L851
https://opendev.org/openstack/heat/src/branch/master/heat/engine/service.py#L731-L732

- ZB

> 
> Thanks!
> 
> On Wed, Jul 22, 2020 at 3:46 PM Zane Bitter <zbitter at redhat.com 
> <mailto:zbitter at redhat.com>> wrote:
> 
>     On 21/07/20 8:03 pm, Laurent Dumont wrote:
>      > Hi!
>      >
>      > We are currently troubleshooting a Heat stack issue where one of the
>      > stack (one of 25 or so) is failing to be created properly (seemingly
>      > randomly).
>      >
>      > The actual error returned by Heat is quite strange and Google has
>     been
>      > quite sparse in terms of references.
>      >
>      > The actual error looks like the following (I've sanitized some of
>     the
>      > names):
>      >
>      > Resource CREATE failed: resources.potato: Resource CREATE failed:
>      > resources[0]: raw template with id 22273 not found
> 
>     When creating a nested stack, rather than just calling the RPC
>     method to
>     create a new stack, Heat stores the template in the database first and
>     passes the ID in the RPC message.[1] (It turns out that by doing it
>     this
>     way we can save massive amounts of memory when processing a large tree
>     of nested stacks.) My best guess is that this message indicates that
>     the
>     template row has been deleted by the time the other engine goes to look
>     at it.
> 
>     I don't see how you could have got an ID like 22273 without the
>     template
>     having been successfully stored at some point.
> 
>     The template is only supposed to be deleted if the RPC call returns
>     with
>     an error.[2] The only way I can think of for that to happen before an
>     attempt to create the child stack is if the RPC call times out, but the
>     original message is eventually picked up by an engine. I would check
>     your logs for RPC timeouts and consider increasing them.
> 
>     What does the status_reason look like at one level above in the tree?
>     That should indicate the first error that caused the template to be
>     deleted.
> 
>      >     heat resource-list STACK_NAME_HERE -n 50
>      >   
>       +------------------+--------------------------------------+-------------------------+-----------------+----------------------+--------------------------------------------------------------------------+
>      >     | resource_name    | physical_resource_id                 |
>      >     resource_type           | resource_status | updated_time    
>          |
>      >     stack_name
>      >          |
>      >   
>       +------------------+--------------------------------------+-------------------------+-----------------+----------------------+--------------------------------------------------------------------------+
>      >     | potato              | RESOURCE_ID_HERE |
>     OS::Heat::ResourceGroup |
>      >     CREATE_FAILED   | 2020-07-18 T19:52:10Z |
>      >     nested_stack_1_STACK_NAME_HERE                  |
>      >     | potato_server_group | RESOURCE_ID_HERE |
>     OS::Nova::ServerGroup   |
>      >     CREATE_COMPLETE | 2020-07-21T19:52:10Z |
>      >     nested_stack_1_STACK_NAME_HERE                  |
>      >     | 0                |                                      |
>      >     potato1.yaml     | CREATE_FAILED   | 2020-07-18T19:52:12Z |
>      >     nested_stack_2_STACK_NAME_HERE |
>      >     | 1                |                                      |
>      >     potato1.yaml     | INIT_COMPLETE   | 2020-07- 18 T19:52:12Z |
>      >     nested_stack_2_STACK_NAME_HERE |
>      >   
>       +------------------+--------------------------------------+-------------------------+-----------------+----------------------+--------------------------------------------------------------------------+
>      >
>      >
>      > The template itself is pretty simple and attempts to create a
>      > ServerGroup and 2 VMs (as part of the ResourceGroup). My feeling
>     is that
>      > one the creation of those machines fails and Heat get's a little
>     cooky
>      > and returns an error that might not be the actual root cause. I
>     would
>      > have expected the VM to show up in the resource list but I just
>     see the
>      > source "yaml".
> 
>     It's clear from the above output that the scaled unit of the resource
>     group is in fact a template (not an OS::Nova::Server), and the error is
>     occurring trying to create a stack from that template (potato1.yaml) -
>     before Heat even has a chance to start creating the server.
> 
>      > Has anyone seen something similar in the past?
> 
>     Nope.
> 
>     cheers,
>     Zane.
> 
>     [1]
>     https://opendev.org/openstack/heat/src/branch/master/heat/engine/resources/stack_resource.py#L367-L384
>     [2]
>     https://opendev.org/openstack/heat/src/branch/master/heat/engine/resources/stack_resource.py#L335-L342
> 
> 


From radoslaw.piliszek at gmail.com  Fri Aug 14 07:50:55 2020
From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=)
Date: Fri, 14 Aug 2020 09:50:55 +0200
Subject: [tc][masakari] Project aliveness (was: [masakari] Meetings)
In-Reply-To: <CAKZ_x79qopDY36TNg_5mvWFy=tiiB08PZqpjKeft-TnR0dQbtA@mail.gmail.com>
References: <CAKZ_x79qopDY36TNg_5mvWFy=tiiB08PZqpjKeft-TnR0dQbtA@mail.gmail.com>
Message-ID: <CAKZ_x799gE8njCGVwUtd28REvJaYK86n0WZ3PVeQv5-HXKGdmQ@mail.gmail.com>

Hi,

it's been a month since I wrote the original (quoted) email, so I
retry it with CC to the PTL and a recently (this year) active core.

I see there have been no meetings and neither Masakari IRC channel nor
review queues have been getting much attention during that time
period.
I am, therefore, offering my help to maintain the project.

Regarding the original topic, I would opt for running Masakari
meetings during the time I proposed so that interested parties could
join and I know there is at least some interest based on recent IRC
activity (i.e. there exist people who want to use and discuss Masakari
- apart from me that is :-) ).

-yoctozepto


On Mon, Jul 13, 2020 at 9:53 PM Radosław Piliszek
<radoslaw.piliszek at gmail.com> wrote:
>
> Hello Fellow cloud-HA-seekers,
>
> I wanted to attend Masakari meetings but I found the current schedule unfit.
> Is there a chance to change the schedule? The day is fine but a shift
> by +3 hours would be nice.
>
> Anyhow, I wanted to discuss [1]. I've already proposed a change
> implementing it and looking forward to positive reviews. :-) That
> said, please reply on the change directly, or mail me or catch me on
> IRC, whichever option sounds best to you.
>
> [1] https://blueprints.launchpad.net/masakari/+spec/customisable-ha-enabled-instance-metadata-key
>
> -yoctozepto


From yan.y.zhao at intel.com  Fri Aug 14 05:16:01 2020
From: yan.y.zhao at intel.com (Yan Zhao)
Date: Fri, 14 Aug 2020 13:16:01 +0800
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
References: <20200804183503.39f56516.cohuck@redhat.com>
 <c178a0d3-269d-1620-22b1-9010f602d8ff@redhat.com>
 <20200805021654.GB30485@joy-OptiPlex-7040>
 <2624b12f-3788-7e2b-2cb7-93534960bcb7@redhat.com>
 <20200805075647.GB2177@nanopsycho>
 <eb1d01c2-fbad-36b6-10cf-9e03483a736b@redhat.com>
 <20200805093338.GC30485@joy-OptiPlex-7040>
 <20200805105319.GF2177@nanopsycho>
 <20200810074631.GA29059@joy-OptiPlex-7040>
 <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
Message-ID: <20200814051601.GD15344@joy-OptiPlex-7040>

On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
> 
> On 2020/8/10 下午3:46, Yan Zhao wrote:
> > > driver is it handled by?
> > It looks that the devlink is for network device specific, and in
> > devlink.h, it says
> > include/uapi/linux/devlink.h - Network physical device Netlink
> > interface,
> 
> 
> Actually not, I think there used to have some discussion last year and the
> conclusion is to remove this comment.
> 
> It supports IB and probably vDPA in the future.
>
hmm... sorry, I didn't find the referred discussion. only below discussion
regarding to why to add devlink.

https://www.mail-archive.com/netdev at vger.kernel.org/msg95801.html
	>This doesn't seem to be too much related to networking? Why can't something
	>like this be in sysfs?
	
	It is related to networking quite bit. There has been couple of
	iteration of this, including sysfs and configfs implementations. There
	has been a consensus reached that this should be done by netlink. I
	believe netlink is really the best for this purpose. Sysfs is not a good
	idea

https://www.mail-archive.com/netdev at vger.kernel.org/msg96102.html
	>there is already a way to change eth/ib via
	>echo 'eth' > /sys/bus/pci/drivers/mlx4_core/0000:02:00.0/mlx4_port1
	>
	>sounds like this is another way to achieve the same?
	
	It is. However the current way is driver-specific, not correct.
	For mlx5, we need the same, it cannot be done in this way. Do devlink is
	the correct way to go.

https://lwn.net/Articles/674867/
	There a is need for some userspace API that would allow to expose things
	that are not directly related to any device class like net_device of
	ib_device, but rather chip-wide/switch-ASIC-wide stuff.

	Use cases:
	1) get/set of port type (Ethernet/InfiniBand)
	2) monitoring of hardware messages to and from chip
	3) setting up port splitters - split port into multiple ones and squash again,
	   enables usage of splitter cable
	4) setting up shared buffers - shared among multiple ports within one chip


we actually can also retrieve the same information through sysfs, .e.g

|- [path to device]
  |--- migration
  |     |--- self
  |     |   |---device_api
  |	|   |---mdev_type
  |	|   |---software_version
  |	|   |---device_id
  |	|   |---aggregator
  |     |--- compatible
  |     |   |---device_api
  |	|   |---mdev_type
  |	|   |---software_version
  |	|   |---device_id
  |	|   |---aggregator


> 
> >   I feel like it's not very appropriate for a GPU driver to use
> > this interface. Is that right?
> 
> 
> I think not though most of the users are switch or ethernet devices. It
> doesn't prevent you from inventing new abstractions.
so need to patch devlink core and the userspace devlink tool?
e.g. devlink migration

> Note that devlink is based on netlink, netlink has been widely used by
> various subsystems other than networking.

the advantage of netlink I see is that it can monitor device status and
notify upper layer that migration database needs to get updated.
But not sure whether openstack would like to use this capability.
As Sean said, it's heavy for openstack. it's heavy for vendor driver
as well :)

And devlink monitor now listens the notification and dumps the state
changes. If we want to use it, need to let it forward the notification
and dumped info to openstack, right?

Thanks
Yan


From pierre at stackhpc.com  Fri Aug 14 07:56:42 2020
From: pierre at stackhpc.com (Pierre Riteau)
Date: Fri, 14 Aug 2020 09:56:42 +0200
Subject: [tc][masakari] Project aliveness (was: [masakari] Meetings)
In-Reply-To: <CAKZ_x799gE8njCGVwUtd28REvJaYK86n0WZ3PVeQv5-HXKGdmQ@mail.gmail.com>
References: <CAKZ_x79qopDY36TNg_5mvWFy=tiiB08PZqpjKeft-TnR0dQbtA@mail.gmail.com>
 <CAKZ_x799gE8njCGVwUtd28REvJaYK86n0WZ3PVeQv5-HXKGdmQ@mail.gmail.com>
Message-ID: <CA+ny2sy4gw3g93jy89a+=Wofq3UWORMHg+aXwSVf3iBEb+qwgg@mail.gmail.com>

You may also want to try contacting suzhengwei
(https://launchpad.net/~sue.sam), we had a discussion in June about
potential integration between Masakari and Blazar.

On Fri, 14 Aug 2020 at 09:52, Radosław Piliszek
<radoslaw.piliszek at gmail.com> wrote:
>
> Hi,
>
> it's been a month since I wrote the original (quoted) email, so I
> retry it with CC to the PTL and a recently (this year) active core.
>
> I see there have been no meetings and neither Masakari IRC channel nor
> review queues have been getting much attention during that time
> period.
> I am, therefore, offering my help to maintain the project.
>
> Regarding the original topic, I would opt for running Masakari
> meetings during the time I proposed so that interested parties could
> join and I know there is at least some interest based on recent IRC
> activity (i.e. there exist people who want to use and discuss Masakari
> - apart from me that is :-) ).
>
> -yoctozepto
>
>
> On Mon, Jul 13, 2020 at 9:53 PM Radosław Piliszek
> <radoslaw.piliszek at gmail.com> wrote:
> >
> > Hello Fellow cloud-HA-seekers,
> >
> > I wanted to attend Masakari meetings but I found the current schedule unfit.
> > Is there a chance to change the schedule? The day is fine but a shift
> > by +3 hours would be nice.
> >
> > Anyhow, I wanted to discuss [1]. I've already proposed a change
> > implementing it and looking forward to positive reviews. :-) That
> > said, please reply on the change directly, or mail me or catch me on
> > IRC, whichever option sounds best to you.
> >
> > [1] https://blueprints.launchpad.net/masakari/+spec/customisable-ha-enabled-instance-metadata-key
> >
> > -yoctozepto
>


From alexander.dibbo at stfc.ac.uk  Fri Aug 14 10:49:34 2020
From: alexander.dibbo at stfc.ac.uk (Alexander Dibbo - UKRI STFC)
Date: Fri, 14 Aug 2020 10:49:34 +0000
Subject: Issue with heat and magnum
Message-ID: <08439410328b4d1ab7ca684d5af2c7c7@stfc.ac.uk>

Hi,

I am having an issue with magnum creating clusters when I have multiple active heat-engine daemons running.

I get the following error in the heat engine logs:
2020-08-14 10:36:30.237 598383 INFO heat.engine.resource [req-a2c862eb-370c-4e91-a2c6-dca32c7872ce - - - - -] signal SoftwareDeployment "master_config_deployment" [67ba9ce2-aba5-4c15-a7ea
-6b774659a0e2] Stack "kubernetes-test-26-3uzjqqob47fh-kube_masters-mhctjio2b4gh-0-pbhumflm5mn5" [dc66e4d9-0c9b-4b18-a2c6-dd9724fa51a9] : Authentication cannot be scoped to multiple target
s. Pick one of: project, domain, trust or unscoped
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource Traceback (most recent call last):
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 2462, in _handle_signal
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource     signal_result = self.handle_signal(details)
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/resources/openstack/heat/software_deployment.py", line 514, in handle_signal
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource     timeutils.utcnow().isoformat())
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/rpc/client.py", line 788, in signal_software_deployment
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource     version='1.6')
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/rpc/client.py", line 89, in call
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource     return client.call(ctxt, method, **kwargs)
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 165, in call
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource     msg_ctxt = self.serializer.serialize_context(ctxt)
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/common/messaging.py", line 46, in serialize_context
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource     _context = ctxt.to_dict()
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/common/context.py", line 185, in to_dict
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource     'roles': self.roles,
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/common/context.py", line 315, in roles
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource     self._load_keystone_data()
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/tenacity/__init__.py", line 292, in wrapped_f
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource     return self.call(f, *args, **kw)
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/tenacity/__init__.py", line 358, in call
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource     do = self.iter(retry_state=retry_state)
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/tenacity/__init__.py", line 319, in iter
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource     return fut.result()
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/concurrent/futures/_base.py", line 422, in result
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource     return self.__get_result()
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/tenacity/__init__.py", line 361, in call
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource     result = fn(*args, **kwargs)
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/common/context.py", line 306, in _load_keystone_data
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource     auth_ref = self.auth_plugin.get_access(self.keystone_session)
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/keystoneauth1/identity/base.py", line 134, in get_access
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource     self.auth_ref = self.get_auth_ref(session)
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/keystoneauth1/identity/generic/base.py", line 208, in get_auth_ref
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource     return self._plugin.get_auth_ref(session, **kwargs)
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/keystoneauth1/identity/v3/base.py", line 144, in get_auth_ref
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource     message='Authentication cannot be scoped to multiple'
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource AuthorizationFailure: Authentication cannot be scoped to multiple targets. Pick one of: project, domain, trust or unscoped
2020-08-14 10:36:30.237 598383 ERROR heat.engine.resource
2020-08-14 10:36:30.890 598383 ERROR heat.engine.service [req-a2c862eb-370c-4e91-a2c6-dca32c7872ce - - - - -] Unhandled error in asynchronous task: ResourceFailure: AuthorizationFailure:
resources.master_config_deployment: Authentication cannot be scoped to multiple targets. Pick one of: project, domain, trust or unscoped
2020-08-14 10:36:30.890 598383 ERROR heat.engine.service Traceback (most recent call last):
2020-08-14 10:36:30.890 598383 ERROR heat.engine.service   File "/usr/lib/python2.7/site-packages/heat/engine/service.py", line 132, in log_exceptions
2020-08-14 10:36:30.890 598383 ERROR heat.engine.service     gt.wait()
2020-08-14 10:36:30.890 598383 ERROR heat.engine.service   File "/usr/lib/python2.7/site-packages/eventlet/greenthread.py", line 181, in wait
2020-08-14 10:36:30.890 598383 ERROR heat.engine.service     return self._exit_event.wait()
2020-08-14 10:36:30.890 598383 ERROR heat.engine.service   File "/usr/lib/python2.7/site-packages/eventlet/event.py", line 132, in wait
2020-08-14 10:36:30.890 598383 ERROR heat.engine.service     current.throw(*self._exc)
2020-08-14 10:36:30.890 598383 ERROR heat.engine.service   File "/usr/lib/python2.7/site-packages/eventlet/greenthread.py", line 221, in main
2020-08-14 10:36:30.890 598383 ERROR heat.engine.service     result = function(*args, **kwargs)
2020-08-14 10:36:30.890 598383 ERROR heat.engine.service   File "/usr/lib/python2.7/site-packages/heat/engine/service.py", line 123, in _start_with_trace
2020-08-14 10:36:30.890 598383 ERROR heat.engine.service     return func(*args, **kwargs)
2020-08-14 10:36:30.890 598383 ERROR heat.engine.service   File "/usr/lib/python2.7/site-packages/heat/engine/service.py", line 1871, in _resource_signal
2020-08-14 10:36:30.890 598383 ERROR heat.engine.service     needs_metadata_updates = rsrc.signal(details, need_check)
2020-08-14 10:36:30.890 598383 ERROR heat.engine.service   File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 2500, in signal
2020-08-14 10:36:30.890 598383 ERROR heat.engine.service     self._handle_signal(details)
2020-08-14 10:36:30.890 598383 ERROR heat.engine.service   File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 2480, in _handle_signal
2020-08-14 10:36:30.890 598383 ERROR heat.engine.service     raise failure
2020-08-14 10:36:30.890 598383 ERROR heat.engine.service ResourceFailure: AuthorizationFailure: resources.master_config_deployment: Authentication cannot be scoped to multiple targets. Pi
ck one of: project, domain, trust or unscoped
2020-08-14 10:36:30.890 598383 ERROR heat.engine.service

Each of the individual heat-engine daemons create magnum clusters correctly when they are the only ones online.

Attached are the heat and magnum config files.

Any ideas where to look would be appreciated?


Regards

Alexander Dibbo - Cloud Architect / Cloud Operations Group Leader
For STFC Cloud Documentation visit https://stfc-cloud-docs.readthedocs.io<https://stfc-cloud-docs.readthedocs.io/>
To raise a support ticket with the cloud team please email cloud-support at gridpp.rl.ac.uk<mailto:cloud-support at gridpp.rl.ac.uk>
To receive notifications about the service please subscribe to our mailing list at: https://www.jiscmail.ac.uk/cgi-bin/webadmin?A0=STFC-CLOUD
To receive fast notifications or to discuss usage of the cloud please join our Slack: https://stfc-cloud.slack.com/


This email and any attachments are intended solely for the use of the named recipients. If you are not the intended recipient you must not use, disclose, copy or distribute this email or any of its attachments and should notify the sender immediately and delete this email from your system. UK Research and Innovation (UKRI) has taken every reasonable precaution to minimise risk of this email or any attachments containing viruses or malware but the recipient should carry out its own virus and malware checks before opening the attachments. UKRI does not accept any liability for any losses or damages which the recipient may sustain due to presence of any viruses. Opinions, conclusions or other information in this message and attachments that are not related directly to UKRI business are solely those of the author and do not represent the views of UKRI.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200814/02208421/attachment-0001.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: heat.conf.txt
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200814/02208421/attachment-0002.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: magnum.conf.txt
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200814/02208421/attachment-0003.txt>

From dev.faz at gmail.com  Fri Aug 14 11:21:04 2020
From: dev.faz at gmail.com (Fabian Zimmermann)
Date: Fri, 14 Aug 2020 13:21:04 +0200
Subject: [nova][neutron][oslo][ops][kolla] rabbit bindings issue
In-Reply-To: <CAA857Vzw+kvW_7R37oqJF6gRkxGBSY3gNYTLKu++pY5WHH49qg@mail.gmail.com>
References: <20200806144016.GP31915@sync>
 <CAA857VxY8mr0OSw8vSYdMph4tXb=5Wwwx3cM_YVgGa8=P+cxkQ@mail.gmail.com>
 <CALaZjRGEcFDBiQ2YnsMjXJ_WF3ALcjWVLQuetupVd__CWhjg0g@mail.gmail.com>
 <CAA857VzKREnBOEF667n4B+9L9Rbm+y7o2LVrDjhxhCT=2XWR8g@mail.gmail.com>
 <1a338d7e-c82c-cda2-2d47-b5aebb999142@openstack.org>
 <CAA857VxxmrHhBZgNHCg2X=ymvOdS_YcpERYN5OnKs0NMEA=cyA@mail.gmail.com>
 <CAA857Vya0rou-SwzO+A=smVHXNTZ9a03_3r09BJj=bb3NAcdGg@mail.gmail.com>
 <CAA857Vzw+kvW_7R37oqJF6gRkxGBSY3gNYTLKu++pY5WHH49qg@mail.gmail.com>
Message-ID: <CAA857Vzq1nYLM3ES35J_9PxTieVskhZBc43Z40R3NeiEEOL4rA@mail.gmail.com>

Hello again,

just a short update about the results of my tests.

I currently see 2 ways of running openstack+rabbitmq

1. without durable-queues and without replication - just one
rabbitmq-process which gets (somehow) restarted if it fails.
2. durable-queues and replication

Any other combination of these settings leads to more or less issues with

* broken / non working bindings
* broken queues

I think vexxhost is running (1) with their openstack-operator - for reasons.

I added [kolla], because kolla-ansible is installing rabbitmq with
replication but without durable-queues.

May someone point me to the best way to document these findings to some
official doc?
I think a lot of installations out there will run into issues if - under
load - a node fails.

 Fabian


Am Do., 13. Aug. 2020 um 15:13 Uhr schrieb Fabian Zimmermann <
dev.faz at gmail.com>:

> Hi,
>
> just did some short tests today in our test-environment (without durable
> queues and without replication):
>
> * started a rally task to generate some load
> * kill-9-ed rabbitmq on one node
> * rally task immediately stopped and the cloud (mostly) stopped working
>
> after some debugging i found (again) exchanges which had bindings to
> queues, but these bindings didnt forward any msgs.
> Wrote a small script to detect these broken bindings and will now check if
> this is "reproducible"
>
> then I will try "durable queues" and "durable queues with replication" to
> see if this helps. Even if I would expect
> rabbitmq should be able to handle this without these "hidden broken
> bindings"
>
> This just FYI.
>
>  Fabian
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200814/9dfedc3f/attachment.html>

From its-openstack at zohocorp.com  Fri Aug 14 11:42:12 2020
From: its-openstack at zohocorp.com (its-openstack at zohocorp.com)
Date: Fri, 14 Aug 2020 17:12:12 +0530
Subject: Openstack-Train VCPU issue in Hyper-V
Message-ID: <173ecc7045a.1134ca19a23846.8868151533455235252@zohocorp.com>

Dear Team,


   We are using Openstack-Train in our organization.We have created windows server 2016 Std R2 instances with this flavor m5.xlarge ( RAM - 65536 , Disk - 500 , VCPUs - 16 ).Once Hyper-V future enabled in this instances VCPU count is automatically reduced to 1 core after restart.Even we have enabled nested virtualisation in openstack compute server.Please help us to short out this issue.

#cat /sys/module/kvm_intel/parameters/nested

Y


Regards,

Sysadmin.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200814/0bcb52c7/attachment.html>

From smooney at redhat.com  Fri Aug 14 12:46:39 2020
From: smooney at redhat.com (Sean Mooney)
Date: Fri, 14 Aug 2020 13:46:39 +0100
Subject: Openstack-Train VCPU issue in Hyper-V
In-Reply-To: <173ecc7045a.1134ca19a23846.8868151533455235252@zohocorp.com>
References: <173ecc7045a.1134ca19a23846.8868151533455235252@zohocorp.com>
Message-ID: <98557b2765564577d5305ace4bff195777f7c857.camel@redhat.com>

On Fri, 2020-08-14 at 17:12 +0530, its-openstack at zohocorp.com wrote:
> Dear Team,
> 
> 
> 
>    We are using Openstack-Train in our organization.We have created windows server 2016 Std R2 instances with this
> flavor m5.xlarge ( RAM - 65536 , Disk - 500 , VCPUs - 16 ).Once Hyper-V future enabled in this instances VCPU count is
> automatically reduced to 1 core after restart.Even we have enabled nested virtualisation in openstack compute server.
just to confirm you are using the hyperv driver? if so then this sound like a hyperv bug not an openstack bug.
have you reached out to microsfot for support with this issue.

openstack itself does not garentee neted virt will be avaiable or work and doe not guarentee that
it will work across operationg systems.
> Please help us to short out this issue.
im off today so i wont be monitoring this i just saw your email while i was doing something else but without more info
of what your configurtion is and how this is failing i dont think people will be able to help you root cause your issue.
the other thing to be aware is that this is not a support list, peopel might have time to help and often do try to help
but outside of there good nature if you have an issue you can resolve your self withpointer form the comunity you might
need to reach out to your openstack vendor for support or if you dont have one engage one of your engeiner to work
with the upstream comunity to rootcause and fix the issue. there is not vendor customer support relationship betten
upstream and those that install it. the list acks as a way for people that develop and use openstack to help each other
voluntarily 
> 
> #cat /sys/module/kvm_intel/parameters/nested
> 
> Y
are you setting this on the host if so that implies you are using the libvirt driver and instead rung windows server
as a guest and trying to enable hyperv on a windwos guest.  that is not whoe your email initally reads and is a differnt
part of teh code base.  when you say you enable the Hyper-V future is that in the windows os on the host or in a windows
os on a vm hosted on a linux host.

i dont know of any way that that could alter the vcpu allocated to the vm.

if you are using the libvirt dirver can you provide the xml before and after you enable the hyperv feature and reboot
if they are still the same then this is a windows kernel bug.  if you are not enabling the hyperv featre in the windows
os and are instead refering to modifying the libvirt xml to add hyperv feature flags to the gust  that is not supported.
you are not allowed to ever modify a nova crated guest xml. the wya to enable the hyperv enlightlment is to set
metadata on the glance image to delcare the image as a windows image and then nova will enabel the hyperv enlightement
feature flags in the xml.


> 
> 
> 
> Regards,
> 
> Sysadmin.


From sean.mcginnis at gmx.com  Fri Aug 14 12:56:38 2020
From: sean.mcginnis at gmx.com (Sean McGinnis)
Date: Fri, 14 Aug 2020 07:56:38 -0500
Subject: [all][TC] OpenStack Client (OSC) vs python-*clients
In-Reply-To: <2956d6bd-320e-34ea-64a0-1001e102d75c@gmail.com>
References: <CAPkQhndGy1VrBT1yOCXQaZcgDxXr5ww9RL1VnU45iHsDJqZpcw@mail.gmail.com>
 <CAKZ_x7_CCbHzVdruU2BvS7kK1mvte+1AW1PJptwVzFJQVVCs4Q@mail.gmail.com>
 <1668118.VLH7GnMWUR@whitebase.usersys.redhat.com>
 <CACqyMicKobe7WnQfKEixAM3pTN0+9JdNGQUR8Z9KeW9bE9Xizw@mail.gmail.com>
 <9cbf9d69a9beb30d03af71e42a3e2446a516292a.camel@redhat.com>
 <CADO3vb4ufSnUpEwk8xk-6WQ9k8QKqnAnkWS1V+6U0x-nZnL7oQ@mail.gmail.com>
 <CAKwVDDOOu_Y6FFu1RQL+vKKg0D+w=qbjzHYzuk06dtjJ4qjjhQ@mail.gmail.com>
 <20200813164131.bdmhankpd2qxycux@yuggoth.org>
 <CACqyMieYo7tTtWVhRFJnEoWGGJrJHxERtDWfjHwE2gg22STMyw@mail.gmail.com>
 <2956d6bd-320e-34ea-64a0-1001e102d75c@gmail.com>
Message-ID: <c79cfe11-985c-2e0b-6834-2b9ab0b28f83@gmx.com>


> And if it's to provide one CLI that rules them all, the individual
> projects (well, Cinder, anyway) can't stop adding functionality to
> cinderclient CLI until the openstackclient CLI has feature parity.  At
> least now, you can use one CLI to do all cinder-related stuff.  If we
> stop cinderclient CLI development, then you'll need to use
> openstackclient for some things (old features + the latest features)
> and the cinderclient for all the in between features, which doesn't
> seem like progress to me.
And in reality, I don't think Cinder can even drop cinderclient even if
we get feature parity. We have python-brick-cinderclient-ext that is
used in conjunction with python-cinderclient for some standalone use cases.


From marino.mrc at gmail.com  Fri Aug 14 13:17:50 2020
From: marino.mrc at gmail.com (Marco Marino)
Date: Fri, 14 Aug 2020 15:17:50 +0200
Subject: [tripleo] Specify different interface name in single nic vlans
 without external network installation
Message-ID: <CAFHVVu+9c9AP777XyqRFKoYQijsavNF0C6qhTxhfK6kbnoD0Pg@mail.gmail.com>

Hi, I'm trying to install openstack using tripleo on preprovisioned
servers. My (desired)  environment is quite simple: 1 controller and 1
compute node. Here is what I did:
- Installed undercloud with 192.168.25.0/24 as a ctlplan subnet. local_ip =
192.168.25.2; undercloud_public_host=192.168.25.4; undercloud_admin_host =
192.168.25.3
- Installed 2 servers (compute and controller) with centos 8. Hardware
requirements are satisfied (32 GB of ram, 100GB disk....)
- Manually created user stack on compute and controller and added it to the
sudoers list.
- Manually Installed openstack repositories (sudo -E tripleo-repos -b
ussuri current-tripleo-rdo)
- Manually installed openstack required openstack packages:  sudo yum
install python3-heat-agent* -y

Now I'd like to use 192.168.25.0/24 as "installation network" (network used
by ansible) and I'm trying to configure one single nic with vlans. Please
note that on all servers (undercloud included) I have 2 physical interfaces
with the same name: enp1s0 and enp7s0. enp1s0 is used for a completely
openstack-detached network: 192.168.2.0/24 and enp7s0 is used for
192.168.25.0/24
More precisely:
Controller-0 = 192.168.25.10
Compute-0 = 192.168.25.20

Furthermore, I confirm that I can reach <local_ip> from both nodes using
ping and curl (on port 8004). And I added 2 lines in /etc/hosts on
undercloud:
192.168.25.10 controller-0
192.168.25.20 compute-0

Now I'm really confused about what to do. I tried with:

openstack overcloud deploy --templates --disable-validations -e
/usr/share/openstack-tripleo-heat-templates/environments/deployed-server-environment.yaml
-e
/usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml
-e
/usr/share/openstack-tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml
-e templates/network-environment-overrides.yaml -e
templates/ctlplane-assignment.yaml -e templates/nameservers.yaml -e
templates/node-info.yaml -e templates/hostnamemap.yaml  -n
templates/network_data.yaml


node-info.yaml: http://paste.openstack.org/show/796840/
hostnamemap.yaml: http://paste.openstack.org/show/796841/
network-environment-overrides.yaml: http://paste.openstack.org/show/796842/
ctlplane-assignment.yaml: http://paste.openstack.org/show/796843/
nameservers.yaml: http://paste.openstack.org/show/796844/
network_data.yaml: http://paste.openstack.org/show/796845/


How can I specify nic2 for single nic vlans without external network?
Please can you provide an example with "openstack overcloud deploy"
complete command? I'm reading documentation but I do not understand how to
do and it's really frustrating.
Thank you,
Marco
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200814/485ee998/attachment.html>

From alterriu at gmail.com  Fri Aug 14 14:13:23 2020
From: alterriu at gmail.com (Popoi Zen)
Date: Fri, 14 Aug 2020 21:13:23 +0700
Subject: [neutron] How to specify overlay network interface when using OVN and
 Geneve?
Message-ID: <CAEW15yAJKF6AjQBQQE0bOK9uZZh3Cq6vf3GHf0Wn=K2DrN5DCg@mail.gmail.com>

Hi, I have used my google fu but I cant find any reference. Just want to
know how to specify overlay network when Im using geneve as my overlay
protocol?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200814/8e56e461/attachment.html>

From zigo at debian.org  Fri Aug 14 14:23:44 2020
From: zigo at debian.org (Thomas Goirand)
Date: Fri, 14 Aug 2020 16:23:44 +0200
Subject: [neutron] Implementing BGP over network:routed for IPv6 in Neutron,
 with DVR capabilities
Message-ID: <45544000-52dd-2f05-1c18-235b495d62de@debian.org>

Hi,

When these patches are approved:
https://review.opendev.org/486450
https://review.opendev.org/669395

we will effectively have BGP announcing for floating IPs and router
gateways, with a provider network as next BGP HOP. I tested in
experimentally, and it does work. There's more work to be done on it to
make it better (like, eliminating GARP requests and getting
neutron-dynamic-routing to know when a floating moves from one segment
to another), as seen in the commends of #669395, but it works.

Now, I'd like to have the same feature for IPv6. Having a segmented IPv6
L2 network already works, though isn't this always going through the
network nodes still?

I see no reason why IPv6 would always go through network nodes, and I
would like to eliminate this SPOF. Has anyone worked on this? Or is
there anyone with some advice on how to start? Is there some blueprints
somewhere? I'm not sure what this implies, and where to start my
research on this. But I really would love, moving forward, to have such
a feature.

Would anyone (try to) contribute this with me?

Cheers,

Thomas Goirand (zigo)


From smooney at redhat.com  Fri Aug 14 12:30:00 2020
From: smooney at redhat.com (Sean Mooney)
Date: Fri, 14 Aug 2020 13:30:00 +0100
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200814051601.GD15344@joy-OptiPlex-7040>
References: <20200804183503.39f56516.cohuck@redhat.com>
 <c178a0d3-269d-1620-22b1-9010f602d8ff@redhat.com>
 <20200805021654.GB30485@joy-OptiPlex-7040>
 <2624b12f-3788-7e2b-2cb7-93534960bcb7@redhat.com>
 <20200805075647.GB2177@nanopsycho>
 <eb1d01c2-fbad-36b6-10cf-9e03483a736b@redhat.com>
 <20200805093338.GC30485@joy-OptiPlex-7040>
 <20200805105319.GF2177@nanopsycho>
 <20200810074631.GA29059@joy-OptiPlex-7040>
 <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
Message-ID: <a4f4a3cf76b87346a4cc4c39c116f575eaab9bac.camel@redhat.com>

On Fri, 2020-08-14 at 13:16 +0800, Yan Zhao wrote:
> On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
> > 
> > On 2020/8/10 下午3:46, Yan Zhao wrote:
> > > > driver is it handled by?
> > > 
> > > It looks that the devlink is for network device specific, and in
> > > devlink.h, it says
> > > include/uapi/linux/devlink.h - Network physical device Netlink
> > > interface,
> > 
> > 
> > Actually not, I think there used to have some discussion last year and the
> > conclusion is to remove this comment.
> > 
> > It supports IB and probably vDPA in the future.
> > 
> 
> hmm... sorry, I didn't find the referred discussion. only below discussion
> regarding to why to add devlink.
> 
> https://www.mail-archive.com/netdev at vger.kernel.org/msg95801.html
> 	>This doesn't seem to be too much related to networking? Why can't something
> 	>like this be in sysfs?
> 	
> 	It is related to networking quite bit. There has been couple of
> 	iteration of this, including sysfs and configfs implementations. There
> 	has been a consensus reached that this should be done by netlink. I
> 	believe netlink is really the best for this purpose. Sysfs is not a good
> 	idea
> 
> https://www.mail-archive.com/netdev at vger.kernel.org/msg96102.html
> 	>there is already a way to change eth/ib via
> 	>echo 'eth' > /sys/bus/pci/drivers/mlx4_core/0000:02:00.0/mlx4_port1
> 	>
> 	>sounds like this is another way to achieve the same?
> 	
> 	It is. However the current way is driver-specific, not correct.
> 	For mlx5, we need the same, it cannot be done in this way. Do devlink is
> 	the correct way to go.
im not sure i agree with that.
standardising a filesystem based api that is used across all vendors is also a valid
option.  that said if devlink is the right choice form a kerenl perspective by all
means use it but i have not heard a convincing argument for why it actually better.
with tthat said we have been uing tools like ethtool to manage aspect of nics for decades
so its not that strange an idea to use a tool and binary protocoal rather then a text
based interface for this but there are advantages to both approches.
> 
> https://lwn.net/Articles/674867/
> 	There a is need for some userspace API that would allow to expose things
> 	that are not directly related to any device class like net_device of
> 	ib_device, but rather chip-wide/switch-ASIC-wide stuff.
> 
> 	Use cases:
> 	1) get/set of port type (Ethernet/InfiniBand)
> 	2) monitoring of hardware messages to and from chip
> 	3) setting up port splitters - split port into multiple ones and squash again,
> 	   enables usage of splitter cable
> 	4) setting up shared buffers - shared among multiple ports within one chip
> 
> 
> 
> we actually can also retrieve the same information through sysfs, .e.g
> 
> > - [path to device]
> 
>   |--- migration
>   |     |--- self
>   |     |   |---device_api
>   |	|   |---mdev_type
>   |	|   |---software_version
>   |	|   |---device_id
>   |	|   |---aggregator
>   |     |--- compatible
>   |     |   |---device_api
>   |	|   |---mdev_type
>   |	|   |---software_version
>   |	|   |---device_id
>   |	|   |---aggregator
> 
> 
> 
> > 
> > >   I feel like it's not very appropriate for a GPU driver to use
> > > this interface. Is that right?
> > 
> > 
> > I think not though most of the users are switch or ethernet devices. It
> > doesn't prevent you from inventing new abstractions.
> 
> so need to patch devlink core and the userspace devlink tool?
> e.g. devlink migration
and devlink python libs if openstack was to use it directly.
we do have caes where we just frok a process and execaute a comannd in a shell
with or without elevated privladge but we really dont like doing that due to 
the performacne impacat and security implciations so where we can use python bindign
over c apis we do. pyroute2 is the only python lib i know off of the top of my head
that support devlink so we would need to enhacne it to support this new devlink api.
there may be otherss i have not really looked in the past since we dont need to use
devlink at all today.
> 
> > Note that devlink is based on netlink, netlink has been widely used by
> > various subsystems other than networking.
> 
> the advantage of netlink I see is that it can monitor device status and
> notify upper layer that migration database needs to get updated.
> But not sure whether openstack would like to use this capability.
> As Sean said, it's heavy for openstack. it's heavy for vendor driver
> as well :)
> 
> And devlink monitor now listens the notification and dumps the state
> changes. If we want to use it, need to let it forward the notification
> and dumped info to openstack, right?
i dont think we would use direct devlink monitoring in nova even if it was avaiable.
we could but we already poll libvirt and the system for other resouce periodicly.
we likely wouldl just add monitoriv via devlink to that periodic task.
we certenly would not use it to detect a migration or a need to update a migration database(not sure what that is)

in reality if we can consume this info indirectly via a libvirt api that will
be the appcoh we will take at least for the libvirt driver in nova. for cyborg
they may take a different appoch. we already use pyroute2 in 2 projects, os-vif and
neutron and it does have devlink support so the burden of using devlink is not that
high for openstack but its a less frineadly interface for configuration tools like
ansiable vs a filesystem based approch.
> 
> Thanks
> Yan
> 


From satish.txt at gmail.com  Fri Aug 14 14:59:28 2020
From: satish.txt at gmail.com (Satish Patel)
Date: Fri, 14 Aug 2020 10:59:28 -0400
Subject: [nova][neutron][oslo][ops][kolla] rabbit bindings issue
In-Reply-To: <CAA857Vzq1nYLM3ES35J_9PxTieVskhZBc43Z40R3NeiEEOL4rA@mail.gmail.com>
References: <20200806144016.GP31915@sync>
 <CAA857VxY8mr0OSw8vSYdMph4tXb=5Wwwx3cM_YVgGa8=P+cxkQ@mail.gmail.com>
 <CALaZjRGEcFDBiQ2YnsMjXJ_WF3ALcjWVLQuetupVd__CWhjg0g@mail.gmail.com>
 <CAA857VzKREnBOEF667n4B+9L9Rbm+y7o2LVrDjhxhCT=2XWR8g@mail.gmail.com>
 <1a338d7e-c82c-cda2-2d47-b5aebb999142@openstack.org>
 <CAA857VxxmrHhBZgNHCg2X=ymvOdS_YcpERYN5OnKs0NMEA=cyA@mail.gmail.com>
 <CAA857Vya0rou-SwzO+A=smVHXNTZ9a03_3r09BJj=bb3NAcdGg@mail.gmail.com>
 <CAA857Vzw+kvW_7R37oqJF6gRkxGBSY3gNYTLKu++pY5WHH49qg@mail.gmail.com>
 <CAA857Vzq1nYLM3ES35J_9PxTieVskhZBc43Z40R3NeiEEOL4rA@mail.gmail.com>
Message-ID: <CAPgF-fr-3J=acxng6SwRkLPXqvTYNsNmCZUXyvzi4rxDnPDdEA@mail.gmail.com>

Fabian,

what do you mean?

>> I think vexxhost is running (1) with their openstack-operator - for reasons.

On Fri, Aug 14, 2020 at 7:28 AM Fabian Zimmermann <dev.faz at gmail.com> wrote:
>
> Hello again,
>
> just a short update about the results of my tests.
>
> I currently see 2 ways of running openstack+rabbitmq
>
> 1. without durable-queues and without replication - just one rabbitmq-process which gets (somehow) restarted if it fails.
> 2. durable-queues and replication
>
> Any other combination of these settings leads to more or less issues with
>
> * broken / non working bindings
> * broken queues
>
> I think vexxhost is running (1) with their openstack-operator - for reasons.
>
> I added [kolla], because kolla-ansible is installing rabbitmq with replication but without durable-queues.
>
> May someone point me to the best way to document these findings to some official doc?
> I think a lot of installations out there will run into issues if - under load - a node fails.
>
>  Fabian
>
>
> Am Do., 13. Aug. 2020 um 15:13 Uhr schrieb Fabian Zimmermann <dev.faz at gmail.com>:
>>
>> Hi,
>>
>> just did some short tests today in our test-environment (without durable queues and without replication):
>>
>> * started a rally task to generate some load
>> * kill-9-ed rabbitmq on one node
>> * rally task immediately stopped and the cloud (mostly) stopped working
>>
>> after some debugging i found (again) exchanges which had bindings to queues, but these bindings didnt forward any msgs.
>> Wrote a small script to detect these broken bindings and will now check if this is "reproducible"
>>
>> then I will try "durable queues" and "durable queues with replication" to see if this helps. Even if I would expect
>> rabbitmq should be able to handle this without these "hidden broken bindings"
>>
>> This just FYI.
>>
>>  Fabian


From samuel.mutel at gmail.com  Fri Aug 14 15:18:31 2020
From: samuel.mutel at gmail.com (Samuel Mutel)
Date: Fri, 14 Aug 2020 17:18:31 +0200
Subject: [Telemetry] Error when sending to prometheus pushgateway
In-Reply-To: <CAA3rH8b1Ad3U99qNOq1=6Yn858rOpLmwB2YZKXdSS0SoV7WjOQ@mail.gmail.com>
References: <CAA3rH8Z2H6LzpUNHMzCQMQx3U-jWQVE9p1ToKW4LF5=CpjKUqA@mail.gmail.com>
 <731c90df-8830-1804-10a8-a9a97a3e2f55@matthias-runge.de>
 <CAA3rH8YiufxxFeR6LB6KQAX_08R0dTG8TzTmdzzS=KxeJm6Y+Q@mail.gmail.com>
 <a876946e-c932-d8d5-580c-b2c14f2953a3@matthias-runge.de>
 <CAA3rH8b1Ad3U99qNOq1=6Yn858rOpLmwB2YZKXdSS0SoV7WjOQ@mail.gmail.com>
Message-ID: <CAA3rH8Y679891b-xkY4SCCA-EQC_VQtb82wAXVo3N-=M9Uk0Cw@mail.gmail.com>

Hello,

I didn't find the issue. Somebody could help me ?

Thanks.

Le mer. 8 juil. 2020 à 17:55, Samuel Mutel <samuel.mutel at gmail.com> a
écrit :

> Hello,
>
> Thanks for your help. I tried to test the pushgateway manually and it
> seems to work fine. The pushgateway wrote some things on the stdout.
> But when I start the ceilometer, nothing happens. I tried to change the IP
> to use 127.0.0.1 but nothing.
>
> Here is my ceilometer.conf:
>
>> [DEFAULT]
>> auth_strategy = keystone
>> debug = False
>> event_dispatchers = gnocchi
>> meter_dispatchers = gnocchi
>> transport_url = rabbit://openstack:xxxxxx at xx.xx.x.xx
>> ,openstack:xxxxxxx at xx.xx.x.xx,openstack:xxxxxxxxx at xx.xx.x.xx/
>>
>> [cache]
>> backend = dogpile.cache.memcached
>> enabled = True
>> memcache_servers = xx.xx.x.xx:11211,xx.xx.x.xx:11211,xx.xx.x.xx:11211
>>
>> [keystone_authtoken]
>> auth_type = password
>> auth_uri = https://xxxxxxxxxxxx:5000/v3
>> auth_url = https://xxxxxxxxxxxx:5000
>> memcached_servers = xx.xx.x.xx:11211,xx.xx.x.xx:11211,xx.xx.x.xx:11211
>> password = xxxxxx
>> project_domain_id = default
>> project_name = service
>> region_name = RegionOne
>> user_domain_id = default
>> username = ceilometer
>> www_authenticate_uri = https://xxxxxxxxxxxx:5000
>>
>> [notification]
>> pipelines = meter
>>
>> [oslo_messaging_notifications]
>> driver = messagingv2
>>
>> [oslo_middleware]
>> enable_proxy_headers_parsing = True
>>
>> [publisher]
>> telemetry_secret = xxxxxxxxx
>>
>> [service_credentials]
>> auth_type = password
>> auth_url =https://xxxxxxxxxxxx:5000
>> password = xxxxxxxxx
>> project_domain_id = default
>> project_name = service
>> region_name = RegionOne
>> user_domain_id = default
>> username = ceilometer
>>
>
> Here is my event_pipeline.yaml:
>
>> sources:
>>   - name: meter_file
>>     events:
>>       - "*"
>>     sinks:
>>       - prometheus
>>
>> sinks:
>>   - name: prometheus
>>     publishers:
>>             - prometheus://127.0.0.1:9091/metrics/job/ceilometer
>>
>
> Here is my pipeline.yaml:
>
>> sources:
>>   - name: meter_file
>>     interval: 30
>>     meters:
>>       - "*"
>>     sinks:
>>       - prometheus
>>
>> sinks:
>>   - name: prometheus
>>     publishers:
>>             - prometheus://127.0.0.1:9091/metrics/job/ceilometer
>>
>
> Here is my polling.yaml:
>
>> ---
>> sources:
>>     - name: some_pollsters
>>       interval: 300
>>       meters:
>>         - cpu
>>         - cpu_l3_cache
>>         - memory.usage
>>         - network.incoming.bytes
>>         - network.incoming.packets
>>         - network.outgoing.bytes
>>         - network.outgoing.packets
>>         - disk.device.read.bytes
>>         - disk.device.read.requests
>>         - disk.device.write.bytes
>>         - disk.device.write.requests
>>         - hardware.cpu.util
>>         - hardware.memory.used
>>         - hardware.memory.total
>>         - hardware.memory.buffer
>>         - hardware.memory.cached
>>         - hardware.memory.swap.avail
>>         - hardware.memory.swap.total
>>         - hardware.system_stats.io.outgoing.blocks
>>         - hardware.system_stats.io.incoming.blocks
>>         - hardware.network.ip.incoming.datagrams
>>         - hardware.network.ip.outgoing.datagrams
>>
>
> Here is my ceilometer-rootwrap:
>
>> # Configuration for ceilometer-rootwrap
>> # This file should be owned by (and only-writeable by) the root user
>>
>> [DEFAULT]
>> # List of directories to load filter definitions from (separated by ',').
>> # These directories MUST all be only writeable by root !
>> filters_path=/etc/ceilometer/rootwrap.d,/usr/share/ceilometer/rootwrap
>>
>> # List of directories to search executables in, in case filters do not
>> # explicitely specify a full path (separated by ',')
>> # If not specified, defaults to system PATH environment variable.
>> # These directories MUST all be only writeable by root !
>> exec_dirs=/sbin,/usr/sbin,/bin,/usr/bin,/usr/local/sbin,/usr/local/bin
>>
>> # Enable logging to syslog
>> # Default value is False
>> use_syslog=False
>>
>> # Which syslog facility to use.
>> # Valid values include auth, authpriv, syslog, user0, user1...
>> # Default value is 'syslog'
>> syslog_log_facility=syslog
>>
>> # Which messages to log.
>> # INFO means log all usage
>> # ERROR means only log unsuccessful attempts
>> syslog_log_level=ERROR
>>
>
> What configuration is wrong ?
>
> Le ven. 3 juil. 2020 à 13:53, Matthias Runge <mrunge at matthias-runge.de> a
> écrit :
>
>> Okay, that doesn't really help with debugging though.
>>
>> Method not allowed is returned eg. when the endpoint expected an http
>> push where your browser did an http get (that's correct).
>>
>> What I'd do next is to configure ceilometer to send to a different http
>> endpoint (like a webserver on your workstation, just for debugging
>> purposes).
>>
>> Verify that the push gateway works as expected,
>> https://github.com/prometheus/pushgateway
>> has some curl commands mentioned for debugging purposes.
>>
>>
>> Matthias
>>
>> On 03/07/2020 13:07, Samuel Mutel wrote:
>> > If I go to http://10.60.4.11:9091/metrics/job/ceilometer with the web
>> > browser I receive: Method Not Allowed but i think it's normal.
>> > http://10.60.4.11:9091/metrics is working with metrics.
>> >
>> > The pushgateway and the ceilometer is working on the same host for my
>> > test so no network/firewall issue.
>> >
>> > Logs of the pushgateway is only these ones:
>> > level=info ts=2020-07-03T11:04:35.907Z caller=main.go:83 msg="starting
>> > pushgateway" version="(version=1.2.0, branch=HEAD,
>> > revision=b7e0167e9574f4f88404dde9653ee1d3c940f2eb)"
>> > level=info ts=2020-07-03T11:04:35.908Z caller=main.go:84
>> > build_context="(go=go1.13.8, user=root at 0e823ccfff84,
>> > date=20200311-18:51:01)"
>> > level=info ts=2020-07-03T11:04:35.911Z caller=main.go:137
>> > listen_address=:9091
>> >
>> > Le ven. 3 juil. 2020 à 12:14, Matthias Runge <mrunge at matthias-runge.de
>> > <mailto:mrunge at matthias-runge.de>> a écrit :
>> >
>> >     On 03/07/2020 11:25, Samuel Mutel wrote:
>> >     > Hello,
>> >     >
>> >     > I have two questions about ceilometer (openstack version rocky).
>> >     >
>> >     >   * First of all, it seems that ceilometer is sending metrics
>> >     every hour
>> >     >     and I don't understand why.
>> >     >   * Next, I am not able to setup ceilometer to send metrics to
>> >     >     prometheus pushgateway.
>> >     >
>> >     > Here is my configuration:
>> >     >
>> >     >     sources:
>> >     >       - name: meter_file
>> >     >         interval: 30
>> >     >         meters:
>> >     >           - "*"
>> >     >         sinks:
>> >     >           - prometheus
>> >     >
>> >     >     sinks:
>> >     >       - name: prometheus
>> >     >         publishers:
>> >     >                 -
>> >     prometheus://10.60.4.11:9091/metrics/job/ceilometer
>> >     <http://10.60.4.11:9091/metrics/job/ceilometer>
>> >     >     <http://10.60.4.11:9091/metrics/job/ceilometer>
>> >     >
>> >     >
>> >     > Here is the error I received:
>> >     >
>> >     >     vcpus{resource_id="7fab268b-ca7c-4692-a103-af4a69f817e4"} 2
>> >     >     # TYPE memory gauge
>> >     >     memory{resource_id="7fab268b-ca7c-4692-a103-af4a69f817e4"}
>> 2048
>> >     >     # TYPE disk.ephemeral.size gauge
>> >     >
>> >
>>   disk.ephemeral.size{resource_id="7fab268b-ca7c-4692-a103-af4a69f817e4"}
>> >     >     0
>> >     >     # TYPE disk.root.size gauge
>> >     >
>> >      disk.root.size{resource_id="7fab268b-ca7c-4692-a103-af4a69f817e4"}
>> 0
>> >     >     : HTTPError: 400 Client Error: Bad Request for url:
>> >     >     http://10.60.4.11:9091/metrics/job/ceilometer
>> >     >     2020-07-01 17:00:12.272 11375 ERROR ceilometer.publisher.http
>> >     >     Traceback (most recent call last):
>> >     >     2020-07-01 17:00:12.272 11375 ERROR ceilometer.publisher.http
>> >       File
>> >     >
>>  "/usr/lib/python2.7/dist-packages/ceilometer/publisher/http.py",
>> >     >     line 178, in _do_post
>> >     >     2020-07-01 17:00:12.272 11375 ERROR ceilometer.publisher.http
>>
>> >     >     res.raise_for_status()
>> >     >     2020-07-01 17:00:12.272 11375 ERROR ceilometer.publisher.http
>> >       File
>> >     >     "/usr/lib/python2.7/dist-packages/requests/models.py", line
>> >     935, in
>> >     >     raise_for_status
>> >     >     2020-07-01 17:00:12.272 11375 ERROR ceilometer.publisher.http
>>
>> >     >     raise HTTPError(http_error_msg, response=self)
>> >     >     2020-07-01 17:00:12.272 11375 ERROR ceilometer.publisher.http
>> >     >     HTTPError: 400 Client Error: Bad Request for url:
>> >     >     http://10.60.4.11:9091/metrics/job/ceilometer
>> >     >     2020-07-01 17:00:12.272 11375 ERROR ceilometer.publisher.http
>> >     >
>> >     >
>> >     > Thanks for your help on this topic.
>> >
>> >
>> >     Hi,
>> >
>> >     first obvious question:
>> >
>> >     are you sure that there is something listening under
>> >     http://10.60.4.11:9091/metrics/job/ceilometer ?
>> >
>> >     Would you have some error logs from the other side? It seems that
>> >     ceilometer is trying to dispatch as expected.
>> >
>> >     Matthias
>> >
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200814/b78ee6d0/attachment-0001.html>

From sean.mcginnis at gmx.com  Fri Aug 14 15:42:03 2020
From: sean.mcginnis at gmx.com (Sean McGinnis)
Date: Fri, 14 Aug 2020 10:42:03 -0500
Subject: [release] Release countdown for week R-8 Aug 17 - 21
Message-ID: <20200814154203.GA4129932@sm-workstation>

General Information
-------------------

We are getting close to some of the end of cycle deadlines. Please be aware of
the upcoming non-client library freeze on September 3.

The following cycle-with-intermediary deliverables only did one release during
the ussuri cycle, and have not done any intermediary release yet during this
cycle. The cycle-with-rc release model is more suited for deliverables that
plan to be released only once per cycle. As a result, we have suggested [1] as
a potential release model change for the following deliverables:

adjutant-ui
adjutant
cloudkitty
heat-agents
magnum-ui
monasca-thresh
monasca-ui

[1] https://review.opendev.org/#/q/topic:victoria-cwi

PTLs and release liaisons for each of those deliverables can either +1
the release model change, or propose an intermediary release for that
deliverable. In absence of answer by the end of R-8 week we'll abandon the
patch.

We also published a couple of options for a proposed release schedule
for the upcoming Wallaby cycle. Please check out the separate thread:

http://lists.openstack.org/pipermail/openstack-discuss/2020-August/016391.html

26 week schedule: https://review.opendev.org/745911/
29 week schedule: https://review.opendev.org/744729/

Upcoming Deadlines & Dates
--------------------------

Non-client library freeze: September 3 (R-6 week)
Client library freeze: September 10 (R-5 week)
Victoria-3 milestone: September 10 (R-5 week)
Victoria release: October 14


From dev.faz at gmail.com  Fri Aug 14 16:45:56 2020
From: dev.faz at gmail.com (Fabian Zimmermann)
Date: Fri, 14 Aug 2020 18:45:56 +0200
Subject: [nova][neutron][oslo][ops][kolla] rabbit bindings issue
In-Reply-To: <CAPgF-fr-3J=acxng6SwRkLPXqvTYNsNmCZUXyvzi4rxDnPDdEA@mail.gmail.com>
References: <20200806144016.GP31915@sync>
 <CAA857VxY8mr0OSw8vSYdMph4tXb=5Wwwx3cM_YVgGa8=P+cxkQ@mail.gmail.com>
 <CALaZjRGEcFDBiQ2YnsMjXJ_WF3ALcjWVLQuetupVd__CWhjg0g@mail.gmail.com>
 <CAA857VzKREnBOEF667n4B+9L9Rbm+y7o2LVrDjhxhCT=2XWR8g@mail.gmail.com>
 <1a338d7e-c82c-cda2-2d47-b5aebb999142@openstack.org>
 <CAA857VxxmrHhBZgNHCg2X=ymvOdS_YcpERYN5OnKs0NMEA=cyA@mail.gmail.com>
 <CAA857Vya0rou-SwzO+A=smVHXNTZ9a03_3r09BJj=bb3NAcdGg@mail.gmail.com>
 <CAA857Vzw+kvW_7R37oqJF6gRkxGBSY3gNYTLKu++pY5WHH49qg@mail.gmail.com>
 <CAA857Vzq1nYLM3ES35J_9PxTieVskhZBc43Z40R3NeiEEOL4rA@mail.gmail.com>
 <CAPgF-fr-3J=acxng6SwRkLPXqvTYNsNmCZUXyvzi4rxDnPDdEA@mail.gmail.com>
Message-ID: <CAA857Vw_HG8VuZyyihm8RB9Mo8aPm=v36GJrnK1jD9wyZHAb2A@mail.gmail.com>

Hi,

i read somewhere that vexxhosts kubernetes openstack-Operator is running
one rabbitmq Container per Service. Just the kubernetes self healing is
used as "ha" for rabbitmq.

That seems to match with my finding: run rabbitmq standalone and use an
external system to restart rabbitmq if required.

 Fabian

Satish Patel <satish.txt at gmail.com> schrieb am Fr., 14. Aug. 2020, 16:59:

> Fabian,
>
> what do you mean?
>
> >> I think vexxhost is running (1) with their openstack-operator - for
> reasons.
>
> On Fri, Aug 14, 2020 at 7:28 AM Fabian Zimmermann <dev.faz at gmail.com>
> wrote:
> >
> > Hello again,
> >
> > just a short update about the results of my tests.
> >
> > I currently see 2 ways of running openstack+rabbitmq
> >
> > 1. without durable-queues and without replication - just one
> rabbitmq-process which gets (somehow) restarted if it fails.
> > 2. durable-queues and replication
> >
> > Any other combination of these settings leads to more or less issues with
> >
> > * broken / non working bindings
> > * broken queues
> >
> > I think vexxhost is running (1) with their openstack-operator - for
> reasons.
> >
> > I added [kolla], because kolla-ansible is installing rabbitmq with
> replication but without durable-queues.
> >
> > May someone point me to the best way to document these findings to some
> official doc?
> > I think a lot of installations out there will run into issues if - under
> load - a node fails.
> >
> >  Fabian
> >
> >
> > Am Do., 13. Aug. 2020 um 15:13 Uhr schrieb Fabian Zimmermann <
> dev.faz at gmail.com>:
> >>
> >> Hi,
> >>
> >> just did some short tests today in our test-environment (without
> durable queues and without replication):
> >>
> >> * started a rally task to generate some load
> >> * kill-9-ed rabbitmq on one node
> >> * rally task immediately stopped and the cloud (mostly) stopped working
> >>
> >> after some debugging i found (again) exchanges which had bindings to
> queues, but these bindings didnt forward any msgs.
> >> Wrote a small script to detect these broken bindings and will now check
> if this is "reproducible"
> >>
> >> then I will try "durable queues" and "durable queues with replication"
> to see if this helps. Even if I would expect
> >> rabbitmq should be able to handle this without these "hidden broken
> bindings"
> >>
> >> This just FYI.
> >>
> >>  Fabian
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200814/7489deaf/attachment.html>

From smooney at redhat.com  Fri Aug 14 19:09:22 2020
From: smooney at redhat.com (Sean Mooney)
Date: Fri, 14 Aug 2020 20:09:22 +0100
Subject: [nova][neutron][oslo][ops][kolla] rabbit bindings issue
In-Reply-To: <CAA857Vw_HG8VuZyyihm8RB9Mo8aPm=v36GJrnK1jD9wyZHAb2A@mail.gmail.com>
References: <20200806144016.GP31915@sync>
 <CAA857VxY8mr0OSw8vSYdMph4tXb=5Wwwx3cM_YVgGa8=P+cxkQ@mail.gmail.com>
 <CALaZjRGEcFDBiQ2YnsMjXJ_WF3ALcjWVLQuetupVd__CWhjg0g@mail.gmail.com>
 <CAA857VzKREnBOEF667n4B+9L9Rbm+y7o2LVrDjhxhCT=2XWR8g@mail.gmail.com>
 <1a338d7e-c82c-cda2-2d47-b5aebb999142@openstack.org>
 <CAA857VxxmrHhBZgNHCg2X=ymvOdS_YcpERYN5OnKs0NMEA=cyA@mail.gmail.com>
 <CAA857Vya0rou-SwzO+A=smVHXNTZ9a03_3r09BJj=bb3NAcdGg@mail.gmail.com>
 <CAA857Vzw+kvW_7R37oqJF6gRkxGBSY3gNYTLKu++pY5WHH49qg@mail.gmail.com>
 <CAA857Vzq1nYLM3ES35J_9PxTieVskhZBc43Z40R3NeiEEOL4rA@mail.gmail.com>
 <CAPgF-fr-3J=acxng6SwRkLPXqvTYNsNmCZUXyvzi4rxDnPDdEA@mail.gmail.com>
 <CAA857Vw_HG8VuZyyihm8RB9Mo8aPm=v36GJrnK1jD9wyZHAb2A@mail.gmail.com>
Message-ID: <28f04c4eff84aa6d15424f3de3706ae9ec361fa7.camel@redhat.com>

On Fri, 2020-08-14 at 18:45 +0200, Fabian Zimmermann wrote:
> Hi,
> 
> i read somewhere that vexxhosts kubernetes openstack-Operator is running
> one rabbitmq Container per Service. Just the kubernetes self healing is
> used as "ha" for rabbitmq.
> 
> That seems to match with my finding: run rabbitmq standalone and use an
> external system to restart rabbitmq if required.
thats the design that was orginally planned for kolla-kubernetes orrignally

each service was to be deployed with its own rabbit mq server if it required one
and if it crashed it woudl just be recreated by k8s. it perfromace better then a cluster
and if you trust k8s or the external service enough to ensure it is recteated it
should be as effective a solution. you dont even need k8s to do that but it seams to be
a good fit if  your prepared to ocationally loose inflight rpcs.
if you not then you can configure rabbit to persite all message to disk and mont that on a shared
file system like nfs or cephfs so that when the rabbit instance is recreated the queue contency is
perserved. assuming you can take the perfromance hit of writing all messages to disk that is.
> 
>  Fabian
> 
> Satish Patel <satish.txt at gmail.com> schrieb am Fr., 14. Aug. 2020, 16:59:
> 
> > Fabian,
> > 
> > what do you mean?
> > 
> > > > I think vexxhost is running (1) with their openstack-operator - for
> > 
> > reasons.
> > 
> > On Fri, Aug 14, 2020 at 7:28 AM Fabian Zimmermann <dev.faz at gmail.com>
> > wrote:
> > > 
> > > Hello again,
> > > 
> > > just a short update about the results of my tests.
> > > 
> > > I currently see 2 ways of running openstack+rabbitmq
> > > 
> > > 1. without durable-queues and without replication - just one
> > 
> > rabbitmq-process which gets (somehow) restarted if it fails.
> > > 2. durable-queues and replication
> > > 
> > > Any other combination of these settings leads to more or less issues with
> > > 
> > > * broken / non working bindings
> > > * broken queues
> > > 
> > > I think vexxhost is running (1) with their openstack-operator - for
> > 
> > reasons.
> > > 
> > > I added [kolla], because kolla-ansible is installing rabbitmq with
> > 
> > replication but without durable-queues.
> > > 
> > > May someone point me to the best way to document these findings to some
> > 
> > official doc?
> > > I think a lot of installations out there will run into issues if - under
> > 
> > load - a node fails.
> > > 
> > >  Fabian
> > > 
> > > 
> > > Am Do., 13. Aug. 2020 um 15:13 Uhr schrieb Fabian Zimmermann <
> > 
> > dev.faz at gmail.com>:
> > > > 
> > > > Hi,
> > > > 
> > > > just did some short tests today in our test-environment (without
> > 
> > durable queues and without replication):
> > > > 
> > > > * started a rally task to generate some load
> > > > * kill-9-ed rabbitmq on one node
> > > > * rally task immediately stopped and the cloud (mostly) stopped working
> > > > 
> > > > after some debugging i found (again) exchanges which had bindings to
> > 
> > queues, but these bindings didnt forward any msgs.
> > > > Wrote a small script to detect these broken bindings and will now check
> > 
> > if this is "reproducible"
> > > > 
> > > > then I will try "durable queues" and "durable queues with replication"
> > 
> > to see if this helps. Even if I would expect
> > > > rabbitmq should be able to handle this without these "hidden broken
> > 
> > bindings"
> > > > 
> > > > This just FYI.
> > > > 
> > > >  Fabian


From pramchan at yahoo.com  Sat Aug 15 02:04:30 2020
From: pramchan at yahoo.com (prakash RAMCHANDRAN)
Date: Sat, 15 Aug 2020 02:04:30 +0000 (UTC)
Subject: [Interop-WG] Inviting cross-projects discussions for new
 re-branding efforts (Oct 26)
References: <2035126983.2350763.1597457070531.ref@mail.yahoo.com>
Message-ID: <2035126983.2350763.1597457070531@mail.yahoo.com>

Hi all,
I have booked for two hours slot to enable re-branding efforts we are looking to unleash in early 2021.As part of "Open Infrastructure Summit" we kick-start with Inter-op for next decade.
Monday October 26 13UTC - 15UTC InteropWG
We would like to encourage Open Infrastructure Projects to enlighten the stage with Out-of-Box  thinking & requests for Interop in Marketplace in OSF.
- Integrated Projects in OpenStack have well served thru last decades dream team,  that has stood the Tempest tests for RefStackV1 being base for OPNFV-CNTT / ONAP/ and CVP/OVP1 of LFN
- Its the turn to the Open Infra Projects like Kata, Airship, Zuul, StarlingX and potential https://openinfralabs.org/ to innovate and suggest the world    How OSF can leverage next-gen  Infra with k8s cluster as baseline for Milt-cluster , Hybrid Cloud, Muti-cloud RefStackV2 for upstream usage for Telco and Edge Clouds
We need all Graduated and Incumbent Projects to propose how we can Re-Brand them for Open Infra Containerized workloads.
Do you want to use Magnum, Zun, Kolla & Kuryer - refer https://etherpad.opendev.org/p/interop
Should we collaborate with LFN re-imagining efforts via our RefStack2  plans as base for Open Infra Summit efforts to give Industry a wake up call to collaborate?
Please reply with comments below, where are the global innovators hiding behind Alps &  Himalaya, come and swing your ping pongs balls or Cricket Bats. The Rocky mountains curve balls will always haunt you if you don't speak-up./*================================================================================================================Your comments


+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*/
Lets join the collaboration for unfinished transition to Containers world with Interoperability.
Our committee members are all aligned to back you on our journey and ensure we bring the ideas that matter and execution that lits fire.https://www.openstack.org/summit/2020/vote-for-presentations#/24735


ThanksPrakash RamchandranFor Interop WG / OSF


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200815/b36f363a/attachment-0001.html>

From reza.b2008 at gmail.com  Sat Aug 15 13:08:42 2020
From: reza.b2008 at gmail.com (Reza Bakhshayeshi)
Date: Sat, 15 Aug 2020 17:38:42 +0430
Subject: VM doesn't have internet - OpenStack Ussuri with OVN networking
Message-ID: <CAMGoRG0gwFsa26Q+vUgbXyo05bJZV4Vvgz3tRG5Mi3sm34FF6g@mail.gmail.com>

Hi all,

I've set up OpenStack Ussuri with OVN networking manually, VMs can ping
each other through an internal network. I've created a provider network
with valid IP subnet, and my problem is VMs don't have internet access
before and after assigning floating IP.
I've encountered the same problem on TripleO (with dvr), and I just wanted
to investigate the problem by manual installation (without HA and DVR), but
the same happened.
Everything seems working properly, I can't see any error in logs, here is
agent list output:

[root at controller ~]# openstack network agent list
+--------------------------------------+------------------------------+------------------------+-------------------+-------+-------+-------------------------------+
| ID                                   | Agent Type                   |
Host                   | Availability Zone | Alive | State | Binary
               |
+--------------------------------------+------------------------------+------------------------+-------------------+-------+-------+-------------------------------+
| 1ade76ae-6caf-4942-8df3-e3bc39d2f12d | OVN Controller Gateway agent |
controller.localdomain | n/a               | :-)   | UP    | ovn-controller
               |
| 484f123f-5935-44ce-aee7-4102271d9f11 | OVN Controller agent         |
compute.localdomain    | n/a               | :-)   | UP    | ovn-controller
               |
| 01235c13-4f32-4c4f-8cf6-e4b8d59a438a | OVN Metadata agent           |
compute.localdomain    | n/a               | :-)   | UP    |
networking-ovn-metadata-agent |
+--------------------------------------+------------------------------+------------------------+-------------------+-------+-------+-------------------------------+

On the controller I got br-ex with a valid IP address. here is the
external-ids table on controller and compute node:

[root at controller ~]# ovs-vsctl get Open_vSwitch . external-ids
{hostname=controller.localdomain, ovn-bridge=br-int,
ovn-cms-options=enable-chassis-as-gw, ovn-encap-ip="10.0.0.11",
ovn-encap-type=geneve, ovn-remote="tcp:10.0.0.11:6642",
rundir="/var/run/openvswitch",
system-id="1ade76ae-6caf-4942-8df3-e3bc39d2f12d"}

[root at compute ~]# ovs-vsctl get Open_vSwitch . external-ids
{hostname=compute.localdomain, ovn-bridge=br-int, ovn-encap-ip="10.0.0.31",
ovn-encap-type=geneve, ovn-remote="tcp:10.0.0.11:6642",
rundir="/var/run/openvswitch",
system-id="484f123f-5935-44ce-aee7-4102271d9f11"}

and I have:

[root at controller ~]# ovn-nbctl show
switch 72fd5c08-6852-4d7e-b9b4-7e0a1ccdd976
(neutron-b8c66c3d-f47a-42a5-bd2d-c40c435c0376) (aka net01)
    port cf99f43b-0a18-4b91-9ca5-b6ed3f86d994
        type: localport
        addresses: ["fa:16:3e:d0:df:82 192.168.0.100"]
    port 4268f511-bee3-4da0-8835-b9a8664101c4
        addresses: ["fa:16:3e:35:f2:02 192.168.0.135"]
    port 846919e8-cde5-4ba3-b003-0c06e73676ed
        type: router
        router-port: lrp-846919e8-cde5-4ba3-b003-0c06e73676ed
switch bb22224e-e1d1-4bb2-b57e-1058e9fc33a7
(neutron-9614546f-b216-4554-9bfe-e8d6bb11d927) (aka provider)
    port 2f05c7bc-ad0f-4a41-bbd8-5fef1f5bfd2c
        type: localport
        addresses: ["fa:16:3e:17:7b:5b  X.X.X.X"]
    port provnet-9614546f-b216-4554-9bfe-e8d6bb11d927
        type: localnet
        addresses: ["unknown"]
    port 23fcdc9d-2d11-40c9-881e-c78e871a3314
        type: router
        router-port: lrp-23fcdc9d-2d11-40c9-881e-c78e871a3314
router 0bd35585-b0a3-4c8f-b71b-cb87c9fad060
(neutron-8cdcd0d2-752c-4130-87bb-d2b7af803ec9) (aka router01)
    port lrp-846919e8-cde5-4ba3-b003-0c06e73676ed
        mac: "fa:16:3e:4d:c3:f9"
        networks: ["192.168.0.1/24"]
    port lrp-23fcdc9d-2d11-40c9-881e-c78e871a3314
        mac: "fa:16:3e:94:89:8e"
        networks: ["X.X.X.X/22"]
        gateway chassis: [1ade76ae-6caf-4942-8df3-e3bc39d2f12d
484f123f-5935-44ce-aee7-4102271d9f11]
    nat 8ef6167a-bc28-4caf-8af5-d0bf12a62545
        external ip: " X.X.X.X "
        logical ip: "192.168.0.135"
        type: "dnat_and_snat"
    nat ba32ab93-3d2b-4199-b634-802f0f438338
        external ip: " X.X.X.X "
        logical ip: "192.168.0.0/24"
        type: "snat"

I replaced valid IPs with X.X.X.X

Any suggestion would be grateful.
Regards,
Reza
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200815/98edfde0/attachment.html>

From midhunlaln66 at gmail.com  Sat Aug 15 13:28:41 2020
From: midhunlaln66 at gmail.com (Midhunlal Nb)
Date: Sat, 15 Aug 2020 18:58:41 +0530
Subject: Trouble to launch a instance in open stack
Message-ID: <CACNQXq367GEyTNfGDZT+UGY6scph18u-cD0h6bLVqtk6LyTAUA@mail.gmail.com>

Hi all,
--> I created an openstack set up in my networking lab.
---> Vmware installed in one of the blade server then ubuntu 18.04
installed as OS.
---> openstack 5.3.1 version successfully installed in this os
---> In my lab we are using 192.168.x.x/16 network
--->  In openstack I created an external network with 192.168.x.x/16
----> In openstack I created an internal network with 172.16.x.x/16(for testing)
-----> then I created 1 external network(provider network),1 router,1
private               cloud .
------>    In internal network I created 2 instance
(172.16.0.2&172.16.0.3)this two instance pinging each other and i am
able assign floating ip(192.168.x.x) to this instance
----> Now my problem is I created a original instance with my
network(192.168.x.x)in provider network(that instance directly
attached to external (provider network))
--->This instance launched successfully and interface ip also our
internal ip and dns also showing correct but i am not able to ping our
any one of the lab network ip,internet also not available.

please help me on this


Thanks & Regards
Midhunlal N B
+918921245637


From noonedeadpunk at ya.ru  Sat Aug 15 14:47:06 2020
From: noonedeadpunk at ya.ru (Dmitriy Rabotyagov)
Date: Sat, 15 Aug 2020 17:47:06 +0300
Subject: [openstack-ansible] New core on board!
Message-ID: <60371597502548@mail.yandex.ru>

Hey everyone!

Today we have several reasons to celebrate!

First of all, we have released OSA Ussuri with tag 21.0.0 (better late than never :p).

And even more exciting announcement, that Andrew Bonney is our new OpenStack-Ansible Core reviewer! Even though usual proposal process has been skipped this time, I think everyone will agree that Andrew deserved it and it's high time we congratulated him with becoming part of our team.

Welcome on board, Andrew!

-- 
Kind Regards,
Dmitriy Rabotyagov


From gsteinmuller at vexxhost.com  Sat Aug 15 17:16:49 2020
From: gsteinmuller at vexxhost.com (=?UTF-8?Q?Guilherme_Steinm=C3=BCller?=)
Date: Sat, 15 Aug 2020 14:16:49 -0300
Subject: [openstack-ansible] New core on board!
In-Reply-To: <60371597502548@mail.yandex.ru>
References: <60371597502548@mail.yandex.ru>
Message-ID: <CAFszKC5yozJ91pUAZHzDHcxGo_TnnX4YupCY2Sqg9Eq2KPBNNg@mail.gmail.com>

+1

Welcome, Andrew!

Regards,
Guilherme

On Sat, Aug 15, 2020 at 11:52 AM Dmitriy Rabotyagov <noonedeadpunk at ya.ru>
wrote:

> Hey everyone!
>
> Today we have several reasons to celebrate!
>
> First of all, we have released OSA Ussuri with tag 21.0.0 (better late
> than never :p).
>
> And even more exciting announcement, that Andrew Bonney is our new
> OpenStack-Ansible Core reviewer! Even though usual proposal process has
> been skipped this time, I think everyone will agree that Andrew deserved it
> and it's high time we congratulated him with becoming part of our team.
>
> Welcome on board, Andrew!
>
> --
> Kind Regards,
> Dmitriy Rabotyagov
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200815/c6719dee/attachment.html>

From skaplons at redhat.com  Sat Aug 15 17:36:35 2020
From: skaplons at redhat.com (Slawek Kaplonski)
Date: Sat, 15 Aug 2020 19:36:35 +0200
Subject: [neutron] How to specify overlay network interface when using
 OVN and Geneve?
In-Reply-To: <CAEW15yAJKF6AjQBQQE0bOK9uZZh3Cq6vf3GHf0Wn=K2DrN5DCg@mail.gmail.com>
References: <CAEW15yAJKF6AjQBQQE0bOK9uZZh3Cq6vf3GHf0Wn=K2DrN5DCg@mail.gmail.com>
Message-ID: <20200815173635.3z66wzg475d4kzm2@skaplons-mac>

Hi,

You can do that by configuring bridge_mappings on compute node(s).
It is described in the doc [1].

On Fri, Aug 14, 2020 at 09:13:23PM +0700, Popoi Zen wrote:
> Hi, I have used my google fu but I cant find any reference. Just want to
> know how to specify overlay network when Im using geneve as my overlay
> protocol?

[1] https://docs.openstack.org/neutron/latest/admin/ovn/refarch/provider-networks.html

-- 
Slawek Kaplonski
Principal software engineer
Red Hat


From smooney at redhat.com  Sat Aug 15 18:02:33 2020
From: smooney at redhat.com (Sean Mooney)
Date: Sat, 15 Aug 2020 19:02:33 +0100
Subject: [neutron] How to specify overlay network interface when using
 OVN and Geneve?
In-Reply-To: <20200815173635.3z66wzg475d4kzm2@skaplons-mac>
References: <CAEW15yAJKF6AjQBQQE0bOK9uZZh3Cq6vf3GHf0Wn=K2DrN5DCg@mail.gmail.com>
 <20200815173635.3z66wzg475d4kzm2@skaplons-mac>
Message-ID: <79cf1f2a19cc242d0030e7ba3c39311aa176e6bf.camel@redhat.com>

On Sat, 2020-08-15 at 19:36 +0200, Slawek Kaplonski wrote:
> Hi,
> 
> You can do that by configuring bridge_mappings on compute node(s).
> It is described in the doc [1].
when they said overlay network i think they meant the geneve tunnels in which
casue you contole the interface that is used by adjusting your routing table to use the interface you desire.
that can involve movein ipt to bridges or interface to  correctly set up the routes depending on your
configurtion.

but ya if you were refering to provider networks the link slawek porovide is proably what you want.
> 
> On Fri, Aug 14, 2020 at 09:13:23PM +0700, Popoi Zen wrote:
> > Hi, I have used my google fu but I cant find any reference. Just want to
> > know how to specify overlay network when Im using geneve as my overlay
> > protocol?
> 
> [1] https://docs.openstack.org/neutron/latest/admin/ovn/refarch/provider-networks.html
> 


From satish.txt at gmail.com  Sat Aug 15 18:10:54 2020
From: satish.txt at gmail.com (Satish Patel)
Date: Sat, 15 Aug 2020 14:10:54 -0400
Subject: [openstack-ansible] New core on board!
In-Reply-To: <CAFszKC5yozJ91pUAZHzDHcxGo_TnnX4YupCY2Sqg9Eq2KPBNNg@mail.gmail.com>
References: <CAFszKC5yozJ91pUAZHzDHcxGo_TnnX4YupCY2Sqg9Eq2KPBNNg@mail.gmail.com>
Message-ID: <B54FFDD5-886F-44A4-8DB4-61E18B8EB127@gmail.com>


Congrats Andrew 

Sent from my iPhone

> On Aug 15, 2020, at 1:25 PM, Guilherme Steinmüller <gsteinmuller at vexxhost.com> wrote:
> 
> ﻿
> +1
> 
> Welcome, Andrew!
> 
> Regards,
> Guilherme
> 
>> On Sat, Aug 15, 2020 at 11:52 AM Dmitriy Rabotyagov <noonedeadpunk at ya.ru> wrote:
>> Hey everyone!
>> 
>> Today we have several reasons to celebrate!
>> 
>> First of all, we have released OSA Ussuri with tag 21.0.0 (better late than never :p).
>> 
>> And even more exciting announcement, that Andrew Bonney is our new OpenStack-Ansible Core reviewer! Even though usual proposal process has been skipped this time, I think everyone will agree that Andrew deserved it and it's high time we congratulated him with becoming part of our team.
>> 
>> Welcome on board, Andrew!
>> 
>> -- 
>> Kind Regards,
>> Dmitriy Rabotyagov
>> 
>> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200815/f4384272/attachment.html>

From zigo at debian.org  Sat Aug 15 20:22:42 2020
From: zigo at debian.org (Thomas Goirand)
Date: Sat, 15 Aug 2020 22:22:42 +0200
Subject: [Interop-WG] Inviting cross-projects discussions for new
 re-branding efforts (Oct 26)
In-Reply-To: <2035126983.2350763.1597457070531@mail.yahoo.com>
References: <2035126983.2350763.1597457070531.ref@mail.yahoo.com>
 <2035126983.2350763.1597457070531@mail.yahoo.com>
Message-ID: <dc14cb94-feaf-8ca7-0541-51763b5b7f9a@debian.org>

On 8/15/20 4:04 AM, prakash RAMCHANDRAN wrote:
> Hi all,
> 
> I have booked for two hours slot to enable re-branding efforts we are
> looking to unleash in early 2021.
> As part of "Open Infrastructure Summit" we kick-start with Inter-op for
> next decade.
> 
> Monday October 2613UTC - 15UTCInteropWG
> 
> We would like to encourage Open Infrastructure Projects to enlighten the
> stage with Out-of-Box  thinking & requests for Interop in Marketplace in
> OSF.
> 
> - Integrated Projects in OpenStack have well served thru last decades
> dream team,  that has stood the Tempest tests for RefStackV1 being base
> for OPNFV-CNTT / ONAP/ and CVP/OVP1 of LFN
> 
> - Its the turn to the Open Infra Projects like Kata, Airship, Zuul,
> StarlingX and potential https://openinfralabs.org/ to innovate and
> suggest the world
>     How OSF can leverage next-gen  Infra with k8s cluster as baseline
> for Milt-cluster , Hybrid Cloud, Muti-cloud RefStackV2 for upstream
> usage for Telco and Edge Clouds
> 
> We need all Graduated and Incumbent Projects to propose how we can
> Re-Brand them for Open Infra Containerized workloads.
> 
> Do you want to use Magnum, Zun, Kolla & Kuryer -
> refer https://etherpad.opendev.org/p/interop
> 
> Should we collaborate with LFN re-imagining efforts via our RefStack2 
> plans as base for Open Infra Summit efforts to give Industry a wake up
> call to collaborate?
> 
> Please reply with comments below, where are the global innovators hiding
> behind Alps &  Himalaya, come and swing your ping pongs balls or Cricket
> Bats. The Rocky mountains curve balls will always haunt you if you don't
> speak-up.

I'm sorry if this is an abrupt response to your enthusiastic email, but
I'd very much prefer if we made efforts to fill the gap with missing
features, and getting understaffed projects on good rails, rather than
pushing for more buzz words.

I have in mind:

- a networking stack that really scales, with IPv6 not as second citizen
(ie: that must use centralized network nodes)
- stuff like server recue working fully, even with boot from volume
- finish the encrypted volume thingy (it's a joke: live-migration with
them don't work because of rights issues on Barbican...)
- finish the project specific client to openstack client migration (it's
taking years....)

Also, re-staffing projects like horizon, cloudkitty, telemetry,
you-name-it... seems like another challenge of the next decade.

Cheers,

Thomas Goirand (zigo)


From satish.txt at gmail.com  Sun Aug 16 00:13:44 2020
From: satish.txt at gmail.com (Satish Patel)
Date: Sat, 15 Aug 2020 20:13:44 -0400
Subject: [nova][neutron][oslo][ops][kolla] rabbit bindings issue
In-Reply-To: <28f04c4eff84aa6d15424f3de3706ae9ec361fa7.camel@redhat.com>
References: <20200806144016.GP31915@sync>
 <CAA857VxY8mr0OSw8vSYdMph4tXb=5Wwwx3cM_YVgGa8=P+cxkQ@mail.gmail.com>
 <CALaZjRGEcFDBiQ2YnsMjXJ_WF3ALcjWVLQuetupVd__CWhjg0g@mail.gmail.com>
 <CAA857VzKREnBOEF667n4B+9L9Rbm+y7o2LVrDjhxhCT=2XWR8g@mail.gmail.com>
 <1a338d7e-c82c-cda2-2d47-b5aebb999142@openstack.org>
 <CAA857VxxmrHhBZgNHCg2X=ymvOdS_YcpERYN5OnKs0NMEA=cyA@mail.gmail.com>
 <CAA857Vya0rou-SwzO+A=smVHXNTZ9a03_3r09BJj=bb3NAcdGg@mail.gmail.com>
 <CAA857Vzw+kvW_7R37oqJF6gRkxGBSY3gNYTLKu++pY5WHH49qg@mail.gmail.com>
 <CAA857Vzq1nYLM3ES35J_9PxTieVskhZBc43Z40R3NeiEEOL4rA@mail.gmail.com>
 <CAPgF-fr-3J=acxng6SwRkLPXqvTYNsNmCZUXyvzi4rxDnPDdEA@mail.gmail.com>
 <CAA857Vw_HG8VuZyyihm8RB9Mo8aPm=v36GJrnK1jD9wyZHAb2A@mail.gmail.com>
 <28f04c4eff84aa6d15424f3de3706ae9ec361fa7.camel@redhat.com>
Message-ID: <CAPgF-fo-MFk7oVcKKPXthBE=hi4U+VtRtkdtkc3w=SsGuJJ-ag@mail.gmail.com>

Hi Sean,

Sounds good, but running rabbitmq for each service going to be little
overhead also, how do you scale cluster (Yes we can use cellv2 but its
not something everyone like to do because of complexity). If we thinks
rabbitMQ is growing pain then why community not looking for
alternative option (kafka) etc..?

On Fri, Aug 14, 2020 at 3:09 PM Sean Mooney <smooney at redhat.com> wrote:
>
> On Fri, 2020-08-14 at 18:45 +0200, Fabian Zimmermann wrote:
> > Hi,
> >
> > i read somewhere that vexxhosts kubernetes openstack-Operator is running
> > one rabbitmq Container per Service. Just the kubernetes self healing is
> > used as "ha" for rabbitmq.
> >
> > That seems to match with my finding: run rabbitmq standalone and use an
> > external system to restart rabbitmq if required.
> thats the design that was orginally planned for kolla-kubernetes orrignally
>
> each service was to be deployed with its own rabbit mq server if it required one
> and if it crashed it woudl just be recreated by k8s. it perfromace better then a cluster
> and if you trust k8s or the external service enough to ensure it is recteated it
> should be as effective a solution. you dont even need k8s to do that but it seams to be
> a good fit if  your prepared to ocationally loose inflight rpcs.
> if you not then you can configure rabbit to persite all message to disk and mont that on a shared
> file system like nfs or cephfs so that when the rabbit instance is recreated the queue contency is
> perserved. assuming you can take the perfromance hit of writing all messages to disk that is.
> >
> >  Fabian
> >
> > Satish Patel <satish.txt at gmail.com> schrieb am Fr., 14. Aug. 2020, 16:59:
> >
> > > Fabian,
> > >
> > > what do you mean?
> > >
> > > > > I think vexxhost is running (1) with their openstack-operator - for
> > >
> > > reasons.
> > >
> > > On Fri, Aug 14, 2020 at 7:28 AM Fabian Zimmermann <dev.faz at gmail.com>
> > > wrote:
> > > >
> > > > Hello again,
> > > >
> > > > just a short update about the results of my tests.
> > > >
> > > > I currently see 2 ways of running openstack+rabbitmq
> > > >
> > > > 1. without durable-queues and without replication - just one
> > >
> > > rabbitmq-process which gets (somehow) restarted if it fails.
> > > > 2. durable-queues and replication
> > > >
> > > > Any other combination of these settings leads to more or less issues with
> > > >
> > > > * broken / non working bindings
> > > > * broken queues
> > > >
> > > > I think vexxhost is running (1) with their openstack-operator - for
> > >
> > > reasons.
> > > >
> > > > I added [kolla], because kolla-ansible is installing rabbitmq with
> > >
> > > replication but without durable-queues.
> > > >
> > > > May someone point me to the best way to document these findings to some
> > >
> > > official doc?
> > > > I think a lot of installations out there will run into issues if - under
> > >
> > > load - a node fails.
> > > >
> > > >  Fabian
> > > >
> > > >
> > > > Am Do., 13. Aug. 2020 um 15:13 Uhr schrieb Fabian Zimmermann <
> > >
> > > dev.faz at gmail.com>:
> > > > >
> > > > > Hi,
> > > > >
> > > > > just did some short tests today in our test-environment (without
> > >
> > > durable queues and without replication):
> > > > >
> > > > > * started a rally task to generate some load
> > > > > * kill-9-ed rabbitmq on one node
> > > > > * rally task immediately stopped and the cloud (mostly) stopped working
> > > > >
> > > > > after some debugging i found (again) exchanges which had bindings to
> > >
> > > queues, but these bindings didnt forward any msgs.
> > > > > Wrote a small script to detect these broken bindings and will now check
> > >
> > > if this is "reproducible"
> > > > >
> > > > > then I will try "durable queues" and "durable queues with replication"
> > >
> > > to see if this helps. Even if I would expect
> > > > > rabbitmq should be able to handle this without these "hidden broken
> > >
> > > bindings"
> > > > >
> > > > > This just FYI.
> > > > >
> > > > >  Fabian
>


From alterriu at gmail.com  Sun Aug 16 02:58:57 2020
From: alterriu at gmail.com (Popoi Zen)
Date: Sun, 16 Aug 2020 09:58:57 +0700
Subject: [neutron] How to specify overlay network interface when using OVN
 and Geneve?
In-Reply-To: <79cf1f2a19cc242d0030e7ba3c39311aa176e6bf.camel@redhat.com>
References: <CAEW15yAJKF6AjQBQQE0bOK9uZZh3Cq6vf3GHf0Wn=K2DrN5DCg@mail.gmail.com>
 <20200815173635.3z66wzg475d4kzm2@skaplons-mac>
 <79cf1f2a19cc242d0030e7ba3c39311aa176e6bf.camel@redhat.com>
Message-ID: <CAEW15yBB504tqMBEBgCe09gNDDPe2Pgk93Nyjkj=mCug_gL_=Q@mail.gmail.com>

Yeah, what I mean is tunnel network between instance when instance
communicate using selfservice network, can I specify from which host
interface/NIC that traffic goes through? I found this: `ovs-vsctl set open
. external-ids:ovn-encap-ip=IP_ADDRESS` is it righ?

And btw, what is the best practise when using OVN? Did I need setup bridge
for overlay interface and provider interface on my controller too? Since,
as my understanding, inbound/outbound will have direct access from compute
node by default on OVN. And in this guide [1] bridge only configured on
compute nodes.

[1]
https://docs.openstack.org/neutron/ussuri/install/ovn/manual_install.html

On Sun, Aug 16, 2020 at 1:02 AM Sean Mooney <smooney at redhat.com> wrote:

> On Sat, 2020-08-15 at 19:36 +0200, Slawek Kaplonski wrote:
> > Hi,
> >
> > You can do that by configuring bridge_mappings on compute node(s).
> > It is described in the doc [1].
> when they said overlay network i think they meant the geneve tunnels in
> which
> casue you contole the interface that is used by adjusting your routing
> table to use the interface you desire.
> that can involve movein ipt to bridges or interface to  correctly set up
> the routes depending on your
> configurtion.
>
> but ya if you were refering to provider networks the link slawek porovide
> is proably what you want.
> >
> > On Fri, Aug 14, 2020 at 09:13:23PM +0700, Popoi Zen wrote:
> > > Hi, I have used my google fu but I cant find any reference. Just want
> to
> > > know how to specify overlay network when Im using geneve as my overlay
> > > protocol?
> >
> > [1]
> https://docs.openstack.org/neutron/latest/admin/ovn/refarch/provider-networks.html
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200816/273113ff/attachment-0001.html>

From alterriu at gmail.com  Sun Aug 16 03:08:08 2020
From: alterriu at gmail.com (Popoi Zen)
Date: Sun, 16 Aug 2020 10:08:08 +0700
Subject: [neutron][ovn][sfc] Is it possible to use SFC (Service Function
 Chaining) on provider network?
Message-ID: <CAEW15yBKpqh7SyWi71iqqYcrMOWQFjzqRKved_6wsn524j+QEQ@mail.gmail.com>

I have look some guide about SFC, but it seems that SFC only used on
private/selfservice network. Is it possible to steer traffic between
instance when they use provider network? I always getting error when using
provider network.
Maybe, can I push flow rule direct on OVN database or something like that?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200816/83595a58/attachment.html>

From dev.faz at gmail.com  Sun Aug 16 05:40:55 2020
From: dev.faz at gmail.com (Fabian Zimmermann)
Date: Sun, 16 Aug 2020 07:40:55 +0200
Subject: [nova][neutron][oslo][ops][kolla] rabbit bindings issue
In-Reply-To: <CAPgF-fo-MFk7oVcKKPXthBE=hi4U+VtRtkdtkc3w=SsGuJJ-ag@mail.gmail.com>
References: <20200806144016.GP31915@sync>
 <CAA857VxY8mr0OSw8vSYdMph4tXb=5Wwwx3cM_YVgGa8=P+cxkQ@mail.gmail.com>
 <CALaZjRGEcFDBiQ2YnsMjXJ_WF3ALcjWVLQuetupVd__CWhjg0g@mail.gmail.com>
 <CAA857VzKREnBOEF667n4B+9L9Rbm+y7o2LVrDjhxhCT=2XWR8g@mail.gmail.com>
 <1a338d7e-c82c-cda2-2d47-b5aebb999142@openstack.org>
 <CAA857VxxmrHhBZgNHCg2X=ymvOdS_YcpERYN5OnKs0NMEA=cyA@mail.gmail.com>
 <CAA857Vya0rou-SwzO+A=smVHXNTZ9a03_3r09BJj=bb3NAcdGg@mail.gmail.com>
 <CAA857Vzw+kvW_7R37oqJF6gRkxGBSY3gNYTLKu++pY5WHH49qg@mail.gmail.com>
 <CAA857Vzq1nYLM3ES35J_9PxTieVskhZBc43Z40R3NeiEEOL4rA@mail.gmail.com>
 <CAPgF-fr-3J=acxng6SwRkLPXqvTYNsNmCZUXyvzi4rxDnPDdEA@mail.gmail.com>
 <CAA857Vw_HG8VuZyyihm8RB9Mo8aPm=v36GJrnK1jD9wyZHAb2A@mail.gmail.com>
 <28f04c4eff84aa6d15424f3de3706ae9ec361fa7.camel@redhat.com>
 <CAPgF-fo-MFk7oVcKKPXthBE=hi4U+VtRtkdtkc3w=SsGuJJ-ag@mail.gmail.com>
Message-ID: <CAA857VzF-WmGSUEb4SYriNn2Zw4Xd6mXtNe+=MocyTgNdiczFQ@mail.gmail.com>

Hi,

Already looked in Oslo.messaging, but rabbitmq is the only stable driver :(

Kafka is marked as experimental and (if the docs are correct) is only
usable for notifications.

Would love to switch to an alternate.

 Fabian

Satish Patel <satish.txt at gmail.com> schrieb am So., 16. Aug. 2020, 02:13:

> Hi Sean,
>
> Sounds good, but running rabbitmq for each service going to be little
> overhead also, how do you scale cluster (Yes we can use cellv2 but its
> not something everyone like to do because of complexity). If we thinks
> rabbitMQ is growing pain then why community not looking for
> alternative option (kafka) etc..?
>
> On Fri, Aug 14, 2020 at 3:09 PM Sean Mooney <smooney at redhat.com> wrote:
> >
> > On Fri, 2020-08-14 at 18:45 +0200, Fabian Zimmermann wrote:
> > > Hi,
> > >
> > > i read somewhere that vexxhosts kubernetes openstack-Operator is
> running
> > > one rabbitmq Container per Service. Just the kubernetes self healing is
> > > used as "ha" for rabbitmq.
> > >
> > > That seems to match with my finding: run rabbitmq standalone and use an
> > > external system to restart rabbitmq if required.
> > thats the design that was orginally planned for kolla-kubernetes
> orrignally
> >
> > each service was to be deployed with its own rabbit mq server if it
> required one
> > and if it crashed it woudl just be recreated by k8s. it perfromace
> better then a cluster
> > and if you trust k8s or the external service enough to ensure it is
> recteated it
> > should be as effective a solution. you dont even need k8s to do that but
> it seams to be
> > a good fit if  your prepared to ocationally loose inflight rpcs.
> > if you not then you can configure rabbit to persite all message to disk
> and mont that on a shared
> > file system like nfs or cephfs so that when the rabbit instance is
> recreated the queue contency is
> > perserved. assuming you can take the perfromance hit of writing all
> messages to disk that is.
> > >
> > >  Fabian
> > >
> > > Satish Patel <satish.txt at gmail.com> schrieb am Fr., 14. Aug. 2020,
> 16:59:
> > >
> > > > Fabian,
> > > >
> > > > what do you mean?
> > > >
> > > > > > I think vexxhost is running (1) with their openstack-operator -
> for
> > > >
> > > > reasons.
> > > >
> > > > On Fri, Aug 14, 2020 at 7:28 AM Fabian Zimmermann <dev.faz at gmail.com
> >
> > > > wrote:
> > > > >
> > > > > Hello again,
> > > > >
> > > > > just a short update about the results of my tests.
> > > > >
> > > > > I currently see 2 ways of running openstack+rabbitmq
> > > > >
> > > > > 1. without durable-queues and without replication - just one
> > > >
> > > > rabbitmq-process which gets (somehow) restarted if it fails.
> > > > > 2. durable-queues and replication
> > > > >
> > > > > Any other combination of these settings leads to more or less
> issues with
> > > > >
> > > > > * broken / non working bindings
> > > > > * broken queues
> > > > >
> > > > > I think vexxhost is running (1) with their openstack-operator - for
> > > >
> > > > reasons.
> > > > >
> > > > > I added [kolla], because kolla-ansible is installing rabbitmq with
> > > >
> > > > replication but without durable-queues.
> > > > >
> > > > > May someone point me to the best way to document these findings to
> some
> > > >
> > > > official doc?
> > > > > I think a lot of installations out there will run into issues if -
> under
> > > >
> > > > load - a node fails.
> > > > >
> > > > >  Fabian
> > > > >
> > > > >
> > > > > Am Do., 13. Aug. 2020 um 15:13 Uhr schrieb Fabian Zimmermann <
> > > >
> > > > dev.faz at gmail.com>:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > just did some short tests today in our test-environment (without
> > > >
> > > > durable queues and without replication):
> > > > > >
> > > > > > * started a rally task to generate some load
> > > > > > * kill-9-ed rabbitmq on one node
> > > > > > * rally task immediately stopped and the cloud (mostly) stopped
> working
> > > > > >
> > > > > > after some debugging i found (again) exchanges which had
> bindings to
> > > >
> > > > queues, but these bindings didnt forward any msgs.
> > > > > > Wrote a small script to detect these broken bindings and will
> now check
> > > >
> > > > if this is "reproducible"
> > > > > >
> > > > > > then I will try "durable queues" and "durable queues with
> replication"
> > > >
> > > > to see if this helps. Even if I would expect
> > > > > > rabbitmq should be able to handle this without these "hidden
> broken
> > > >
> > > > bindings"
> > > > > >
> > > > > > This just FYI.
> > > > > >
> > > > > >  Fabian
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200816/1ded3f31/attachment.html>

From tobias.urdin at binero.com  Sun Aug 16 08:48:13 2020
From: tobias.urdin at binero.com (Tobias Urdin)
Date: Sun, 16 Aug 2020 08:48:13 +0000
Subject: [nova][neutron][oslo][ops][kolla] rabbit bindings issue
In-Reply-To: <CAA857VzF-WmGSUEb4SYriNn2Zw4Xd6mXtNe+=MocyTgNdiczFQ@mail.gmail.com>
References: <20200806144016.GP31915@sync>
 <CAA857VxY8mr0OSw8vSYdMph4tXb=5Wwwx3cM_YVgGa8=P+cxkQ@mail.gmail.com>
 <CALaZjRGEcFDBiQ2YnsMjXJ_WF3ALcjWVLQuetupVd__CWhjg0g@mail.gmail.com>
 <CAA857VzKREnBOEF667n4B+9L9Rbm+y7o2LVrDjhxhCT=2XWR8g@mail.gmail.com>
 <1a338d7e-c82c-cda2-2d47-b5aebb999142@openstack.org>
 <CAA857VxxmrHhBZgNHCg2X=ymvOdS_YcpERYN5OnKs0NMEA=cyA@mail.gmail.com>
 <CAA857Vya0rou-SwzO+A=smVHXNTZ9a03_3r09BJj=bb3NAcdGg@mail.gmail.com>
 <CAA857Vzw+kvW_7R37oqJF6gRkxGBSY3gNYTLKu++pY5WHH49qg@mail.gmail.com>
 <CAA857Vzq1nYLM3ES35J_9PxTieVskhZBc43Z40R3NeiEEOL4rA@mail.gmail.com>
 <CAPgF-fr-3J=acxng6SwRkLPXqvTYNsNmCZUXyvzi4rxDnPDdEA@mail.gmail.com>
 <CAA857Vw_HG8VuZyyihm8RB9Mo8aPm=v36GJrnK1jD9wyZHAb2A@mail.gmail.com>
 <28f04c4eff84aa6d15424f3de3706ae9ec361fa7.camel@redhat.com>
 <CAPgF-fo-MFk7oVcKKPXthBE=hi4U+VtRtkdtkc3w=SsGuJJ-ag@mail.gmail.com>,
 <CAA857VzF-WmGSUEb4SYriNn2Zw4Xd6mXtNe+=MocyTgNdiczFQ@mail.gmail.com>
Message-ID: <303EB7E0-C584-42A7-BF7A-D1EAABDD1AD7@binero.com>

Hello,

Kind of off topic but I’ve been starting doing some research to see if a KubeMQ driver could be added to oslo.messaging

Best regards

On 16 Aug 2020, at 07:44, Fabian Zimmermann <dev.faz at gmail.com> wrote:

﻿
Hi,

Already looked in Oslo.messaging, but rabbitmq is the only stable driver :(

Kafka is marked as experimental and (if the docs are correct) is only usable for notifications.

Would love to switch to an alternate.

 Fabian

Satish Patel <satish.txt at gmail.com<mailto:satish.txt at gmail.com>> schrieb am So., 16. Aug. 2020, 02:13:
Hi Sean,

Sounds good, but running rabbitmq for each service going to be little
overhead also, how do you scale cluster (Yes we can use cellv2 but its
not something everyone like to do because of complexity). If we thinks
rabbitMQ is growing pain then why community not looking for
alternative option (kafka) etc..?

On Fri, Aug 14, 2020 at 3:09 PM Sean Mooney <smooney at redhat.com<mailto:smooney at redhat.com>> wrote:
>
> On Fri, 2020-08-14 at 18:45 +0200, Fabian Zimmermann wrote:
> > Hi,
> >
> > i read somewhere that vexxhosts kubernetes openstack-Operator is running
> > one rabbitmq Container per Service. Just the kubernetes self healing is
> > used as "ha" for rabbitmq.
> >
> > That seems to match with my finding: run rabbitmq standalone and use an
> > external system to restart rabbitmq if required.
> thats the design that was orginally planned for kolla-kubernetes orrignally
>
> each service was to be deployed with its own rabbit mq server if it required one
> and if it crashed it woudl just be recreated by k8s. it perfromace better then a cluster
> and if you trust k8s or the external service enough to ensure it is recteated it
> should be as effective a solution. you dont even need k8s to do that but it seams to be
> a good fit if  your prepared to ocationally loose inflight rpcs.
> if you not then you can configure rabbit to persite all message to disk and mont that on a shared
> file system like nfs or cephfs so that when the rabbit instance is recreated the queue contency is
> perserved. assuming you can take the perfromance hit of writing all messages to disk that is.
> >
> >  Fabian
> >
> > Satish Patel <satish.txt at gmail.com<mailto:satish.txt at gmail.com>> schrieb am Fr., 14. Aug. 2020, 16:59:
> >
> > > Fabian,
> > >
> > > what do you mean?
> > >
> > > > > I think vexxhost is running (1) with their openstack-operator - for
> > >
> > > reasons.
> > >
> > > On Fri, Aug 14, 2020 at 7:28 AM Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>>
> > > wrote:
> > > >
> > > > Hello again,
> > > >
> > > > just a short update about the results of my tests.
> > > >
> > > > I currently see 2 ways of running openstack+rabbitmq
> > > >
> > > > 1. without durable-queues and without replication - just one
> > >
> > > rabbitmq-process which gets (somehow) restarted if it fails.
> > > > 2. durable-queues and replication
> > > >
> > > > Any other combination of these settings leads to more or less issues with
> > > >
> > > > * broken / non working bindings
> > > > * broken queues
> > > >
> > > > I think vexxhost is running (1) with their openstack-operator - for
> > >
> > > reasons.
> > > >
> > > > I added [kolla], because kolla-ansible is installing rabbitmq with
> > >
> > > replication but without durable-queues.
> > > >
> > > > May someone point me to the best way to document these findings to some
> > >
> > > official doc?
> > > > I think a lot of installations out there will run into issues if - under
> > >
> > > load - a node fails.
> > > >
> > > >  Fabian
> > > >
> > > >
> > > > Am Do., 13. Aug. 2020 um 15:13 Uhr schrieb Fabian Zimmermann <
> > >
> > > dev.faz at gmail.com<mailto:dev.faz at gmail.com>>:
> > > > >
> > > > > Hi,
> > > > >
> > > > > just did some short tests today in our test-environment (without
> > >
> > > durable queues and without replication):
> > > > >
> > > > > * started a rally task to generate some load
> > > > > * kill-9-ed rabbitmq on one node
> > > > > * rally task immediately stopped and the cloud (mostly) stopped working
> > > > >
> > > > > after some debugging i found (again) exchanges which had bindings to
> > >
> > > queues, but these bindings didnt forward any msgs.
> > > > > Wrote a small script to detect these broken bindings and will now check
> > >
> > > if this is "reproducible"
> > > > >
> > > > > then I will try "durable queues" and "durable queues with replication"
> > >
> > > to see if this helps. Even if I would expect
> > > > > rabbitmq should be able to handle this without these "hidden broken
> > >
> > > bindings"
> > > > >
> > > > > This just FYI.
> > > > >
> > > > >  Fabian
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200816/217fdf6d/attachment-0001.html>

From smooney at redhat.com  Sun Aug 16 13:37:18 2020
From: smooney at redhat.com (Sean Mooney)
Date: Sun, 16 Aug 2020 14:37:18 +0100
Subject: [nova][neutron][oslo][ops][kolla] rabbit bindings issue
In-Reply-To: <CAPgF-fo-MFk7oVcKKPXthBE=hi4U+VtRtkdtkc3w=SsGuJJ-ag@mail.gmail.com>
References: <20200806144016.GP31915@sync>
 <CAA857VxY8mr0OSw8vSYdMph4tXb=5Wwwx3cM_YVgGa8=P+cxkQ@mail.gmail.com>
 <CALaZjRGEcFDBiQ2YnsMjXJ_WF3ALcjWVLQuetupVd__CWhjg0g@mail.gmail.com>
 <CAA857VzKREnBOEF667n4B+9L9Rbm+y7o2LVrDjhxhCT=2XWR8g@mail.gmail.com>
 <1a338d7e-c82c-cda2-2d47-b5aebb999142@openstack.org>
 <CAA857VxxmrHhBZgNHCg2X=ymvOdS_YcpERYN5OnKs0NMEA=cyA@mail.gmail.com>
 <CAA857Vya0rou-SwzO+A=smVHXNTZ9a03_3r09BJj=bb3NAcdGg@mail.gmail.com>
 <CAA857Vzw+kvW_7R37oqJF6gRkxGBSY3gNYTLKu++pY5WHH49qg@mail.gmail.com>
 <CAA857Vzq1nYLM3ES35J_9PxTieVskhZBc43Z40R3NeiEEOL4rA@mail.gmail.com>
 <CAPgF-fr-3J=acxng6SwRkLPXqvTYNsNmCZUXyvzi4rxDnPDdEA@mail.gmail.com>
 <CAA857Vw_HG8VuZyyihm8RB9Mo8aPm=v36GJrnK1jD9wyZHAb2A@mail.gmail.com>
 <28f04c4eff84aa6d15424f3de3706ae9ec361fa7.camel@redhat.com>
 <CAPgF-fo-MFk7oVcKKPXthBE=hi4U+VtRtkdtkc3w=SsGuJJ-ag@mail.gmail.com>
Message-ID: <f4d5d0e32626f6c8b5509d67eb3d38272169e4fc.camel@redhat.com>

On Sat, 2020-08-15 at 20:13 -0400, Satish Patel wrote:
> Hi Sean,
> 
> Sounds good, but running rabbitmq for each service going to be little
> overhead also, how do you scale cluster (Yes we can use cellv2 but its
> not something everyone like to do because of complexity).

my understanding is that when using rabbitmq adding multiple rabbitmq servers in a cluster lowers
througput vs jsut 1 rabbitmq instance for any given excahnge. that is because the content of
the queue need to be syconised across the cluster. so if cinder nova and neutron share
a 3 node cluster and your compaure that to the same service deployed with cinder nova and neuton
each having there on rabbitmq service then the independent deployment will tend to out perform the
clustered solution. im not really sure if that has change i know tha thow clustering has been donw has evovled
over the years but in the past clustering was the adversary of scaling.

>  If we thinks
> rabbitMQ is growing pain then why community not looking for
> alternative option (kafka) etc..?
we have looked at alternivives several times
rabbit mq  wroks well enough ans scales well enough for most deployments.
there other amqp implimantation that scale better then rabbit, 
activemq and qpid are both reported to scale better but they perfrom worse
out of the box and need to be carfully tuned

in the past zeromq has been supported but peole did not maintain it.

kafka i dont think is a good alternative but nats https://nats.io/ might be.

for what its worth all nova deployment are cellv2 deployments with 1 cell from around pike/rocky
and its really not that complex. cells_v1 was much more complex bug part of the redesign
for cells_v2 was makeing sure there is only 1 code path. adding a second cell just need another
cell db and conductor to be deployed assuming you startted with a super conductor in the first
place. the issue is cells is only a nova feature no other service have cells so it does not help
you with cinder or neutron. as such cinder an neutron likely be the services that hit scaling limits first.
adopign cells in other services is not nessaryally the right approch either but when we talk about scale
we do need to keep in mind that cells is just for nova today.


> 
> On Fri, Aug 14, 2020 at 3:09 PM Sean Mooney <smooney at redhat.com> wrote:
> > 
> > On Fri, 2020-08-14 at 18:45 +0200, Fabian Zimmermann wrote:
> > > Hi,
> > > 
> > > i read somewhere that vexxhosts kubernetes openstack-Operator is running
> > > one rabbitmq Container per Service. Just the kubernetes self healing is
> > > used as "ha" for rabbitmq.
> > > 
> > > That seems to match with my finding: run rabbitmq standalone and use an
> > > external system to restart rabbitmq if required.
> > 
> > thats the design that was orginally planned for kolla-kubernetes orrignally
> > 
> > each service was to be deployed with its own rabbit mq server if it required one
> > and if it crashed it woudl just be recreated by k8s. it perfromace better then a cluster
> > and if you trust k8s or the external service enough to ensure it is recteated it
> > should be as effective a solution. you dont even need k8s to do that but it seams to be
> > a good fit if  your prepared to ocationally loose inflight rpcs.
> > if you not then you can configure rabbit to persite all message to disk and mont that on a shared
> > file system like nfs or cephfs so that when the rabbit instance is recreated the queue contency is
> > perserved. assuming you can take the perfromance hit of writing all messages to disk that is.
> > > 
> > >  Fabian
> > > 
> > > Satish Patel <satish.txt at gmail.com> schrieb am Fr., 14. Aug. 2020, 16:59:
> > > 
> > > > Fabian,
> > > > 
> > > > what do you mean?
> > > > 
> > > > > > I think vexxhost is running (1) with their openstack-operator - for
> > > > 
> > > > reasons.
> > > > 
> > > > On Fri, Aug 14, 2020 at 7:28 AM Fabian Zimmermann <dev.faz at gmail.com>
> > > > wrote:
> > > > > 
> > > > > Hello again,
> > > > > 
> > > > > just a short update about the results of my tests.
> > > > > 
> > > > > I currently see 2 ways of running openstack+rabbitmq
> > > > > 
> > > > > 1. without durable-queues and without replication - just one
> > > > 
> > > > rabbitmq-process which gets (somehow) restarted if it fails.
> > > > > 2. durable-queues and replication
> > > > > 
> > > > > Any other combination of these settings leads to more or less issues with
> > > > > 
> > > > > * broken / non working bindings
> > > > > * broken queues
> > > > > 
> > > > > I think vexxhost is running (1) with their openstack-operator - for
> > > > 
> > > > reasons.
> > > > > 
> > > > > I added [kolla], because kolla-ansible is installing rabbitmq with
> > > > 
> > > > replication but without durable-queues.
> > > > > 
> > > > > May someone point me to the best way to document these findings to some
> > > > 
> > > > official doc?
> > > > > I think a lot of installations out there will run into issues if - under
> > > > 
> > > > load - a node fails.
> > > > > 
> > > > >  Fabian
> > > > > 
> > > > > 
> > > > > Am Do., 13. Aug. 2020 um 15:13 Uhr schrieb Fabian Zimmermann <
> > > > 
> > > > dev.faz at gmail.com>:
> > > > > > 
> > > > > > Hi,
> > > > > > 
> > > > > > just did some short tests today in our test-environment (without
> > > > 
> > > > durable queues and without replication):
> > > > > > 
> > > > > > * started a rally task to generate some load
> > > > > > * kill-9-ed rabbitmq on one node
> > > > > > * rally task immediately stopped and the cloud (mostly) stopped working
> > > > > > 
> > > > > > after some debugging i found (again) exchanges which had bindings to
> > > > 
> > > > queues, but these bindings didnt forward any msgs.
> > > > > > Wrote a small script to detect these broken bindings and will now check
> > > > 
> > > > if this is "reproducible"
> > > > > > 
> > > > > > then I will try "durable queues" and "durable queues with replication"
> > > > 
> > > > to see if this helps. Even if I would expect
> > > > > > rabbitmq should be able to handle this without these "hidden broken
> > > > 
> > > > bindings"
> > > > > > 
> > > > > > This just FYI.
> > > > > > 
> > > > > >  Fabian
> 
> 


From tonyliu0592 at hotmail.com  Sun Aug 16 18:41:20 2020
From: tonyliu0592 at hotmail.com (Tony Liu)
Date: Sun, 16 Aug 2020 18:41:20 +0000
Subject: [neutron] How to specify overlay network interface when using OVN
 and Geneve?
In-Reply-To: <CAEW15yBB504tqMBEBgCe09gNDDPe2Pgk93Nyjkj=mCug_gL_=Q@mail.gmail.com>
References: <CAEW15yAJKF6AjQBQQE0bOK9uZZh3Cq6vf3GHf0Wn=K2DrN5DCg@mail.gmail.com>
 <20200815173635.3z66wzg475d4kzm2@skaplons-mac>
 <79cf1f2a19cc242d0030e7ba3c39311aa176e6bf.camel@redhat.com>
 <CAEW15yBB504tqMBEBgCe09gNDDPe2Pgk93Nyjkj=mCug_gL_=Q@mail.gmail.com>
Message-ID: <MWHPR08MB23828CB7B1E130E23A8087DDBD5E0@MWHPR08MB2382.namprd08.prod.outlook.com>

I am using Kolla Ansible that deploys OVN well for me.
You can set tunnel_interface to specify the tunnel interface.

Tony
> -----Original Message-----
> From: Popoi Zen <alterriu at gmail.com>
> Sent: Saturday, August 15, 2020 7:59 PM
> To: Sean Mooney <smooney at redhat.com>
> Cc: Slawek Kaplonski <skaplons at redhat.com>; openstack-
> discuss at lists.openstack.org
> Subject: Re: [neutron] How to specify overlay network interface when
> using OVN and Geneve?
> 
> Yeah, what I mean is tunnel network between instance when instance
> communicate using selfservice network, can I specify from which host
> interface/NIC that traffic goes through? I found this: `ovs-vsctl set
> open . external-ids:ovn-encap-ip=IP_ADDRESS` is it righ?
> 
> And btw, what is the best practise when using OVN? Did I need setup
> bridge for overlay interface and provider interface on my controller too?
> Since, as my understanding, inbound/outbound will have direct access
> from compute node by default on OVN. And in this guide [1] bridge only
> configured on compute nodes.
> 
> [1]
> https://docs.openstack.org/neutron/ussuri/install/ovn/manual_install.htm
> l
> 
> On Sun, Aug 16, 2020 at 1:02 AM Sean Mooney <smooney at redhat.com
> <mailto:smooney at redhat.com> > wrote:
> 
> 
> 	On Sat, 2020-08-15 at 19:36 +0200, Slawek Kaplonski wrote:
> 	> Hi,
> 	>
> 	> You can do that by configuring bridge_mappings on compute node(s).
> 	> It is described in the doc [1].
> 	when they said overlay network i think they meant the geneve
> tunnels in which
> 	casue you contole the interface that is used by adjusting your
> routing table to use the interface you desire.
> 	that can involve movein ipt to bridges or interface to  correctly
> set up the routes depending on your
> 	configurtion.
> 
> 	but ya if you were refering to provider networks the link slawek
> porovide is proably what you want.
> 	>
> 	> On Fri, Aug 14, 2020 at 09:13:23PM +0700, Popoi Zen wrote:
> 	> > Hi, I have used my google fu but I cant find any reference.
> Just want to
> 	> > know how to specify overlay network when Im using geneve as my
> overlay
> 	> > protocol?
> 	>
> 	> [1]
> https://docs.openstack.org/neutron/latest/admin/ovn/refarch/provider-
> networks.html
> 	>
> 
> 


From adriant at catalystcloud.nz  Mon Aug 17 04:42:32 2020
From: adriant at catalystcloud.nz (Adrian Turjak)
Date: Mon, 17 Aug 2020 16:42:32 +1200
Subject: [requirements][oslo] Inclusion of CONFspirator in
 openstack/requirements
Message-ID: <e1c14afe-1ec3-2447-75e4-d2015f2939c6@catalystcloud.nz>

Hey OpenStackers!

I'm hoping to add CONFspirator to openstack/requirements as I'm using it 
Adjutant:
https://review.opendev.org/#/c/746436/

The library has been in Adjutant for a while but I didn't add it to 
openstack/requirements, so I'm trying to remedy that now. I think it is 
different enough from oslo.config and I think the features/differences 
are ones that are unlikely to ever make sense in oslo.config without 
breaking it for people who do use it as it is, or adding too much 
complexity.

I wanted to use oslo.config but quickly found that the way I was 
currently doing config in Adjutant was heavily dependent on yaml, and 
the ability to nest things. I was in a bind because I didn't have a 
declarative config system like oslo.config, and the config for Adjutant 
was a mess to maintain and understand (even for me, and I wrote it) with 
random parts of the code pulling config that may or may not have been 
set/declared.

After finding oslo.config was not suitable for my rather weird needs, I 
took oslo.config as a starting point and ended up writing another 
library specific to my requirements in Adjutant, and rather than keeping 
it internal to Adjutant, moved it to an external library.

CONFspirator was built for a weird and complex edge case, because I have 
plugins that need to dynamically load config on startup, which then has 
to be lazy_loaded. I also have weird overlay logic for defaults that can 
be overridden, and building it into the library made Adjutant simpler. I 
also have nested config groups that need to be named dynamically to 
allow plugin classes to be extended without subclasses sharing the same 
config group name. I built something specific to my needs, that just so 
happens to also be a potentially useful library for people wanting 
something like oslo.config but that is targeted towards yaml and toml, 
and the ability to nest groups.

The docs are here: https://confspirator.readthedocs.io/
The code is here: https://gitlab.com/catalyst-cloud/confspirator

And for those interested in how I use it in Adjutant here are some 
places of interest (be warned, it may be a rabbit hole):
https://opendev.org/openstack/adjutant/src/branch/master/adjutant/config
https://opendev.org/openstack/adjutant/src/branch/master/adjutant/feature_set.py 

https://opendev.org/openstack/adjutant/src/branch/master/adjutant/core.py
https://opendev.org/openstack/adjutant/src/branch/master/adjutant/api/v1/openstack.py#L35-L44 

https://opendev.org/openstack/adjutant/src/branch/master/adjutant/actions/v1/projects.py#L155-L164 

https://opendev.org/openstack/adjutant/src/branch/master/adjutant/actions/v1/base.py#L146 

https://opendev.org/openstack/adjutant/src/branch/master/adjutant/tasks/v1/base.py#L30 

https://opendev.org/openstack/adjutant/src/branch/master/adjutant/tasks/v1/base.py#L293 


If there are strong opinions about working to add this to oslo.config, 
let's chat, as I'm not against merging this into it somehow if we find a 
way that make sense, but while some aspects where similar, I felt that 
this was cleaner without being part of oslo.config because the mindset I 
was building towards seemed different and oslo.config didn't need my 
complexity.

Cheers,
Adrian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200817/a827b202/attachment-0001.html>

From mdulko at redhat.com  Mon Aug 17 07:46:32 2020
From: mdulko at redhat.com (=?UTF-8?Q?Micha=C5=82?= Dulko)
Date: Mon, 17 Aug 2020 09:46:32 +0200
Subject: [kuryr] vPTG October 2020
Message-ID: <54f84af6378e1507d1f04c0aab733922cdc2c8bd.camel@redhat.com>

Hello all,

There's a vPTG October 2020 project signup process going on and I'd
like to ask if you want me to reserve an hour or two there for a sync
up on the priorities and plans of various parts of the team.

Thanks,
Michał


From akekane at redhat.com  Mon Aug 17 07:56:36 2020
From: akekane at redhat.com (Abhishek Kekane)
Date: Mon, 17 Aug 2020 13:26:36 +0530
Subject: [glance] Virtual PTG October 2020
Message-ID: <CALOt+STxE2+kRKe9QEckT-UbQNTkRyVU2F8mN7ACte+hQBY-1A@mail.gmail.com>

Hi Team,

There is a project signup process going on for virtual PTG October 2020. I
will like to book slots for the same between 1400 UTC to 1700 UTC.

Please let me know your convenience for the same.

Thanks & Best Regards,

Abhishek Kekane
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200817/c2820da7/attachment.html>

From stephenfin at redhat.com  Mon Aug 17 08:48:23 2020
From: stephenfin at redhat.com (Stephen Finucane)
Date: Mon, 17 Aug 2020 09:48:23 +0100
Subject: [oslo] Proposing Lance Bragstad as oslo.cache core
In-Reply-To: <CAG_6SUC+hgX9SqoqPmFxew2xwfi99MCcN=5v1bTHeVLo5ed4VQ@mail.gmail.com>
References: <CAG_6SUC+hgX9SqoqPmFxew2xwfi99MCcN=5v1bTHeVLo5ed4VQ@mail.gmail.com>
Message-ID: <5df90046486505f18d1fb812a12e26d1c68cf311.camel@redhat.com>

On Thu, 2020-08-13 at 17:06 +0200, Moises Guimaraes de Medeiros wrote:
> Hello everybody,
> 
> 
> 
> It is my pleasure to propose Lance Bragstad (lbragstad) as a new
> member of the oslo.core core team.
> Lance has been a big contributor to the project and is known as a
> walking version of the Keystone documentation, which happens to be
> one of the biggest consumers of oslo.cache.
> 
> 
> 
> Obviously we think he'd make a good addition to the core team. If
> there are no objections, I'll make that happen in a week.
> 
> 
> 
> Thanks.
> 
> 

+1

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200817/2041d01b/attachment.html>

From hberaud at redhat.com  Mon Aug 17 09:37:01 2020
From: hberaud at redhat.com (Herve Beraud)
Date: Mon, 17 Aug 2020 11:37:01 +0200
Subject: [oslo] Proposing Lance Bragstad as oslo.cache core
In-Reply-To: <5df90046486505f18d1fb812a12e26d1c68cf311.camel@redhat.com>
References: <CAG_6SUC+hgX9SqoqPmFxew2xwfi99MCcN=5v1bTHeVLo5ed4VQ@mail.gmail.com>
 <5df90046486505f18d1fb812a12e26d1c68cf311.camel@redhat.com>
Message-ID: <CAFDq9gV5dK3WEhvsmFL5kk1VQCxg7enJgSwqvbFjMftBASwfeg@mail.gmail.com>

+1

Le lun. 17 août 2020 à 10:52, Stephen Finucane <stephenfin at redhat.com> a
écrit :

> On Thu, 2020-08-13 at 17:06 +0200, Moises Guimaraes de Medeiros wrote:
>
> Hello everybody,
>
> It is my pleasure to propose Lance Bragstad (lbragstad) as a new member
> of the oslo.core core team.
>
> Lance has been a big contributor to the project and is known as a walking
> version of the Keystone documentation, which happens to be one of the
> biggest consumers of oslo.cache.
>
> Obviously we think he'd make a good addition to the core team. If there
> are no objections, I'll make that happen in a week.
>
> Thanks.
>
>
> +1
>
>

-- 
Hervé Beraud
Senior Software Engineer
Red Hat - Openstack Oslo
irc: hberaud
-----BEGIN PGP SIGNATURE-----

wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+
Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+
RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP
F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G
5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g
glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw
m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ
hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0
qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y
F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3
B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O
v6rDpkeNksZ9fFSyoY2o
=ECSj
-----END PGP SIGNATURE-----
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200817/150ebbf5/attachment.html>

From mnaser at vexxhost.com  Mon Aug 17 12:01:55 2020
From: mnaser at vexxhost.com (Mohammed Naser)
Date: Mon, 17 Aug 2020 08:01:55 -0400
Subject: [neutron][ops] API for viewing HA router states
Message-ID: <CAEs876jV-RqWiW2K8ehvm4ACTCiU059LJDiC+FWGWKQdPGCjGw@mail.gmail.com>

Hi all,

Over the past few days, we were troubleshooting an issue that ended up
having a root cause where keepalived has somehow ended up active in
two different L3 agents.  We've yet to find the root cause of how this
happened but removing it and adding it resolved the issue for us.

As we work on improving our monitoring, we wanted to implement
something that gets us the info of # of active routers to check if
there's a router that has >1 active L3 agent but it's hard because
hitting the /l3-agents endpoint on _every_ single router hurts a lot
on performance.

Is there something else that we can watch which might be more
productive?  FYI -- this all goes in the open and will end up inside
the openstack-exporter:
https://github.com/openstack-exporter/openstack-exporter and the Helm
charts will end up with the alerts:
https://github.com/openstack-exporter/helm-charts

Thanks!
Mohammed

-- 
Mohammed Naser
VEXXHOST, Inc.


From bence.romsics at gmail.com  Mon Aug 17 12:18:13 2020
From: bence.romsics at gmail.com (Bence Romsics)
Date: Mon, 17 Aug 2020 14:18:13 +0200
Subject: [neutron] bug deputy report for week of 2020-08-10
Message-ID: <CAHeS+3_N3gzSKqtzkhs3zq0=C5tC0ARZXYe5y3rvtycsEruW4A@mail.gmail.com>

Hi,

This is last week's buglist. Probably due to summer vacations but we
have a few bugs without an owner.

High:
* https://bugs.launchpad.net/neutron/+bug/1891307
SSH fails in neutron-ovn-tripleo-ci-centos-8-containers-multinode job
gate-failure, unassigned
* https://bugs.launchpad.net/neutron/+bug/1891309
Designate integration - internal server error in Neutron
gate-failure, unassigned
* https://bugs.launchpad.net/neutron/+bug/1891517
neutron.tests.unit.common.test_utils.TimerTestCase.test__enter_with_timeout
fails once in a while
gate-failure, proposed fix: https://review.opendev.org/746154
* https://bugs.launchpad.net/neutron/+bug/1891673
qrouter ns ip rules not deleted when fip removed from vm
proposed fix: https://review.opendev.org/746336

Needs further triage by someone knowing the designate integration
better than I do:
* https://bugs.launchpad.net/neutron/+bug/1891333
strange behavior of dns_domain with designate multi domain
* https://bugs.launchpad.net/neutron/+bug/1891512
neutron designate DNS dns_domain assignment issue

Low:
* https://bugs.launchpad.net/neutron/+bug/1891243
neutron tempest failure:
neutron_tempest_plugin.api.test_extensions.ExtensionsTest.test_list_extensions_includes_all
OVN sample devstack conf did not enable all service plugins needed for
tempest tests
proposed fix: https://review.opendev.org/745829

Whishlist:
* https://bugs.launchpad.net/neutron/+bug/1891360
Floating IP agent gateway IP addresses not released when deleting dead
DVR L3 agents
unassigned
* https://bugs.launchpad.net/neutron/+bug/1891448
L3 agent mode transition between dvr and dvr_no_external
unassigned

RFE:
* https://bugs.launchpad.net/neutron/+bug/1891334
[RFE] Enable change of CIDR on a subnet

Best regards,
Bence (rubasov)


From dev.faz at gmail.com  Mon Aug 17 13:54:31 2020
From: dev.faz at gmail.com (Fabian Zimmermann)
Date: Mon, 17 Aug 2020 15:54:31 +0200
Subject: [neutron][ops] API for viewing HA router states
In-Reply-To: <CAEs876jV-RqWiW2K8ehvm4ACTCiU059LJDiC+FWGWKQdPGCjGw@mail.gmail.com>
References: <CAEs876jV-RqWiW2K8ehvm4ACTCiU059LJDiC+FWGWKQdPGCjGw@mail.gmail.com>
Message-ID: <CAA857Vyu1bBJFe7W-ny_7BXo2WzUqj5WoP700r6Tq=+Zb+MMrg@mail.gmail.com>

Hi,

I can just tell you that we are doing a similar check for dhcp-agent, but
here we just execute a suitable SQL-statement to detect more than 1 agent /
AZ.

Doing the same for L3 shouldn't be that hard, but I dont know if this is
what you are looking for?

 Fabian


Am Mo., 17. Aug. 2020 um 14:11 Uhr schrieb Mohammed Naser <
mnaser at vexxhost.com>:

> Hi all,
>
> Over the past few days, we were troubleshooting an issue that ended up
> having a root cause where keepalived has somehow ended up active in
> two different L3 agents.  We've yet to find the root cause of how this
> happened but removing it and adding it resolved the issue for us.
>
> As we work on improving our monitoring, we wanted to implement
> something that gets us the info of # of active routers to check if
> there's a router that has >1 active L3 agent but it's hard because
> hitting the /l3-agents endpoint on _every_ single router hurts a lot
> on performance.
>
> Is there something else that we can watch which might be more
> productive?  FYI -- this all goes in the open and will end up inside
> the openstack-exporter:
> https://github.com/openstack-exporter/openstack-exporter and the Helm
> charts will end up with the alerts:
> https://github.com/openstack-exporter/helm-charts
>
> Thanks!
> Mohammed
>
> --
> Mohammed Naser
> VEXXHOST, Inc.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200817/82db134c/attachment.html>

From ltoscano at redhat.com  Mon Aug 17 13:57:27 2020
From: ltoscano at redhat.com (Luigi Toscano)
Date: Mon, 17 Aug 2020 09:57:27 -0400 (EDT)
Subject: [all][goals] Switch legacy Zuul jobs to native - update #2
In-Reply-To: <1991766177.46483554.1597672564973.JavaMail.zimbra@redhat.com>
Message-ID: <54384483.46483679.1597672647276.JavaMail.zimbra@redhat.com>

Hi,

Much progress has happened since the first report, almost 4 weeks ago.

Let's summarize the main documents:

- the goal: https://governance.openstack.org/tc/goals/selected/victoria/native-zuulv3-jobs.html
- the document above now includes the reference to the up-to-date Zuul v3 porting guide: https://docs.openstack.org/project-team-guide/zuulv3.html
- the etherpad which tracks the current status: https://etherpad.opendev.org/p/goal-victoria-native-zuulv3-migration
- the previous report: http://lists.openstack.org/pipermail/openstack-discuss/2020-July/016058.html

If you still have legacy jobs around, please prioritize this porting work. Victoria branching is not far away (actually, quite close for client libraries).

This is the list of projects which still need to complete the work. Most of the legacy jobs derive from legacy-dsvm-base, and those are the most important ones. There is a limited amount of jobs which derive from legacy-base that should be taken into account as well, though.


- barbican (*)
- blazar (*)
- cinder (/) (-> yes, this is my fault for the record :)
- designate (+)
- ec2-api (*)
- freezer (*)
- heat (/)
- infra (/)
- ironic (*)
- karbor (*)
- magnum (*)
- manila (/) - but only devstack-base
- monasca (+)
- murano (/)
- neutron (*)
- nova (*)
- oslo (*)
- senlin
- trove (+)
- vitrage
- zaqar

The symbol close to the project name provides more detals about the status: 
(+) means that the project cores are at least aware of the issue, 
(*) means that there was active pending reviews for some of the remaining jobs, 
(/) means that there was past activity but no open reviews currently

I'd just like to remind everyone that, while not part of the main goal, backporting the new jobs to the older branches when possible will make future maintenance easier.

Ciao
-- 
Luigi


From dev.faz at gmail.com  Mon Aug 17 14:03:39 2020
From: dev.faz at gmail.com (Fabian Zimmermann)
Date: Mon, 17 Aug 2020 16:03:39 +0200
Subject: [nova][neutron][oslo][ops][kolla] rabbit bindings issue
In-Reply-To: <f4d5d0e32626f6c8b5509d67eb3d38272169e4fc.camel@redhat.com>
References: <20200806144016.GP31915@sync>
 <CAA857VxY8mr0OSw8vSYdMph4tXb=5Wwwx3cM_YVgGa8=P+cxkQ@mail.gmail.com>
 <CALaZjRGEcFDBiQ2YnsMjXJ_WF3ALcjWVLQuetupVd__CWhjg0g@mail.gmail.com>
 <CAA857VzKREnBOEF667n4B+9L9Rbm+y7o2LVrDjhxhCT=2XWR8g@mail.gmail.com>
 <1a338d7e-c82c-cda2-2d47-b5aebb999142@openstack.org>
 <CAA857VxxmrHhBZgNHCg2X=ymvOdS_YcpERYN5OnKs0NMEA=cyA@mail.gmail.com>
 <CAA857Vya0rou-SwzO+A=smVHXNTZ9a03_3r09BJj=bb3NAcdGg@mail.gmail.com>
 <CAA857Vzw+kvW_7R37oqJF6gRkxGBSY3gNYTLKu++pY5WHH49qg@mail.gmail.com>
 <CAA857Vzq1nYLM3ES35J_9PxTieVskhZBc43Z40R3NeiEEOL4rA@mail.gmail.com>
 <CAPgF-fr-3J=acxng6SwRkLPXqvTYNsNmCZUXyvzi4rxDnPDdEA@mail.gmail.com>
 <CAA857Vw_HG8VuZyyihm8RB9Mo8aPm=v36GJrnK1jD9wyZHAb2A@mail.gmail.com>
 <28f04c4eff84aa6d15424f3de3706ae9ec361fa7.camel@redhat.com>
 <CAPgF-fo-MFk7oVcKKPXthBE=hi4U+VtRtkdtkc3w=SsGuJJ-ag@mail.gmail.com>
 <f4d5d0e32626f6c8b5509d67eb3d38272169e4fc.camel@redhat.com>
Message-ID: <CAA857VxrRXEeG=QQjEHhJpxRsow_eUsEK29-DX_kuDcBGzp4PA@mail.gmail.com>

Just to keep the list updated.

If you run with durable_queues and replication, there is still a
possibility, that a short living queue will *not* jet be replicated
and a node failure will mark these queue as "unreachable". This
wouldnt be a problem, if openstack would create a new queue, but i
fear it would just try to reuse the existing after reconnect.

So, after all - it seems the less buggy way would be

* use durable-queue and replication for long-running queues/exchanges
* use non-durable-queue without replication for short (fanout, reply_) queues

This should allow the short-living ones to destroy themself on node
failure, and the long living ones should be able to be as available as
possible.

Absolutely untested - so use with caution, but here is a possible
policy-regex: ^(?!amq\.)(?!reply_)(?!.*fanout).*

 Fabian


Am So., 16. Aug. 2020 um 15:37 Uhr schrieb Sean Mooney <smooney at redhat.com>:
>
> On Sat, 2020-08-15 at 20:13 -0400, Satish Patel wrote:
> > Hi Sean,
> >
> > Sounds good, but running rabbitmq for each service going to be little
> > overhead also, how do you scale cluster (Yes we can use cellv2 but its
> > not something everyone like to do because of complexity).
>
> my understanding is that when using rabbitmq adding multiple rabbitmq servers in a cluster lowers
> througput vs jsut 1 rabbitmq instance for any given excahnge. that is because the content of
> the queue need to be syconised across the cluster. so if cinder nova and neutron share
> a 3 node cluster and your compaure that to the same service deployed with cinder nova and neuton
> each having there on rabbitmq service then the independent deployment will tend to out perform the
> clustered solution. im not really sure if that has change i know tha thow clustering has been donw has evovled
> over the years but in the past clustering was the adversary of scaling.
>
> >  If we thinks
> > rabbitMQ is growing pain then why community not looking for
> > alternative option (kafka) etc..?
> we have looked at alternivives several times
> rabbit mq  wroks well enough ans scales well enough for most deployments.
> there other amqp implimantation that scale better then rabbit,
> activemq and qpid are both reported to scale better but they perfrom worse
> out of the box and need to be carfully tuned
>
> in the past zeromq has been supported but peole did not maintain it.
>
> kafka i dont think is a good alternative but nats https://nats.io/ might be.
>
> for what its worth all nova deployment are cellv2 deployments with 1 cell from around pike/rocky
> and its really not that complex. cells_v1 was much more complex bug part of the redesign
> for cells_v2 was makeing sure there is only 1 code path. adding a second cell just need another
> cell db and conductor to be deployed assuming you startted with a super conductor in the first
> place. the issue is cells is only a nova feature no other service have cells so it does not help
> you with cinder or neutron. as such cinder an neutron likely be the services that hit scaling limits first.
> adopign cells in other services is not nessaryally the right approch either but when we talk about scale
> we do need to keep in mind that cells is just for nova today.
>
>
> >
> > On Fri, Aug 14, 2020 at 3:09 PM Sean Mooney <smooney at redhat.com> wrote:
> > >
> > > On Fri, 2020-08-14 at 18:45 +0200, Fabian Zimmermann wrote:
> > > > Hi,
> > > >
> > > > i read somewhere that vexxhosts kubernetes openstack-Operator is running
> > > > one rabbitmq Container per Service. Just the kubernetes self healing is
> > > > used as "ha" for rabbitmq.
> > > >
> > > > That seems to match with my finding: run rabbitmq standalone and use an
> > > > external system to restart rabbitmq if required.
> > >
> > > thats the design that was orginally planned for kolla-kubernetes orrignally
> > >
> > > each service was to be deployed with its own rabbit mq server if it required one
> > > and if it crashed it woudl just be recreated by k8s. it perfromace better then a cluster
> > > and if you trust k8s or the external service enough to ensure it is recteated it
> > > should be as effective a solution. you dont even need k8s to do that but it seams to be
> > > a good fit if  your prepared to ocationally loose inflight rpcs.
> > > if you not then you can configure rabbit to persite all message to disk and mont that on a shared
> > > file system like nfs or cephfs so that when the rabbit instance is recreated the queue contency is
> > > perserved. assuming you can take the perfromance hit of writing all messages to disk that is.
> > > >
> > > >  Fabian
> > > >
> > > > Satish Patel <satish.txt at gmail.com> schrieb am Fr., 14. Aug. 2020, 16:59:
> > > >
> > > > > Fabian,
> > > > >
> > > > > what do you mean?
> > > > >
> > > > > > > I think vexxhost is running (1) with their openstack-operator - for
> > > > >
> > > > > reasons.
> > > > >
> > > > > On Fri, Aug 14, 2020 at 7:28 AM Fabian Zimmermann <dev.faz at gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > Hello again,
> > > > > >
> > > > > > just a short update about the results of my tests.
> > > > > >
> > > > > > I currently see 2 ways of running openstack+rabbitmq
> > > > > >
> > > > > > 1. without durable-queues and without replication - just one
> > > > >
> > > > > rabbitmq-process which gets (somehow) restarted if it fails.
> > > > > > 2. durable-queues and replication
> > > > > >
> > > > > > Any other combination of these settings leads to more or less issues with
> > > > > >
> > > > > > * broken / non working bindings
> > > > > > * broken queues
> > > > > >
> > > > > > I think vexxhost is running (1) with their openstack-operator - for
> > > > >
> > > > > reasons.
> > > > > >
> > > > > > I added [kolla], because kolla-ansible is installing rabbitmq with
> > > > >
> > > > > replication but without durable-queues.
> > > > > >
> > > > > > May someone point me to the best way to document these findings to some
> > > > >
> > > > > official doc?
> > > > > > I think a lot of installations out there will run into issues if - under
> > > > >
> > > > > load - a node fails.
> > > > > >
> > > > > >  Fabian
> > > > > >
> > > > > >
> > > > > > Am Do., 13. Aug. 2020 um 15:13 Uhr schrieb Fabian Zimmermann <
> > > > >
> > > > > dev.faz at gmail.com>:
> > > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > just did some short tests today in our test-environment (without
> > > > >
> > > > > durable queues and without replication):
> > > > > > >
> > > > > > > * started a rally task to generate some load
> > > > > > > * kill-9-ed rabbitmq on one node
> > > > > > > * rally task immediately stopped and the cloud (mostly) stopped working
> > > > > > >
> > > > > > > after some debugging i found (again) exchanges which had bindings to
> > > > >
> > > > > queues, but these bindings didnt forward any msgs.
> > > > > > > Wrote a small script to detect these broken bindings and will now check
> > > > >
> > > > > if this is "reproducible"
> > > > > > >
> > > > > > > then I will try "durable queues" and "durable queues with replication"
> > > > >
> > > > > to see if this helps. Even if I would expect
> > > > > > > rabbitmq should be able to handle this without these "hidden broken
> > > > >
> > > > > bindings"
> > > > > > >
> > > > > > > This just FYI.
> > > > > > >
> > > > > > >  Fabian
> >
> >
>


From amuller at redhat.com  Mon Aug 17 14:03:25 2020
From: amuller at redhat.com (Assaf Muller)
Date: Mon, 17 Aug 2020 10:03:25 -0400
Subject: [neutron][ops] API for viewing HA router states
In-Reply-To: <CAA857Vyu1bBJFe7W-ny_7BXo2WzUqj5WoP700r6Tq=+Zb+MMrg@mail.gmail.com>
References: <CAEs876jV-RqWiW2K8ehvm4ACTCiU059LJDiC+FWGWKQdPGCjGw@mail.gmail.com>
 <CAA857Vyu1bBJFe7W-ny_7BXo2WzUqj5WoP700r6Tq=+Zb+MMrg@mail.gmail.com>
Message-ID: <CABARBAZeo515f9dV25Oo3AsHw_7dxez-E2irHrg=qzB0RVgmyA@mail.gmail.com>

On Mon, Aug 17, 2020 at 9:59 AM Fabian Zimmermann <dev.faz at gmail.com> wrote:
>
> Hi,
>
> I can just tell you that we are doing a similar check for dhcp-agent, but here we just execute a suitable SQL-statement to detect more than 1 agent / AZ.
>
> Doing the same for L3 shouldn't be that hard, but I dont know if this is what you are looking for?

There's already an API for this:
neutron l3-agent-list-hosting-router <router_id>

It will show you the HA state per L3 agent for the given router.

>
>  Fabian
>
>
> Am Mo., 17. Aug. 2020 um 14:11 Uhr schrieb Mohammed Naser <mnaser at vexxhost.com>:
>>
>> Hi all,
>>
>> Over the past few days, we were troubleshooting an issue that ended up
>> having a root cause where keepalived has somehow ended up active in
>> two different L3 agents.  We've yet to find the root cause of how this
>> happened but removing it and adding it resolved the issue for us.
>>
>> As we work on improving our monitoring, we wanted to implement
>> something that gets us the info of # of active routers to check if
>> there's a router that has >1 active L3 agent but it's hard because
>> hitting the /l3-agents endpoint on _every_ single router hurts a lot
>> on performance.
>>
>> Is there something else that we can watch which might be more
>> productive?  FYI -- this all goes in the open and will end up inside
>> the openstack-exporter:
>> https://github.com/openstack-exporter/openstack-exporter and the Helm
>> charts will end up with the alerts:
>> https://github.com/openstack-exporter/helm-charts
>>
>> Thanks!
>> Mohammed
>>
>> --
>> Mohammed Naser
>> VEXXHOST, Inc.
>>


From dev.faz at gmail.com  Mon Aug 17 14:05:07 2020
From: dev.faz at gmail.com (Fabian Zimmermann)
Date: Mon, 17 Aug 2020 16:05:07 +0200
Subject: [neutron][ops] API for viewing HA router states
In-Reply-To: <CABARBAZeo515f9dV25Oo3AsHw_7dxez-E2irHrg=qzB0RVgmyA@mail.gmail.com>
References: <CAEs876jV-RqWiW2K8ehvm4ACTCiU059LJDiC+FWGWKQdPGCjGw@mail.gmail.com>
 <CAA857Vyu1bBJFe7W-ny_7BXo2WzUqj5WoP700r6Tq=+Zb+MMrg@mail.gmail.com>
 <CABARBAZeo515f9dV25Oo3AsHw_7dxez-E2irHrg=qzB0RVgmyA@mail.gmail.com>
Message-ID: <CAA857VzXdA6aA+L7OGhCbiz8J-gonkR0D3UWnjCdOgDmUQS1cw@mail.gmail.com>

Hi,

yes for 1 router, but doing this in a loop for hundreds is not so performant ;)

 Fabian

Am Mo., 17. Aug. 2020 um 16:04 Uhr schrieb Assaf Muller <amuller at redhat.com>:
>
> On Mon, Aug 17, 2020 at 9:59 AM Fabian Zimmermann <dev.faz at gmail.com> wrote:
> >
> > Hi,
> >
> > I can just tell you that we are doing a similar check for dhcp-agent, but here we just execute a suitable SQL-statement to detect more than 1 agent / AZ.
> >
> > Doing the same for L3 shouldn't be that hard, but I dont know if this is what you are looking for?
>
> There's already an API for this:
> neutron l3-agent-list-hosting-router <router_id>
>
> It will show you the HA state per L3 agent for the given router.
>
> >
> >  Fabian
> >
> >
> > Am Mo., 17. Aug. 2020 um 14:11 Uhr schrieb Mohammed Naser <mnaser at vexxhost.com>:
> >>
> >> Hi all,
> >>
> >> Over the past few days, we were troubleshooting an issue that ended up
> >> having a root cause where keepalived has somehow ended up active in
> >> two different L3 agents.  We've yet to find the root cause of how this
> >> happened but removing it and adding it resolved the issue for us.
> >>
> >> As we work on improving our monitoring, we wanted to implement
> >> something that gets us the info of # of active routers to check if
> >> there's a router that has >1 active L3 agent but it's hard because
> >> hitting the /l3-agents endpoint on _every_ single router hurts a lot
> >> on performance.
> >>
> >> Is there something else that we can watch which might be more
> >> productive?  FYI -- this all goes in the open and will end up inside
> >> the openstack-exporter:
> >> https://github.com/openstack-exporter/openstack-exporter and the Helm
> >> charts will end up with the alerts:
> >> https://github.com/openstack-exporter/helm-charts
> >>
> >> Thanks!
> >> Mohammed
> >>
> >> --
> >> Mohammed Naser
> >> VEXXHOST, Inc.
> >>
>


From yan.y.zhao at intel.com  Mon Aug 17 01:52:43 2020
From: yan.y.zhao at intel.com (Yan Zhao)
Date: Mon, 17 Aug 2020 09:52:43 +0800
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <a4f4a3cf76b87346a4cc4c39c116f575eaab9bac.camel@redhat.com>
References: <20200805021654.GB30485@joy-OptiPlex-7040>
 <2624b12f-3788-7e2b-2cb7-93534960bcb7@redhat.com>
 <20200805075647.GB2177@nanopsycho>
 <eb1d01c2-fbad-36b6-10cf-9e03483a736b@redhat.com>
 <20200805093338.GC30485@joy-OptiPlex-7040>
 <20200805105319.GF2177@nanopsycho>
 <20200810074631.GA29059@joy-OptiPlex-7040>
 <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
 <a4f4a3cf76b87346a4cc4c39c116f575eaab9bac.camel@redhat.com>
Message-ID: <20200817015243.GE15344@joy-OptiPlex-7040>

On Fri, Aug 14, 2020 at 01:30:00PM +0100, Sean Mooney wrote:
> On Fri, 2020-08-14 at 13:16 +0800, Yan Zhao wrote:
> > On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
> > > 
> > > On 2020/8/10 下午3:46, Yan Zhao wrote:
> > > > > driver is it handled by?
> > > > 
> > > > It looks that the devlink is for network device specific, and in
> > > > devlink.h, it says
> > > > include/uapi/linux/devlink.h - Network physical device Netlink
> > > > interface,
> > > 
> > > 
> > > Actually not, I think there used to have some discussion last year and the
> > > conclusion is to remove this comment.
> > > 
> > > It supports IB and probably vDPA in the future.
> > > 
> > 
> > hmm... sorry, I didn't find the referred discussion. only below discussion
> > regarding to why to add devlink.
> > 
> > https://www.mail-archive.com/netdev at vger.kernel.org/msg95801.html
> > 	>This doesn't seem to be too much related to networking? Why can't something
> > 	>like this be in sysfs?
> > 	
> > 	It is related to networking quite bit. There has been couple of
> > 	iteration of this, including sysfs and configfs implementations. There
> > 	has been a consensus reached that this should be done by netlink. I
> > 	believe netlink is really the best for this purpose. Sysfs is not a good
> > 	idea
> > 
> > https://www.mail-archive.com/netdev at vger.kernel.org/msg96102.html
> > 	>there is already a way to change eth/ib via
> > 	>echo 'eth' > /sys/bus/pci/drivers/mlx4_core/0000:02:00.0/mlx4_port1
> > 	>
> > 	>sounds like this is another way to achieve the same?
> > 	
> > 	It is. However the current way is driver-specific, not correct.
> > 	For mlx5, we need the same, it cannot be done in this way. Do devlink is
> > 	the correct way to go.
> im not sure i agree with that.
> standardising a filesystem based api that is used across all vendors is also a valid
> option.  that said if devlink is the right choice form a kerenl perspective by all
> means use it but i have not heard a convincing argument for why it actually better.
> with tthat said we have been uing tools like ethtool to manage aspect of nics for decades
> so its not that strange an idea to use a tool and binary protocoal rather then a text
> based interface for this but there are advantages to both approches.
> >
Yes, I agree with you.

> > https://lwn.net/Articles/674867/
> > 	There a is need for some userspace API that would allow to expose things
> > 	that are not directly related to any device class like net_device of
> > 	ib_device, but rather chip-wide/switch-ASIC-wide stuff.
> > 
> > 	Use cases:
> > 	1) get/set of port type (Ethernet/InfiniBand)
> > 	2) monitoring of hardware messages to and from chip
> > 	3) setting up port splitters - split port into multiple ones and squash again,
> > 	   enables usage of splitter cable
> > 	4) setting up shared buffers - shared among multiple ports within one chip
> > 
> > 
> > 
> > we actually can also retrieve the same information through sysfs, .e.g
> > 
> > > - [path to device]
> > 
> >   |--- migration
> >   |     |--- self
> >   |     |   |---device_api
> >   |	|   |---mdev_type
> >   |	|   |---software_version
> >   |	|   |---device_id
> >   |	|   |---aggregator
> >   |     |--- compatible
> >   |     |   |---device_api
> >   |	|   |---mdev_type
> >   |	|   |---software_version
> >   |	|   |---device_id
> >   |	|   |---aggregator
> > 
> > 
> > 
> > > 
> > > >   I feel like it's not very appropriate for a GPU driver to use
> > > > this interface. Is that right?
> > > 
> > > 
> > > I think not though most of the users are switch or ethernet devices. It
> > > doesn't prevent you from inventing new abstractions.
> > 
> > so need to patch devlink core and the userspace devlink tool?
> > e.g. devlink migration
> and devlink python libs if openstack was to use it directly.
> we do have caes where we just frok a process and execaute a comannd in a shell
> with or without elevated privladge but we really dont like doing that due to 
> the performacne impacat and security implciations so where we can use python bindign
> over c apis we do. pyroute2 is the only python lib i know off of the top of my head
> that support devlink so we would need to enhacne it to support this new devlink api.
> there may be otherss i have not really looked in the past since we dont need to use
> devlink at all today.
> > 
> > > Note that devlink is based on netlink, netlink has been widely used by
> > > various subsystems other than networking.
> > 
> > the advantage of netlink I see is that it can monitor device status and
> > notify upper layer that migration database needs to get updated.
> > But not sure whether openstack would like to use this capability.
> > As Sean said, it's heavy for openstack. it's heavy for vendor driver
> > as well :)
> > 
> > And devlink monitor now listens the notification and dumps the state
> > changes. If we want to use it, need to let it forward the notification
> > and dumped info to openstack, right?
> i dont think we would use direct devlink monitoring in nova even if it was avaiable.
> we could but we already poll libvirt and the system for other resouce periodicly.
so, if we use file system based approach, could openstack periodically check and
update the migration info?
e.g.
every minute, read /sys/<path to device>/migration/self/*, and if there
are any file disappearing or appearing or content changes, just let the
placement know.

Then when about to start migration, check source device's
/sys/<path to src device>/migration/compatible/* and searches the
placement if there are existing device matching to it,
if yes, create vm with the device and migrate to it;
if not, and if it's an mdev, try to create a matching one and migrate to
it.
(to create a matching mdev, I guess openstack can follow below sequence:
1. find a target device with the same device id (e.g. parent pci id)
2. create an mdev with matching mdev type
3. adjust other vendor specific attributes
4. if 2 or 3 fails, go to 1 again
)

is this approach feasible?


> we likely wouldl just add monitoriv via devlink to that periodic task.
> we certenly would not use it to detect a migration or a need to update a migration database(not sure what that is)
by migration database, I meant the traits in the placement. :)

if a periodic monitoring or devlink is required, then periodically
monitor sysfs is also viable, right?
> 
> in reality if we can consume this info indirectly via a libvirt api that will
> be the appcoh we will take at least for the libvirt driver in nova. for cyborg
> they may take a different appoch. we already use pyroute2 in 2 projects, os-vif and
> neutron and it does have devlink support so the burden of using devlink is not that
> high for openstack but its a less frineadly interface for configuration tools like
> ansiable vs a filesystem based approch.
> > 

 
From cohuck at redhat.com  Mon Aug 17 06:38:28 2020
From: cohuck at redhat.com (Cornelia Huck)
Date: Mon, 17 Aug 2020 08:38:28 +0200
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <315669b0-5c75-d359-a912-62ebab496abf@linux.ibm.com>
References: <20200727072440.GA28676@joy-OptiPlex-7040>
 <20200727162321.7097070e@x1.home>
 <20200729080503.GB28676@joy-OptiPlex-7040>
 <20200804183503.39f56516.cohuck@redhat.com>
 <c178a0d3-269d-1620-22b1-9010f602d8ff@redhat.com>
 <20200805021654.GB30485@joy-OptiPlex-7040>
 <2624b12f-3788-7e2b-2cb7-93534960bcb7@redhat.com>
 <20200805075647.GB2177@nanopsycho>
 <eb1d01c2-fbad-36b6-10cf-9e03483a736b@redhat.com>
 <20200805093338.GC30485@joy-OptiPlex-7040>
 <20200805105319.GF2177@nanopsycho>
 <4cf2824c803c96496e846c5b06767db305e9fb5a.camel@redhat.com>
 <20200807135942.5d56a202.cohuck@redhat.com>
 <20200813173347.239801fa.cohuck@redhat.com>
 <315669b0-5c75-d359-a912-62ebab496abf@linux.ibm.com>
Message-ID: <20200817083828.187315ef.cohuck@redhat.com>

On Thu, 13 Aug 2020 15:02:53 -0400
Eric Farman <farman at linux.ibm.com> wrote:

> On 8/13/20 11:33 AM, Cornelia Huck wrote:
> > On Fri, 7 Aug 2020 13:59:42 +0200
> > Cornelia Huck <cohuck at redhat.com> wrote:
> >   
> >> On Wed, 05 Aug 2020 12:35:01 +0100
> >> Sean Mooney <smooney at redhat.com> wrote:
> >>  
> >>> On Wed, 2020-08-05 at 12:53 +0200, Jiri Pirko wrote:    
> >>>> Wed, Aug 05, 2020 at 11:33:38AM CEST, yan.y.zhao at intel.com wrote:      
> >>
> >> (...)
> >>  
> >>>>>    software_version: device driver's version.
> >>>>>               in <major>.<minor>[.bugfix] scheme, where there is no
> >>>>> 	       compatibility across major versions, minor versions have
> >>>>> 	       forward compatibility (ex. 1-> 2 is ok, 2 -> 1 is not) and
> >>>>> 	       bugfix version number indicates some degree of internal
> >>>>> 	       improvement that is not visible to the user in terms of
> >>>>> 	       features or compatibility,
> >>>>>
> >>>>> vendor specific attributes: each vendor may define different attributes
> >>>>>   device id : device id of a physical devices or mdev's parent pci device.
> >>>>>               it could be equal to pci id for pci devices
> >>>>>   aggregator: used together with mdev_type. e.g. aggregator=2 together
> >>>>>               with i915-GVTg_V5_4 means 2*1/4=1/2 of a gen9 Intel
> >>>>> 	       graphics device.
> >>>>>   remote_url: for a local NVMe VF, it may be configured with a remote
> >>>>>               url of a remote storage and all data is stored in the
> >>>>> 	       remote side specified by the remote url.
> >>>>>   ...      
> >>> just a minor not that i find ^ much more simmple to understand then
> >>> the current proposal with self and compatiable.
> >>> if i have well defiend attibute that i can parse and understand that allow
> >>> me to calulate the what is and is not compatible that is likely going to
> >>> more useful as you wont have to keep maintianing a list of other compatible
> >>> devices every time a new sku is released.
> >>>
> >>> in anycase thank for actully shareing ^ as it make it simpler to reson about what
> >>> you have previously proposed.    
> >>
> >> So, what would be the most helpful format? A 'software_version' field
> >> that follows the conventions outlined above, and other (possibly
> >> optional) fields that have to match?  
> > 
> > Just to get a different perspective, I've been trying to come up with
> > what would be useful for a very different kind of device, namely
> > vfio-ccw. (Adding Eric to cc: for that.)
> > 
> > software_version makes sense for everybody, so it should be a standard
> > attribute.
> > 
> > For the vfio-ccw type, we have only one vendor driver (vfio-ccw_IO).
> > 
> > Given a subchannel A, we want to make sure that subchannel B has a
> > reasonable chance of being compatible. I guess that means:
> > 
> > - same subchannel type (I/O)
> > - same chpid type (e.g. all FICON; I assume there are no 'mixed' setups
> >   -- Eric?)  
> 
> Correct.
> 
> > - same number of chpids? Maybe we can live without that and just inject
> >   some machine checks, I don't know. Same chpid numbers is something we
> >   cannot guarantee, especially if we want to migrate cross-CEC (to
> >   another machine.)  
> 
> I think we'd live without it, because I wouldn't expect it to be
> consistent between systems.

Yes, and the guest needs to be able to deal with changing path
configurations anyway.

> 
> > 
> > Other possibly interesting information is not available at the
> > subchannel level (vfio-ccw is a subchannel driver.)  
> 
> I presume you're alluding to the DASD uid (dasdinfo -x) here?

Yes, or the even more basic Sense ID information.

> 
> > 
> > So, looking at a concrete subchannel on one of my machines, it would
> > look something like the following:
> > 
> > <common>
> > software_version=1.0.0
> > type=vfio-ccw          <-- would be vfio-pci on the example above
> > <vfio-ccw specific>
> > subchannel_type=0
> > <vfio-ccw_IO specific>
> > chpid_type=0x1a
> > chpid_mask=0xf0        <-- not sure if needed/wanted

Let's just drop the chpid_mask here.

> > 
> > Does that make sense?

Would be interesting if someone could come up with some possible
information for a third type of device.


From jegor at greenedge.cloud  Mon Aug 17 10:15:11 2020
From: jegor at greenedge.cloud (Jegor van Opdorp)
Date: Mon, 17 Aug 2020 10:15:11 +0000
Subject: [tc][masakari] Project aliveness (was: [masakari] Meetings)
In-Reply-To: <CAFHSqWroR+rirgLLn7hB-8fHKBR8jFE8KezgSg4EayePC3_dgQ@mail.gmail.com>
References: <CAKZ_x79qopDY36TNg_5mvWFy=tiiB08PZqpjKeft-TnR0dQbtA@mail.gmail.com>
 <CAKZ_x799gE8njCGVwUtd28REvJaYK86n0WZ3PVeQv5-HXKGdmQ@mail.gmail.com>,
 <CAFHSqWroR+rirgLLn7hB-8fHKBR8jFE8KezgSg4EayePC3_dgQ@mail.gmail.com>
Message-ID: <AM0PR02MB5746C86B851DAE317F5A2CEDA85F0@AM0PR02MB5746.eurprd02.prod.outlook.com>

We're also using masakari and willing to help maintain it!
________________________________
From: Mark Goddard <mark at stackhpc.com>
Sent: Monday, August 17, 2020 12:12 PM
To: Jegor van Opdorp <jegor at greenedge.cloud>
Subject: Fwd: [tc][masakari] Project aliveness (was: [masakari] Meetings)

---------- Forwarded message ---------
From: Radosław Piliszek <radoslaw.piliszek at gmail.com>
Date: Fri, 14 Aug 2020 at 08:53
Subject: [tc][masakari] Project aliveness (was: [masakari] Meetings)
To: openstack-discuss <openstack-discuss at lists.openstack.org>
Cc: Sampath Priyankara (samP) <sam47priya at gmail.com>, Tushar Patil
(tpatil) <tushar.vitthal.patil at gmail.com>


Hi,

it's been a month since I wrote the original (quoted) email, so I
retry it with CC to the PTL and a recently (this year) active core.

I see there have been no meetings and neither Masakari IRC channel nor
review queues have been getting much attention during that time
period.
I am, therefore, offering my help to maintain the project.

Regarding the original topic, I would opt for running Masakari
meetings during the time I proposed so that interested parties could
join and I know there is at least some interest based on recent IRC
activity (i.e. there exist people who want to use and discuss Masakari
- apart from me that is :-) ).

-yoctozepto


On Mon, Jul 13, 2020 at 9:53 PM Radosław Piliszek
<radoslaw.piliszek at gmail.com> wrote:
>
> Hello Fellow cloud-HA-seekers,
>
> I wanted to attend Masakari meetings but I found the current schedule unfit.
> Is there a chance to change the schedule? The day is fine but a shift
> by +3 hours would be nice.
>
> Anyhow, I wanted to discuss [1]. I've already proposed a change
> implementing it and looking forward to positive reviews. :-) That
> said, please reply on the change directly, or mail me or catch me on
> IRC, whichever option sounds best to you.
>
> [1] https://blueprints.launchpad.net/masakari/+spec/customisable-ha-enabled-instance-metadata-key
>
> -yoctozepto
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200817/a71a127a/attachment-0001.html>

From dbengt at redhat.com  Mon Aug 17 12:08:05 2020
From: dbengt at redhat.com (Daniel Bengtsson)
Date: Mon, 17 Aug 2020 14:08:05 +0200
Subject: Can't fetch from opendev.
Message-ID: <58c9ecb6-d1cc-df2f-caa8-693ed3f03d00@redhat.com>

Hi everyone,

I have tried to fetch the repository tripleo-heat-templates from 
opendev. I was not able to do that:

http://paste.openstack.org/show/796882/

With github it works. I have asked to another colleague to try, he have 
the same problem.


From arnaud.morin at gmail.com  Mon Aug 17 14:17:37 2020
From: arnaud.morin at gmail.com (Arnaud Morin)
Date: Mon, 17 Aug 2020 14:17:37 +0000
Subject: [nova][neutron][oslo][ops][kolla] rabbit bindings issue
In-Reply-To: <CAA857VxrRXEeG=QQjEHhJpxRsow_eUsEK29-DX_kuDcBGzp4PA@mail.gmail.com>
References: <CAA857VxxmrHhBZgNHCg2X=ymvOdS_YcpERYN5OnKs0NMEA=cyA@mail.gmail.com>
 <CAA857Vya0rou-SwzO+A=smVHXNTZ9a03_3r09BJj=bb3NAcdGg@mail.gmail.com>
 <CAA857Vzw+kvW_7R37oqJF6gRkxGBSY3gNYTLKu++pY5WHH49qg@mail.gmail.com>
 <CAA857Vzq1nYLM3ES35J_9PxTieVskhZBc43Z40R3NeiEEOL4rA@mail.gmail.com>
 <CAPgF-fr-3J=acxng6SwRkLPXqvTYNsNmCZUXyvzi4rxDnPDdEA@mail.gmail.com>
 <CAA857Vw_HG8VuZyyihm8RB9Mo8aPm=v36GJrnK1jD9wyZHAb2A@mail.gmail.com>
 <28f04c4eff84aa6d15424f3de3706ae9ec361fa7.camel@redhat.com>
 <CAPgF-fo-MFk7oVcKKPXthBE=hi4U+VtRtkdtkc3w=SsGuJJ-ag@mail.gmail.com>
 <f4d5d0e32626f6c8b5509d67eb3d38272169e4fc.camel@redhat.com>
 <CAA857VxrRXEeG=QQjEHhJpxRsow_eUsEK29-DX_kuDcBGzp4PA@mail.gmail.com>
Message-ID: <20200817141737.GU31915@sync>

Hey Fabian,

I was thinking the same, and I found the "default" values from
openstack-ansible:
https://github.com/openstack/openstack-ansible-rabbitmq_server/blob/fc27e735a68b64cb3c67dd8abeaf324803a9845b/defaults/main.yml#L172

pattern: '^(?!(amq\.)|(.*_fanout_)|(reply_)).*'

Which are setting HA for all except
amq.*
*_fanout_*
reply_*

So that would make sense?

-- 
Arnaud Morin

On 17.08.20 - 16:03, Fabian Zimmermann wrote:
> Just to keep the list updated.
> 
> If you run with durable_queues and replication, there is still a
> possibility, that a short living queue will *not* jet be replicated
> and a node failure will mark these queue as "unreachable". This
> wouldnt be a problem, if openstack would create a new queue, but i
> fear it would just try to reuse the existing after reconnect.
> 
> So, after all - it seems the less buggy way would be
> 
> * use durable-queue and replication for long-running queues/exchanges
> * use non-durable-queue without replication for short (fanout, reply_) queues
> 
> This should allow the short-living ones to destroy themself on node
> failure, and the long living ones should be able to be as available as
> possible.
> 
> Absolutely untested - so use with caution, but here is a possible
> policy-regex: ^(?!amq\.)(?!reply_)(?!.*fanout).*
> 
>  Fabian
> 
> 
> Am So., 16. Aug. 2020 um 15:37 Uhr schrieb Sean Mooney <smooney at redhat.com>:
> >
> > On Sat, 2020-08-15 at 20:13 -0400, Satish Patel wrote:
> > > Hi Sean,
> > >
> > > Sounds good, but running rabbitmq for each service going to be little
> > > overhead also, how do you scale cluster (Yes we can use cellv2 but its
> > > not something everyone like to do because of complexity).
> >
> > my understanding is that when using rabbitmq adding multiple rabbitmq servers in a cluster lowers
> > througput vs jsut 1 rabbitmq instance for any given excahnge. that is because the content of
> > the queue need to be syconised across the cluster. so if cinder nova and neutron share
> > a 3 node cluster and your compaure that to the same service deployed with cinder nova and neuton
> > each having there on rabbitmq service then the independent deployment will tend to out perform the
> > clustered solution. im not really sure if that has change i know tha thow clustering has been donw has evovled
> > over the years but in the past clustering was the adversary of scaling.
> >
> > >  If we thinks
> > > rabbitMQ is growing pain then why community not looking for
> > > alternative option (kafka) etc..?
> > we have looked at alternivives several times
> > rabbit mq  wroks well enough ans scales well enough for most deployments.
> > there other amqp implimantation that scale better then rabbit,
> > activemq and qpid are both reported to scale better but they perfrom worse
> > out of the box and need to be carfully tuned
> >
> > in the past zeromq has been supported but peole did not maintain it.
> >
> > kafka i dont think is a good alternative but nats https://nats.io/ might be.
> >
> > for what its worth all nova deployment are cellv2 deployments with 1 cell from around pike/rocky
> > and its really not that complex. cells_v1 was much more complex bug part of the redesign
> > for cells_v2 was makeing sure there is only 1 code path. adding a second cell just need another
> > cell db and conductor to be deployed assuming you startted with a super conductor in the first
> > place. the issue is cells is only a nova feature no other service have cells so it does not help
> > you with cinder or neutron. as such cinder an neutron likely be the services that hit scaling limits first.
> > adopign cells in other services is not nessaryally the right approch either but when we talk about scale
> > we do need to keep in mind that cells is just for nova today.
> >
> >
> > >
> > > On Fri, Aug 14, 2020 at 3:09 PM Sean Mooney <smooney at redhat.com> wrote:
> > > >
> > > > On Fri, 2020-08-14 at 18:45 +0200, Fabian Zimmermann wrote:
> > > > > Hi,
> > > > >
> > > > > i read somewhere that vexxhosts kubernetes openstack-Operator is running
> > > > > one rabbitmq Container per Service. Just the kubernetes self healing is
> > > > > used as "ha" for rabbitmq.
> > > > >
> > > > > That seems to match with my finding: run rabbitmq standalone and use an
> > > > > external system to restart rabbitmq if required.
> > > >
> > > > thats the design that was orginally planned for kolla-kubernetes orrignally
> > > >
> > > > each service was to be deployed with its own rabbit mq server if it required one
> > > > and if it crashed it woudl just be recreated by k8s. it perfromace better then a cluster
> > > > and if you trust k8s or the external service enough to ensure it is recteated it
> > > > should be as effective a solution. you dont even need k8s to do that but it seams to be
> > > > a good fit if  your prepared to ocationally loose inflight rpcs.
> > > > if you not then you can configure rabbit to persite all message to disk and mont that on a shared
> > > > file system like nfs or cephfs so that when the rabbit instance is recreated the queue contency is
> > > > perserved. assuming you can take the perfromance hit of writing all messages to disk that is.
> > > > >
> > > > >  Fabian
> > > > >
> > > > > Satish Patel <satish.txt at gmail.com> schrieb am Fr., 14. Aug. 2020, 16:59:
> > > > >
> > > > > > Fabian,
> > > > > >
> > > > > > what do you mean?
> > > > > >
> > > > > > > > I think vexxhost is running (1) with their openstack-operator - for
> > > > > >
> > > > > > reasons.
> > > > > >
> > > > > > On Fri, Aug 14, 2020 at 7:28 AM Fabian Zimmermann <dev.faz at gmail.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > Hello again,
> > > > > > >
> > > > > > > just a short update about the results of my tests.
> > > > > > >
> > > > > > > I currently see 2 ways of running openstack+rabbitmq
> > > > > > >
> > > > > > > 1. without durable-queues and without replication - just one
> > > > > >
> > > > > > rabbitmq-process which gets (somehow) restarted if it fails.
> > > > > > > 2. durable-queues and replication
> > > > > > >
> > > > > > > Any other combination of these settings leads to more or less issues with
> > > > > > >
> > > > > > > * broken / non working bindings
> > > > > > > * broken queues
> > > > > > >
> > > > > > > I think vexxhost is running (1) with their openstack-operator - for
> > > > > >
> > > > > > reasons.
> > > > > > >
> > > > > > > I added [kolla], because kolla-ansible is installing rabbitmq with
> > > > > >
> > > > > > replication but without durable-queues.
> > > > > > >
> > > > > > > May someone point me to the best way to document these findings to some
> > > > > >
> > > > > > official doc?
> > > > > > > I think a lot of installations out there will run into issues if - under
> > > > > >
> > > > > > load - a node fails.
> > > > > > >
> > > > > > >  Fabian
> > > > > > >
> > > > > > >
> > > > > > > Am Do., 13. Aug. 2020 um 15:13 Uhr schrieb Fabian Zimmermann <
> > > > > >
> > > > > > dev.faz at gmail.com>:
> > > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > just did some short tests today in our test-environment (without
> > > > > >
> > > > > > durable queues and without replication):
> > > > > > > >
> > > > > > > > * started a rally task to generate some load
> > > > > > > > * kill-9-ed rabbitmq on one node
> > > > > > > > * rally task immediately stopped and the cloud (mostly) stopped working
> > > > > > > >
> > > > > > > > after some debugging i found (again) exchanges which had bindings to
> > > > > >
> > > > > > queues, but these bindings didnt forward any msgs.
> > > > > > > > Wrote a small script to detect these broken bindings and will now check
> > > > > >
> > > > > > if this is "reproducible"
> > > > > > > >
> > > > > > > > then I will try "durable queues" and "durable queues with replication"
> > > > > >
> > > > > > to see if this helps. Even if I would expect
> > > > > > > > rabbitmq should be able to handle this without these "hidden broken
> > > > > >
> > > > > > bindings"
> > > > > > > >
> > > > > > > > This just FYI.
> > > > > > > >
> > > > > > > >  Fabian
> > >
> > >
> >
> 


From mnaser at vexxhost.com  Mon Aug 17 14:17:39 2020
From: mnaser at vexxhost.com (Mohammed Naser)
Date: Mon, 17 Aug 2020 10:17:39 -0400
Subject: Can't fetch from opendev.
In-Reply-To: <58c9ecb6-d1cc-df2f-caa8-693ed3f03d00@redhat.com>
References: <58c9ecb6-d1cc-df2f-caa8-693ed3f03d00@redhat.com>
Message-ID: <CAEs876h_vOQPk4Aji5P-wXGD=YnmRZVt51kYovVGBqS5Ti2ehw@mail.gmail.com>

Hi there,

I've reported this to the OpenDev team at #opendev on IRC, I think one
of the Gitea backends is likely unhappy.

Thanks
Mohammed

On Mon, Aug 17, 2020 at 10:15 AM Daniel Bengtsson <dbengt at redhat.com> wrote:
>
> Hi everyone,
>
> I have tried to fetch the repository tripleo-heat-templates from
> opendev. I was not able to do that:
>
> http://paste.openstack.org/show/796882/
>
> With github it works. I have asked to another colleague to try, he have
> the same problem.
>
>


-- 
Mohammed Naser
VEXXHOST, Inc.


From mnaser at vexxhost.com  Mon Aug 17 14:18:51 2020
From: mnaser at vexxhost.com (Mohammed Naser)
Date: Mon, 17 Aug 2020 10:18:51 -0400
Subject: [neutron][ops] API for viewing HA router states
In-Reply-To: <CAA857VzXdA6aA+L7OGhCbiz8J-gonkR0D3UWnjCdOgDmUQS1cw@mail.gmail.com>
References: <CAEs876jV-RqWiW2K8ehvm4ACTCiU059LJDiC+FWGWKQdPGCjGw@mail.gmail.com>
 <CAA857Vyu1bBJFe7W-ny_7BXo2WzUqj5WoP700r6Tq=+Zb+MMrg@mail.gmail.com>
 <CABARBAZeo515f9dV25Oo3AsHw_7dxez-E2irHrg=qzB0RVgmyA@mail.gmail.com>
 <CAA857VzXdA6aA+L7OGhCbiz8J-gonkR0D3UWnjCdOgDmUQS1cw@mail.gmail.com>
Message-ID: <CAEs876hAcBwwR0WOzXiPf12HS6EoMqhcUemynXUN0GXD4Hxeew@mail.gmail.com>

Hi all:

What Fabian is describing is exactly the problem we're having, there
are _many_ routers in these environments so we'd be looking at N
requests which can get out of control quickly

Thanks
Mohammed

On Mon, Aug 17, 2020 at 10:05 AM Fabian Zimmermann <dev.faz at gmail.com> wrote:
>
> Hi,
>
> yes for 1 router, but doing this in a loop for hundreds is not so performant ;)
>
>  Fabian
>
> Am Mo., 17. Aug. 2020 um 16:04 Uhr schrieb Assaf Muller <amuller at redhat.com>:
> >
> > On Mon, Aug 17, 2020 at 9:59 AM Fabian Zimmermann <dev.faz at gmail.com> wrote:
> > >
> > > Hi,
> > >
> > > I can just tell you that we are doing a similar check for dhcp-agent, but here we just execute a suitable SQL-statement to detect more than 1 agent / AZ.
> > >
> > > Doing the same for L3 shouldn't be that hard, but I dont know if this is what you are looking for?
> >
> > There's already an API for this:
> > neutron l3-agent-list-hosting-router <router_id>
> >
> > It will show you the HA state per L3 agent for the given router.
> >
> > >
> > >  Fabian
> > >
> > >
> > > Am Mo., 17. Aug. 2020 um 14:11 Uhr schrieb Mohammed Naser <mnaser at vexxhost.com>:
> > >>
> > >> Hi all,
> > >>
> > >> Over the past few days, we were troubleshooting an issue that ended up
> > >> having a root cause where keepalived has somehow ended up active in
> > >> two different L3 agents.  We've yet to find the root cause of how this
> > >> happened but removing it and adding it resolved the issue for us.
> > >>
> > >> As we work on improving our monitoring, we wanted to implement
> > >> something that gets us the info of # of active routers to check if
> > >> there's a router that has >1 active L3 agent but it's hard because
> > >> hitting the /l3-agents endpoint on _every_ single router hurts a lot
> > >> on performance.
> > >>
> > >> Is there something else that we can watch which might be more
> > >> productive?  FYI -- this all goes in the open and will end up inside
> > >> the openstack-exporter:
> > >> https://github.com/openstack-exporter/openstack-exporter and the Helm
> > >> charts will end up with the alerts:
> > >> https://github.com/openstack-exporter/helm-charts
> > >>
> > >> Thanks!
> > >> Mohammed
> > >>
> > >> --
> > >> Mohammed Naser
> > >> VEXXHOST, Inc.
> > >>
> >


-- 
Mohammed Naser
VEXXHOST, Inc.


From dev.faz at gmail.com  Mon Aug 17 14:21:34 2020
From: dev.faz at gmail.com (Fabian Zimmermann)
Date: Mon, 17 Aug 2020 16:21:34 +0200
Subject: [nova][neutron][oslo][ops][kolla] rabbit bindings issue
In-Reply-To: <20200817141737.GU31915@sync>
References: <CAA857VxxmrHhBZgNHCg2X=ymvOdS_YcpERYN5OnKs0NMEA=cyA@mail.gmail.com>
 <CAA857Vya0rou-SwzO+A=smVHXNTZ9a03_3r09BJj=bb3NAcdGg@mail.gmail.com>
 <CAA857Vzw+kvW_7R37oqJF6gRkxGBSY3gNYTLKu++pY5WHH49qg@mail.gmail.com>
 <CAA857Vzq1nYLM3ES35J_9PxTieVskhZBc43Z40R3NeiEEOL4rA@mail.gmail.com>
 <CAPgF-fr-3J=acxng6SwRkLPXqvTYNsNmCZUXyvzi4rxDnPDdEA@mail.gmail.com>
 <CAA857Vw_HG8VuZyyihm8RB9Mo8aPm=v36GJrnK1jD9wyZHAb2A@mail.gmail.com>
 <28f04c4eff84aa6d15424f3de3706ae9ec361fa7.camel@redhat.com>
 <CAPgF-fo-MFk7oVcKKPXthBE=hi4U+VtRtkdtkc3w=SsGuJJ-ag@mail.gmail.com>
 <f4d5d0e32626f6c8b5509d67eb3d38272169e4fc.camel@redhat.com>
 <CAA857VxrRXEeG=QQjEHhJpxRsow_eUsEK29-DX_kuDcBGzp4PA@mail.gmail.com>
 <20200817141737.GU31915@sync>
Message-ID: <CAA857Vwint8kVYnLqfSAUugWKO7mw7NmQi2wTV5eU6LJFOUQKw@mail.gmail.com>

Hi,

oh, that's great!

So, someone at openstack-ansible already detected this and just forgot
to update the docs.openstack.org ;)

I tested my regex and it seems to fix my issue (atm).

I will run an openstack rally load test with the regex above to check
what happens if I terminate a rabbitmq while load is hitting the
system.

 Fabian

Am Mo., 17. Aug. 2020 um 16:17 Uhr schrieb Arnaud Morin
<arnaud.morin at gmail.com>:
>
> Hey Fabian,
>
> I was thinking the same, and I found the "default" values from
> openstack-ansible:
> https://github.com/openstack/openstack-ansible-rabbitmq_server/blob/fc27e735a68b64cb3c67dd8abeaf324803a9845b/defaults/main.yml#L172
>
> pattern: '^(?!(amq\.)|(.*_fanout_)|(reply_)).*'
>
> Which are setting HA for all except
> amq.*
> *_fanout_*
> reply_*
>
> So that would make sense?
>
> --
> Arnaud Morin
>
> On 17.08.20 - 16:03, Fabian Zimmermann wrote:
> > Just to keep the list updated.
> >
> > If you run with durable_queues and replication, there is still a
> > possibility, that a short living queue will *not* jet be replicated
> > and a node failure will mark these queue as "unreachable". This
> > wouldnt be a problem, if openstack would create a new queue, but i
> > fear it would just try to reuse the existing after reconnect.
> >
> > So, after all - it seems the less buggy way would be
> >
> > * use durable-queue and replication for long-running queues/exchanges
> > * use non-durable-queue without replication for short (fanout, reply_) queues
> >
> > This should allow the short-living ones to destroy themself on node
> > failure, and the long living ones should be able to be as available as
> > possible.
> >
> > Absolutely untested - so use with caution, but here is a possible
> > policy-regex: ^(?!amq\.)(?!reply_)(?!.*fanout).*
> >
> >  Fabian
> >
> >
> > Am So., 16. Aug. 2020 um 15:37 Uhr schrieb Sean Mooney <smooney at redhat.com>:
> > >
> > > On Sat, 2020-08-15 at 20:13 -0400, Satish Patel wrote:
> > > > Hi Sean,
> > > >
> > > > Sounds good, but running rabbitmq for each service going to be little
> > > > overhead also, how do you scale cluster (Yes we can use cellv2 but its
> > > > not something everyone like to do because of complexity).
> > >
> > > my understanding is that when using rabbitmq adding multiple rabbitmq servers in a cluster lowers
> > > througput vs jsut 1 rabbitmq instance for any given excahnge. that is because the content of
> > > the queue need to be syconised across the cluster. so if cinder nova and neutron share
> > > a 3 node cluster and your compaure that to the same service deployed with cinder nova and neuton
> > > each having there on rabbitmq service then the independent deployment will tend to out perform the
> > > clustered solution. im not really sure if that has change i know tha thow clustering has been donw has evovled
> > > over the years but in the past clustering was the adversary of scaling.
> > >
> > > >  If we thinks
> > > > rabbitMQ is growing pain then why community not looking for
> > > > alternative option (kafka) etc..?
> > > we have looked at alternivives several times
> > > rabbit mq  wroks well enough ans scales well enough for most deployments.
> > > there other amqp implimantation that scale better then rabbit,
> > > activemq and qpid are both reported to scale better but they perfrom worse
> > > out of the box and need to be carfully tuned
> > >
> > > in the past zeromq has been supported but peole did not maintain it.
> > >
> > > kafka i dont think is a good alternative but nats https://nats.io/ might be.
> > >
> > > for what its worth all nova deployment are cellv2 deployments with 1 cell from around pike/rocky
> > > and its really not that complex. cells_v1 was much more complex bug part of the redesign
> > > for cells_v2 was makeing sure there is only 1 code path. adding a second cell just need another
> > > cell db and conductor to be deployed assuming you startted with a super conductor in the first
> > > place. the issue is cells is only a nova feature no other service have cells so it does not help
> > > you with cinder or neutron. as such cinder an neutron likely be the services that hit scaling limits first.
> > > adopign cells in other services is not nessaryally the right approch either but when we talk about scale
> > > we do need to keep in mind that cells is just for nova today.
> > >
> > >
> > > >
> > > > On Fri, Aug 14, 2020 at 3:09 PM Sean Mooney <smooney at redhat.com> wrote:
> > > > >
> > > > > On Fri, 2020-08-14 at 18:45 +0200, Fabian Zimmermann wrote:
> > > > > > Hi,
> > > > > >
> > > > > > i read somewhere that vexxhosts kubernetes openstack-Operator is running
> > > > > > one rabbitmq Container per Service. Just the kubernetes self healing is
> > > > > > used as "ha" for rabbitmq.
> > > > > >
> > > > > > That seems to match with my finding: run rabbitmq standalone and use an
> > > > > > external system to restart rabbitmq if required.
> > > > >
> > > > > thats the design that was orginally planned for kolla-kubernetes orrignally
> > > > >
> > > > > each service was to be deployed with its own rabbit mq server if it required one
> > > > > and if it crashed it woudl just be recreated by k8s. it perfromace better then a cluster
> > > > > and if you trust k8s or the external service enough to ensure it is recteated it
> > > > > should be as effective a solution. you dont even need k8s to do that but it seams to be
> > > > > a good fit if  your prepared to ocationally loose inflight rpcs.
> > > > > if you not then you can configure rabbit to persite all message to disk and mont that on a shared
> > > > > file system like nfs or cephfs so that when the rabbit instance is recreated the queue contency is
> > > > > perserved. assuming you can take the perfromance hit of writing all messages to disk that is.
> > > > > >
> > > > > >  Fabian
> > > > > >
> > > > > > Satish Patel <satish.txt at gmail.com> schrieb am Fr., 14. Aug. 2020, 16:59:
> > > > > >
> > > > > > > Fabian,
> > > > > > >
> > > > > > > what do you mean?
> > > > > > >
> > > > > > > > > I think vexxhost is running (1) with their openstack-operator - for
> > > > > > >
> > > > > > > reasons.
> > > > > > >
> > > > > > > On Fri, Aug 14, 2020 at 7:28 AM Fabian Zimmermann <dev.faz at gmail.com>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > Hello again,
> > > > > > > >
> > > > > > > > just a short update about the results of my tests.
> > > > > > > >
> > > > > > > > I currently see 2 ways of running openstack+rabbitmq
> > > > > > > >
> > > > > > > > 1. without durable-queues and without replication - just one
> > > > > > >
> > > > > > > rabbitmq-process which gets (somehow) restarted if it fails.
> > > > > > > > 2. durable-queues and replication
> > > > > > > >
> > > > > > > > Any other combination of these settings leads to more or less issues with
> > > > > > > >
> > > > > > > > * broken / non working bindings
> > > > > > > > * broken queues
> > > > > > > >
> > > > > > > > I think vexxhost is running (1) with their openstack-operator - for
> > > > > > >
> > > > > > > reasons.
> > > > > > > >
> > > > > > > > I added [kolla], because kolla-ansible is installing rabbitmq with
> > > > > > >
> > > > > > > replication but without durable-queues.
> > > > > > > >
> > > > > > > > May someone point me to the best way to document these findings to some
> > > > > > >
> > > > > > > official doc?
> > > > > > > > I think a lot of installations out there will run into issues if - under
> > > > > > >
> > > > > > > load - a node fails.
> > > > > > > >
> > > > > > > >  Fabian
> > > > > > > >
> > > > > > > >
> > > > > > > > Am Do., 13. Aug. 2020 um 15:13 Uhr schrieb Fabian Zimmermann <
> > > > > > >
> > > > > > > dev.faz at gmail.com>:
> > > > > > > > >
> > > > > > > > > Hi,
> > > > > > > > >
> > > > > > > > > just did some short tests today in our test-environment (without
> > > > > > >
> > > > > > > durable queues and without replication):
> > > > > > > > >
> > > > > > > > > * started a rally task to generate some load
> > > > > > > > > * kill-9-ed rabbitmq on one node
> > > > > > > > > * rally task immediately stopped and the cloud (mostly) stopped working
> > > > > > > > >
> > > > > > > > > after some debugging i found (again) exchanges which had bindings to
> > > > > > >
> > > > > > > queues, but these bindings didnt forward any msgs.
> > > > > > > > > Wrote a small script to detect these broken bindings and will now check
> > > > > > >
> > > > > > > if this is "reproducible"
> > > > > > > > >
> > > > > > > > > then I will try "durable queues" and "durable queues with replication"
> > > > > > >
> > > > > > > to see if this helps. Even if I would expect
> > > > > > > > > rabbitmq should be able to handle this without these "hidden broken
> > > > > > >
> > > > > > > bindings"
> > > > > > > > >
> > > > > > > > > This just FYI.
> > > > > > > > >
> > > > > > > > >  Fabian
> > > >
> > > >
> > >
> >


From fungi at yuggoth.org  Mon Aug 17 14:37:03 2020
From: fungi at yuggoth.org (Jeremy Stanley)
Date: Mon, 17 Aug 2020 14:37:03 +0000
Subject: Can't fetch from opendev.
In-Reply-To: <58c9ecb6-d1cc-df2f-caa8-693ed3f03d00@redhat.com>
References: <58c9ecb6-d1cc-df2f-caa8-693ed3f03d00@redhat.com>
Message-ID: <20200817143703.c5rh3eqcl3ihxy4m@yuggoth.org>

[keeping Daniel in Cc as he doesn't appear to be subscribed]

On 2020-08-17 14:08:05 +0200 (+0200), Daniel Bengtsson wrote:
[...]
> I have tried to fetch the repository tripleo-heat-templates from
> opendev. I was not able to do that:
> 
> http://paste.openstack.org/show/796882/
> 
> With github it works. I have asked to another colleague to try, he
> have the same problem.

What command(s) did you run and what error message is Git giving
you? That paste doesn't look like an error, just a trace of the
internal operations which were performed.

Are you and your colleague both connecting from the same network?
Possibly the same corporate network or the same VPN?
-- 
Jeremy Stanley
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 963 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200817/62609f56/attachment.sig>

From amuller at redhat.com  Mon Aug 17 15:39:44 2020
From: amuller at redhat.com (Assaf Muller)
Date: Mon, 17 Aug 2020 11:39:44 -0400
Subject: [neutron][ops] API for viewing HA router states
In-Reply-To: <CAEs876hAcBwwR0WOzXiPf12HS6EoMqhcUemynXUN0GXD4Hxeew@mail.gmail.com>
References: <CAEs876jV-RqWiW2K8ehvm4ACTCiU059LJDiC+FWGWKQdPGCjGw@mail.gmail.com>
 <CAA857Vyu1bBJFe7W-ny_7BXo2WzUqj5WoP700r6Tq=+Zb+MMrg@mail.gmail.com>
 <CABARBAZeo515f9dV25Oo3AsHw_7dxez-E2irHrg=qzB0RVgmyA@mail.gmail.com>
 <CAA857VzXdA6aA+L7OGhCbiz8J-gonkR0D3UWnjCdOgDmUQS1cw@mail.gmail.com>
 <CAEs876hAcBwwR0WOzXiPf12HS6EoMqhcUemynXUN0GXD4Hxeew@mail.gmail.com>
Message-ID: <CABARBAZGCVF6J9bbDCg2b=_bWrN6cOZGTJxkvOFNe3ds=kG9fg@mail.gmail.com>

On Mon, Aug 17, 2020 at 10:19 AM Mohammed Naser <mnaser at vexxhost.com> wrote:
>
> Hi all:
>
> What Fabian is describing is exactly the problem we're having, there
> are _many_ routers in these environments so we'd be looking at N
> requests which can get out of control quickly

I think it's a clear use case to implement a new API endpoint that
returns HA state per agent for *all* routers in a single call. Should
be easy to implement.

>
> Thanks
> Mohammed
>
> On Mon, Aug 17, 2020 at 10:05 AM Fabian Zimmermann <dev.faz at gmail.com> wrote:
> >
> > Hi,
> >
> > yes for 1 router, but doing this in a loop for hundreds is not so performant ;)
> >
> >  Fabian
> >
> > Am Mo., 17. Aug. 2020 um 16:04 Uhr schrieb Assaf Muller <amuller at redhat.com>:
> > >
> > > On Mon, Aug 17, 2020 at 9:59 AM Fabian Zimmermann <dev.faz at gmail.com> wrote:
> > > >
> > > > Hi,
> > > >
> > > > I can just tell you that we are doing a similar check for dhcp-agent, but here we just execute a suitable SQL-statement to detect more than 1 agent / AZ.
> > > >
> > > > Doing the same for L3 shouldn't be that hard, but I dont know if this is what you are looking for?
> > >
> > > There's already an API for this:
> > > neutron l3-agent-list-hosting-router <router_id>
> > >
> > > It will show you the HA state per L3 agent for the given router.
> > >
> > > >
> > > >  Fabian
> > > >
> > > >
> > > > Am Mo., 17. Aug. 2020 um 14:11 Uhr schrieb Mohammed Naser <mnaser at vexxhost.com>:
> > > >>
> > > >> Hi all,
> > > >>
> > > >> Over the past few days, we were troubleshooting an issue that ended up
> > > >> having a root cause where keepalived has somehow ended up active in
> > > >> two different L3 agents.  We've yet to find the root cause of how this
> > > >> happened but removing it and adding it resolved the issue for us.
> > > >>
> > > >> As we work on improving our monitoring, we wanted to implement
> > > >> something that gets us the info of # of active routers to check if
> > > >> there's a router that has >1 active L3 agent but it's hard because
> > > >> hitting the /l3-agents endpoint on _every_ single router hurts a lot
> > > >> on performance.
> > > >>
> > > >> Is there something else that we can watch which might be more
> > > >> productive?  FYI -- this all goes in the open and will end up inside
> > > >> the openstack-exporter:
> > > >> https://github.com/openstack-exporter/openstack-exporter and the Helm
> > > >> charts will end up with the alerts:
> > > >> https://github.com/openstack-exporter/helm-charts
> > > >>
> > > >> Thanks!
> > > >> Mohammed
> > > >>
> > > >> --
> > > >> Mohammed Naser
> > > >> VEXXHOST, Inc.
> > > >>
> > >
>
>
>
> --
> Mohammed Naser
> VEXXHOST, Inc.
>


From corey.bryant at canonical.com  Mon Aug 17 15:59:20 2020
From: corey.bryant at canonical.com (Corey Bryant)
Date: Mon, 17 Aug 2020 11:59:20 -0400
Subject: cross-team action items
Message-ID: <CADn0iZ3o4-DKLBTopcq2d8GxtEE6iQfi37YCRqNs3yg_HvHgGA@mail.gmail.com>

These were the items from the cross team today that need action from our
team:

Bootstack: What’s the plan for Ceph-osd w/ openstack-on-lxd for Bluestore?
Can we give an answer as to whether we plan to deprecate or solve this, one
way or another?

Bootstack: Gnocchi thread - can we respond to thread to make official
decision so they can plan accordingly? ie. if upstream is not supported, we
likely won't support, so if we can clarify then bootstack can remove from
standard deploys.

SEG: LP#1891096 <https://bugs.launchpad.net/charm-ceph-mon/+bug/1891096> -
configuration database support for mimic+: Any value that exists in the
configuration database will no longer receive updates from ceph.conf. Will
be subscribed to field-high.

Thanks, Corey
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200817/0ef7dfbf/attachment.html>

From openstack at nemebean.com  Mon Aug 17 16:13:15 2020
From: openstack at nemebean.com (Ben Nemec)
Date: Mon, 17 Aug 2020 11:13:15 -0500
Subject: [nova][neutron][oslo][ops][kolla] rabbit bindings issue
In-Reply-To: <303EB7E0-C584-42A7-BF7A-D1EAABDD1AD7@binero.com>
References: <20200806144016.GP31915@sync>
 <CAA857VxY8mr0OSw8vSYdMph4tXb=5Wwwx3cM_YVgGa8=P+cxkQ@mail.gmail.com>
 <CALaZjRGEcFDBiQ2YnsMjXJ_WF3ALcjWVLQuetupVd__CWhjg0g@mail.gmail.com>
 <CAA857VzKREnBOEF667n4B+9L9Rbm+y7o2LVrDjhxhCT=2XWR8g@mail.gmail.com>
 <1a338d7e-c82c-cda2-2d47-b5aebb999142@openstack.org>
 <CAA857VxxmrHhBZgNHCg2X=ymvOdS_YcpERYN5OnKs0NMEA=cyA@mail.gmail.com>
 <CAA857Vya0rou-SwzO+A=smVHXNTZ9a03_3r09BJj=bb3NAcdGg@mail.gmail.com>
 <CAA857Vzw+kvW_7R37oqJF6gRkxGBSY3gNYTLKu++pY5WHH49qg@mail.gmail.com>
 <CAA857Vzq1nYLM3ES35J_9PxTieVskhZBc43Z40R3NeiEEOL4rA@mail.gmail.com>
 <CAPgF-fr-3J=acxng6SwRkLPXqvTYNsNmCZUXyvzi4rxDnPDdEA@mail.gmail.com>
 <CAA857Vw_HG8VuZyyihm8RB9Mo8aPm=v36GJrnK1jD9wyZHAb2A@mail.gmail.com>
 <28f04c4eff84aa6d15424f3de3706ae9ec361fa7.camel@redhat.com>
 <CAPgF-fo-MFk7oVcKKPXthBE=hi4U+VtRtkdtkc3w=SsGuJJ-ag@mail.gmail.com>
 <CAA857VzF-WmGSUEb4SYriNn2Zw4Xd6mXtNe+=MocyTgNdiczFQ@mail.gmail.com>
 <303EB7E0-C584-42A7-BF7A-D1EAABDD1AD7@binero.com>
Message-ID: <95b6b4d8-c70e-7f78-2659-4d12d315b42d@nemebean.com>


On 8/16/20 3:48 AM, Tobias Urdin wrote:
> Hello,
> 
> Kind of off topic but I’ve been starting doing some research to see if a 
> KubeMQ driver could be added to oslo.messaging

You may want to take a look at 
https://docs.openstack.org/oslo.messaging/latest/contributor/supported-messaging-drivers.html 


We've had bad luck with adding new drivers to oslo.messaging in the 
past, so we've tried to come up with a policy that gives them the best 
possible chance of being successful. It does set a rather high bar for 
integration though.

Also take a look at https://review.opendev.org/#/c/692784/ A lot of the 
discussion there may be relevant to another new driver.

> 
> Best regards
> 
>> On 16 Aug 2020, at 07:44, Fabian Zimmermann <dev.faz at gmail.com> wrote:
>>
>> ﻿
>> Hi,
>>
>> Already looked in Oslo.messaging, but rabbitmq is the only stable 
>> driver :(
>>
>> Kafka is marked as experimental and (if the docs are correct) is only 
>> usable for notifications.
>>
>> Would love to switch to an alternate.
>>
>>  Fabian
>>
>> Satish Patel <satish.txt at gmail.com <mailto:satish.txt at gmail.com>> 
>> schrieb am So., 16. Aug. 2020, 02:13:
>>
>>     Hi Sean,
>>
>>     Sounds good, but running rabbitmq for each service going to be little
>>     overhead also, how do you scale cluster (Yes we can use cellv2 but its
>>     not something everyone like to do because of complexity). If we thinks
>>     rabbitMQ is growing pain then why community not looking for
>>     alternative option (kafka) etc..?
>>
>>     On Fri, Aug 14, 2020 at 3:09 PM Sean Mooney <smooney at redhat.com
>>     <mailto:smooney at redhat.com>> wrote:
>>     >
>>     > On Fri, 2020-08-14 at 18:45 +0200, Fabian Zimmermann wrote:
>>     > > Hi,
>>     > >
>>     > > i read somewhere that vexxhosts kubernetes openstack-Operator
>>     is running
>>     > > one rabbitmq Container per Service. Just the kubernetes self
>>     healing is
>>     > > used as "ha" for rabbitmq.
>>     > >
>>     > > That seems to match with my finding: run rabbitmq standalone
>>     and use an
>>     > > external system to restart rabbitmq if required.
>>     > thats the design that was orginally planned for kolla-kubernetes
>>     orrignally
>>     >
>>     > each service was to be deployed with its own rabbit mq server if
>>     it required one
>>     > and if it crashed it woudl just be recreated by k8s. it
>>     perfromace better then a cluster
>>     > and if you trust k8s or the external service enough to ensure it
>>     is recteated it
>>     > should be as effective a solution. you dont even need k8s to do
>>     that but it seams to be
>>     > a good fit if  your prepared to ocationally loose inflight rpcs.
>>     > if you not then you can configure rabbit to persite all message
>>     to disk and mont that on a shared
>>     > file system like nfs or cephfs so that when the rabbit instance
>>     is recreated the queue contency is
>>     > perserved. assuming you can take the perfromance hit of writing
>>     all messages to disk that is.
>>     > >
>>     > >  Fabian
>>     > >
>>     > > Satish Patel <satish.txt at gmail.com
>>     <mailto:satish.txt at gmail.com>> schrieb am Fr., 14. Aug. 2020, 16:59:
>>     > >
>>     > > > Fabian,
>>     > > >
>>     > > > what do you mean?
>>     > > >
>>     > > > > > I think vexxhost is running (1) with their
>>     openstack-operator - for
>>     > > >
>>     > > > reasons.
>>     > > >
>>     > > > On Fri, Aug 14, 2020 at 7:28 AM Fabian Zimmermann
>>     <dev.faz at gmail.com <mailto:dev.faz at gmail.com>>
>>     > > > wrote:
>>     > > > >
>>     > > > > Hello again,
>>     > > > >
>>     > > > > just a short update about the results of my tests.
>>     > > > >
>>     > > > > I currently see 2 ways of running openstack+rabbitmq
>>     > > > >
>>     > > > > 1. without durable-queues and without replication - just one
>>     > > >
>>     > > > rabbitmq-process which gets (somehow) restarted if it fails.
>>     > > > > 2. durable-queues and replication
>>     > > > >
>>     > > > > Any other combination of these settings leads to more or
>>     less issues with
>>     > > > >
>>     > > > > * broken / non working bindings
>>     > > > > * broken queues
>>     > > > >
>>     > > > > I think vexxhost is running (1) with their
>>     openstack-operator - for
>>     > > >
>>     > > > reasons.
>>     > > > >
>>     > > > > I added [kolla], because kolla-ansible is installing
>>     rabbitmq with
>>     > > >
>>     > > > replication but without durable-queues.
>>     > > > >
>>     > > > > May someone point me to the best way to document these
>>     findings to some
>>     > > >
>>     > > > official doc?
>>     > > > > I think a lot of installations out there will run into
>>     issues if - under
>>     > > >
>>     > > > load - a node fails.
>>     > > > >
>>     > > > >  Fabian
>>     > > > >
>>     > > > >
>>     > > > > Am Do., 13. Aug. 2020 um 15:13 Uhr schrieb Fabian Zimmermann <
>>     > > >
>>     > > > dev.faz at gmail.com <mailto:dev.faz at gmail.com>>:
>>     > > > > >
>>     > > > > > Hi,
>>     > > > > >
>>     > > > > > just did some short tests today in our test-environment
>>     (without
>>     > > >
>>     > > > durable queues and without replication):
>>     > > > > >
>>     > > > > > * started a rally task to generate some load
>>     > > > > > * kill-9-ed rabbitmq on one node
>>     > > > > > * rally task immediately stopped and the cloud (mostly)
>>     stopped working
>>     > > > > >
>>     > > > > > after some debugging i found (again) exchanges which had
>>     bindings to
>>     > > >
>>     > > > queues, but these bindings didnt forward any msgs.
>>     > > > > > Wrote a small script to detect these broken bindings and
>>     will now check
>>     > > >
>>     > > > if this is "reproducible"
>>     > > > > >
>>     > > > > > then I will try "durable queues" and "durable queues
>>     with replication"
>>     > > >
>>     > > > to see if this helps. Even if I would expect
>>     > > > > > rabbitmq should be able to handle this without these
>>     "hidden broken
>>     > > >
>>     > > > bindings"
>>     > > > > >
>>     > > > > > This just FYI.
>>     > > > > >
>>     > > > > >  Fabian
>>     >
>>


From elfosardo at gmail.com  Mon Aug 17 16:29:56 2020
From: elfosardo at gmail.com (Riccardo Pittau)
Date: Mon, 17 Aug 2020 18:29:56 +0200
Subject: [ironic] next Victoria meetup
Message-ID: <CAORRS=kxY6MbaNmo6eh2v_5taKCnVLAxMZXneCYXAjQA-ySg+w@mail.gmail.com>

Hello everyone!

The time for the next Ironic virtual meetup is close!
It will be an opportunity to review what has been done in the last months,
exchange ideas and plan for the time before the upcoming victoria release,
with an eye towards the future.

We're aiming to have the virtual meetup the first week of September (Monday
August 31 - Friday September 4) and split it in two days, with one
two-hours slot per day.
Please vote for your best time slots here:
https://doodle.com/poll/pi4x3kuxamf4nnpu

We're planning to leave the vote open at least for the entire week until
Friday August 21, so to have enough time to announce the final slots and
planning early next week.

Thanks!

A si biri

Riccardo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200817/10745135/attachment-0001.html>

From openstack at nemebean.com  Mon Aug 17 16:35:56 2020
From: openstack at nemebean.com (Ben Nemec)
Date: Mon, 17 Aug 2020 11:35:56 -0500
Subject: [oslo] Feature Freeze is approaching
Message-ID: <ef619c2f-3c27-1cd0-b1ca-636a7ebadca2@nemebean.com>

Hi Oslo contributors,

Oslo observes a feature freeze that is earlier than other projects. This 
is to allow time for features in Oslo to be adopted in the services 
before their feature freeze.

And it's coming up soon. Aug. 28th is the Oslo feature freeze date for 
this cycle. That leaves about two weeks for features to be merged in 
Oslo libraries. After that, any features to be merged will require a 
feature freeze exception, which can be requested on the list.

If you have any questions about this feel free to contact me here or on 
IRC (bnemec in #openstack-oslo). Thanks!

-Ben


From melwittt at gmail.com  Mon Aug 17 17:50:15 2020
From: melwittt at gmail.com (melanie witt)
Date: Mon, 17 Aug 2020 10:50:15 -0700
Subject: [neutron][gate] verbose q-svc log files and e-r indexing
Message-ID: <bbf4a146-7c67-969e-b783-4bdfbe31add1@gmail.com>

Hi all,

Recently we've noticed elastic search indexing is behind 115 hours [1] and we looked for abnormally large log files being generated in the gate.

We found that the q-svc log is very large, one example being 71.6M [2]. There is a lot of Time-Cost profiling output in the log, like this:

Aug 17 14:22:23.210076 ubuntu-bionic-ovh-bhs1-0019298855 neutron-server[5168]: DEBUG neutron_lib.utils.helpers [req-75719db1-4abf-4500-bb0a-6d24e82cd4fd req-d88e7052-7da9-4bc9-8b35-5730ae76dcad service neutron] Time-cost: call 48e628cc-8c3a-408d-a36f-b219524480e0 function apply_funcs start {{(pid=5554) wrapper /usr/local/lib/python3.6/dist-packages/neutron_lib/utils/helpers.py:218}}

We saw that there was a recent-ish change to remove some of the profiling output [3] but it was only for the get_objects method.

Looking at the total number of lines in the file vs the number of lines without apply_funcs Time-Cost output:

$ wc -l screen-q-svc.txt
186387 screen-q-svc.txt

$ grep -v "function apply_funcs" screen-q-svc.txt|wc -l
102593

Would it be possible to remove this profiling output from the gate log to give elastic search indexing a better chance at keeping up? Or is there something else I've missed that could be made less verbose in the logging?

Thanks for your help.

Cheers,
-melanie

[1] http://status.openstack.org/elastic-recheck
[2] https://b6ba3b9af8fd7de57099-18aa39cea11f738aa67ebd6bc9fb5e4c.ssl.cf2.rackcdn.com/744958/4/check/tempest-integrated-compute/4421bf9/controller/logs/screen-q-svc.txt
[3] https://review.opendev.org/741540


From mnaser at vexxhost.com  Mon Aug 17 18:36:34 2020
From: mnaser at vexxhost.com (Mohammed Naser)
Date: Mon, 17 Aug 2020 14:36:34 -0400
Subject: [tc] weekly update
Message-ID: <CAEs876gicpOC2PeHZpQBs56kofskatp2H=mnkJN2AoanvgHDOA@mail.gmail.com>

Hi everyone,

Here’s an update for what happened in the OpenStack TC this week. You
can get more information by checking for changes in
openstack/governance repository.  We've also included a few references
to some important mailing list threads that you should check out.

# Patches
## Open Reviews
- Move towards single office hour https://review.opendev.org/745200
- Drop all exceptions for legacy validation https://review.opendev.org/745403
- Create starter-kit:kubernetes-in-virt tag https://review.opendev.org/736369
- Fix names inside check-review-status https://review.opendev.org/745913
- Resolution to define distributed leadership for projects
https://review.opendev.org/744995
- Move towards dual office hours in diff TZ https://review.opendev.org/746167
- [draft] Add assert:supports-standalone https://review.opendev.org/722399

## Project Updates
- Add python-dracclient to be owned by Hardware Vendor SIG
https://review.opendev.org/745564

## General Changes
- Add legacy repository validation https://review.opendev.org/737559
- Clean up expired i18n SIG extra-ATCs https://review.opendev.org/745565
- Sort SIG names in repo owner list https://review.opendev.org/745563
- Drop neutron-vpnaas from legacy projects https://review.opendev.org/745401
- Pierre Riteau as CloudKitty PTL for Victoria https://review.opendev.org/745653
- Declare supported runtimes for Wallaby release
https://review.opendev.org/743847

## Abandoned Changes
- Move towards dual office hours https://review.opendev.org/745201

# Email Threads
- Zuul Native Jobs Goal Update #2:
http://lists.openstack.org/pipermail/openstack-discuss/2020-August/016561.html
- Masakari Project Aliveness:
http://lists.openstack.org/pipermail/openstack-discuss/2020-August/016520.html
- vPTG October 2020 Signup:
http://lists.openstack.org/pipermail/openstack-discuss/2020-August/016497.html
- OpenStack Client vs python-*clients:
http://lists.openstack.org/pipermail/openstack-discuss/2020-August/016409.html

# Other Reminders
- Virtual Summit Community voting closes Monday, August 17 at 11:59pm
Pacific Time

Thanks for reading!
Mohammed & Kendall


-- 
Mohammed Naser
VEXXHOST, Inc.


From kennelson11 at gmail.com  Mon Aug 17 20:46:32 2020
From: kennelson11 at gmail.com (Kendall Nelson)
Date: Mon, 17 Aug 2020 13:46:32 -0700
Subject: [all][PTL][TC] Forum Brainstorming
Message-ID: <CAJ6yrQhAhdbTAhcMtsj9Fk3Av8MU9GcNdMivgQKKvFFS8p6UZQ@mail.gmail.com>

Hello Everyone!

The Virtual Forum is approaching.  We would love 1 volunteer from the
community for the Forum Selection Committee.  Ideally, the volunteer would
already be serving in some capacity in a governance role for your project.

In addition to calling for volunteers for the Forum selection committee,
this email kicks off the brainstorming period before the CFP tool opens for
formal Forum submissions. The categories for brainstorming etherpads have
already been setup here[1].  Please add your etherpads and ideas there!

The CFP tool will open on August 31st and will close September 14th.

For information on the upcoming virtual Summit[2].
For more information on the Forum[3].

Please reach out to jimmy at openstack.org or knelson at openstack.org  if you're
interested. Volunteers should respond on or before August 31, 2020.

Thanks!
Kendall (diablo_rojo)

[1] Virtual Forum 2020 Wiki:
https://wiki.openstack.org/wiki/Forum/Virtual2020
[2] Virtual Open Infra Summit Site:  https://www.openstack.org/summit/2020
[3] General Forum Wiki: https://wiki.openstack.org/wiki/Forum
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200817/c85eb8f4/attachment.html>

From dbengt at redhat.com  Mon Aug 17 14:19:29 2020
From: dbengt at redhat.com (Daniel Bengtsson)
Date: Mon, 17 Aug 2020 16:19:29 +0200
Subject: Can't fetch from opendev.
In-Reply-To: <CAEs876h_vOQPk4Aji5P-wXGD=YnmRZVt51kYovVGBqS5Ti2ehw@mail.gmail.com>
References: <58c9ecb6-d1cc-df2f-caa8-693ed3f03d00@redhat.com>
 <CAEs876h_vOQPk4Aji5P-wXGD=YnmRZVt51kYovVGBqS5Ti2ehw@mail.gmail.com>
Message-ID: <90a2f1fd-b450-fcc0-ccf2-d5ed1e0b7533@redhat.com>


On 8/17/20 4:17 PM, Mohammed Naser wrote:
> I've reported this to the OpenDev team at #opendev on IRC, I think one
> of the Gitea backends is likely unhappy.
Thanks a lot for your answer and the report.


From cjeanner at redhat.com  Tue Aug 18 07:29:20 2020
From: cjeanner at redhat.com (=?UTF-8?Q?C=c3=a9dric_Jeanneret?=)
Date: Tue, 18 Aug 2020 09:29:20 +0200
Subject: [tripleo] Moving tripleo-ansible-inventory script to tripleo-common?
Message-ID: <0e91db84-723b-d14b-d654-fdc74a0a42eb@redhat.com>

Hello there!

I'm wondering if we could move the "tripleo-ansible-inventory" script
from the tripleo-validations repo to tripleo-common.

The main motivation here is to make things consistent:
- that script calls content from tripleo-common, nothing from
tripleo-validations.
- that script isn't only for the validations, so it makes more sense to
install it via tripleo-common
- in fact, we should probably push that inventory thing as an `openstack
tripleo' sub-command, but that's another story

So, is there any opposition to this proposal?

Cheers,

C.


-- 
Cédric Jeanneret (He/Him/His)
Sr. Software Engineer - OpenStack Platform
Deployment Framework TC
Red Hat EMEA
https://www.redhat.com/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200818/773bce31/attachment-0001.sig>

From ramishra at redhat.com  Tue Aug 18 07:53:40 2020
From: ramishra at redhat.com (Rabi Mishra)
Date: Tue, 18 Aug 2020 13:23:40 +0530
Subject: [tripleo] Moving tripleo-ansible-inventory script to
 tripleo-common?
In-Reply-To: <0e91db84-723b-d14b-d654-fdc74a0a42eb@redhat.com>
References: <0e91db84-723b-d14b-d654-fdc74a0a42eb@redhat.com>
Message-ID: <CABJHmF6t_58+Nq=gWiB_U4u3mMd=gmFWkfN5LUoDVuNN-68wQQ@mail.gmail.com>

On Tue, Aug 18, 2020 at 1:07 PM Cédric Jeanneret <cjeanner at redhat.com>
wrote:

> Hello there!
>
> I'm wondering if we could move the "tripleo-ansible-inventory" script
> from the tripleo-validations repo to tripleo-common.
>

TBH, I don't know the history, but it would be better if we remove all
scripts from tripleo-common and use it just as a utility library (now that
Mistral is gone). Most of the existing scripts probably have an existing
command in tripleoclient. We can implement  missing ones including
"tripleo-ansible-inventory" in python-tripleoclient.

>
> The main motivation here is to make things consistent:
> - that script calls content from tripleo-common, nothing from
> tripleo-validations.
> - that script isn't only for the validations, so it makes more sense to
> install it via tripleo-common
> - in fact, we should probably push that inventory thing as an `openstack
> tripleo' sub-command, but that's another story
>
> So, is there any opposition to this proposal?
>
> Cheers,
>
> C.
>
>
> --
> Cédric Jeanneret (He/Him/His)
> Sr. Software Engineer - OpenStack Platform
> Deployment Framework TC
> Red Hat EMEA
> https://www.redhat.com/
>
>

-- 
Regards,
Rabi Mishra
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200818/042c0e15/attachment.html>

From cjeanner at redhat.com  Tue Aug 18 08:03:02 2020
From: cjeanner at redhat.com (=?UTF-8?Q?C=c3=a9dric_Jeanneret?=)
Date: Tue, 18 Aug 2020 10:03:02 +0200
Subject: [tripleo] Moving tripleo-ansible-inventory script to
 tripleo-common?
In-Reply-To: <CABJHmF6t_58+Nq=gWiB_U4u3mMd=gmFWkfN5LUoDVuNN-68wQQ@mail.gmail.com>
References: <0e91db84-723b-d14b-d654-fdc74a0a42eb@redhat.com>
 <CABJHmF6t_58+Nq=gWiB_U4u3mMd=gmFWkfN5LUoDVuNN-68wQQ@mail.gmail.com>
Message-ID: <e28f0070-75c9-d960-2b19-1b2c3cd91c39@redhat.com>


On 8/18/20 9:53 AM, Rabi Mishra wrote:
> 
> 
> On Tue, Aug 18, 2020 at 1:07 PM Cédric Jeanneret <cjeanner at redhat.com
> <mailto:cjeanner at redhat.com>> wrote:
> 
>     Hello there!
> 
>     I'm wondering if we could move the "tripleo-ansible-inventory" script
>     from the tripleo-validations repo to tripleo-common.
> 
> 
> TBH, I don't know the history, but it would be better if we remove all
> scripts from tripleo-common and use it just as a utility library (now
> that Mistral is gone). Most of the existing scripts probably have an
> existing command in tripleoclient. We can implement  missing ones
> including "tripleo-ansible-inventory" in python-tripleoclient.

would probably be better to implement it directly in tripleoclient imho.
In any cases, it has nothing to do in tripleo-validations...

I can't connect to launchpad, they are having some auth issue, I can't
create an RFE there :(.

> 
> 
>     The main motivation here is to make things consistent:
>     - that script calls content from tripleo-common, nothing from
>     tripleo-validations.
>     - that script isn't only for the validations, so it makes more sense to
>     install it via tripleo-common
>     - in fact, we should probably push that inventory thing as an `openstack
>     tripleo' sub-command, but that's another story
> 
>     So, is there any opposition to this proposal?
> 
>     Cheers,
> 
>     C.
> 
> 
>     -- 
>     Cédric Jeanneret (He/Him/His)
>     Sr. Software Engineer - OpenStack Platform
>     Deployment Framework TC
>     Red Hat EMEA
>     https://www.redhat.com/
> 
> 
> 
> -- 
> Regards,
> Rabi Mishra
> 

-- 
Cédric Jeanneret (He/Him/His)
Sr. Software Engineer - OpenStack Platform
Deployment Framework TC
Red Hat EMEA
https://www.redhat.com/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200818/6026fbcf/attachment.sig>

From jaosorior at redhat.com  Tue Aug 18 08:20:47 2020
From: jaosorior at redhat.com (Juan Osorio Robles)
Date: Tue, 18 Aug 2020 11:20:47 +0300
Subject: [tripleo] Moving tripleo-ansible-inventory script to
 tripleo-common?
In-Reply-To: <e28f0070-75c9-d960-2b19-1b2c3cd91c39@redhat.com>
References: <0e91db84-723b-d14b-d654-fdc74a0a42eb@redhat.com>
 <CABJHmF6t_58+Nq=gWiB_U4u3mMd=gmFWkfN5LUoDVuNN-68wQQ@mail.gmail.com>
 <e28f0070-75c9-d960-2b19-1b2c3cd91c39@redhat.com>
Message-ID: <CAO+FWXTyU-93=f21Krby0i34wMKKd5HmSK9r5=imiivMrV_+0g@mail.gmail.com>

IIRC it was in tripleo-validations since that was the first and only user
of the script at the time. When it got into use by other flows it just
never got moved.

On Tue, 18 Aug 2020 at 11:11, Cédric Jeanneret <cjeanner at redhat.com> wrote:

>
>
> On 8/18/20 9:53 AM, Rabi Mishra wrote:
> >
> >
> > On Tue, Aug 18, 2020 at 1:07 PM Cédric Jeanneret <cjeanner at redhat.com
> > <mailto:cjeanner at redhat.com>> wrote:
> >
> >     Hello there!
> >
> >     I'm wondering if we could move the "tripleo-ansible-inventory" script
> >     from the tripleo-validations repo to tripleo-common.
> >
> >
> > TBH, I don't know the history, but it would be better if we remove all
> > scripts from tripleo-common and use it just as a utility library (now
> > that Mistral is gone). Most of the existing scripts probably have an
> > existing command in tripleoclient. We can implement  missing ones
> > including "tripleo-ansible-inventory" in python-tripleoclient.
>
> would probably be better to implement it directly in tripleoclient imho.
> In any cases, it has nothing to do in tripleo-validations...
>

Moving it to tripleoclient makes sense IMO.


> I can't connect to launchpad, they are having some auth issue, I can't
> create an RFE there :(.
>
> >
> >
> >     The main motivation here is to make things consistent:
> >     - that script calls content from tripleo-common, nothing from
> >     tripleo-validations.
> >     - that script isn't only for the validations, so it makes more sense
> to
> >     install it via tripleo-common
> >     - in fact, we should probably push that inventory thing as an
> `openstack
> >     tripleo' sub-command, but that's another story
> >
> >     So, is there any opposition to this proposal?
> >
> >     Cheers,
> >
> >     C.
> >
> >
> >     --
> >     Cédric Jeanneret (He/Him/His)
> >     Sr. Software Engineer - OpenStack Platform
> >     Deployment Framework TC
> >     Red Hat EMEA
> >     https://www.redhat.com/
> >
> >
> >
> > --
> > Regards,
> > Rabi Mishra
> >
>
> --
> Cédric Jeanneret (He/Him/His)
> Sr. Software Engineer - OpenStack Platform
> Deployment Framework TC
> Red Hat EMEA
> https://www.redhat.com/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200818/b56c6d91/attachment.html>

From mbultel at redhat.com  Tue Aug 18 08:26:56 2020
From: mbultel at redhat.com (Mathieu Bultel)
Date: Tue, 18 Aug 2020 10:26:56 +0200
Subject: [tripleo] Moving tripleo-ansible-inventory script to
 tripleo-common?
In-Reply-To: <e28f0070-75c9-d960-2b19-1b2c3cd91c39@redhat.com>
References: <0e91db84-723b-d14b-d654-fdc74a0a42eb@redhat.com>
 <CABJHmF6t_58+Nq=gWiB_U4u3mMd=gmFWkfN5LUoDVuNN-68wQQ@mail.gmail.com>
 <e28f0070-75c9-d960-2b19-1b2c3cd91c39@redhat.com>
Message-ID: <CAKwUe8nZZ5sNQ2n0es-p0682DaovwwXiRNf+h7zjg4K_EF=8gQ@mail.gmail.com>

On Tue, Aug 18, 2020 at 10:08 AM Cédric Jeanneret <cjeanner at redhat.com>
wrote:

>
>
> On 8/18/20 9:53 AM, Rabi Mishra wrote:
> >
> >
> > On Tue, Aug 18, 2020 at 1:07 PM Cédric Jeanneret <cjeanner at redhat.com
> > <mailto:cjeanner at redhat.com>> wrote:
> >
> >     Hello there!
> >
> >     I'm wondering if we could move the "tripleo-ansible-inventory" script
> >     from the tripleo-validations repo to tripleo-common.
> >
> >
> > TBH, I don't know the history, but it would be better if we remove all
> > scripts from tripleo-common and use it just as a utility library (now
> > that Mistral is gone). Most of the existing scripts probably have an
> > existing command in tripleoclient. We can implement  missing ones
> > including "tripleo-ansible-inventory" in python-tripleoclient.
>
> would probably be better to implement it directly in tripleoclient imho.
> In any cases, it has nothing to do in tripleo-validations...
>

+1 with that, it will probably be better to move everything in
tripleoclient.


>
> I can't connect to launchpad, they are having some auth issue, I can't
> create an RFE there :(.
>
> >
> >
> >     The main motivation here is to make things consistent:
> >     - that script calls content from tripleo-common, nothing from
> >     tripleo-validations.
> >     - that script isn't only for the validations, so it makes more sense
> to
> >     install it via tripleo-common
> >     - in fact, we should probably push that inventory thing as an
> `openstack
> >     tripleo' sub-command, but that's another story
> >
> >     So, is there any opposition to this proposal?
> >
> >     Cheers,
> >
> >     C.
> >
> >
> >     --
> >     Cédric Jeanneret (He/Him/His)
> >     Sr. Software Engineer - OpenStack Platform
> >     Deployment Framework TC
> >     Red Hat EMEA
> >     https://www.redhat.com/
> >
> >
> >
> > --
> > Regards,
> > Rabi Mishra
> >
>
> --
> Cédric Jeanneret (He/Him/His)
> Sr. Software Engineer - OpenStack Platform
> Deployment Framework TC
> Red Hat EMEA
> https://www.redhat.com/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200818/b0901faf/attachment-0001.html>

From skaplons at redhat.com  Tue Aug 18 10:28:20 2020
From: skaplons at redhat.com (Slawek Kaplonski)
Date: Tue, 18 Aug 2020 12:28:20 +0200
Subject: [neutron][ops] API for viewing HA router states
In-Reply-To: <CABARBAZGCVF6J9bbDCg2b=_bWrN6cOZGTJxkvOFNe3ds=kG9fg@mail.gmail.com>
References: <CAEs876jV-RqWiW2K8ehvm4ACTCiU059LJDiC+FWGWKQdPGCjGw@mail.gmail.com>
 <CAA857Vyu1bBJFe7W-ny_7BXo2WzUqj5WoP700r6Tq=+Zb+MMrg@mail.gmail.com>
 <CABARBAZeo515f9dV25Oo3AsHw_7dxez-E2irHrg=qzB0RVgmyA@mail.gmail.com>
 <CAA857VzXdA6aA+L7OGhCbiz8J-gonkR0D3UWnjCdOgDmUQS1cw@mail.gmail.com>
 <CAEs876hAcBwwR0WOzXiPf12HS6EoMqhcUemynXUN0GXD4Hxeew@mail.gmail.com>
 <CABARBAZGCVF6J9bbDCg2b=_bWrN6cOZGTJxkvOFNe3ds=kG9fg@mail.gmail.com>
Message-ID: <20200818102820.l2vxfqhpmetw6gft@skaplons-mac>

Hi,

On Mon, Aug 17, 2020 at 11:39:44AM -0400, Assaf Muller wrote:
> On Mon, Aug 17, 2020 at 10:19 AM Mohammed Naser <mnaser at vexxhost.com> wrote:
> >
> > Hi all:
> >
> > What Fabian is describing is exactly the problem we're having, there
> > are _many_ routers in these environments so we'd be looking at N
> > requests which can get out of control quickly
> 
> I think it's a clear use case to implement a new API endpoint that
> returns HA state per agent for *all* routers in a single call. Should
> be easy to implement.

I agree with that. Can You maybe propose official RFE for that and describe
there Your use case - see [1] for details.

> 
> >
> > Thanks
> > Mohammed
> >
> > On Mon, Aug 17, 2020 at 10:05 AM Fabian Zimmermann <dev.faz at gmail.com> wrote:
> > >
> > > Hi,
> > >
> > > yes for 1 router, but doing this in a loop for hundreds is not so performant ;)
> > >
> > >  Fabian
> > >
> > > Am Mo., 17. Aug. 2020 um 16:04 Uhr schrieb Assaf Muller <amuller at redhat.com>:
> > > >
> > > > On Mon, Aug 17, 2020 at 9:59 AM Fabian Zimmermann <dev.faz at gmail.com> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > I can just tell you that we are doing a similar check for dhcp-agent, but here we just execute a suitable SQL-statement to detect more than 1 agent / AZ.
> > > > >
> > > > > Doing the same for L3 shouldn't be that hard, but I dont know if this is what you are looking for?
> > > >
> > > > There's already an API for this:
> > > > neutron l3-agent-list-hosting-router <router_id>
> > > >
> > > > It will show you the HA state per L3 agent for the given router.
> > > >
> > > > >
> > > > >  Fabian
> > > > >
> > > > >
> > > > > Am Mo., 17. Aug. 2020 um 14:11 Uhr schrieb Mohammed Naser <mnaser at vexxhost.com>:
> > > > >>
> > > > >> Hi all,
> > > > >>
> > > > >> Over the past few days, we were troubleshooting an issue that ended up
> > > > >> having a root cause where keepalived has somehow ended up active in
> > > > >> two different L3 agents.  We've yet to find the root cause of how this
> > > > >> happened but removing it and adding it resolved the issue for us.
> > > > >>
> > > > >> As we work on improving our monitoring, we wanted to implement
> > > > >> something that gets us the info of # of active routers to check if
> > > > >> there's a router that has >1 active L3 agent but it's hard because
> > > > >> hitting the /l3-agents endpoint on _every_ single router hurts a lot
> > > > >> on performance.
> > > > >>
> > > > >> Is there something else that we can watch which might be more
> > > > >> productive?  FYI -- this all goes in the open and will end up inside
> > > > >> the openstack-exporter:
> > > > >> https://github.com/openstack-exporter/openstack-exporter and the Helm
> > > > >> charts will end up with the alerts:
> > > > >> https://github.com/openstack-exporter/helm-charts
> > > > >>
> > > > >> Thanks!
> > > > >> Mohammed
> > > > >>
> > > > >> --
> > > > >> Mohammed Naser
> > > > >> VEXXHOST, Inc.
> > > > >>
> > > >
> >
> >
> >
> > --
> > Mohammed Naser
> > VEXXHOST, Inc.
> >
> 
> 

[1] https://docs.openstack.org/neutron/latest/contributor/policies/blueprints.html#neutron-request-for-feature-enhancements

-- 
Slawek Kaplonski
Principal software engineer
Red Hat


From skaplons at redhat.com  Tue Aug 18 10:33:23 2020
From: skaplons at redhat.com (Slawek Kaplonski)
Date: Tue, 18 Aug 2020 12:33:23 +0200
Subject: [neutron][gate] verbose q-svc log files and e-r indexing
In-Reply-To: <bbf4a146-7c67-969e-b783-4bdfbe31add1@gmail.com>
References: <bbf4a146-7c67-969e-b783-4bdfbe31add1@gmail.com>
Message-ID: <20200818103323.wq5upyjn4nzsqhx7@skaplons-mac>

Hi,

I opened LP for that [1] and I will propose some fix for it ASAP.

On Mon, Aug 17, 2020 at 10:50:15AM -0700, melanie witt wrote:
> Hi all,
> 
> Recently we've noticed elastic search indexing is behind 115 hours [1] and we looked for abnormally large log files being generated in the gate.
> 
> We found that the q-svc log is very large, one example being 71.6M [2]. There is a lot of Time-Cost profiling output in the log, like this:
> 
> Aug 17 14:22:23.210076 ubuntu-bionic-ovh-bhs1-0019298855 neutron-server[5168]: DEBUG neutron_lib.utils.helpers [req-75719db1-4abf-4500-bb0a-6d24e82cd4fd req-d88e7052-7da9-4bc9-8b35-5730ae76dcad service neutron] Time-cost: call 48e628cc-8c3a-408d-a36f-b219524480e0 function apply_funcs start {{(pid=5554) wrapper /usr/local/lib/python3.6/dist-packages/neutron_lib/utils/helpers.py:218}}
> 
> We saw that there was a recent-ish change to remove some of the profiling output [3] but it was only for the get_objects method.
> 
> Looking at the total number of lines in the file vs the number of lines without apply_funcs Time-Cost output:
> 
> $ wc -l screen-q-svc.txt
> 186387 screen-q-svc.txt
> 
> $ grep -v "function apply_funcs" screen-q-svc.txt|wc -l
> 102593
> 
> Would it be possible to remove this profiling output from the gate log to give elastic search indexing a better chance at keeping up? Or is there something else I've missed that could be made less verbose in the logging?
> 
> Thanks for your help.
> 
> Cheers,
> -melanie
> 
> [1] http://status.openstack.org/elastic-recheck
> [2] https://b6ba3b9af8fd7de57099-18aa39cea11f738aa67ebd6bc9fb5e4c.ssl.cf2.rackcdn.com/744958/4/check/tempest-integrated-compute/4421bf9/controller/logs/screen-q-svc.txt
> [3] https://review.opendev.org/741540
> 

[1] https://bugs.launchpad.net/neutron/+bug/1892017

-- 
Slawek Kaplonski
Principal software engineer
Red Hat


From thierry at openstack.org  Tue Aug 18 10:44:43 2020
From: thierry at openstack.org (Thierry Carrez)
Date: Tue, 18 Aug 2020 12:44:43 +0200
Subject: [simplification] Making ask.openstack.org read-only
Message-ID: <f1cdb911-5e4c-cf4f-7f3b-32ca12353ace@openstack.org>

Hi everyone,

This has been discussed several times on this mailing list in the past, 
but we never got to actually pull the plug.

Ask.openstack.org was launched in 2013. The reason for hosting our own 
setup was to be able to support multiple languages, while StackOverflow 
rejected our proposal to have our own openstack-branded StackExchange 
site. The Chinese ask.o.o side never really took off. The English side 
also never really worked perfectly (like email alerts are hopelessly 
broken), but we figured it would get better with time if a big community 
formed around it.

Fast-forward to 2020 and the instance is lacking volunteers to help run 
it, while the code (and our customization of it) has become more 
complicated to maintain. It regularly fails one way or another, and 
questions there often go unanswered, making us look bad. Of the top 30 
users, most have abandoned the platform since 2017, leaving only Bernd 
Bausch actively engaging and helping moderate questions lately. We have 
called for volunteers several times, but the offers for help never 
really materialized.

At the same time, people are asking OpenStack questions on 
StackOverflow, and sometimes getting answers there[1]. The fragmentation 
of the "questions" space is not helping users getting good answers.

I think it's time to pull the plug, make ask.openstack.org read-only (so 
that links to old answers are not lost) and redirect users to the 
mailing-list and the "OpenStack" tag on StackOverflow. I picked 
StackOverflow since it seems to have the most openstack questions (2,574 
on SO, 76 on SuperUser and 430 on ServerFault).

We discussed that option several times, but I now proposed a change to 
actually make it happen:

https://review.opendev.org/#/c/746497/

It's always a difficult decision to make to kill a resource, but I feel 
like in this case, consolidation and simplification would help.

Thoughts, comments?

[1] https://stackoverflow.com/questions/tagged/openstack

-- 
Thierry


From arnaud.morin at gmail.com  Tue Aug 18 12:07:08 2020
From: arnaud.morin at gmail.com (Arnaud Morin)
Date: Tue, 18 Aug 2020 12:07:08 +0000
Subject: [nova][neutron][oslo][ops][kolla] rabbit bindings issue
In-Reply-To: <CAA857Vw_HG8VuZyyihm8RB9Mo8aPm=v36GJrnK1jD9wyZHAb2A@mail.gmail.com>
References: <CAA857VxY8mr0OSw8vSYdMph4tXb=5Wwwx3cM_YVgGa8=P+cxkQ@mail.gmail.com>
 <CALaZjRGEcFDBiQ2YnsMjXJ_WF3ALcjWVLQuetupVd__CWhjg0g@mail.gmail.com>
 <CAA857VzKREnBOEF667n4B+9L9Rbm+y7o2LVrDjhxhCT=2XWR8g@mail.gmail.com>
 <1a338d7e-c82c-cda2-2d47-b5aebb999142@openstack.org>
 <CAA857VxxmrHhBZgNHCg2X=ymvOdS_YcpERYN5OnKs0NMEA=cyA@mail.gmail.com>
 <CAA857Vya0rou-SwzO+A=smVHXNTZ9a03_3r09BJj=bb3NAcdGg@mail.gmail.com>
 <CAA857Vzw+kvW_7R37oqJF6gRkxGBSY3gNYTLKu++pY5WHH49qg@mail.gmail.com>
 <CAA857Vzq1nYLM3ES35J_9PxTieVskhZBc43Z40R3NeiEEOL4rA@mail.gmail.com>
 <CAPgF-fr-3J=acxng6SwRkLPXqvTYNsNmCZUXyvzi4rxDnPDdEA@mail.gmail.com>
 <CAA857Vw_HG8VuZyyihm8RB9Mo8aPm=v36GJrnK1jD9wyZHAb2A@mail.gmail.com>
Message-ID: <20200818120708.GV31915@sync>

Hey all,

About the vexxhost strategy to use only one rabbit server and manage HA through
rabbit.
Do you plan to do the same for MariaDB/MySQL?

-- 
Arnaud Morin

On 14.08.20 - 18:45, Fabian Zimmermann wrote:
> Hi,
> 
> i read somewhere that vexxhosts kubernetes openstack-Operator is running
> one rabbitmq Container per Service. Just the kubernetes self healing is
> used as "ha" for rabbitmq.
> 
> That seems to match with my finding: run rabbitmq standalone and use an
> external system to restart rabbitmq if required.
> 
>  Fabian
> 
> Satish Patel <satish.txt at gmail.com> schrieb am Fr., 14. Aug. 2020, 16:59:
> 
> > Fabian,
> >
> > what do you mean?
> >
> > >> I think vexxhost is running (1) with their openstack-operator - for
> > reasons.
> >
> > On Fri, Aug 14, 2020 at 7:28 AM Fabian Zimmermann <dev.faz at gmail.com>
> > wrote:
> > >
> > > Hello again,
> > >
> > > just a short update about the results of my tests.
> > >
> > > I currently see 2 ways of running openstack+rabbitmq
> > >
> > > 1. without durable-queues and without replication - just one
> > rabbitmq-process which gets (somehow) restarted if it fails.
> > > 2. durable-queues and replication
> > >
> > > Any other combination of these settings leads to more or less issues with
> > >
> > > * broken / non working bindings
> > > * broken queues
> > >
> > > I think vexxhost is running (1) with their openstack-operator - for
> > reasons.
> > >
> > > I added [kolla], because kolla-ansible is installing rabbitmq with
> > replication but without durable-queues.
> > >
> > > May someone point me to the best way to document these findings to some
> > official doc?
> > > I think a lot of installations out there will run into issues if - under
> > load - a node fails.
> > >
> > >  Fabian
> > >
> > >
> > > Am Do., 13. Aug. 2020 um 15:13 Uhr schrieb Fabian Zimmermann <
> > dev.faz at gmail.com>:
> > >>
> > >> Hi,
> > >>
> > >> just did some short tests today in our test-environment (without
> > durable queues and without replication):
> > >>
> > >> * started a rally task to generate some load
> > >> * kill-9-ed rabbitmq on one node
> > >> * rally task immediately stopped and the cloud (mostly) stopped working
> > >>
> > >> after some debugging i found (again) exchanges which had bindings to
> > queues, but these bindings didnt forward any msgs.
> > >> Wrote a small script to detect these broken bindings and will now check
> > if this is "reproducible"
> > >>
> > >> then I will try "durable queues" and "durable queues with replication"
> > to see if this helps. Even if I would expect
> > >> rabbitmq should be able to handle this without these "hidden broken
> > bindings"
> > >>
> > >> This just FYI.
> > >>
> > >>  Fabian
> >


From jonas.schaefer at cloudandheat.com  Tue Aug 18 12:08:42 2020
From: jonas.schaefer at cloudandheat.com (Jonas =?ISO-8859-1?Q?Sch=E4fer?=)
Date: Tue, 18 Aug 2020 14:08:42 +0200
Subject: [neutron][ops] API for viewing HA router states
In-Reply-To: <CAEs876jV-RqWiW2K8ehvm4ACTCiU059LJDiC+FWGWKQdPGCjGw@mail.gmail.com>
References: <CAEs876jV-RqWiW2K8ehvm4ACTCiU059LJDiC+FWGWKQdPGCjGw@mail.gmail.com>
Message-ID: <6613245.ccrTHCtBl7@antares>

Hi Mohammed and all,

On Montag, 17. August 2020 14:01:55 CEST Mohammed Naser wrote:
> Over the past few days, we were troubleshooting an issue that ended up
> having a root cause where keepalived has somehow ended up active in
> two different L3 agents.  We've yet to find the root cause of how this
> happened but removing it and adding it resolved the issue for us.

We’ve also seen that behaviour occasionally. The root cause is also unclear 
for us (so we would’ve love to hear about that). We have anecdotal evidence 
that a rabbitmq failure was involved, although that makes no sense to me 
personally. Other causes may be incorrectly cleaned-up namespaces (for 
example, when you kill or hard-restart the l3 agent, the namespaces will stay 
around, possibly with the IP address assigned; the keepalived on the other l3 
agents will not see the VRRP advertisments anymore and will ALSO assign the IP 
address. This will also be rectified by a restart always and may require 
manual namespace cleanup with a tool, a node reboot or an agent disable/enable 
cycle.). 

> As we work on improving our monitoring, we wanted to implement
> something that gets us the info of # of active routers to check if
> there's a router that has >1 active L3 agent but it's hard because
> hitting the /l3-agents endpoint on _every_ single router hurts a lot
> on performance.
> 
> Is there something else that we can watch which might be more
> productive?  FYI -- this all goes in the open and will end up inside
> the openstack-exporter:
> https://github.com/openstack-exporter/openstack-exporter and the Helm
> charts will end up with the alerts:
> https://github.com/openstack-exporter/helm-charts

While I don’t think it fits in your openstack-exporter design, we are 
currently using the attached script (which we also hereby publish under the 
terms of the Apache 2.0 license [1]). (Sorry, I lack the time to cleanly 
publish it somewhere right now.)

It checks the state files maintained by the L3 agent conglomerate and exports 
metrics about the master-ness of the routers as prometheus metrics.

Note that this is slightly dangerous since the router IDs are high-cardinality 
and using that as a label value in Prometheus is discouraged; you may not want 
to do this in a public cloud setting.

Either way: This allows us to alert on routers where there is not exactly one 
master state. Downside is that this requires the thing to run locally on the 
l3 agent nodes. Upside is that it is very efficient, and will also show the 
master state in some cases where the router was not cleaned up properly (e.g. 
because the l3 agent and its keepaliveds were killed).

kind regards,
Jonas

   [1]: http://www.apache.org/licenses/LICENSE-2.0
-- 
Jonas Schäfer
DevOps Engineer

Cloud&Heat Technologies GmbH
Königsbrücker Straße 96 | 01099 Dresden
+49 351 479 367 37
jonas.schaefer at cloudandheat.com | www.cloudandheat.com

New Service:
Managed Kubernetes designed for AI & ML
https://managed-kubernetes.cloudandheat.com/

Commercial Register: District Court Dresden
Register Number: HRB 30549
VAT ID No.: DE281093504
Managing Director: Nicolas Röhrs
Authorized signatory: Dr. Marius Feldmann
Authorized signatory: Kristina Rübenkamp
-------------- next part --------------
A non-text attachment was scrubbed...
Name: os_l3_router_exporter.py
Type: text/x-python3
Size: 1780 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200818/97f65ca1/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200818/97f65ca1/attachment.sig>

From amy at demarco.com  Tue Aug 18 13:14:28 2020
From: amy at demarco.com (Amy Marrich)
Date: Tue, 18 Aug 2020 08:14:28 -0500
Subject: GHC Mentors Needed for OpenStack
Message-ID: <CAFs83Qp+v4Nv6UL=ULuvH=43z71mNnhgCrP_S-kHOVVgCSP5DQ@mail.gmail.com>

Grace Hopper Conference is going virtual this year and once again OpenStack
is participating as one of the Open Source Day projects. We are hoping to
do some peer programming (aka mentees shadowing folks while they work
through a patch) as part of the day. Mentors receive a full conference pass
and AnitaB.org membership. Please check out the requirements
<https://ghc.anitab.org/get-involved/volunteer/committee-members-and-scholarship-reviewers-2>{0}
and apply
<https://ghc.anitab.org/get-involved/volunteer/committee-members-and-scholarship-reviewers-2>(1)
by August 19, 2020.

We are also figuring a way for more folks to be able to mentor, so if you'd
like to help but aren't interested in the conference please reach out to me
or Victoria(vkmc) by email or on IRC.

Thanks and apologies for the short deadline though I can probably get let
additions in:)

Amy (spotz)

0- Grace Hopper mentorship requirements:
https://ghc.anitab.org/get-involved/volunteer/committee-members-and-scholarship-reviewers-2

1- Grace Hopper mentorship application:
https://ghc.anitab.org/get-involved/volunteer/committee-members-and-scholarship-reviewers-2
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200818/ecc7861c/attachment.html>

From jasowang at redhat.com  Tue Aug 18 03:24:30 2020
From: jasowang at redhat.com (Jason Wang)
Date: Tue, 18 Aug 2020 11:24:30 +0800
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200814051601.GD15344@joy-OptiPlex-7040>
References: <20200804183503.39f56516.cohuck@redhat.com>
 <c178a0d3-269d-1620-22b1-9010f602d8ff@redhat.com>
 <20200805021654.GB30485@joy-OptiPlex-7040>
 <2624b12f-3788-7e2b-2cb7-93534960bcb7@redhat.com>
 <20200805075647.GB2177@nanopsycho>
 <eb1d01c2-fbad-36b6-10cf-9e03483a736b@redhat.com>
 <20200805093338.GC30485@joy-OptiPlex-7040> <20200805105319.GF2177@nanopsycho>
 <20200810074631.GA29059@joy-OptiPlex-7040>
 <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
Message-ID: <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>


On 2020/8/14 下午1:16, Yan Zhao wrote:
> On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
>> On 2020/8/10 下午3:46, Yan Zhao wrote:
>>>> driver is it handled by?
>>> It looks that the devlink is for network device specific, and in
>>> devlink.h, it says
>>> include/uapi/linux/devlink.h - Network physical device Netlink
>>> interface,
>>
>> Actually not, I think there used to have some discussion last year and the
>> conclusion is to remove this comment.
>>
>> It supports IB and probably vDPA in the future.
>>
> hmm... sorry, I didn't find the referred discussion. only below discussion
> regarding to why to add devlink.
>
> https://www.mail-archive.com/netdev at vger.kernel.org/msg95801.html
> 	>This doesn't seem to be too much related to networking? Why can't something
> 	>like this be in sysfs?
> 	
> 	It is related to networking quite bit. There has been couple of
> 	iteration of this, including sysfs and configfs implementations. There
> 	has been a consensus reached that this should be done by netlink. I
> 	believe netlink is really the best for this purpose. Sysfs is not a good
> 	idea


See the discussion here:

https://patchwork.ozlabs.org/project/netdev/patch/20191115223355.1277139-1-jeffrey.t.kirsher at intel.com/


>
> https://www.mail-archive.com/netdev at vger.kernel.org/msg96102.html
> 	>there is already a way to change eth/ib via
> 	>echo 'eth' > /sys/bus/pci/drivers/mlx4_core/0000:02:00.0/mlx4_port1
> 	>
> 	>sounds like this is another way to achieve the same?
> 	
> 	It is. However the current way is driver-specific, not correct.
> 	For mlx5, we need the same, it cannot be done in this way. Do devlink is
> 	the correct way to go.
>
> https://lwn.net/Articles/674867/
> 	There a is need for some userspace API that would allow to expose things
> 	that are not directly related to any device class like net_device of
> 	ib_device, but rather chip-wide/switch-ASIC-wide stuff.
>
> 	Use cases:
> 	1) get/set of port type (Ethernet/InfiniBand)
> 	2) monitoring of hardware messages to and from chip
> 	3) setting up port splitters - split port into multiple ones and squash again,
> 	   enables usage of splitter cable
> 	4) setting up shared buffers - shared among multiple ports within one chip
>
>
>
> we actually can also retrieve the same information through sysfs, .e.g
>
> |- [path to device]
>    |--- migration
>    |     |--- self
>    |     |   |---device_api
>    |	|   |---mdev_type
>    |	|   |---software_version
>    |	|   |---device_id
>    |	|   |---aggregator
>    |     |--- compatible
>    |     |   |---device_api
>    |	|   |---mdev_type
>    |	|   |---software_version
>    |	|   |---device_id
>    |	|   |---aggregator
>

Yes but:

- You need one file per attribute (one syscall for one attribute)
- Attribute is coupled with kobject

All of above seems unnecessary.

Another point, as we discussed in another thread, it's really hard to 
make sure the above API work for all types of devices and frameworks. So 
having a vendor specific API looks much better.


>
>>>    I feel like it's not very appropriate for a GPU driver to use
>>> this interface. Is that right?
>>
>> I think not though most of the users are switch or ethernet devices. It
>> doesn't prevent you from inventing new abstractions.
> so need to patch devlink core and the userspace devlink tool?
> e.g. devlink migration


It quite flexible, you can extend devlink, invent your own or let mgmt 
to establish devlink directly.


>
>> Note that devlink is based on netlink, netlink has been widely used by
>> various subsystems other than networking.
> the advantage of netlink I see is that it can monitor device status and
> notify upper layer that migration database needs to get updated.


I may miss something, but why this is needed?

 From device point of view, the following capability should be 
sufficient to support live migration:

- set/get device state
- report dirty page tracking
- set/get capability


> But not sure whether openstack would like to use this capability.
> As Sean said, it's heavy for openstack. it's heavy for vendor driver
> as well :)


Well, it depends several factors. Just counting LOCs, sysfs based 
attributes is not lightweight.

Thanks


>
> And devlink monitor now listens the notification and dumps the state
> changes. If we want to use it, need to let it forward the notification
> and dumped info to openstack, right?
>
> Thanks
> Yan
>


From antonios.dimtsoudis at cloud.ionos.com  Tue Aug 18 08:24:36 2020
From: antonios.dimtsoudis at cloud.ionos.com (Antonios Dimtsoudis)
Date: Tue, 18 Aug 2020 10:24:36 +0200
Subject: [monasca] Setup Monasca from scratch
Message-ID: <5e457dae-dc7c-3693-dc34-e622c2cd40f8@cloud.ionos.com>

Hi all,

i am trying to set up Monasca from scratch. Is there a good introduction 
/ point to start of you would recommend?

Thanks in advance,

Antonios.


From berrange at redhat.com  Tue Aug 18 08:55:27 2020
From: berrange at redhat.com (Daniel =?utf-8?B?UC4gQmVycmFuZ8Op?=)
Date: Tue, 18 Aug 2020 09:55:27 +0100
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
References: <20200805021654.GB30485@joy-OptiPlex-7040>
 <2624b12f-3788-7e2b-2cb7-93534960bcb7@redhat.com>
 <20200805075647.GB2177@nanopsycho>
 <eb1d01c2-fbad-36b6-10cf-9e03483a736b@redhat.com>
 <20200805093338.GC30485@joy-OptiPlex-7040>
 <20200805105319.GF2177@nanopsycho>
 <20200810074631.GA29059@joy-OptiPlex-7040>
 <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
Message-ID: <20200818085527.GB20215@redhat.com>

On Tue, Aug 18, 2020 at 11:24:30AM +0800, Jason Wang wrote:
> 
> On 2020/8/14 下午1:16, Yan Zhao wrote:
> > On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
> > > On 2020/8/10 下午3:46, Yan Zhao wrote:
> > > > > driver is it handled by?
> > > > It looks that the devlink is for network device specific, and in
> > > > devlink.h, it says
> > > > include/uapi/linux/devlink.h - Network physical device Netlink
> > > > interface,
> > > 
> > > Actually not, I think there used to have some discussion last year and the
> > > conclusion is to remove this comment.
> > > 
> > > It supports IB and probably vDPA in the future.
> > > 
> > hmm... sorry, I didn't find the referred discussion. only below discussion
> > regarding to why to add devlink.
> > 
> > https://www.mail-archive.com/netdev at vger.kernel.org/msg95801.html
> > 	>This doesn't seem to be too much related to networking? Why can't something
> > 	>like this be in sysfs?
> > 	
> > 	It is related to networking quite bit. There has been couple of
> > 	iteration of this, including sysfs and configfs implementations. There
> > 	has been a consensus reached that this should be done by netlink. I
> > 	believe netlink is really the best for this purpose. Sysfs is not a good
> > 	idea
> 
> 
> See the discussion here:
> 
> https://patchwork.ozlabs.org/project/netdev/patch/20191115223355.1277139-1-jeffrey.t.kirsher at intel.com/
> 
> 
> > 
> > https://www.mail-archive.com/netdev at vger.kernel.org/msg96102.html
> > 	>there is already a way to change eth/ib via
> > 	>echo 'eth' > /sys/bus/pci/drivers/mlx4_core/0000:02:00.0/mlx4_port1
> > 	>
> > 	>sounds like this is another way to achieve the same?
> > 	
> > 	It is. However the current way is driver-specific, not correct.
> > 	For mlx5, we need the same, it cannot be done in this way. Do devlink is
> > 	the correct way to go.
> > 
> > https://lwn.net/Articles/674867/
> > 	There a is need for some userspace API that would allow to expose things
> > 	that are not directly related to any device class like net_device of
> > 	ib_device, but rather chip-wide/switch-ASIC-wide stuff.
> > 
> > 	Use cases:
> > 	1) get/set of port type (Ethernet/InfiniBand)
> > 	2) monitoring of hardware messages to and from chip
> > 	3) setting up port splitters - split port into multiple ones and squash again,
> > 	   enables usage of splitter cable
> > 	4) setting up shared buffers - shared among multiple ports within one chip
> > 
> > 
> > 
> > we actually can also retrieve the same information through sysfs, .e.g
> > 
> > |- [path to device]
> >    |--- migration
> >    |     |--- self
> >    |     |   |---device_api
> >    |	|   |---mdev_type
> >    |	|   |---software_version
> >    |	|   |---device_id
> >    |	|   |---aggregator
> >    |     |--- compatible
> >    |     |   |---device_api
> >    |	|   |---mdev_type
> >    |	|   |---software_version
> >    |	|   |---device_id
> >    |	|   |---aggregator
> > 
> 
> Yes but:
> 
> - You need one file per attribute (one syscall for one attribute)
> - Attribute is coupled with kobject
> 
> All of above seems unnecessary.
> 
> Another point, as we discussed in another thread, it's really hard to make
> sure the above API work for all types of devices and frameworks. So having a
> vendor specific API looks much better.

>From the POV of userspace mgmt apps doing device compat checking / migration,
we certainly do NOT want to use different vendor specific APIs. We want to
have an API that can be used / controlled in a standard manner across vendors.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


From jasowang at redhat.com  Tue Aug 18 09:01:51 2020
From: jasowang at redhat.com (Jason Wang)
Date: Tue, 18 Aug 2020 17:01:51 +0800
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200818085527.GB20215@redhat.com>
References: <20200805021654.GB30485@joy-OptiPlex-7040>
 <2624b12f-3788-7e2b-2cb7-93534960bcb7@redhat.com>
 <20200805075647.GB2177@nanopsycho>
 <eb1d01c2-fbad-36b6-10cf-9e03483a736b@redhat.com>
 <20200805093338.GC30485@joy-OptiPlex-7040> <20200805105319.GF2177@nanopsycho>
 <20200810074631.GA29059@joy-OptiPlex-7040>
 <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
Message-ID: <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>

An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200818/17fdcfd6/attachment-0001.html>

From cohuck at redhat.com  Tue Aug 18 09:06:17 2020
From: cohuck at redhat.com (Cornelia Huck)
Date: Tue, 18 Aug 2020 11:06:17 +0200
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200818085527.GB20215@redhat.com>
References: <20200805021654.GB30485@joy-OptiPlex-7040>
 <2624b12f-3788-7e2b-2cb7-93534960bcb7@redhat.com>
 <20200805075647.GB2177@nanopsycho>
 <eb1d01c2-fbad-36b6-10cf-9e03483a736b@redhat.com>
 <20200805093338.GC30485@joy-OptiPlex-7040>
 <20200805105319.GF2177@nanopsycho>
 <20200810074631.GA29059@joy-OptiPlex-7040>
 <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
Message-ID: <20200818110617.05def37c.cohuck@redhat.com>

On Tue, 18 Aug 2020 09:55:27 +0100
Daniel P. Berrangé <berrange at redhat.com> wrote:

> On Tue, Aug 18, 2020 at 11:24:30AM +0800, Jason Wang wrote:
> > Another point, as we discussed in another thread, it's really hard to make
> > sure the above API work for all types of devices and frameworks. So having a
> > vendor specific API looks much better.  
> 
> From the POV of userspace mgmt apps doing device compat checking / migration,
> we certainly do NOT want to use different vendor specific APIs. We want to
> have an API that can be used / controlled in a standard manner across vendors.

As we certainly will need to have different things to check for
different device types and vendor drivers, would it still be fine to
have differing (say) attributes, as long as they are presented (and can
be discovered) in a standardized way?

(See e.g. what I came up with for vfio-ccw in a different branch of
this thread.)

E.g.
version=
<type>.type_specific_value0=
<type>.type_specific_value1=
<vendor_driver>.vendor_driver_specific_value0=

with a type or vendor driver having some kind of
get_supported_attributes method?


From berrange at redhat.com  Tue Aug 18 09:16:28 2020
From: berrange at redhat.com (Daniel =?utf-8?B?UC4gQmVycmFuZ8Op?=)
Date: Tue, 18 Aug 2020 10:16:28 +0100
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
References: <20200805075647.GB2177@nanopsycho>
 <eb1d01c2-fbad-36b6-10cf-9e03483a736b@redhat.com>
 <20200805093338.GC30485@joy-OptiPlex-7040>
 <20200805105319.GF2177@nanopsycho>
 <20200810074631.GA29059@joy-OptiPlex-7040>
 <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
Message-ID: <20200818091628.GC20215@redhat.com>

Your mail came through as HTML-only so all the quoting and attribution
is mangled / lost now :-(

On Tue, Aug 18, 2020 at 05:01:51PM +0800, Jason Wang wrote:
>    On 2020/8/18 下午4:55, Daniel P. Berrangé wrote:
> 
>  On Tue, Aug 18, 2020 at 11:24:30AM +0800, Jason Wang wrote:
> 
>  On 2020/8/14 下午1:16, Yan Zhao wrote:
> 
>  On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
> 
>  On 2020/8/10 下午3:46, Yan Zhao wrote:

>  we actually can also retrieve the same information through sysfs, .e.g
> 
>  |- [path to device]
>     |--- migration
>     |     |--- self
>     |     |   |---device_api
>     |    |   |---mdev_type
>     |    |   |---software_version
>     |    |   |---device_id
>     |    |   |---aggregator
>     |     |--- compatible
>     |     |   |---device_api
>     |    |   |---mdev_type
>     |    |   |---software_version
>     |    |   |---device_id
>     |    |   |---aggregator
> 
> 
>  Yes but:
> 
>  - You need one file per attribute (one syscall for one attribute)
>  - Attribute is coupled with kobject
> 
>  All of above seems unnecessary.
> 
>  Another point, as we discussed in another thread, it's really hard to make
>  sure the above API work for all types of devices and frameworks. So having a
>  vendor specific API looks much better.
> 
>  From the POV of userspace mgmt apps doing device compat checking / migration,
>  we certainly do NOT want to use different vendor specific APIs. We want to
>  have an API that can be used / controlled in a standard manner across vendors.
> 
>    Yes, but it could be hard. E.g vDPA will chose to use devlink (there's a
>    long debate on sysfs vs devlink). So if we go with sysfs, at least two
>    APIs needs to be supported ...

NB, I was not questioning devlink vs sysfs directly. If devlink is related
to netlink, I can't say I'm enthusiastic as IMKE sysfs is easier to deal
with. I don't know enough about devlink to have much of an opinion though.
The key point was that I don't want the userspace APIs we need to deal with
to be vendor specific.

What I care about is that we have a *standard* userspace API for performing
device compatibility checking / state migration, for use by QEMU/libvirt/
OpenStack, such that we can write code without countless vendor specific
code paths.

If there is vendor specific stuff on the side, that's fine as we can ignore
that, but the core functionality for device compat / migration needs to be
standardized.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


From berrange at redhat.com  Tue Aug 18 09:24:33 2020
From: berrange at redhat.com (Daniel =?utf-8?B?UC4gQmVycmFuZ8Op?=)
Date: Tue, 18 Aug 2020 10:24:33 +0100
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200818110617.05def37c.cohuck@redhat.com>
References: <20200805075647.GB2177@nanopsycho>
 <eb1d01c2-fbad-36b6-10cf-9e03483a736b@redhat.com>
 <20200805093338.GC30485@joy-OptiPlex-7040>
 <20200805105319.GF2177@nanopsycho>
 <20200810074631.GA29059@joy-OptiPlex-7040>
 <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <20200818110617.05def37c.cohuck@redhat.com>
Message-ID: <20200818092433.GD20215@redhat.com>

On Tue, Aug 18, 2020 at 11:06:17AM +0200, Cornelia Huck wrote:
> On Tue, 18 Aug 2020 09:55:27 +0100
> Daniel P. Berrangé <berrange at redhat.com> wrote:
> 
> > On Tue, Aug 18, 2020 at 11:24:30AM +0800, Jason Wang wrote:
> > > Another point, as we discussed in another thread, it's really hard to make
> > > sure the above API work for all types of devices and frameworks. So having a
> > > vendor specific API looks much better.  
> > 
> > From the POV of userspace mgmt apps doing device compat checking / migration,
> > we certainly do NOT want to use different vendor specific APIs. We want to
> > have an API that can be used / controlled in a standard manner across vendors.
> 
> As we certainly will need to have different things to check for
> different device types and vendor drivers, would it still be fine to
> have differing (say) attributes, as long as they are presented (and can
> be discovered) in a standardized way?

Yes, the control API and algorithm to deal with the problem needs to
have standardization, but the data passed in/out of the APIs can vary.

Essentially the key is that vendors should be able to create devices
at the kernel, and those devices should "just work" with the existing
generic userspace migration / compat checking code, without needing
extra vendor specific logic to be added.

Note, I'm not saying that the userspace decisions would be perfectly
optimal based on generic code. They might be making a simplified
decision that while functionally safe, is not the ideal solution.
Adding vendor specific code might be able to optimize the userspace
decisions, but that should be considered just optimization, not a
core must have for any opertion.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


From parav at nvidia.com  Tue Aug 18 09:32:55 2020
From: parav at nvidia.com (Parav Pandit)
Date: Tue, 18 Aug 2020 09:32:55 +0000
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
References: <20200805021654.GB30485@joy-OptiPlex-7040>
 <2624b12f-3788-7e2b-2cb7-93534960bcb7@redhat.com>
 <20200805075647.GB2177@nanopsycho>
 <eb1d01c2-fbad-36b6-10cf-9e03483a736b@redhat.com>
 <20200805093338.GC30485@joy-OptiPlex-7040> <20200805105319.GF2177@nanopsycho>
 <20200810074631.GA29059@joy-OptiPlex-7040>
 <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
Message-ID: <BY5PR12MB43222059335C96F7B050CFDCDC5C0@BY5PR12MB4322.namprd12.prod.outlook.com>

Hi Jason,

From: Jason Wang <jasowang at redhat.com> 
Sent: Tuesday, August 18, 2020 2:32 PM


On 2020/8/18 下午4:55, Daniel P. Berrangé wrote:
On Tue, Aug 18, 2020 at 11:24:30AM +0800, Jason Wang wrote:
On 2020/8/14 下午1:16, Yan Zhao wrote:
On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
On 2020/8/10 下午3:46, Yan Zhao wrote:
driver is it handled by?
It looks that the devlink is for network device specific, and in
devlink.h, it says
include/uapi/linux/devlink.h - Network physical device Netlink
interface,
Actually not, I think there used to have some discussion last year and the
conclusion is to remove this comment.

[...]

> Yes, but it could be hard. E.g vDPA will chose to use devlink (there's a long debate on sysfs vs devlink). So if we go with sysfs, at least two APIs needs to be supported ...

We had internal discussion and proposal on this topic.
I wanted Eli Cohen to be back from vacation on Wed 8/19, but since this is active discussion right now, I will share the thoughts anyway.

Here are the initial round of thoughts and proposal.

User requirements:
---------------------------
1. User might want to create one or more vdpa devices per PCI PF/VF/SF.
2. User might want to create one or more vdpa devices of type net/blk or other type.
3. User needs to look and dump at the health of the queues for debug purpose.
4. During vdpa net device creation time, user may have to provide a MAC address and/or VLAN.
5. User should be able to set/query some of the attributes for debug/compatibility check
6. When user wants to create vdpa device, it needs to know which device supports creation.
7. User should be able to see the queue statistics of doorbells, wqes etc regardless of class type

To address above requirements, there is a need of vendor agnostic tool, so that user can create/config/delete vdpa device(s) regardless of the vendor.

Hence,
We should have a tool that lets user do it.

Examples:
-------------
(a) List parent devices which supports creating vdpa devices.
It also shows which class types supported by this parent device.
In below command two parent devices support vdpa device creation.
First is PCI VF whose bdf is 03.00:5.
Second is PCI SF whose name is mlx5_sf.1

$ vdpa list pd
pci/0000:03.00:5
  class_supports
    net vdpa
virtbus/mlx5_sf.1
  class_supports
    net

(b) Now add a vdpa device and show the device.
$ vdpa dev add pci/0000:03.00:5 type net
$ vdpa dev show
vdpa0 at pci/0000:03.00:5 type net state inactive maxqueues 8 curqueues 4

(c) vdpa dev show features vdpa0
iommu platform
version 1

(d) dump vdpa statistics
$ vdpa dev stats show vdpa0
kickdoorbells 10
wqes 100

(e) Now delete a vdpa device previously created.
$ vdpa dev del vdpa0

Design overview:
-----------------------
1. Above example tool runs over netlink socket interface.
2. This enables users to return meaningful error strings in addition to code so that user can be more informed.
Often this is missing in ioctl()/configfs/sysfs interfaces.
3. This tool over netlink enables syscaller tests to be more usable like other subsystems to keep kernel robust
4. This provides vendor agnostic view of all vdpa capable parent and vdpa devices.

5. Each driver which supports vdpa device creation, registers the parent device along with supported classes.

FAQs:
--------
1. Why not using devlink?
Ans: Because as vdpa echo system grows, devlink will fall short of extending vdpa specific params, attributes, stats.

2. Why not use sysfs?
Ans: 
(a) Because running syscaller infrastructure can run well over netlink sockets like it runs for several subsystem.
(b) it lacks the ability to return error messages. Doing via kernel log is just doesn't work.
(c) Why not using some ioctl()? It will reinvent the wheel of netlink that has TLV formats for several attributes.

3. Why not configs?
It follows same limitation as that of sysfs.

Low level design and driver APIS:
--------------------------------------------
Will post once we discuss this further.

From cohuck at redhat.com  Tue Aug 18 09:36:52 2020
From: cohuck at redhat.com (Cornelia Huck)
Date: Tue, 18 Aug 2020 11:36:52 +0200
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200818091628.GC20215@redhat.com>
References: <20200805075647.GB2177@nanopsycho>
 <eb1d01c2-fbad-36b6-10cf-9e03483a736b@redhat.com>
 <20200805093338.GC30485@joy-OptiPlex-7040>
 <20200805105319.GF2177@nanopsycho>
 <20200810074631.GA29059@joy-OptiPlex-7040>
 <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
Message-ID: <20200818113652.5d81a392.cohuck@redhat.com>

On Tue, 18 Aug 2020 10:16:28 +0100
Daniel P. Berrangé <berrange at redhat.com> wrote:

> On Tue, Aug 18, 2020 at 05:01:51PM +0800, Jason Wang wrote:
> >    On 2020/8/18 下午4:55, Daniel P. Berrangé wrote:
> > 
> >  On Tue, Aug 18, 2020 at 11:24:30AM +0800, Jason Wang wrote:
> > 
> >  On 2020/8/14 下午1:16, Yan Zhao wrote:
> > 
> >  On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
> > 
> >  On 2020/8/10 下午3:46, Yan Zhao wrote:  
> 
> >  we actually can also retrieve the same information through sysfs, .e.g
> > 
> >  |- [path to device]
> >     |--- migration
> >     |     |--- self
> >     |     |   |---device_api
> >     |    |   |---mdev_type
> >     |    |   |---software_version
> >     |    |   |---device_id
> >     |    |   |---aggregator
> >     |     |--- compatible
> >     |     |   |---device_api
> >     |    |   |---mdev_type
> >     |    |   |---software_version
> >     |    |   |---device_id
> >     |    |   |---aggregator
> > 
> > 
> >  Yes but:
> > 
> >  - You need one file per attribute (one syscall for one attribute)
> >  - Attribute is coupled with kobject

Is that really that bad? You have the device with an embedded kobject
anyway, and you can just put things into an attribute group?

[Also, I think that self/compatible split in the example makes things
needlessly complex. Shouldn't semantic versioning and matching already
cover nearly everything? I would expect very few cases that are more
complex than that. Maybe the aggregation stuff, but I don't think we
need that self/compatible split for that, either.]

> > 
> >  All of above seems unnecessary.
> > 
> >  Another point, as we discussed in another thread, it's really hard to make
> >  sure the above API work for all types of devices and frameworks. So having a
> >  vendor specific API looks much better.
> > 
> >  From the POV of userspace mgmt apps doing device compat checking / migration,
> >  we certainly do NOT want to use different vendor specific APIs. We want to
> >  have an API that can be used / controlled in a standard manner across vendors.
> > 
> >    Yes, but it could be hard. E.g vDPA will chose to use devlink (there's a
> >    long debate on sysfs vs devlink). So if we go with sysfs, at least two
> >    APIs needs to be supported ...  
> 
> NB, I was not questioning devlink vs sysfs directly. If devlink is related
> to netlink, I can't say I'm enthusiastic as IMKE sysfs is easier to deal
> with. I don't know enough about devlink to have much of an opinion though.
> The key point was that I don't want the userspace APIs we need to deal with
> to be vendor specific.

From what I've seen of devlink, it seems quite nice; but I understand
why sysfs might be easier to deal with (especially as there's likely
already a lot of code using it.)

I understand that some users would like devlink because it is already
widely used for network drivers (and some others), but I don't think
the majority of devices used with vfio are network (although certainly
a lot of them are.)

> 
> What I care about is that we have a *standard* userspace API for performing
> device compatibility checking / state migration, for use by QEMU/libvirt/
> OpenStack, such that we can write code without countless vendor specific
> code paths.
> 
> If there is vendor specific stuff on the side, that's fine as we can ignore
> that, but the core functionality for device compat / migration needs to be
> standardized.

To summarize:
- choose one of sysfs or devlink
- have a common interface, with a standardized way to add
  vendor-specific attributes
?


From cohuck at redhat.com  Tue Aug 18 09:38:55 2020
From: cohuck at redhat.com (Cornelia Huck)
Date: Tue, 18 Aug 2020 11:38:55 +0200
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200818092433.GD20215@redhat.com>
References: <20200805075647.GB2177@nanopsycho>
 <eb1d01c2-fbad-36b6-10cf-9e03483a736b@redhat.com>
 <20200805093338.GC30485@joy-OptiPlex-7040>
 <20200805105319.GF2177@nanopsycho>
 <20200810074631.GA29059@joy-OptiPlex-7040>
 <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <20200818110617.05def37c.cohuck@redhat.com>
 <20200818092433.GD20215@redhat.com>
Message-ID: <20200818113855.647938c0.cohuck@redhat.com>

On Tue, 18 Aug 2020 10:24:33 +0100
Daniel P. Berrangé <berrange at redhat.com> wrote:

> On Tue, Aug 18, 2020 at 11:06:17AM +0200, Cornelia Huck wrote:
> > On Tue, 18 Aug 2020 09:55:27 +0100
> > Daniel P. Berrangé <berrange at redhat.com> wrote:
> >   
> > > On Tue, Aug 18, 2020 at 11:24:30AM +0800, Jason Wang wrote:  
> > > > Another point, as we discussed in another thread, it's really hard to make
> > > > sure the above API work for all types of devices and frameworks. So having a
> > > > vendor specific API looks much better.    
> > > 
> > > From the POV of userspace mgmt apps doing device compat checking / migration,
> > > we certainly do NOT want to use different vendor specific APIs. We want to
> > > have an API that can be used / controlled in a standard manner across vendors.  
> > 
> > As we certainly will need to have different things to check for
> > different device types and vendor drivers, would it still be fine to
> > have differing (say) attributes, as long as they are presented (and can
> > be discovered) in a standardized way?  
> 
> Yes, the control API and algorithm to deal with the problem needs to
> have standardization, but the data passed in/out of the APIs can vary.
> 
> Essentially the key is that vendors should be able to create devices
> at the kernel, and those devices should "just work" with the existing
> generic userspace migration / compat checking code, without needing
> extra vendor specific logic to be added.
> 
> Note, I'm not saying that the userspace decisions would be perfectly
> optimal based on generic code. They might be making a simplified
> decision that while functionally safe, is not the ideal solution.
> Adding vendor specific code might be able to optimize the userspace
> decisions, but that should be considered just optimization, not a
> core must have for any opertion.

Yes, that sounds reasonable.


From parav at nvidia.com  Tue Aug 18 09:39:24 2020
From: parav at nvidia.com (Parav Pandit)
Date: Tue, 18 Aug 2020 09:39:24 +0000
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200818113652.5d81a392.cohuck@redhat.com>
References: <20200805075647.GB2177@nanopsycho>
 <eb1d01c2-fbad-36b6-10cf-9e03483a736b@redhat.com>
 <20200805093338.GC30485@joy-OptiPlex-7040>	<20200805105319.GF2177@nanopsycho>
 <20200810074631.GA29059@joy-OptiPlex-7040>
 <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
 <20200818113652.5d81a392.cohuck@redhat.com>
Message-ID: <BY5PR12MB4322C9D1A66C4657776A1383DC5C0@BY5PR12MB4322.namprd12.prod.outlook.com>

Hi Cornelia,

> From: Cornelia Huck <cohuck at redhat.com>
> Sent: Tuesday, August 18, 2020 3:07 PM
> To: Daniel P. Berrangé <berrange at redhat.com>
> Cc: Jason Wang <jasowang at redhat.com>; Yan Zhao
> <yan.y.zhao at intel.com>; kvm at vger.kernel.org; libvir-list at redhat.com;
> qemu-devel at nongnu.org; Kirti Wankhede <kwankhede at nvidia.com>;
> eauger at redhat.com; xin-ran.wang at intel.com; corbet at lwn.net; openstack-
> discuss at lists.openstack.org; shaohe.feng at intel.com; kevin.tian at intel.com;
> Parav Pandit <parav at mellanox.com>; jian-feng.ding at intel.com;
> dgilbert at redhat.com; zhenyuw at linux.intel.com; hejie.xu at intel.com;
> bao.yumeng at zte.com.cn; Alex Williamson <alex.williamson at redhat.com>;
> eskultet at redhat.com; smooney at redhat.com; intel-gvt-
> dev at lists.freedesktop.org; Jiri Pirko <jiri at mellanox.com>;
> dinechin at redhat.com; devel at ovirt.org
> Subject: Re: device compatibility interface for live migration with assigned
> devices
> 
> On Tue, 18 Aug 2020 10:16:28 +0100
> Daniel P. Berrangé <berrange at redhat.com> wrote:
> 
> > On Tue, Aug 18, 2020 at 05:01:51PM +0800, Jason Wang wrote:
> > >    On 2020/8/18 下午4:55, Daniel P. Berrangé wrote:
> > >
> > >  On Tue, Aug 18, 2020 at 11:24:30AM +0800, Jason Wang wrote:
> > >
> > >  On 2020/8/14 下午1:16, Yan Zhao wrote:
> > >
> > >  On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
> > >
> > >  On 2020/8/10 下午3:46, Yan Zhao wrote:
> >
> > >  we actually can also retrieve the same information through sysfs,
> > > .e.g
> > >
> > >  |- [path to device]
> > >     |--- migration
> > >     |     |--- self
> > >     |     |   |---device_api
> > >     |    |   |---mdev_type
> > >     |    |   |---software_version
> > >     |    |   |---device_id
> > >     |    |   |---aggregator
> > >     |     |--- compatible
> > >     |     |   |---device_api
> > >     |    |   |---mdev_type
> > >     |    |   |---software_version
> > >     |    |   |---device_id
> > >     |    |   |---aggregator
> > >
> > >
> > >  Yes but:
> > >
> > >  - You need one file per attribute (one syscall for one attribute)
> > >  - Attribute is coupled with kobject
> 
> Is that really that bad? You have the device with an embedded kobject
> anyway, and you can just put things into an attribute group?
> 
> [Also, I think that self/compatible split in the example makes things
> needlessly complex. Shouldn't semantic versioning and matching already
> cover nearly everything? I would expect very few cases that are more
> complex than that. Maybe the aggregation stuff, but I don't think we need
> that self/compatible split for that, either.]
> 
> > >
> > >  All of above seems unnecessary.
> > >
> > >  Another point, as we discussed in another thread, it's really hard
> > > to make  sure the above API work for all types of devices and
> > > frameworks. So having a  vendor specific API looks much better.
> > >
> > >  From the POV of userspace mgmt apps doing device compat checking /
> > > migration,  we certainly do NOT want to use different vendor
> > > specific APIs. We want to  have an API that can be used / controlled in a
> standard manner across vendors.
> > >
> > >    Yes, but it could be hard. E.g vDPA will chose to use devlink (there's a
> > >    long debate on sysfs vs devlink). So if we go with sysfs, at least two
> > >    APIs needs to be supported ...
> >
> > NB, I was not questioning devlink vs sysfs directly. If devlink is
> > related to netlink, I can't say I'm enthusiastic as IMKE sysfs is
> > easier to deal with. I don't know enough about devlink to have much of an
> opinion though.
> > The key point was that I don't want the userspace APIs we need to deal
> > with to be vendor specific.
> 
> From what I've seen of devlink, it seems quite nice; but I understand why
> sysfs might be easier to deal with (especially as there's likely already a lot of
> code using it.)
> 
> I understand that some users would like devlink because it is already widely
> used for network drivers (and some others), but I don't think the majority of
> devices used with vfio are network (although certainly a lot of them are.)
> 
> >
> > What I care about is that we have a *standard* userspace API for
> > performing device compatibility checking / state migration, for use by
> > QEMU/libvirt/ OpenStack, such that we can write code without countless
> > vendor specific code paths.
> >
> > If there is vendor specific stuff on the side, that's fine as we can
> > ignore that, but the core functionality for device compat / migration
> > needs to be standardized.
> 
> To summarize:
> - choose one of sysfs or devlink
> - have a common interface, with a standardized way to add
>   vendor-specific attributes
> ?

Please refer to my previous email which has more example and details.

From dbengt at redhat.com  Tue Aug 18 10:19:35 2020
From: dbengt at redhat.com (Daniel Bengtsson)
Date: Tue, 18 Aug 2020 12:19:35 +0200
Subject: Can't fetch from opendev.
In-Reply-To: <20200817143703.c5rh3eqcl3ihxy4m@yuggoth.org>
References: <58c9ecb6-d1cc-df2f-caa8-693ed3f03d00@redhat.com>
 <20200817143703.c5rh3eqcl3ihxy4m@yuggoth.org>
Message-ID: <6590e740-00f1-ee60-ac00-5872039e0cb0@redhat.com>


On 8/17/20 4:37 PM, Jeremy Stanley wrote:
> [keeping Daniel in Cc as he doesn't appear to be subscribed]
I will check why. I don't understand the problem.

> What command(s) did you run and what error message is Git giving
> you? That paste doesn't look like an error, just a trace of the
> internal operations which were performed.
I try only to do a fetch on this remote. I have no explicit error. But 
the fetch blocks indefinitely.

> Are you and your colleague both connecting from the same network?
> Possibly the same corporate network or the same VPN?I'm not sure to understand what is the problem with the vpn. But yes we 
used the same one.


From emilien at redhat.com  Tue Aug 18 14:28:06 2020
From: emilien at redhat.com (Emilien Macchi)
Date: Tue, 18 Aug 2020 10:28:06 -0400
Subject: [tripleo] Proposing Takashi Kajinami to be core on puppet-tripleo
Message-ID: <CACu=hytgMt=4whc8vVRMLw-gkrfZ-rYiX26sNH78jnvgwu_zjQ@mail.gmail.com>

Hi people,

If you don't know Takashi yet, he has been involved in the Puppet OpenStack
project and helped *a lot* in its maintenance (and by maintenance I mean
not-funny-work). When our community was getting smaller and smaller, he
joined us and our review velicity went back to eleven. He became a core
maintainer very quickly and we're glad to have him onboard.

He's also been involved in taking care of puppet-tripleo for a few months
and I believe he has more than enough knowledge on the module to provide
core reviews and be part of the core maintainer group. I also noticed his
amount of contribution (bug fixes, improvements, reviews, etc) in other
TripleO repos and I'm confident he'll make his road to be core in TripleO
at some point. For now I would like him to propose him to be core in
puppet-tripleo.

As usual, any feedback is welcome but in the meantime I want to thank
Takashi for his work in TripleO and we're super happy to have new
contributors!

Thanks,
-- 
Emilien Macchi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200818/8bd83f4b/attachment.html>

From abishop at redhat.com  Tue Aug 18 14:37:44 2020
From: abishop at redhat.com (Alan Bishop)
Date: Tue, 18 Aug 2020 07:37:44 -0700
Subject: [tripleo] Proposing Takashi Kajinami to be core on puppet-tripleo
In-Reply-To: <CACu=hytgMt=4whc8vVRMLw-gkrfZ-rYiX26sNH78jnvgwu_zjQ@mail.gmail.com>
References: <CACu=hytgMt=4whc8vVRMLw-gkrfZ-rYiX26sNH78jnvgwu_zjQ@mail.gmail.com>
Message-ID: <CADO3vb7+JrfnJAq-7bJBrgomcuuGDtb3CHNeHT_gr+dQ_VDzjg@mail.gmail.com>

On Tue, Aug 18, 2020 at 7:34 AM Emilien Macchi <emilien at redhat.com> wrote:

> Hi people,
>
> If you don't know Takashi yet, he has been involved in the Puppet
> OpenStack project and helped *a lot* in its maintenance (and by maintenance
> I mean not-funny-work). When our community was getting smaller and smaller,
> he joined us and our review velicity went back to eleven. He became a core
> maintainer very quickly and we're glad to have him onboard.
>
> He's also been involved in taking care of puppet-tripleo for a few months
> and I believe he has more than enough knowledge on the module to provide
> core reviews and be part of the core maintainer group. I also noticed his
> amount of contribution (bug fixes, improvements, reviews, etc) in other
> TripleO repos and I'm confident he'll make his road to be core in TripleO
> at some point. For now I would like him to propose him to be core in
> puppet-tripleo.
>
> As usual, any feedback is welcome but in the meantime I want to thank
> Takashi for his work in TripleO and we're super happy to have new
> contributors!
>

Big +1 from me!


> Thanks,
> --
> Emilien Macchi
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200818/9a73231d/attachment-0001.html>

From aschultz at redhat.com  Tue Aug 18 14:46:55 2020
From: aschultz at redhat.com (Alex Schultz)
Date: Tue, 18 Aug 2020 08:46:55 -0600
Subject: [tripleo] Proposing Takashi Kajinami to be core on puppet-tripleo
In-Reply-To: <CACu=hytgMt=4whc8vVRMLw-gkrfZ-rYiX26sNH78jnvgwu_zjQ@mail.gmail.com>
References: <CACu=hytgMt=4whc8vVRMLw-gkrfZ-rYiX26sNH78jnvgwu_zjQ@mail.gmail.com>
Message-ID: <CAFsb3b4q=rcRZyCO_9Y+11RRLuYn0ZSGB1vaECjb+Hxi8OqSJw@mail.gmail.com>

+1

On Tue, Aug 18, 2020 at 8:38 AM Emilien Macchi <emilien at redhat.com> wrote:
>
> Hi people,
>
> If you don't know Takashi yet, he has been involved in the Puppet OpenStack project and helped *a lot* in its maintenance (and by maintenance I mean not-funny-work). When our community was getting smaller and smaller, he joined us and our review velicity went back to eleven. He became a core maintainer very quickly and we're glad to have him onboard.
>
> He's also been involved in taking care of puppet-tripleo for a few months and I believe he has more than enough knowledge on the module to provide core reviews and be part of the core maintainer group. I also noticed his amount of contribution (bug fixes, improvements, reviews, etc) in other TripleO repos and I'm confident he'll make his road to be core in TripleO at some point. For now I would like him to propose him to be core in puppet-tripleo.
>
> As usual, any feedback is welcome but in the meantime I want to thank Takashi for his work in TripleO and we're super happy to have new contributors!
>
> Thanks,
> --
> Emilien Macchi


From moreira.belmiro.email.lists at gmail.com  Tue Aug 18 14:49:37 2020
From: moreira.belmiro.email.lists at gmail.com (Belmiro Moreira)
Date: Tue, 18 Aug 2020 16:49:37 +0200
Subject: [nova][ops] Live migration and CPU features
Message-ID: <CAPkQhne9QKod_B7+0zzbG=ag=S76GAXKJWa=Fou6m-QcimOenQ@mail.gmail.com>

Hi,
in our infrastructure we have always compute nodes that need a hardware
intervention and as a consequence they are rebooted, bringing a new kernel,
kvm, ...

In order to have a good compromise between performance and flexibility
(live migration) we have been using "host-model" for the "cpu_mode"
configuration of our service VMs. We didn't expect to have CPU
compatibility issues because we have the same hardware type per cell.

The problem is that when a compute node is rebooted the instance domain is
recreated with the new cpu features that were introduced because of the
reboot (using centOS).

If there are new CPU features exposed, this basically blocks live migration
to all the non rebooted compute nodes (those cpu features are not exposed,
yet). The nova-scheduler doesn't know about them when scheduling the live
migration destination.

I wonder how other operators are solving this issue.
I don't like stopping OS upgrades.
What I'm considering is to define a "custom" cpu_mode for each hardware
type.

I would appreciate your comments and learn how you are solving this problem.

Belmiro
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200818/ee6f4fe4/attachment.html>

From amuller at redhat.com  Tue Aug 18 14:48:23 2020
From: amuller at redhat.com (Assaf Muller)
Date: Tue, 18 Aug 2020 10:48:23 -0400
Subject: [neutron][ops] API for viewing HA router states
In-Reply-To: <6613245.ccrTHCtBl7@antares>
References: <CAEs876jV-RqWiW2K8ehvm4ACTCiU059LJDiC+FWGWKQdPGCjGw@mail.gmail.com>
 <6613245.ccrTHCtBl7@antares>
Message-ID: <CABARBAY_r-U8fixF7_tPg1cxrDy14x5s=nQ=oa436H7WeOyp=g@mail.gmail.com>

On Tue, Aug 18, 2020 at 8:12 AM Jonas Schäfer
<jonas.schaefer at cloudandheat.com> wrote:
>
> Hi Mohammed and all,
>
> On Montag, 17. August 2020 14:01:55 CEST Mohammed Naser wrote:
> > Over the past few days, we were troubleshooting an issue that ended up
> > having a root cause where keepalived has somehow ended up active in
> > two different L3 agents.  We've yet to find the root cause of how this
> > happened but removing it and adding it resolved the issue for us.
>
> We’ve also seen that behaviour occasionally. The root cause is also unclear
> for us (so we would’ve love to hear about that).

Insert shameless plug for the Neutron OVN backend. One of it's
advantages is that it's L3 HA architecture is cleaner and more
scalable (this is coming from the dude that wrote the L3 HA code we're
all suffering from =D). The ML2/OVS L3 HA architecture has it's issues
- I've seen it work at 100's of customer sites at scale, so I don't
want to knock it too much, but just a day ago I got an internal
customer ticket about keepalived falling over on a particular router
that has 200 floating IPs. It works but it's not perfect. I'm sure the
OVN implementation isn't either but it's simply cleaner and has less
moving parts. It uses BFD to monitor the tunnel endpoints, so failover
is faster too. Plus, it doesn't use keepalived.

> We have anecdotal evidence
> that a rabbitmq failure was involved, although that makes no sense to me
> personally. Other causes may be incorrectly cleaned-up namespaces (for
> example, when you kill or hard-restart the l3 agent, the namespaces will stay
> around, possibly with the IP address assigned; the keepalived on the other l3
> agents will not see the VRRP advertisments anymore and will ALSO assign the IP
> address. This will also be rectified by a restart always and may require
> manual namespace cleanup with a tool, a node reboot or an agent disable/enable
> cycle.).
>
> > As we work on improving our monitoring, we wanted to implement
> > something that gets us the info of # of active routers to check if
> > there's a router that has >1 active L3 agent but it's hard because
> > hitting the /l3-agents endpoint on _every_ single router hurts a lot
> > on performance.
> >
> > Is there something else that we can watch which might be more
> > productive?  FYI -- this all goes in the open and will end up inside
> > the openstack-exporter:
> > https://github.com/openstack-exporter/openstack-exporter and the Helm
> > charts will end up with the alerts:
> > https://github.com/openstack-exporter/helm-charts
>
> While I don’t think it fits in your openstack-exporter design, we are
> currently using the attached script (which we also hereby publish under the
> terms of the Apache 2.0 license [1]). (Sorry, I lack the time to cleanly
> publish it somewhere right now.)
>
> It checks the state files maintained by the L3 agent conglomerate and exports
> metrics about the master-ness of the routers as prometheus metrics.
>
> Note that this is slightly dangerous since the router IDs are high-cardinality
> and using that as a label value in Prometheus is discouraged; you may not want
> to do this in a public cloud setting.
>
> Either way: This allows us to alert on routers where there is not exactly one
> master state. Downside is that this requires the thing to run locally on the
> l3 agent nodes. Upside is that it is very efficient, and will also show the
> master state in some cases where the router was not cleaned up properly (e.g.
> because the l3 agent and its keepaliveds were killed).
>
> kind regards,
> Jonas
>
>    [1]: http://www.apache.org/licenses/LICENSE-2.0
> --
> Jonas Schäfer
> DevOps Engineer
>
> Cloud&Heat Technologies GmbH
> Königsbrücker Straße 96 | 01099 Dresden
> +49 351 479 367 37
> jonas.schaefer at cloudandheat.com | www.cloudandheat.com
>
> New Service:
> Managed Kubernetes designed for AI & ML
> https://managed-kubernetes.cloudandheat.com/
>
> Commercial Register: District Court Dresden
> Register Number: HRB 30549
> VAT ID No.: DE281093504
> Managing Director: Nicolas Röhrs
> Authorized signatory: Dr. Marius Feldmann
> Authorized signatory: Kristina Rübenkamp


From beagles at redhat.com  Tue Aug 18 14:53:03 2020
From: beagles at redhat.com (Brent Eagles)
Date: Tue, 18 Aug 2020 12:23:03 -0230
Subject: [tripleo] Proposing Takashi Kajinami to be core on puppet-tripleo
In-Reply-To: <CACu=hytgMt=4whc8vVRMLw-gkrfZ-rYiX26sNH78jnvgwu_zjQ@mail.gmail.com>
References: <CACu=hytgMt=4whc8vVRMLw-gkrfZ-rYiX26sNH78jnvgwu_zjQ@mail.gmail.com>
Message-ID: <CADnDpCZ-6nxc31QMVaKz8_Yt+YSL3B2hkZg7-iPaniUpDOvS-w@mail.gmail.com>

+1

On Tue, Aug 18, 2020 at 12:01 PM Emilien Macchi <emilien at redhat.com> wrote:

> Hi people,
>
> If you don't know Takashi yet, he has been involved in the Puppet
> OpenStack project and helped *a lot* in its maintenance (and by maintenance
> I mean not-funny-work). When our community was getting smaller and smaller,
> he joined us and our review velicity went back to eleven. He became a core
> maintainer very quickly and we're glad to have him onboard.
>
> He's also been involved in taking care of puppet-tripleo for a few months
> and I believe he has more than enough knowledge on the module to provide
> core reviews and be part of the core maintainer group. I also noticed his
> amount of contribution (bug fixes, improvements, reviews, etc) in other
> TripleO repos and I'm confident he'll make his road to be core in TripleO
> at some point. For now I would like him to propose him to be core in
> puppet-tripleo.
>
> As usual, any feedback is welcome but in the meantime I want to thank
> Takashi for his work in TripleO and we're super happy to have new
> contributors!
>
> Thanks,
> --
> Emilien Macchi
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200818/140eec89/attachment.html>

From skaplons at redhat.com  Tue Aug 18 15:00:52 2020
From: skaplons at redhat.com (Slawek Kaplonski)
Date: Tue, 18 Aug 2020 17:00:52 +0200
Subject: [neutron][gate] verbose q-svc log files and e-r indexing
In-Reply-To: <20200818103323.wq5upyjn4nzsqhx7@skaplons-mac>
References: <bbf4a146-7c67-969e-b783-4bdfbe31add1@gmail.com>
 <20200818103323.wq5upyjn4nzsqhx7@skaplons-mac>
Message-ID: <20200818150052.u4xkjsptejikwcny@skaplons-mac>

Hi,

I proposed patch [1] which seems that decreased size of the neutron-server log
a bit - see [2] but it's still about 40M :/

[1] https://review.opendev.org/#/c/730879/
[2] https://48dcf568cd222acfbfb6-11d92d8452a346ca231ad13d26a55a7d.ssl.cf2.rackcdn.com/746714/1/check/tempest-full-py3/5c1399c/controller/logs/

On Tue, Aug 18, 2020 at 12:33:23PM +0200, Slawek Kaplonski wrote:
> Hi,
> 
> I opened LP for that [1] and I will propose some fix for it ASAP.
> 
> On Mon, Aug 17, 2020 at 10:50:15AM -0700, melanie witt wrote:
> > Hi all,
> > 
> > Recently we've noticed elastic search indexing is behind 115 hours [1] and we looked for abnormally large log files being generated in the gate.
> > 
> > We found that the q-svc log is very large, one example being 71.6M [2]. There is a lot of Time-Cost profiling output in the log, like this:
> > 
> > Aug 17 14:22:23.210076 ubuntu-bionic-ovh-bhs1-0019298855 neutron-server[5168]: DEBUG neutron_lib.utils.helpers [req-75719db1-4abf-4500-bb0a-6d24e82cd4fd req-d88e7052-7da9-4bc9-8b35-5730ae76dcad service neutron] Time-cost: call 48e628cc-8c3a-408d-a36f-b219524480e0 function apply_funcs start {{(pid=5554) wrapper /usr/local/lib/python3.6/dist-packages/neutron_lib/utils/helpers.py:218}}
> > 
> > We saw that there was a recent-ish change to remove some of the profiling output [3] but it was only for the get_objects method.
> > 
> > Looking at the total number of lines in the file vs the number of lines without apply_funcs Time-Cost output:
> > 
> > $ wc -l screen-q-svc.txt
> > 186387 screen-q-svc.txt
> > 
> > $ grep -v "function apply_funcs" screen-q-svc.txt|wc -l
> > 102593
> > 
> > Would it be possible to remove this profiling output from the gate log to give elastic search indexing a better chance at keeping up? Or is there something else I've missed that could be made less verbose in the logging?
> > 
> > Thanks for your help.
> > 
> > Cheers,
> > -melanie
> > 
> > [1] http://status.openstack.org/elastic-recheck
> > [2] https://b6ba3b9af8fd7de57099-18aa39cea11f738aa67ebd6bc9fb5e4c.ssl.cf2.rackcdn.com/744958/4/check/tempest-integrated-compute/4421bf9/controller/logs/screen-q-svc.txt
> > [3] https://review.opendev.org/741540
> > 
> 
> [1] https://bugs.launchpad.net/neutron/+bug/1892017
> 
> -- 
> Slawek Kaplonski
> Principal software engineer
> Red Hat

-- 
Slawek Kaplonski
Principal software engineer
Red Hat


From luis.ramirez at opencloud.es  Tue Aug 18 15:01:09 2020
From: luis.ramirez at opencloud.es (Luis Ramirez)
Date: Tue, 18 Aug 2020 17:01:09 +0200
Subject: [nova][ops] Live migration and CPU features
In-Reply-To: <CAPkQhne9QKod_B7+0zzbG=ag=S76GAXKJWa=Fou6m-QcimOenQ@mail.gmail.com>
References: <CAPkQhne9QKod_B7+0zzbG=ag=S76GAXKJWa=Fou6m-QcimOenQ@mail.gmail.com>
Message-ID: <CAAvZhtnpBqMcKf3cvi=OVzTc=Xv+MS0k-PZDOVXcyD7H+btXUg@mail.gmail.com>

Hi,

Try to choose a custom cpu_model that fits into your infra. This should be
the best approach to avoid this kind of problem. If the performance is not
an issue for the tenants, KVM64 should be a good election.

Br,
Luis Rmz <https://www.linkedin.com/in/luisframirez/>
Blockchain, DevOps & Open Source Cloud Solutions Architect
----------------------------------------
Founder & CEO
OpenCloud.es <http://www.opencloud.es/>
luis.ramirez at opencloud.es
Skype ID: d.overload
Hangouts: luis.ramirez at opencloud.es
[image: ] +34 911 950 123 / [image: ]+39 392 1289553 / [image: ]+49 152
26917722 / Česká republika: +420 774 274 882
-----------------------------------------------------


El mar., 18 ago. 2020 a las 16:55, Belmiro Moreira (<
moreira.belmiro.email.lists at gmail.com>) escribió:

> Hi,
> in our infrastructure we have always compute nodes that need a hardware
> intervention and as a consequence they are rebooted, bringing a new kernel,
> kvm, ...
>
> In order to have a good compromise between performance and flexibility
> (live migration) we have been using "host-model" for the "cpu_mode"
> configuration of our service VMs. We didn't expect to have CPU
> compatibility issues because we have the same hardware type per cell.
>
> The problem is that when a compute node is rebooted the instance domain is
> recreated with the new cpu features that were introduced because of the
> reboot (using centOS).
>
> If there are new CPU features exposed, this basically blocks live
> migration to all the non rebooted compute nodes (those cpu features are not
> exposed, yet). The nova-scheduler doesn't know about them when scheduling
> the live migration destination.
>
> I wonder how other operators are solving this issue.
> I don't like stopping OS upgrades.
> What I'm considering is to define a "custom" cpu_mode for each hardware
> type.
>
> I would appreciate your comments and learn how you are solving this
> problem.
>
> Belmiro
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200818/f997768d/attachment.html>

From bdobreli at redhat.com  Tue Aug 18 15:01:35 2020
From: bdobreli at redhat.com (Bogdan Dobrelya)
Date: Tue, 18 Aug 2020 17:01:35 +0200
Subject: [tripleo] Proposing Takashi Kajinami to be core on puppet-tripleo
In-Reply-To: <CADO3vb7+JrfnJAq-7bJBrgomcuuGDtb3CHNeHT_gr+dQ_VDzjg@mail.gmail.com>
References: <CACu=hytgMt=4whc8vVRMLw-gkrfZ-rYiX26sNH78jnvgwu_zjQ@mail.gmail.com>
 <CADO3vb7+JrfnJAq-7bJBrgomcuuGDtb3CHNeHT_gr+dQ_VDzjg@mail.gmail.com>
Message-ID: <1c51af8b-50ed-ccc7-61bf-3569cbc81d43@redhat.com>

On 8/18/20 4:37 PM, Alan Bishop wrote:
> 
> 
> On Tue, Aug 18, 2020 at 7:34 AM Emilien Macchi <emilien at redhat.com 
> <mailto:emilien at redhat.com>> wrote:
> 
>     Hi people,
> 
>     If you don't know Takashi yet, he has been involved in the Puppet
>     OpenStack project and helped *a lot* in its maintenance (and by
>     maintenance I mean not-funny-work). When our community was getting
>     smaller and smaller, he joined us and our review velicity went back
>     to eleven. He became a core maintainer very quickly and we're glad
>     to have him onboard.
> 
>     He's also been involved in taking care of puppet-tripleo for a few
>     months and I believe he has more than enough knowledge on the module
>     to provide core reviews and be part of the core maintainer group. I
>     also noticed his amount of contribution (bug fixes, improvements,
>     reviews, etc) in other TripleO repos and I'm confident he'll make
>     his road to be core in TripleO at some point. For now I would like
>     him to propose him to be core in puppet-tripleo.
> 
>     As usual, any feedback is welcome but in the meantime I want to
>     thank Takashi for his work in TripleO and we're super happy to have
>     new contributors!
> 
> 
> Big +1 from me!

+1

> 
> 
>     Thanks,
>     -- 
>     Emilien Macchi
> 


-- 
Best regards,
Bogdan Dobrelya,
Irc #bogdando


From dev.faz at gmail.com  Tue Aug 18 15:06:47 2020
From: dev.faz at gmail.com (Fabian Zimmermann)
Date: Tue, 18 Aug 2020 17:06:47 +0200
Subject: [nova][ops] Live migration and CPU features
In-Reply-To: <CAPkQhne9QKod_B7+0zzbG=ag=S76GAXKJWa=Fou6m-QcimOenQ@mail.gmail.com>
References: <CAPkQhne9QKod_B7+0zzbG=ag=S76GAXKJWa=Fou6m-QcimOenQ@mail.gmail.com>
Message-ID: <CAA857VzZYMZxLaZ=4=fJo7rXnYak5mN-64Wz7ACoYD11sp+OmQ@mail.gmail.com>

Hi,

We are using the "custom"-way. But this does not protect you from all issues.

We had problems with a new cpu-generation not (jet) detected correctly
in an libvirt-version. So libvirt failed back to the "desktop"-cpu of
this newer generation, but didnt support/detect some features =>
blocked live-migration.

 Fabian

Am Di., 18. Aug. 2020 um 16:54 Uhr schrieb Belmiro Moreira
<moreira.belmiro.email.lists at gmail.com>:
>
> Hi,
> in our infrastructure we have always compute nodes that need a hardware intervention and as a consequence they are rebooted, bringing a new kernel, kvm, ...
>
> In order to have a good compromise between performance and flexibility (live migration) we have been using "host-model" for the "cpu_mode" configuration of our service VMs. We didn't expect to have CPU compatibility issues because we have the same hardware type per cell.
>
> The problem is that when a compute node is rebooted the instance domain is recreated with the new cpu features that were introduced because of the reboot (using centOS).
>
> If there are new CPU features exposed, this basically blocks live migration to all the non rebooted compute nodes (those cpu features are not exposed, yet). The nova-scheduler doesn't know about them when scheduling the live migration destination.
>
> I wonder how other operators are solving this issue.
> I don't like stopping OS upgrades.
> What I'm considering is to define a "custom" cpu_mode for each hardware type.
>
> I would appreciate your comments and learn how you are solving this problem.
>
> Belmiro
>


From smooney at redhat.com  Tue Aug 18 15:11:45 2020
From: smooney at redhat.com (Sean Mooney)
Date: Tue, 18 Aug 2020 16:11:45 +0100
Subject: [nova][ops] Live migration and CPU features
In-Reply-To: <CAAvZhtnpBqMcKf3cvi=OVzTc=Xv+MS0k-PZDOVXcyD7H+btXUg@mail.gmail.com>
References: <CAPkQhne9QKod_B7+0zzbG=ag=S76GAXKJWa=Fou6m-QcimOenQ@mail.gmail.com>
 <CAAvZhtnpBqMcKf3cvi=OVzTc=Xv+MS0k-PZDOVXcyD7H+btXUg@mail.gmail.com>
Message-ID: <10be83e71171f752a926af614d4541ab77d385e8.camel@redhat.com>

On Tue, 2020-08-18 at 17:01 +0200, Luis Ramirez wrote:
> Hi,
> 
> Try to choose a custom cpu_model that fits into your infra. This should be
> the best approach to avoid this kind of problem. If the performance is not
> an issue for the tenants, KVM64 should be a good election.
you should neve use kvm64 in production
it is not maintained for security vulnerablity e.g. it is never updated with
any fo the feature flag to mitigate security issue like specter ectra.

its perfect for ci and test where you dont contol the underlying cloud
and are using nested virt. its also semi resonable for nested vms
but its not a good choice for the host.

you should either use host-passthough and segreate your host using aggreates or other
means to ensure live migration capavlity or use a custom model. host model is a good default
provided you upgrade all host at the same time and you are ok with the feature set changing.

host model has a 1 way migration proablem where it possible to migrate form old host to new but
not new to old if the vm is hard rebooted in between. so when using host model we still
recommend segrationg host by cpu generation to avoid that.
> 
> Br,
> Luis Rmz <https://www.linkedin.com/in/luisframirez/>
> Blockchain, DevOps & Open Source Cloud Solutions Architect
> ----------------------------------------
> Founder & CEO
> OpenCloud.es <http://www.opencloud.es/>
> luis.ramirez at opencloud.es
> Skype ID: d.overload
> Hangouts: luis.ramirez at opencloud.es
> [image: ] +34 911 950 123 / [image: ]+39 392 1289553 / [image: ]+49 152
> 26917722 / Česká republika: +420 774 274 882
> -----------------------------------------------------
> 
> 
> El mar., 18 ago. 2020 a las 16:55, Belmiro Moreira (<
> moreira.belmiro.email.lists at gmail.com>) escribió:
> 
> > Hi,
> > in our infrastructure we have always compute nodes that need a hardware
> > intervention and as a consequence they are rebooted, bringing a new kernel,
> > kvm, ...
> > 
> > In order to have a good compromise between performance and flexibility
> > (live migration) we have been using "host-model" for the "cpu_mode"
> > configuration of our service VMs. We didn't expect to have CPU
> > compatibility issues because we have the same hardware type per cell.
> > 
> > The problem is that when a compute node is rebooted the instance domain is
> > recreated with the new cpu features that were introduced because of the
> > reboot (using centOS).
> > 
> > If there are new CPU features exposed, this basically blocks live
> > migration to all the non rebooted compute nodes (those cpu features are not
> > exposed, yet). The nova-scheduler doesn't know about them when scheduling
> > the live migration destination.
> > 
> > I wonder how other operators are solving this issue.
> > I don't like stopping OS upgrades.
> > What I'm considering is to define a "custom" cpu_mode for each hardware
> > type.
> > 
> > I would appreciate your comments and learn how you are solving this
> > problem.
> > 
> > Belmiro
> > 
> > 


From smooney at redhat.com  Tue Aug 18 15:16:17 2020
From: smooney at redhat.com (Sean Mooney)
Date: Tue, 18 Aug 2020 16:16:17 +0100
Subject: [nova][ops] Live migration and CPU features
In-Reply-To: <CAA857VzZYMZxLaZ=4=fJo7rXnYak5mN-64Wz7ACoYD11sp+OmQ@mail.gmail.com>
References: <CAPkQhne9QKod_B7+0zzbG=ag=S76GAXKJWa=Fou6m-QcimOenQ@mail.gmail.com>
 <CAA857VzZYMZxLaZ=4=fJo7rXnYak5mN-64Wz7ACoYD11sp+OmQ@mail.gmail.com>
Message-ID: <44347504ff7308a6c3b4155060c778fad368a002.camel@redhat.com>

On Tue, 2020-08-18 at 17:06 +0200, Fabian Zimmermann wrote:
> Hi,
> 
> We are using the "custom"-way. But this does not protect you from all issues.
> 
> We had problems with a new cpu-generation not (jet) detected correctly
> in an libvirt-version. So libvirt failed back to the "desktop"-cpu of
> this newer generation, but didnt support/detect some features =>
> blocked live-migration.
yes that is common when using really new hardware. having previouly worked
at intel and hitting this often that one of the reason i tend to default to host-passthouh
and recommend using AZ or aggreate to segreatate the cloud for live migration.

in the case where your libvirt does not know about the new cpus your best approch is to use the
newest server cpu model that it know about and then if you really need the new fature you can try
to add theem using the config options  but that is effectivly the same as using host-passhtough
which is why i default to that as a workaround instead.

> 
>  Fabian
> 
> Am Di., 18. Aug. 2020 um 16:54 Uhr schrieb Belmiro Moreira
> <moreira.belmiro.email.lists at gmail.com>:
> > 
> > Hi,
> > in our infrastructure we have always compute nodes that need a hardware intervention and as a consequence they are
> > rebooted, bringing a new kernel, kvm, ...
> > 
> > In order to have a good compromise between performance and flexibility (live migration) we have been using "host-
> > model" for the "cpu_mode" configuration of our service VMs. We didn't expect to have CPU compatibility issues
> > because we have the same hardware type per cell.
> > 
> > The problem is that when a compute node is rebooted the instance domain is recreated with the new cpu features that
> > were introduced because of the reboot (using centOS).
> > 
> > If there are new CPU features exposed, this basically blocks live migration to all the non rebooted compute nodes
> > (those cpu features are not exposed, yet). The nova-scheduler doesn't know about them when scheduling the live
> > migration destination.
> > 
> > I wonder how other operators are solving this issue.
> > I don't like stopping OS upgrades.
> > What I'm considering is to define a "custom" cpu_mode for each hardware type.
> > 
> > I would appreciate your comments and learn how you are solving this problem.
> > 
> > Belmiro
> > 
> 
> 


From fungi at yuggoth.org  Tue Aug 18 15:24:14 2020
From: fungi at yuggoth.org (Jeremy Stanley)
Date: Tue, 18 Aug 2020 15:24:14 +0000
Subject: Can't fetch from opendev.
In-Reply-To: <6590e740-00f1-ee60-ac00-5872039e0cb0@redhat.com>
References: <58c9ecb6-d1cc-df2f-caa8-693ed3f03d00@redhat.com>
 <20200817143703.c5rh3eqcl3ihxy4m@yuggoth.org>
 <6590e740-00f1-ee60-ac00-5872039e0cb0@redhat.com>
Message-ID: <20200818152414.s5srmotngy7a7w7r@yuggoth.org>

On 2020-08-18 12:19:35 +0200 (+0200), Daniel Bengtsson wrote:
[...]
> I try only to do a fetch on this remote. I have no explicit error.
> But the fetch blocks indefinitely.
[...]

Thanks, that's an important detail. So you're running this:

    git fetch https://opendev.org/openstack/tripleo-heat-templates

and it just hangs indefinitely and never returns an error?

This makes me suspect a routing problem. I've seen it most often
when users have broken IPv6 routing locally. If you're using Git
2.16 or later, it provides the option of specifying IPv4 or IPv6 on
the command line. To test this, add a -4 after the "fetch" like:

    git fetch -4 https://opendev.org/openstack/tripleo-heat-templates

One reason I suspect this might be the problem is that GitHub is
IPv4-only, so if you have something black-holing or blocking traffic
for global IPv6 routes, then that could cause the behavior you're
observing.
-- 
Jeremy Stanley
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 963 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200818/18a4d6ea/attachment.sig>

From emilien at redhat.com  Tue Aug 18 16:38:03 2020
From: emilien at redhat.com (Emilien Macchi)
Date: Tue, 18 Aug 2020 12:38:03 -0400
Subject: [tripleo] Help needed to write a Third Party integration guide
Message-ID: <CACu=hytByxGFqnVZUUXKL8QTTQFOp+T2aUCnuxJdpg-OraDENg@mail.gmail.com>

Hi people,

We have been requested several times (for good reasons) about how to write
out-of-tree files for TripleO integration (Heat templates, Ansible roles,
Container Images layouts, etc).
For example, Dell has an external repository (
https://github.com/dell/tripleo-powerflex) where they have pretty much all
they need to install their services in out-of-tree fashion (I'm sure there
are more examples, this one is just the most recent in my knowledge).
This model is recommended for the third party services that aren't part of
TripleO but still want to be integrated with it.
This usually fits when the service can't be maintained by the TripleO team
but there is a desire from outside of the community to maintain some
integration (e.g. vendors).

We haven't done a good job at providing a full end to end guide on how to
achieve this and very often asked people to just do it. I propose that we
work on this guide together and today I'm gathering for volunteers who have
knowledge on that field or are interested to learn about it and contribute
it back directly into a new guide, hosted on tripleo-docs repo.

https://bugs.launchpad.net/tripleo/+bug/1892072

This will probably involve a bunch of linking to existing docs but also a
good opportunity to update what is outdated in our content and provide more
information where needed.

Thanks for letting us know if you're interested to be actively contributing
into that effort,
-- 
Emilien Macchi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200818/e7326cbf/attachment.html>

From kennelson11 at gmail.com  Tue Aug 18 16:43:54 2020
From: kennelson11 at gmail.com (Kendall Nelson)
Date: Tue, 18 Aug 2020 09:43:54 -0700
Subject: GHC Mentors Needed for OpenStack
In-Reply-To: <CAFs83Qp+v4Nv6UL=ULuvH=43z71mNnhgCrP_S-kHOVVgCSP5DQ@mail.gmail.com>
References: <CAFs83Qp+v4Nv6UL=ULuvH=43z71mNnhgCrP_S-kHOVVgCSP5DQ@mail.gmail.com>
Message-ID: <CAJ6yrQji5GSE_nw0gpYaTaybEyktMoBMeb+X3afoi14JWC=XHw@mail.gmail.com>

Looks like applications are closed already?

-Kendall (diablo_rojo)

On Tue, Aug 18, 2020 at 6:16 AM Amy Marrich <amy at demarco.com> wrote:

> Grace Hopper Conference is going virtual this year and once again
> OpenStack is participating as one of the Open Source Day projects. We are
> hoping to do some peer programming (aka mentees shadowing folks while they
> work through a patch) as part of the day. Mentors receive a full
> conference pass and AnitaB.org membership. Please check out the
> requirements
> <https://ghc.anitab.org/get-involved/volunteer/committee-members-and-scholarship-reviewers-2>{0}
> and apply
> <https://ghc.anitab.org/get-involved/volunteer/committee-members-and-scholarship-reviewers-2>(1)
> by August 19, 2020.
>
> We are also figuring a way for more folks to be able to mentor, so if
> you'd like to help but aren't interested in the conference please reach out
> to me or Victoria(vkmc) by email or on IRC.
>
> Thanks and apologies for the short deadline though I can probably get let
> additions in:)
>
> Amy (spotz)
>
> 0- Grace Hopper mentorship requirements:
> https://ghc.anitab.org/get-involved/volunteer/committee-members-and-scholarship-reviewers-2
>
> 1- Grace Hopper mentorship application:
> https://ghc.anitab.org/get-involved/volunteer/committee-members-and-scholarship-reviewers-2
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200818/4c18336d/attachment.html>

From amy at demarco.com  Tue Aug 18 16:55:55 2020
From: amy at demarco.com (Amy Marrich)
Date: Tue, 18 Aug 2020 11:55:55 -0500
Subject: GHC Mentors Needed for OpenStack
In-Reply-To: <CAJ6yrQji5GSE_nw0gpYaTaybEyktMoBMeb+X3afoi14JWC=XHw@mail.gmail.com>
References: <CAFs83Qp+v4Nv6UL=ULuvH=43z71mNnhgCrP_S-kHOVVgCSP5DQ@mail.gmail.com>
 <CAJ6yrQji5GSE_nw0gpYaTaybEyktMoBMeb+X3afoi14JWC=XHw@mail.gmail.com>
Message-ID: <CAFs83QoPZu22q_POa9hEZwJLCSOd7irSZFpinF1z+Exe6v4Oqg@mail.gmail.com>

We're working on being allowed to give them a list as the date we
originally had changed.

Amy (spotz)

On Tue, Aug 18, 2020 at 11:44 AM Kendall Nelson <kennelson11 at gmail.com>
wrote:

> Looks like applications are closed already?
>
> -Kendall (diablo_rojo)
>
> On Tue, Aug 18, 2020 at 6:16 AM Amy Marrich <amy at demarco.com> wrote:
>
>> Grace Hopper Conference is going virtual this year and once again
>> OpenStack is participating as one of the Open Source Day projects. We are
>> hoping to do some peer programming (aka mentees shadowing folks while they
>> work through a patch) as part of the day. Mentors receive a full
>> conference pass and AnitaB.org membership. Please check out the
>> requirements
>> <https://ghc.anitab.org/get-involved/volunteer/committee-members-and-scholarship-reviewers-2>{0}
>> and apply
>> <https://ghc.anitab.org/get-involved/volunteer/committee-members-and-scholarship-reviewers-2>(1)
>> by August 19, 2020.
>>
>> We are also figuring a way for more folks to be able to mentor, so if
>> you'd like to help but aren't interested in the conference please reach out
>> to me or Victoria(vkmc) by email or on IRC.
>>
>> Thanks and apologies for the short deadline though I can probably get let
>> additions in:)
>>
>> Amy (spotz)
>>
>> 0- Grace Hopper mentorship requirements:
>> https://ghc.anitab.org/get-involved/volunteer/committee-members-and-scholarship-reviewers-2
>>
>> 1- Grace Hopper mentorship application:
>> https://ghc.anitab.org/get-involved/volunteer/committee-members-and-scholarship-reviewers-2
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200818/f55bd8ee/attachment-0001.html>

From amy at demarco.com  Tue Aug 18 16:58:29 2020
From: amy at demarco.com (Amy Marrich)
Date: Tue, 18 Aug 2020 11:58:29 -0500
Subject: [tripleo] Help needed to write a Third Party integration guide
In-Reply-To: <CACu=hytByxGFqnVZUUXKL8QTTQFOp+T2aUCnuxJdpg-OraDENg@mail.gmail.com>
References: <CACu=hytByxGFqnVZUUXKL8QTTQFOp+T2aUCnuxJdpg-OraDENg@mail.gmail.com>
Message-ID: <CAFs83QqXuTX0dhXiqhgR9P2pirf+9RoJ8LZezskDPDexbgMuKQ@mail.gmail.com>

Emilien,

I can definitely help with QA and editing of this if not the actual
writing. I'm in the process of making an up to date how to install a
virtual cluster, so have been doing some basic installs over and over but
nothing more advanced.

Thanks,

Amy (spotz)

On Tue, Aug 18, 2020 at 11:41 AM Emilien Macchi <emilien at redhat.com> wrote:

> Hi people,
>
> We have been requested several times (for good reasons) about how to write
> out-of-tree files for TripleO integration (Heat templates, Ansible roles,
> Container Images layouts, etc).
> For example, Dell has an external repository (
> https://github.com/dell/tripleo-powerflex) where they have pretty much
> all they need to install their services in out-of-tree fashion (I'm sure
> there are more examples, this one is just the most recent in my knowledge).
> This model is recommended for the third party services that aren't part of
> TripleO but still want to be integrated with it.
> This usually fits when the service can't be maintained by the TripleO team
> but there is a desire from outside of the community to maintain some
> integration (e.g. vendors).
>
> We haven't done a good job at providing a full end to end guide on how to
> achieve this and very often asked people to just do it. I propose that we
> work on this guide together and today I'm gathering for volunteers who have
> knowledge on that field or are interested to learn about it and contribute
> it back directly into a new guide, hosted on tripleo-docs repo.
>
> https://bugs.launchpad.net/tripleo/+bug/1892072
>
> This will probably involve a bunch of linking to existing docs but also a
> good opportunity to update what is outdated in our content and provide more
> information where needed.
>
> Thanks for letting us know if you're interested to be actively
> contributing into that effort,
> --
> Emilien Macchi
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200818/8bb9881d/attachment.html>

From stig at stackhpc.com  Tue Aug 18 20:51:11 2020
From: stig at stackhpc.com (Stig Telfer)
Date: Tue, 18 Aug 2020 21:51:11 +0100
Subject: [monasca] Setup Monasca from scratch
In-Reply-To: <5e457dae-dc7c-3693-dc34-e622c2cd40f8@cloud.ionos.com>
References: <5e457dae-dc7c-3693-dc34-e622c2cd40f8@cloud.ionos.com>
Message-ID: <C244862D-65B8-48C3-85AE-0B14543D54B4@telfer.org>

Hey Antonios -

This will depend a good deal on the method you're using for deploying OpenStack.  I've used the Kolla-Ansible documentation for Monasca [https://docs.openstack.org/kolla-ansible/ussuri/reference/logging-and-monitoring/monasca-guide.html] and found it helpful for getting started.  I'm sure there are other guides out there too.

If you get stuck I also recommend trying the #openstack-monasca IRC channel.

Cheers,
Stig


> On 18 Aug 2020, at 09:24, Antonios Dimtsoudis <antonios.dimtsoudis at cloud.ionos.com> wrote:
> 
> Hi all,
> 
> i am trying to set up Monasca from scratch. Is there a good introduction / point to start of you would recommend?
> 
> Thanks in advance,
> 
> Antonios.
> 
> 


From stig at stackhpc.com  Tue Aug 18 20:55:13 2020
From: stig at stackhpc.com (Stig Telfer)
Date: Tue, 18 Aug 2020 21:55:13 +0100
Subject: [scientific-sig] IRC meeting starting shortly
Message-ID: <D4B477D1-8285-4986-B08F-4B10A27F2ACE@telfer.org>

Hi All - 

We have a Scientific SIG IRC meeting starting shortly in channel #openstack-meeting.  Everyone is welcome.

This week's agenda is here: https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_August_18th_2020

We'd like to cover some upcoming (virtual) events - Supercomputing 2020 and the OpenStack PTG.

Cheers,
Stig


From melwittt at gmail.com  Tue Aug 18 21:10:40 2020
From: melwittt at gmail.com (melanie witt)
Date: Tue, 18 Aug 2020 14:10:40 -0700
Subject: [neutron][gate] verbose q-svc log files and e-r indexing
In-Reply-To: <20200818150052.u4xkjsptejikwcny@skaplons-mac>
References: <bbf4a146-7c67-969e-b783-4bdfbe31add1@gmail.com>
 <20200818103323.wq5upyjn4nzsqhx7@skaplons-mac>
 <20200818150052.u4xkjsptejikwcny@skaplons-mac>
Message-ID: <62e4fcd2-0f7a-a7d3-7692-3ad9a05c8399@gmail.com>

On 8/18/20 08:00, Slawek Kaplonski wrote:
> Hi,
> 
> I proposed patch [1] which seems that decreased size of the neutron-server log
> a bit - see [2] but it's still about 40M :/
> 
> [1] https://review.opendev.org/#/c/730879/
> [2] https://48dcf568cd222acfbfb6-11d92d8452a346ca231ad13d26a55a7d.ssl.cf2.rackcdn.com/746714/1/check/tempest-full-py3/5c1399c/controller/logs/

Thanks for jumping in to help, Slawek! Indeed your proposed patch improves things from 60M-70M => 40M (good!).

With your patch applied, the most frequent potential log message I see now is like this:

Aug 18 14:40:21.294549 ubuntu-bionic-rax-iad-0019321276 neutron-server[5829]: DEBUG neutron_lib.callbacks.manager [None req-eadfbe92-eaee-4e3e-a5c0-f18aa8ba9772 None None] Notify callbacks ['neutron.services.segments.db._update_segment_host_mapping_for_agent-8764691834039', 'neutron.plugins.ml2.plugin.Ml2Plugin._retry_binding_revived_agents-4033733'] for agent, after_update {{(pid=6206) _notify_loop /opt/stack/neutron-lib/neutron_lib/callbacks/manager.py:193}}

with the line count difference being with and without:

$ wc -l "screen-q-svc.txt"
102493 screen-q-svc.txt

$ grep -v "neutron_lib.callbacks.manager" "screen-q-svc.txt" |wc -l
83261

so I suppose we could predict a decrease in file size of about 40M => 32M if we were able to remove the neutron_lib.callbacks.manager output.

But I'm not sure whether that's a critical debugging element or not.

-melanie


From christophe.sauthier at objectif-libre.com  Tue Aug 18 21:20:39 2020
From: christophe.sauthier at objectif-libre.com (Christophe Sauthier)
Date: Tue, 18 Aug 2020 17:20:39 -0400
Subject: [cloudkitty][tc] Cloudkitty abandoned?
In-Reply-To: <da1a13d1-e840-c43d-ee83-a5ac4dea02db@openstack.org>
References: <c6150166-1fa9-5020-988a-0a8e9101bca8@gmx.com>
 <CAG97raekL-OR3mT_GdpD+zcwoCCLYDTCsieMwqCkYC9brmHKpw@mail.gmail.com>
 <CAFHSqWrKO4nDiVNpZe3LUsccXwefqRirZGk9BCsQjads0VrYCA@mail.gmail.com>
 <CAG97racc_j30TVhZ1UBk9SAo1rNrer1j1pr-uQEjkee4cTY_gg@mail.gmail.com>
 <CAJ6yrQgvpyTpR2Z99-_ruqRMJextYcx4e-yP2kkpPY5PDWdAvA@mail.gmail.com>
 <CA+ny2swxRnfJS8mStB2T9tEA1Ox4=_LdHS+giMEQPiCbu=MUDw@mail.gmail.com>
 <173c942a17b.dfe050d2111458.180813585646259079@ghanshyammann.com>
 <e104109d-a7c8-9e88-42e8-cd947de9a88c@debian.org>
 <da1a13d1-e840-c43d-ee83-a5ac4dea02db@openstack.org>
Message-ID: <CAFmpASQespnD_w-3moHcfcKJOKcKNYfHi+WSv+9d3WefTz9UjQ@mail.gmail.com>

Hello everyone

Sorry it took me a few days to answer that thread.

First of all I am REALLY REALLY happy to see that a few persones are
stepping up to continue to work on Cloudkitty.

The situation is, like usually, a chaining of events (and honestly Thomas
it is absolutely not related to the sale of Objectif Libre by Linkbynet).
In late 2019 we tried to push hard to organize a community around
Cloudkitty. We have tried to organise a few call with some users explaining
them the next challenges that the project will be facing and how we could
all work on that. Like it is the case for many projects we had little/no
feedback...
By early 2020 we had some turn over in the company (once again not related
to the sale) and we have started to organise ourself to continue our
ongoing on CLoudkitty like we are doing since the beginning of the project,
that I have started some years ago... And then the COVID crisis arrived,
and like many compagny in the world we had to change our priorities...
During the end of summer (before holidays..) we started to organize again
internally to continue that work. So it is a great news that a community is
rising, and we will be really happy to work with the rest of it to continue
to improve Cloudkitty, especially since like Thomas said "It does the job"
:)

     Christophe

On Tue, Aug 11, 2020 at 6:16 AM Thierry Carrez <thierry at openstack.org>
wrote:

> Thomas Goirand wrote:
> > On 8/7/20 4:10 PM, Ghanshyam Mann wrote:
> >> Thanks, Pierre for helping with this.
> >>
> >> ttx has reached out to PTL (Justin Ferrieu (jferrieu) <
> justin.ferrieu at objectif-libre.com>)
> >> but I am not sure if he got any response back.
>
> No response so far, but they may all be in company summer vacation.
>
> > The end of the very good maintenance of Cloudkitty matched the date when
> > objectif libre was sold to Linkbynet. Maybe the new owner don't care
> enough?
> >
> > This is very disappointing as I've been using it for some time already,
> > and that I was satisfied by it (ie: it does the job...), and especially
> > that latest releases are able to scale correctly.
> >
> > I very much would love if Pierre Riteau was successful in taking over.
> > Good luck Pierre! I'll try to help whenever I can and if I'm not too
> busy.
>
> Given the volunteers (Pierre, Rafael, Luis) I would support the TC using
> its unholy powers to add extra core reviewers to cloudkitty.
>
> If the current PTL comes back, I'm sure they will appreciate the help,
> and can always fix/revert things before Victoria release.
>
> --
> Thierry Carrez (ttx)
>
>

-- 

----
Christophe Sauthier
Directeur Général

Objectif Libre : Au service de votre Cloud

+33 (0) 6 16 98 63 96 | christophe.sauthier at objectif-libre.com

https://www.objectif-libre.com | @objectiflibre
Recevez la Pause Cloud Et DevOps : https://olib.re/abo-pause
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200818/85a2c49c/attachment-0001.html>

From iwienand at redhat.com  Tue Aug 18 23:52:47 2020
From: iwienand at redhat.com (Ian Wienand)
Date: Wed, 19 Aug 2020 09:52:47 +1000
Subject: [simplification] Making ask.openstack.org read-only
In-Reply-To: <f1cdb911-5e4c-cf4f-7f3b-32ca12353ace@openstack.org>
References: <f1cdb911-5e4c-cf4f-7f3b-32ca12353ace@openstack.org>
Message-ID: <20200818235247.GA341779@fedora19.localdomain>

On Tue, Aug 18, 2020 at 12:44:43PM +0200, Thierry Carrez wrote:
> I think it's time to pull the plug, make ask.openstack.org read-only (so
> that links to old answers are not lost) and redirect users to the
> mailing-list and the "OpenStack" tag on StackOverflow. I picked
> StackOverflow since it seems to have the most openstack questions (2,574 on
> SO, 76 on SuperUser and 430 on ServerFault).

I agree that this is the most pragmatic approach.

> Thoughts, comments?

*If* we were to restore it now, it looks like 0.11 branch comes with
an upstream Dockerfile [1]; there's lots of examples now in
system-config of similar container-based production sites and this
could fit in.

This makes it significantly easier than trying to build up everything
it requires from scratch, and if upstream keep their container
compatible (a big if...) theoretically less work to keep updated.

But despite the self-hosting story being better in 2020, I agree the
ROI isn't there.

-i

[1] https://github.com/ASKBOT/askbot-devel/blob/0.11.x/Dockerfile


From fungi at yuggoth.org  Wed Aug 19 00:03:59 2020
From: fungi at yuggoth.org (Jeremy Stanley)
Date: Wed, 19 Aug 2020 00:03:59 +0000
Subject: [simplification] Making ask.openstack.org read-only
In-Reply-To: <20200818235247.GA341779@fedora19.localdomain>
References: <f1cdb911-5e4c-cf4f-7f3b-32ca12353ace@openstack.org>
 <20200818235247.GA341779@fedora19.localdomain>
Message-ID: <20200819000359.mhz43jvop5vtcgct@yuggoth.org>

On 2020-08-19 09:52:47 +1000 (+1000), Ian Wienand wrote:
[...]
> *If* we were to restore it now, it looks like 0.11 branch comes with
> an upstream Dockerfile [1]; there's lots of examples now in
> system-config of similar container-based production sites and this
> could fit in.
> 
> This makes it significantly easier than trying to build up everything
> it requires from scratch, and if upstream keep their container
> compatible (a big if...) theoretically less work to keep updated.
[...]

Which also brings up another point: right now we're running it on
Ubuntu Xenial (16.04 LTS) which is scheduled to reach EOL early next
year, and the tooling we're using to deploy it isn't going to work
on newer Ubuntu releases. Even keeping it up in a read-only state is
timebound to how long we can safely keep its server online. If we
switch ask.openstack.org to read-only now, I would still plan to
turn it off entirely on or before April 1, 2021.
-- 
Jeremy Stanley
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 963 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200819/d8b911cd/attachment.sig>

From johnsomor at gmail.com  Wed Aug 19 00:35:05 2020
From: johnsomor at gmail.com (Michael Johnson)
Date: Tue, 18 Aug 2020 17:35:05 -0700
Subject: [simplification] Making ask.openstack.org read-only
In-Reply-To: <20200819000359.mhz43jvop5vtcgct@yuggoth.org>
References: <f1cdb911-5e4c-cf4f-7f3b-32ca12353ace@openstack.org>
 <20200818235247.GA341779@fedora19.localdomain>
 <20200819000359.mhz43jvop5vtcgct@yuggoth.org>
Message-ID: <CAMH0MgJwpDeRNbKuBCa5rx+aPhBvvGu-Qzj-OPGQfuE1i-J9uw@mail.gmail.com>

Yes! ask.openstack.org is no fun to attempt to be helpful on (see
e-mail notification issues, etc.).

I would like to ask that we put together some sort of guide and/or
guidence for how to use stack overflow efficiently for OpenStack
questions. I.e. some well known or defined tags that we recommend
people use when asking questions. This would be similar to the tags we
use for the openstack discuss list.

I see that there is already a trend for "openstack-nova"
"openstack-horizon", etc. This works for me.

This way we can setup notifications for these tags and be much more
efficient at getting people answers.

Thanks Thierry for moving this forward!

Michael

On Tue, Aug 18, 2020 at 5:10 PM Jeremy Stanley <fungi at yuggoth.org> wrote:
>
> On 2020-08-19 09:52:47 +1000 (+1000), Ian Wienand wrote:
> [...]
> > *If* we were to restore it now, it looks like 0.11 branch comes with
> > an upstream Dockerfile [1]; there's lots of examples now in
> > system-config of similar container-based production sites and this
> > could fit in.
> >
> > This makes it significantly easier than trying to build up everything
> > it requires from scratch, and if upstream keep their container
> > compatible (a big if...) theoretically less work to keep updated.
> [...]
>
> Which also brings up another point: right now we're running it on
> Ubuntu Xenial (16.04 LTS) which is scheduled to reach EOL early next
> year, and the tooling we're using to deploy it isn't going to work
> on newer Ubuntu releases. Even keeping it up in a read-only state is
> timebound to how long we can safely keep its server online. If we
> switch ask.openstack.org to read-only now, I would still plan to
> turn it off entirely on or before April 1, 2021.
> --
> Jeremy Stanley


From sam47priya at gmail.com  Wed Aug 19 01:45:59 2020
From: sam47priya at gmail.com (Sam P)
Date: Wed, 19 Aug 2020 10:45:59 +0900
Subject: [tc][masakari] Project aliveness (was: [masakari] Meetings)
In-Reply-To: <AM0PR02MB5746C86B851DAE317F5A2CEDA85F0@AM0PR02MB5746.eurprd02.prod.outlook.com>
References: <CAKZ_x79qopDY36TNg_5mvWFy=tiiB08PZqpjKeft-TnR0dQbtA@mail.gmail.com>
 <CAKZ_x799gE8njCGVwUtd28REvJaYK86n0WZ3PVeQv5-HXKGdmQ@mail.gmail.com>
 <CAFHSqWroR+rirgLLn7hB-8fHKBR8jFE8KezgSg4EayePC3_dgQ@mail.gmail.com>
 <AM0PR02MB5746C86B851DAE317F5A2CEDA85F0@AM0PR02MB5746.eurprd02.prod.outlook.com>
Message-ID: <CA+RdavF=POsLiCzkL6DxgjAxorUaw0z2njhm8BHGU67mwfkOwg@mail.gmail.com>

Hi All,
  In past few weeks I was not able to manage time to properly maintain
the project.
  Really sorry for that. If you would like to help out, I will add you
as core member to project and we can discuss how to proceed.

If there are no objections, I will add the following members to the core team.
suzhengwei
Jegor van Opdorp
Radosław Piliszek

--- Regards,
Sampath

On Mon, Aug 17, 2020 at 11:13 PM Jegor van Opdorp <jegor at greenedge.cloud> wrote:
>
> We're also using masakari and willing to help maintain it!
> ________________________________
> From: Mark Goddard <mark at stackhpc.com>
> Sent: Monday, August 17, 2020 12:12 PM
> To: Jegor van Opdorp <jegor at greenedge.cloud>
> Subject: Fwd: [tc][masakari] Project aliveness (was: [masakari] Meetings)
>
> ---------- Forwarded message ---------
> From: Radosław Piliszek <radoslaw.piliszek at gmail.com>
> Date: Fri, 14 Aug 2020 at 08:53
> Subject: [tc][masakari] Project aliveness (was: [masakari] Meetings)
> To: openstack-discuss <openstack-discuss at lists.openstack.org>
> Cc: Sampath Priyankara (samP) <sam47priya at gmail.com>, Tushar Patil
> (tpatil) <tushar.vitthal.patil at gmail.com>
>
>
> Hi,
>
> it's been a month since I wrote the original (quoted) email, so I
> retry it with CC to the PTL and a recently (this year) active core.
>
> I see there have been no meetings and neither Masakari IRC channel nor
> review queues have been getting much attention during that time
> period.
> I am, therefore, offering my help to maintain the project.
>
> Regarding the original topic, I would opt for running Masakari
> meetings during the time I proposed so that interested parties could
> join and I know there is at least some interest based on recent IRC
> activity (i.e. there exist people who want to use and discuss Masakari
> - apart from me that is :-) ).
>
> -yoctozepto
>
>
> On Mon, Jul 13, 2020 at 9:53 PM Radosław Piliszek
> <radoslaw.piliszek at gmail.com> wrote:
> >
> > Hello Fellow cloud-HA-seekers,
> >
> > I wanted to attend Masakari meetings but I found the current schedule unfit.
> > Is there a chance to change the schedule? The day is fine but a shift
> > by +3 hours would be nice.
> >
> > Anyhow, I wanted to discuss [1]. I've already proposed a change
> > implementing it and looking forward to positive reviews. :-) That
> > said, please reply on the change directly, or mail me or catch me on
> > IRC, whichever option sounds best to you.
> >
> > [1] https://blueprints.launchpad.net/masakari/+spec/customisable-ha-enabled-instance-metadata-key
> >
> > -yoctozepto


From mnaser at vexxhost.com  Wed Aug 19 02:30:08 2020
From: mnaser at vexxhost.com (Mohammed Naser)
Date: Tue, 18 Aug 2020 22:30:08 -0400
Subject: [neutron][ops] API for viewing HA router states
In-Reply-To: <CABARBAY_r-U8fixF7_tPg1cxrDy14x5s=nQ=oa436H7WeOyp=g@mail.gmail.com>
References: <CAEs876jV-RqWiW2K8ehvm4ACTCiU059LJDiC+FWGWKQdPGCjGw@mail.gmail.com>
 <6613245.ccrTHCtBl7@antares>
 <CABARBAY_r-U8fixF7_tPg1cxrDy14x5s=nQ=oa436H7WeOyp=g@mail.gmail.com>
Message-ID: <CAEs876gsfa3Z+f+7fchabxVYDsM3rnNza=ftZx3r2rov7VGvww@mail.gmail.com>

On Tue, Aug 18, 2020 at 10:53 AM Assaf Muller <amuller at redhat.com> wrote:
>
> On Tue, Aug 18, 2020 at 8:12 AM Jonas Schäfer
> <jonas.schaefer at cloudandheat.com> wrote:
> >
> > Hi Mohammed and all,
> >
> > On Montag, 17. August 2020 14:01:55 CEST Mohammed Naser wrote:
> > > Over the past few days, we were troubleshooting an issue that ended up
> > > having a root cause where keepalived has somehow ended up active in
> > > two different L3 agents.  We've yet to find the root cause of how this
> > > happened but removing it and adding it resolved the issue for us.
> >
> > We’ve also seen that behaviour occasionally. The root cause is also unclear
> > for us (so we would’ve love to hear about that).
>
> Insert shameless plug for the Neutron OVN backend. One of it's
> advantages is that it's L3 HA architecture is cleaner and more
> scalable (this is coming from the dude that wrote the L3 HA code we're
> all suffering from =D). The ML2/OVS L3 HA architecture has it's issues
> - I've seen it work at 100's of customer sites at scale, so I don't
> want to knock it too much, but just a day ago I got an internal
> customer ticket about keepalived falling over on a particular router
> that has 200 floating IPs. It works but it's not perfect. I'm sure the
> OVN implementation isn't either but it's simply cleaner and has less
> moving parts. It uses BFD to monitor the tunnel endpoints, so failover
> is faster too. Plus, it doesn't use keepalived.
>

OVN is something we're looking at and we're very excited about,
unfortunately, there seems to be a bunch of gaps in documentation
right now as well as a lot of the migration scripts to OVN are
TripleO-y.

So it'll take time to get us there, but yes, OVN simplifies this greatly

> > We have anecdotal evidence
> > that a rabbitmq failure was involved, although that makes no sense to me
> > personally. Other causes may be incorrectly cleaned-up namespaces (for
> > example, when you kill or hard-restart the l3 agent, the namespaces will stay
> > around, possibly with the IP address assigned; the keepalived on the other l3
> > agents will not see the VRRP advertisments anymore and will ALSO assign the IP
> > address. This will also be rectified by a restart always and may require
> > manual namespace cleanup with a tool, a node reboot or an agent disable/enable
> > cycle.).
> >
> > > As we work on improving our monitoring, we wanted to implement
> > > something that gets us the info of # of active routers to check if
> > > there's a router that has >1 active L3 agent but it's hard because
> > > hitting the /l3-agents endpoint on _every_ single router hurts a lot
> > > on performance.
> > >
> > > Is there something else that we can watch which might be more
> > > productive?  FYI -- this all goes in the open and will end up inside
> > > the openstack-exporter:
> > > https://github.com/openstack-exporter/openstack-exporter and the Helm
> > > charts will end up with the alerts:
> > > https://github.com/openstack-exporter/helm-charts
> >
> > While I don’t think it fits in your openstack-exporter design, we are
> > currently using the attached script (which we also hereby publish under the
> > terms of the Apache 2.0 license [1]). (Sorry, I lack the time to cleanly
> > publish it somewhere right now.)
> >
> > It checks the state files maintained by the L3 agent conglomerate and exports
> > metrics about the master-ness of the routers as prometheus metrics.
> >
> > Note that this is slightly dangerous since the router IDs are high-cardinality
> > and using that as a label value in Prometheus is discouraged; you may not want
> > to do this in a public cloud setting.
> >
> > Either way: This allows us to alert on routers where there is not exactly one
> > master state. Downside is that this requires the thing to run locally on the
> > l3 agent nodes. Upside is that it is very efficient, and will also show the
> > master state in some cases where the router was not cleaned up properly (e.g.
> > because the l3 agent and its keepaliveds were killed).
> > kind regards,
> > Jonas
> >
> >    [1]: http://www.apache.org/licenses/LICENSE-2.0
> > --
> > Jonas Schäfer
> > DevOps Engineer
> >
> > Cloud&Heat Technologies GmbH
> > Königsbrücker Straße 96 | 01099 Dresden
> > +49 351 479 367 37
> > jonas.schaefer at cloudandheat.com | www.cloudandheat.com
> >
> > New Service:
> > Managed Kubernetes designed for AI & ML
> > https://managed-kubernetes.cloudandheat.com/
> >
> > Commercial Register: District Court Dresden
> > Register Number: HRB 30549
> > VAT ID No.: DE281093504
> > Managing Director: Nicolas Röhrs
> > Authorized signatory: Dr. Marius Feldmann
> > Authorized signatory: Kristina Rübenkamp
>
>


-- 
Mohammed Naser
VEXXHOST, Inc.


From mnaser at vexxhost.com  Wed Aug 19 02:31:17 2020
From: mnaser at vexxhost.com (Mohammed Naser)
Date: Tue, 18 Aug 2020 22:31:17 -0400
Subject: [neutron][ops] API for viewing HA router states
In-Reply-To: <6613245.ccrTHCtBl7@antares>
References: <CAEs876jV-RqWiW2K8ehvm4ACTCiU059LJDiC+FWGWKQdPGCjGw@mail.gmail.com>
 <6613245.ccrTHCtBl7@antares>
Message-ID: <CAEs876h-bBc_XAUUaVUcs9=N0RSRXftwB+JaD_KFjL1rn4qqrw@mail.gmail.com>

On Tue, Aug 18, 2020 at 8:12 AM Jonas Schäfer
<jonas.schaefer at cloudandheat.com> wrote:
>
> Hi Mohammed and all,
>
> On Montag, 17. August 2020 14:01:55 CEST Mohammed Naser wrote:
> > Over the past few days, we were troubleshooting an issue that ended up
> > having a root cause where keepalived has somehow ended up active in
> > two different L3 agents.  We've yet to find the root cause of how this
> > happened but removing it and adding it resolved the issue for us.
>
> We’ve also seen that behaviour occasionally. The root cause is also unclear
> for us (so we would’ve love to hear about that). We have anecdotal evidence
> that a rabbitmq failure was involved, although that makes no sense to me
> personally. Other causes may be incorrectly cleaned-up namespaces (for
> example, when you kill or hard-restart the l3 agent, the namespaces will stay
> around, possibly with the IP address assigned; the keepalived on the other l3
> agents will not see the VRRP advertisments anymore and will ALSO assign the IP
> address. This will also be rectified by a restart always and may require
> manual namespace cleanup with a tool, a node reboot or an agent disable/enable
> cycle.).
>
> > As we work on improving our monitoring, we wanted to implement
> > something that gets us the info of # of active routers to check if
> > there's a router that has >1 active L3 agent but it's hard because
> > hitting the /l3-agents endpoint on _every_ single router hurts a lot
> > on performance.
> >
> > Is there something else that we can watch which might be more
> > productive?  FYI -- this all goes in the open and will end up inside
> > the openstack-exporter:
> > https://github.com/openstack-exporter/openstack-exporter and the Helm
> > charts will end up with the alerts:
> > https://github.com/openstack-exporter/helm-charts
>
> While I don’t think it fits in your openstack-exporter design, we are
> currently using the attached script (which we also hereby publish under the
> terms of the Apache 2.0 license [1]). (Sorry, I lack the time to cleanly
> publish it somewhere right now.)
>
> It checks the state files maintained by the L3 agent conglomerate and exports
> metrics about the master-ness of the routers as prometheus metrics.
>
> Note that this is slightly dangerous since the router IDs are high-cardinality
> and using that as a label value in Prometheus is discouraged; you may not want
> to do this in a public cloud setting.
>
> Either way: This allows us to alert on routers where there is not exactly one
> master state. Downside is that this requires the thing to run locally on the
> l3 agent nodes. Upside is that it is very efficient, and will also show the
> master state in some cases where the router was not cleaned up properly (e.g.
> because the l3 agent and its keepaliveds were killed).

This seems sweet.  Let me go over the code.  I might package this up
into something
consumable and host it inside OpenDev, if that's okay with you?

> kind regards,
> Jonas
>
>    [1]: http://www.apache.org/licenses/LICENSE-2.0
> --
> Jonas Schäfer
> DevOps Engineer
>
> Cloud&Heat Technologies GmbH
> Königsbrücker Straße 96 | 01099 Dresden
> +49 351 479 367 37
> jonas.schaefer at cloudandheat.com | www.cloudandheat.com
>
> New Service:
> Managed Kubernetes designed for AI & ML
> https://managed-kubernetes.cloudandheat.com/
>
> Commercial Register: District Court Dresden
> Register Number: HRB 30549
> VAT ID No.: DE281093504
> Managing Director: Nicolas Röhrs
> Authorized signatory: Dr. Marius Feldmann
> Authorized signatory: Kristina Rübenkamp


-- 
Mohammed Naser
VEXXHOST, Inc.


From dev.faz at gmail.com  Wed Aug 19 04:23:38 2020
From: dev.faz at gmail.com (Fabian Zimmermann)
Date: Wed, 19 Aug 2020 06:23:38 +0200
Subject: [tc][masakari] Project aliveness (was: [masakari] Meetings)
In-Reply-To: <CA+RdavF=POsLiCzkL6DxgjAxorUaw0z2njhm8BHGU67mwfkOwg@mail.gmail.com>
References: <CAKZ_x79qopDY36TNg_5mvWFy=tiiB08PZqpjKeft-TnR0dQbtA@mail.gmail.com>
 <CAKZ_x799gE8njCGVwUtd28REvJaYK86n0WZ3PVeQv5-HXKGdmQ@mail.gmail.com>
 <CAFHSqWroR+rirgLLn7hB-8fHKBR8jFE8KezgSg4EayePC3_dgQ@mail.gmail.com>
 <AM0PR02MB5746C86B851DAE317F5A2CEDA85F0@AM0PR02MB5746.eurprd02.prod.outlook.com>
 <CA+RdavF=POsLiCzkL6DxgjAxorUaw0z2njhm8BHGU67mwfkOwg@mail.gmail.com>
Message-ID: <CAA857Vz2W+_tnfT2fAhVEVojcxwASJCazF3JLW+LF=om02Q5Eg@mail.gmail.com>

Hi,

if nobody complains I also would like to request core status to help
getting the project further.

 Fabian Zimmermann


Sam P <sam47priya at gmail.com> schrieb am Mi., 19. Aug. 2020, 03:50:

> Hi All,
>   In past few weeks I was not able to manage time to properly maintain
> the project.
>   Really sorry for that. If you would like to help out, I will add you
> as core member to project and we can discuss how to proceed.
>
> If there are no objections, I will add the following members to the core
> team.
> suzhengwei
> Jegor van Opdorp
> Radosław Piliszek
>
> --- Regards,
> Sampath
>
> On Mon, Aug 17, 2020 at 11:13 PM Jegor van Opdorp <jegor at greenedge.cloud>
> wrote:
> >
> > We're also using masakari and willing to help maintain it!
> > ________________________________
> > From: Mark Goddard <mark at stackhpc.com>
> > Sent: Monday, August 17, 2020 12:12 PM
> > To: Jegor van Opdorp <jegor at greenedge.cloud>
> > Subject: Fwd: [tc][masakari] Project aliveness (was: [masakari] Meetings)
> >
> > ---------- Forwarded message ---------
> > From: Radosław Piliszek <radoslaw.piliszek at gmail.com>
> > Date: Fri, 14 Aug 2020 at 08:53
> > Subject: [tc][masakari] Project aliveness (was: [masakari] Meetings)
> > To: openstack-discuss <openstack-discuss at lists.openstack.org>
> > Cc: Sampath Priyankara (samP) <sam47priya at gmail.com>, Tushar Patil
> > (tpatil) <tushar.vitthal.patil at gmail.com>
> >
> >
> > Hi,
> >
> > it's been a month since I wrote the original (quoted) email, so I
> > retry it with CC to the PTL and a recently (this year) active core.
> >
> > I see there have been no meetings and neither Masakari IRC channel nor
> > review queues have been getting much attention during that time
> > period.
> > I am, therefore, offering my help to maintain the project.
> >
> > Regarding the original topic, I would opt for running Masakari
> > meetings during the time I proposed so that interested parties could
> > join and I know there is at least some interest based on recent IRC
> > activity (i.e. there exist people who want to use and discuss Masakari
> > - apart from me that is :-) ).
> >
> > -yoctozepto
> >
> >
> > On Mon, Jul 13, 2020 at 9:53 PM Radosław Piliszek
> > <radoslaw.piliszek at gmail.com> wrote:
> > >
> > > Hello Fellow cloud-HA-seekers,
> > >
> > > I wanted to attend Masakari meetings but I found the current schedule
> unfit.
> > > Is there a chance to change the schedule? The day is fine but a shift
> > > by +3 hours would be nice.
> > >
> > > Anyhow, I wanted to discuss [1]. I've already proposed a change
> > > implementing it and looking forward to positive reviews. :-) That
> > > said, please reply on the change directly, or mail me or catch me on
> > > IRC, whichever option sounds best to you.
> > >
> > > [1]
> https://blueprints.launchpad.net/masakari/+spec/customisable-ha-enabled-instance-metadata-key
> > >
> > > -yoctozepto
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200819/12d44773/attachment.html>

From reza.b2008 at gmail.com  Wed Aug 19 05:42:20 2020
From: reza.b2008 at gmail.com (Reza Bakhshayeshi)
Date: Wed, 19 Aug 2020 10:12:20 +0430
Subject: VM doesn't have internet - OpenStack Ussuri with OVN networking
In-Reply-To: <CAMGoRG0gwFsa26Q+vUgbXyo05bJZV4Vvgz3tRG5Mi3sm34FF6g@mail.gmail.com>
References: <CAMGoRG0gwFsa26Q+vUgbXyo05bJZV4Vvgz3tRG5Mi3sm34FF6g@mail.gmail.com>
Message-ID: <CAMGoRG3QGXiRbvh=8mroEZuKaUkoDHgDoRtyAvRqQQWf2dmxEQ@mail.gmail.com>

The problem was solved.
It was due to the underlying macvtap bridge.

On Sat, 15 Aug 2020 at 17:38, Reza Bakhshayeshi <reza.b2008 at gmail.com>
wrote:

> Hi all,
>
> I've set up OpenStack Ussuri with OVN networking manually, VMs can ping
> each other through an internal network. I've created a provider network
> with valid IP subnet, and my problem is VMs don't have internet access
> before and after assigning floating IP.
> I've encountered the same problem on TripleO (with dvr), and I just wanted
> to investigate the problem by manual installation (without HA and DVR), but
> the same happened.
> Everything seems working properly, I can't see any error in logs, here is
> agent list output:
>
> [root at controller ~]# openstack network agent list
>
> +--------------------------------------+------------------------------+------------------------+-------------------+-------+-------+-------------------------------+
> | ID                                   | Agent Type                   |
> Host                   | Availability Zone | Alive | State | Binary
>                |
>
> +--------------------------------------+------------------------------+------------------------+-------------------+-------+-------+-------------------------------+
> | 1ade76ae-6caf-4942-8df3-e3bc39d2f12d | OVN Controller Gateway agent |
> controller.localdomain | n/a               | :-)   | UP    | ovn-controller
>                |
> | 484f123f-5935-44ce-aee7-4102271d9f11 | OVN Controller agent         |
> compute.localdomain    | n/a               | :-)   | UP    | ovn-controller
>                |
> | 01235c13-4f32-4c4f-8cf6-e4b8d59a438a | OVN Metadata agent           |
> compute.localdomain    | n/a               | :-)   | UP    |
> networking-ovn-metadata-agent |
>
> +--------------------------------------+------------------------------+------------------------+-------------------+-------+-------+-------------------------------+
>
> On the controller I got br-ex with a valid IP address. here is the
> external-ids table on controller and compute node:
>
> [root at controller ~]# ovs-vsctl get Open_vSwitch . external-ids
> {hostname=controller.localdomain, ovn-bridge=br-int,
> ovn-cms-options=enable-chassis-as-gw, ovn-encap-ip="10.0.0.11",
> ovn-encap-type=geneve, ovn-remote="tcp:10.0.0.11:6642",
> rundir="/var/run/openvswitch",
> system-id="1ade76ae-6caf-4942-8df3-e3bc39d2f12d"}
>
> [root at compute ~]# ovs-vsctl get Open_vSwitch . external-ids
> {hostname=compute.localdomain, ovn-bridge=br-int,
> ovn-encap-ip="10.0.0.31", ovn-encap-type=geneve, ovn-remote="tcp:
> 10.0.0.11:6642", rundir="/var/run/openvswitch",
> system-id="484f123f-5935-44ce-aee7-4102271d9f11"}
>
> and I have:
>
> [root at controller ~]# ovn-nbctl show
> switch 72fd5c08-6852-4d7e-b9b4-7e0a1ccdd976
> (neutron-b8c66c3d-f47a-42a5-bd2d-c40c435c0376) (aka net01)
>     port cf99f43b-0a18-4b91-9ca5-b6ed3f86d994
>         type: localport
>         addresses: ["fa:16:3e:d0:df:82 192.168.0.100"]
>     port 4268f511-bee3-4da0-8835-b9a8664101c4
>         addresses: ["fa:16:3e:35:f2:02 192.168.0.135"]
>     port 846919e8-cde5-4ba3-b003-0c06e73676ed
>         type: router
>         router-port: lrp-846919e8-cde5-4ba3-b003-0c06e73676ed
> switch bb22224e-e1d1-4bb2-b57e-1058e9fc33a7
> (neutron-9614546f-b216-4554-9bfe-e8d6bb11d927) (aka provider)
>     port 2f05c7bc-ad0f-4a41-bbd8-5fef1f5bfd2c
>         type: localport
>         addresses: ["fa:16:3e:17:7b:5b  X.X.X.X"]
>     port provnet-9614546f-b216-4554-9bfe-e8d6bb11d927
>         type: localnet
>         addresses: ["unknown"]
>     port 23fcdc9d-2d11-40c9-881e-c78e871a3314
>         type: router
>         router-port: lrp-23fcdc9d-2d11-40c9-881e-c78e871a3314
> router 0bd35585-b0a3-4c8f-b71b-cb87c9fad060
> (neutron-8cdcd0d2-752c-4130-87bb-d2b7af803ec9) (aka router01)
>     port lrp-846919e8-cde5-4ba3-b003-0c06e73676ed
>         mac: "fa:16:3e:4d:c3:f9"
>         networks: ["192.168.0.1/24"]
>     port lrp-23fcdc9d-2d11-40c9-881e-c78e871a3314
>         mac: "fa:16:3e:94:89:8e"
>         networks: ["X.X.X.X/22"]
>         gateway chassis: [1ade76ae-6caf-4942-8df3-e3bc39d2f12d
> 484f123f-5935-44ce-aee7-4102271d9f11]
>     nat 8ef6167a-bc28-4caf-8af5-d0bf12a62545
>         external ip: " X.X.X.X "
>         logical ip: "192.168.0.135"
>         type: "dnat_and_snat"
>     nat ba32ab93-3d2b-4199-b634-802f0f438338
>         external ip: " X.X.X.X "
>         logical ip: "192.168.0.0/24"
>         type: "snat"
>
> I replaced valid IPs with X.X.X.X
>
> Any suggestion would be grateful.
> Regards,
> Reza
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200819/9f965621/attachment.html>

From jonas.schaefer at cloudandheat.com  Wed Aug 19 05:58:16 2020
From: jonas.schaefer at cloudandheat.com (Jonas =?ISO-8859-1?Q?Sch=E4fer?=)
Date: Wed, 19 Aug 2020 07:58:16 +0200
Subject: [neutron][ops] API for viewing HA router states
In-Reply-To: <CAEs876h-bBc_XAUUaVUcs9=N0RSRXftwB+JaD_KFjL1rn4qqrw@mail.gmail.com>
References: <CAEs876jV-RqWiW2K8ehvm4ACTCiU059LJDiC+FWGWKQdPGCjGw@mail.gmail.com>
 <6613245.ccrTHCtBl7@antares>
 <CAEs876h-bBc_XAUUaVUcs9=N0RSRXftwB+JaD_KFjL1rn4qqrw@mail.gmail.com>
Message-ID: <5669200.AjuPLuGbex@antares>

On Mittwoch, 19. August 2020 04:31:17 CEST you wrote:
> This seems sweet.  Let me go over the code.  I might package this up
> into something
> consumable and host it inside OpenDev, if that's okay with you?

Yes sure. I would’ve proposed it for x/osops-tools-contrib myself, but 
unfortunately I’m very short on time to work on this right now. So thanks for 
taking this on.

kind regards,
-- 
Jonas Schäfer
DevOps Engineer

Cloud&Heat Technologies GmbH
Königsbrücker Straße 96 | 01099 Dresden
+49 351 479 367 37
jonas.schaefer at cloudandheat.com | www.cloudandheat.com

New Service:
Managed Kubernetes designed for AI & ML
https://managed-kubernetes.cloudandheat.com/

Commercial Register: District Court Dresden
Register Number: HRB 30549
VAT ID No.: DE281093504
Managing Director: Nicolas Röhrs
Authorized signatory: Dr. Marius Feldmann
Authorized signatory: Kristina Rübenkamp
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200819/25d4de43/attachment-0001.sig>

From radoslaw.piliszek at gmail.com  Wed Aug 19 07:35:22 2020
From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=)
Date: Wed, 19 Aug 2020 09:35:22 +0200
Subject: [tc][masakari] Project aliveness (was: [masakari] Meetings)
In-Reply-To: <CA+RdavF=POsLiCzkL6DxgjAxorUaw0z2njhm8BHGU67mwfkOwg@mail.gmail.com>
References: <CAKZ_x79qopDY36TNg_5mvWFy=tiiB08PZqpjKeft-TnR0dQbtA@mail.gmail.com>
 <CAKZ_x799gE8njCGVwUtd28REvJaYK86n0WZ3PVeQv5-HXKGdmQ@mail.gmail.com>
 <CAFHSqWroR+rirgLLn7hB-8fHKBR8jFE8KezgSg4EayePC3_dgQ@mail.gmail.com>
 <AM0PR02MB5746C86B851DAE317F5A2CEDA85F0@AM0PR02MB5746.eurprd02.prod.outlook.com>
 <CA+RdavF=POsLiCzkL6DxgjAxorUaw0z2njhm8BHGU67mwfkOwg@mail.gmail.com>
Message-ID: <CAKZ_x79_z6rxxuRzvxz-0jziDsgsShmbLkSGQzM4=mO3=OomiA@mail.gmail.com>

Hello Sampath,

I'm really glad you are doing well! Also, thanks for approving the
Train release. :-)

Please let me know how we should proceed with the meetings.
I can start them on Tuesdays at 7 AM UTC.
And since the Masakari own channel is quite a peaceful one, I would
suggest to run them there directly.
What are your thoughts? :-)

Kind regards,

-yoctozepto

On Wed, Aug 19, 2020 at 3:49 AM Sam P <sam47priya at gmail.com> wrote:
>
> Hi All,
>   In past few weeks I was not able to manage time to properly maintain
> the project.
>   Really sorry for that. If you would like to help out, I will add you
> as core member to project and we can discuss how to proceed.
>
> If there are no objections, I will add the following members to the core team.
> suzhengwei
> Jegor van Opdorp
> Radosław Piliszek
>
> --- Regards,
> Sampath
>
> On Mon, Aug 17, 2020 at 11:13 PM Jegor van Opdorp <jegor at greenedge.cloud> wrote:
> >
> > We're also using masakari and willing to help maintain it!
> > ________________________________
> > From: Mark Goddard <mark at stackhpc.com>
> > Sent: Monday, August 17, 2020 12:12 PM
> > To: Jegor van Opdorp <jegor at greenedge.cloud>
> > Subject: Fwd: [tc][masakari] Project aliveness (was: [masakari] Meetings)
> >
> > ---------- Forwarded message ---------
> > From: Radosław Piliszek <radoslaw.piliszek at gmail.com>
> > Date: Fri, 14 Aug 2020 at 08:53
> > Subject: [tc][masakari] Project aliveness (was: [masakari] Meetings)
> > To: openstack-discuss <openstack-discuss at lists.openstack.org>
> > Cc: Sampath Priyankara (samP) <sam47priya at gmail.com>, Tushar Patil
> > (tpatil) <tushar.vitthal.patil at gmail.com>
> >
> >
> > Hi,
> >
> > it's been a month since I wrote the original (quoted) email, so I
> > retry it with CC to the PTL and a recently (this year) active core.
> >
> > I see there have been no meetings and neither Masakari IRC channel nor
> > review queues have been getting much attention during that time
> > period.
> > I am, therefore, offering my help to maintain the project.
> >
> > Regarding the original topic, I would opt for running Masakari
> > meetings during the time I proposed so that interested parties could
> > join and I know there is at least some interest based on recent IRC
> > activity (i.e. there exist people who want to use and discuss Masakari
> > - apart from me that is :-) ).
> >
> > -yoctozepto
> >
> >
> > On Mon, Jul 13, 2020 at 9:53 PM Radosław Piliszek
> > <radoslaw.piliszek at gmail.com> wrote:
> > >
> > > Hello Fellow cloud-HA-seekers,
> > >
> > > I wanted to attend Masakari meetings but I found the current schedule unfit.
> > > Is there a chance to change the schedule? The day is fine but a shift
> > > by +3 hours would be nice.
> > >
> > > Anyhow, I wanted to discuss [1]. I've already proposed a change
> > > implementing it and looking forward to positive reviews. :-) That
> > > said, please reply on the change directly, or mail me or catch me on
> > > IRC, whichever option sounds best to you.
> > >
> > > [1] https://blueprints.launchpad.net/masakari/+spec/customisable-ha-enabled-instance-metadata-key
> > >
> > > -yoctozepto
>


From hemant.sonawane at itera.io  Wed Aug 19 07:52:40 2020
From: hemant.sonawane at itera.io (Hemant Sonawane)
Date: Wed, 19 Aug 2020 09:52:40 +0200
Subject: [openstack-helm] openstack-helm-images release stable/ussuri loci
 images building issue
Message-ID: <CAAOUBOcnRKJQ0NouxmhS28Cca+3vrnoR=e7eLcN-5y4g=gGKZQ@mail.gmail.com>

Hello,

I am trying to build a *loci openstack-helm-images release stable/ussuri *but
there is an issue with each image I am building which could be related to
the python version and pip package but I am really not sure about it. I
also tried to update the python version for each package in requirements
repo but it didn't help. Is there any way to upgrade the python version and
pip as well in requirements? or does anybody know how to resolve this issue
while building openstack-helm-images? I have attached logs for your ready
reference. Help will be much appreciated thanks :)

-- 
Thanks and Regards,

Hemant Sonawane
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200819/78ba0d92/attachment.html>
-------------- next part --------------

ERROR: Package 'magnum' requires a different Python: 2.7.17 not in '>=3.6'                   

ERROR: Package 'senlin' requires a different Python: 2.7.17 not in '>=3.6'                                                              

ERROR: Package 'ironic' requires a different Python: 2.7.17 not in '>=3.6'        

ERROR: Package 'openstack-heat' requires a different Python: 2.7.17 not in '>=3.6'

ERROR: Package 'keystone' requires a different Python: 2.7.17 not in '>=3.6'                                                                 

ERROR: Package 'glance' requires a different Python: 2.7.17 not in '>=3.6'   

ERROR: Package 'neutron' requires a different Python: 2.7.17 not in '>=3.6'     

ERROR: Package 'cinder' requires a different Python: 2.7.17 not in '>=3.6'      

ERROR: Could not find a version that satisfies the requirement scandir; python_version < "3.5" (from pathlib2===2.3.5->-c /tmp/wheels/upper-constraints.txt (line 509)) (from versions: none)
ERROR: No matching distribution found for scandir; python_version < "3.5" (from pathlib2===2.3.5->-c /tmp/wheels/upper-constraints.txt (line 509))

ERROR: Package 'nova' requires a different Python: 2.7.17 not in '>=3.6'

From pierre at stackhpc.com  Wed Aug 19 08:22:08 2020
From: pierre at stackhpc.com (Pierre Riteau)
Date: Wed, 19 Aug 2020 10:22:08 +0200
Subject: [cloudkitty][rdo] Broken cloudkitty RPMs on CentOS8
Message-ID: <CA+ny2sy7hRpycOjg++z4SFNkbE1rawyocJAezKxR3-8cSTsbQg@mail.gmail.com>

Hello,

This issue was discovered on Kolla train-centos8 images, but I assume
it applies to both Train and Ussuri for CentOS 8 in general, not just
to Kolla.

CloudKitty became timezone-aware in Train:
https://review.opendev.org/#/c/669192/
This code references tz.UTC from the dateutil library. However, this
was added only in dateutil 2.7.0, while train-centos8 Kolla images use
package python3-dateutil-2.6.1-6.el8.noarch.rpm, causing the error
captured at the end of this message [2].

I submitted a patch to raise the minimum requirement for dateutil in
cloudkitty: https://review.opendev.org/#/c/742477/
However, how are those requirements taken into consideration when
packaging OpenStack in RDO? RDO packages for CentOS7 provide
python2-dateutil-2.8.0-1.el7.noarch.rpm, but there is no such package
in the CentOS8 repository.

Would it be better to just remove the use of tz.UTC? I believe we
could use dateutil.tz.tzutc() instead.

Thanks,
Pierre Riteau (priteau)

[1] http://mirror.centos.org/centos/8/cloud/x86_64/openstack-train/Packages/p/
[2] Error trace below:

2020-07-22 16:33:11.207 26 ERROR wsme.api
[req-3c49884a-1412-42bb-a57e-e7c731360148
ef450a969a2945928d3ade785eaae860 19df0f36ede14c29be9ca476222f8ba9
default - -] Server-side error:
 "module 'dateutil.tz' has no attribute 'UTC'". Detail:
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/wsmeext/pecan.py", line 85,
in callfunction
    result = f(self, *args, **kwargs)
  File "/usr/lib/python3.6/site-packages/cloudkitty/api/v1/controllers/storage.py",
line 71, in get_all
    paginate=False)
  File "/usr/lib/python3.6/site-packages/cloudkitty/storage/v2/influx.py",
line 311, in retrieve
    begin, end = self._check_begin_end(begin, end)
  File "/usr/lib/python3.6/site-packages/cloudkitty/storage/v2/influx.py",
line 271, in _check_begin_end
    end = tzutils.get_next_month()
  File "/usr/lib/python3.6/site-packages/cloudkitty/tzutils.py", line
150, in get_next_month
    return add_delta(start, datetime.timedelta(days=month_days))
  File "/usr/lib/python3.6/site-packages/cloudkitty/tzutils.py", line
104, in add_delta
    return utc_to_local(local_to_utc(dt, naive=True) + delta)
  File "/usr/lib/python3.6/site-packages/cloudkitty/tzutils.py", line
52, in local_to_utc
    output = dt.astimezone(tz.UTC)
AttributeError: module 'dateutil.tz' has no attribute 'UTC'


From arnaud.morin at gmail.com  Wed Aug 19 09:21:30 2020
From: arnaud.morin at gmail.com (Arnaud Morin)
Date: Wed, 19 Aug 2020 09:21:30 +0000
Subject: [nova][ops] Live migration and CPU features
In-Reply-To: <44347504ff7308a6c3b4155060c778fad368a002.camel@redhat.com>
References: <CAPkQhne9QKod_B7+0zzbG=ag=S76GAXKJWa=Fou6m-QcimOenQ@mail.gmail.com>
 <CAA857VzZYMZxLaZ=4=fJo7rXnYak5mN-64Wz7ACoYD11sp+OmQ@mail.gmail.com>
 <44347504ff7308a6c3b4155060c778fad368a002.camel@redhat.com>
Message-ID: <20200819092130.GX31915@sync>


Hello,

We have the same kind of issue.
To help mitigate it, we do segregation and also use cpu_mode=custom, so we
can use a model which is close to our hardware (cpu_model=Haswell-noTSX)
and add extra_flags when needed.

This is painful.

Cheers,

-- 
Arnaud Morin

On 18.08.20 - 16:16, Sean Mooney wrote:
> On Tue, 2020-08-18 at 17:06 +0200, Fabian Zimmermann wrote:
> > Hi,
> > 
> > We are using the "custom"-way. But this does not protect you from all issues.
> > 
> > We had problems with a new cpu-generation not (jet) detected correctly
> > in an libvirt-version. So libvirt failed back to the "desktop"-cpu of
> > this newer generation, but didnt support/detect some features =>
> > blocked live-migration.
> yes that is common when using really new hardware. having previouly worked
> at intel and hitting this often that one of the reason i tend to default to host-passthouh
> and recommend using AZ or aggreate to segreatate the cloud for live migration.
> 
> in the case where your libvirt does not know about the new cpus your best approch is to use the
> newest server cpu model that it know about and then if you really need the new fature you can try
> to add theem using the config options  but that is effectivly the same as using host-passhtough
> which is why i default to that as a workaround instead.
> 
> > 
> >  Fabian
> > 
> > Am Di., 18. Aug. 2020 um 16:54 Uhr schrieb Belmiro Moreira
> > <moreira.belmiro.email.lists at gmail.com>:
> > > 
> > > Hi,
> > > in our infrastructure we have always compute nodes that need a hardware intervention and as a consequence they are
> > > rebooted, bringing a new kernel, kvm, ...
> > > 
> > > In order to have a good compromise between performance and flexibility (live migration) we have been using "host-
> > > model" for the "cpu_mode" configuration of our service VMs. We didn't expect to have CPU compatibility issues
> > > because we have the same hardware type per cell.
> > > 
> > > The problem is that when a compute node is rebooted the instance domain is recreated with the new cpu features that
> > > were introduced because of the reboot (using centOS).
> > > 
> > > If there are new CPU features exposed, this basically blocks live migration to all the non rebooted compute nodes
> > > (those cpu features are not exposed, yet). The nova-scheduler doesn't know about them when scheduling the live
> > > migration destination.
> > > 
> > > I wonder how other operators are solving this issue.
> > > I don't like stopping OS upgrades.
> > > What I'm considering is to define a "custom" cpu_mode for each hardware type.
> > > 
> > > I would appreciate your comments and learn how you are solving this problem.
> > > 
> > > Belmiro
> > > 
> > 
> > 
> 
> 


From pierre at stackhpc.com  Wed Aug 19 09:34:47 2020
From: pierre at stackhpc.com (Pierre Riteau)
Date: Wed, 19 Aug 2020 11:34:47 +0200
Subject: [cloudkitty][tc] Cloudkitty abandoned?
In-Reply-To: <CAFmpASQespnD_w-3moHcfcKJOKcKNYfHi+WSv+9d3WefTz9UjQ@mail.gmail.com>
References: <c6150166-1fa9-5020-988a-0a8e9101bca8@gmx.com>
 <CAG97raekL-OR3mT_GdpD+zcwoCCLYDTCsieMwqCkYC9brmHKpw@mail.gmail.com>
 <CAFHSqWrKO4nDiVNpZe3LUsccXwefqRirZGk9BCsQjads0VrYCA@mail.gmail.com>
 <CAG97racc_j30TVhZ1UBk9SAo1rNrer1j1pr-uQEjkee4cTY_gg@mail.gmail.com>
 <CAJ6yrQgvpyTpR2Z99-_ruqRMJextYcx4e-yP2kkpPY5PDWdAvA@mail.gmail.com>
 <CA+ny2swxRnfJS8mStB2T9tEA1Ox4=_LdHS+giMEQPiCbu=MUDw@mail.gmail.com>
 <173c942a17b.dfe050d2111458.180813585646259079@ghanshyammann.com>
 <e104109d-a7c8-9e88-42e8-cd947de9a88c@debian.org>
 <da1a13d1-e840-c43d-ee83-a5ac4dea02db@openstack.org>
 <CAFmpASQespnD_w-3moHcfcKJOKcKNYfHi+WSv+9d3WefTz9UjQ@mail.gmail.com>
Message-ID: <CA+ny2swVWBFVvcOzeTr0_V5Ru38YdGHDbFnGo1+b-P_CL_T+CQ@mail.gmail.com>

Hello Christophe,

Good to hear that Objectif Libre is still planning to be involved in
the project. The existing core reviewer team is still in place. Do let
us know if new contributors could be granted core reviewer privileges.

Best wishes,
Pierre Riteau (priteau)

On Tue, 18 Aug 2020 at 23:21, Christophe Sauthier
<christophe.sauthier at objectif-libre.com> wrote:
>
> Hello everyone
>
> Sorry it took me a few days to answer that thread.
>
> First of all I am REALLY REALLY happy to see that a few persones are stepping up to continue to work on Cloudkitty.
>
> The situation is, like usually, a chaining of events (and honestly Thomas it is absolutely not related to the sale of Objectif Libre by Linkbynet).
> In late 2019 we tried to push hard to organize a community around Cloudkitty. We have tried to organise a few call with some users explaining them the next challenges that the project will be facing and how we could all work on that. Like it is the case for many projects we had little/no feedback...
> By early 2020 we had some turn over in the company (once again not related to the sale) and we have started to organise ourself to continue our ongoing on CLoudkitty like we are doing since the beginning of the project, that I have started some years ago... And then the COVID crisis arrived, and like many compagny in the world we had to change our priorities...
> During the end of summer (before holidays..) we started to organize again internally to continue that work. So it is a great news that a community is rising, and we will be really happy to work with the rest of it to continue to improve Cloudkitty, especially since like Thomas said "It does the job" :)
>
>      Christophe
>
> On Tue, Aug 11, 2020 at 6:16 AM Thierry Carrez <thierry at openstack.org> wrote:
>>
>> Thomas Goirand wrote:
>> > On 8/7/20 4:10 PM, Ghanshyam Mann wrote:
>> >> Thanks, Pierre for helping with this.
>> >>
>> >> ttx has reached out to PTL (Justin Ferrieu (jferrieu) <justin.ferrieu at objectif-libre.com>)
>> >> but I am not sure if he got any response back.
>>
>> No response so far, but they may all be in company summer vacation.
>>
>> > The end of the very good maintenance of Cloudkitty matched the date when
>> > objectif libre was sold to Linkbynet. Maybe the new owner don't care enough?
>> >
>> > This is very disappointing as I've been using it for some time already,
>> > and that I was satisfied by it (ie: it does the job...), and especially
>> > that latest releases are able to scale correctly.
>> >
>> > I very much would love if Pierre Riteau was successful in taking over.
>> > Good luck Pierre! I'll try to help whenever I can and if I'm not too busy.
>>
>> Given the volunteers (Pierre, Rafael, Luis) I would support the TC using
>> its unholy powers to add extra core reviewers to cloudkitty.
>>
>> If the current PTL comes back, I'm sure they will appreciate the help,
>> and can always fix/revert things before Victoria release.
>>
>> --
>> Thierry Carrez (ttx)
>>
>
>
> --
>
> ----
> Christophe Sauthier
> Directeur Général
>
> Objectif Libre : Au service de votre Cloud
>
> +33 (0) 6 16 98 63 96 | christophe.sauthier at objectif-libre.com
>
> https://www.objectif-libre.com | @objectiflibre
> Recevez la Pause Cloud Et DevOps : https://olib.re/abo-pause


From skaplons at redhat.com  Wed Aug 19 10:40:29 2020
From: skaplons at redhat.com (Slawek Kaplonski)
Date: Wed, 19 Aug 2020 12:40:29 +0200
Subject: [neutron] CI meeting cancelled
Message-ID: <20200819104029.u5qsqv36tbovritk@skaplons-mac>

Hi,

I have today some internal meeting in the same time as Neutron CI meeting is.
Also, some of the team members who are usually attending this meeting are on pto
this week so lets cancel it.
If You see any CI related issue, please open LP and ping me on IRC.

-- 
Slawek Kaplonski
Principal software engineer
Red Hat


From ekultails at gmail.com  Wed Aug 19 13:15:06 2020
From: ekultails at gmail.com (Luke Short)
Date: Wed, 19 Aug 2020 09:15:06 -0400
Subject: [tripleo][ci] container pulls failing
In-Reply-To: <CAOHJT4LP72hpOqgbXMcA3Tw7gndwNJ88nVB0ZjoCcAddAS36dw@mail.gmail.com>
References: <CAOHJT4LQL0ggWHdrESnuhOHVUb42vYod-PuNLEpyjXx2PARoxw@mail.gmail.com>
 <CACu=hyufTKUV_vG9nRss2hAbFscS471Wbjs3-pWQgx55GMag4w@mail.gmail.com>
 <CAFsb3b56pNdWyQZokoQJVxgRjOr3h_eTsNF6Bxit6cTMcNbdwQ@mail.gmail.com>
 <CACu=hyvm17efhF4uEe9L4M5vRa8Df1kjOLxD+WLZhDNun8Mntw@mail.gmail.com>
 <CAOHJT4L2t_rqq7oaD=g5scf2PXASi=CQ4rRTiVu=Lm-Z0zQU5Q@mail.gmail.com>
 <d62cfcbe-08c9-f6ea-e1b9-77f0e85386f8@redhat.com>
 <CAOHJT4JfTbtmcyh2vSmeSqtB4nmSt4V6PAuiuvc8ak1HzPyAdQ@mail.gmail.com>
 <CAFsb3b51vjiJvzE9DnaoKM0WuSjKFA5xSfwzr4oS034D1PH-Hw@mail.gmail.com>
 <CAOHJT4LY9p4iA85Sbb44WKbTC4LNCpe0oGxUZ47kr9MYmKxZcg@mail.gmail.com>
 <CAOHJT4LP72hpOqgbXMcA3Tw7gndwNJ88nVB0ZjoCcAddAS36dw@mail.gmail.com>
Message-ID: <CAFJ6H5sMyCE0gE5ovY+OFqFk=fxyLTGOcfVfNj1ea8GvO_2c8Q@mail.gmail.com>

Hey folks,

All of the latest patches to address this have been merged in but we are
still seeing this error randomly in CI jobs that involve an Undercloud or
Standalone node. As far as I can tell, the error is appearing less often
than before but it is still present making merging new patches difficult. I
would be happy to help work towards other possible solutions however I am
unsure where to start from here. Any help would be greatly appreciated.

Sincerely,
    Luke Short

On Wed, Aug 5, 2020 at 12:26 PM Wesley Hayutin <whayutin at redhat.com> wrote:

>
>
> On Wed, Jul 29, 2020 at 4:48 PM Wesley Hayutin <whayutin at redhat.com>
> wrote:
>
>>
>>
>> On Wed, Jul 29, 2020 at 4:33 PM Alex Schultz <aschultz at redhat.com> wrote:
>>
>>> On Wed, Jul 29, 2020 at 7:13 AM Wesley Hayutin <whayutin at redhat.com>
>>> wrote:
>>> >
>>> >
>>> >
>>> > On Wed, Jul 29, 2020 at 2:25 AM Bogdan Dobrelya <bdobreli at redhat.com>
>>> wrote:
>>> >>
>>> >> On 7/28/20 6:09 PM, Wesley Hayutin wrote:
>>> >> >
>>> >> >
>>> >> > On Tue, Jul 28, 2020 at 7:24 AM Emilien Macchi <emilien at redhat.com
>>> >> > <mailto:emilien at redhat.com>> wrote:
>>> >> >
>>> >> >
>>> >> >
>>> >> >     On Tue, Jul 28, 2020 at 9:20 AM Alex Schultz <
>>> aschultz at redhat.com
>>> >> >     <mailto:aschultz at redhat.com>> wrote:
>>> >> >
>>> >> >         On Tue, Jul 28, 2020 at 7:13 AM Emilien Macchi
>>> >> >         <emilien at redhat.com <mailto:emilien at redhat.com>> wrote:
>>> >> >          >
>>> >> >          >
>>> >> >          >
>>> >> >          > On Mon, Jul 27, 2020 at 5:27 PM Wesley Hayutin
>>> >> >         <whayutin at redhat.com <mailto:whayutin at redhat.com>> wrote:
>>> >> >          >>
>>> >> >          >> FYI...
>>> >> >          >>
>>> >> >          >> If you find your jobs are failing with an error similar
>>> to
>>> >> >         [1], you have been rate limited by docker.io <
>>> http://docker.io>
>>> >> >         via the upstream mirror system and have hit [2].  I've been
>>> >> >         discussing the issue w/ upstream infra, rdo-infra and a few
>>> CI
>>> >> >         engineers.
>>> >> >          >>
>>> >> >          >> There are a few ways to mitigate the issue however I
>>> don't
>>> >> >         see any of the options being completed very quickly so I'm
>>> >> >         asking for your patience while this issue is socialized and
>>> >> >         resolved.
>>> >> >          >>
>>> >> >          >> For full transparency we're considering the following
>>> options.
>>> >> >          >>
>>> >> >          >> 1. move off of docker.io <http://docker.io> to quay.io
>>> >> >         <http://quay.io>
>>> >> >          >
>>> >> >          >
>>> >> >          > quay.io <http://quay.io> also has API rate limit:
>>> >> >          > https://docs.quay.io/issues/429.html
>>> >> >          >
>>> >> >          > Now I'm not sure about how many requests per seconds one
>>> can
>>> >> >         do vs the other but this would need to be checked with the
>>> quay
>>> >> >         team before changing anything.
>>> >> >          > Also quay.io <http://quay.io> had its big downtimes as
>>> well,
>>> >> >         SLA needs to be considered.
>>> >> >          >
>>> >> >          >> 2. local container builds for each job in master,
>>> possibly
>>> >> >         ussuri
>>> >> >          >
>>> >> >          >
>>> >> >          > Not convinced.
>>> >> >          > You can look at CI logs:
>>> >> >          > - pulling / updating / pushing container images from
>>> >> >         docker.io <http://docker.io> to local registry takes ~10
>>> min on
>>> >> >         standalone (OVH)
>>> >> >          > - building containers from scratch with updated repos and
>>> >> >         pushing them to local registry takes ~29 min on standalone
>>> (OVH).
>>> >> >          >
>>> >> >          >>
>>> >> >          >> 3. parent child jobs upstream where rpms and containers
>>> will
>>> >> >         be build and host artifacts for the child jobs
>>> >> >          >
>>> >> >          >
>>> >> >          > Yes, we need to investigate that.
>>> >> >          >
>>> >> >          >>
>>> >> >          >> 4. remove some portion of the upstream jobs to lower the
>>> >> >         impact we have on 3rd party infrastructure.
>>> >> >          >
>>> >> >          >
>>> >> >          > I'm not sure I understand this one, maybe you can give an
>>> >> >         example of what could be removed?
>>> >> >
>>> >> >         We need to re-evaulate our use of scenarios (e.g. we have
>>> two
>>> >> >         scenario010's both are non-voting).  There's a reason we
>>> >> >         historically
>>> >> >         didn't want to add more jobs because of these types of
>>> resource
>>> >> >         constraints.  I think we've added new jobs recently and
>>> likely
>>> >> >         need to
>>> >> >         reduce what we run. Additionally we might want to look into
>>> reducing
>>> >> >         what we run on stable branches as well.
>>> >> >
>>> >> >
>>> >> >     Oh... removing jobs (I thought we would remove some steps of
>>> the jobs).
>>> >> >     Yes big +1, this should be a continuous goal when working on
>>> CI, and
>>> >> >     always evaluating what we need vs what we run now.
>>> >> >
>>> >> >     We should look at:
>>> >> >     1) services deployed in scenarios that aren't worth testing
>>> (e.g.
>>> >> >     deprecated or unused things) (and deprecate the unused things)
>>> >> >     2) jobs themselves (I don't have any example beside scenario010
>>> but
>>> >> >     I'm sure there are more).
>>> >> >     --
>>> >> >     Emilien Macchi
>>> >> >
>>> >> >
>>> >> > Thanks Alex, Emilien
>>> >> >
>>> >> > +1 to reviewing the catalog and adjusting things on an ongoing
>>> basis.
>>> >> >
>>> >> > All.. it looks like the issues with docker.io <http://docker.io>
>>> were
>>> >> > more of a flare up than a change in docker.io <http://docker.io>
>>> policy
>>> >> > or infrastructure [2].  The flare up started on July 27 8am utc and
>>> >> > ended on July 27 17:00 utc, see screenshots.
>>> >>
>>> >> The numbers of image prepare workers and its exponential fallback
>>> >> intervals should be also adjusted. I've analysed the log snippet [0]
>>> for
>>> >> the connection reset counts by workers versus the times the rate
>>> >> limiting was triggered. See the details in the reported bug [1].
>>> >>
>>> >> tl;dr -- for an example 5 sec interval 03:55:31,379 - 03:55:36,110:
>>> >>
>>> >> Conn Reset Counts by a Worker PID:
>>> >>        3 58412
>>> >>        2 58413
>>> >>        3 58415
>>> >>        3 58417
>>> >>
>>> >> which seems too much of (workers*reconnects) and triggers rate
>>> limiting
>>> >> immediately.
>>> >>
>>> >> [0]
>>> >>
>>> https://13b475d7469ed7126ee9-28d4ad440f46f2186fe3f98464e57890.ssl.cf1.rackcdn.com/741228/6/check/tripleo-ci-centos-8-undercloud-containers/8e47836/logs/undercloud/var/log/tripleo-container-image-prepare.log
>>> >>
>>> >> [1] https://bugs.launchpad.net/tripleo/+bug/1889372
>>> >>
>>> >> --
>>> >> Best regards,
>>> >> Bogdan Dobrelya,
>>> >> Irc #bogdando
>>> >>
>>> >
>>> > FYI..
>>> >
>>> > The issue w/ "too many requests" is back.  Expect delays and failures
>>> in attempting to merge your patches upstream across all branches.   The
>>> issue is being tracked as a critical issue.
>>>
>>> Working with the infra folks and we have identified the authorization
>>> header as causing issues when we're rediected from docker.io to
>>> cloudflare. I'll throw up a patch tomorrow to handle this case which
>>> should improve our usage of the cache.  It needs some testing against
>>> other registries to ensure that we don't break authenticated fetching
>>> of resources.
>>>
>>> Thanks Alex!
>>
>
>
> FYI.. we have been revisited by the container pull issue, "too many
> requests".
> Alex has some fresh patches on it:
> https://review.opendev.org/#/q/status:open+project:openstack/tripleo-common+topic:bug/1889122
>
> expect trouble in check and gate:
>
> http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22429%20Client%20Error%3A%20Too%20Many%20Requests%20for%20url%3A%5C%22%20AND%20voting%3A1
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200819/c1b9ddeb/attachment.html>

From cjeanner at redhat.com  Wed Aug 19 13:19:49 2020
From: cjeanner at redhat.com (=?UTF-8?Q?C=c3=a9dric_Jeanneret?=)
Date: Wed, 19 Aug 2020 15:19:49 +0200
Subject: [tripleo] Proposing Takashi Kajinami to be core on puppet-tripleo
In-Reply-To: <CACu=hytgMt=4whc8vVRMLw-gkrfZ-rYiX26sNH78jnvgwu_zjQ@mail.gmail.com>
References: <CACu=hytgMt=4whc8vVRMLw-gkrfZ-rYiX26sNH78jnvgwu_zjQ@mail.gmail.com>
Message-ID: <9c141d62-44db-a1c7-db44-894443d2576f@redhat.com>

+1 - and a big thanks to Takashi for his hard work on puppet integration!

On 8/18/20 4:28 PM, Emilien Macchi wrote:
> Hi people,
> 
> If you don't know Takashi yet, he has been involved in the Puppet
> OpenStack project and helped *a lot* in its maintenance (and by
> maintenance I mean not-funny-work). When our community was getting
> smaller and smaller, he joined us and our review velicity went back to
> eleven. He became a core maintainer very quickly and we're glad to have
> him onboard.
> 
> He's also been involved in taking care of puppet-tripleo for a few
> months and I believe he has more than enough knowledge on the module to
> provide core reviews and be part of the core maintainer group. I also
> noticed his amount of contribution (bug fixes, improvements, reviews,
> etc) in other TripleO repos and I'm confident he'll make his road to be
> core in TripleO at some point. For now I would like him to propose him
> to be core in puppet-tripleo.
> 
> As usual, any feedback is welcome but in the meantime I want to thank
> Takashi for his work in TripleO and we're super happy to have new
> contributors!
> 
> Thanks,
> -- 
> Emilien Macchi

-- 
Cédric Jeanneret (He/Him/His)
Sr. Software Engineer - OpenStack Platform
Deployment Framework TC
Red Hat EMEA
https://www.redhat.com/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200819/322853f9/attachment-0001.sig>

From aschultz at redhat.com  Wed Aug 19 13:23:18 2020
From: aschultz at redhat.com (Alex Schultz)
Date: Wed, 19 Aug 2020 07:23:18 -0600
Subject: [tripleo][ci] container pulls failing
In-Reply-To: <CAFJ6H5sMyCE0gE5ovY+OFqFk=fxyLTGOcfVfNj1ea8GvO_2c8Q@mail.gmail.com>
References: <CAOHJT4LQL0ggWHdrESnuhOHVUb42vYod-PuNLEpyjXx2PARoxw@mail.gmail.com>
 <CACu=hyufTKUV_vG9nRss2hAbFscS471Wbjs3-pWQgx55GMag4w@mail.gmail.com>
 <CAFsb3b56pNdWyQZokoQJVxgRjOr3h_eTsNF6Bxit6cTMcNbdwQ@mail.gmail.com>
 <CACu=hyvm17efhF4uEe9L4M5vRa8Df1kjOLxD+WLZhDNun8Mntw@mail.gmail.com>
 <CAOHJT4L2t_rqq7oaD=g5scf2PXASi=CQ4rRTiVu=Lm-Z0zQU5Q@mail.gmail.com>
 <d62cfcbe-08c9-f6ea-e1b9-77f0e85386f8@redhat.com>
 <CAOHJT4JfTbtmcyh2vSmeSqtB4nmSt4V6PAuiuvc8ak1HzPyAdQ@mail.gmail.com>
 <CAFsb3b51vjiJvzE9DnaoKM0WuSjKFA5xSfwzr4oS034D1PH-Hw@mail.gmail.com>
 <CAOHJT4LY9p4iA85Sbb44WKbTC4LNCpe0oGxUZ47kr9MYmKxZcg@mail.gmail.com>
 <CAOHJT4LP72hpOqgbXMcA3Tw7gndwNJ88nVB0ZjoCcAddAS36dw@mail.gmail.com>
 <CAFJ6H5sMyCE0gE5ovY+OFqFk=fxyLTGOcfVfNj1ea8GvO_2c8Q@mail.gmail.com>
Message-ID: <CAFsb3b6j6auPd++pmcRWh_6oT_=0=Dp6KNgLpKSkJ9mRuU_Lnw@mail.gmail.com>

On Wed, Aug 19, 2020 at 7:15 AM Luke Short <ekultails at gmail.com> wrote:
>
> Hey folks,
>
> All of the latest patches to address this have been merged in but we are still seeing this error randomly in CI jobs that involve an Undercloud or Standalone node. As far as I can tell, the error is appearing less often than before but it is still present making merging new patches difficult. I would be happy to help work towards other possible solutions however I am unsure where to start from here. Any help would be greatly appreciated.
>

I'm looking at this today but from what I can tell the problem is
likely caused by a reduced anonymous query quota from docker.io and
our usage of the upstream mirrors.  Because the mirrors essentially
funnel all requests through a single IP we're hitting limits faster
than if we didn't use the mirrors. Due to the nature of the requests,
the metadata queries don't get cached due to the authorization header
but are subject to the rate limiting.  Additionally we're querying the
registry to determine which containers we need to update in CI because
we limit our updates to a certain set of containers as part of the CI
jobs.

So there are likely a few different steps forward on this and we can
do a few of these together.

1) stop using mirrors (not ideal but likely makes this go away).
Alternatively switch stable branches off the mirrors due to a reduced
number of executions and leave mirrors configured on master only (or
vice versa).
2) reduce the number of jobs
3) stop querying the registry for the update filters (i'm looking into
this today) and use the information in tripleo-common first.
4) build containers always instead of fetching from docker.io

Thanks,
-Alex


> Sincerely,
>     Luke Short
>
> On Wed, Aug 5, 2020 at 12:26 PM Wesley Hayutin <whayutin at redhat.com> wrote:
>>
>>
>>
>> On Wed, Jul 29, 2020 at 4:48 PM Wesley Hayutin <whayutin at redhat.com> wrote:
>>>
>>>
>>>
>>> On Wed, Jul 29, 2020 at 4:33 PM Alex Schultz <aschultz at redhat.com> wrote:
>>>>
>>>> On Wed, Jul 29, 2020 at 7:13 AM Wesley Hayutin <whayutin at redhat.com> wrote:
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Jul 29, 2020 at 2:25 AM Bogdan Dobrelya <bdobreli at redhat.com> wrote:
>>>> >>
>>>> >> On 7/28/20 6:09 PM, Wesley Hayutin wrote:
>>>> >> >
>>>> >> >
>>>> >> > On Tue, Jul 28, 2020 at 7:24 AM Emilien Macchi <emilien at redhat.com
>>>> >> > <mailto:emilien at redhat.com>> wrote:
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> >     On Tue, Jul 28, 2020 at 9:20 AM Alex Schultz <aschultz at redhat.com
>>>> >> >     <mailto:aschultz at redhat.com>> wrote:
>>>> >> >
>>>> >> >         On Tue, Jul 28, 2020 at 7:13 AM Emilien Macchi
>>>> >> >         <emilien at redhat.com <mailto:emilien at redhat.com>> wrote:
>>>> >> >          >
>>>> >> >          >
>>>> >> >          >
>>>> >> >          > On Mon, Jul 27, 2020 at 5:27 PM Wesley Hayutin
>>>> >> >         <whayutin at redhat.com <mailto:whayutin at redhat.com>> wrote:
>>>> >> >          >>
>>>> >> >          >> FYI...
>>>> >> >          >>
>>>> >> >          >> If you find your jobs are failing with an error similar to
>>>> >> >         [1], you have been rate limited by docker.io <http://docker.io>
>>>> >> >         via the upstream mirror system and have hit [2].  I've been
>>>> >> >         discussing the issue w/ upstream infra, rdo-infra and a few CI
>>>> >> >         engineers.
>>>> >> >          >>
>>>> >> >          >> There are a few ways to mitigate the issue however I don't
>>>> >> >         see any of the options being completed very quickly so I'm
>>>> >> >         asking for your patience while this issue is socialized and
>>>> >> >         resolved.
>>>> >> >          >>
>>>> >> >          >> For full transparency we're considering the following options.
>>>> >> >          >>
>>>> >> >          >> 1. move off of docker.io <http://docker.io> to quay.io
>>>> >> >         <http://quay.io>
>>>> >> >          >
>>>> >> >          >
>>>> >> >          > quay.io <http://quay.io> also has API rate limit:
>>>> >> >          > https://docs.quay.io/issues/429.html
>>>> >> >          >
>>>> >> >          > Now I'm not sure about how many requests per seconds one can
>>>> >> >         do vs the other but this would need to be checked with the quay
>>>> >> >         team before changing anything.
>>>> >> >          > Also quay.io <http://quay.io> had its big downtimes as well,
>>>> >> >         SLA needs to be considered.
>>>> >> >          >
>>>> >> >          >> 2. local container builds for each job in master, possibly
>>>> >> >         ussuri
>>>> >> >          >
>>>> >> >          >
>>>> >> >          > Not convinced.
>>>> >> >          > You can look at CI logs:
>>>> >> >          > - pulling / updating / pushing container images from
>>>> >> >         docker.io <http://docker.io> to local registry takes ~10 min on
>>>> >> >         standalone (OVH)
>>>> >> >          > - building containers from scratch with updated repos and
>>>> >> >         pushing them to local registry takes ~29 min on standalone (OVH).
>>>> >> >          >
>>>> >> >          >>
>>>> >> >          >> 3. parent child jobs upstream where rpms and containers will
>>>> >> >         be build and host artifacts for the child jobs
>>>> >> >          >
>>>> >> >          >
>>>> >> >          > Yes, we need to investigate that.
>>>> >> >          >
>>>> >> >          >>
>>>> >> >          >> 4. remove some portion of the upstream jobs to lower the
>>>> >> >         impact we have on 3rd party infrastructure.
>>>> >> >          >
>>>> >> >          >
>>>> >> >          > I'm not sure I understand this one, maybe you can give an
>>>> >> >         example of what could be removed?
>>>> >> >
>>>> >> >         We need to re-evaulate our use of scenarios (e.g. we have two
>>>> >> >         scenario010's both are non-voting).  There's a reason we
>>>> >> >         historically
>>>> >> >         didn't want to add more jobs because of these types of resource
>>>> >> >         constraints.  I think we've added new jobs recently and likely
>>>> >> >         need to
>>>> >> >         reduce what we run. Additionally we might want to look into reducing
>>>> >> >         what we run on stable branches as well.
>>>> >> >
>>>> >> >
>>>> >> >     Oh... removing jobs (I thought we would remove some steps of the jobs).
>>>> >> >     Yes big +1, this should be a continuous goal when working on CI, and
>>>> >> >     always evaluating what we need vs what we run now.
>>>> >> >
>>>> >> >     We should look at:
>>>> >> >     1) services deployed in scenarios that aren't worth testing (e.g.
>>>> >> >     deprecated or unused things) (and deprecate the unused things)
>>>> >> >     2) jobs themselves (I don't have any example beside scenario010 but
>>>> >> >     I'm sure there are more).
>>>> >> >     --
>>>> >> >     Emilien Macchi
>>>> >> >
>>>> >> >
>>>> >> > Thanks Alex, Emilien
>>>> >> >
>>>> >> > +1 to reviewing the catalog and adjusting things on an ongoing basis.
>>>> >> >
>>>> >> > All.. it looks like the issues with docker.io <http://docker.io> were
>>>> >> > more of a flare up than a change in docker.io <http://docker.io> policy
>>>> >> > or infrastructure [2].  The flare up started on July 27 8am utc and
>>>> >> > ended on July 27 17:00 utc, see screenshots.
>>>> >>
>>>> >> The numbers of image prepare workers and its exponential fallback
>>>> >> intervals should be also adjusted. I've analysed the log snippet [0] for
>>>> >> the connection reset counts by workers versus the times the rate
>>>> >> limiting was triggered. See the details in the reported bug [1].
>>>> >>
>>>> >> tl;dr -- for an example 5 sec interval 03:55:31,379 - 03:55:36,110:
>>>> >>
>>>> >> Conn Reset Counts by a Worker PID:
>>>> >>        3 58412
>>>> >>        2 58413
>>>> >>        3 58415
>>>> >>        3 58417
>>>> >>
>>>> >> which seems too much of (workers*reconnects) and triggers rate limiting
>>>> >> immediately.
>>>> >>
>>>> >> [0]
>>>> >> https://13b475d7469ed7126ee9-28d4ad440f46f2186fe3f98464e57890.ssl.cf1.rackcdn.com/741228/6/check/tripleo-ci-centos-8-undercloud-containers/8e47836/logs/undercloud/var/log/tripleo-container-image-prepare.log
>>>> >>
>>>> >> [1] https://bugs.launchpad.net/tripleo/+bug/1889372
>>>> >>
>>>> >> --
>>>> >> Best regards,
>>>> >> Bogdan Dobrelya,
>>>> >> Irc #bogdando
>>>> >>
>>>> >
>>>> > FYI..
>>>> >
>>>> > The issue w/ "too many requests" is back.  Expect delays and failures in attempting to merge your patches upstream across all branches.   The issue is being tracked as a critical issue.
>>>>
>>>> Working with the infra folks and we have identified the authorization
>>>> header as causing issues when we're rediected from docker.io to
>>>> cloudflare. I'll throw up a patch tomorrow to handle this case which
>>>> should improve our usage of the cache.  It needs some testing against
>>>> other registries to ensure that we don't break authenticated fetching
>>>> of resources.
>>>>
>>> Thanks Alex!
>>
>>
>>
>> FYI.. we have been revisited by the container pull issue, "too many requests".
>> Alex has some fresh patches on it: https://review.opendev.org/#/q/status:open+project:openstack/tripleo-common+topic:bug/1889122
>>
>> expect trouble in check and gate:
>> http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22429%20Client%20Error%3A%20Too%20Many%20Requests%20for%20url%3A%5C%22%20AND%20voting%3A1
>>


From eblock at nde.ag  Wed Aug 19 13:36:16 2020
From: eblock at nde.ag (Eugen Block)
Date: Wed, 19 Aug 2020 13:36:16 +0000
Subject: [neutron] Disable dhcp drop rule
Message-ID: <20200819133616.Horde.zhXC_mhe4RdzjbP4Shl1M45@webmail.nde.ag>

Hi *,

we recently upgraded our Ocata Cloud to Train and also switched from  
linuxbridge to openvswitch.

One of our instances within the cloud works as DHCP server and to make  
that work we had to comment the respective part in this file on the  
compute node the instance was running on:

/usr/lib/python2.7/site-packages/neutron/agent/linux/iptables_firewall.py


Now we tried the same in

/usr/lib/python3.6/site-packages/neutron/agent/linux/openvswitch_firewall/firewall.py
/usr/lib/python3.6/site-packages/neutron/agent/linux/iptables_firewall.py

but restarting openstack-neutron-openvswitch-agent.service didn't drop  
that rule, the DHCP reply didn't get through. To continue with our  
work we just dropped it manually, so we get by, but since there have  
been a couple of years between Ocata and Train, is there any smoother  
or better way to achieve this? This seems to be a reoccuring request  
but I couldn't find any updates on this topic. Maybe someone here can  
shed some light? Is there more to change than those two files I  
mentioned?

Any pointers are highly appreciated!

Best regards,
Eugen


From ramishra at redhat.com  Wed Aug 19 13:37:29 2020
From: ramishra at redhat.com (Rabi Mishra)
Date: Wed, 19 Aug 2020 19:07:29 +0530
Subject: [tripleo] Proposing Takashi Kajinami to be core on puppet-tripleo
In-Reply-To: <CACu=hytgMt=4whc8vVRMLw-gkrfZ-rYiX26sNH78jnvgwu_zjQ@mail.gmail.com>
References: <CACu=hytgMt=4whc8vVRMLw-gkrfZ-rYiX26sNH78jnvgwu_zjQ@mail.gmail.com>
Message-ID: <CABJHmF5eJOgG=hoHrE2z-8gMwm_cxGihMSLOd02hcXd+2GvK7w@mail.gmail.com>

+1

On Tue, Aug 18, 2020 at 8:03 PM Emilien Macchi <emilien at redhat.com> wrote:

> Hi people,
>
> If you don't know Takashi yet, he has been involved in the Puppet
> OpenStack project and helped *a lot* in its maintenance (and by maintenance
> I mean not-funny-work). When our community was getting smaller and smaller,
> he joined us and our review velicity went back to eleven. He became a core
> maintainer very quickly and we're glad to have him onboard.
>
> He's also been involved in taking care of puppet-tripleo for a few months
> and I believe he has more than enough knowledge on the module to provide
> core reviews and be part of the core maintainer group. I also noticed his
> amount of contribution (bug fixes, improvements, reviews, etc) in other
> TripleO repos and I'm confident he'll make his road to be core in TripleO
> at some point. For now I would like him to propose him to be core in
> puppet-tripleo.
>
> As usual, any feedback is welcome but in the meantime I want to thank
> Takashi for his work in TripleO and we're super happy to have new
> contributors!
>
> Thanks,
> --
> Emilien Macchi
>


-- 
Regards,
Rabi Mishra
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200819/9e3607dd/attachment.html>

From cjeanner at redhat.com  Wed Aug 19 13:40:08 2020
From: cjeanner at redhat.com (=?UTF-8?Q?C=c3=a9dric_Jeanneret?=)
Date: Wed, 19 Aug 2020 15:40:08 +0200
Subject: [tripleo][ci] container pulls failing
In-Reply-To: <CAFsb3b6j6auPd++pmcRWh_6oT_=0=Dp6KNgLpKSkJ9mRuU_Lnw@mail.gmail.com>
References: <CAOHJT4LQL0ggWHdrESnuhOHVUb42vYod-PuNLEpyjXx2PARoxw@mail.gmail.com>
 <CACu=hyufTKUV_vG9nRss2hAbFscS471Wbjs3-pWQgx55GMag4w@mail.gmail.com>
 <CAFsb3b56pNdWyQZokoQJVxgRjOr3h_eTsNF6Bxit6cTMcNbdwQ@mail.gmail.com>
 <CACu=hyvm17efhF4uEe9L4M5vRa8Df1kjOLxD+WLZhDNun8Mntw@mail.gmail.com>
 <CAOHJT4L2t_rqq7oaD=g5scf2PXASi=CQ4rRTiVu=Lm-Z0zQU5Q@mail.gmail.com>
 <d62cfcbe-08c9-f6ea-e1b9-77f0e85386f8@redhat.com>
 <CAOHJT4JfTbtmcyh2vSmeSqtB4nmSt4V6PAuiuvc8ak1HzPyAdQ@mail.gmail.com>
 <CAFsb3b51vjiJvzE9DnaoKM0WuSjKFA5xSfwzr4oS034D1PH-Hw@mail.gmail.com>
 <CAOHJT4LY9p4iA85Sbb44WKbTC4LNCpe0oGxUZ47kr9MYmKxZcg@mail.gmail.com>
 <CAOHJT4LP72hpOqgbXMcA3Tw7gndwNJ88nVB0ZjoCcAddAS36dw@mail.gmail.com>
 <CAFJ6H5sMyCE0gE5ovY+OFqFk=fxyLTGOcfVfNj1ea8GvO_2c8Q@mail.gmail.com>
 <CAFsb3b6j6auPd++pmcRWh_6oT_=0=Dp6KNgLpKSkJ9mRuU_Lnw@mail.gmail.com>
Message-ID: <ac715ea0-4451-7bf1-1d64-7ef16fbc832e@redhat.com>


On 8/19/20 3:23 PM, Alex Schultz wrote:
> On Wed, Aug 19, 2020 at 7:15 AM Luke Short <ekultails at gmail.com> wrote:
>>
>> Hey folks,
>>
>> All of the latest patches to address this have been merged in but we are still seeing this error randomly in CI jobs that involve an Undercloud or Standalone node. As far as I can tell, the error is appearing less often than before but it is still present making merging new patches difficult. I would be happy to help work towards other possible solutions however I am unsure where to start from here. Any help would be greatly appreciated.
>>
> 
> I'm looking at this today but from what I can tell the problem is
> likely caused by a reduced anonymous query quota from docker.io and
> our usage of the upstream mirrors.  Because the mirrors essentially
> funnel all requests through a single IP we're hitting limits faster
> than if we didn't use the mirrors. Due to the nature of the requests,
> the metadata queries don't get cached due to the authorization header
> but are subject to the rate limiting.  Additionally we're querying the
> registry to determine which containers we need to update in CI because
> we limit our updates to a certain set of containers as part of the CI
> jobs.
> 
> So there are likely a few different steps forward on this and we can
> do a few of these together.
> 
> 1) stop using mirrors (not ideal but likely makes this go away).
> Alternatively switch stable branches off the mirrors due to a reduced
> number of executions and leave mirrors configured on master only (or
> vice versa).

might be good, but it might lead to some other issues - docker might
want to rate-limit on container owner. I wouldn't be surprised if they
go that way in the future. Could be OK as a first "unlocking step". But
we should consider 2) and 3).

> 2) reduce the number of jobs

always a good thing to do, +1

> 3) stop querying the registry for the update filters (i'm looking into
> this today) and use the information in tripleo-common first.

+1 - thanks for looking into it!

> 4) build containers always instead of fetching from docker.io

meh... last resort, if really nothing else works... It's time consuming
and will lead to other issues within the CI (job timeout and the like),
wouldn't it?

> 
> Thanks,
> -Alex
> 
> 
> 
>> Sincerely,
>>     Luke Short
>>
>> On Wed, Aug 5, 2020 at 12:26 PM Wesley Hayutin <whayutin at redhat.com> wrote:
>>>
>>>
>>>
>>> On Wed, Jul 29, 2020 at 4:48 PM Wesley Hayutin <whayutin at redhat.com> wrote:
>>>>
>>>>
>>>>
>>>> On Wed, Jul 29, 2020 at 4:33 PM Alex Schultz <aschultz at redhat.com> wrote:
>>>>>
>>>>> On Wed, Jul 29, 2020 at 7:13 AM Wesley Hayutin <whayutin at redhat.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Jul 29, 2020 at 2:25 AM Bogdan Dobrelya <bdobreli at redhat.com> wrote:
>>>>>>>
>>>>>>> On 7/28/20 6:09 PM, Wesley Hayutin wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Jul 28, 2020 at 7:24 AM Emilien Macchi <emilien at redhat.com
>>>>>>>> <mailto:emilien at redhat.com>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>     On Tue, Jul 28, 2020 at 9:20 AM Alex Schultz <aschultz at redhat.com
>>>>>>>>     <mailto:aschultz at redhat.com>> wrote:
>>>>>>>>
>>>>>>>>         On Tue, Jul 28, 2020 at 7:13 AM Emilien Macchi
>>>>>>>>         <emilien at redhat.com <mailto:emilien at redhat.com>> wrote:
>>>>>>>>          >
>>>>>>>>          >
>>>>>>>>          >
>>>>>>>>          > On Mon, Jul 27, 2020 at 5:27 PM Wesley Hayutin
>>>>>>>>         <whayutin at redhat.com <mailto:whayutin at redhat.com>> wrote:
>>>>>>>>          >>
>>>>>>>>          >> FYI...
>>>>>>>>          >>
>>>>>>>>          >> If you find your jobs are failing with an error similar to
>>>>>>>>         [1], you have been rate limited by docker.io <http://docker.io>
>>>>>>>>         via the upstream mirror system and have hit [2].  I've been
>>>>>>>>         discussing the issue w/ upstream infra, rdo-infra and a few CI
>>>>>>>>         engineers.
>>>>>>>>          >>
>>>>>>>>          >> There are a few ways to mitigate the issue however I don't
>>>>>>>>         see any of the options being completed very quickly so I'm
>>>>>>>>         asking for your patience while this issue is socialized and
>>>>>>>>         resolved.
>>>>>>>>          >>
>>>>>>>>          >> For full transparency we're considering the following options.
>>>>>>>>          >>
>>>>>>>>          >> 1. move off of docker.io <http://docker.io> to quay.io
>>>>>>>>         <http://quay.io>
>>>>>>>>          >
>>>>>>>>          >
>>>>>>>>          > quay.io <http://quay.io> also has API rate limit:
>>>>>>>>          > https://docs.quay.io/issues/429.html
>>>>>>>>          >
>>>>>>>>          > Now I'm not sure about how many requests per seconds one can
>>>>>>>>         do vs the other but this would need to be checked with the quay
>>>>>>>>         team before changing anything.
>>>>>>>>          > Also quay.io <http://quay.io> had its big downtimes as well,
>>>>>>>>         SLA needs to be considered.
>>>>>>>>          >
>>>>>>>>          >> 2. local container builds for each job in master, possibly
>>>>>>>>         ussuri
>>>>>>>>          >
>>>>>>>>          >
>>>>>>>>          > Not convinced.
>>>>>>>>          > You can look at CI logs:
>>>>>>>>          > - pulling / updating / pushing container images from
>>>>>>>>         docker.io <http://docker.io> to local registry takes ~10 min on
>>>>>>>>         standalone (OVH)
>>>>>>>>          > - building containers from scratch with updated repos and
>>>>>>>>         pushing them to local registry takes ~29 min on standalone (OVH).
>>>>>>>>          >
>>>>>>>>          >>
>>>>>>>>          >> 3. parent child jobs upstream where rpms and containers will
>>>>>>>>         be build and host artifacts for the child jobs
>>>>>>>>          >
>>>>>>>>          >
>>>>>>>>          > Yes, we need to investigate that.
>>>>>>>>          >
>>>>>>>>          >>
>>>>>>>>          >> 4. remove some portion of the upstream jobs to lower the
>>>>>>>>         impact we have on 3rd party infrastructure.
>>>>>>>>          >
>>>>>>>>          >
>>>>>>>>          > I'm not sure I understand this one, maybe you can give an
>>>>>>>>         example of what could be removed?
>>>>>>>>
>>>>>>>>         We need to re-evaulate our use of scenarios (e.g. we have two
>>>>>>>>         scenario010's both are non-voting).  There's a reason we
>>>>>>>>         historically
>>>>>>>>         didn't want to add more jobs because of these types of resource
>>>>>>>>         constraints.  I think we've added new jobs recently and likely
>>>>>>>>         need to
>>>>>>>>         reduce what we run. Additionally we might want to look into reducing
>>>>>>>>         what we run on stable branches as well.
>>>>>>>>
>>>>>>>>
>>>>>>>>     Oh... removing jobs (I thought we would remove some steps of the jobs).
>>>>>>>>     Yes big +1, this should be a continuous goal when working on CI, and
>>>>>>>>     always evaluating what we need vs what we run now.
>>>>>>>>
>>>>>>>>     We should look at:
>>>>>>>>     1) services deployed in scenarios that aren't worth testing (e.g.
>>>>>>>>     deprecated or unused things) (and deprecate the unused things)
>>>>>>>>     2) jobs themselves (I don't have any example beside scenario010 but
>>>>>>>>     I'm sure there are more).
>>>>>>>>     --
>>>>>>>>     Emilien Macchi
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks Alex, Emilien
>>>>>>>>
>>>>>>>> +1 to reviewing the catalog and adjusting things on an ongoing basis.
>>>>>>>>
>>>>>>>> All.. it looks like the issues with docker.io <http://docker.io> were
>>>>>>>> more of a flare up than a change in docker.io <http://docker.io> policy
>>>>>>>> or infrastructure [2].  The flare up started on July 27 8am utc and
>>>>>>>> ended on July 27 17:00 utc, see screenshots.
>>>>>>>
>>>>>>> The numbers of image prepare workers and its exponential fallback
>>>>>>> intervals should be also adjusted. I've analysed the log snippet [0] for
>>>>>>> the connection reset counts by workers versus the times the rate
>>>>>>> limiting was triggered. See the details in the reported bug [1].
>>>>>>>
>>>>>>> tl;dr -- for an example 5 sec interval 03:55:31,379 - 03:55:36,110:
>>>>>>>
>>>>>>> Conn Reset Counts by a Worker PID:
>>>>>>>        3 58412
>>>>>>>        2 58413
>>>>>>>        3 58415
>>>>>>>        3 58417
>>>>>>>
>>>>>>> which seems too much of (workers*reconnects) and triggers rate limiting
>>>>>>> immediately.
>>>>>>>
>>>>>>> [0]
>>>>>>> https://13b475d7469ed7126ee9-28d4ad440f46f2186fe3f98464e57890.ssl.cf1.rackcdn.com/741228/6/check/tripleo-ci-centos-8-undercloud-containers/8e47836/logs/undercloud/var/log/tripleo-container-image-prepare.log
>>>>>>>
>>>>>>> [1] https://bugs.launchpad.net/tripleo/+bug/1889372
>>>>>>>
>>>>>>> --
>>>>>>> Best regards,
>>>>>>> Bogdan Dobrelya,
>>>>>>> Irc #bogdando
>>>>>>>
>>>>>>
>>>>>> FYI..
>>>>>>
>>>>>> The issue w/ "too many requests" is back.  Expect delays and failures in attempting to merge your patches upstream across all branches.   The issue is being tracked as a critical issue.
>>>>>
>>>>> Working with the infra folks and we have identified the authorization
>>>>> header as causing issues when we're rediected from docker.io to
>>>>> cloudflare. I'll throw up a patch tomorrow to handle this case which
>>>>> should improve our usage of the cache.  It needs some testing against
>>>>> other registries to ensure that we don't break authenticated fetching
>>>>> of resources.
>>>>>
>>>> Thanks Alex!
>>>
>>>
>>>
>>> FYI.. we have been revisited by the container pull issue, "too many requests".
>>> Alex has some fresh patches on it: https://review.opendev.org/#/q/status:open+project:openstack/tripleo-common+topic:bug/1889122
>>>
>>> expect trouble in check and gate:
>>> http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22429%20Client%20Error%3A%20Too%20Many%20Requests%20for%20url%3A%5C%22%20AND%20voting%3A1
>>>
> 
> 

-- 
Cédric Jeanneret (He/Him/His)
Sr. Software Engineer - OpenStack Platform
Deployment Framework TC
Red Hat EMEA
https://www.redhat.com/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200819/6a4c8e57/attachment-0001.sig>

From bdobreli at redhat.com  Wed Aug 19 13:53:00 2020
From: bdobreli at redhat.com (Bogdan Dobrelya)
Date: Wed, 19 Aug 2020 15:53:00 +0200
Subject: [tripleo][ci] container pulls failing
In-Reply-To: <CAFsb3b6j6auPd++pmcRWh_6oT_=0=Dp6KNgLpKSkJ9mRuU_Lnw@mail.gmail.com>
References: <CAOHJT4LQL0ggWHdrESnuhOHVUb42vYod-PuNLEpyjXx2PARoxw@mail.gmail.com>
 <CACu=hyufTKUV_vG9nRss2hAbFscS471Wbjs3-pWQgx55GMag4w@mail.gmail.com>
 <CAFsb3b56pNdWyQZokoQJVxgRjOr3h_eTsNF6Bxit6cTMcNbdwQ@mail.gmail.com>
 <CACu=hyvm17efhF4uEe9L4M5vRa8Df1kjOLxD+WLZhDNun8Mntw@mail.gmail.com>
 <CAOHJT4L2t_rqq7oaD=g5scf2PXASi=CQ4rRTiVu=Lm-Z0zQU5Q@mail.gmail.com>
 <d62cfcbe-08c9-f6ea-e1b9-77f0e85386f8@redhat.com>
 <CAOHJT4JfTbtmcyh2vSmeSqtB4nmSt4V6PAuiuvc8ak1HzPyAdQ@mail.gmail.com>
 <CAFsb3b51vjiJvzE9DnaoKM0WuSjKFA5xSfwzr4oS034D1PH-Hw@mail.gmail.com>
 <CAOHJT4LY9p4iA85Sbb44WKbTC4LNCpe0oGxUZ47kr9MYmKxZcg@mail.gmail.com>
 <CAOHJT4LP72hpOqgbXMcA3Tw7gndwNJ88nVB0ZjoCcAddAS36dw@mail.gmail.com>
 <CAFJ6H5sMyCE0gE5ovY+OFqFk=fxyLTGOcfVfNj1ea8GvO_2c8Q@mail.gmail.com>
 <CAFsb3b6j6auPd++pmcRWh_6oT_=0=Dp6KNgLpKSkJ9mRuU_Lnw@mail.gmail.com>
Message-ID: <75d5fd38-f0bb-01eb-54a4-bfc3f0c42474@redhat.com>

On 8/19/20 3:23 PM, Alex Schultz wrote:
> On Wed, Aug 19, 2020 at 7:15 AM Luke Short <ekultails at gmail.com> wrote:
>>
>> Hey folks,
>>
>> All of the latest patches to address this have been merged in but we are still seeing this error randomly in CI jobs that involve an Undercloud or Standalone node. As far as I can tell, the error is appearing less often than before but it is still present making merging new patches difficult. I would be happy to help work towards other possible solutions however I am unsure where to start from here. Any help would be greatly appreciated.
>>
> 
> I'm looking at this today but from what I can tell the problem is
> likely caused by a reduced anonymous query quota from docker.io and
> our usage of the upstream mirrors.  Because the mirrors essentially
> funnel all requests through a single IP we're hitting limits faster
> than if we didn't use the mirrors. Due to the nature of the requests,
> the metadata queries don't get cached due to the authorization header
> but are subject to the rate limiting.  Additionally we're querying the
> registry to determine which containers we need to update in CI because
> we limit our updates to a certain set of containers as part of the CI
> jobs.
> 
> So there are likely a few different steps forward on this and we can
> do a few of these together.
> 
> 1) stop using mirrors (not ideal but likely makes this go away).
> Alternatively switch stable branches off the mirrors due to a reduced
> number of executions and leave mirrors configured on master only (or
> vice versa).

Also, the stable/(N-1) branch could use quay.io, while master keeps 
using docker.io (assuming containers for that N-1 release will be hosted 
there instead of the dockerhub)

> 2) reduce the number of jobs
> 3) stop querying the registry for the update filters (i'm looking into
> this today) and use the information in tripleo-common first.
> 4) build containers always instead of fetching from docker.io
> 
> Thanks,
> -Alex
> 
> 
> 
>> Sincerely,
>>      Luke Short
>>
>> On Wed, Aug 5, 2020 at 12:26 PM Wesley Hayutin <whayutin at redhat.com> wrote:
>>>
>>>
>>>
>>> On Wed, Jul 29, 2020 at 4:48 PM Wesley Hayutin <whayutin at redhat.com> wrote:
>>>>
>>>>
>>>>
>>>> On Wed, Jul 29, 2020 at 4:33 PM Alex Schultz <aschultz at redhat.com> wrote:
>>>>>
>>>>> On Wed, Jul 29, 2020 at 7:13 AM Wesley Hayutin <whayutin at redhat.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Jul 29, 2020 at 2:25 AM Bogdan Dobrelya <bdobreli at redhat.com> wrote:
>>>>>>>
>>>>>>> On 7/28/20 6:09 PM, Wesley Hayutin wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Jul 28, 2020 at 7:24 AM Emilien Macchi <emilien at redhat.com
>>>>>>>> <mailto:emilien at redhat.com>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>      On Tue, Jul 28, 2020 at 9:20 AM Alex Schultz <aschultz at redhat.com
>>>>>>>>      <mailto:aschultz at redhat.com>> wrote:
>>>>>>>>
>>>>>>>>          On Tue, Jul 28, 2020 at 7:13 AM Emilien Macchi
>>>>>>>>          <emilien at redhat.com <mailto:emilien at redhat.com>> wrote:
>>>>>>>>           >
>>>>>>>>           >
>>>>>>>>           >
>>>>>>>>           > On Mon, Jul 27, 2020 at 5:27 PM Wesley Hayutin
>>>>>>>>          <whayutin at redhat.com <mailto:whayutin at redhat.com>> wrote:
>>>>>>>>           >>
>>>>>>>>           >> FYI...
>>>>>>>>           >>
>>>>>>>>           >> If you find your jobs are failing with an error similar to
>>>>>>>>          [1], you have been rate limited by docker.io <http://docker.io>
>>>>>>>>          via the upstream mirror system and have hit [2].  I've been
>>>>>>>>          discussing the issue w/ upstream infra, rdo-infra and a few CI
>>>>>>>>          engineers.
>>>>>>>>           >>
>>>>>>>>           >> There are a few ways to mitigate the issue however I don't
>>>>>>>>          see any of the options being completed very quickly so I'm
>>>>>>>>          asking for your patience while this issue is socialized and
>>>>>>>>          resolved.
>>>>>>>>           >>
>>>>>>>>           >> For full transparency we're considering the following options.
>>>>>>>>           >>
>>>>>>>>           >> 1. move off of docker.io <http://docker.io> to quay.io
>>>>>>>>          <http://quay.io>
>>>>>>>>           >
>>>>>>>>           >
>>>>>>>>           > quay.io <http://quay.io> also has API rate limit:
>>>>>>>>           > https://docs.quay.io/issues/429.html
>>>>>>>>           >
>>>>>>>>           > Now I'm not sure about how many requests per seconds one can
>>>>>>>>          do vs the other but this would need to be checked with the quay
>>>>>>>>          team before changing anything.
>>>>>>>>           > Also quay.io <http://quay.io> had its big downtimes as well,
>>>>>>>>          SLA needs to be considered.
>>>>>>>>           >
>>>>>>>>           >> 2. local container builds for each job in master, possibly
>>>>>>>>          ussuri
>>>>>>>>           >
>>>>>>>>           >
>>>>>>>>           > Not convinced.
>>>>>>>>           > You can look at CI logs:
>>>>>>>>           > - pulling / updating / pushing container images from
>>>>>>>>          docker.io <http://docker.io> to local registry takes ~10 min on
>>>>>>>>          standalone (OVH)
>>>>>>>>           > - building containers from scratch with updated repos and
>>>>>>>>          pushing them to local registry takes ~29 min on standalone (OVH).
>>>>>>>>           >
>>>>>>>>           >>
>>>>>>>>           >> 3. parent child jobs upstream where rpms and containers will
>>>>>>>>          be build and host artifacts for the child jobs
>>>>>>>>           >
>>>>>>>>           >
>>>>>>>>           > Yes, we need to investigate that.
>>>>>>>>           >
>>>>>>>>           >>
>>>>>>>>           >> 4. remove some portion of the upstream jobs to lower the
>>>>>>>>          impact we have on 3rd party infrastructure.
>>>>>>>>           >
>>>>>>>>           >
>>>>>>>>           > I'm not sure I understand this one, maybe you can give an
>>>>>>>>          example of what could be removed?
>>>>>>>>
>>>>>>>>          We need to re-evaulate our use of scenarios (e.g. we have two
>>>>>>>>          scenario010's both are non-voting).  There's a reason we
>>>>>>>>          historically
>>>>>>>>          didn't want to add more jobs because of these types of resource
>>>>>>>>          constraints.  I think we've added new jobs recently and likely
>>>>>>>>          need to
>>>>>>>>          reduce what we run. Additionally we might want to look into reducing
>>>>>>>>          what we run on stable branches as well.
>>>>>>>>
>>>>>>>>
>>>>>>>>      Oh... removing jobs (I thought we would remove some steps of the jobs).
>>>>>>>>      Yes big +1, this should be a continuous goal when working on CI, and
>>>>>>>>      always evaluating what we need vs what we run now.
>>>>>>>>
>>>>>>>>      We should look at:
>>>>>>>>      1) services deployed in scenarios that aren't worth testing (e.g.
>>>>>>>>      deprecated or unused things) (and deprecate the unused things)
>>>>>>>>      2) jobs themselves (I don't have any example beside scenario010 but
>>>>>>>>      I'm sure there are more).
>>>>>>>>      --
>>>>>>>>      Emilien Macchi
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks Alex, Emilien
>>>>>>>>
>>>>>>>> +1 to reviewing the catalog and adjusting things on an ongoing basis.
>>>>>>>>
>>>>>>>> All.. it looks like the issues with docker.io <http://docker.io> were
>>>>>>>> more of a flare up than a change in docker.io <http://docker.io> policy
>>>>>>>> or infrastructure [2].  The flare up started on July 27 8am utc and
>>>>>>>> ended on July 27 17:00 utc, see screenshots.
>>>>>>>
>>>>>>> The numbers of image prepare workers and its exponential fallback
>>>>>>> intervals should be also adjusted. I've analysed the log snippet [0] for
>>>>>>> the connection reset counts by workers versus the times the rate
>>>>>>> limiting was triggered. See the details in the reported bug [1].
>>>>>>>
>>>>>>> tl;dr -- for an example 5 sec interval 03:55:31,379 - 03:55:36,110:
>>>>>>>
>>>>>>> Conn Reset Counts by a Worker PID:
>>>>>>>         3 58412
>>>>>>>         2 58413
>>>>>>>         3 58415
>>>>>>>         3 58417
>>>>>>>
>>>>>>> which seems too much of (workers*reconnects) and triggers rate limiting
>>>>>>> immediately.
>>>>>>>
>>>>>>> [0]
>>>>>>> https://13b475d7469ed7126ee9-28d4ad440f46f2186fe3f98464e57890.ssl.cf1.rackcdn.com/741228/6/check/tripleo-ci-centos-8-undercloud-containers/8e47836/logs/undercloud/var/log/tripleo-container-image-prepare.log
>>>>>>>
>>>>>>> [1] https://bugs.launchpad.net/tripleo/+bug/1889372
>>>>>>>
>>>>>>> --
>>>>>>> Best regards,
>>>>>>> Bogdan Dobrelya,
>>>>>>> Irc #bogdando
>>>>>>>
>>>>>>
>>>>>> FYI..
>>>>>>
>>>>>> The issue w/ "too many requests" is back.  Expect delays and failures in attempting to merge your patches upstream across all branches.   The issue is being tracked as a critical issue.
>>>>>
>>>>> Working with the infra folks and we have identified the authorization
>>>>> header as causing issues when we're rediected from docker.io to
>>>>> cloudflare. I'll throw up a patch tomorrow to handle this case which
>>>>> should improve our usage of the cache.  It needs some testing against
>>>>> other registries to ensure that we don't break authenticated fetching
>>>>> of resources.
>>>>>
>>>> Thanks Alex!
>>>
>>>
>>>
>>> FYI.. we have been revisited by the container pull issue, "too many requests".
>>> Alex has some fresh patches on it: https://review.opendev.org/#/q/status:open+project:openstack/tripleo-common+topic:bug/1889122
>>>
>>> expect trouble in check and gate:
>>> http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22429%20Client%20Error%3A%20Too%20Many%20Requests%20for%20url%3A%5C%22%20AND%20voting%3A1
>>>
> 


-- 
Best regards,
Bogdan Dobrelya,
Irc #bogdando


From aschultz at redhat.com  Wed Aug 19 13:55:09 2020
From: aschultz at redhat.com (Alex Schultz)
Date: Wed, 19 Aug 2020 07:55:09 -0600
Subject: [tripleo][ci] container pulls failing
In-Reply-To: <75d5fd38-f0bb-01eb-54a4-bfc3f0c42474@redhat.com>
References: <CAOHJT4LQL0ggWHdrESnuhOHVUb42vYod-PuNLEpyjXx2PARoxw@mail.gmail.com>
 <CACu=hyufTKUV_vG9nRss2hAbFscS471Wbjs3-pWQgx55GMag4w@mail.gmail.com>
 <CAFsb3b56pNdWyQZokoQJVxgRjOr3h_eTsNF6Bxit6cTMcNbdwQ@mail.gmail.com>
 <CACu=hyvm17efhF4uEe9L4M5vRa8Df1kjOLxD+WLZhDNun8Mntw@mail.gmail.com>
 <CAOHJT4L2t_rqq7oaD=g5scf2PXASi=CQ4rRTiVu=Lm-Z0zQU5Q@mail.gmail.com>
 <d62cfcbe-08c9-f6ea-e1b9-77f0e85386f8@redhat.com>
 <CAOHJT4JfTbtmcyh2vSmeSqtB4nmSt4V6PAuiuvc8ak1HzPyAdQ@mail.gmail.com>
 <CAFsb3b51vjiJvzE9DnaoKM0WuSjKFA5xSfwzr4oS034D1PH-Hw@mail.gmail.com>
 <CAOHJT4LY9p4iA85Sbb44WKbTC4LNCpe0oGxUZ47kr9MYmKxZcg@mail.gmail.com>
 <CAOHJT4LP72hpOqgbXMcA3Tw7gndwNJ88nVB0ZjoCcAddAS36dw@mail.gmail.com>
 <CAFJ6H5sMyCE0gE5ovY+OFqFk=fxyLTGOcfVfNj1ea8GvO_2c8Q@mail.gmail.com>
 <CAFsb3b6j6auPd++pmcRWh_6oT_=0=Dp6KNgLpKSkJ9mRuU_Lnw@mail.gmail.com>
 <75d5fd38-f0bb-01eb-54a4-bfc3f0c42474@redhat.com>
Message-ID: <CAFsb3b4r90OLtiJtWzeeOA+3n_CXL7TPvR=u4xFJ0uDfyWWYPA@mail.gmail.com>

On Wed, Aug 19, 2020 at 7:53 AM Bogdan Dobrelya <bdobreli at redhat.com> wrote:
>
> On 8/19/20 3:23 PM, Alex Schultz wrote:
> > On Wed, Aug 19, 2020 at 7:15 AM Luke Short <ekultails at gmail.com> wrote:
> >>
> >> Hey folks,
> >>
> >> All of the latest patches to address this have been merged in but we are still seeing this error randomly in CI jobs that involve an Undercloud or Standalone node. As far as I can tell, the error is appearing less often than before but it is still present making merging new patches difficult. I would be happy to help work towards other possible solutions however I am unsure where to start from here. Any help would be greatly appreciated.
> >>
> >
> > I'm looking at this today but from what I can tell the problem is
> > likely caused by a reduced anonymous query quota from docker.io and
> > our usage of the upstream mirrors.  Because the mirrors essentially
> > funnel all requests through a single IP we're hitting limits faster
> > than if we didn't use the mirrors. Due to the nature of the requests,
> > the metadata queries don't get cached due to the authorization header
> > but are subject to the rate limiting.  Additionally we're querying the
> > registry to determine which containers we need to update in CI because
> > we limit our updates to a certain set of containers as part of the CI
> > jobs.
> >
> > So there are likely a few different steps forward on this and we can
> > do a few of these together.
> >
> > 1) stop using mirrors (not ideal but likely makes this go away).
> > Alternatively switch stable branches off the mirrors due to a reduced
> > number of executions and leave mirrors configured on master only (or
> > vice versa).
>
> Also, the stable/(N-1) branch could use quay.io, while master keeps
> using docker.io (assuming containers for that N-1 release will be hosted
> there instead of the dockerhub)
>

quay has its own limits and likely will suffer from a similar problem.

> > 2) reduce the number of jobs
> > 3) stop querying the registry for the update filters (i'm looking into
> > this today) and use the information in tripleo-common first.
> > 4) build containers always instead of fetching from docker.io
> >
> > Thanks,
> > -Alex
> >
> >
> >
> >> Sincerely,
> >>      Luke Short
> >>
> >> On Wed, Aug 5, 2020 at 12:26 PM Wesley Hayutin <whayutin at redhat.com> wrote:
> >>>
> >>>
> >>>
> >>> On Wed, Jul 29, 2020 at 4:48 PM Wesley Hayutin <whayutin at redhat.com> wrote:
> >>>>
> >>>>
> >>>>
> >>>> On Wed, Jul 29, 2020 at 4:33 PM Alex Schultz <aschultz at redhat.com> wrote:
> >>>>>
> >>>>> On Wed, Jul 29, 2020 at 7:13 AM Wesley Hayutin <whayutin at redhat.com> wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Wed, Jul 29, 2020 at 2:25 AM Bogdan Dobrelya <bdobreli at redhat.com> wrote:
> >>>>>>>
> >>>>>>> On 7/28/20 6:09 PM, Wesley Hayutin wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Tue, Jul 28, 2020 at 7:24 AM Emilien Macchi <emilien at redhat.com
> >>>>>>>> <mailto:emilien at redhat.com>> wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>      On Tue, Jul 28, 2020 at 9:20 AM Alex Schultz <aschultz at redhat.com
> >>>>>>>>      <mailto:aschultz at redhat.com>> wrote:
> >>>>>>>>
> >>>>>>>>          On Tue, Jul 28, 2020 at 7:13 AM Emilien Macchi
> >>>>>>>>          <emilien at redhat.com <mailto:emilien at redhat.com>> wrote:
> >>>>>>>>           >
> >>>>>>>>           >
> >>>>>>>>           >
> >>>>>>>>           > On Mon, Jul 27, 2020 at 5:27 PM Wesley Hayutin
> >>>>>>>>          <whayutin at redhat.com <mailto:whayutin at redhat.com>> wrote:
> >>>>>>>>           >>
> >>>>>>>>           >> FYI...
> >>>>>>>>           >>
> >>>>>>>>           >> If you find your jobs are failing with an error similar to
> >>>>>>>>          [1], you have been rate limited by docker.io <http://docker.io>
> >>>>>>>>          via the upstream mirror system and have hit [2].  I've been
> >>>>>>>>          discussing the issue w/ upstream infra, rdo-infra and a few CI
> >>>>>>>>          engineers.
> >>>>>>>>           >>
> >>>>>>>>           >> There are a few ways to mitigate the issue however I don't
> >>>>>>>>          see any of the options being completed very quickly so I'm
> >>>>>>>>          asking for your patience while this issue is socialized and
> >>>>>>>>          resolved.
> >>>>>>>>           >>
> >>>>>>>>           >> For full transparency we're considering the following options.
> >>>>>>>>           >>
> >>>>>>>>           >> 1. move off of docker.io <http://docker.io> to quay.io
> >>>>>>>>          <http://quay.io>
> >>>>>>>>           >
> >>>>>>>>           >
> >>>>>>>>           > quay.io <http://quay.io> also has API rate limit:
> >>>>>>>>           > https://docs.quay.io/issues/429.html
> >>>>>>>>           >
> >>>>>>>>           > Now I'm not sure about how many requests per seconds one can
> >>>>>>>>          do vs the other but this would need to be checked with the quay
> >>>>>>>>          team before changing anything.
> >>>>>>>>           > Also quay.io <http://quay.io> had its big downtimes as well,
> >>>>>>>>          SLA needs to be considered.
> >>>>>>>>           >
> >>>>>>>>           >> 2. local container builds for each job in master, possibly
> >>>>>>>>          ussuri
> >>>>>>>>           >
> >>>>>>>>           >
> >>>>>>>>           > Not convinced.
> >>>>>>>>           > You can look at CI logs:
> >>>>>>>>           > - pulling / updating / pushing container images from
> >>>>>>>>          docker.io <http://docker.io> to local registry takes ~10 min on
> >>>>>>>>          standalone (OVH)
> >>>>>>>>           > - building containers from scratch with updated repos and
> >>>>>>>>          pushing them to local registry takes ~29 min on standalone (OVH).
> >>>>>>>>           >
> >>>>>>>>           >>
> >>>>>>>>           >> 3. parent child jobs upstream where rpms and containers will
> >>>>>>>>          be build and host artifacts for the child jobs
> >>>>>>>>           >
> >>>>>>>>           >
> >>>>>>>>           > Yes, we need to investigate that.
> >>>>>>>>           >
> >>>>>>>>           >>
> >>>>>>>>           >> 4. remove some portion of the upstream jobs to lower the
> >>>>>>>>          impact we have on 3rd party infrastructure.
> >>>>>>>>           >
> >>>>>>>>           >
> >>>>>>>>           > I'm not sure I understand this one, maybe you can give an
> >>>>>>>>          example of what could be removed?
> >>>>>>>>
> >>>>>>>>          We need to re-evaulate our use of scenarios (e.g. we have two
> >>>>>>>>          scenario010's both are non-voting).  There's a reason we
> >>>>>>>>          historically
> >>>>>>>>          didn't want to add more jobs because of these types of resource
> >>>>>>>>          constraints.  I think we've added new jobs recently and likely
> >>>>>>>>          need to
> >>>>>>>>          reduce what we run. Additionally we might want to look into reducing
> >>>>>>>>          what we run on stable branches as well.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>      Oh... removing jobs (I thought we would remove some steps of the jobs).
> >>>>>>>>      Yes big +1, this should be a continuous goal when working on CI, and
> >>>>>>>>      always evaluating what we need vs what we run now.
> >>>>>>>>
> >>>>>>>>      We should look at:
> >>>>>>>>      1) services deployed in scenarios that aren't worth testing (e.g.
> >>>>>>>>      deprecated or unused things) (and deprecate the unused things)
> >>>>>>>>      2) jobs themselves (I don't have any example beside scenario010 but
> >>>>>>>>      I'm sure there are more).
> >>>>>>>>      --
> >>>>>>>>      Emilien Macchi
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Thanks Alex, Emilien
> >>>>>>>>
> >>>>>>>> +1 to reviewing the catalog and adjusting things on an ongoing basis.
> >>>>>>>>
> >>>>>>>> All.. it looks like the issues with docker.io <http://docker.io> were
> >>>>>>>> more of a flare up than a change in docker.io <http://docker.io> policy
> >>>>>>>> or infrastructure [2].  The flare up started on July 27 8am utc and
> >>>>>>>> ended on July 27 17:00 utc, see screenshots.
> >>>>>>>
> >>>>>>> The numbers of image prepare workers and its exponential fallback
> >>>>>>> intervals should be also adjusted. I've analysed the log snippet [0] for
> >>>>>>> the connection reset counts by workers versus the times the rate
> >>>>>>> limiting was triggered. See the details in the reported bug [1].
> >>>>>>>
> >>>>>>> tl;dr -- for an example 5 sec interval 03:55:31,379 - 03:55:36,110:
> >>>>>>>
> >>>>>>> Conn Reset Counts by a Worker PID:
> >>>>>>>         3 58412
> >>>>>>>         2 58413
> >>>>>>>         3 58415
> >>>>>>>         3 58417
> >>>>>>>
> >>>>>>> which seems too much of (workers*reconnects) and triggers rate limiting
> >>>>>>> immediately.
> >>>>>>>
> >>>>>>> [0]
> >>>>>>> https://13b475d7469ed7126ee9-28d4ad440f46f2186fe3f98464e57890.ssl.cf1.rackcdn.com/741228/6/check/tripleo-ci-centos-8-undercloud-containers/8e47836/logs/undercloud/var/log/tripleo-container-image-prepare.log
> >>>>>>>
> >>>>>>> [1] https://bugs.launchpad.net/tripleo/+bug/1889372
> >>>>>>>
> >>>>>>> --
> >>>>>>> Best regards,
> >>>>>>> Bogdan Dobrelya,
> >>>>>>> Irc #bogdando
> >>>>>>>
> >>>>>>
> >>>>>> FYI..
> >>>>>>
> >>>>>> The issue w/ "too many requests" is back.  Expect delays and failures in attempting to merge your patches upstream across all branches.   The issue is being tracked as a critical issue.
> >>>>>
> >>>>> Working with the infra folks and we have identified the authorization
> >>>>> header as causing issues when we're rediected from docker.io to
> >>>>> cloudflare. I'll throw up a patch tomorrow to handle this case which
> >>>>> should improve our usage of the cache.  It needs some testing against
> >>>>> other registries to ensure that we don't break authenticated fetching
> >>>>> of resources.
> >>>>>
> >>>> Thanks Alex!
> >>>
> >>>
> >>>
> >>> FYI.. we have been revisited by the container pull issue, "too many requests".
> >>> Alex has some fresh patches on it: https://review.opendev.org/#/q/status:open+project:openstack/tripleo-common+topic:bug/1889122
> >>>
> >>> expect trouble in check and gate:
> >>> http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22429%20Client%20Error%3A%20Too%20Many%20Requests%20for%20url%3A%5C%22%20AND%20voting%3A1
> >>>
> >
>
>
> --
> Best regards,
> Bogdan Dobrelya,
> Irc #bogdando
>


From bdobreli at redhat.com  Wed Aug 19 14:31:10 2020
From: bdobreli at redhat.com (Bogdan Dobrelya)
Date: Wed, 19 Aug 2020 16:31:10 +0200
Subject: [tripleo][ci] container pulls failing
In-Reply-To: <CAFsb3b4r90OLtiJtWzeeOA+3n_CXL7TPvR=u4xFJ0uDfyWWYPA@mail.gmail.com>
References: <CAOHJT4LQL0ggWHdrESnuhOHVUb42vYod-PuNLEpyjXx2PARoxw@mail.gmail.com>
 <CACu=hyufTKUV_vG9nRss2hAbFscS471Wbjs3-pWQgx55GMag4w@mail.gmail.com>
 <CAFsb3b56pNdWyQZokoQJVxgRjOr3h_eTsNF6Bxit6cTMcNbdwQ@mail.gmail.com>
 <CACu=hyvm17efhF4uEe9L4M5vRa8Df1kjOLxD+WLZhDNun8Mntw@mail.gmail.com>
 <CAOHJT4L2t_rqq7oaD=g5scf2PXASi=CQ4rRTiVu=Lm-Z0zQU5Q@mail.gmail.com>
 <d62cfcbe-08c9-f6ea-e1b9-77f0e85386f8@redhat.com>
 <CAOHJT4JfTbtmcyh2vSmeSqtB4nmSt4V6PAuiuvc8ak1HzPyAdQ@mail.gmail.com>
 <CAFsb3b51vjiJvzE9DnaoKM0WuSjKFA5xSfwzr4oS034D1PH-Hw@mail.gmail.com>
 <CAOHJT4LY9p4iA85Sbb44WKbTC4LNCpe0oGxUZ47kr9MYmKxZcg@mail.gmail.com>
 <CAOHJT4LP72hpOqgbXMcA3Tw7gndwNJ88nVB0ZjoCcAddAS36dw@mail.gmail.com>
 <CAFJ6H5sMyCE0gE5ovY+OFqFk=fxyLTGOcfVfNj1ea8GvO_2c8Q@mail.gmail.com>
 <CAFsb3b6j6auPd++pmcRWh_6oT_=0=Dp6KNgLpKSkJ9mRuU_Lnw@mail.gmail.com>
 <75d5fd38-f0bb-01eb-54a4-bfc3f0c42474@redhat.com>
 <CAFsb3b4r90OLtiJtWzeeOA+3n_CXL7TPvR=u4xFJ0uDfyWWYPA@mail.gmail.com>
Message-ID: <6ac5db1c-4750-de2d-627f-edc68e3da516@redhat.com>

On 8/19/20 3:55 PM, Alex Schultz wrote:
> On Wed, Aug 19, 2020 at 7:53 AM Bogdan Dobrelya <bdobreli at redhat.com> wrote:
>>
>> On 8/19/20 3:23 PM, Alex Schultz wrote:
>>> On Wed, Aug 19, 2020 at 7:15 AM Luke Short <ekultails at gmail.com> wrote:
>>>>
>>>> Hey folks,
>>>>
>>>> All of the latest patches to address this have been merged in but we are still seeing this error randomly in CI jobs that involve an Undercloud or Standalone node. As far as I can tell, the error is appearing less often than before but it is still present making merging new patches difficult. I would be happy to help work towards other possible solutions however I am unsure where to start from here. Any help would be greatly appreciated.
>>>>
>>>
>>> I'm looking at this today but from what I can tell the problem is
>>> likely caused by a reduced anonymous query quota from docker.io and
>>> our usage of the upstream mirrors.  Because the mirrors essentially
>>> funnel all requests through a single IP we're hitting limits faster
>>> than if we didn't use the mirrors. Due to the nature of the requests,
>>> the metadata queries don't get cached due to the authorization header
>>> but are subject to the rate limiting.  Additionally we're querying the
>>> registry to determine which containers we need to update in CI because
>>> we limit our updates to a certain set of containers as part of the CI
>>> jobs.
>>>
>>> So there are likely a few different steps forward on this and we can
>>> do a few of these together.
>>>
>>> 1) stop using mirrors (not ideal but likely makes this go away).
>>> Alternatively switch stable branches off the mirrors due to a reduced
>>> number of executions and leave mirrors configured on master only (or
>>> vice versa).
>>
>> Also, the stable/(N-1) branch could use quay.io, while master keeps
>> using docker.io (assuming containers for that N-1 release will be hosted
>> there instead of the dockerhub)
>>
> 
> quay has its own limits and likely will suffer from a similar problem.

Right. But dropped numbers of total requests sent to each registry could 
end up with less often rate limiting by either of two.

> 
>>> 2) reduce the number of jobs
>>> 3) stop querying the registry for the update filters (i'm looking into
>>> this today) and use the information in tripleo-common first.
>>> 4) build containers always instead of fetching from docker.io

There may be a middle-ground solution. Building it only once for each 
patchset executed in TripleO Zuul pipelines. Transient images, like [0], 
that can have TTL and self-expire should be used for that purpose.

[0] 
https://idbs-engineering.com/containers/2019/08/27/auto-expiry-quayio-tags.html

That would require the zuul jobs with dependencies passing ansible 
variables to each other, by the execution results. Can that be done?

Pretty much like we have it already set in TripleO for tox jobs as a 
dependency for standalone/multinode jobs. But adding an extra step to 
prepare such a transient pack of the container images (only to be used 
for that patchset) and push it to a quay registry hosted elsewhere by 
TripleO devops folks.

Then the jobs that have that dependency met can use those transient 
images via an ansible variable passed for the jobs. Auto expiration 
solves the space/lifecycle requirements for the cloud that will be 
hosting that registry.

>>>
>>> Thanks,
>>> -Alex
>>>
>>>
>>>
>>>> Sincerely,
>>>>       Luke Short
>>>>
>>>> On Wed, Aug 5, 2020 at 12:26 PM Wesley Hayutin <whayutin at redhat.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Jul 29, 2020 at 4:48 PM Wesley Hayutin <whayutin at redhat.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Jul 29, 2020 at 4:33 PM Alex Schultz <aschultz at redhat.com> wrote:
>>>>>>>
>>>>>>> On Wed, Jul 29, 2020 at 7:13 AM Wesley Hayutin <whayutin at redhat.com> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Jul 29, 2020 at 2:25 AM Bogdan Dobrelya <bdobreli at redhat.com> wrote:
>>>>>>>>>
>>>>>>>>> On 7/28/20 6:09 PM, Wesley Hayutin wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Jul 28, 2020 at 7:24 AM Emilien Macchi <emilien at redhat.com
>>>>>>>>>> <mailto:emilien at redhat.com>> wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>       On Tue, Jul 28, 2020 at 9:20 AM Alex Schultz <aschultz at redhat.com
>>>>>>>>>>       <mailto:aschultz at redhat.com>> wrote:
>>>>>>>>>>
>>>>>>>>>>           On Tue, Jul 28, 2020 at 7:13 AM Emilien Macchi
>>>>>>>>>>           <emilien at redhat.com <mailto:emilien at redhat.com>> wrote:
>>>>>>>>>>            >
>>>>>>>>>>            >
>>>>>>>>>>            >
>>>>>>>>>>            > On Mon, Jul 27, 2020 at 5:27 PM Wesley Hayutin
>>>>>>>>>>           <whayutin at redhat.com <mailto:whayutin at redhat.com>> wrote:
>>>>>>>>>>            >>
>>>>>>>>>>            >> FYI...
>>>>>>>>>>            >>
>>>>>>>>>>            >> If you find your jobs are failing with an error similar to
>>>>>>>>>>           [1], you have been rate limited by docker.io <http://docker.io>
>>>>>>>>>>           via the upstream mirror system and have hit [2].  I've been
>>>>>>>>>>           discussing the issue w/ upstream infra, rdo-infra and a few CI
>>>>>>>>>>           engineers.
>>>>>>>>>>            >>
>>>>>>>>>>            >> There are a few ways to mitigate the issue however I don't
>>>>>>>>>>           see any of the options being completed very quickly so I'm
>>>>>>>>>>           asking for your patience while this issue is socialized and
>>>>>>>>>>           resolved.
>>>>>>>>>>            >>
>>>>>>>>>>            >> For full transparency we're considering the following options.
>>>>>>>>>>            >>
>>>>>>>>>>            >> 1. move off of docker.io <http://docker.io> to quay.io
>>>>>>>>>>           <http://quay.io>
>>>>>>>>>>            >
>>>>>>>>>>            >
>>>>>>>>>>            > quay.io <http://quay.io> also has API rate limit:
>>>>>>>>>>            > https://docs.quay.io/issues/429.html
>>>>>>>>>>            >
>>>>>>>>>>            > Now I'm not sure about how many requests per seconds one can
>>>>>>>>>>           do vs the other but this would need to be checked with the quay
>>>>>>>>>>           team before changing anything.
>>>>>>>>>>            > Also quay.io <http://quay.io> had its big downtimes as well,
>>>>>>>>>>           SLA needs to be considered.
>>>>>>>>>>            >
>>>>>>>>>>            >> 2. local container builds for each job in master, possibly
>>>>>>>>>>           ussuri
>>>>>>>>>>            >
>>>>>>>>>>            >
>>>>>>>>>>            > Not convinced.
>>>>>>>>>>            > You can look at CI logs:
>>>>>>>>>>            > - pulling / updating / pushing container images from
>>>>>>>>>>           docker.io <http://docker.io> to local registry takes ~10 min on
>>>>>>>>>>           standalone (OVH)
>>>>>>>>>>            > - building containers from scratch with updated repos and
>>>>>>>>>>           pushing them to local registry takes ~29 min on standalone (OVH).
>>>>>>>>>>            >
>>>>>>>>>>            >>
>>>>>>>>>>            >> 3. parent child jobs upstream where rpms and containers will
>>>>>>>>>>           be build and host artifacts for the child jobs
>>>>>>>>>>            >
>>>>>>>>>>            >
>>>>>>>>>>            > Yes, we need to investigate that.
>>>>>>>>>>            >
>>>>>>>>>>            >>
>>>>>>>>>>            >> 4. remove some portion of the upstream jobs to lower the
>>>>>>>>>>           impact we have on 3rd party infrastructure.
>>>>>>>>>>            >
>>>>>>>>>>            >
>>>>>>>>>>            > I'm not sure I understand this one, maybe you can give an
>>>>>>>>>>           example of what could be removed?
>>>>>>>>>>
>>>>>>>>>>           We need to re-evaulate our use of scenarios (e.g. we have two
>>>>>>>>>>           scenario010's both are non-voting).  There's a reason we
>>>>>>>>>>           historically
>>>>>>>>>>           didn't want to add more jobs because of these types of resource
>>>>>>>>>>           constraints.  I think we've added new jobs recently and likely
>>>>>>>>>>           need to
>>>>>>>>>>           reduce what we run. Additionally we might want to look into reducing
>>>>>>>>>>           what we run on stable branches as well.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>       Oh... removing jobs (I thought we would remove some steps of the jobs).
>>>>>>>>>>       Yes big +1, this should be a continuous goal when working on CI, and
>>>>>>>>>>       always evaluating what we need vs what we run now.
>>>>>>>>>>
>>>>>>>>>>       We should look at:
>>>>>>>>>>       1) services deployed in scenarios that aren't worth testing (e.g.
>>>>>>>>>>       deprecated or unused things) (and deprecate the unused things)
>>>>>>>>>>       2) jobs themselves (I don't have any example beside scenario010 but
>>>>>>>>>>       I'm sure there are more).
>>>>>>>>>>       --
>>>>>>>>>>       Emilien Macchi
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks Alex, Emilien
>>>>>>>>>>
>>>>>>>>>> +1 to reviewing the catalog and adjusting things on an ongoing basis.
>>>>>>>>>>
>>>>>>>>>> All.. it looks like the issues with docker.io <http://docker.io> were
>>>>>>>>>> more of a flare up than a change in docker.io <http://docker.io> policy
>>>>>>>>>> or infrastructure [2].  The flare up started on July 27 8am utc and
>>>>>>>>>> ended on July 27 17:00 utc, see screenshots.
>>>>>>>>>
>>>>>>>>> The numbers of image prepare workers and its exponential fallback
>>>>>>>>> intervals should be also adjusted. I've analysed the log snippet [0] for
>>>>>>>>> the connection reset counts by workers versus the times the rate
>>>>>>>>> limiting was triggered. See the details in the reported bug [1].
>>>>>>>>>
>>>>>>>>> tl;dr -- for an example 5 sec interval 03:55:31,379 - 03:55:36,110:
>>>>>>>>>
>>>>>>>>> Conn Reset Counts by a Worker PID:
>>>>>>>>>          3 58412
>>>>>>>>>          2 58413
>>>>>>>>>          3 58415
>>>>>>>>>          3 58417
>>>>>>>>>
>>>>>>>>> which seems too much of (workers*reconnects) and triggers rate limiting
>>>>>>>>> immediately.
>>>>>>>>>
>>>>>>>>> [0]
>>>>>>>>> https://13b475d7469ed7126ee9-28d4ad440f46f2186fe3f98464e57890.ssl.cf1.rackcdn.com/741228/6/check/tripleo-ci-centos-8-undercloud-containers/8e47836/logs/undercloud/var/log/tripleo-container-image-prepare.log
>>>>>>>>>
>>>>>>>>> [1] https://bugs.launchpad.net/tripleo/+bug/1889372
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Best regards,
>>>>>>>>> Bogdan Dobrelya,
>>>>>>>>> Irc #bogdando
>>>>>>>>>
>>>>>>>>
>>>>>>>> FYI..
>>>>>>>>
>>>>>>>> The issue w/ "too many requests" is back.  Expect delays and failures in attempting to merge your patches upstream across all branches.   The issue is being tracked as a critical issue.
>>>>>>>
>>>>>>> Working with the infra folks and we have identified the authorization
>>>>>>> header as causing issues when we're rediected from docker.io to
>>>>>>> cloudflare. I'll throw up a patch tomorrow to handle this case which
>>>>>>> should improve our usage of the cache.  It needs some testing against
>>>>>>> other registries to ensure that we don't break authenticated fetching
>>>>>>> of resources.
>>>>>>>
>>>>>> Thanks Alex!
>>>>>
>>>>>
>>>>>
>>>>> FYI.. we have been revisited by the container pull issue, "too many requests".
>>>>> Alex has some fresh patches on it: https://review.opendev.org/#/q/status:open+project:openstack/tripleo-common+topic:bug/1889122
>>>>>
>>>>> expect trouble in check and gate:
>>>>> http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22429%20Client%20Error%3A%20Too%20Many%20Requests%20for%20url%3A%5C%22%20AND%20voting%3A1
>>>>>
>>>
>>
>>
>> --
>> Best regards,
>> Bogdan Dobrelya,
>> Irc #bogdando
>>
> 


-- 
Best regards,
Bogdan Dobrelya,
Irc #bogdando


From bdobreli at redhat.com  Wed Aug 19 14:34:34 2020
From: bdobreli at redhat.com (Bogdan Dobrelya)
Date: Wed, 19 Aug 2020 16:34:34 +0200
Subject: [tripleo][ci] container pulls failing
In-Reply-To: <6ac5db1c-4750-de2d-627f-edc68e3da516@redhat.com>
References: <CAOHJT4LQL0ggWHdrESnuhOHVUb42vYod-PuNLEpyjXx2PARoxw@mail.gmail.com>
 <CACu=hyufTKUV_vG9nRss2hAbFscS471Wbjs3-pWQgx55GMag4w@mail.gmail.com>
 <CAFsb3b56pNdWyQZokoQJVxgRjOr3h_eTsNF6Bxit6cTMcNbdwQ@mail.gmail.com>
 <CACu=hyvm17efhF4uEe9L4M5vRa8Df1kjOLxD+WLZhDNun8Mntw@mail.gmail.com>
 <CAOHJT4L2t_rqq7oaD=g5scf2PXASi=CQ4rRTiVu=Lm-Z0zQU5Q@mail.gmail.com>
 <d62cfcbe-08c9-f6ea-e1b9-77f0e85386f8@redhat.com>
 <CAOHJT4JfTbtmcyh2vSmeSqtB4nmSt4V6PAuiuvc8ak1HzPyAdQ@mail.gmail.com>
 <CAFsb3b51vjiJvzE9DnaoKM0WuSjKFA5xSfwzr4oS034D1PH-Hw@mail.gmail.com>
 <CAOHJT4LY9p4iA85Sbb44WKbTC4LNCpe0oGxUZ47kr9MYmKxZcg@mail.gmail.com>
 <CAOHJT4LP72hpOqgbXMcA3Tw7gndwNJ88nVB0ZjoCcAddAS36dw@mail.gmail.com>
 <CAFJ6H5sMyCE0gE5ovY+OFqFk=fxyLTGOcfVfNj1ea8GvO_2c8Q@mail.gmail.com>
 <CAFsb3b6j6auPd++pmcRWh_6oT_=0=Dp6KNgLpKSkJ9mRuU_Lnw@mail.gmail.com>
 <75d5fd38-f0bb-01eb-54a4-bfc3f0c42474@redhat.com>
 <CAFsb3b4r90OLtiJtWzeeOA+3n_CXL7TPvR=u4xFJ0uDfyWWYPA@mail.gmail.com>
 <6ac5db1c-4750-de2d-627f-edc68e3da516@redhat.com>
Message-ID: <bc8d0b6c-6662-406a-f79a-5566628be552@redhat.com>

On 8/19/20 4:31 PM, Bogdan Dobrelya wrote:
> On 8/19/20 3:55 PM, Alex Schultz wrote:
>> On Wed, Aug 19, 2020 at 7:53 AM Bogdan Dobrelya <bdobreli at redhat.com> 
>> wrote:
>>>
>>> On 8/19/20 3:23 PM, Alex Schultz wrote:
>>>> On Wed, Aug 19, 2020 at 7:15 AM Luke Short <ekultails at gmail.com> wrote:
>>>>>
>>>>> Hey folks,
>>>>>
>>>>> All of the latest patches to address this have been merged in but 
>>>>> we are still seeing this error randomly in CI jobs that involve an 
>>>>> Undercloud or Standalone node. As far as I can tell, the error is 
>>>>> appearing less often than before but it is still present making 
>>>>> merging new patches difficult. I would be happy to help work 
>>>>> towards other possible solutions however I am unsure where to start 
>>>>> from here. Any help would be greatly appreciated.
>>>>>
>>>>
>>>> I'm looking at this today but from what I can tell the problem is
>>>> likely caused by a reduced anonymous query quota from docker.io and
>>>> our usage of the upstream mirrors.  Because the mirrors essentially
>>>> funnel all requests through a single IP we're hitting limits faster
>>>> than if we didn't use the mirrors. Due to the nature of the requests,
>>>> the metadata queries don't get cached due to the authorization header
>>>> but are subject to the rate limiting.  Additionally we're querying the
>>>> registry to determine which containers we need to update in CI because
>>>> we limit our updates to a certain set of containers as part of the CI
>>>> jobs.
>>>>
>>>> So there are likely a few different steps forward on this and we can
>>>> do a few of these together.
>>>>
>>>> 1) stop using mirrors (not ideal but likely makes this go away).
>>>> Alternatively switch stable branches off the mirrors due to a reduced
>>>> number of executions and leave mirrors configured on master only (or
>>>> vice versa).
>>>
>>> Also, the stable/(N-1) branch could use quay.io, while master keeps
>>> using docker.io (assuming containers for that N-1 release will be hosted
>>> there instead of the dockerhub)
>>>
>>
>> quay has its own limits and likely will suffer from a similar problem.
> 
> Right. But dropped numbers of total requests sent to each registry could 
> end up with less often rate limiting by either of two.
> 
>>
>>>> 2) reduce the number of jobs
>>>> 3) stop querying the registry for the update filters (i'm looking into
>>>> this today) and use the information in tripleo-common first.
>>>> 4) build containers always instead of fetching from docker.io
> 
> There may be a middle-ground solution. Building it only once for each 
> patchset executed in TripleO Zuul pipelines. Transient images, like [0], 
> that can have TTL and self-expire should be used for that purpose.
> 
> [0] 
> https://idbs-engineering.com/containers/2019/08/27/auto-expiry-quayio-tags.html 
> 
> 
> That would require the zuul jobs with dependencies passing ansible 
> variables to each other, by the execution results. Can that be done?

...or even simpler than that, predictable names can be created for those 
transient images, like <namespace>/<tag>_<patchset>

> 
> Pretty much like we have it already set in TripleO for tox jobs as a 
> dependency for standalone/multinode jobs. But adding an extra step to 
> prepare such a transient pack of the container images (only to be used 
> for that patchset) and push it to a quay registry hosted elsewhere by 
> TripleO devops folks.
> 
> Then the jobs that have that dependency met can use those transient 
> images via an ansible variable passed for the jobs. Auto expiration 
> solves the space/lifecycle requirements for the cloud that will be 
> hosting that registry.
> 
>>>>
>>>> Thanks,
>>>> -Alex
>>>>
>>>>
>>>>
>>>>> Sincerely,
>>>>>       Luke Short
>>>>>
>>>>> On Wed, Aug 5, 2020 at 12:26 PM Wesley Hayutin 
>>>>> <whayutin at redhat.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Jul 29, 2020 at 4:48 PM Wesley Hayutin 
>>>>>> <whayutin at redhat.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jul 29, 2020 at 4:33 PM Alex Schultz 
>>>>>>> <aschultz at redhat.com> wrote:
>>>>>>>>
>>>>>>>> On Wed, Jul 29, 2020 at 7:13 AM Wesley Hayutin 
>>>>>>>> <whayutin at redhat.com> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Jul 29, 2020 at 2:25 AM Bogdan Dobrelya 
>>>>>>>>> <bdobreli at redhat.com> wrote:
>>>>>>>>>>
>>>>>>>>>> On 7/28/20 6:09 PM, Wesley Hayutin wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Jul 28, 2020 at 7:24 AM Emilien Macchi 
>>>>>>>>>>> <emilien at redhat.com
>>>>>>>>>>> <mailto:emilien at redhat.com>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>       On Tue, Jul 28, 2020 at 9:20 AM Alex Schultz 
>>>>>>>>>>> <aschultz at redhat.com
>>>>>>>>>>>       <mailto:aschultz at redhat.com>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>           On Tue, Jul 28, 2020 at 7:13 AM Emilien Macchi
>>>>>>>>>>>           <emilien at redhat.com <mailto:emilien at redhat.com>> 
>>>>>>>>>>> wrote:
>>>>>>>>>>>            >
>>>>>>>>>>>            >
>>>>>>>>>>>            >
>>>>>>>>>>>            > On Mon, Jul 27, 2020 at 5:27 PM Wesley Hayutin
>>>>>>>>>>>           <whayutin at redhat.com <mailto:whayutin at redhat.com>> 
>>>>>>>>>>> wrote:
>>>>>>>>>>>            >>
>>>>>>>>>>>            >> FYI...
>>>>>>>>>>>            >>
>>>>>>>>>>>            >> If you find your jobs are failing with an error 
>>>>>>>>>>> similar to
>>>>>>>>>>>           [1], you have been rate limited by docker.io 
>>>>>>>>>>> <http://docker.io>
>>>>>>>>>>>           via the upstream mirror system and have hit [2].  
>>>>>>>>>>> I've been
>>>>>>>>>>>           discussing the issue w/ upstream infra, rdo-infra 
>>>>>>>>>>> and a few CI
>>>>>>>>>>>           engineers.
>>>>>>>>>>>            >>
>>>>>>>>>>>            >> There are a few ways to mitigate the issue 
>>>>>>>>>>> however I don't
>>>>>>>>>>>           see any of the options being completed very quickly 
>>>>>>>>>>> so I'm
>>>>>>>>>>>           asking for your patience while this issue is 
>>>>>>>>>>> socialized and
>>>>>>>>>>>           resolved.
>>>>>>>>>>>            >>
>>>>>>>>>>>            >> For full transparency we're considering the 
>>>>>>>>>>> following options.
>>>>>>>>>>>            >>
>>>>>>>>>>>            >> 1. move off of docker.io <http://docker.io> to 
>>>>>>>>>>> quay.io
>>>>>>>>>>>           <http://quay.io>
>>>>>>>>>>>            >
>>>>>>>>>>>            >
>>>>>>>>>>>            > quay.io <http://quay.io> also has API rate limit:
>>>>>>>>>>>            > https://docs.quay.io/issues/429.html
>>>>>>>>>>>            >
>>>>>>>>>>>            > Now I'm not sure about how many requests per 
>>>>>>>>>>> seconds one can
>>>>>>>>>>>           do vs the other but this would need to be checked 
>>>>>>>>>>> with the quay
>>>>>>>>>>>           team before changing anything.
>>>>>>>>>>>            > Also quay.io <http://quay.io> had its big 
>>>>>>>>>>> downtimes as well,
>>>>>>>>>>>           SLA needs to be considered.
>>>>>>>>>>>            >
>>>>>>>>>>>            >> 2. local container builds for each job in 
>>>>>>>>>>> master, possibly
>>>>>>>>>>>           ussuri
>>>>>>>>>>>            >
>>>>>>>>>>>            >
>>>>>>>>>>>            > Not convinced.
>>>>>>>>>>>            > You can look at CI logs:
>>>>>>>>>>>            > - pulling / updating / pushing container images 
>>>>>>>>>>> from
>>>>>>>>>>>           docker.io <http://docker.io> to local registry 
>>>>>>>>>>> takes ~10 min on
>>>>>>>>>>>           standalone (OVH)
>>>>>>>>>>>            > - building containers from scratch with updated 
>>>>>>>>>>> repos and
>>>>>>>>>>>           pushing them to local registry takes ~29 min on 
>>>>>>>>>>> standalone (OVH).
>>>>>>>>>>>            >
>>>>>>>>>>>            >>
>>>>>>>>>>>            >> 3. parent child jobs upstream where rpms and 
>>>>>>>>>>> containers will
>>>>>>>>>>>           be build and host artifacts for the child jobs
>>>>>>>>>>>            >
>>>>>>>>>>>            >
>>>>>>>>>>>            > Yes, we need to investigate that.
>>>>>>>>>>>            >
>>>>>>>>>>>            >>
>>>>>>>>>>>            >> 4. remove some portion of the upstream jobs to 
>>>>>>>>>>> lower the
>>>>>>>>>>>           impact we have on 3rd party infrastructure.
>>>>>>>>>>>            >
>>>>>>>>>>>            >
>>>>>>>>>>>            > I'm not sure I understand this one, maybe you 
>>>>>>>>>>> can give an
>>>>>>>>>>>           example of what could be removed?
>>>>>>>>>>>
>>>>>>>>>>>           We need to re-evaulate our use of scenarios (e.g. 
>>>>>>>>>>> we have two
>>>>>>>>>>>           scenario010's both are non-voting).  There's a 
>>>>>>>>>>> reason we
>>>>>>>>>>>           historically
>>>>>>>>>>>           didn't want to add more jobs because of these types 
>>>>>>>>>>> of resource
>>>>>>>>>>>           constraints.  I think we've added new jobs recently 
>>>>>>>>>>> and likely
>>>>>>>>>>>           need to
>>>>>>>>>>>           reduce what we run. Additionally we might want to 
>>>>>>>>>>> look into reducing
>>>>>>>>>>>           what we run on stable branches as well.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>       Oh... removing jobs (I thought we would remove some 
>>>>>>>>>>> steps of the jobs).
>>>>>>>>>>>       Yes big +1, this should be a continuous goal when 
>>>>>>>>>>> working on CI, and
>>>>>>>>>>>       always evaluating what we need vs what we run now.
>>>>>>>>>>>
>>>>>>>>>>>       We should look at:
>>>>>>>>>>>       1) services deployed in scenarios that aren't worth 
>>>>>>>>>>> testing (e.g.
>>>>>>>>>>>       deprecated or unused things) (and deprecate the unused 
>>>>>>>>>>> things)
>>>>>>>>>>>       2) jobs themselves (I don't have any example beside 
>>>>>>>>>>> scenario010 but
>>>>>>>>>>>       I'm sure there are more).
>>>>>>>>>>>       --
>>>>>>>>>>>       Emilien Macchi
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks Alex, Emilien
>>>>>>>>>>>
>>>>>>>>>>> +1 to reviewing the catalog and adjusting things on an 
>>>>>>>>>>> ongoing basis.
>>>>>>>>>>>
>>>>>>>>>>> All.. it looks like the issues with docker.io 
>>>>>>>>>>> <http://docker.io> were
>>>>>>>>>>> more of a flare up than a change in docker.io 
>>>>>>>>>>> <http://docker.io> policy
>>>>>>>>>>> or infrastructure [2].  The flare up started on July 27 8am 
>>>>>>>>>>> utc and
>>>>>>>>>>> ended on July 27 17:00 utc, see screenshots.
>>>>>>>>>>
>>>>>>>>>> The numbers of image prepare workers and its exponential fallback
>>>>>>>>>> intervals should be also adjusted. I've analysed the log 
>>>>>>>>>> snippet [0] for
>>>>>>>>>> the connection reset counts by workers versus the times the rate
>>>>>>>>>> limiting was triggered. See the details in the reported bug [1].
>>>>>>>>>>
>>>>>>>>>> tl;dr -- for an example 5 sec interval 03:55:31,379 - 
>>>>>>>>>> 03:55:36,110:
>>>>>>>>>>
>>>>>>>>>> Conn Reset Counts by a Worker PID:
>>>>>>>>>>          3 58412
>>>>>>>>>>          2 58413
>>>>>>>>>>          3 58415
>>>>>>>>>>          3 58417
>>>>>>>>>>
>>>>>>>>>> which seems too much of (workers*reconnects) and triggers rate 
>>>>>>>>>> limiting
>>>>>>>>>> immediately.
>>>>>>>>>>
>>>>>>>>>> [0]
>>>>>>>>>> https://13b475d7469ed7126ee9-28d4ad440f46f2186fe3f98464e57890.ssl.cf1.rackcdn.com/741228/6/check/tripleo-ci-centos-8-undercloud-containers/8e47836/logs/undercloud/var/log/tripleo-container-image-prepare.log 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> [1] https://bugs.launchpad.net/tripleo/+bug/1889372
>>>>>>>>>>
>>>>>>>>>> -- 
>>>>>>>>>> Best regards,
>>>>>>>>>> Bogdan Dobrelya,
>>>>>>>>>> Irc #bogdando
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> FYI..
>>>>>>>>>
>>>>>>>>> The issue w/ "too many requests" is back.  Expect delays and 
>>>>>>>>> failures in attempting to merge your patches upstream across 
>>>>>>>>> all branches.   The issue is being tracked as a critical issue.
>>>>>>>>
>>>>>>>> Working with the infra folks and we have identified the 
>>>>>>>> authorization
>>>>>>>> header as causing issues when we're rediected from docker.io to
>>>>>>>> cloudflare. I'll throw up a patch tomorrow to handle this case 
>>>>>>>> which
>>>>>>>> should improve our usage of the cache.  It needs some testing 
>>>>>>>> against
>>>>>>>> other registries to ensure that we don't break authenticated 
>>>>>>>> fetching
>>>>>>>> of resources.
>>>>>>>>
>>>>>>> Thanks Alex!
>>>>>>
>>>>>>
>>>>>>
>>>>>> FYI.. we have been revisited by the container pull issue, "too 
>>>>>> many requests".
>>>>>> Alex has some fresh patches on it: 
>>>>>> https://review.opendev.org/#/q/status:open+project:openstack/tripleo-common+topic:bug/1889122 
>>>>>>
>>>>>>
>>>>>> expect trouble in check and gate:
>>>>>> http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22429%20Client%20Error%3A%20Too%20Many%20Requests%20for%20url%3A%5C%22%20AND%20voting%3A1 
>>>>>>
>>>>>>
>>>>
>>>
>>>
>>> -- 
>>> Best regards,
>>> Bogdan Dobrelya,
>>> Irc #bogdando
>>>
>>
> 
> 


-- 
Best regards,
Bogdan Dobrelya,
Irc #bogdando


From tobias.urdin at binero.com  Wed Aug 19 14:38:42 2020
From: tobias.urdin at binero.com (Tobias Urdin)
Date: Wed, 19 Aug 2020 14:38:42 +0000
Subject: [tripleo] Proposing Takashi Kajinami to be core on puppet-tripleo
In-Reply-To: <CABJHmF5eJOgG=hoHrE2z-8gMwm_cxGihMSLOd02hcXd+2GvK7w@mail.gmail.com>
References: <CACu=hytgMt=4whc8vVRMLw-gkrfZ-rYiX26sNH78jnvgwu_zjQ@mail.gmail.com>,
 <CABJHmF5eJOgG=hoHrE2z-8gMwm_cxGihMSLOd02hcXd+2GvK7w@mail.gmail.com>
Message-ID: <1597847922905.32607@binero.com>

?Big +1 from an outsider :))


Best regards

Tobias


________________________________
From: Rabi Mishra <ramishra at redhat.com>
Sent: Wednesday, August 19, 2020 3:37 PM
To: Emilien Macchi
Cc: openstack-discuss
Subject: Re: [tripleo] Proposing Takashi Kajinami to be core on puppet-tripleo

+1

On Tue, Aug 18, 2020 at 8:03 PM Emilien Macchi <emilien at redhat.com<mailto:emilien at redhat.com>> wrote:
Hi people,

If you don't know Takashi yet, he has been involved in the Puppet OpenStack project and helped *a lot* in its maintenance (and by maintenance I mean not-funny-work). When our community was getting smaller and smaller, he joined us and our review velicity went back to eleven. He became a core maintainer very quickly and we're glad to have him onboard.

He's also been involved in taking care of puppet-tripleo for a few months and I believe he has more than enough knowledge on the module to provide core reviews and be part of the core maintainer group. I also noticed his amount of contribution (bug fixes, improvements, reviews, etc) in other TripleO repos and I'm confident he'll make his road to be core in TripleO at some point. For now I would like him to propose him to be core in puppet-tripleo.

As usual, any feedback is welcome but in the meantime I want to thank Takashi for his work in TripleO and we're super happy to have new contributors!

Thanks,
--
Emilien Macchi


--
Regards,
Rabi Mishra

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200819/6aebb824/attachment.html>

From gmann at ghanshyammann.com  Wed Aug 19 14:52:17 2020
From: gmann at ghanshyammann.com (Ghanshyam Mann)
Date: Wed, 19 Aug 2020 09:52:17 -0500
Subject: [simplification] Making ask.openstack.org read-only
In-Reply-To: <CAMH0MgJwpDeRNbKuBCa5rx+aPhBvvGu-Qzj-OPGQfuE1i-J9uw@mail.gmail.com>
References: <f1cdb911-5e4c-cf4f-7f3b-32ca12353ace@openstack.org>
 <20200818235247.GA341779@fedora19.localdomain>
 <20200819000359.mhz43jvop5vtcgct@yuggoth.org>
 <CAMH0MgJwpDeRNbKuBCa5rx+aPhBvvGu-Qzj-OPGQfuE1i-J9uw@mail.gmail.com>
Message-ID: <1740734d7ec.ed463568528549.2297830654288026424@ghanshyammann.com>

 ---- On Tue, 18 Aug 2020 19:35:05 -0500 Michael Johnson <johnsomor at gmail.com> wrote ----
 > Yes! ask.openstack.org is no fun to attempt to be helpful on (see
 > e-mail notification issues, etc.).
 > 

+1 on making it RO and redirect users to StackOverflow or ML(fast response)..


 > I would like to ask that we put together some sort of guide and/or
 > guidence for how to use stack overflow efficiently for OpenStack
 > questions. I.e. some well known or defined tags that we recommend
 > people use when asking questions. This would be similar to the tags we
 > use for the openstack discuss list.
 > 
 > I see that there is already a trend for "openstack-nova"
 > "openstack-horizon", etc. This works for me.

In FC SIG, we check a set of tags for new contributors in ask.o.o [1] which we can switch to do in StackOverflow.
Similarly, we can start monitoring the popular tags for project/area-specific.

[1] https://wiki.openstack.org/wiki/First_Contact_SIG#Biweekly_Homework

-gmann

 > 
 > This way we can setup notifications for these tags and be much more
 > efficient at getting people answers.
 > 
 > Thanks Thierry for moving this forward!
 > 
 > Michael
 > 
 > On Tue, Aug 18, 2020 at 5:10 PM Jeremy Stanley <fungi at yuggoth.org> wrote:
 > >
 > > On 2020-08-19 09:52:47 +1000 (+1000), Ian Wienand wrote:
 > > [...]
 > > > *If* we were to restore it now, it looks like 0.11 branch comes with
 > > > an upstream Dockerfile [1]; there's lots of examples now in
 > > > system-config of similar container-based production sites and this
 > > > could fit in.
 > > >
 > > > This makes it significantly easier than trying to build up everything
 > > > it requires from scratch, and if upstream keep their container
 > > > compatible (a big if...) theoretically less work to keep updated.
 > > [...]
 > >
 > > Which also brings up another point: right now we're running it on
 > > Ubuntu Xenial (16.04 LTS) which is scheduled to reach EOL early next
 > > year, and the tooling we're using to deploy it isn't going to work
 > > on newer Ubuntu releases. Even keeping it up in a read-only state is
 > > timebound to how long we can safely keep its server online. If we
 > > switch ask.openstack.org to read-only now, I would still plan to
 > > turn it off entirely on or before April 1, 2021.
 > > --
 > > Jeremy Stanley
 > 
 > 


From fungi at yuggoth.org  Wed Aug 19 14:53:52 2020
From: fungi at yuggoth.org (Jeremy Stanley)
Date: Wed, 19 Aug 2020 14:53:52 +0000
Subject: [tripleo][ci] container pulls failing
In-Reply-To: <ac715ea0-4451-7bf1-1d64-7ef16fbc832e@redhat.com>
References: <CACu=hyvm17efhF4uEe9L4M5vRa8Df1kjOLxD+WLZhDNun8Mntw@mail.gmail.com>
 <CAOHJT4L2t_rqq7oaD=g5scf2PXASi=CQ4rRTiVu=Lm-Z0zQU5Q@mail.gmail.com>
 <d62cfcbe-08c9-f6ea-e1b9-77f0e85386f8@redhat.com>
 <CAOHJT4JfTbtmcyh2vSmeSqtB4nmSt4V6PAuiuvc8ak1HzPyAdQ@mail.gmail.com>
 <CAFsb3b51vjiJvzE9DnaoKM0WuSjKFA5xSfwzr4oS034D1PH-Hw@mail.gmail.com>
 <CAOHJT4LY9p4iA85Sbb44WKbTC4LNCpe0oGxUZ47kr9MYmKxZcg@mail.gmail.com>
 <CAOHJT4LP72hpOqgbXMcA3Tw7gndwNJ88nVB0ZjoCcAddAS36dw@mail.gmail.com>
 <CAFJ6H5sMyCE0gE5ovY+OFqFk=fxyLTGOcfVfNj1ea8GvO_2c8Q@mail.gmail.com>
 <CAFsb3b6j6auPd++pmcRWh_6oT_=0=Dp6KNgLpKSkJ9mRuU_Lnw@mail.gmail.com>
 <ac715ea0-4451-7bf1-1d64-7ef16fbc832e@redhat.com>
Message-ID: <20200819145352.ezxr6kvvpsq3tgui@yuggoth.org>

On 2020-08-19 15:40:08 +0200 (+0200), Cédric Jeanneret wrote:
> On 8/19/20 3:23 PM, Alex Schultz wrote:
[...]
> > 1) stop using mirrors (not ideal but likely makes this go away).
> > Alternatively switch stable branches off the mirrors due to a reduced
> > number of executions and leave mirrors configured on master only (or
> > vice versa).
> 
> might be good, but it might lead to some other issues - docker might
> want to rate-limit on container owner. I wouldn't be surprised if they
> go that way in the future. Could be OK as a first "unlocking step".
[...]

Be aware that there is another side effect: right now the images are
being served from a cache within the same environment as the test
nodes, and instead your jobs will begin fetching them over the
Internet. This may mean longer average job run time, and a higher
percentage of download failures due to network hiccups (whether
these will be of a greater frequency than the API rate limit
blocking, it's hard to guess). It also necessarily means
significantly more bandwidth utilization for our resource donors,
particularly as TripleO consumes far more job resources than any
other project already.

I wonder if there's a middle ground: finding a way to use the cache
for fetching images, but connecting straight to Dockerhub when
you're querying metadata? It sounds like the metadata requests
represent a majority of the actual Dockerhub API calls anyway, and
can't be cached regardless.
-- 
Jeremy Stanley
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 963 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200819/d06e3560/attachment.sig>

From aschultz at redhat.com  Wed Aug 19 15:14:27 2020
From: aschultz at redhat.com (Alex Schultz)
Date: Wed, 19 Aug 2020 09:14:27 -0600
Subject: [tripleo][ci] container pulls failing
In-Reply-To: <20200819145352.ezxr6kvvpsq3tgui@yuggoth.org>
References: <CACu=hyvm17efhF4uEe9L4M5vRa8Df1kjOLxD+WLZhDNun8Mntw@mail.gmail.com>
 <CAOHJT4L2t_rqq7oaD=g5scf2PXASi=CQ4rRTiVu=Lm-Z0zQU5Q@mail.gmail.com>
 <d62cfcbe-08c9-f6ea-e1b9-77f0e85386f8@redhat.com>
 <CAOHJT4JfTbtmcyh2vSmeSqtB4nmSt4V6PAuiuvc8ak1HzPyAdQ@mail.gmail.com>
 <CAFsb3b51vjiJvzE9DnaoKM0WuSjKFA5xSfwzr4oS034D1PH-Hw@mail.gmail.com>
 <CAOHJT4LY9p4iA85Sbb44WKbTC4LNCpe0oGxUZ47kr9MYmKxZcg@mail.gmail.com>
 <CAOHJT4LP72hpOqgbXMcA3Tw7gndwNJ88nVB0ZjoCcAddAS36dw@mail.gmail.com>
 <CAFJ6H5sMyCE0gE5ovY+OFqFk=fxyLTGOcfVfNj1ea8GvO_2c8Q@mail.gmail.com>
 <CAFsb3b6j6auPd++pmcRWh_6oT_=0=Dp6KNgLpKSkJ9mRuU_Lnw@mail.gmail.com>
 <ac715ea0-4451-7bf1-1d64-7ef16fbc832e@redhat.com>
 <20200819145352.ezxr6kvvpsq3tgui@yuggoth.org>
Message-ID: <CAFsb3b69ZrQtDnUdezjxJw5=j3CmRy0XGzoLiejqP0tyBVpi_w@mail.gmail.com>

On Wed, Aug 19, 2020 at 8:59 AM Jeremy Stanley <fungi at yuggoth.org> wrote:
>
> On 2020-08-19 15:40:08 +0200 (+0200), Cédric Jeanneret wrote:
> > On 8/19/20 3:23 PM, Alex Schultz wrote:
> [...]
> > > 1) stop using mirrors (not ideal but likely makes this go away).
> > > Alternatively switch stable branches off the mirrors due to a reduced
> > > number of executions and leave mirrors configured on master only (or
> > > vice versa).
> >
> > might be good, but it might lead to some other issues - docker might
> > want to rate-limit on container owner. I wouldn't be surprised if they
> > go that way in the future. Could be OK as a first "unlocking step".
> [...]
>
> Be aware that there is another side effect: right now the images are
> being served from a cache within the same environment as the test
> nodes, and instead your jobs will begin fetching them over the
> Internet. This may mean longer average job run time, and a higher
> percentage of download failures due to network hiccups (whether
> these will be of a greater frequency than the API rate limit
> blocking, it's hard to guess). It also necessarily means
> significantly more bandwidth utilization for our resource donors,
> particularly as TripleO consumes far more job resources than any
> other project already.
>

Yea I know so we're trying to find a solution that doesn't make it
worse.  It would be great if we could have any visibility into the
cache hit ratio/requests going through these mirrors to know if we
have changes that are improving things or making it worse.

> I wonder if there's a middle ground: finding a way to use the cache
> for fetching images, but connecting straight to Dockerhub when
> you're querying metadata? It sounds like the metadata requests
> represent a majority of the actual Dockerhub API calls anyway, and
> can't be cached regardless.

Maybe, but at the moment i'm working on not even doing the requests at
all which would be better.  Next i'll look into that but the mirror
config is handled before we even start requesting things

> --
> Jeremy Stanley


From fungi at yuggoth.org  Wed Aug 19 15:45:40 2020
From: fungi at yuggoth.org (Jeremy Stanley)
Date: Wed, 19 Aug 2020 15:45:40 +0000
Subject: [tripleo][ci]i[infra] container pulls failing
In-Reply-To: <CAFsb3b69ZrQtDnUdezjxJw5=j3CmRy0XGzoLiejqP0tyBVpi_w@mail.gmail.com>
References: <d62cfcbe-08c9-f6ea-e1b9-77f0e85386f8@redhat.com>
 <CAOHJT4JfTbtmcyh2vSmeSqtB4nmSt4V6PAuiuvc8ak1HzPyAdQ@mail.gmail.com>
 <CAFsb3b51vjiJvzE9DnaoKM0WuSjKFA5xSfwzr4oS034D1PH-Hw@mail.gmail.com>
 <CAOHJT4LY9p4iA85Sbb44WKbTC4LNCpe0oGxUZ47kr9MYmKxZcg@mail.gmail.com>
 <CAOHJT4LP72hpOqgbXMcA3Tw7gndwNJ88nVB0ZjoCcAddAS36dw@mail.gmail.com>
 <CAFJ6H5sMyCE0gE5ovY+OFqFk=fxyLTGOcfVfNj1ea8GvO_2c8Q@mail.gmail.com>
 <CAFsb3b6j6auPd++pmcRWh_6oT_=0=Dp6KNgLpKSkJ9mRuU_Lnw@mail.gmail.com>
 <ac715ea0-4451-7bf1-1d64-7ef16fbc832e@redhat.com>
 <20200819145352.ezxr6kvvpsq3tgui@yuggoth.org>
 <CAFsb3b69ZrQtDnUdezjxJw5=j3CmRy0XGzoLiejqP0tyBVpi_w@mail.gmail.com>
Message-ID: <20200819154540.es5nru4xmzj637rx@yuggoth.org>

On 2020-08-19 09:14:27 -0600 (-0600), Alex Schultz wrote:
[...]
> It would be great if we could have any visibility into the cache
> hit ratio/requests going through these mirrors to know if we have
> changes that are improving things or making it worse.
[...]

Normally we avoid publishing raw Web server logs to protect the
privacy of our users, but in this case we might make an exception
because the mirrors are only intended for use by our public Zuul
jobs and Nodepool image builds. It's worth bringing up with the rest
of the team, for sure.
-- 
Jeremy Stanley
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 963 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200819/77d4e306/attachment.sig>

From openstack at nemebean.com  Wed Aug 19 16:27:01 2020
From: openstack at nemebean.com (Ben Nemec)
Date: Wed, 19 Aug 2020 11:27:01 -0500
Subject: [neutron] Disable dhcp drop rule
In-Reply-To: <20200819133616.Horde.zhXC_mhe4RdzjbP4Shl1M45@webmail.nde.ag>
References: <20200819133616.Horde.zhXC_mhe4RdzjbP4Shl1M45@webmail.nde.ag>
Message-ID: <4ea4eb17-0373-e1ab-6f45-c35cb67723e0@nemebean.com>


On 8/19/20 8:36 AM, Eugen Block wrote:
> Hi *,
> 
> we recently upgraded our Ocata Cloud to Train and also switched from 
> linuxbridge to openvswitch.
> 
> One of our instances within the cloud works as DHCP server and to make 
> that work we had to comment the respective part in this file on the 
> compute node the instance was running on:
> 
> /usr/lib/python2.7/site-packages/neutron/agent/linux/iptables_firewall.py
> 
> 
> Now we tried the same in
> 
> /usr/lib/python3.6/site-packages/neutron/agent/linux/openvswitch_firewall/firewall.py 
> 
> /usr/lib/python3.6/site-packages/neutron/agent/linux/iptables_firewall.py
> 
> but restarting openstack-neutron-openvswitch-agent.service didn't drop 
> that rule, the DHCP reply didn't get through. To continue with our work 
> we just dropped it manually, so we get by, but since there have been a 
> couple of years between Ocata and Train, is there any smoother or better 
> way to achieve this? This seems to be a reoccuring request but I 
> couldn't find any updates on this topic. Maybe someone here can shed 
> some light? Is there more to change than those two files I mentioned?

You might try disabling port-security on the instance's port. That's 
what we use in OVB to allow a DHCP server in an instance now.

neutron port-update [port-id] --port_security_enabled=False

That will drop all port security for that instance, not just the DHCP 
rule, but on the other hand it leaves the DHCP rule in place for any 
instances you don't want running DHCP servers.

> 
> Any pointers are highly appreciated!
> 
> Best regards,
> Eugen
> 
> 


From eblock at nde.ag  Wed Aug 19 16:42:11 2020
From: eblock at nde.ag (Eugen Block)
Date: Wed, 19 Aug 2020 16:42:11 +0000
Subject: [neutron] Disable dhcp drop rule
In-Reply-To: <4ea4eb17-0373-e1ab-6f45-c35cb67723e0@nemebean.com>
References: <20200819133616.Horde.zhXC_mhe4RdzjbP4Shl1M45@webmail.nde.ag>
 <4ea4eb17-0373-e1ab-6f45-c35cb67723e0@nemebean.com>
Message-ID: <20200819164211.Horde.jx_dhmZz16BL7k9bIumarOA@webmail.nde.ag>

That sounds promising, thank you! I had noticed that option but didn’t  
have a chance to look closer into it.
I’ll try that tomorrow.

Thanks for the tip!

Zitat von Ben Nemec <openstack at nemebean.com>:

> On 8/19/20 8:36 AM, Eugen Block wrote:
>> Hi *,
>>
>> we recently upgraded our Ocata Cloud to Train and also switched  
>> from linuxbridge to openvswitch.
>>
>> One of our instances within the cloud works as DHCP server and to  
>> make that work we had to comment the respective part in this file  
>> on the compute node the instance was running on:
>>
>> /usr/lib/python2.7/site-packages/neutron/agent/linux/iptables_firewall.py
>>
>>
>> Now we tried the same in
>>
>> /usr/lib/python3.6/site-packages/neutron/agent/linux/openvswitch_firewall/firewall.py  
>> /usr/lib/python3.6/site-packages/neutron/agent/linux/iptables_firewall.py
>>
>> but restarting openstack-neutron-openvswitch-agent.service didn't  
>> drop that rule, the DHCP reply didn't get through. To continue with  
>> our work we just dropped it manually, so we get by, but since there  
>> have been a couple of years between Ocata and Train, is there any  
>> smoother or better way to achieve this? This seems to be a  
>> reoccuring request but I couldn't find any updates on this topic.  
>> Maybe someone here can shed some light? Is there more to change  
>> than those two files I mentioned?
>
> You might try disabling port-security on the instance's port. That's  
> what we use in OVB to allow a DHCP server in an instance now.
>
> neutron port-update [port-id] --port_security_enabled=False
>
> That will drop all port security for that instance, not just the  
> DHCP rule, but on the other hand it leaves the DHCP rule in place  
> for any instances you don't want running DHCP servers.
>
>>
>> Any pointers are highly appreciated!
>>
>> Best regards,
>> Eugen
>>
>>


From rosmaita.fossdev at gmail.com  Wed Aug 19 16:49:53 2020
From: rosmaita.fossdev at gmail.com (Brian Rosmaita)
Date: Wed, 19 Aug 2020 12:49:53 -0400
Subject: [cinder][ops] "Berlin" 2020 Virtual Forum - Cinder brainstorming
Message-ID: <8f233106-a48a-af66-6aa4-42316fd3a669@gmail.com>

This message is aimed at anyone with an interest in the OpenStack Block 
Storage Service, whether as an operator, a user, or a developer.

Like all the other teams, Cinder would like to get feedback from 
operators and users about the current state of the software, get some 
ideas about what should be in the next release, and have some strategic 
discussion about The Future.  So if you have some ideas you'd like to be 
considered, feel free to propose a topic:

https://etherpad.opendev.org/p/2020-Wallaby-cinder-brainstorming

You only need to add a sentence or two describing your topic and it 
doesn't have to be very polished, so if you have an idea, just go to the 
etherpad and slap it down now while you're thinking about it.

The deadline for proposals to the Foundation is 14 September, so if you 
could get your idea down on the etherpad before the Cinder weekly 
meeting on Wednesday 9 September 14:00 UTC, that will give the Cinder 
team time to look them over.

thanks!
brian


From amuller at redhat.com  Wed Aug 19 16:50:04 2020
From: amuller at redhat.com (Assaf Muller)
Date: Wed, 19 Aug 2020 12:50:04 -0400
Subject: [neutron][ops] API for viewing HA router states
In-Reply-To: <CAEs876gsfa3Z+f+7fchabxVYDsM3rnNza=ftZx3r2rov7VGvww@mail.gmail.com>
References: <CAEs876jV-RqWiW2K8ehvm4ACTCiU059LJDiC+FWGWKQdPGCjGw@mail.gmail.com>
 <6613245.ccrTHCtBl7@antares>
 <CABARBAY_r-U8fixF7_tPg1cxrDy14x5s=nQ=oa436H7WeOyp=g@mail.gmail.com>
 <CAEs876gsfa3Z+f+7fchabxVYDsM3rnNza=ftZx3r2rov7VGvww@mail.gmail.com>
Message-ID: <CABARBAYS9rzM+cKY3yAoqF4tP+BtLGx1PGdTQ+RnLBqCewjUPg@mail.gmail.com>

On Tue, Aug 18, 2020 at 10:30 PM Mohammed Naser <mnaser at vexxhost.com> wrote:
>
> On Tue, Aug 18, 2020 at 10:53 AM Assaf Muller <amuller at redhat.com> wrote:
> >
> > On Tue, Aug 18, 2020 at 8:12 AM Jonas Schäfer
> > <jonas.schaefer at cloudandheat.com> wrote:
> > >
> > > Hi Mohammed and all,
> > >
> > > On Montag, 17. August 2020 14:01:55 CEST Mohammed Naser wrote:
> > > > Over the past few days, we were troubleshooting an issue that ended up
> > > > having a root cause where keepalived has somehow ended up active in
> > > > two different L3 agents.  We've yet to find the root cause of how this
> > > > happened but removing it and adding it resolved the issue for us.
> > >
> > > We’ve also seen that behaviour occasionally. The root cause is also unclear
> > > for us (so we would’ve love to hear about that).
> >
> > Insert shameless plug for the Neutron OVN backend. One of it's
> > advantages is that it's L3 HA architecture is cleaner and more
> > scalable (this is coming from the dude that wrote the L3 HA code we're
> > all suffering from =D). The ML2/OVS L3 HA architecture has it's issues
> > - I've seen it work at 100's of customer sites at scale, so I don't
> > want to knock it too much, but just a day ago I got an internal
> > customer ticket about keepalived falling over on a particular router
> > that has 200 floating IPs. It works but it's not perfect. I'm sure the
> > OVN implementation isn't either but it's simply cleaner and has less
> > moving parts. It uses BFD to monitor the tunnel endpoints, so failover
> > is faster too. Plus, it doesn't use keepalived.
> >
>
> OVN is something we're looking at and we're very excited about,
> unfortunately, there seems to be a bunch of gaps in documentation

Can you elaborate?  If you can write down a list of gaps we can address that.

> right now as well as a lot of the migration scripts to OVN are
> TripleO-y.
>
> So it'll take time to get us there, but yes, OVN simplifies this greatly
>
> > > We have anecdotal evidence
> > > that a rabbitmq failure was involved, although that makes no sense to me
> > > personally. Other causes may be incorrectly cleaned-up namespaces (for
> > > example, when you kill or hard-restart the l3 agent, the namespaces will stay
> > > around, possibly with the IP address assigned; the keepalived on the other l3
> > > agents will not see the VRRP advertisments anymore and will ALSO assign the IP
> > > address. This will also be rectified by a restart always and may require
> > > manual namespace cleanup with a tool, a node reboot or an agent disable/enable
> > > cycle.).
> > >
> > > > As we work on improving our monitoring, we wanted to implement
> > > > something that gets us the info of # of active routers to check if
> > > > there's a router that has >1 active L3 agent but it's hard because
> > > > hitting the /l3-agents endpoint on _every_ single router hurts a lot
> > > > on performance.
> > > >
> > > > Is there something else that we can watch which might be more
> > > > productive?  FYI -- this all goes in the open and will end up inside
> > > > the openstack-exporter:
> > > > https://github.com/openstack-exporter/openstack-exporter and the Helm
> > > > charts will end up with the alerts:
> > > > https://github.com/openstack-exporter/helm-charts
> > >
> > > While I don’t think it fits in your openstack-exporter design, we are
> > > currently using the attached script (which we also hereby publish under the
> > > terms of the Apache 2.0 license [1]). (Sorry, I lack the time to cleanly
> > > publish it somewhere right now.)
> > >
> > > It checks the state files maintained by the L3 agent conglomerate and exports
> > > metrics about the master-ness of the routers as prometheus metrics.
> > >
> > > Note that this is slightly dangerous since the router IDs are high-cardinality
> > > and using that as a label value in Prometheus is discouraged; you may not want
> > > to do this in a public cloud setting.
> > >
> > > Either way: This allows us to alert on routers where there is not exactly one
> > > master state. Downside is that this requires the thing to run locally on the
> > > l3 agent nodes. Upside is that it is very efficient, and will also show the
> > > master state in some cases where the router was not cleaned up properly (e.g.
> > > because the l3 agent and its keepaliveds were killed).
> > > kind regards,
> > > Jonas
> > >
> > >    [1]: http://www.apache.org/licenses/LICENSE-2.0
> > > --
> > > Jonas Schäfer
> > > DevOps Engineer
> > >
> > > Cloud&Heat Technologies GmbH
> > > Königsbrücker Straße 96 | 01099 Dresden
> > > +49 351 479 367 37
> > > jonas.schaefer at cloudandheat.com | www.cloudandheat.com
> > >
> > > New Service:
> > > Managed Kubernetes designed for AI & ML
> > > https://managed-kubernetes.cloudandheat.com/
> > >
> > > Commercial Register: District Court Dresden
> > > Register Number: HRB 30549
> > > VAT ID No.: DE281093504
> > > Managing Director: Nicolas Röhrs
> > > Authorized signatory: Dr. Marius Feldmann
> > > Authorized signatory: Kristina Rübenkamp
> >
> >
>
>
> --
> Mohammed Naser
> VEXXHOST, Inc.
>


From ildiko.vancsa at gmail.com  Wed Aug 19 17:12:26 2020
From: ildiko.vancsa at gmail.com (Ildiko Vancsa)
Date: Wed, 19 Aug 2020 19:12:26 +0200
Subject: [upstream-institute] Virtual training mentor sign-up and planning
In-Reply-To: <BE49233E-B171-42F8-8DD2-2623D04158B4@gmail.com>
References: <BE49233E-B171-42F8-8DD2-2623D04158B4@gmail.com>
Message-ID: <BA9E6569-B307-4ADE-AFB9-A993F27C6BC0@gmail.com>

Hi,

It is a friendly reminder to please sign up on the wiki[1] if you are interested in participating in the virtual version of the Upstream Institute training as a mentor.

We will start planning soon to ensure that we have the format and the materials adjusted to the new circumstances.

Please let me know if you have any questions.

Thanks.
Ildikó

[1] https://wiki.openstack.org/wiki/OpenStack_Upstream_Institute_Occasions#Virtual_Training.2C_2020


> On Aug 10, 2020, at 14:31, Ildiko Vancsa <ildiko.vancsa at gmail.com> wrote:
> 
> Hi mentors,
> 
> I’m reaching out to you as the next Open Infrastructure Summit is approaching quickly so it is time to start planning for the next OpenStack Upstream Institute.
> 
> As the next event will be virtual we will need to re-think the training format and experience to make sure our audience gets the most out of it.
> 
> I created a new entry on our training occasions wiki page here: https://wiki.openstack.org/wiki/OpenStack_Upstream_Institute_Occasions#Virtual_Training.2C_2020
> 
> Please __sign up on the wiki__ if you would like to participate in the preparations and running the virtual training.
> 
> As it is still vacation season I think we can target the last week of August or first week of September to have the first prep meeting and can collect ideas here or discuss them on the #openstack-upstream-institute IRC channel on Freenode in the meantime.
> 
> Please let me know if you have any questions or need any help with signing up on the wiki.
> 
> Thanks and Best Regards,
> Ildikó
> 
> 


From jasowang at redhat.com  Wed Aug 19 02:38:13 2020
From: jasowang at redhat.com (Jason Wang)
Date: Wed, 19 Aug 2020 10:38:13 +0800
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200818091628.GC20215@redhat.com>
References: <20200805075647.GB2177@nanopsycho>
 <eb1d01c2-fbad-36b6-10cf-9e03483a736b@redhat.com>
 <20200805093338.GC30485@joy-OptiPlex-7040> <20200805105319.GF2177@nanopsycho>
 <20200810074631.GA29059@joy-OptiPlex-7040>
 <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
Message-ID: <5aea4ae6-e8c8-1120-453d-20a78cee6b20@redhat.com>


On 2020/8/18 下午5:16, Daniel P. Berrangé wrote:
> Your mail came through as HTML-only so all the quoting and attribution
> is mangled / lost now :-(


My bad, sorry.


>
> On Tue, Aug 18, 2020 at 05:01:51PM +0800, Jason Wang wrote:
>>     On 2020/8/18 下午4:55, Daniel P. Berrangé wrote:
>>
>>   On Tue, Aug 18, 2020 at 11:24:30AM +0800, Jason Wang wrote:
>>
>>   On 2020/8/14 下午1:16, Yan Zhao wrote:
>>
>>   On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
>>
>>   On 2020/8/10 下午3:46, Yan Zhao wrote:
>>   we actually can also retrieve the same information through sysfs, .e.g
>>
>>   |- [path to device]
>>      |--- migration
>>      |     |--- self
>>      |     |   |---device_api
>>      |    |   |---mdev_type
>>      |    |   |---software_version
>>      |    |   |---device_id
>>      |    |   |---aggregator
>>      |     |--- compatible
>>      |     |   |---device_api
>>      |    |   |---mdev_type
>>      |    |   |---software_version
>>      |    |   |---device_id
>>      |    |   |---aggregator
>>
>>
>>   Yes but:
>>
>>   - You need one file per attribute (one syscall for one attribute)
>>   - Attribute is coupled with kobject
>>
>>   All of above seems unnecessary.
>>
>>   Another point, as we discussed in another thread, it's really hard to make
>>   sure the above API work for all types of devices and frameworks. So having a
>>   vendor specific API looks much better.
>>
>>   From the POV of userspace mgmt apps doing device compat checking / migration,
>>   we certainly do NOT want to use different vendor specific APIs. We want to
>>   have an API that can be used / controlled in a standard manner across vendors.
>>
>>     Yes, but it could be hard. E.g vDPA will chose to use devlink (there's a
>>     long debate on sysfs vs devlink). So if we go with sysfs, at least two
>>     APIs needs to be supported ...
> NB, I was not questioning devlink vs sysfs directly. If devlink is related
> to netlink, I can't say I'm enthusiastic as IMKE sysfs is easier to deal
> with. I don't know enough about devlink to have much of an opinion though.
> The key point was that I don't want the userspace APIs we need to deal with
> to be vendor specific.
>
> What I care about is that we have a *standard* userspace API for performing
> device compatibility checking / state migration, for use by QEMU/libvirt/
> OpenStack, such that we can write code without countless vendor specific
> code paths.
>
> If there is vendor specific stuff on the side, that's fine as we can ignore
> that, but the core functionality for device compat / migration needs to be
> standardized.


Ok, I agree with you.

Thanks


>
> Regards,
> Daniel


From jasowang at redhat.com  Wed Aug 19 02:45:57 2020
From: jasowang at redhat.com (Jason Wang)
Date: Wed, 19 Aug 2020 10:45:57 +0800
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <BY5PR12MB43222059335C96F7B050CFDCDC5C0@BY5PR12MB4322.namprd12.prod.outlook.com>
References: <20200805021654.GB30485@joy-OptiPlex-7040>
 <2624b12f-3788-7e2b-2cb7-93534960bcb7@redhat.com>
 <20200805075647.GB2177@nanopsycho>
 <eb1d01c2-fbad-36b6-10cf-9e03483a736b@redhat.com>
 <20200805093338.GC30485@joy-OptiPlex-7040> <20200805105319.GF2177@nanopsycho>
 <20200810074631.GA29059@joy-OptiPlex-7040>
 <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <BY5PR12MB43222059335C96F7B050CFDCDC5C0@BY5PR12MB4322.namprd12.prod.outlook.com>
Message-ID: <934c8d2a-a34e-6c68-0e53-5de2a8f49d19@redhat.com>


On 2020/8/18 下午5:32, Parav Pandit wrote:
> Hi Jason,
>
> From: Jason Wang <jasowang at redhat.com>
> Sent: Tuesday, August 18, 2020 2:32 PM
>
>
> On 2020/8/18 下午4:55, Daniel P. Berrangé wrote:
> On Tue, Aug 18, 2020 at 11:24:30AM +0800, Jason Wang wrote:
> On 2020/8/14 下午1:16, Yan Zhao wrote:
> On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
> On 2020/8/10 下午3:46, Yan Zhao wrote:
> driver is it handled by?
> It looks that the devlink is for network device specific, and in
> devlink.h, it says
> include/uapi/linux/devlink.h - Network physical device Netlink
> interface,
> Actually not, I think there used to have some discussion last year and the
> conclusion is to remove this comment.
>
> [...]
>
>> Yes, but it could be hard. E.g vDPA will chose to use devlink (there's a long debate on sysfs vs devlink). So if we go with sysfs, at least two APIs needs to be supported ...
> We had internal discussion and proposal on this topic.
> I wanted Eli Cohen to be back from vacation on Wed 8/19, but since this is active discussion right now, I will share the thoughts anyway.
>
> Here are the initial round of thoughts and proposal.
>
> User requirements:
> ---------------------------
> 1. User might want to create one or more vdpa devices per PCI PF/VF/SF.
> 2. User might want to create one or more vdpa devices of type net/blk or other type.
> 3. User needs to look and dump at the health of the queues for debug purpose.
> 4. During vdpa net device creation time, user may have to provide a MAC address and/or VLAN.
> 5. User should be able to set/query some of the attributes for debug/compatibility check
> 6. When user wants to create vdpa device, it needs to know which device supports creation.
> 7. User should be able to see the queue statistics of doorbells, wqes etc regardless of class type


Note that wqes is probably not something common in all of the vendors.


>
> To address above requirements, there is a need of vendor agnostic tool, so that user can create/config/delete vdpa device(s) regardless of the vendor.
>
> Hence,
> We should have a tool that lets user do it.
>
> Examples:
> -------------
> (a) List parent devices which supports creating vdpa devices.
> It also shows which class types supported by this parent device.
> In below command two parent devices support vdpa device creation.
> First is PCI VF whose bdf is 03.00:5.
> Second is PCI SF whose name is mlx5_sf.1
>
> $ vdpa list pd


What did "pd" mean?


> pci/0000:03.00:5
>    class_supports
>      net vdpa
> virtbus/mlx5_sf.1


So creating mlx5_sf.1 is the charge of devlink?


>    class_supports
>      net
>
> (b) Now add a vdpa device and show the device.
> $ vdpa dev add pci/0000:03.00:5 type net


So if you want to create devices types other than vdpa on 
pci/0000:03.00:5 it needs some synchronization with devlink?


> $ vdpa dev show
> vdpa0 at pci/0000:03.00:5 type net state inactive maxqueues 8 curqueues 4
>
> (c) vdpa dev show features vdpa0
> iommu platform
> version 1
>
> (d) dump vdpa statistics
> $ vdpa dev stats show vdpa0
> kickdoorbells 10
> wqes 100
>
> (e) Now delete a vdpa device previously created.
> $ vdpa dev del vdpa0
>
> Design overview:
> -----------------------
> 1. Above example tool runs over netlink socket interface.
> 2. This enables users to return meaningful error strings in addition to code so that user can be more informed.
> Often this is missing in ioctl()/configfs/sysfs interfaces.
> 3. This tool over netlink enables syscaller tests to be more usable like other subsystems to keep kernel robust
> 4. This provides vendor agnostic view of all vdpa capable parent and vdpa devices.
>
> 5. Each driver which supports vdpa device creation, registers the parent device along with supported classes.
>
> FAQs:
> --------
> 1. Why not using devlink?
> Ans: Because as vdpa echo system grows, devlink will fall short of extending vdpa specific params, attributes, stats.


This should be fine but it's still not clear to me the difference 
between a vdpa netlink and a vdpa object in devlink.

Thanks


>
> 2. Why not use sysfs?
> Ans:
> (a) Because running syscaller infrastructure can run well over netlink sockets like it runs for several subsystem.
> (b) it lacks the ability to return error messages. Doing via kernel log is just doesn't work.
> (c) Why not using some ioctl()? It will reinvent the wheel of netlink that has TLV formats for several attributes.
>
> 3. Why not configs?
> It follows same limitation as that of sysfs.
>
> Low level design and driver APIS:
> --------------------------------------------
> Will post once we discuss this further.


From jasowang at redhat.com  Wed Aug 19 02:54:07 2020
From: jasowang at redhat.com (Jason Wang)
Date: Wed, 19 Aug 2020 10:54:07 +0800
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200818113652.5d81a392.cohuck@redhat.com>
References: <20200805075647.GB2177@nanopsycho>
 <eb1d01c2-fbad-36b6-10cf-9e03483a736b@redhat.com>
 <20200805093338.GC30485@joy-OptiPlex-7040> <20200805105319.GF2177@nanopsycho>
 <20200810074631.GA29059@joy-OptiPlex-7040>
 <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
 <20200818113652.5d81a392.cohuck@redhat.com>
Message-ID: <e862b946-6688-0a75-47ae-9ca16a759c38@redhat.com>


On 2020/8/18 下午5:36, Cornelia Huck wrote:
> On Tue, 18 Aug 2020 10:16:28 +0100
> Daniel P. Berrangé <berrange at redhat.com> wrote:
>
>> On Tue, Aug 18, 2020 at 05:01:51PM +0800, Jason Wang wrote:
>>>     On 2020/8/18 下午4:55, Daniel P. Berrangé wrote:
>>>
>>>   On Tue, Aug 18, 2020 at 11:24:30AM +0800, Jason Wang wrote:
>>>
>>>   On 2020/8/14 下午1:16, Yan Zhao wrote:
>>>
>>>   On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
>>>
>>>   On 2020/8/10 下午3:46, Yan Zhao wrote:
>>>   we actually can also retrieve the same information through sysfs, .e.g
>>>
>>>   |- [path to device]
>>>      |--- migration
>>>      |     |--- self
>>>      |     |   |---device_api
>>>      |    |   |---mdev_type
>>>      |    |   |---software_version
>>>      |    |   |---device_id
>>>      |    |   |---aggregator
>>>      |     |--- compatible
>>>      |     |   |---device_api
>>>      |    |   |---mdev_type
>>>      |    |   |---software_version
>>>      |    |   |---device_id
>>>      |    |   |---aggregator
>>>
>>>
>>>   Yes but:
>>>
>>>   - You need one file per attribute (one syscall for one attribute)
>>>   - Attribute is coupled with kobject
> Is that really that bad? You have the device with an embedded kobject
> anyway, and you can just put things into an attribute group?


Yes, but all of this could be done via devlink(netlink) as well with low 
overhead.


>
> [Also, I think that self/compatible split in the example makes things
> needlessly complex. Shouldn't semantic versioning and matching already
> cover nearly everything?


That's my question as well. E.g for virtio, versioning may not even 
work, some of features are negotiated independently:

Source features: A, B, C
Dest features: A, B, C, E

We just need to make sure the dest features is a superset of source then 
all set.


>   I would expect very few cases that are more
> complex than that. Maybe the aggregation stuff, but I don't think we
> need that self/compatible split for that, either.]
>
>>>   All of above seems unnecessary.
>>>
>>>   Another point, as we discussed in another thread, it's really hard to make
>>>   sure the above API work for all types of devices and frameworks. So having a
>>>   vendor specific API looks much better.
>>>
>>>   From the POV of userspace mgmt apps doing device compat checking / migration,
>>>   we certainly do NOT want to use different vendor specific APIs. We want to
>>>   have an API that can be used / controlled in a standard manner across vendors.
>>>
>>>     Yes, but it could be hard. E.g vDPA will chose to use devlink (there's a
>>>     long debate on sysfs vs devlink). So if we go with sysfs, at least two
>>>     APIs needs to be supported ...
>> NB, I was not questioning devlink vs sysfs directly. If devlink is related
>> to netlink, I can't say I'm enthusiastic as IMKE sysfs is easier to deal
>> with. I don't know enough about devlink to have much of an opinion though.
>> The key point was that I don't want the userspace APIs we need to deal with
>> to be vendor specific.
>  From what I've seen of devlink, it seems quite nice; but I understand
> why sysfs might be easier to deal with (especially as there's likely
> already a lot of code using it.)
>
> I understand that some users would like devlink because it is already
> widely used for network drivers (and some others), but I don't think
> the majority of devices used with vfio are network (although certainly
> a lot of them are.)


Note that though devlink could be popular only in network devices, 
netlink is widely used by a lot of subsystesm (e.g SCSI).

Thanks


>
>> What I care about is that we have a *standard* userspace API for performing
>> device compatibility checking / state migration, for use by QEMU/libvirt/
>> OpenStack, such that we can write code without countless vendor specific
>> code paths.
>>
>> If there is vendor specific stuff on the side, that's fine as we can ignore
>> that, but the core functionality for device compat / migration needs to be
>> standardized.
> To summarize:
> - choose one of sysfs or devlink
> - have a common interface, with a standardized way to add
>    vendor-specific attributes
> ?


From yan.y.zhao at intel.com  Wed Aug 19 03:30:35 2020
From: yan.y.zhao at intel.com (Yan Zhao)
Date: Wed, 19 Aug 2020 11:30:35 +0800
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <BY5PR12MB4322C9D1A66C4657776A1383DC5C0@BY5PR12MB4322.namprd12.prod.outlook.com>
References: <20200805105319.GF2177@nanopsycho>
 <20200810074631.GA29059@joy-OptiPlex-7040>
 <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
 <20200818113652.5d81a392.cohuck@redhat.com>
 <BY5PR12MB4322C9D1A66C4657776A1383DC5C0@BY5PR12MB4322.namprd12.prod.outlook.com>
Message-ID: <20200819033035.GA21172@joy-OptiPlex-7040>

On Tue, Aug 18, 2020 at 09:39:24AM +0000, Parav Pandit wrote:
> Hi Cornelia,
> 
> > From: Cornelia Huck <cohuck at redhat.com>
> > Sent: Tuesday, August 18, 2020 3:07 PM
> > To: Daniel P. Berrangé <berrange at redhat.com>
> > Cc: Jason Wang <jasowang at redhat.com>; Yan Zhao
> > <yan.y.zhao at intel.com>; kvm at vger.kernel.org; libvir-list at redhat.com;
> > qemu-devel at nongnu.org; Kirti Wankhede <kwankhede at nvidia.com>;
> > eauger at redhat.com; xin-ran.wang at intel.com; corbet at lwn.net; openstack-
> > discuss at lists.openstack.org; shaohe.feng at intel.com; kevin.tian at intel.com;
> > Parav Pandit <parav at mellanox.com>; jian-feng.ding at intel.com;
> > dgilbert at redhat.com; zhenyuw at linux.intel.com; hejie.xu at intel.com;
> > bao.yumeng at zte.com.cn; Alex Williamson <alex.williamson at redhat.com>;
> > eskultet at redhat.com; smooney at redhat.com; intel-gvt-
> > dev at lists.freedesktop.org; Jiri Pirko <jiri at mellanox.com>;
> > dinechin at redhat.com; devel at ovirt.org
> > Subject: Re: device compatibility interface for live migration with assigned
> > devices
> > 
> > On Tue, 18 Aug 2020 10:16:28 +0100
> > Daniel P. Berrangé <berrange at redhat.com> wrote:
> > 
> > > On Tue, Aug 18, 2020 at 05:01:51PM +0800, Jason Wang wrote:
> > > >    On 2020/8/18 下午4:55, Daniel P. Berrangé wrote:
> > > >
> > > >  On Tue, Aug 18, 2020 at 11:24:30AM +0800, Jason Wang wrote:
> > > >
> > > >  On 2020/8/14 下午1:16, Yan Zhao wrote:
> > > >
> > > >  On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
> > > >
> > > >  On 2020/8/10 下午3:46, Yan Zhao wrote:
> > >
> > > >  we actually can also retrieve the same information through sysfs,
> > > > .e.g
> > > >
> > > >  |- [path to device]
> > > >     |--- migration
> > > >     |     |--- self
> > > >     |     |   |---device_api
> > > >     |    |   |---mdev_type
> > > >     |    |   |---software_version
> > > >     |    |   |---device_id
> > > >     |    |   |---aggregator
> > > >     |     |--- compatible
> > > >     |     |   |---device_api
> > > >     |    |   |---mdev_type
> > > >     |    |   |---software_version
> > > >     |    |   |---device_id
> > > >     |    |   |---aggregator
> > > >
> > > >
> > > >  Yes but:
> > > >
> > > >  - You need one file per attribute (one syscall for one attribute)
> > > >  - Attribute is coupled with kobject
> > 
> > Is that really that bad? You have the device with an embedded kobject
> > anyway, and you can just put things into an attribute group?
> > 
> > [Also, I think that self/compatible split in the example makes things
> > needlessly complex. Shouldn't semantic versioning and matching already
> > cover nearly everything? I would expect very few cases that are more
> > complex than that. Maybe the aggregation stuff, but I don't think we need
> > that self/compatible split for that, either.]
> > 
> > > >
> > > >  All of above seems unnecessary.
> > > >
> > > >  Another point, as we discussed in another thread, it's really hard
> > > > to make  sure the above API work for all types of devices and
> > > > frameworks. So having a  vendor specific API looks much better.
> > > >
> > > >  From the POV of userspace mgmt apps doing device compat checking /
> > > > migration,  we certainly do NOT want to use different vendor
> > > > specific APIs. We want to  have an API that can be used / controlled in a
> > standard manner across vendors.
> > > >
> > > >    Yes, but it could be hard. E.g vDPA will chose to use devlink (there's a
> > > >    long debate on sysfs vs devlink). So if we go with sysfs, at least two
> > > >    APIs needs to be supported ...
> > >
> > > NB, I was not questioning devlink vs sysfs directly. If devlink is
> > > related to netlink, I can't say I'm enthusiastic as IMKE sysfs is
> > > easier to deal with. I don't know enough about devlink to have much of an
> > opinion though.
> > > The key point was that I don't want the userspace APIs we need to deal
> > > with to be vendor specific.
> > 
> > From what I've seen of devlink, it seems quite nice; but I understand why
> > sysfs might be easier to deal with (especially as there's likely already a lot of
> > code using it.)
> > 
> > I understand that some users would like devlink because it is already widely
> > used for network drivers (and some others), but I don't think the majority of
> > devices used with vfio are network (although certainly a lot of them are.)
> > 
> > >
> > > What I care about is that we have a *standard* userspace API for
> > > performing device compatibility checking / state migration, for use by
> > > QEMU/libvirt/ OpenStack, such that we can write code without countless
> > > vendor specific code paths.
> > >
> > > If there is vendor specific stuff on the side, that's fine as we can
> > > ignore that, but the core functionality for device compat / migration
> > > needs to be standardized.
> > 
> > To summarize:
> > - choose one of sysfs or devlink
> > - have a common interface, with a standardized way to add
> >   vendor-specific attributes
> > ?
> 
> Please refer to my previous email which has more example and details.
hi Parav,
the example is based on a new vdpa tool running over netlink, not based
on devlink, right?
For vfio migration compatibility, we have to deal with both mdev and physical
pci devices, I don't think it's a good idea to write a new tool for it, given
we are able to retrieve the same info from sysfs and there's already an
mdevctl from Alex (https://github.com/mdevctl/mdevctl).

hi All,
could we decide that sysfs is the interface that every VFIO vendor driver
needs to provide in order to support vfio live migration, otherwise the
userspace management tool would not list the device into the compatible
list?

if that's true, let's move to the standardizing of the sysfs interface.
(1) content
common part: (must)
   - software_version: (in major.minor.bugfix scheme)
   - device_api: vfio-pci or vfio-ccw ...
   - type: mdev type for mdev device or
           a signature for physical device which is a counterpart for
	   mdev type.

device api specific part: (must)
  - pci id: pci id of mdev parent device or pci id of physical pci
    device (device_api is vfio-pci)
  - subchannel_type (device_api is vfio-ccw) 
 
vendor driver specific part: (optional)
  - aggregator
  - chpid_type
  - remote_url

NOTE: vendors are free to add attributes in this part with a
restriction that this attribute is able to be configured with the same
name in sysfs too. e.g.
for aggregator, there must be a sysfs attribute in device node
/sys/devices/pci0000:00/0000:00:02.0/882cc4da-dede-11e7-9180-078a62063ab1/intel_vgpu/aggregator,
so that the userspace tool is able to configure the target device
according to source device's aggregator attribute.


(2) where and structure
proposal 1:
|- [path to device]
  |--- migration
  |     |--- self
  |     |    |-software_version
  |     |    |-device_api
  |     |    |-type
  |     |    |-[pci_id or subchannel_type]
  |     |    |-<aggregator or chpid_type>
  |     |--- compatible
  |     |    |-software_version
  |     |    |-device_api
  |     |    |-type
  |     |    |-[pci_id or subchannel_type]
  |     |    |-<aggregator or chpid_type>
multiple compatible is allowed.
attributes should be ASCII text files, preferably with only one value
per file.


proposal 2: use bin_attribute.
|- [path to device]
  |--- migration
  |     |--- self
  |     |--- compatible

so we can continue use multiline format. e.g.
cat compatible
  software_version=0.1.0
  device_api=vfio_pci
  type=i915-GVTg_V5_{val1:int:1,2,4,8}
  pci_id=80865963
  aggregator={val1}/2

Thanks
Yan


From parav at nvidia.com  Wed Aug 19 05:26:58 2020
From: parav at nvidia.com (Parav Pandit)
Date: Wed, 19 Aug 2020 05:26:58 +0000
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <934c8d2a-a34e-6c68-0e53-5de2a8f49d19@redhat.com>
References: <20200805021654.GB30485@joy-OptiPlex-7040>
 <2624b12f-3788-7e2b-2cb7-93534960bcb7@redhat.com>
 <20200805075647.GB2177@nanopsycho>
 <eb1d01c2-fbad-36b6-10cf-9e03483a736b@redhat.com>
 <20200805093338.GC30485@joy-OptiPlex-7040> <20200805105319.GF2177@nanopsycho>
 <20200810074631.GA29059@joy-OptiPlex-7040>
 <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <BY5PR12MB43222059335C96F7B050CFDCDC5C0@BY5PR12MB4322.namprd12.prod.outlook.com>
 <934c8d2a-a34e-6c68-0e53-5de2a8f49d19@redhat.com>
Message-ID: <BY5PR12MB4322CD6B3C697B6F1807ECBFDC5D0@BY5PR12MB4322.namprd12.prod.outlook.com>


> From: Jason Wang <jasowang at redhat.com>
> Sent: Wednesday, August 19, 2020 8:16 AM


> On 2020/8/18 下午5:32, Parav Pandit wrote:
> > Hi Jason,
> >
> > From: Jason Wang <jasowang at redhat.com>
> > Sent: Tuesday, August 18, 2020 2:32 PM
> >
> >
> > On 2020/8/18 下午4:55, Daniel P. Berrangé wrote:
> > On Tue, Aug 18, 2020 at 11:24:30AM +0800, Jason Wang wrote:
> > On 2020/8/14 下午1:16, Yan Zhao wrote:
> > On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
> > On 2020/8/10 下午3:46, Yan Zhao wrote:
> > driver is it handled by?
> > It looks that the devlink is for network device specific, and in
> > devlink.h, it says include/uapi/linux/devlink.h - Network physical
> > device Netlink interface, Actually not, I think there used to have
> > some discussion last year and the conclusion is to remove this
> > comment.
> >
> > [...]
> >
> >> Yes, but it could be hard. E.g vDPA will chose to use devlink (there's a long
> debate on sysfs vs devlink). So if we go with sysfs, at least two APIs needs to be
> supported ...
> > We had internal discussion and proposal on this topic.
> > I wanted Eli Cohen to be back from vacation on Wed 8/19, but since this is
> active discussion right now, I will share the thoughts anyway.
> >
> > Here are the initial round of thoughts and proposal.
> >
> > User requirements:
> > ---------------------------
> > 1. User might want to create one or more vdpa devices per PCI PF/VF/SF.
> > 2. User might want to create one or more vdpa devices of type net/blk or
> other type.
> > 3. User needs to look and dump at the health of the queues for debug purpose.
> > 4. During vdpa net device creation time, user may have to provide a MAC
> address and/or VLAN.
> > 5. User should be able to set/query some of the attributes for
> > debug/compatibility check 6. When user wants to create vdpa device, it needs
> to know which device supports creation.
> > 7. User should be able to see the queue statistics of doorbells, wqes
> > etc regardless of class type
> 
> 
> Note that wqes is probably not something common in all of the vendors.
Yes. I virtq descriptors stats is better to monitor the virtqueues.

> 
> 
> >
> > To address above requirements, there is a need of vendor agnostic tool, so
> that user can create/config/delete vdpa device(s) regardless of the vendor.
> >
> > Hence,
> > We should have a tool that lets user do it.
> >
> > Examples:
> > -------------
> > (a) List parent devices which supports creating vdpa devices.
> > It also shows which class types supported by this parent device.
> > In below command two parent devices support vdpa device creation.
> > First is PCI VF whose bdf is 03.00:5.
> > Second is PCI SF whose name is mlx5_sf.1
> >
> > $ vdpa list pd
> 
> 
> What did "pd" mean?
> 
Parent device which support creation of one or more vdpa devices.
In a system there can be multiple parent devices which may be support vdpa creation.
User should be able to know which devices support it, and when user creates a vdpa device, it tells which parent device to use for creation as done in below vdpa dev add example.
> 
> > pci/0000:03.00:5
> >    class_supports
> >      net vdpa
> > virtbus/mlx5_sf.1
> 
> 
> So creating mlx5_sf.1 is the charge of devlink?
> 
Yes.
But here vdpa tool is working at the parent device identifier {bus+name} instead of devlink identifier.


> 
> >    class_supports
> >      net
> >
> > (b) Now add a vdpa device and show the device.
> > $ vdpa dev add pci/0000:03.00:5 type net
> 
> 
> So if you want to create devices types other than vdpa on
> pci/0000:03.00:5 it needs some synchronization with devlink?
Please refer to FAQ-1,  a new tool is not linked to devlink because vdpa will evolve with time and devlink will fall short.
So no, it doesn't need any synchronization with devlink.
As long as parent device exist, user can create it.
All synchronization will be within drivers/vdpa/vdpa.c
This user interface is exposed via new netlink family by doing genl_register_family() with new name "vdpa" in drivers/vdpa/vdpa.c.

> 
> 
> > $ vdpa dev show
> > vdpa0 at pci/0000:03.00:5 type net state inactive maxqueues 8 curqueues 4
> >
> > (c) vdpa dev show features vdpa0
> > iommu platform
> > version 1
> >
> > (d) dump vdpa statistics
> > $ vdpa dev stats show vdpa0
> > kickdoorbells 10
> > wqes 100
> >
> > (e) Now delete a vdpa device previously created.
> > $ vdpa dev del vdpa0
> >
> > Design overview:
> > -----------------------
> > 1. Above example tool runs over netlink socket interface.
> > 2. This enables users to return meaningful error strings in addition to code so
> that user can be more informed.
> > Often this is missing in ioctl()/configfs/sysfs interfaces.
> > 3. This tool over netlink enables syscaller tests to be more usable like other
> subsystems to keep kernel robust
> > 4. This provides vendor agnostic view of all vdpa capable parent and vdpa
> devices.
> >
> > 5. Each driver which supports vdpa device creation, registers the parent device
> along with supported classes.
> >
> > FAQs:
> > --------
> > 1. Why not using devlink?
> > Ans: Because as vdpa echo system grows, devlink will fall short of extending
> vdpa specific params, attributes, stats.
> 
> 
> This should be fine but it's still not clear to me the difference
> between a vdpa netlink and a vdpa object in devlink.
>
The difference is a vdpa specific tool work at the parent device level.
It is likely more appropriate to because it can self-contain everything needed to create/delete devices, view/set features, stats.
Trying to put that in devlink will fall short as devlink doesn’t have vdpa definitions.
Typically when a class/device subsystem grows, its own tool is wiser like iproute2/ip, iproute2/tc, iproute2/rdma.

From parav at nvidia.com  Wed Aug 19 05:58:12 2020
From: parav at nvidia.com (Parav Pandit)
Date: Wed, 19 Aug 2020 05:58:12 +0000
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200819033035.GA21172@joy-OptiPlex-7040>
References: <20200805105319.GF2177@nanopsycho>
 <20200810074631.GA29059@joy-OptiPlex-7040>
 <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
 <20200818113652.5d81a392.cohuck@redhat.com>
 <BY5PR12MB4322C9D1A66C4657776A1383DC5C0@BY5PR12MB4322.namprd12.prod.outlook.com>
 <20200819033035.GA21172@joy-OptiPlex-7040>
Message-ID: <BY5PR12MB43226CABD003285D0C77E2B7DC5D0@BY5PR12MB4322.namprd12.prod.outlook.com>


> From: Yan Zhao <yan.y.zhao at intel.com>
> Sent: Wednesday, August 19, 2020 9:01 AM

> On Tue, Aug 18, 2020 at 09:39:24AM +0000, Parav Pandit wrote:

> > Please refer to my previous email which has more example and details.
> hi Parav,
> the example is based on a new vdpa tool running over netlink, not based on
> devlink, right?
Right.

> For vfio migration compatibility, we have to deal with both mdev and physical
> pci devices, I don't think it's a good idea to write a new tool for it, given we are
> able to retrieve the same info from sysfs and there's already an mdevctl from
mdev attribute should be visible in the mdev's sysfs tree.
I do not propose to write a new mdev tool over netlink. I am sorry if I implied that with my suggestion of vdpa tool.

If underlying device is vdpa, mdev might be able to understand vdpa device and query from it and populate in mdev sysfs tree.

The vdpa tool I propose is usable even without mdevs.
vdpa tool's role is to create one or more vdpa devices and place on the "vdpa" bus which is the lowest layer here.
Additionally this tool let user query virtqueue stats, db stats.
When a user creates vdpa net device, user may need to configure features of the vdpa device such as VIRTIO_NET_F_MAC, default VIRTIO_NET_F_MTU.
These are vdpa level features, attributes. Mdev is layer above it.

> Alex
> (https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.
> com%2Fmdevctl%2Fmdevctl&amp;data=02%7C01%7Cparav%40nvidia.com%7C
> 0c2691d430304f5ea11308d843f2d84e%7C43083d15727340c1b7db39efd9ccc17
> a%7C0%7C0%7C637334057571911357&amp;sdata=KxH7PwxmKyy9JODut8BWr
> LQyOBylW00%2Fyzc4rEvjUvA%3D&amp;reserved=0).
>
Sorry for above link mangling. Our mail server is still transitioning due to company acquisition.

I am less familiar on below points to comment.

> hi All,
> could we decide that sysfs is the interface that every VFIO vendor driver needs
> to provide in order to support vfio live migration, otherwise the userspace
> management tool would not list the device into the compatible list?
> 
> if that's true, let's move to the standardizing of the sysfs interface.
> (1) content
> common part: (must)
>    - software_version: (in major.minor.bugfix scheme)
>    - device_api: vfio-pci or vfio-ccw ...
>    - type: mdev type for mdev device or
>            a signature for physical device which is a counterpart for
> 	   mdev type.
> 
> device api specific part: (must)
>   - pci id: pci id of mdev parent device or pci id of physical pci
>     device (device_api is vfio-pci)
>   - subchannel_type (device_api is vfio-ccw)
> 
> vendor driver specific part: (optional)
>   - aggregator
>   - chpid_type
>   - remote_url
> 
> NOTE: vendors are free to add attributes in this part with a restriction that this
> attribute is able to be configured with the same name in sysfs too. e.g.
> for aggregator, there must be a sysfs attribute in device node
> /sys/devices/pci0000:00/0000:00:02.0/882cc4da-dede-11e7-9180-
> 078a62063ab1/intel_vgpu/aggregator,
> so that the userspace tool is able to configure the target device according to
> source device's aggregator attribute.
> 
> 
> (2) where and structure
> proposal 1:
> |- [path to device]
>   |--- migration
>   |     |--- self
>   |     |    |-software_version
>   |     |    |-device_api
>   |     |    |-type
>   |     |    |-[pci_id or subchannel_type]
>   |     |    |-<aggregator or chpid_type>
>   |     |--- compatible
>   |     |    |-software_version
>   |     |    |-device_api
>   |     |    |-type
>   |     |    |-[pci_id or subchannel_type]
>   |     |    |-<aggregator or chpid_type>
> multiple compatible is allowed.
> attributes should be ASCII text files, preferably with only one value per file.
> 
> 
> proposal 2: use bin_attribute.
> |- [path to device]
>   |--- migration
>   |     |--- self
>   |     |--- compatible
> 
> so we can continue use multiline format. e.g.
> cat compatible
>   software_version=0.1.0
>   device_api=vfio_pci
>   type=i915-GVTg_V5_{val1:int:1,2,4,8}
>   pci_id=80865963
>   aggregator={val1}/2
> 
> Thanks
> Yan

From jasowang at redhat.com  Wed Aug 19 06:48:34 2020
From: jasowang at redhat.com (Jason Wang)
Date: Wed, 19 Aug 2020 14:48:34 +0800
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <BY5PR12MB4322CD6B3C697B6F1807ECBFDC5D0@BY5PR12MB4322.namprd12.prod.outlook.com>
References: <20200805021654.GB30485@joy-OptiPlex-7040>
 <2624b12f-3788-7e2b-2cb7-93534960bcb7@redhat.com>
 <20200805075647.GB2177@nanopsycho>
 <eb1d01c2-fbad-36b6-10cf-9e03483a736b@redhat.com>
 <20200805093338.GC30485@joy-OptiPlex-7040> <20200805105319.GF2177@nanopsycho>
 <20200810074631.GA29059@joy-OptiPlex-7040>
 <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <BY5PR12MB43222059335C96F7B050CFDCDC5C0@BY5PR12MB4322.namprd12.prod.outlook.com>
 <934c8d2a-a34e-6c68-0e53-5de2a8f49d19@redhat.com>
 <BY5PR12MB4322CD6B3C697B6F1807ECBFDC5D0@BY5PR12MB4322.namprd12.prod.outlook.com>
Message-ID: <115147a9-3d8c-aa95-c43d-251a321ac152@redhat.com>


On 2020/8/19 下午1:26, Parav Pandit wrote:
>
>> From: Jason Wang <jasowang at redhat.com>
>> Sent: Wednesday, August 19, 2020 8:16 AM
>
>> On 2020/8/18 下午5:32, Parav Pandit wrote:
>>> Hi Jason,
>>>
>>> From: Jason Wang <jasowang at redhat.com>
>>> Sent: Tuesday, August 18, 2020 2:32 PM
>>>
>>>
>>> On 2020/8/18 下午4:55, Daniel P. Berrangé wrote:
>>> On Tue, Aug 18, 2020 at 11:24:30AM +0800, Jason Wang wrote:
>>> On 2020/8/14 下午1:16, Yan Zhao wrote:
>>> On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
>>> On 2020/8/10 下午3:46, Yan Zhao wrote:
>>> driver is it handled by?
>>> It looks that the devlink is for network device specific, and in
>>> devlink.h, it says include/uapi/linux/devlink.h - Network physical
>>> device Netlink interface, Actually not, I think there used to have
>>> some discussion last year and the conclusion is to remove this
>>> comment.
>>>
>>> [...]
>>>
>>>> Yes, but it could be hard. E.g vDPA will chose to use devlink (there's a long
>> debate on sysfs vs devlink). So if we go with sysfs, at least two APIs needs to be
>> supported ...
>>> We had internal discussion and proposal on this topic.
>>> I wanted Eli Cohen to be back from vacation on Wed 8/19, but since this is
>> active discussion right now, I will share the thoughts anyway.
>>> Here are the initial round of thoughts and proposal.
>>>
>>> User requirements:
>>> ---------------------------
>>> 1. User might want to create one or more vdpa devices per PCI PF/VF/SF.
>>> 2. User might want to create one or more vdpa devices of type net/blk or
>> other type.
>>> 3. User needs to look and dump at the health of the queues for debug purpose.
>>> 4. During vdpa net device creation time, user may have to provide a MAC
>> address and/or VLAN.
>>> 5. User should be able to set/query some of the attributes for
>>> debug/compatibility check 6. When user wants to create vdpa device, it needs
>> to know which device supports creation.
>>> 7. User should be able to see the queue statistics of doorbells, wqes
>>> etc regardless of class type
>>
>> Note that wqes is probably not something common in all of the vendors.
> Yes. I virtq descriptors stats is better to monitor the virtqueues.
>
>>
>>> To address above requirements, there is a need of vendor agnostic tool, so
>> that user can create/config/delete vdpa device(s) regardless of the vendor.
>>> Hence,
>>> We should have a tool that lets user do it.
>>>
>>> Examples:
>>> -------------
>>> (a) List parent devices which supports creating vdpa devices.
>>> It also shows which class types supported by this parent device.
>>> In below command two parent devices support vdpa device creation.
>>> First is PCI VF whose bdf is 03.00:5.
>>> Second is PCI SF whose name is mlx5_sf.1
>>>
>>> $ vdpa list pd
>>
>> What did "pd" mean?
>>
> Parent device which support creation of one or more vdpa devices.
> In a system there can be multiple parent devices which may be support vdpa creation.
> User should be able to know which devices support it, and when user creates a vdpa device, it tells which parent device to use for creation as done in below vdpa dev add example.
>>> pci/0000:03.00:5
>>>     class_supports
>>>       net vdpa
>>> virtbus/mlx5_sf.1
>>
>> So creating mlx5_sf.1 is the charge of devlink?
>>
> Yes.
> But here vdpa tool is working at the parent device identifier {bus+name} instead of devlink identifier.
>
>
>>>     class_supports
>>>       net
>>>
>>> (b) Now add a vdpa device and show the device.
>>> $ vdpa dev add pci/0000:03.00:5 type net
>>
>> So if you want to create devices types other than vdpa on
>> pci/0000:03.00:5 it needs some synchronization with devlink?
> Please refer to FAQ-1,  a new tool is not linked to devlink because vdpa will evolve with time and devlink will fall short.
> So no, it doesn't need any synchronization with devlink.
> As long as parent device exist, user can create it.
> All synchronization will be within drivers/vdpa/vdpa.c
> This user interface is exposed via new netlink family by doing genl_register_family() with new name "vdpa" in drivers/vdpa/vdpa.c.


Just to make sure I understand here.

Consider we had virtbus/mlx5_sf.1. Process A want to create a vDPA 
instance on top of it but Process B want to create a IB instance. Then I 
think some synchronization is needed at at least parent device level?


>
>>
>>> $ vdpa dev show
>>> vdpa0 at pci/0000:03.00:5 type net state inactive maxqueues 8 curqueues 4
>>>
>>> (c) vdpa dev show features vdpa0
>>> iommu platform
>>> version 1
>>>
>>> (d) dump vdpa statistics
>>> $ vdpa dev stats show vdpa0
>>> kickdoorbells 10
>>> wqes 100
>>>
>>> (e) Now delete a vdpa device previously created.
>>> $ vdpa dev del vdpa0
>>>
>>> Design overview:
>>> -----------------------
>>> 1. Above example tool runs over netlink socket interface.
>>> 2. This enables users to return meaningful error strings in addition to code so
>> that user can be more informed.
>>> Often this is missing in ioctl()/configfs/sysfs interfaces.
>>> 3. This tool over netlink enables syscaller tests to be more usable like other
>> subsystems to keep kernel robust
>>> 4. This provides vendor agnostic view of all vdpa capable parent and vdpa
>> devices.
>>> 5. Each driver which supports vdpa device creation, registers the parent device
>> along with supported classes.
>>> FAQs:
>>> --------
>>> 1. Why not using devlink?
>>> Ans: Because as vdpa echo system grows, devlink will fall short of extending
>> vdpa specific params, attributes, stats.
>>
>>
>> This should be fine but it's still not clear to me the difference
>> between a vdpa netlink and a vdpa object in devlink.
>>
> The difference is a vdpa specific tool work at the parent device level.
> It is likely more appropriate to because it can self-contain everything needed to create/delete devices, view/set features, stats.
> Trying to put that in devlink will fall short as devlink doesn’t have vdpa definitions.
> Typically when a class/device subsystem grows, its own tool is wiser like iproute2/ip, iproute2/tc, iproute2/rdma.


Ok, I see.

Thanks


From parav at nvidia.com  Wed Aug 19 06:53:03 2020
From: parav at nvidia.com (Parav Pandit)
Date: Wed, 19 Aug 2020 06:53:03 +0000
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <115147a9-3d8c-aa95-c43d-251a321ac152@redhat.com>
References: <20200805021654.GB30485@joy-OptiPlex-7040>
 <2624b12f-3788-7e2b-2cb7-93534960bcb7@redhat.com>
 <20200805075647.GB2177@nanopsycho>
 <eb1d01c2-fbad-36b6-10cf-9e03483a736b@redhat.com>
 <20200805093338.GC30485@joy-OptiPlex-7040> <20200805105319.GF2177@nanopsycho>
 <20200810074631.GA29059@joy-OptiPlex-7040>
 <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <BY5PR12MB43222059335C96F7B050CFDCDC5C0@BY5PR12MB4322.namprd12.prod.outlook.com>
 <934c8d2a-a34e-6c68-0e53-5de2a8f49d19@redhat.com>
 <BY5PR12MB4322CD6B3C697B6F1807ECBFDC5D0@BY5PR12MB4322.namprd12.prod.outlook.com>
 <115147a9-3d8c-aa95-c43d-251a321ac152@redhat.com>
Message-ID: <BY5PR12MB43225D59AFF0D54AC2F9D3C1DC5D0@BY5PR12MB4322.namprd12.prod.outlook.com>


> From: Jason Wang <jasowang at redhat.com>
> Sent: Wednesday, August 19, 2020 12:19 PM
> 
> 
> On 2020/8/19 下午1:26, Parav Pandit wrote:
> >
> >> From: Jason Wang <jasowang at redhat.com>
> >> Sent: Wednesday, August 19, 2020 8:16 AM
> >
> >> On 2020/8/18 下午5:32, Parav Pandit wrote:
> >>> Hi Jason,
> >>>
> >>> From: Jason Wang <jasowang at redhat.com>
> >>> Sent: Tuesday, August 18, 2020 2:32 PM
> >>>
> >>>
> >>> On 2020/8/18 下午4:55, Daniel P. Berrangé wrote:
> >>> On Tue, Aug 18, 2020 at 11:24:30AM +0800, Jason Wang wrote:
> >>> On 2020/8/14 下午1:16, Yan Zhao wrote:
> >>> On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
> >>> On 2020/8/10 下午3:46, Yan Zhao wrote:
> >>> driver is it handled by?
> >>> It looks that the devlink is for network device specific, and in
> >>> devlink.h, it says include/uapi/linux/devlink.h - Network physical
> >>> device Netlink interface, Actually not, I think there used to have
> >>> some discussion last year and the conclusion is to remove this
> >>> comment.
> >>>
> >>> [...]
> >>>
> >>>> Yes, but it could be hard. E.g vDPA will chose to use devlink
> >>>> (there's a long
> >> debate on sysfs vs devlink). So if we go with sysfs, at least two
> >> APIs needs to be supported ...
> >>> We had internal discussion and proposal on this topic.
> >>> I wanted Eli Cohen to be back from vacation on Wed 8/19, but since
> >>> this is
> >> active discussion right now, I will share the thoughts anyway.
> >>> Here are the initial round of thoughts and proposal.
> >>>
> >>> User requirements:
> >>> ---------------------------
> >>> 1. User might want to create one or more vdpa devices per PCI PF/VF/SF.
> >>> 2. User might want to create one or more vdpa devices of type
> >>> net/blk or
> >> other type.
> >>> 3. User needs to look and dump at the health of the queues for debug
> purpose.
> >>> 4. During vdpa net device creation time, user may have to provide a
> >>> MAC
> >> address and/or VLAN.
> >>> 5. User should be able to set/query some of the attributes for
> >>> debug/compatibility check 6. When user wants to create vdpa device,
> >>> it needs
> >> to know which device supports creation.
> >>> 7. User should be able to see the queue statistics of doorbells,
> >>> wqes etc regardless of class type
> >>
> >> Note that wqes is probably not something common in all of the vendors.
> > Yes. I virtq descriptors stats is better to monitor the virtqueues.
> >
> >>
> >>> To address above requirements, there is a need of vendor agnostic
> >>> tool, so
> >> that user can create/config/delete vdpa device(s) regardless of the vendor.
> >>> Hence,
> >>> We should have a tool that lets user do it.
> >>>
> >>> Examples:
> >>> -------------
> >>> (a) List parent devices which supports creating vdpa devices.
> >>> It also shows which class types supported by this parent device.
> >>> In below command two parent devices support vdpa device creation.
> >>> First is PCI VF whose bdf is 03.00:5.
> >>> Second is PCI SF whose name is mlx5_sf.1
> >>>
> >>> $ vdpa list pd
> >>
> >> What did "pd" mean?
> >>
> > Parent device which support creation of one or more vdpa devices.
> > In a system there can be multiple parent devices which may be support vdpa
> creation.
> > User should be able to know which devices support it, and when user creates a
> vdpa device, it tells which parent device to use for creation as done in below
> vdpa dev add example.
> >>> pci/0000:03.00:5
> >>>     class_supports
> >>>       net vdpa
> >>> virtbus/mlx5_sf.1
> >>
> >> So creating mlx5_sf.1 is the charge of devlink?
> >>
> > Yes.
> > But here vdpa tool is working at the parent device identifier {bus+name}
> instead of devlink identifier.
> >
> >
> >>>     class_supports
> >>>       net
> >>>
> >>> (b) Now add a vdpa device and show the device.
> >>> $ vdpa dev add pci/0000:03.00:5 type net
> >>
> >> So if you want to create devices types other than vdpa on
> >> pci/0000:03.00:5 it needs some synchronization with devlink?
> > Please refer to FAQ-1,  a new tool is not linked to devlink because vdpa will
> evolve with time and devlink will fall short.
> > So no, it doesn't need any synchronization with devlink.
> > As long as parent device exist, user can create it.
> > All synchronization will be within drivers/vdpa/vdpa.c This user
> > interface is exposed via new netlink family by doing genl_register_family() with
> new name "vdpa" in drivers/vdpa/vdpa.c.
> 
> 
> Just to make sure I understand here.
> 
> Consider we had virtbus/mlx5_sf.1. Process A want to create a vDPA instance on
> top of it but Process B want to create a IB instance. Then I think some
> synchronization is needed at at least parent device level?

Likely but rdma device will be created either through 
$ rdma link add command.
Or auto created by driver because there is only one without much configuration.

While vdpa device(s) for virtbus/mlx5_sf.1 will be created through vdpa subsystem.
And vdpa's synchronization will be contained within drivers/vdpa/vdpa.c

> 
> 
> >
> >>
> >>> $ vdpa dev show
> >>> vdpa0 at pci/0000:03.00:5 type net state inactive maxqueues 8 curqueues
> >>> 4
> >>>
> >>> (c) vdpa dev show features vdpa0
> >>> iommu platform
> >>> version 1
> >>>
> >>> (d) dump vdpa statistics
> >>> $ vdpa dev stats show vdpa0
> >>> kickdoorbells 10
> >>> wqes 100
> >>>
> >>> (e) Now delete a vdpa device previously created.
> >>> $ vdpa dev del vdpa0
> >>>
> >>> Design overview:
> >>> -----------------------
> >>> 1. Above example tool runs over netlink socket interface.
> >>> 2. This enables users to return meaningful error strings in addition
> >>> to code so
> >> that user can be more informed.
> >>> Often this is missing in ioctl()/configfs/sysfs interfaces.
> >>> 3. This tool over netlink enables syscaller tests to be more usable
> >>> like other
> >> subsystems to keep kernel robust
> >>> 4. This provides vendor agnostic view of all vdpa capable parent and
> >>> vdpa
> >> devices.
> >>> 5. Each driver which supports vdpa device creation, registers the
> >>> parent device
> >> along with supported classes.
> >>> FAQs:
> >>> --------
> >>> 1. Why not using devlink?
> >>> Ans: Because as vdpa echo system grows, devlink will fall short of
> >>> extending
> >> vdpa specific params, attributes, stats.
> >>
> >>
> >> This should be fine but it's still not clear to me the difference
> >> between a vdpa netlink and a vdpa object in devlink.
> >>
> > The difference is a vdpa specific tool work at the parent device level.
> > It is likely more appropriate to because it can self-contain everything needed
> to create/delete devices, view/set features, stats.
> > Trying to put that in devlink will fall short as devlink doesn’t have vdpa
> definitions.
> > Typically when a class/device subsystem grows, its own tool is wiser like
> iproute2/ip, iproute2/tc, iproute2/rdma.
> 
> 
> Ok, I see.
> 
> Thanks
> 


From jasowang at redhat.com  Wed Aug 19 06:57:34 2020
From: jasowang at redhat.com (Jason Wang)
Date: Wed, 19 Aug 2020 14:57:34 +0800
Subject: [ovirt-devel] Re: device compatibility interface for live
 migration with assigned devices
In-Reply-To: <20200819033035.GA21172@joy-OptiPlex-7040>
References: <20200805105319.GF2177@nanopsycho>
 <20200810074631.GA29059@joy-OptiPlex-7040>
 <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
 <20200818113652.5d81a392.cohuck@redhat.com>
 <BY5PR12MB4322C9D1A66C4657776A1383DC5C0@BY5PR12MB4322.namprd12.prod.outlook.com>
 <20200819033035.GA21172@joy-OptiPlex-7040>
Message-ID: <e20812b7-994b-b7f9-2df4-a78c4d116c7f@redhat.com>


On 2020/8/19 上午11:30, Yan Zhao wrote:
> hi All,
> could we decide that sysfs is the interface that every VFIO vendor driver
> needs to provide in order to support vfio live migration, otherwise the
> userspace management tool would not list the device into the compatible
> list?
>
> if that's true, let's move to the standardizing of the sysfs interface.
> (1) content
> common part: (must)
>     - software_version: (in major.minor.bugfix scheme)


This can not work for devices whose features can be 
negotiated/advertised independently. (E.g virtio devices)


>     - device_api: vfio-pci or vfio-ccw ...
>     - type: mdev type for mdev device or
>             a signature for physical device which is a counterpart for
> 	   mdev type.
>
> device api specific part: (must)
>    - pci id: pci id of mdev parent device or pci id of physical pci
>      device (device_api is vfio-pci)API here.


So this assumes a PCI device which is probably not true.


>    - subchannel_type (device_api is vfio-ccw)
>   
> vendor driver specific part: (optional)
>    - aggregator
>    - chpid_type
>    - remote_url


For "remote_url", just wonder if it's better to integrate or reuse the 
existing NVME management interface instead of duplicating it here. 
Otherwise it could be a burden for mgmt to learn. E.g vendor A may use 
"remote_url" but vendor B may use a different attribute.


>
> NOTE: vendors are free to add attributes in this part with a
> restriction that this attribute is able to be configured with the same
> name in sysfs too. e.g.


Sysfs works well for common attributes belongs to a class, but I'm not 
sure it can work well for device/vendor specific attributes. Does this 
mean mgmt need to iterate all the attributes in both src and dst?


> for aggregator, there must be a sysfs attribute in device node
> /sys/devices/pci0000:00/0000:00:02.0/882cc4da-dede-11e7-9180-078a62063ab1/intel_vgpu/aggregator,
> so that the userspace tool is able to configure the target device
> according to source device's aggregator attribute.
>
>
> (2) where and structure
> proposal 1:
> |- [path to device]
>    |--- migration
>    |     |--- self
>    |     |    |-software_version
>    |     |    |-device_api
>    |     |    |-type
>    |     |    |-[pci_id or subchannel_type]
>    |     |    |-<aggregator or chpid_type>
>    |     |--- compatible
>    |     |    |-software_version
>    |     |    |-device_api
>    |     |    |-type
>    |     |    |-[pci_id or subchannel_type]
>    |     |    |-<aggregator or chpid_type>
> multiple compatible is allowed.
> attributes should be ASCII text files, preferably with only one value
> per file.
>
>
> proposal 2: use bin_attribute.
> |- [path to device]
>    |--- migration
>    |     |--- self
>    |     |--- compatible
>
> so we can continue use multiline format. e.g.
> cat compatible
>    software_version=0.1.0
>    device_api=vfio_pci
>    type=i915-GVTg_V5_{val1:int:1,2,4,8}
>    pci_id=80865963
>    aggregator={val1}/2


So basically two questions:

- how hard to standardize sysfs API for dealing with compatibility check 
(to make it work for most types of devices)
- how hard for the mgmt to learn with a vendor specific attributes (vs 
existing management API)

Thanks


>
> Thanks
> Yan


From yan.y.zhao at intel.com  Wed Aug 19 06:59:51 2020
From: yan.y.zhao at intel.com (Yan Zhao)
Date: Wed, 19 Aug 2020 14:59:51 +0800
Subject: [ovirt-devel] Re: device compatibility interface for live
 migration with assigned devices
In-Reply-To: <e20812b7-994b-b7f9-2df4-a78c4d116c7f@redhat.com>
References: <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
 <20200818113652.5d81a392.cohuck@redhat.com>
 <BY5PR12MB4322C9D1A66C4657776A1383DC5C0@BY5PR12MB4322.namprd12.prod.outlook.com>
 <20200819033035.GA21172@joy-OptiPlex-7040>
 <e20812b7-994b-b7f9-2df4-a78c4d116c7f@redhat.com>
Message-ID: <20200819065951.GB21172@joy-OptiPlex-7040>

On Wed, Aug 19, 2020 at 02:57:34PM +0800, Jason Wang wrote:
> 
> On 2020/8/19 上午11:30, Yan Zhao wrote:
> > hi All,
> > could we decide that sysfs is the interface that every VFIO vendor driver
> > needs to provide in order to support vfio live migration, otherwise the
> > userspace management tool would not list the device into the compatible
> > list?
> > 
> > if that's true, let's move to the standardizing of the sysfs interface.
> > (1) content
> > common part: (must)
> >     - software_version: (in major.minor.bugfix scheme)
> 
> 
> This can not work for devices whose features can be negotiated/advertised
> independently. (E.g virtio devices)
>
sorry, I don't understand here, why virtio devices need to use vfio interface?
I think this thread is discussing about vfio related devices.

> 
> >     - device_api: vfio-pci or vfio-ccw ...
> >     - type: mdev type for mdev device or
> >             a signature for physical device which is a counterpart for
> > 	   mdev type.
> > 
> > device api specific part: (must)
> >    - pci id: pci id of mdev parent device or pci id of physical pci
> >      device (device_api is vfio-pci)API here.
> 
> 
> So this assumes a PCI device which is probably not true.
> 
for device_api of vfio-pci, why it's not true?

for vfio-ccw, it's subchannel_type.

> 
> >    - subchannel_type (device_api is vfio-ccw)
> > vendor driver specific part: (optional)
> >    - aggregator
> >    - chpid_type
> >    - remote_url
> 
> 
> For "remote_url", just wonder if it's better to integrate or reuse the
> existing NVME management interface instead of duplicating it here. Otherwise
> it could be a burden for mgmt to learn. E.g vendor A may use "remote_url"
> but vendor B may use a different attribute.
> 
it's vendor driver specific.
vendor specific attributes are inevitable, and that's why we are
discussing here of a way to standardizing of it.
our goal is that mgmt can use it without understanding the meaning of vendor
specific attributes.

> 
> > 
> > NOTE: vendors are free to add attributes in this part with a
> > restriction that this attribute is able to be configured with the same
> > name in sysfs too. e.g.
> 
> 
> Sysfs works well for common attributes belongs to a class, but I'm not sure
> it can work well for device/vendor specific attributes. Does this mean mgmt
> need to iterate all the attributes in both src and dst?
>
no. just attributes under migration directory.

> 
> > for aggregator, there must be a sysfs attribute in device node
> > /sys/devices/pci0000:00/0000:00:02.0/882cc4da-dede-11e7-9180-078a62063ab1/intel_vgpu/aggregator,
> > so that the userspace tool is able to configure the target device
> > according to source device's aggregator attribute.
> > 
> > 
> > (2) where and structure
> > proposal 1:
> > |- [path to device]
> >    |--- migration
> >    |     |--- self
> >    |     |    |-software_version
> >    |     |    |-device_api
> >    |     |    |-type
> >    |     |    |-[pci_id or subchannel_type]
> >    |     |    |-<aggregator or chpid_type>
> >    |     |--- compatible
> >    |     |    |-software_version
> >    |     |    |-device_api
> >    |     |    |-type
> >    |     |    |-[pci_id or subchannel_type]
> >    |     |    |-<aggregator or chpid_type>
> > multiple compatible is allowed.
> > attributes should be ASCII text files, preferably with only one value
> > per file.
> > 
> > 
> > proposal 2: use bin_attribute.
> > |- [path to device]
> >    |--- migration
> >    |     |--- self
> >    |     |--- compatible
> > 
> > so we can continue use multiline format. e.g.
> > cat compatible
> >    software_version=0.1.0
> >    device_api=vfio_pci
> >    type=i915-GVTg_V5_{val1:int:1,2,4,8}
> >    pci_id=80865963
> >    aggregator={val1}/2
> 
> 
> So basically two questions:
> 
> - how hard to standardize sysfs API for dealing with compatibility check (to
> make it work for most types of devices)
sorry, I just know we are in the process of standardizing of it :)

> - how hard for the mgmt to learn with a vendor specific attributes (vs
> existing management API)
what is existing management API?

Thanks


From jasowang at redhat.com  Wed Aug 19 07:39:50 2020
From: jasowang at redhat.com (Jason Wang)
Date: Wed, 19 Aug 2020 15:39:50 +0800
Subject: [ovirt-devel] Re: device compatibility interface for live
 migration with assigned devices
In-Reply-To: <20200819065951.GB21172@joy-OptiPlex-7040>
References: <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
 <20200818113652.5d81a392.cohuck@redhat.com>
 <BY5PR12MB4322C9D1A66C4657776A1383DC5C0@BY5PR12MB4322.namprd12.prod.outlook.com>
 <20200819033035.GA21172@joy-OptiPlex-7040>
 <e20812b7-994b-b7f9-2df4-a78c4d116c7f@redhat.com>
 <20200819065951.GB21172@joy-OptiPlex-7040>
Message-ID: <d6f9a51e-80b3-44c5-2656-614b327dc080@redhat.com>


On 2020/8/19 下午2:59, Yan Zhao wrote:
> On Wed, Aug 19, 2020 at 02:57:34PM +0800, Jason Wang wrote:
>> On 2020/8/19 上午11:30, Yan Zhao wrote:
>>> hi All,
>>> could we decide that sysfs is the interface that every VFIO vendor driver
>>> needs to provide in order to support vfio live migration, otherwise the
>>> userspace management tool would not list the device into the compatible
>>> list?
>>>
>>> if that's true, let's move to the standardizing of the sysfs interface.
>>> (1) content
>>> common part: (must)
>>>      - software_version: (in major.minor.bugfix scheme)
>>
>> This can not work for devices whose features can be negotiated/advertised
>> independently. (E.g virtio devices)
>>
> sorry, I don't understand here, why virtio devices need to use vfio interface?


I don't see any reason that virtio devices can't be used by VFIO. Do you?

Actually, virtio devices have been used by VFIO for many years:

- passthrough a hardware virtio devices to userspace(VM) drivers
- using virtio PMD inside guest


> I think this thread is discussing about vfio related devices.
>
>>>      - device_api: vfio-pci or vfio-ccw ...
>>>      - type: mdev type for mdev device or
>>>              a signature for physical device which is a counterpart for
>>> 	   mdev type.
>>>
>>> device api specific part: (must)
>>>     - pci id: pci id of mdev parent device or pci id of physical pci
>>>       device (device_api is vfio-pci)API here.
>>
>> So this assumes a PCI device which is probably not true.
>>
> for device_api of vfio-pci, why it's not true?
>
> for vfio-ccw, it's subchannel_type.


Ok but having two different attributes for the same file is not good 
idea. How mgmt know there will be a 3rd type?


>
>>>     - subchannel_type (device_api is vfio-ccw)
>>> vendor driver specific part: (optional)
>>>     - aggregator
>>>     - chpid_type
>>>     - remote_url
>>
>> For "remote_url", just wonder if it's better to integrate or reuse the
>> existing NVME management interface instead of duplicating it here. Otherwise
>> it could be a burden for mgmt to learn. E.g vendor A may use "remote_url"
>> but vendor B may use a different attribute.
>>
> it's vendor driver specific.
> vendor specific attributes are inevitable, and that's why we are
> discussing here of a way to standardizing of it.


Well, then you will end up with a very long list to discuss. E.g for 
networking devices, you will have "mac", "v(x)lan" and a lot of other.

Note that "remote_url" is not vendor specific but NVME (class/subsystem) 
specific.

The point is that if vendor/class specific part is unavoidable, why not 
making all of the attributes vendor specific?


> our goal is that mgmt can use it without understanding the meaning of vendor
> specific attributes.


I'm not sure this is the correct design of uAPI. Is there something 
similar in the existing uAPIs?

And it might be hard to work for virtio devices.


>
>>> NOTE: vendors are free to add attributes in this part with a
>>> restriction that this attribute is able to be configured with the same
>>> name in sysfs too. e.g.
>>
>> Sysfs works well for common attributes belongs to a class, but I'm not sure
>> it can work well for device/vendor specific attributes. Does this mean mgmt
>> need to iterate all the attributes in both src and dst?
>>
> no. just attributes under migration directory.
>
>>> for aggregator, there must be a sysfs attribute in device node
>>> /sys/devices/pci0000:00/0000:00:02.0/882cc4da-dede-11e7-9180-078a62063ab1/intel_vgpu/aggregator,
>>> so that the userspace tool is able to configure the target device
>>> according to source device's aggregator attribute.
>>>
>>>
>>> (2) where and structure
>>> proposal 1:
>>> |- [path to device]
>>>     |--- migration
>>>     |     |--- self
>>>     |     |    |-software_version
>>>     |     |    |-device_api
>>>     |     |    |-type
>>>     |     |    |-[pci_id or subchannel_type]
>>>     |     |    |-<aggregator or chpid_type>
>>>     |     |--- compatible
>>>     |     |    |-software_version
>>>     |     |    |-device_api
>>>     |     |    |-type
>>>     |     |    |-[pci_id or subchannel_type]
>>>     |     |    |-<aggregator or chpid_type>
>>> multiple compatible is allowed.
>>> attributes should be ASCII text files, preferably with only one value
>>> per file.
>>>
>>>
>>> proposal 2: use bin_attribute.
>>> |- [path to device]
>>>     |--- migration
>>>     |     |--- self
>>>     |     |--- compatible
>>>
>>> so we can continue use multiline format. e.g.
>>> cat compatible
>>>     software_version=0.1.0
>>>     device_api=vfio_pci
>>>     type=i915-GVTg_V5_{val1:int:1,2,4,8}
>>>     pci_id=80865963
>>>     aggregator={val1}/2
>>
>> So basically two questions:
>>
>> - how hard to standardize sysfs API for dealing with compatibility check (to
>> make it work for most types of devices)
> sorry, I just know we are in the process of standardizing of it :)


It's not easy. As I said, the current design can't work for virtio 
devices and it's not hard to find other examples. I remember some Intel 
devices have bitmask based capability registers.


>
>> - how hard for the mgmt to learn with a vendor specific attributes (vs
>> existing management API)
> what is existing management API?


It depends on the type of devices. E.g for NVME, we've already had one 
(/sys/kernel/config/nvme)?

Thanks


>
> Thanks
>


From yan.y.zhao at intel.com  Wed Aug 19 08:13:39 2020
From: yan.y.zhao at intel.com (Yan Zhao)
Date: Wed, 19 Aug 2020 16:13:39 +0800
Subject: [ovirt-devel] Re: device compatibility interface for live
 migration with assigned devices
In-Reply-To: <d6f9a51e-80b3-44c5-2656-614b327dc080@redhat.com>
References: <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
 <20200818113652.5d81a392.cohuck@redhat.com>
 <BY5PR12MB4322C9D1A66C4657776A1383DC5C0@BY5PR12MB4322.namprd12.prod.outlook.com>
 <20200819033035.GA21172@joy-OptiPlex-7040>
 <e20812b7-994b-b7f9-2df4-a78c4d116c7f@redhat.com>
 <20200819065951.GB21172@joy-OptiPlex-7040>
 <d6f9a51e-80b3-44c5-2656-614b327dc080@redhat.com>
Message-ID: <20200819081338.GC21172@joy-OptiPlex-7040>

On Wed, Aug 19, 2020 at 03:39:50PM +0800, Jason Wang wrote:
> 
> On 2020/8/19 下午2:59, Yan Zhao wrote:
> > On Wed, Aug 19, 2020 at 02:57:34PM +0800, Jason Wang wrote:
> > > On 2020/8/19 上午11:30, Yan Zhao wrote:
> > > > hi All,
> > > > could we decide that sysfs is the interface that every VFIO vendor driver
> > > > needs to provide in order to support vfio live migration, otherwise the
> > > > userspace management tool would not list the device into the compatible
> > > > list?
> > > > 
> > > > if that's true, let's move to the standardizing of the sysfs interface.
> > > > (1) content
> > > > common part: (must)
> > > >      - software_version: (in major.minor.bugfix scheme)
> > > 
> > > This can not work for devices whose features can be negotiated/advertised
> > > independently. (E.g virtio devices)
> > > 
> > sorry, I don't understand here, why virtio devices need to use vfio interface?
> 
> 
> I don't see any reason that virtio devices can't be used by VFIO. Do you?
> 
> Actually, virtio devices have been used by VFIO for many years:
> 
> - passthrough a hardware virtio devices to userspace(VM) drivers
> - using virtio PMD inside guest
>
So, what's different for it vs passing through a physical hardware via VFIO?
even though the features are negotiated dynamically, could you explain
why it would cause software_version not work?


> 
> > I think this thread is discussing about vfio related devices.
> > 
> > > >      - device_api: vfio-pci or vfio-ccw ...
> > > >      - type: mdev type for mdev device or
> > > >              a signature for physical device which is a counterpart for
> > > > 	   mdev type.
> > > > 
> > > > device api specific part: (must)
> > > >     - pci id: pci id of mdev parent device or pci id of physical pci
> > > >       device (device_api is vfio-pci)API here.
> > > 
> > > So this assumes a PCI device which is probably not true.
> > > 
> > for device_api of vfio-pci, why it's not true?
> > 
> > for vfio-ccw, it's subchannel_type.
> 
> 
> Ok but having two different attributes for the same file is not good idea.
> How mgmt know there will be a 3rd type?
that's why some attributes need to be common. e.g.
device_api: it's common because mgmt need to know it's a pci device or a
            ccw device. and the api type is already defined vfio.h.
	    (The field is agreed by and actually suggested by Alex in previous mail)
type: mdev_type for mdev. if mgmt does not understand it, it would not
      be able to create one compatible mdev device.
software_version: mgmt can compare the major and minor if it understands
      this fields.
> 
> 
> > 
> > > >     - subchannel_type (device_api is vfio-ccw)
> > > > vendor driver specific part: (optional)
> > > >     - aggregator
> > > >     - chpid_type
> > > >     - remote_url
> > > 
> > > For "remote_url", just wonder if it's better to integrate or reuse the
> > > existing NVME management interface instead of duplicating it here. Otherwise
> > > it could be a burden for mgmt to learn. E.g vendor A may use "remote_url"
> > > but vendor B may use a different attribute.
> > > 
> > it's vendor driver specific.
> > vendor specific attributes are inevitable, and that's why we are
> > discussing here of a way to standardizing of it.
> 
> 
> Well, then you will end up with a very long list to discuss. E.g for
> networking devices, you will have "mac", "v(x)lan" and a lot of other.
> 
> Note that "remote_url" is not vendor specific but NVME (class/subsystem)
> specific.
> 
yes, it's just NVMe specific. I added it as an example to show what is
vendor specific.
if one attribute is vendor specific across all vendors, then it's not vendor specific,
it's already common attribute, right?

> The point is that if vendor/class specific part is unavoidable, why not
> making all of the attributes vendor specific?
>
some parts need to be common, as I listed above.

> 
> > our goal is that mgmt can use it without understanding the meaning of vendor
> > specific attributes.
> 
> 
> I'm not sure this is the correct design of uAPI. Is there something similar
> in the existing uAPIs?
> 
> And it might be hard to work for virtio devices.
> 
> 
> > 
> > > > NOTE: vendors are free to add attributes in this part with a
> > > > restriction that this attribute is able to be configured with the same
> > > > name in sysfs too. e.g.
> > > 
> > > Sysfs works well for common attributes belongs to a class, but I'm not sure
> > > it can work well for device/vendor specific attributes. Does this mean mgmt
> > > need to iterate all the attributes in both src and dst?
> > > 
> > no. just attributes under migration directory.
> > 
> > > > for aggregator, there must be a sysfs attribute in device node
> > > > /sys/devices/pci0000:00/0000:00:02.0/882cc4da-dede-11e7-9180-078a62063ab1/intel_vgpu/aggregator,
> > > > so that the userspace tool is able to configure the target device
> > > > according to source device's aggregator attribute.
> > > > 
> > > > 
> > > > (2) where and structure
> > > > proposal 1:
> > > > |- [path to device]
> > > >     |--- migration
> > > >     |     |--- self
> > > >     |     |    |-software_version
> > > >     |     |    |-device_api
> > > >     |     |    |-type
> > > >     |     |    |-[pci_id or subchannel_type]
> > > >     |     |    |-<aggregator or chpid_type>
> > > >     |     |--- compatible
> > > >     |     |    |-software_version
> > > >     |     |    |-device_api
> > > >     |     |    |-type
> > > >     |     |    |-[pci_id or subchannel_type]
> > > >     |     |    |-<aggregator or chpid_type>
> > > > multiple compatible is allowed.
> > > > attributes should be ASCII text files, preferably with only one value
> > > > per file.
> > > > 
> > > > 
> > > > proposal 2: use bin_attribute.
> > > > |- [path to device]
> > > >     |--- migration
> > > >     |     |--- self
> > > >     |     |--- compatible
> > > > 
> > > > so we can continue use multiline format. e.g.
> > > > cat compatible
> > > >     software_version=0.1.0
> > > >     device_api=vfio_pci
> > > >     type=i915-GVTg_V5_{val1:int:1,2,4,8}
> > > >     pci_id=80865963
> > > >     aggregator={val1}/2
> > > 
> > > So basically two questions:
> > > 
> > > - how hard to standardize sysfs API for dealing with compatibility check (to
> > > make it work for most types of devices)
> > sorry, I just know we are in the process of standardizing of it :)
> 
> 
> It's not easy. As I said, the current design can't work for virtio devices
> and it's not hard to find other examples. I remember some Intel devices have
> bitmask based capability registers.
> 
some Intel devices have bitmask based capability registers.
so what?
we have defined pci_id to identify the devices.
even two different devices have equal PCI IDs, we still allow them to
add vendor specific fields. e.g.
for QAT, they can add alg_set to identify hardware supported algorithms.

> 
> > 
> > > - how hard for the mgmt to learn with a vendor specific attributes (vs
> > > existing management API)
> > what is existing management API?
> 
> 
> It depends on the type of devices. E.g for NVME, we've already had one
> (/sys/kernel/config/nvme)?
>
if the device is binding to vfio or vfio-mdev, I believe this interface
is not there.


Thanks
Yan


From jasowang at redhat.com  Wed Aug 19 09:28:38 2020
From: jasowang at redhat.com (Jason Wang)
Date: Wed, 19 Aug 2020 17:28:38 +0800
Subject: [ovirt-devel] Re: device compatibility interface for live
 migration with assigned devices
In-Reply-To: <20200819081338.GC21172@joy-OptiPlex-7040>
References: <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
 <20200818113652.5d81a392.cohuck@redhat.com>
 <BY5PR12MB4322C9D1A66C4657776A1383DC5C0@BY5PR12MB4322.namprd12.prod.outlook.com>
 <20200819033035.GA21172@joy-OptiPlex-7040>
 <e20812b7-994b-b7f9-2df4-a78c4d116c7f@redhat.com>
 <20200819065951.GB21172@joy-OptiPlex-7040>
 <d6f9a51e-80b3-44c5-2656-614b327dc080@redhat.com>
 <20200819081338.GC21172@joy-OptiPlex-7040>
Message-ID: <c1d580dd-5c0c-21bc-19a6-f776617d4ec2@redhat.com>


On 2020/8/19 下午4:13, Yan Zhao wrote:
> On Wed, Aug 19, 2020 at 03:39:50PM +0800, Jason Wang wrote:
>> On 2020/8/19 下午2:59, Yan Zhao wrote:
>>> On Wed, Aug 19, 2020 at 02:57:34PM +0800, Jason Wang wrote:
>>>> On 2020/8/19 上午11:30, Yan Zhao wrote:
>>>>> hi All,
>>>>> could we decide that sysfs is the interface that every VFIO vendor driver
>>>>> needs to provide in order to support vfio live migration, otherwise the
>>>>> userspace management tool would not list the device into the compatible
>>>>> list?
>>>>>
>>>>> if that's true, let's move to the standardizing of the sysfs interface.
>>>>> (1) content
>>>>> common part: (must)
>>>>>       - software_version: (in major.minor.bugfix scheme)
>>>> This can not work for devices whose features can be negotiated/advertised
>>>> independently. (E.g virtio devices)
>>>>
>>> sorry, I don't understand here, why virtio devices need to use vfio interface?
>>
>> I don't see any reason that virtio devices can't be used by VFIO. Do you?
>>
>> Actually, virtio devices have been used by VFIO for many years:
>>
>> - passthrough a hardware virtio devices to userspace(VM) drivers
>> - using virtio PMD inside guest
>>
> So, what's different for it vs passing through a physical hardware via VFIO?


The difference is in the guest, the device could be either real hardware 
or emulated ones.


> even though the features are negotiated dynamically, could you explain
> why it would cause software_version not work?


Virtio device 1 supports feature A, B, C
Virtio device 2 supports feature B, C, D

So you can't migrate a guest from device 1 to device 2. And it's 
impossible to model the features with versions.


>
>
>>> I think this thread is discussing about vfio related devices.
>>>
>>>>>       - device_api: vfio-pci or vfio-ccw ...
>>>>>       - type: mdev type for mdev device or
>>>>>               a signature for physical device which is a counterpart for
>>>>> 	   mdev type.
>>>>>
>>>>> device api specific part: (must)
>>>>>      - pci id: pci id of mdev parent device or pci id of physical pci
>>>>>        device (device_api is vfio-pci)API here.
>>>> So this assumes a PCI device which is probably not true.
>>>>
>>> for device_api of vfio-pci, why it's not true?
>>>
>>> for vfio-ccw, it's subchannel_type.
>>
>> Ok but having two different attributes for the same file is not good idea.
>> How mgmt know there will be a 3rd type?
> that's why some attributes need to be common. e.g.
> device_api: it's common because mgmt need to know it's a pci device or a
>              ccw device. and the api type is already defined vfio.h.
> 	    (The field is agreed by and actually suggested by Alex in previous mail)
> type: mdev_type for mdev. if mgmt does not understand it, it would not
>        be able to create one compatible mdev device.
> software_version: mgmt can compare the major and minor if it understands
>        this fields.


I think it would be helpful if you can describe how mgmt is expected to 
work step by step with the proposed sysfs API. This can help people to 
understand.

Thanks for the patience. Since sysfs is uABI, when accepted, we need 
support it forever. That's why we need to be careful.


>>
>>>>>      - subchannel_type (device_api is vfio-ccw)
>>>>> vendor driver specific part: (optional)
>>>>>      - aggregator
>>>>>      - chpid_type
>>>>>      - remote_url
>>>> For "remote_url", just wonder if it's better to integrate or reuse the
>>>> existing NVME management interface instead of duplicating it here. Otherwise
>>>> it could be a burden for mgmt to learn. E.g vendor A may use "remote_url"
>>>> but vendor B may use a different attribute.
>>>>
>>> it's vendor driver specific.
>>> vendor specific attributes are inevitable, and that's why we are
>>> discussing here of a way to standardizing of it.
>>
>> Well, then you will end up with a very long list to discuss. E.g for
>> networking devices, you will have "mac", "v(x)lan" and a lot of other.
>>
>> Note that "remote_url" is not vendor specific but NVME (class/subsystem)
>> specific.
>>
> yes, it's just NVMe specific. I added it as an example to show what is
> vendor specific.
> if one attribute is vendor specific across all vendors, then it's not vendor specific,
> it's already common attribute, right?


It's common but the issue is about naming and mgmt overhead. Unless you 
have a unified API per class (NVME, ethernet, etc), you can't prevent 
vendor from using another name instead of "remote_url".


>
>> The point is that if vendor/class specific part is unavoidable, why not
>> making all of the attributes vendor specific?
>>
> some parts need to be common, as I listed above.


This is hard, unless VFIO knows the type of device (e.g it's a NVME or 
networking device).


>
>>> our goal is that mgmt can use it without understanding the meaning of vendor
>>> specific attributes.
>>
>> I'm not sure this is the correct design of uAPI. Is there something similar
>> in the existing uAPIs?
>>
>> And it might be hard to work for virtio devices.
>>
>>
>>>>> NOTE: vendors are free to add attributes in this part with a
>>>>> restriction that this attribute is able to be configured with the same
>>>>> name in sysfs too. e.g.
>>>> Sysfs works well for common attributes belongs to a class, but I'm not sure
>>>> it can work well for device/vendor specific attributes. Does this mean mgmt
>>>> need to iterate all the attributes in both src and dst?
>>>>
>>> no. just attributes under migration directory.
>>>
>>>>> for aggregator, there must be a sysfs attribute in device node
>>>>> /sys/devices/pci0000:00/0000:00:02.0/882cc4da-dede-11e7-9180-078a62063ab1/intel_vgpu/aggregator,
>>>>> so that the userspace tool is able to configure the target device
>>>>> according to source device's aggregator attribute.
>>>>>
>>>>>
>>>>> (2) where and structure
>>>>> proposal 1:
>>>>> |- [path to device]
>>>>>      |--- migration
>>>>>      |     |--- self
>>>>>      |     |    |-software_version
>>>>>      |     |    |-device_api
>>>>>      |     |    |-type
>>>>>      |     |    |-[pci_id or subchannel_type]
>>>>>      |     |    |-<aggregator or chpid_type>
>>>>>      |     |--- compatible
>>>>>      |     |    |-software_version
>>>>>      |     |    |-device_api
>>>>>      |     |    |-type
>>>>>      |     |    |-[pci_id or subchannel_type]
>>>>>      |     |    |-<aggregator or chpid_type>
>>>>> multiple compatible is allowed.
>>>>> attributes should be ASCII text files, preferably with only one value
>>>>> per file.
>>>>>
>>>>>
>>>>> proposal 2: use bin_attribute.
>>>>> |- [path to device]
>>>>>      |--- migration
>>>>>      |     |--- self
>>>>>      |     |--- compatible
>>>>>
>>>>> so we can continue use multiline format. e.g.
>>>>> cat compatible
>>>>>      software_version=0.1.0
>>>>>      device_api=vfio_pci
>>>>>      type=i915-GVTg_V5_{val1:int:1,2,4,8}
>>>>>      pci_id=80865963
>>>>>      aggregator={val1}/2
>>>> So basically two questions:
>>>>
>>>> - how hard to standardize sysfs API for dealing with compatibility check (to
>>>> make it work for most types of devices)
>>> sorry, I just know we are in the process of standardizing of it :)
>>
>> It's not easy. As I said, the current design can't work for virtio devices
>> and it's not hard to find other examples. I remember some Intel devices have
>> bitmask based capability registers.
>>
> some Intel devices have bitmask based capability registers.
> so what?


You should at least make the proposed API working for your(Intel) own 
devices.


> we have defined pci_id to identify the devices.
> even two different devices have equal PCI IDs, we still allow them to
> add vendor specific fields. e.g.
> for QAT, they can add alg_set to identify hardware supported algorithms.


Well, the point is to make sure the API not work only for some specific 
devices. If we agree with this, we need try to seek what is missed instead.


>
>>>> - how hard for the mgmt to learn with a vendor specific attributes (vs
>>>> existing management API)
>>> what is existing management API?
>>
>> It depends on the type of devices. E.g for NVME, we've already had one
>> (/sys/kernel/config/nvme)?
>>
> if the device is binding to vfio or vfio-mdev, I believe this interface
> is not there.


So you want to duplicate some APIs with existing NVME ones?

Thanks


>
>
> Thanks
> Yan
>


From jasowang at redhat.com  Wed Aug 19 09:41:39 2020
From: jasowang at redhat.com (Jason Wang)
Date: Wed, 19 Aug 2020 17:41:39 +0800
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <BY5PR12MB43226CABD003285D0C77E2B7DC5D0@BY5PR12MB4322.namprd12.prod.outlook.com>
References: <20200805105319.GF2177@nanopsycho>
 <20200810074631.GA29059@joy-OptiPlex-7040>
 <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
 <20200818113652.5d81a392.cohuck@redhat.com>
 <BY5PR12MB4322C9D1A66C4657776A1383DC5C0@BY5PR12MB4322.namprd12.prod.outlook.com>
 <20200819033035.GA21172@joy-OptiPlex-7040>
 <BY5PR12MB43226CABD003285D0C77E2B7DC5D0@BY5PR12MB4322.namprd12.prod.outlook.com>
Message-ID: <b766fa9c-ed53-b6be-9c2b-ea8bbe85967b@redhat.com>


On 2020/8/19 下午1:58, Parav Pandit wrote:
>
>> From: Yan Zhao <yan.y.zhao at intel.com>
>> Sent: Wednesday, August 19, 2020 9:01 AM
>> On Tue, Aug 18, 2020 at 09:39:24AM +0000, Parav Pandit wrote:
>>> Please refer to my previous email which has more example and details.
>> hi Parav,
>> the example is based on a new vdpa tool running over netlink, not based on
>> devlink, right?
> Right.
>
>> For vfio migration compatibility, we have to deal with both mdev and physical
>> pci devices, I don't think it's a good idea to write a new tool for it, given we are
>> able to retrieve the same info from sysfs and there's already an mdevctl from
> mdev attribute should be visible in the mdev's sysfs tree.
> I do not propose to write a new mdev tool over netlink. I am sorry if I implied that with my suggestion of vdpa tool.
>
> If underlying device is vdpa, mdev might be able to understand vdpa device and query from it and populate in mdev sysfs tree.


Note that vdpa is bus independent so it can't work now and the support 
of mdev on top of vDPA have been rejected (and duplicated with vhost-vDPA).

Thanks


>
> The vdpa tool I propose is usable even without mdevs.
> vdpa tool's role is to create one or more vdpa devices and place on the "vdpa" bus which is the lowest layer here.
> Additionally this tool let user query virtqueue stats, db stats.
> When a user creates vdpa net device, user may need to configure features of the vdpa device such as VIRTIO_NET_F_MAC, default VIRTIO_NET_F_MTU.
> These are vdpa level features, attributes. Mdev is layer above it.
>
>> Alex
>> (https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.
>> com%2Fmdevctl%2Fmdevctl&amp;data=02%7C01%7Cparav%40nvidia.com%7C
>> 0c2691d430304f5ea11308d843f2d84e%7C43083d15727340c1b7db39efd9ccc17
>> a%7C0%7C0%7C637334057571911357&amp;sdata=KxH7PwxmKyy9JODut8BWr
>> LQyOBylW00%2Fyzc4rEvjUvA%3D&amp;reserved=0).
>>
> Sorry for above link mangling. Our mail server is still transitioning due to company acquisition.
>
> I am less familiar on below points to comment.
>
>> hi All,
>> could we decide that sysfs is the interface that every VFIO vendor driver needs
>> to provide in order to support vfio live migration, otherwise the userspace
>> management tool would not list the device into the compatible list?
>>
>> if that's true, let's move to the standardizing of the sysfs interface.
>> (1) content
>> common part: (must)
>>     - software_version: (in major.minor.bugfix scheme)
>>     - device_api: vfio-pci or vfio-ccw ...
>>     - type: mdev type for mdev device or
>>             a signature for physical device which is a counterpart for
>> 	   mdev type.
>>
>> device api specific part: (must)
>>    - pci id: pci id of mdev parent device or pci id of physical pci
>>      device (device_api is vfio-pci)
>>    - subchannel_type (device_api is vfio-ccw)
>>
>> vendor driver specific part: (optional)
>>    - aggregator
>>    - chpid_type
>>    - remote_url
>>
>> NOTE: vendors are free to add attributes in this part with a restriction that this
>> attribute is able to be configured with the same name in sysfs too. e.g.
>> for aggregator, there must be a sysfs attribute in device node
>> /sys/devices/pci0000:00/0000:00:02.0/882cc4da-dede-11e7-9180-
>> 078a62063ab1/intel_vgpu/aggregator,
>> so that the userspace tool is able to configure the target device according to
>> source device's aggregator attribute.
>>
>>
>> (2) where and structure
>> proposal 1:
>> |- [path to device]
>>    |--- migration
>>    |     |--- self
>>    |     |    |-software_version
>>    |     |    |-device_api
>>    |     |    |-type
>>    |     |    |-[pci_id or subchannel_type]
>>    |     |    |-<aggregator or chpid_type>
>>    |     |--- compatible
>>    |     |    |-software_version
>>    |     |    |-device_api
>>    |     |    |-type
>>    |     |    |-[pci_id or subchannel_type]
>>    |     |    |-<aggregator or chpid_type>
>> multiple compatible is allowed.
>> attributes should be ASCII text files, preferably with only one value per file.
>>
>>
>> proposal 2: use bin_attribute.
>> |- [path to device]
>>    |--- migration
>>    |     |--- self
>>    |     |--- compatible
>>
>> so we can continue use multiline format. e.g.
>> cat compatible
>>    software_version=0.1.0
>>    device_api=vfio_pci
>>    type=i915-GVTg_V5_{val1:int:1,2,4,8}
>>    pci_id=80865963
>>    aggregator={val1}/2
>>
>> Thanks
>> Yan


From harishkumarivaturi at gmail.com  Wed Aug 19 17:13:14 2020
From: harishkumarivaturi at gmail.com (HARISH KUMAR Ivaturi)
Date: Wed, 19 Aug 2020 19:13:14 +0200
Subject: OpenStack with Nginx
Message-ID: <CAGmfrBySM9XhuYnKdHz0Zx0qNxMUsx+m+LdEwm4ZHc2o5X0NoQ@mail.gmail.com>

Hi
I am Harish Kumar, Master Student at BTH, Karlskrona, Sweden. I am working
on my Master thesis at BTH and my thesis topic is Performance evaluation of
OpenStack with HTTP/3.

I have successfully built curl and nginx with HTTP/3 support and I am
performing some commands using curl for generating tokens so i could access
the services of OpenStack.
OpenStack relies with the Apache web server and I could not get any results
using Nginx HTTP/3 . I would like to ask if there is any official
documentation on OpenStack relying with Nginx?, I have searched in the
internet reg. this info but could not get any, I would like to use nginx
instead of apache web server , so I could get some results by performing
curl and commands and nginx web server (with http/3 support). Please let me
know and if there is any content please share with me. I hope you have
understood this. It would be helpful for my Master Thesis.

BR
Harish Kumar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200819/6a76c7f3/attachment-0001.html>

From alex.williamson at redhat.com  Wed Aug 19 17:50:21 2020
From: alex.williamson at redhat.com (Alex Williamson)
Date: Wed, 19 Aug 2020 11:50:21 -0600
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200819033035.GA21172@joy-OptiPlex-7040>
References: <20200805105319.GF2177@nanopsycho>
 <20200810074631.GA29059@joy-OptiPlex-7040>
 <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
 <20200818113652.5d81a392.cohuck@redhat.com>
 <BY5PR12MB4322C9D1A66C4657776A1383DC5C0@BY5PR12MB4322.namprd12.prod.outlook.com>
 <20200819033035.GA21172@joy-OptiPlex-7040>
Message-ID: <20200819115021.004427a3@x1.home>

On Wed, 19 Aug 2020 11:30:35 +0800
Yan Zhao <yan.y.zhao at intel.com> wrote:

> On Tue, Aug 18, 2020 at 09:39:24AM +0000, Parav Pandit wrote:
> > Hi Cornelia,
> >   
> > > From: Cornelia Huck <cohuck at redhat.com>
> > > Sent: Tuesday, August 18, 2020 3:07 PM
> > > To: Daniel P. Berrangé <berrange at redhat.com>
> > > Cc: Jason Wang <jasowang at redhat.com>; Yan Zhao
> > > <yan.y.zhao at intel.com>; kvm at vger.kernel.org; libvir-list at redhat.com;
> > > qemu-devel at nongnu.org; Kirti Wankhede <kwankhede at nvidia.com>;
> > > eauger at redhat.com; xin-ran.wang at intel.com; corbet at lwn.net; openstack-
> > > discuss at lists.openstack.org; shaohe.feng at intel.com; kevin.tian at intel.com;
> > > Parav Pandit <parav at mellanox.com>; jian-feng.ding at intel.com;
> > > dgilbert at redhat.com; zhenyuw at linux.intel.com; hejie.xu at intel.com;
> > > bao.yumeng at zte.com.cn; Alex Williamson <alex.williamson at redhat.com>;
> > > eskultet at redhat.com; smooney at redhat.com; intel-gvt-
> > > dev at lists.freedesktop.org; Jiri Pirko <jiri at mellanox.com>;
> > > dinechin at redhat.com; devel at ovirt.org
> > > Subject: Re: device compatibility interface for live migration with assigned
> > > devices
> > > 
> > > On Tue, 18 Aug 2020 10:16:28 +0100
> > > Daniel P. Berrangé <berrange at redhat.com> wrote:
> > >   
> > > > On Tue, Aug 18, 2020 at 05:01:51PM +0800, Jason Wang wrote:  
> > > > >    On 2020/8/18 下午4:55, Daniel P. Berrangé wrote:
> > > > >
> > > > >  On Tue, Aug 18, 2020 at 11:24:30AM +0800, Jason Wang wrote:
> > > > >
> > > > >  On 2020/8/14 下午1:16, Yan Zhao wrote:
> > > > >
> > > > >  On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
> > > > >
> > > > >  On 2020/8/10 下午3:46, Yan Zhao wrote:  
> > > >  
> > > > >  we actually can also retrieve the same information through sysfs,
> > > > > .e.g
> > > > >
> > > > >  |- [path to device]
> > > > >     |--- migration
> > > > >     |     |--- self
> > > > >     |     |   |---device_api
> > > > >     |    |   |---mdev_type
> > > > >     |    |   |---software_version
> > > > >     |    |   |---device_id
> > > > >     |    |   |---aggregator
> > > > >     |     |--- compatible
> > > > >     |     |   |---device_api
> > > > >     |    |   |---mdev_type
> > > > >     |    |   |---software_version
> > > > >     |    |   |---device_id
> > > > >     |    |   |---aggregator
> > > > >
> > > > >
> > > > >  Yes but:
> > > > >
> > > > >  - You need one file per attribute (one syscall for one attribute)
> > > > >  - Attribute is coupled with kobject  
> > > 
> > > Is that really that bad? You have the device with an embedded kobject
> > > anyway, and you can just put things into an attribute group?
> > > 
> > > [Also, I think that self/compatible split in the example makes things
> > > needlessly complex. Shouldn't semantic versioning and matching already
> > > cover nearly everything? I would expect very few cases that are more
> > > complex than that. Maybe the aggregation stuff, but I don't think we need
> > > that self/compatible split for that, either.]
> > >   
> > > > >
> > > > >  All of above seems unnecessary.
> > > > >
> > > > >  Another point, as we discussed in another thread, it's really hard
> > > > > to make  sure the above API work for all types of devices and
> > > > > frameworks. So having a  vendor specific API looks much better.
> > > > >
> > > > >  From the POV of userspace mgmt apps doing device compat checking /
> > > > > migration,  we certainly do NOT want to use different vendor
> > > > > specific APIs. We want to  have an API that can be used / controlled in a  
> > > standard manner across vendors.  
> > > > >
> > > > >    Yes, but it could be hard. E.g vDPA will chose to use devlink (there's a
> > > > >    long debate on sysfs vs devlink). So if we go with sysfs, at least two
> > > > >    APIs needs to be supported ...  
> > > >
> > > > NB, I was not questioning devlink vs sysfs directly. If devlink is
> > > > related to netlink, I can't say I'm enthusiastic as IMKE sysfs is
> > > > easier to deal with. I don't know enough about devlink to have much of an  
> > > opinion though.  
> > > > The key point was that I don't want the userspace APIs we need to deal
> > > > with to be vendor specific.  
> > > 
> > > From what I've seen of devlink, it seems quite nice; but I understand why
> > > sysfs might be easier to deal with (especially as there's likely already a lot of
> > > code using it.)
> > > 
> > > I understand that some users would like devlink because it is already widely
> > > used for network drivers (and some others), but I don't think the majority of
> > > devices used with vfio are network (although certainly a lot of them are.)
> > >   
> > > >
> > > > What I care about is that we have a *standard* userspace API for
> > > > performing device compatibility checking / state migration, for use by
> > > > QEMU/libvirt/ OpenStack, such that we can write code without countless
> > > > vendor specific code paths.
> > > >
> > > > If there is vendor specific stuff on the side, that's fine as we can
> > > > ignore that, but the core functionality for device compat / migration
> > > > needs to be standardized.  
> > > 
> > > To summarize:
> > > - choose one of sysfs or devlink
> > > - have a common interface, with a standardized way to add
> > >   vendor-specific attributes
> > > ?  
> > 
> > Please refer to my previous email which has more example and details.  
> hi Parav,
> the example is based on a new vdpa tool running over netlink, not based
> on devlink, right?
> For vfio migration compatibility, we have to deal with both mdev and physical
> pci devices, I don't think it's a good idea to write a new tool for it, given
> we are able to retrieve the same info from sysfs and there's already an
> mdevctl from Alex (https://github.com/mdevctl/mdevctl).
> 
> hi All,
> could we decide that sysfs is the interface that every VFIO vendor driver
> needs to provide in order to support vfio live migration, otherwise the
> userspace management tool would not list the device into the compatible
> list?
> 
> if that's true, let's move to the standardizing of the sysfs interface.
> (1) content
> common part: (must)
>    - software_version: (in major.minor.bugfix scheme)
>    - device_api: vfio-pci or vfio-ccw ...
>    - type: mdev type for mdev device or
>            a signature for physical device which is a counterpart for
> 	   mdev type.
> 
> device api specific part: (must)
>   - pci id: pci id of mdev parent device or pci id of physical pci
>     device (device_api is vfio-pci)

As noted previously, the parent PCI ID should not matter for an mdev
device, if a vendor has a dependency on matching the parent device PCI
ID, that's a vendor specific restriction.  An mdev device can also
expose a vfio-pci device API without the parent device being PCI.  For
a physical PCI device, shouldn't the PCI ID be encompassed in the
signature?  Thanks,

Alex

>   - subchannel_type (device_api is vfio-ccw) 
>  
> vendor driver specific part: (optional)
>   - aggregator
>   - chpid_type
>   - remote_url
> 
> NOTE: vendors are free to add attributes in this part with a
> restriction that this attribute is able to be configured with the same
> name in sysfs too. e.g.
> for aggregator, there must be a sysfs attribute in device node
> /sys/devices/pci0000:00/0000:00:02.0/882cc4da-dede-11e7-9180-078a62063ab1/intel_vgpu/aggregator,
> so that the userspace tool is able to configure the target device
> according to source device's aggregator attribute.
> 
> 
> (2) where and structure
> proposal 1:
> |- [path to device]
>   |--- migration
>   |     |--- self
>   |     |    |-software_version
>   |     |    |-device_api
>   |     |    |-type
>   |     |    |-[pci_id or subchannel_type]
>   |     |    |-<aggregator or chpid_type>
>   |     |--- compatible
>   |     |    |-software_version
>   |     |    |-device_api
>   |     |    |-type
>   |     |    |-[pci_id or subchannel_type]
>   |     |    |-<aggregator or chpid_type>
> multiple compatible is allowed.
> attributes should be ASCII text files, preferably with only one value
> per file.
> 
> 
> proposal 2: use bin_attribute.
> |- [path to device]
>   |--- migration
>   |     |--- self
>   |     |--- compatible
> 
> so we can continue use multiline format. e.g.
> cat compatible
>   software_version=0.1.0
>   device_api=vfio_pci
>   type=i915-GVTg_V5_{val1:int:1,2,4,8}
>   pci_id=80865963
>   aggregator={val1}/2
> 
> Thanks
> Yan
> 


From sean.mcginnis at gmx.com  Wed Aug 19 18:32:40 2020
From: sean.mcginnis at gmx.com (Sean McGinnis)
Date: Wed, 19 Aug 2020 13:32:40 -0500
Subject: OpenStack with Nginx
In-Reply-To: <CAGmfrBySM9XhuYnKdHz0Zx0qNxMUsx+m+LdEwm4ZHc2o5X0NoQ@mail.gmail.com>
References: <CAGmfrBySM9XhuYnKdHz0Zx0qNxMUsx+m+LdEwm4ZHc2o5X0NoQ@mail.gmail.com>
Message-ID: <d82fe709-a531-96d4-f96b-2cf2d4a725f7@gmx.com>

On 8/19/20 12:13 PM, HARISH KUMAR Ivaturi wrote:
> Hi
> I am Harish Kumar, Master Student at BTH, Karlskrona, Sweden. I am
> working on my Master thesis at BTH and my thesis topic is Performance
> evaluation of OpenStack with HTTP/3.
Welcome Harish! That should be interesting to see the results of your
evaluation. I hope you will share that with the community once you
complete your research.
>
> I have successfully built curl and nginx with HTTP/3 support and I am
> performing some commands using curl for generating tokens so i could
> access the services of OpenStack.
> OpenStack relies with the Apache web server and I could not get any
> results using Nginx HTTP/3 . I would like to ask if there is any
> official documentation on OpenStack relying with Nginx?, I have
> searched in the internet reg. this info but could not get any, I would
> like to use nginx instead of apache web server , so I could get some
> results by performing curl and commands and nginx web server (with
> http/3 support). Please let me know and if there is any content please
> share with me. I hope you have understood this. It would be helpful
> for my Master Thesis.
>
I haven't really done anything with HTTP/3, but from what I understand,
it just changes the transport to use QUIC. So that should be pretty
transparent as far as the OpenStack services are concerned.

We don't have any documentation that I know of. Unless someone has done
some of their own testing and has some notes they can share.

I think the main thing here would be just setting up nginx to use the
uWSGI apps rather than Apache. This seems like a promising article that
walks through configuring nginx:

https://www.digitalocean.com/community/tutorials/how-to-serve-flask-applications-with-uwsgi-and-nginx-on-ubuntu-20-04

That specifically references flask, so just keep in mind that most
OpenStack services do not use that part of the tutorial.

Cinder has some old notes from when we were first looking at running
behind Apache. Those can be found here:

https://docs.openstack.org/cinder/latest/contributor/api.apache.html

But you may just need to look at the existing Apache configuration and
figure out what to change to do the equivalent under nginx.

Good luck!

Sean


From whayutin at redhat.com  Thu Aug 20 02:08:48 2020
From: whayutin at redhat.com (Wesley Hayutin)
Date: Wed, 19 Aug 2020 20:08:48 -0600
Subject: [tripleo] Proposing Takashi Kajinami to be core on puppet-tripleo
In-Reply-To: <1597847922905.32607@binero.com>
References: <CACu=hytgMt=4whc8vVRMLw-gkrfZ-rYiX26sNH78jnvgwu_zjQ@mail.gmail.com>
 <CABJHmF5eJOgG=hoHrE2z-8gMwm_cxGihMSLOd02hcXd+2GvK7w@mail.gmail.com>
 <1597847922905.32607@binero.com>
Message-ID: <CAOHJT4K3FFpO2dbiNsVCUh2d+9TPPQsnnX=k1yPRqqaPBN1bTg@mail.gmail.com>

On Wed, Aug 19, 2020 at 8:40 AM Tobias Urdin <tobias.urdin at binero.com>
wrote:

> Big +1 from an outsider :))
>
>
> Best regards
>
> Tobias
>
>
> ------------------------------
> *From:* Rabi Mishra <ramishra at redhat.com>
> *Sent:* Wednesday, August 19, 2020 3:37 PM
> *To:* Emilien Macchi
> *Cc:* openstack-discuss
> *Subject:* Re: [tripleo] Proposing Takashi Kajinami to be core on
> puppet-tripleo
>
> +1
>
> On Tue, Aug 18, 2020 at 8:03 PM Emilien Macchi <emilien at redhat.com> wrote:
>
>> Hi people,
>>
>> If you don't know Takashi yet, he has been involved in the Puppet
>> OpenStack project and helped *a lot* in its maintenance (and by maintenance
>> I mean not-funny-work). When our community was getting smaller and smaller,
>> he joined us and our review velicity went back to eleven. He became a core
>> maintainer very quickly and we're glad to have him onboard.
>>
>> He's also been involved in taking care of puppet-tripleo for a few months
>> and I believe he has more than enough knowledge on the module to provide
>> core reviews and be part of the core maintainer group. I also noticed his
>> amount of contribution (bug fixes, improvements, reviews, etc) in other
>> TripleO repos and I'm confident he'll make his road to be core in TripleO
>> at some point. For now I would like him to propose him to be core in
>> puppet-tripleo.
>>
>> As usual, any feedback is welcome but in the meantime I want to thank
>> Takashi for his work in TripleO and we're super happy to have new
>> contributors!
>>
>> Thanks,
>> --
>> Emilien Macchi
>>
>
>
> --
> Regards,
> Rabi Mishra
>
>
+1, thanks for your contributions Takashi!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200819/1b4171a5/attachment.html>

From eblock at nde.ag  Thu Aug 20 08:22:06 2020
From: eblock at nde.ag (Eugen Block)
Date: Thu, 20 Aug 2020 08:22:06 +0000
Subject: [neutron] Disable dhcp drop rule
In-Reply-To: <20200819164211.Horde.jx_dhmZz16BL7k9bIumarOA@webmail.nde.ag>
References: <20200819133616.Horde.zhXC_mhe4RdzjbP4Shl1M45@webmail.nde.ag>
 <4ea4eb17-0373-e1ab-6f45-c35cb67723e0@nemebean.com>
 <20200819164211.Horde.jx_dhmZz16BL7k9bIumarOA@webmail.nde.ag>
Message-ID: <20200820082206.Horde.cXRYpICP4lCwZzX-6gHKj-q@webmail.nde.ag>

Hi,

just a quick follow-up on this: disabling port_security only on the  
specified port works as expected.
Although this is still not an optimal solution we can live with it for now.

Thanks again and best regards,
Eugen


Zitat von Eugen Block <eblock at nde.ag>:

> That sounds promising, thank you! I had noticed that option but  
> didn’t have a chance to look closer into it.
> I’ll try that tomorrow.
>
> Thanks for the tip!
>
> Zitat von Ben Nemec <openstack at nemebean.com>:
>
>> On 8/19/20 8:36 AM, Eugen Block wrote:
>>> Hi *,
>>>
>>> we recently upgraded our Ocata Cloud to Train and also switched  
>>> from linuxbridge to openvswitch.
>>>
>>> One of our instances within the cloud works as DHCP server and to  
>>> make that work we had to comment the respective part in this file  
>>> on the compute node the instance was running on:
>>>
>>> /usr/lib/python2.7/site-packages/neutron/agent/linux/iptables_firewall.py
>>>
>>>
>>> Now we tried the same in
>>>
>>> /usr/lib/python3.6/site-packages/neutron/agent/linux/openvswitch_firewall/firewall.py  
>>> /usr/lib/python3.6/site-packages/neutron/agent/linux/iptables_firewall.py
>>>
>>> but restarting openstack-neutron-openvswitch-agent.service didn't  
>>> drop that rule, the DHCP reply didn't get through. To continue  
>>> with our work we just dropped it manually, so we get by, but since  
>>> there have been a couple of years between Ocata and Train, is  
>>> there any smoother or better way to achieve this? This seems to be  
>>> a reoccuring request but I couldn't find any updates on this  
>>> topic. Maybe someone here can shed some light? Is there more to  
>>> change than those two files I mentioned?
>>
>> You might try disabling port-security on the instance's port.  
>> That's what we use in OVB to allow a DHCP server in an instance now.
>>
>> neutron port-update [port-id] --port_security_enabled=False
>>
>> That will drop all port security for that instance, not just the  
>> DHCP rule, but on the other hand it leaves the DHCP rule in place  
>> for any instances you don't want running DHCP servers.
>>
>>>
>>> Any pointers are highly appreciated!
>>>
>>> Best regards,
>>> Eugen
>>>
>>>


From dtantsur at redhat.com  Thu Aug 20 08:54:17 2020
From: dtantsur at redhat.com (Dmitry Tantsur)
Date: Thu, 20 Aug 2020 10:54:17 +0200
Subject: [ironic] RFC: deprecate the iSCSI deploy interface?
Message-ID: <CACNgkFy4Gx9UsSfR3EW_vfZzyPGg9mR2ks+h6noycOVz1C1-Ww@mail.gmail.com>

Hi all,

Side note for those lacking context: this proposal concerns deprecating one
of the ironic deploy interfaces detailed in
https://docs.openstack.org/ironic/latest/admin/interfaces/deploy.html. It
does not affect the boot-from-iSCSI feature.

I would like to propose deprecating and removing the 'iscsi' deploy
interface over the course of the next 2 cycles. The reasons are:
1) The iSCSI deploy is a source of occasional cryptic bugs when a target
cannot be discovered or mounted properly.
2) Its security is questionable: I don't think we even use authentication.
3) Operators confusion: right now we default to the iSCSI deploy but pretty
much direct everyone who cares about scalability or security to the
'direct' deploy.
4) Cost of maintenance: our feature set is growing, our team - not so much.
iscsi_deploy.py is 800 lines of code that can be removed, and some
dependencies that can be dropped as well.

As far as I can remember, we've kept the iSCSI deploy for two reasons:
1) The direct deploy used to require Glance with Swift backend. The
recently added [agent]image_download_source option allows caching and
serving images via the ironic's HTTP server, eliminating this problem. I
guess we'll have to switch to 'http' by default for this option to keep the
out-of-box experience.
2) Memory footprint of the direct deploy. With the raw images streaming we
no longer have to cache the downloaded images in the agent memory, removing
this problem as well (I'm not even sure how much of a problem it is in
2020, even my phone has 4GiB of RAM).

If this proposal is accepted, I suggest to execute it as follows:
Victoria release:
1) Put an early deprecation warning in the release notes.
2) Announce the future change of the default value for
[agent]image_download_source.
W release:
3) Change [agent]image_download_source to 'http' by default.
4) Remove iscsi from the default enabled_deploy_interfaces and move it to
the back of the supported list (effectively making direct deploy the
default).
X release:
5) Remove the iscsi deploy code from both ironic and IPA.

Thoughts, opinions, suggestions?

Dmitry
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200820/ed0e7880/attachment.html>

From cjeanner at redhat.com  Thu Aug 20 12:22:59 2020
From: cjeanner at redhat.com (=?UTF-8?Q?C=c3=a9dric_Jeanneret?=)
Date: Thu, 20 Aug 2020 14:22:59 +0200
Subject: [tripleo] Moving tripleo-ansible-inventory script to
 tripleo-common?
In-Reply-To: <e28f0070-75c9-d960-2b19-1b2c3cd91c39@redhat.com>
References: <0e91db84-723b-d14b-d654-fdc74a0a42eb@redhat.com>
 <CABJHmF6t_58+Nq=gWiB_U4u3mMd=gmFWkfN5LUoDVuNN-68wQQ@mail.gmail.com>
 <e28f0070-75c9-d960-2b19-1b2c3cd91c39@redhat.com>
Message-ID: <0290a47a-4e84-76a1-e3aa-b0993191f6ff@redhat.com>


On 8/18/20 10:03 AM, Cédric Jeanneret wrote:
> 
> 
> On 8/18/20 9:53 AM, Rabi Mishra wrote:
>>
>>
>> On Tue, Aug 18, 2020 at 1:07 PM Cédric Jeanneret <cjeanner at redhat.com
>> <mailto:cjeanner at redhat.com>> wrote:
>>
>>     Hello there!
>>
>>     I'm wondering if we could move the "tripleo-ansible-inventory" script
>>     from the tripleo-validations repo to tripleo-common.
>>
>>
>> TBH, I don't know the history, but it would be better if we remove all
>> scripts from tripleo-common and use it just as a utility library (now
>> that Mistral is gone). Most of the existing scripts probably have an
>> existing command in tripleoclient. We can implement  missing ones
>> including "tripleo-ansible-inventory" in python-tripleoclient.

hm, we can't really replace it imho, since it's used as a "dynamic
inventory" for ansible directly.
The best thing we can probably do is:
- add "--os-cloud" option support[1]
- move this script to tripleoclient (I agree with you regarding
tripleo-common)

Once this is done, *maybe* we can move things to tripleoclient itself,
but we'll need to do something in order to keep that script in place
anyway...

[1] Thanks Mathieu :) https://review.opendev.org/747140


> 
> would probably be better to implement it directly in tripleoclient imho.
> In any cases, it has nothing to do in tripleo-validations...
> 
> I can't connect to launchpad, they are having some auth issue, I can't
> create an RFE there :(.
> 
>>
>>
>>     The main motivation here is to make things consistent:
>>     - that script calls content from tripleo-common, nothing from
>>     tripleo-validations.
>>     - that script isn't only for the validations, so it makes more sense to
>>     install it via tripleo-common
>>     - in fact, we should probably push that inventory thing as an `openstack
>>     tripleo' sub-command, but that's another story
>>
>>     So, is there any opposition to this proposal?
>>
>>     Cheers,
>>
>>     C.
>>
>>
>>     -- 
>>     Cédric Jeanneret (He/Him/His)
>>     Sr. Software Engineer - OpenStack Platform
>>     Deployment Framework TC
>>     Red Hat EMEA
>>     https://www.redhat.com/
>>
>>
>>
>> -- 
>> Regards,
>> Rabi Mishra
>>
> 

-- 
Cédric Jeanneret (He/Him/His)
Sr. Software Engineer - OpenStack Platform
Deployment Framework TC
Red Hat EMEA
https://www.redhat.com/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200820/93c9f8eb/attachment.sig>

From root.mch at gmail.com  Thu Aug 20 12:46:25 2020
From: root.mch at gmail.com (=?UTF-8?Q?=C4=B0zzettin_Erdem?=)
Date: Thu, 20 Aug 2020 15:46:25 +0300
Subject: [MURANO] Murano Class error when try to deploy WordPress APP
Message-ID: <CAN_SLJVBMoKwUR-GVe4XVHb8O-pqHPKO_duhZQQQTDJ86=8KuA@mail.gmail.com>

Hello everyone,

WordPress needs Mysql, HTTP and Zabbix Server/Agent. These apps run
individually with succes but when I try to deploy WordPress App on Murano
it gives the error about Apache HTTP that mentioned below.

How can I fix this? Do you have any suggestions?

Error:
http://paste.openstack.org/show/796980/
http://paste.openstack.org/show/796983/ (cont.)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200820/3a3d9cb1/attachment.html>

From Rajini.Karthik at Dell.com  Thu Aug 20 13:25:09 2020
From: Rajini.Karthik at Dell.com (Karthik, Rajini)
Date: Thu, 20 Aug 2020 13:25:09 +0000
Subject: [tripleo] Proposing Takashi Kajinami to be core on puppet-tripleo
In-Reply-To: <CAOHJT4K3FFpO2dbiNsVCUh2d+9TPPQsnnX=k1yPRqqaPBN1bTg@mail.gmail.com>
References: <CACu=hytgMt=4whc8vVRMLw-gkrfZ-rYiX26sNH78jnvgwu_zjQ@mail.gmail.com>
 <CABJHmF5eJOgG=hoHrE2z-8gMwm_cxGihMSLOd02hcXd+2GvK7w@mail.gmail.com>
 <1597847922905.32607@binero.com>
 <CAOHJT4K3FFpO2dbiNsVCUh2d+9TPPQsnnX=k1yPRqqaPBN1bTg@mail.gmail.com>
Message-ID: <DS7PR19MB463081789D3FED379F33EBCB9A5A0@DS7PR19MB4630.namprd19.prod.outlook.com>

+1 .

Rajini

From: Wesley Hayutin <whayutin at redhat.com>
Sent: Wednesday, August 19, 2020 9:09 PM
To: openstack-discuss
Cc: Emilien Macchi
Subject: Re: [tripleo] Proposing Takashi Kajinami to be core on puppet-tripleo


[EXTERNAL EMAIL]


On Wed, Aug 19, 2020 at 8:40 AM Tobias Urdin <tobias.urdin at binero.com<mailto:tobias.urdin at binero.com>> wrote:

Big +1 from an outsider :))


Best regards

Tobias


________________________________
From: Rabi Mishra <ramishra at redhat.com<mailto:ramishra at redhat.com>>
Sent: Wednesday, August 19, 2020 3:37 PM
To: Emilien Macchi
Cc: openstack-discuss
Subject: Re: [tripleo] Proposing Takashi Kajinami to be core on puppet-tripleo

+1

On Tue, Aug 18, 2020 at 8:03 PM Emilien Macchi <emilien at redhat.com<mailto:emilien at redhat.com>> wrote:
Hi people,

If you don't know Takashi yet, he has been involved in the Puppet OpenStack project and helped *a lot* in its maintenance (and by maintenance I mean not-funny-work). When our community was getting smaller and smaller, he joined us and our review velicity went back to eleven. He became a core maintainer very quickly and we're glad to have him onboard.

He's also been involved in taking care of puppet-tripleo for a few months and I believe he has more than enough knowledge on the module to provide core reviews and be part of the core maintainer group. I also noticed his amount of contribution (bug fixes, improvements, reviews, etc) in other TripleO repos and I'm confident he'll make his road to be core in TripleO at some point. For now I would like him to propose him to be core in puppet-tripleo.

As usual, any feedback is welcome but in the meantime I want to thank Takashi for his work in TripleO and we're super happy to have new contributors!

Thanks,
--
Emilien Macchi


--
Regards,
Rabi Mishra


+1, thanks for your contributions Takashi!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200820/8385774e/attachment-0001.html>

From yan.y.zhao at intel.com  Thu Aug 20 00:18:10 2020
From: yan.y.zhao at intel.com (Yan Zhao)
Date: Thu, 20 Aug 2020 08:18:10 +0800
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200819115021.004427a3@x1.home>
References: <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
 <20200818113652.5d81a392.cohuck@redhat.com>
 <BY5PR12MB4322C9D1A66C4657776A1383DC5C0@BY5PR12MB4322.namprd12.prod.outlook.com>
 <20200819033035.GA21172@joy-OptiPlex-7040>
 <20200819115021.004427a3@x1.home>
Message-ID: <20200820001810.GD21172@joy-OptiPlex-7040>

On Wed, Aug 19, 2020 at 11:50:21AM -0600, Alex Williamson wrote:
<...>
> > > > > What I care about is that we have a *standard* userspace API for
> > > > > performing device compatibility checking / state migration, for use by
> > > > > QEMU/libvirt/ OpenStack, such that we can write code without countless
> > > > > vendor specific code paths.
> > > > >
> > > > > If there is vendor specific stuff on the side, that's fine as we can
> > > > > ignore that, but the core functionality for device compat / migration
> > > > > needs to be standardized.  
> > > > 
> > > > To summarize:
> > > > - choose one of sysfs or devlink
> > > > - have a common interface, with a standardized way to add
> > > >   vendor-specific attributes
> > > > ?  
> > > 
> > > Please refer to my previous email which has more example and details.  
> > hi Parav,
> > the example is based on a new vdpa tool running over netlink, not based
> > on devlink, right?
> > For vfio migration compatibility, we have to deal with both mdev and physical
> > pci devices, I don't think it's a good idea to write a new tool for it, given
> > we are able to retrieve the same info from sysfs and there's already an
> > mdevctl from Alex (https://github.com/mdevctl/mdevctl).
> > 
> > hi All,
> > could we decide that sysfs is the interface that every VFIO vendor driver
> > needs to provide in order to support vfio live migration, otherwise the
> > userspace management tool would not list the device into the compatible
> > list?
> > 
> > if that's true, let's move to the standardizing of the sysfs interface.
> > (1) content
> > common part: (must)
> >    - software_version: (in major.minor.bugfix scheme)
> >    - device_api: vfio-pci or vfio-ccw ...
> >    - type: mdev type for mdev device or
> >            a signature for physical device which is a counterpart for
> > 	   mdev type.
> > 
> > device api specific part: (must)
> >   - pci id: pci id of mdev parent device or pci id of physical pci
> >     device (device_api is vfio-pci)
> 
> As noted previously, the parent PCI ID should not matter for an mdev
> device, if a vendor has a dependency on matching the parent device PCI
> ID, that's a vendor specific restriction.  An mdev device can also
> expose a vfio-pci device API without the parent device being PCI.  For
> a physical PCI device, shouldn't the PCI ID be encompassed in the
> signature?  Thanks,
> 
you are right. I need to put the PCI ID as a vendor specific field.
I didn't do that because I wanted all fields in vendor specific to be
configurable by management tools, so they can configure the target device
according to the value of a vendor specific field even they don't know
the meaning of the field.
But maybe they can just ignore the field when they can't find a matching
writable field to configure the target.

Thanks
Yan


> >   - subchannel_type (device_api is vfio-ccw) 
> >  
> > vendor driver specific part: (optional)
> >   - aggregator
> >   - chpid_type
> >   - remote_url
> > 
> > NOTE: vendors are free to add attributes in this part with a
> > restriction that this attribute is able to be configured with the same
> > name in sysfs too. e.g.
> > for aggregator, there must be a sysfs attribute in device node
> > /sys/devices/pci0000:00/0000:00:02.0/882cc4da-dede-11e7-9180-078a62063ab1/intel_vgpu/aggregator,
> > so that the userspace tool is able to configure the target device
> > according to source device's aggregator attribute.
> > 
> > 
> > (2) where and structure
> > proposal 1:
> > |- [path to device]
> >   |--- migration
> >   |     |--- self
> >   |     |    |-software_version
> >   |     |    |-device_api
> >   |     |    |-type
> >   |     |    |-[pci_id or subchannel_type]
> >   |     |    |-<aggregator or chpid_type>
> >   |     |--- compatible
> >   |     |    |-software_version
> >   |     |    |-device_api
> >   |     |    |-type
> >   |     |    |-[pci_id or subchannel_type]
> >   |     |    |-<aggregator or chpid_type>
> > multiple compatible is allowed.
> > attributes should be ASCII text files, preferably with only one value
> > per file.
> > 
> > 
> > proposal 2: use bin_attribute.
> > |- [path to device]
> >   |--- migration
> >   |     |--- self
> >   |     |--- compatible
> > 
> > so we can continue use multiline format. e.g.
> > cat compatible
> >   software_version=0.1.0
> >   device_api=vfio_pci
> >   type=i915-GVTg_V5_{val1:int:1,2,4,8}
> >   pci_id=80865963
> >   aggregator={val1}/2
> > 
> > Thanks
> > Yan
> > 
> 


From yan.y.zhao at intel.com  Thu Aug 20 00:39:22 2020
From: yan.y.zhao at intel.com (Yan Zhao)
Date: Thu, 20 Aug 2020 08:39:22 +0800
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200818113652.5d81a392.cohuck@redhat.com>
References: <20200805093338.GC30485@joy-OptiPlex-7040>
 <20200805105319.GF2177@nanopsycho>
 <20200810074631.GA29059@joy-OptiPlex-7040>
 <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
 <20200818113652.5d81a392.cohuck@redhat.com>
Message-ID: <20200820003922.GE21172@joy-OptiPlex-7040>

On Tue, Aug 18, 2020 at 11:36:52AM +0200, Cornelia Huck wrote:
> On Tue, 18 Aug 2020 10:16:28 +0100
> Daniel P. Berrangé <berrange at redhat.com> wrote:
> 
> > On Tue, Aug 18, 2020 at 05:01:51PM +0800, Jason Wang wrote:
> > >    On 2020/8/18 下午4:55, Daniel P. Berrangé wrote:
> > > 
> > >  On Tue, Aug 18, 2020 at 11:24:30AM +0800, Jason Wang wrote:
> > > 
> > >  On 2020/8/14 下午1:16, Yan Zhao wrote:
> > > 
> > >  On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
> > > 
> > >  On 2020/8/10 下午3:46, Yan Zhao wrote:  
> > 
> > >  we actually can also retrieve the same information through sysfs, .e.g
> > > 
> > >  |- [path to device]
> > >     |--- migration
> > >     |     |--- self
> > >     |     |   |---device_api
> > >     |    |   |---mdev_type
> > >     |    |   |---software_version
> > >     |    |   |---device_id
> > >     |    |   |---aggregator
> > >     |     |--- compatible
> > >     |     |   |---device_api
> > >     |    |   |---mdev_type
> > >     |    |   |---software_version
> > >     |    |   |---device_id
> > >     |    |   |---aggregator
> > > 
> > > 
> > >  Yes but:
> > > 
> > >  - You need one file per attribute (one syscall for one attribute)
> > >  - Attribute is coupled with kobject
> 
> Is that really that bad? You have the device with an embedded kobject
> anyway, and you can just put things into an attribute group?
> 
> [Also, I think that self/compatible split in the example makes things
> needlessly complex. Shouldn't semantic versioning and matching already
> cover nearly everything? I would expect very few cases that are more
> complex than that. Maybe the aggregation stuff, but I don't think we
> need that self/compatible split for that, either.]
Hi Cornelia,

The reason I want to declare compatible list of attributes is that
sometimes it's not a simple 1:1 matching of source attributes and target attributes
as I demonstrated below,
source mdev of (mdev_type i915-GVTg_V5_2 + aggregator 1) is compatible to
target mdev of (mdev_type i915-GVTg_V5_4 + aggregator 2),
               (mdev_type i915-GVTg_V5_8 + aggregator 4)

and aggragator may be just one of such examples that 1:1 matching does not
fit.

So, we explicitly list out self/compatible attributes, and management
tools only need to check if self attributes is contained compatible
attributes.

or do you mean only compatible list is enough, and the management tools
need to find out self list by themselves?
But I think provide a self list is easier for management tools.

Thanks
Yan


From smooney at redhat.com  Thu Aug 20 01:29:07 2020
From: smooney at redhat.com (Sean Mooney)
Date: Thu, 20 Aug 2020 02:29:07 +0100
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200820003922.GE21172@joy-OptiPlex-7040>
References: <20200805093338.GC30485@joy-OptiPlex-7040>
 <20200805105319.GF2177@nanopsycho>
 <20200810074631.GA29059@joy-OptiPlex-7040>
 <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
 <20200818113652.5d81a392.cohuck@redhat.com>
 <20200820003922.GE21172@joy-OptiPlex-7040>
Message-ID: <242591bb809b68c618f62fdc93d4f8ae7b146b6d.camel@redhat.com>

On Thu, 2020-08-20 at 08:39 +0800, Yan Zhao wrote:
> On Tue, Aug 18, 2020 at 11:36:52AM +0200, Cornelia Huck wrote:
> > On Tue, 18 Aug 2020 10:16:28 +0100
> > Daniel P. Berrangé <berrange at redhat.com> wrote:
> > 
> > > On Tue, Aug 18, 2020 at 05:01:51PM +0800, Jason Wang wrote:
> > > >    On 2020/8/18 下午4:55, Daniel P. Berrangé wrote:
> > > > 
> > > >  On Tue, Aug 18, 2020 at 11:24:30AM +0800, Jason Wang wrote:
> > > > 
> > > >  On 2020/8/14 下午1:16, Yan Zhao wrote:
> > > > 
> > > >  On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
> > > > 
> > > >  On 2020/8/10 下午3:46, Yan Zhao wrote:  
> > > >  we actually can also retrieve the same information through sysfs, .e.g
> > > > 
> > > >  |- [path to device]
> > > >     |--- migration
> > > >     |     |--- self
> > > >     |     |   |---device_api
> > > >     |    |   |---mdev_type
> > > >     |    |   |---software_version
> > > >     |    |   |---device_id
> > > >     |    |   |---aggregator
> > > >     |     |--- compatible
> > > >     |     |   |---device_api
> > > >     |    |   |---mdev_type
> > > >     |    |   |---software_version
> > > >     |    |   |---device_id
> > > >     |    |   |---aggregator
> > > > 
> > > > 
> > > >  Yes but:
> > > > 
> > > >  - You need one file per attribute (one syscall for one attribute)
> > > >  - Attribute is coupled with kobject
> > 
> > Is that really that bad? You have the device with an embedded kobject
> > anyway, and you can just put things into an attribute group?
> > 
> > [Also, I think that self/compatible split in the example makes things
> > needlessly complex. Shouldn't semantic versioning and matching already
> > cover nearly everything? I would expect very few cases that are more
> > complex than that. Maybe the aggregation stuff, but I don't think we
> > need that self/compatible split for that, either.]
> 
> Hi Cornelia,
> 
> The reason I want to declare compatible list of attributes is that
> sometimes it's not a simple 1:1 matching of source attributes and target attributes
> as I demonstrated below,
> source mdev of (mdev_type i915-GVTg_V5_2 + aggregator 1) is compatible to
> target mdev of (mdev_type i915-GVTg_V5_4 + aggregator 2),
>                (mdev_type i915-GVTg_V5_8 + aggregator 4)
the way you are doing the nameing is till really confusing by the way
if this has not already been merged in the kernel can you chagne the mdev
so that mdev_type i915-GVTg_V5_2 is 2 of mdev_type i915-GVTg_V5_1 instead of half the device

currently you need to deived the aggratod by the number at the end of the mdev type to figure out
how much of the phsicial device is being used with is a very unfridly api convention

the way aggrator are being proposed in general is not really someting i like but i thin this at least
is something that should be able to correct.

with the complexity in the mdev type name + aggrator i suspect that this will never be support
in openstack nova directly requireing integration via cyborg unless we can pre partion the
device in to mdevs staicaly and just ignore this.

this is way to vendor sepecif to integrate into something like openstack in nova unless we can guarentee
taht how aggreator work will be portable across vendors genericly.

> 
> and aggragator may be just one of such examples that 1:1 matching does not
> fit.
for openstack nova i dont see us support anything beyond the 1:1 case where the mdev type does not change.

i woudl really prefer if there was just one mdev type that repsented the minimal allcatable unit and the
aggragaotr where used to create compostions of that. i.e instad of i915-GVTg_V5_2 beign half the device,
have 1 mdev type i915-GVTg and if the device support 8 of them then we can aggrate 4 of i915-GVTg

if you want to have muplie mdev type to model the different amoutn of the resouce e.g. i915-GVTg_small i915-GVTg_large
that is totlaly fine too or even i915-GVTg_4 indcating it sis 4 of i915-GVTg

failing that i would just expose an mdev type per composable resouce and allow us to compose them a the user level with
some other construct mudeling a attament to the device. e.g. create composed mdev or somethig that is an aggreateion of
multiple sub resouces each of which is an mdev. so kind of like how bond port work. we would create an mdev for each of
the sub resouces and then create a bond or aggrated mdev by reference the other mdevs by uuid then attach only the
aggreated mdev to the instance.

the current aggrator syntax and sematic however make me rather uncofrotable when i think about orchestating vms on top
of it even to boot them let alone migrate them.
> 
> So, we explicitly list out self/compatible attributes, and management
> tools only need to check if self attributes is contained compatible
> attributes.
> 
> or do you mean only compatible list is enough, and the management tools
> need to find out self list by themselves?
> But I think provide a self list is easier for management tools.
> 
> Thanks
> Yan
> 


From alex.williamson at redhat.com  Thu Aug 20 03:13:45 2020
From: alex.williamson at redhat.com (Alex Williamson)
Date: Wed, 19 Aug 2020 21:13:45 -0600
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200820001810.GD21172@joy-OptiPlex-7040>
References: <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
 <20200818113652.5d81a392.cohuck@redhat.com>
 <BY5PR12MB4322C9D1A66C4657776A1383DC5C0@BY5PR12MB4322.namprd12.prod.outlook.com>
 <20200819033035.GA21172@joy-OptiPlex-7040>
 <20200819115021.004427a3@x1.home>
 <20200820001810.GD21172@joy-OptiPlex-7040>
Message-ID: <20200819211345.0d9daf03@x1.home>

On Thu, 20 Aug 2020 08:18:10 +0800
Yan Zhao <yan.y.zhao at intel.com> wrote:

> On Wed, Aug 19, 2020 at 11:50:21AM -0600, Alex Williamson wrote:
> <...>
> > > > > > What I care about is that we have a *standard* userspace API for
> > > > > > performing device compatibility checking / state migration, for use by
> > > > > > QEMU/libvirt/ OpenStack, such that we can write code without countless
> > > > > > vendor specific code paths.
> > > > > >
> > > > > > If there is vendor specific stuff on the side, that's fine as we can
> > > > > > ignore that, but the core functionality for device compat / migration
> > > > > > needs to be standardized.    
> > > > > 
> > > > > To summarize:
> > > > > - choose one of sysfs or devlink
> > > > > - have a common interface, with a standardized way to add
> > > > >   vendor-specific attributes
> > > > > ?    
> > > > 
> > > > Please refer to my previous email which has more example and details.    
> > > hi Parav,
> > > the example is based on a new vdpa tool running over netlink, not based
> > > on devlink, right?
> > > For vfio migration compatibility, we have to deal with both mdev and physical
> > > pci devices, I don't think it's a good idea to write a new tool for it, given
> > > we are able to retrieve the same info from sysfs and there's already an
> > > mdevctl from Alex (https://github.com/mdevctl/mdevctl).
> > > 
> > > hi All,
> > > could we decide that sysfs is the interface that every VFIO vendor driver
> > > needs to provide in order to support vfio live migration, otherwise the
> > > userspace management tool would not list the device into the compatible
> > > list?
> > > 
> > > if that's true, let's move to the standardizing of the sysfs interface.
> > > (1) content
> > > common part: (must)
> > >    - software_version: (in major.minor.bugfix scheme)
> > >    - device_api: vfio-pci or vfio-ccw ...
> > >    - type: mdev type for mdev device or
> > >            a signature for physical device which is a counterpart for
> > > 	   mdev type.
> > > 
> > > device api specific part: (must)
> > >   - pci id: pci id of mdev parent device or pci id of physical pci
> > >     device (device_api is vfio-pci)  
> > 
> > As noted previously, the parent PCI ID should not matter for an mdev
> > device, if a vendor has a dependency on matching the parent device PCI
> > ID, that's a vendor specific restriction.  An mdev device can also
> > expose a vfio-pci device API without the parent device being PCI.  For
> > a physical PCI device, shouldn't the PCI ID be encompassed in the
> > signature?  Thanks,
> >   
> you are right. I need to put the PCI ID as a vendor specific field.
> I didn't do that because I wanted all fields in vendor specific to be
> configurable by management tools, so they can configure the target device
> according to the value of a vendor specific field even they don't know
> the meaning of the field.
> But maybe they can just ignore the field when they can't find a matching
> writable field to configure the target.


If fields can be ignored, what's the point of reporting them?  Seems
it's no longer a requirement.  Thanks,

Alex


> > >   - subchannel_type (device_api is vfio-ccw) 
> > >  
> > > vendor driver specific part: (optional)
> > >   - aggregator
> > >   - chpid_type
> > >   - remote_url
> > > 
> > > NOTE: vendors are free to add attributes in this part with a
> > > restriction that this attribute is able to be configured with the same
> > > name in sysfs too. e.g.
> > > for aggregator, there must be a sysfs attribute in device node
> > > /sys/devices/pci0000:00/0000:00:02.0/882cc4da-dede-11e7-9180-078a62063ab1/intel_vgpu/aggregator,
> > > so that the userspace tool is able to configure the target device
> > > according to source device's aggregator attribute.
> > > 
> > > 
> > > (2) where and structure
> > > proposal 1:
> > > |- [path to device]
> > >   |--- migration
> > >   |     |--- self
> > >   |     |    |-software_version
> > >   |     |    |-device_api
> > >   |     |    |-type
> > >   |     |    |-[pci_id or subchannel_type]
> > >   |     |    |-<aggregator or chpid_type>
> > >   |     |--- compatible
> > >   |     |    |-software_version
> > >   |     |    |-device_api
> > >   |     |    |-type
> > >   |     |    |-[pci_id or subchannel_type]
> > >   |     |    |-<aggregator or chpid_type>
> > > multiple compatible is allowed.
> > > attributes should be ASCII text files, preferably with only one value
> > > per file.
> > > 
> > > 
> > > proposal 2: use bin_attribute.
> > > |- [path to device]
> > >   |--- migration
> > >   |     |--- self
> > >   |     |--- compatible
> > > 
> > > so we can continue use multiline format. e.g.
> > > cat compatible
> > >   software_version=0.1.0
> > >   device_api=vfio_pci
> > >   type=i915-GVTg_V5_{val1:int:1,2,4,8}
> > >   pci_id=80865963
> > >   aggregator={val1}/2
> > > 
> > > Thanks
> > > Yan
> > >   
> >   
> 


From alex.williamson at redhat.com  Thu Aug 20 03:22:34 2020
From: alex.williamson at redhat.com (Alex Williamson)
Date: Wed, 19 Aug 2020 21:22:34 -0600
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200820003922.GE21172@joy-OptiPlex-7040>
References: <20200805093338.GC30485@joy-OptiPlex-7040>
 <20200805105319.GF2177@nanopsycho>
 <20200810074631.GA29059@joy-OptiPlex-7040>
 <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
 <20200818113652.5d81a392.cohuck@redhat.com>
 <20200820003922.GE21172@joy-OptiPlex-7040>
Message-ID: <20200819212234.223667b3@x1.home>

On Thu, 20 Aug 2020 08:39:22 +0800
Yan Zhao <yan.y.zhao at intel.com> wrote:

> On Tue, Aug 18, 2020 at 11:36:52AM +0200, Cornelia Huck wrote:
> > On Tue, 18 Aug 2020 10:16:28 +0100
> > Daniel P. Berrangé <berrange at redhat.com> wrote:
> >   
> > > On Tue, Aug 18, 2020 at 05:01:51PM +0800, Jason Wang wrote:  
> > > >    On 2020/8/18 下午4:55, Daniel P. Berrangé wrote:
> > > > 
> > > >  On Tue, Aug 18, 2020 at 11:24:30AM +0800, Jason Wang wrote:
> > > > 
> > > >  On 2020/8/14 下午1:16, Yan Zhao wrote:
> > > > 
> > > >  On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
> > > > 
> > > >  On 2020/8/10 下午3:46, Yan Zhao wrote:    
> > >   
> > > >  we actually can also retrieve the same information through sysfs, .e.g
> > > > 
> > > >  |- [path to device]
> > > >     |--- migration
> > > >     |     |--- self
> > > >     |     |   |---device_api
> > > >     |    |   |---mdev_type
> > > >     |    |   |---software_version
> > > >     |    |   |---device_id
> > > >     |    |   |---aggregator
> > > >     |     |--- compatible
> > > >     |     |   |---device_api
> > > >     |    |   |---mdev_type
> > > >     |    |   |---software_version
> > > >     |    |   |---device_id
> > > >     |    |   |---aggregator
> > > > 
> > > > 
> > > >  Yes but:
> > > > 
> > > >  - You need one file per attribute (one syscall for one attribute)
> > > >  - Attribute is coupled with kobject  
> > 
> > Is that really that bad? You have the device with an embedded kobject
> > anyway, and you can just put things into an attribute group?
> > 
> > [Also, I think that self/compatible split in the example makes things
> > needlessly complex. Shouldn't semantic versioning and matching already
> > cover nearly everything? I would expect very few cases that are more
> > complex than that. Maybe the aggregation stuff, but I don't think we
> > need that self/compatible split for that, either.]  
> Hi Cornelia,
> 
> The reason I want to declare compatible list of attributes is that
> sometimes it's not a simple 1:1 matching of source attributes and target attributes
> as I demonstrated below,
> source mdev of (mdev_type i915-GVTg_V5_2 + aggregator 1) is compatible to
> target mdev of (mdev_type i915-GVTg_V5_4 + aggregator 2),
>                (mdev_type i915-GVTg_V5_8 + aggregator 4)
> 
> and aggragator may be just one of such examples that 1:1 matching does not
> fit.

If you're suggesting that we need a new 'compatible' set for every
aggregation, haven't we lost the purpose of aggregation?  For example,
rather than having N mdev types to represent all the possible
aggregation values, we have a single mdev type with N compatible
migration entries, one for each possible aggregation value.  BTW, how do
we have multiple compatible directories?  compatible0001,
compatible0002? Thanks,

Alex


From yan.y.zhao at intel.com  Thu Aug 20 03:09:51 2020
From: yan.y.zhao at intel.com (Yan Zhao)
Date: Thu, 20 Aug 2020 11:09:51 +0800
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200819211345.0d9daf03@x1.home>
References: <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
 <20200818113652.5d81a392.cohuck@redhat.com>
 <BY5PR12MB4322C9D1A66C4657776A1383DC5C0@BY5PR12MB4322.namprd12.prod.outlook.com>
 <20200819033035.GA21172@joy-OptiPlex-7040>
 <20200819115021.004427a3@x1.home>
 <20200820001810.GD21172@joy-OptiPlex-7040>
 <20200819211345.0d9daf03@x1.home>
Message-ID: <20200820030951.GA24121@joy-OptiPlex-7040>

On Wed, Aug 19, 2020 at 09:13:45PM -0600, Alex Williamson wrote:
> On Thu, 20 Aug 2020 08:18:10 +0800
> Yan Zhao <yan.y.zhao at intel.com> wrote:
> 
> > On Wed, Aug 19, 2020 at 11:50:21AM -0600, Alex Williamson wrote:
> > <...>
> > > > > > > What I care about is that we have a *standard* userspace API for
> > > > > > > performing device compatibility checking / state migration, for use by
> > > > > > > QEMU/libvirt/ OpenStack, such that we can write code without countless
> > > > > > > vendor specific code paths.
> > > > > > >
> > > > > > > If there is vendor specific stuff on the side, that's fine as we can
> > > > > > > ignore that, but the core functionality for device compat / migration
> > > > > > > needs to be standardized.    
> > > > > > 
> > > > > > To summarize:
> > > > > > - choose one of sysfs or devlink
> > > > > > - have a common interface, with a standardized way to add
> > > > > >   vendor-specific attributes
> > > > > > ?    
> > > > > 
> > > > > Please refer to my previous email which has more example and details.    
> > > > hi Parav,
> > > > the example is based on a new vdpa tool running over netlink, not based
> > > > on devlink, right?
> > > > For vfio migration compatibility, we have to deal with both mdev and physical
> > > > pci devices, I don't think it's a good idea to write a new tool for it, given
> > > > we are able to retrieve the same info from sysfs and there's already an
> > > > mdevctl from Alex (https://github.com/mdevctl/mdevctl).
> > > > 
> > > > hi All,
> > > > could we decide that sysfs is the interface that every VFIO vendor driver
> > > > needs to provide in order to support vfio live migration, otherwise the
> > > > userspace management tool would not list the device into the compatible
> > > > list?
> > > > 
> > > > if that's true, let's move to the standardizing of the sysfs interface.
> > > > (1) content
> > > > common part: (must)
> > > >    - software_version: (in major.minor.bugfix scheme)
> > > >    - device_api: vfio-pci or vfio-ccw ...
> > > >    - type: mdev type for mdev device or
> > > >            a signature for physical device which is a counterpart for
> > > > 	   mdev type.
> > > > 
> > > > device api specific part: (must)
> > > >   - pci id: pci id of mdev parent device or pci id of physical pci
> > > >     device (device_api is vfio-pci)  
> > > 
> > > As noted previously, the parent PCI ID should not matter for an mdev
> > > device, if a vendor has a dependency on matching the parent device PCI
> > > ID, that's a vendor specific restriction.  An mdev device can also
> > > expose a vfio-pci device API without the parent device being PCI.  For
> > > a physical PCI device, shouldn't the PCI ID be encompassed in the
> > > signature?  Thanks,
> > >   
> > you are right. I need to put the PCI ID as a vendor specific field.
> > I didn't do that because I wanted all fields in vendor specific to be
> > configurable by management tools, so they can configure the target device
> > according to the value of a vendor specific field even they don't know
> > the meaning of the field.
> > But maybe they can just ignore the field when they can't find a matching
> > writable field to configure the target.
> 
> 
> If fields can be ignored, what's the point of reporting them?  Seems
> it's no longer a requirement.  Thanks,
> 
sorry about the confusion. I mean this condition:
about to migrate, openstack searches if there are existing matching
MDEVs,
if yes, i.e. all common/vendor specific fields match, then just create
a VM with the matching target MDEV. (in this condition, the PCI ID field
is not ignored);
if not, openstack tries to create one MDEV according to mdev_type, and
configures MDEV according to the vendor specific attributes.
as PCI ID is not a configurable field, it just ignore the field.

Thanks
Yan

 
From yan.y.zhao at intel.com  Thu Aug 20 03:16:21 2020
From: yan.y.zhao at intel.com (Yan Zhao)
Date: Thu, 20 Aug 2020 11:16:21 +0800
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200819212234.223667b3@x1.home>
References: <20200810074631.GA29059@joy-OptiPlex-7040>
 <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
 <20200818113652.5d81a392.cohuck@redhat.com>
 <20200820003922.GE21172@joy-OptiPlex-7040>
 <20200819212234.223667b3@x1.home>
Message-ID: <20200820031621.GA24997@joy-OptiPlex-7040>

On Wed, Aug 19, 2020 at 09:22:34PM -0600, Alex Williamson wrote:
> On Thu, 20 Aug 2020 08:39:22 +0800
> Yan Zhao <yan.y.zhao at intel.com> wrote:
> 
> > On Tue, Aug 18, 2020 at 11:36:52AM +0200, Cornelia Huck wrote:
> > > On Tue, 18 Aug 2020 10:16:28 +0100
> > > Daniel P. Berrangé <berrange at redhat.com> wrote:
> > >   
> > > > On Tue, Aug 18, 2020 at 05:01:51PM +0800, Jason Wang wrote:  
> > > > >    On 2020/8/18 下午4:55, Daniel P. Berrangé wrote:
> > > > > 
> > > > >  On Tue, Aug 18, 2020 at 11:24:30AM +0800, Jason Wang wrote:
> > > > > 
> > > > >  On 2020/8/14 下午1:16, Yan Zhao wrote:
> > > > > 
> > > > >  On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
> > > > > 
> > > > >  On 2020/8/10 下午3:46, Yan Zhao wrote:    
> > > >   
> > > > >  we actually can also retrieve the same information through sysfs, .e.g
> > > > > 
> > > > >  |- [path to device]
> > > > >     |--- migration
> > > > >     |     |--- self
> > > > >     |     |   |---device_api
> > > > >     |    |   |---mdev_type
> > > > >     |    |   |---software_version
> > > > >     |    |   |---device_id
> > > > >     |    |   |---aggregator
> > > > >     |     |--- compatible
> > > > >     |     |   |---device_api
> > > > >     |    |   |---mdev_type
> > > > >     |    |   |---software_version
> > > > >     |    |   |---device_id
> > > > >     |    |   |---aggregator
> > > > > 
> > > > > 
> > > > >  Yes but:
> > > > > 
> > > > >  - You need one file per attribute (one syscall for one attribute)
> > > > >  - Attribute is coupled with kobject  
> > > 
> > > Is that really that bad? You have the device with an embedded kobject
> > > anyway, and you can just put things into an attribute group?
> > > 
> > > [Also, I think that self/compatible split in the example makes things
> > > needlessly complex. Shouldn't semantic versioning and matching already
> > > cover nearly everything? I would expect very few cases that are more
> > > complex than that. Maybe the aggregation stuff, but I don't think we
> > > need that self/compatible split for that, either.]  
> > Hi Cornelia,
> > 
> > The reason I want to declare compatible list of attributes is that
> > sometimes it's not a simple 1:1 matching of source attributes and target attributes
> > as I demonstrated below,
> > source mdev of (mdev_type i915-GVTg_V5_2 + aggregator 1) is compatible to
> > target mdev of (mdev_type i915-GVTg_V5_4 + aggregator 2),
> >                (mdev_type i915-GVTg_V5_8 + aggregator 4)
> > 
> > and aggragator may be just one of such examples that 1:1 matching does not
> > fit.
> 
> If you're suggesting that we need a new 'compatible' set for every
> aggregation, haven't we lost the purpose of aggregation?  For example,
> rather than having N mdev types to represent all the possible
> aggregation values, we have a single mdev type with N compatible
> migration entries, one for each possible aggregation value.  BTW, how do
> we have multiple compatible directories?  compatible0001,
> compatible0002? Thanks,
> 
do you think the bin_attribute I proposed yesterday good?
Then we can have a single compatible with a variable in the mdev_type and
aggregator.

   mdev_type=i915-GVTg_V5_{val1:int:2,4,8}
   aggregator={val1}/2

Thanks
Yan


From yan.y.zhao at intel.com  Thu Aug 20 04:01:16 2020
From: yan.y.zhao at intel.com (Yan Zhao)
Date: Thu, 20 Aug 2020 12:01:16 +0800
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <242591bb809b68c618f62fdc93d4f8ae7b146b6d.camel@redhat.com>
References: <20200810074631.GA29059@joy-OptiPlex-7040>
 <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
 <20200818113652.5d81a392.cohuck@redhat.com>
 <20200820003922.GE21172@joy-OptiPlex-7040>
 <242591bb809b68c618f62fdc93d4f8ae7b146b6d.camel@redhat.com>
Message-ID: <20200820040116.GB24121@joy-OptiPlex-7040>

On Thu, Aug 20, 2020 at 02:29:07AM +0100, Sean Mooney wrote:
> On Thu, 2020-08-20 at 08:39 +0800, Yan Zhao wrote:
> > On Tue, Aug 18, 2020 at 11:36:52AM +0200, Cornelia Huck wrote:
> > > On Tue, 18 Aug 2020 10:16:28 +0100
> > > Daniel P. Berrangé <berrange at redhat.com> wrote:
> > > 
> > > > On Tue, Aug 18, 2020 at 05:01:51PM +0800, Jason Wang wrote:
> > > > >    On 2020/8/18 下午4:55, Daniel P. Berrangé wrote:
> > > > > 
> > > > >  On Tue, Aug 18, 2020 at 11:24:30AM +0800, Jason Wang wrote:
> > > > > 
> > > > >  On 2020/8/14 下午1:16, Yan Zhao wrote:
> > > > > 
> > > > >  On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
> > > > > 
> > > > >  On 2020/8/10 下午3:46, Yan Zhao wrote:  
> > > > >  we actually can also retrieve the same information through sysfs, .e.g
> > > > > 
> > > > >  |- [path to device]
> > > > >     |--- migration
> > > > >     |     |--- self
> > > > >     |     |   |---device_api
> > > > >     |    |   |---mdev_type
> > > > >     |    |   |---software_version
> > > > >     |    |   |---device_id
> > > > >     |    |   |---aggregator
> > > > >     |     |--- compatible
> > > > >     |     |   |---device_api
> > > > >     |    |   |---mdev_type
> > > > >     |    |   |---software_version
> > > > >     |    |   |---device_id
> > > > >     |    |   |---aggregator
> > > > > 
> > > > > 
> > > > >  Yes but:
> > > > > 
> > > > >  - You need one file per attribute (one syscall for one attribute)
> > > > >  - Attribute is coupled with kobject
> > > 
> > > Is that really that bad? You have the device with an embedded kobject
> > > anyway, and you can just put things into an attribute group?
> > > 
> > > [Also, I think that self/compatible split in the example makes things
> > > needlessly complex. Shouldn't semantic versioning and matching already
> > > cover nearly everything? I would expect very few cases that are more
> > > complex than that. Maybe the aggregation stuff, but I don't think we
> > > need that self/compatible split for that, either.]
> > 
> > Hi Cornelia,
> > 
> > The reason I want to declare compatible list of attributes is that
> > sometimes it's not a simple 1:1 matching of source attributes and target attributes
> > as I demonstrated below,
> > source mdev of (mdev_type i915-GVTg_V5_2 + aggregator 1) is compatible to
> > target mdev of (mdev_type i915-GVTg_V5_4 + aggregator 2),
> >                (mdev_type i915-GVTg_V5_8 + aggregator 4)
> the way you are doing the nameing is till really confusing by the way
> if this has not already been merged in the kernel can you chagne the mdev
> so that mdev_type i915-GVTg_V5_2 is 2 of mdev_type i915-GVTg_V5_1 instead of half the device
> 
> currently you need to deived the aggratod by the number at the end of the mdev type to figure out
> how much of the phsicial device is being used with is a very unfridly api convention
> 
> the way aggrator are being proposed in general is not really someting i like but i thin this at least
> is something that should be able to correct.
> 
> with the complexity in the mdev type name + aggrator i suspect that this will never be support
> in openstack nova directly requireing integration via cyborg unless we can pre partion the
> device in to mdevs staicaly and just ignore this.
> 
> this is way to vendor sepecif to integrate into something like openstack in nova unless we can guarentee
> taht how aggreator work will be portable across vendors genericly.
> 
> > 
> > and aggragator may be just one of such examples that 1:1 matching does not
> > fit.
> for openstack nova i dont see us support anything beyond the 1:1 case where the mdev type does not change.
>
hi Sean,
I understand it's hard for openstack. but 1:N is always meaningful.
e.g.
if source device 1 has cap A, it is compatible to
device 2: cap A,
device 3: cap A+B,
device 4: cap A+B+C
....
to allow openstack to detect it correctly, in compatible list of
device 2, we would say compatible cap is A;
device 3, compatible cap is A or A+B;
device 4, compatible cap is A or A+B, or A+B+C;

then if openstack finds device A's self cap A is contained in compatible
cap of device 2/3/4, it can migrate device 1 to device 2,3,4.

conversely,  device 1's compatible cap is only A,
so it is able to migrate device 2 to device 1, and it is not able to
migrate device 3/4 to device 1.

Thanks
Yan

> i woudl really prefer if there was just one mdev type that repsented the minimal allcatable unit and the
> aggragaotr where used to create compostions of that. i.e instad of i915-GVTg_V5_2 beign half the device,
> have 1 mdev type i915-GVTg and if the device support 8 of them then we can aggrate 4 of i915-GVTg
> 
> if you want to have muplie mdev type to model the different amoutn of the resouce e.g. i915-GVTg_small i915-GVTg_large
> that is totlaly fine too or even i915-GVTg_4 indcating it sis 4 of i915-GVTg
> 
> failing that i would just expose an mdev type per composable resouce and allow us to compose them a the user level with
> some other construct mudeling a attament to the device. e.g. create composed mdev or somethig that is an aggreateion of
> multiple sub resouces each of which is an mdev. so kind of like how bond port work. we would create an mdev for each of
> the sub resouces and then create a bond or aggrated mdev by reference the other mdevs by uuid then attach only the
> aggreated mdev to the instance.
> 
> the current aggrator syntax and sematic however make me rather uncofrotable when i think about orchestating vms on top
> of it even to boot them let alone migrate them.
> > 
> > So, we explicitly list out self/compatible attributes, and management
> > tools only need to check if self attributes is contained compatible
> > attributes.
> > 
> > or do you mean only compatible list is enough, and the management tools
> > need to find out self list by themselves?
> > But I think provide a self list is easier for management tools.
> > 
> > Thanks
> > Yan
> > 
> 


From smooney at redhat.com  Thu Aug 20 05:16:28 2020
From: smooney at redhat.com (Sean Mooney)
Date: Thu, 20 Aug 2020 06:16:28 +0100
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200820040116.GB24121@joy-OptiPlex-7040>
References: <20200810074631.GA29059@joy-OptiPlex-7040>
 <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
 <20200818113652.5d81a392.cohuck@redhat.com>
 <20200820003922.GE21172@joy-OptiPlex-7040>
 <242591bb809b68c618f62fdc93d4f8ae7b146b6d.camel@redhat.com>
 <20200820040116.GB24121@joy-OptiPlex-7040>
Message-ID: <da140e6d262632e2fb707f69f220915748d25d35.camel@redhat.com>

On Thu, 2020-08-20 at 12:01 +0800, Yan Zhao wrote:
> On Thu, Aug 20, 2020 at 02:29:07AM +0100, Sean Mooney wrote:
> > On Thu, 2020-08-20 at 08:39 +0800, Yan Zhao wrote:
> > > On Tue, Aug 18, 2020 at 11:36:52AM +0200, Cornelia Huck wrote:
> > > > On Tue, 18 Aug 2020 10:16:28 +0100
> > > > Daniel P. Berrangé <berrange at redhat.com> wrote:
> > > > 
> > > > > On Tue, Aug 18, 2020 at 05:01:51PM +0800, Jason Wang wrote:
> > > > > >    On 2020/8/18 下午4:55, Daniel P. Berrangé wrote:
> > > > > > 
> > > > > >  On Tue, Aug 18, 2020 at 11:24:30AM +0800, Jason Wang wrote:
> > > > > > 
> > > > > >  On 2020/8/14 下午1:16, Yan Zhao wrote:
> > > > > > 
> > > > > >  On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
> > > > > > 
> > > > > >  On 2020/8/10 下午3:46, Yan Zhao wrote:  
> > > > > >  we actually can also retrieve the same information through sysfs, .e.g
> > > > > > 
> > > > > >  |- [path to device]
> > > > > >     |--- migration
> > > > > >     |     |--- self
> > > > > >     |     |   |---device_api
> > > > > >     |    |   |---mdev_type
> > > > > >     |    |   |---software_version
> > > > > >     |    |   |---device_id
> > > > > >     |    |   |---aggregator
> > > > > >     |     |--- compatible
> > > > > >     |     |   |---device_api
> > > > > >     |    |   |---mdev_type
> > > > > >     |    |   |---software_version
> > > > > >     |    |   |---device_id
> > > > > >     |    |   |---aggregator
> > > > > > 
> > > > > > 
> > > > > >  Yes but:
> > > > > > 
> > > > > >  - You need one file per attribute (one syscall for one attribute)
> > > > > >  - Attribute is coupled with kobject
> > > > 
> > > > Is that really that bad? You have the device with an embedded kobject
> > > > anyway, and you can just put things into an attribute group?
> > > > 
> > > > [Also, I think that self/compatible split in the example makes things
> > > > needlessly complex. Shouldn't semantic versioning and matching already
> > > > cover nearly everything? I would expect very few cases that are more
> > > > complex than that. Maybe the aggregation stuff, but I don't think we
> > > > need that self/compatible split for that, either.]
> > > 
> > > Hi Cornelia,
> > > 
> > > The reason I want to declare compatible list of attributes is that
> > > sometimes it's not a simple 1:1 matching of source attributes and target attributes
> > > as I demonstrated below,
> > > source mdev of (mdev_type i915-GVTg_V5_2 + aggregator 1) is compatible to
> > > target mdev of (mdev_type i915-GVTg_V5_4 + aggregator 2),
> > >                (mdev_type i915-GVTg_V5_8 + aggregator 4)
> > 
> > the way you are doing the nameing is till really confusing by the way
> > if this has not already been merged in the kernel can you chagne the mdev
> > so that mdev_type i915-GVTg_V5_2 is 2 of mdev_type i915-GVTg_V5_1 instead of half the device
> > 
> > currently you need to deived the aggratod by the number at the end of the mdev type to figure out
> > how much of the phsicial device is being used with is a very unfridly api convention
> > 
> > the way aggrator are being proposed in general is not really someting i like but i thin this at least
> > is something that should be able to correct.
> > 
> > with the complexity in the mdev type name + aggrator i suspect that this will never be support
> > in openstack nova directly requireing integration via cyborg unless we can pre partion the
> > device in to mdevs staicaly and just ignore this.
> > 
> > this is way to vendor sepecif to integrate into something like openstack in nova unless we can guarentee
> > taht how aggreator work will be portable across vendors genericly.
> > 
> > > 
> > > and aggragator may be just one of such examples that 1:1 matching does not
> > > fit.
> > 
> > for openstack nova i dont see us support anything beyond the 1:1 case where the mdev type does not change.
> > 
> 
> hi Sean,
> I understand it's hard for openstack. but 1:N is always meaningful.
> e.g.
> if source device 1 has cap A, it is compatible to
> device 2: cap A,
> device 3: cap A+B,
> device 4: cap A+B+C
> ....
> to allow openstack to detect it correctly, in compatible list of
> device 2, we would say compatible cap is A;
> device 3, compatible cap is A or A+B;
> device 4, compatible cap is A or A+B, or A+B+C;
> 
> then if openstack finds device A's self cap A is contained in compatible
> cap of device 2/3/4, it can migrate device 1 to device 2,3,4.
> 
> conversely,  device 1's compatible cap is only A,
> so it is able to migrate device 2 to device 1, and it is not able to
> migrate device 3/4 to device 1.

yes we build the palcement servce aroudn the idea of capablites as traits on resocue providres.
which is why i originally asked if we coudl model compatibality with feature flags

we can seaislyt model deivce as aupport A, A+B or  A+B+C
and then select hosts and evice based on that but

the list of compatable deivce you are propsoeing hide this feature infomation which whould be what we are matching on.

give me a lset of feature you want and list ting the feature avaiable on each device allow highre level ocestation to
easily match the request to a host that can fulllfile it btu thave a set of other compatihble device does not help with
that

so if a simple list a capabliteis can be advertiese d and if we know tha two dievce with the same capablity are
intercahangebale that is workabout i suspect that will not be the case however and it would onely work within a familay
of mdevs that are closely related.  which i think agian is an argument for not changeing the mdev type and at least
intially only look at migatreion where the mdev type doee not change initally. 

> 
> Thanks
> Yan
> 
> > i woudl really prefer if there was just one mdev type that repsented the minimal allcatable unit and the
> > aggragaotr where used to create compostions of that. i.e instad of i915-GVTg_V5_2 beign half the device,
> > have 1 mdev type i915-GVTg and if the device support 8 of them then we can aggrate 4 of i915-GVTg
> > 
> > if you want to have muplie mdev type to model the different amoutn of the resouce e.g. i915-GVTg_small i915-
> > GVTg_large
> > that is totlaly fine too or even i915-GVTg_4 indcating it sis 4 of i915-GVTg
> > 
> > failing that i would just expose an mdev type per composable resouce and allow us to compose them a the user level
> > with
> > some other construct mudeling a attament to the device. e.g. create composed mdev or somethig that is an aggreateion
> > of
> > multiple sub resouces each of which is an mdev. so kind of like how bond port work. we would create an mdev for each
> > of
> > the sub resouces and then create a bond or aggrated mdev by reference the other mdevs by uuid then attach only the
> > aggreated mdev to the instance.
> > 
> > the current aggrator syntax and sematic however make me rather uncofrotable when i think about orchestating vms on
> > top
> > of it even to boot them let alone migrate them.
> > > 
> > > So, we explicitly list out self/compatible attributes, and management
> > > tools only need to check if self attributes is contained compatible
> > > attributes.
> > > 
> > > or do you mean only compatible list is enough, and the management tools
> > > need to find out self list by themselves?
> > > But I think provide a self list is easier for management tools.
> > > 
> > > Thanks
> > > Yan
> > > 
> 
> 


From yan.y.zhao at intel.com  Thu Aug 20 06:27:25 2020
From: yan.y.zhao at intel.com (Yan Zhao)
Date: Thu, 20 Aug 2020 14:27:25 +0800
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <da140e6d262632e2fb707f69f220915748d25d35.camel@redhat.com>
References: <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
 <20200818113652.5d81a392.cohuck@redhat.com>
 <20200820003922.GE21172@joy-OptiPlex-7040>
 <242591bb809b68c618f62fdc93d4f8ae7b146b6d.camel@redhat.com>
 <20200820040116.GB24121@joy-OptiPlex-7040>
 <da140e6d262632e2fb707f69f220915748d25d35.camel@redhat.com>
Message-ID: <20200820062725.GB24997@joy-OptiPlex-7040>

On Thu, Aug 20, 2020 at 06:16:28AM +0100, Sean Mooney wrote:
> On Thu, 2020-08-20 at 12:01 +0800, Yan Zhao wrote:
> > On Thu, Aug 20, 2020 at 02:29:07AM +0100, Sean Mooney wrote:
> > > On Thu, 2020-08-20 at 08:39 +0800, Yan Zhao wrote:
> > > > On Tue, Aug 18, 2020 at 11:36:52AM +0200, Cornelia Huck wrote:
> > > > > On Tue, 18 Aug 2020 10:16:28 +0100
> > > > > Daniel P. Berrangé <berrange at redhat.com> wrote:
> > > > > 
> > > > > > On Tue, Aug 18, 2020 at 05:01:51PM +0800, Jason Wang wrote:
> > > > > > >    On 2020/8/18 下午4:55, Daniel P. Berrangé wrote:
> > > > > > > 
> > > > > > >  On Tue, Aug 18, 2020 at 11:24:30AM +0800, Jason Wang wrote:
> > > > > > > 
> > > > > > >  On 2020/8/14 下午1:16, Yan Zhao wrote:
> > > > > > > 
> > > > > > >  On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
> > > > > > > 
> > > > > > >  On 2020/8/10 下午3:46, Yan Zhao wrote:  
> > > > > > >  we actually can also retrieve the same information through sysfs, .e.g
> > > > > > > 
> > > > > > >  |- [path to device]
> > > > > > >     |--- migration
> > > > > > >     |     |--- self
> > > > > > >     |     |   |---device_api
> > > > > > >     |    |   |---mdev_type
> > > > > > >     |    |   |---software_version
> > > > > > >     |    |   |---device_id
> > > > > > >     |    |   |---aggregator
> > > > > > >     |     |--- compatible
> > > > > > >     |     |   |---device_api
> > > > > > >     |    |   |---mdev_type
> > > > > > >     |    |   |---software_version
> > > > > > >     |    |   |---device_id
> > > > > > >     |    |   |---aggregator
> > > > > > > 
> > > > > > > 
> > > > > > >  Yes but:
> > > > > > > 
> > > > > > >  - You need one file per attribute (one syscall for one attribute)
> > > > > > >  - Attribute is coupled with kobject
> > > > > 
> > > > > Is that really that bad? You have the device with an embedded kobject
> > > > > anyway, and you can just put things into an attribute group?
> > > > > 
> > > > > [Also, I think that self/compatible split in the example makes things
> > > > > needlessly complex. Shouldn't semantic versioning and matching already
> > > > > cover nearly everything? I would expect very few cases that are more
> > > > > complex than that. Maybe the aggregation stuff, but I don't think we
> > > > > need that self/compatible split for that, either.]
> > > > 
> > > > Hi Cornelia,
> > > > 
> > > > The reason I want to declare compatible list of attributes is that
> > > > sometimes it's not a simple 1:1 matching of source attributes and target attributes
> > > > as I demonstrated below,
> > > > source mdev of (mdev_type i915-GVTg_V5_2 + aggregator 1) is compatible to
> > > > target mdev of (mdev_type i915-GVTg_V5_4 + aggregator 2),
> > > >                (mdev_type i915-GVTg_V5_8 + aggregator 4)
> > > 
> > > the way you are doing the nameing is till really confusing by the way
> > > if this has not already been merged in the kernel can you chagne the mdev
> > > so that mdev_type i915-GVTg_V5_2 is 2 of mdev_type i915-GVTg_V5_1 instead of half the device
> > > 
> > > currently you need to deived the aggratod by the number at the end of the mdev type to figure out
> > > how much of the phsicial device is being used with is a very unfridly api convention
> > > 
> > > the way aggrator are being proposed in general is not really someting i like but i thin this at least
> > > is something that should be able to correct.
> > > 
> > > with the complexity in the mdev type name + aggrator i suspect that this will never be support
> > > in openstack nova directly requireing integration via cyborg unless we can pre partion the
> > > device in to mdevs staicaly and just ignore this.
> > > 
> > > this is way to vendor sepecif to integrate into something like openstack in nova unless we can guarentee
> > > taht how aggreator work will be portable across vendors genericly.
> > > 
> > > > 
> > > > and aggragator may be just one of such examples that 1:1 matching does not
> > > > fit.
> > > 
> > > for openstack nova i dont see us support anything beyond the 1:1 case where the mdev type does not change.
> > > 
> > 
> > hi Sean,
> > I understand it's hard for openstack. but 1:N is always meaningful.
> > e.g.
> > if source device 1 has cap A, it is compatible to
> > device 2: cap A,
> > device 3: cap A+B,
> > device 4: cap A+B+C
> > ....
> > to allow openstack to detect it correctly, in compatible list of
> > device 2, we would say compatible cap is A;
> > device 3, compatible cap is A or A+B;
> > device 4, compatible cap is A or A+B, or A+B+C;
> > 
> > then if openstack finds device A's self cap A is contained in compatible
> > cap of device 2/3/4, it can migrate device 1 to device 2,3,4.
> > 
> > conversely,  device 1's compatible cap is only A,
> > so it is able to migrate device 2 to device 1, and it is not able to
> > migrate device 3/4 to device 1.
> 
> yes we build the palcement servce aroudn the idea of capablites as traits on resocue providres.
> which is why i originally asked if we coudl model compatibality with feature flags
> 
> we can seaislyt model deivce as aupport A, A+B or  A+B+C
> and then select hosts and evice based on that but
> 
> the list of compatable deivce you are propsoeing hide this feature infomation which whould be what we are matching on.
> 
> give me a lset of feature you want and list ting the feature avaiable on each device allow highre level ocestation to
> easily match the request to a host that can fulllfile it btu thave a set of other compatihble device does not help with
> that
> 
> so if a simple list a capabliteis can be advertiese d and if we know tha two dievce with the same capablity are
> intercahangebale that is workabout i suspect that will not be the case however and it would onely work within a familay
> of mdevs that are closely related.  which i think agian is an argument for not changeing the mdev type and at least
> intially only look at migatreion where the mdev type doee not change initally. 
>
sorry Sean, I don't understand your words completely.
Please allow me to write it down in my words, and please confirm if my
understanding is right.
1. you mean you agree on that each field is regarded as a trait, and
openstack can compare by itself if source trait is a subset of target trait, right?
e.g.
source device
field1=A1
field2=A2+B2
field3=A3

target device
field1=A1+B1
field2=A2+B2
filed3=A3

then openstack sees that field1/2/3 in source is a subset of field1/2/3 in
target, so it's migratable to target?

2. mdev_type + aggregator make it hard to achieve the above elegant
solution, so it's best to avoid the combined comparing of mdev_type + aggregator.
do I understand it correctly?

3. you don't like self list and compatible list, because it is hard for
openstack to compare different traits?
e.g. if we have self list and compatible list, then as below, openstack needs
to compare if self field1/2/3 is a subset of compatible field 1/2/3.

source device:
self field1=A1
self field2=A2+B2
self field3=A3

compatible field1=A1
compatible field2=A2;B2;A2+B2;
compatible field3=A3


target device:
self field1=A1+B1
self field2=A2+B2
self field3=A3

compatible field1=A1;B1;A1+B1;
compatible field2=A2;B2;A2+B2;
compatible field3=A3


Thanks
Yan


> > 
> > 
> > > i woudl really prefer if there was just one mdev type that repsented the minimal allcatable unit and the
> > > aggragaotr where used to create compostions of that. i.e instad of i915-GVTg_V5_2 beign half the device,
> > > have 1 mdev type i915-GVTg and if the device support 8 of them then we can aggrate 4 of i915-GVTg
> > > 
> > > if you want to have muplie mdev type to model the different amoutn of the resouce e.g. i915-GVTg_small i915-
> > > GVTg_large
> > > that is totlaly fine too or even i915-GVTg_4 indcating it sis 4 of i915-GVTg
> > > 
> > > failing that i would just expose an mdev type per composable resouce and allow us to compose them a the user level
> > > with
> > > some other construct mudeling a attament to the device. e.g. create composed mdev or somethig that is an aggreateion
> > > of
> > > multiple sub resouces each of which is an mdev. so kind of like how bond port work. we would create an mdev for each
> > > of
> > > the sub resouces and then create a bond or aggrated mdev by reference the other mdevs by uuid then attach only the
> > > aggreated mdev to the instance.
> > > 
> > > the current aggrator syntax and sematic however make me rather uncofrotable when i think about orchestating vms on
> > > top
> > > of it even to boot them let alone migrate them.
> > > > 
> > > > So, we explicitly list out self/compatible attributes, and management
> > > > tools only need to check if self attributes is contained compatible
> > > > attributes.
> > > > 
> > > > or do you mean only compatible list is enough, and the management tools
> > > > need to find out self list by themselves?
> > > > But I think provide a self list is easier for management tools.
> > > > 
> > > > Thanks
> > > > Yan
> > > > 
> > 
> > 
> 


From cohuck at redhat.com  Thu Aug 20 12:27:40 2020
From: cohuck at redhat.com (Cornelia Huck)
Date: Thu, 20 Aug 2020 14:27:40 +0200
Subject: [ovirt-devel] Re: device compatibility interface for live
 migration with assigned devices
In-Reply-To: <c1d580dd-5c0c-21bc-19a6-f776617d4ec2@redhat.com>
References: <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
 <20200818113652.5d81a392.cohuck@redhat.com>
 <BY5PR12MB4322C9D1A66C4657776A1383DC5C0@BY5PR12MB4322.namprd12.prod.outlook.com>
 <20200819033035.GA21172@joy-OptiPlex-7040>
 <e20812b7-994b-b7f9-2df4-a78c4d116c7f@redhat.com>
 <20200819065951.GB21172@joy-OptiPlex-7040>
 <d6f9a51e-80b3-44c5-2656-614b327dc080@redhat.com>
 <20200819081338.GC21172@joy-OptiPlex-7040>
 <c1d580dd-5c0c-21bc-19a6-f776617d4ec2@redhat.com>
Message-ID: <20200820142740.6513884d.cohuck@redhat.com>

On Wed, 19 Aug 2020 17:28:38 +0800
Jason Wang <jasowang at redhat.com> wrote:

> On 2020/8/19 下午4:13, Yan Zhao wrote:
> > On Wed, Aug 19, 2020 at 03:39:50PM +0800, Jason Wang wrote:  
> >> On 2020/8/19 下午2:59, Yan Zhao wrote:  
> >>> On Wed, Aug 19, 2020 at 02:57:34PM +0800, Jason Wang wrote:  
> >>>> On 2020/8/19 上午11:30, Yan Zhao wrote:  
> >>>>> hi All,
> >>>>> could we decide that sysfs is the interface that every VFIO vendor driver
> >>>>> needs to provide in order to support vfio live migration, otherwise the
> >>>>> userspace management tool would not list the device into the compatible
> >>>>> list?
> >>>>>
> >>>>> if that's true, let's move to the standardizing of the sysfs interface.
> >>>>> (1) content
> >>>>> common part: (must)
> >>>>>       - software_version: (in major.minor.bugfix scheme)  
> >>>> This can not work for devices whose features can be negotiated/advertised
> >>>> independently. (E.g virtio devices)

I thought the 'software_version' was supposed to describe kind of a
'protocol version' for the data we transmit? I.e., you add a new field,
you bump the version number.

> >>>>  
> >>> sorry, I don't understand here, why virtio devices need to use vfio interface?  
> >>
> >> I don't see any reason that virtio devices can't be used by VFIO. Do you?
> >>
> >> Actually, virtio devices have been used by VFIO for many years:
> >>
> >> - passthrough a hardware virtio devices to userspace(VM) drivers
> >> - using virtio PMD inside guest
> >>  
> > So, what's different for it vs passing through a physical hardware via VFIO?  
> 
> 
> The difference is in the guest, the device could be either real hardware 
> or emulated ones.
> 
> 
> > even though the features are negotiated dynamically, could you explain
> > why it would cause software_version not work?  
> 
> 
> Virtio device 1 supports feature A, B, C
> Virtio device 2 supports feature B, C, D
> 
> So you can't migrate a guest from device 1 to device 2. And it's 
> impossible to model the features with versions.

We're talking about the features offered by the device, right? Would it
be sufficient to mandate that the target device supports the same
features or a superset of the features supported by the source device?

> 
> 
> >
> >  
> >>> I think this thread is discussing about vfio related devices.
> >>>  
> >>>>>       - device_api: vfio-pci or vfio-ccw ...
> >>>>>       - type: mdev type for mdev device or
> >>>>>               a signature for physical device which is a counterpart for
> >>>>> 	   mdev type.
> >>>>>
> >>>>> device api specific part: (must)
> >>>>>      - pci id: pci id of mdev parent device or pci id of physical pci
> >>>>>        device (device_api is vfio-pci)API here.  
> >>>> So this assumes a PCI device which is probably not true.
> >>>>  
> >>> for device_api of vfio-pci, why it's not true?
> >>>
> >>> for vfio-ccw, it's subchannel_type.  
> >>
> >> Ok but having two different attributes for the same file is not good idea.
> >> How mgmt know there will be a 3rd type?  
> > that's why some attributes need to be common. e.g.
> > device_api: it's common because mgmt need to know it's a pci device or a
> >              ccw device. and the api type is already defined vfio.h.
> > 	    (The field is agreed by and actually suggested by Alex in previous mail)
> > type: mdev_type for mdev. if mgmt does not understand it, it would not
> >        be able to create one compatible mdev device.
> > software_version: mgmt can compare the major and minor if it understands
> >        this fields.  
> 
> 
> I think it would be helpful if you can describe how mgmt is expected to 
> work step by step with the proposed sysfs API. This can help people to 
> understand.

My proposal would be:
- check that device_api matches
- check possible device_api specific attributes
- check that type matches [I don't think the combination of mdev types
  and another attribute to determine compatibility is a good idea;
  actually, the current proposal confuses me every time I look at it]
- check that software_version is compatible, assuming semantic
  versioning
- check possible type-specific attributes

> 
> Thanks for the patience. Since sysfs is uABI, when accepted, we need 
> support it forever. That's why we need to be careful.

Nod.

(...)


From smooney at redhat.com  Thu Aug 20 13:24:26 2020
From: smooney at redhat.com (Sean Mooney)
Date: Thu, 20 Aug 2020 14:24:26 +0100
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200820062725.GB24997@joy-OptiPlex-7040>
References: <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
 <20200818113652.5d81a392.cohuck@redhat.com>
 <20200820003922.GE21172@joy-OptiPlex-7040>
 <242591bb809b68c618f62fdc93d4f8ae7b146b6d.camel@redhat.com>
 <20200820040116.GB24121@joy-OptiPlex-7040>
 <da140e6d262632e2fb707f69f220915748d25d35.camel@redhat.com>
 <20200820062725.GB24997@joy-OptiPlex-7040>
Message-ID: <47d216330e10152f0f5d27421da60a7b1c52e5f0.camel@redhat.com>

On Thu, 2020-08-20 at 14:27 +0800, Yan Zhao wrote:
> On Thu, Aug 20, 2020 at 06:16:28AM +0100, Sean Mooney wrote:
> > On Thu, 2020-08-20 at 12:01 +0800, Yan Zhao wrote:
> > > On Thu, Aug 20, 2020 at 02:29:07AM +0100, Sean Mooney wrote:
> > > > On Thu, 2020-08-20 at 08:39 +0800, Yan Zhao wrote:
> > > > > On Tue, Aug 18, 2020 at 11:36:52AM +0200, Cornelia Huck wrote:
> > > > > > On Tue, 18 Aug 2020 10:16:28 +0100
> > > > > > Daniel P. Berrangé <berrange at redhat.com> wrote:
> > > > > > 
> > > > > > > On Tue, Aug 18, 2020 at 05:01:51PM +0800, Jason Wang wrote:
> > > > > > > >    On 2020/8/18 下午4:55, Daniel P. Berrangé wrote:
> > > > > > > > 
> > > > > > > >  On Tue, Aug 18, 2020 at 11:24:30AM +0800, Jason Wang wrote:
> > > > > > > > 
> > > > > > > >  On 2020/8/14 下午1:16, Yan Zhao wrote:
> > > > > > > > 
> > > > > > > >  On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
> > > > > > > > 
> > > > > > > >  On 2020/8/10 下午3:46, Yan Zhao wrote:  
> > > > > > > >  we actually can also retrieve the same information through sysfs, .e.g
> > > > > > > > 
> > > > > > > >  |- [path to device]
> > > > > > > >     |--- migration
> > > > > > > >     |     |--- self
> > > > > > > >     |     |   |---device_api
> > > > > > > >     |    |   |---mdev_type
> > > > > > > >     |    |   |---software_version
> > > > > > > >     |    |   |---device_id
> > > > > > > >     |    |   |---aggregator
> > > > > > > >     |     |--- compatible
> > > > > > > >     |     |   |---device_api
> > > > > > > >     |    |   |---mdev_type
> > > > > > > >     |    |   |---software_version
> > > > > > > >     |    |   |---device_id
> > > > > > > >     |    |   |---aggregator
> > > > > > > > 
> > > > > > > > 
> > > > > > > >  Yes but:
> > > > > > > > 
> > > > > > > >  - You need one file per attribute (one syscall for one attribute)
> > > > > > > >  - Attribute is coupled with kobject
> > > > > > 
> > > > > > Is that really that bad? You have the device with an embedded kobject
> > > > > > anyway, and you can just put things into an attribute group?
> > > > > > 
> > > > > > [Also, I think that self/compatible split in the example makes things
> > > > > > needlessly complex. Shouldn't semantic versioning and matching already
> > > > > > cover nearly everything? I would expect very few cases that are more
> > > > > > complex than that. Maybe the aggregation stuff, but I don't think we
> > > > > > need that self/compatible split for that, either.]
> > > > > 
> > > > > Hi Cornelia,
> > > > > 
> > > > > The reason I want to declare compatible list of attributes is that
> > > > > sometimes it's not a simple 1:1 matching of source attributes and target attributes
> > > > > as I demonstrated below,
> > > > > source mdev of (mdev_type i915-GVTg_V5_2 + aggregator 1) is compatible to
> > > > > target mdev of (mdev_type i915-GVTg_V5_4 + aggregator 2),
> > > > >                (mdev_type i915-GVTg_V5_8 + aggregator 4)
> > > > 
> > > > the way you are doing the nameing is till really confusing by the way
> > > > if this has not already been merged in the kernel can you chagne the mdev
> > > > so that mdev_type i915-GVTg_V5_2 is 2 of mdev_type i915-GVTg_V5_1 instead of half the device
> > > > 
> > > > currently you need to deived the aggratod by the number at the end of the mdev type to figure out
> > > > how much of the phsicial device is being used with is a very unfridly api convention
> > > > 
> > > > the way aggrator are being proposed in general is not really someting i like but i thin this at least
> > > > is something that should be able to correct.
> > > > 
> > > > with the complexity in the mdev type name + aggrator i suspect that this will never be support
> > > > in openstack nova directly requireing integration via cyborg unless we can pre partion the
> > > > device in to mdevs staicaly and just ignore this.
> > > > 
> > > > this is way to vendor sepecif to integrate into something like openstack in nova unless we can guarentee
> > > > taht how aggreator work will be portable across vendors genericly.
> > > > 
> > > > > 
> > > > > and aggragator may be just one of such examples that 1:1 matching does not
> > > > > fit.
> > > > 
> > > > for openstack nova i dont see us support anything beyond the 1:1 case where the mdev type does not change.
> > > > 
> > > 
> > > hi Sean,
> > > I understand it's hard for openstack. but 1:N is always meaningful.
> > > e.g.
> > > if source device 1 has cap A, it is compatible to
> > > device 2: cap A,
> > > device 3: cap A+B,
> > > device 4: cap A+B+C
> > > ....
> > > to allow openstack to detect it correctly, in compatible list of
> > > device 2, we would say compatible cap is A;
> > > device 3, compatible cap is A or A+B;
> > > device 4, compatible cap is A or A+B, or A+B+C;
> > > 
> > > then if openstack finds device A's self cap A is contained in compatible
> > > cap of device 2/3/4, it can migrate device 1 to device 2,3,4.
> > > 
> > > conversely,  device 1's compatible cap is only A,
> > > so it is able to migrate device 2 to device 1, and it is not able to
> > > migrate device 3/4 to device 1.
> > 
> > yes we build the palcement servce aroudn the idea of capablites as traits on resocue providres.
> > which is why i originally asked if we coudl model compatibality with feature flags
> > 
> > we can seaislyt model deivce as aupport A, A+B or  A+B+C
> > and then select hosts and evice based on that but
> > 
> > the list of compatable deivce you are propsoeing hide this feature infomation which whould be what we are matching
> > on.
> > 
> > give me a lset of feature you want and list ting the feature avaiable on each device allow highre level ocestation
> > to
> > easily match the request to a host that can fulllfile it btu thave a set of other compatihble device does not help
> > with
> > that
> > 
> > so if a simple list a capabliteis can be advertiese d and if we know tha two dievce with the same capablity are
> > intercahangebale that is workabout i suspect that will not be the case however and it would onely work within a
> > familay
> > of mdevs that are closely related.  which i think agian is an argument for not changeing the mdev type and at least
> > intially only look at migatreion where the mdev type doee not change initally. 
> > 
> 
> sorry Sean, I don't understand your words completely.
> Please allow me to write it down in my words, and please confirm if my
> understanding is right.
> 1. you mean you agree on that each field is regarded as a trait, and
> openstack can compare by itself if source trait is a subset of target trait, right?
> e.g.
> source device
> field1=A1
> field2=A2+B2
> field3=A3
> 
> target device
> field1=A1+B1
> field2=A2+B2
> filed3=A3
> 
> then openstack sees that field1/2/3 in source is a subset of field1/2/3 in
> target, so it's migratable to target?

yes this is basically how cpu feature work.
if we see the host cpu on the dest is a supperset of the cpu feature used
by the vm we know its safe to migrate.
> 
> 2. mdev_type + aggregator make it hard to achieve the above elegant
> solution, so it's best to avoid the combined comparing of mdev_type + aggregator.
> do I understand it correctly?
yes and no. one of the challange that mdevs pose right now is that sometiem mdev model
independent resouces and sometimes multipe mdev types consume the same underlying resouces
there is know way for openstack to know if i915-GVTg_V5_2 and i915-GVTg_V5_4 consume the same resouces
or not. as such we cant do the accounting properly so i would much prefer to have just 1 mdev type
i915-GVTg and which models the minimal allocatable unit and then say i want 4 of them comsed into 1 device
then have a second mdev type that does that since

what that means in pratice is we cannot trust the available_instances for a given mdev type
as consuming a different mdev type might change it. aggrators makes that problem worse.
which is why i siad i would prefer if instead of aggreator as prposed each consumable
resouce was reported indepenedly as different mdev types and then we composed those
like we would when bond ports creating an attachment or other logical aggration that refers
to instance of mdevs of differing type which we expose as a singel mdev that is exposed to the guest.
in a concreate example we might say create a aggreator of 64 cuda cores and 32 tensor cores and "bond them"
or aggrate them as a single attachme mdev and provide that to a ml workload guest. a differnt guest could request
1 instace of the nvenc video encoder and one instance of the nvenc video decoder but no cuda or tensor for a video
transcoding workload.

if each of those componets are indepent mdev types and can be composed with that granularity then i think that approch
is better then the current aggreator with vendor sepcific fileds.
we can model the phsical device as being multipel nested resouces with different traits for each type of resouce and
different capsities for the same. we can even model how many of the attachments/compositions can be done indepently
if there is a limit on that.

|- [parent physical device]
|--- Vendor-specific-attributes [optional]
|--- [mdev_supported_types]
|     |--- [<type-id>]
|     |   |--- create
|     |   |--- name
|     |   |--- available_instances
|     |   |--- device_api
|     |   |--- description
|     |   |--- [devices]
|     |--- [<type-id>]
|     |   |--- create
|     |   |--- name
|     |   |--- available_instances
|     |   |--- device_api
|     |   |--- description
|     |   |--- [devices]
|     |--- [<type-id>]
|          |--- create
|          |--- name
|          |--- available_instances
|          |--- device_api
|          |--- description
|          |--- [devices]

a benifit of this appoch is we would be the mdev types would not change on migration 
and we could jsut compuare a a simeple version stirgh and feature flag list to determin comaptiablity
in a vendor neutral way. i dont nessisarly need to know what the vendeor flags mean just that the dest is a subset of
the source and that the semaitic version numbers say the mdevs are compatible.
> 
> 3. you don't like self list and compatible list, because it is hard for
> openstack to compare different traits?
> e.g. if we have self list and compatible list, then as below, openstack needs
> to compare if self field1/2/3 is a subset of compatible field 1/2/3.
currnetly we only use mdevs for vGPUs and in our documentaiton we tell customer
to model the mdev_type as a trait and request it as a reuiqred trait.
so for customer that are doing that today changing mdev types is not really an option.
we would prefer that they request the feature they need instead of a spefic mdev type
so we can select any that meets there needs
for example we have a bunch of traits for cuda support
https://github.com/openstack/os-traits/blob/master/os_traits/hw/gpu/cuda.py
or driectx/vulkan/opengl https://github.com/openstack/os-traits/blob/master/os_traits/hw/gpu/api.py
these are closely analogous to cpu feature flag lix avx or sse
https://github.com/openstack/os-traits/blob/master/os_traits/hw/cpu/x86/__init__.py#L16

so when it comes to compatiablities it would be ideal if you could express capablities as something like
a cpu feature flag then we can eaisly model those as traits. 
> 
> source device:
> self field1=A1
> self field2=A2+B2
> self field3=A3
> 
> compatible field1=A1
> compatible field2=A2;B2;A2+B2;
> compatible field3=A3
> 
> 
> target device:
> self field1=A1+B1
> self field2=A2+B2
> self field3=A3
> 
> compatible field1=A1;B1;A1+B1;
> compatible field2=A2;B2;A2+B2;
> compatible field3=A3
> 
> 
> Thanks
> Yan
> 
> 
> > > 
> > > 
> > > > i woudl really prefer if there was just one mdev type that repsented the minimal allcatable unit and the
> > > > aggragaotr where used to create compostions of that. i.e instad of i915-GVTg_V5_2 beign half the device,
> > > > have 1 mdev type i915-GVTg and if the device support 8 of them then we can aggrate 4 of i915-GVTg
> > > > 
> > > > if you want to have muplie mdev type to model the different amoutn of the resouce e.g. i915-GVTg_small i915-
> > > > GVTg_large
> > > > that is totlaly fine too or even i915-GVTg_4 indcating it sis 4 of i915-GVTg
> > > > 
> > > > failing that i would just expose an mdev type per composable resouce and allow us to compose them a the user
> > > > level
> > > > with
> > > > some other construct mudeling a attament to the device. e.g. create composed mdev or somethig that is an
> > > > aggreateion
> > > > of
> > > > multiple sub resouces each of which is an mdev. so kind of like how bond port work. we would create an mdev for
> > > > each
> > > > of
> > > > the sub resouces and then create a bond or aggrated mdev by reference the other mdevs by uuid then attach only
> > > > the
> > > > aggreated mdev to the instance.
> > > > 
> > > > the current aggrator syntax and sematic however make me rather uncofrotable when i think about orchestating vms
> > > > on
> > > > top
> > > > of it even to boot them let alone migrate them.
> > > > > 
> > > > > So, we explicitly list out self/compatible attributes, and management
> > > > > tools only need to check if self attributes is contained compatible
> > > > > attributes.
> > > > > 
> > > > > or do you mean only compatible list is enough, and the management tools
> > > > > need to find out self list by themselves?
> > > > > But I think provide a self list is easier for management tools.
> > > > > 
> > > > > Thanks
> > > > > Yan
> > > > > 
> > > 
> > > 
> 
> 


From arnaud.morin at gmail.com  Thu Aug 20 15:35:03 2020
From: arnaud.morin at gmail.com (Arnaud Morin)
Date: Thu, 20 Aug 2020 15:35:03 +0000
Subject: [largescale-sig][nova][neutron][oslo] RPC ping
In-Reply-To: <CAJoCO=MJ1vEEfwauom5qog3jgyVyEH7tZs4nRjVu8K_hcHTioA@mail.gmail.com>
References: <c5afa01d-d8ef-7441-26a5-ddbc756f719c@nemebean.com>
 <2af09e63936f75489946ea6b70c41d6e091531ee.camel@redhat.com>
 <7496bd35-856e-f48f-b6d8-65155b1777f1@openstack.org>
 <16a3adf0-2f51-dd7d-c729-7b27f1593980@nemebean.com>
 <a36bcd1d-1dbe-bd69-d195-c89795680e44@openstack.org>
 <6e68d1a3cfc4efff91d3668bb53805dc469673c6.camel@redhat.com>
 <e614e2ce-e5c6-087a-b186-74ab96889acf@nemebean.com>
 <65204b738f13fcea16b9b6d5a68149c89be73e6a.camel@redhat.com>
 <ecae0bcd-5ab1-caaa-aa64-0988836ec07a@nemebean.com>
 <CAJoCO=MJ1vEEfwauom5qog3jgyVyEH7tZs4nRjVu8K_hcHTioA@mail.gmail.com>
Message-ID: <20200820153503.GY31915@sync>

Hey all,

TLDR:
- Patch in [1] updated
- Example of usage in [3]
- Agree with fixing nova/rabbit/oslo but would like to keep this ping
  endpoint also
- Totally agree with documentation needed

Long:

Thank you all for your review and for the great information you bring to
that topic!

First thing, we are not yet using that patch in production, but in
testing/dev only for now (at OVH).
But the plan is to use it in production ASAP.

Also, we initially pushed that for neutron agent, that's why I missed
the fact that nova already used the "ping" endpoint, sorry for that.

Anyway, I dont care about the naming, so in latest patchset of [1], you
will see that I changed the name of the endpoint following Ken Giusti
suggestions.

The bug reported in [2] looks very similar to what we saw.
Thank you Sean for bringing that to attention in this thread.

To detect this error, using the above "ping" endpoint in oslo, we can
use a script like the one in [3] (sorry about it, I can write better
python :p).
As mentionned by Sean in a previous mail, I am calling effectively
the topic "compute.host123456.sbg5.cloud.ovh.net" in "nova" exchange.
My initial plan would be to identify topics related to a compute and do
pings in all topics, to make sure that all of them are answering.
I am not yet sure about how often and if this is a good plan btw.

Anyway, the compute is reporting status as UP, but the ping is
timeouting, which is exactly what I wanted to detect!

I mostly agree with all your comments about the fact that this is a
trick that we do as operator, and using the RPC bus is maybe not the
best approach, but this is pragmatic and quite simple IMHO.
What I also like in this solution is the fact that this is partialy 
outside of OpenStack: the endpoint is inside, but doing the ping is
external.
Monitoring OpenStack is not always easy, and sometimes we struggle on
finding the root cause of some issues. Having such endpoint
allow us to monitor OpenStack from an external point of view, but still
in a deeper way.
It's like a probe in your car telling you that even if you are still
running, your engine is off :)

Still, making sure that this bug is fixed by doing some work on 
(rabbit|oslo.messaging|nova|whatever} is the best thing to do.

However, IMO, this does not prevent this rpc ping endpoint from
existing.

Last, but not least, I totally agree about documenting this, but also
adding some documentation on how to configure rabbit and OpenStack
services in a way that fit operator needs.
There are plenty of parameters which could be tweaked on both OpenStack
and rabbit side. IMO, we need to explain a little bit more what are the
impact of setting a specific parameter to a given value.
For example, in another discussion ([4]), we were talking about
"durable" queues in rabbit. We manage to find that if we enable HA, we
should also enable durability of queues.

Anyway that's another topic, and this is also something we discuss in
large-scale group.

Thank you all,

[1] https://review.opendev.org/#/c/735385/
[2] https://bugs.launchpad.net/nova/+bug/1854992
[3] http://paste.openstack.org/show/796990/
[4] http://lists.openstack.org/pipermail/openstack-discuss/2020-August/016362.html


-- 
Arnaud Morin

On 13.08.20 - 17:17, Ken Giusti wrote:
> On Thu, Aug 13, 2020 at 12:30 PM Ben Nemec <openstack at nemebean.com> wrote:
> 
> >
> >
> > On 8/13/20 11:07 AM, Sean Mooney wrote:
> > >>   I think it's probably
> > >> better to provide a well-defined endpoint for them to talk to rather
> > >> than have everyone implement their own slightly different RPC ping
> > >> mechanism. The docs for this feature should be very explicit that this
> > >> is the only thing external code should be calling.
> > > ya i think that is a good approch.
> > > i would still prefer if people used say middelware to add a service ping
> > admin api endpoint
> > > instead of driectly calling the rpc endpoint to avoid exposing rabbitmq
> > but that is out of scope of this discussion.
> >
> > Completely agree. In the long run I would like to see this replaced with
> > better integrated healthchecking in OpenStack, but we've been talking
> > about that for years and have made minimal progress.
> >
> > >
> > >>
> > >>>
> > >>> so if this does actully detect somethign we can otherwise detect and
> > the use cases involves using it within
> > >>> the openstack services not form an external source then i think that
> > is fine but we proably need to use another
> > >>> name (alive? status?) or otherewise modify nova so that there is no
> > conflict.
> > >>>>
> > >>
> > >> If I understand your analysis of the bug correctly, this would have
> > >> caught that type of outage after all since the failure was asymmetric.
> > > am im not sure
> > > it might yes looking at https://review.opendev.org/#/c/735385/6
> > > its not clear to me how the endpoint is invoked. is it doing a topic
> > send or a direct send?
> > > to detech the failure you would need to invoke a ping on the compute
> > service and that ping would
> > > have to been encured on the to nova topic exchante with a routing key of
> > compute.<compute node hostname>
> > >
> > > if the compute topic queue was broken either because it was nolonger
> > bound to the correct topic or due to some other
> > > rabbitmq error then you woudl either get a message undeilverbale error
> > of some kind with the mandaroy flag or likely a
> > > timeout without the mandaroty flag. so if the ping would be routed usign
> > a topic too compute.<compute node hostname>
> > > then yes it would find this.
> > >
> > > although we can also detech this ourselves and fix it using the
> > mandatory flag i think by just recreating the queue wehn
> > > it extis but we get an undeliverable message, at least i think we can
> > rabbit is not my main are of expertiese so it
> > > woudl be nice is someone that know more about it can weigh in on that.
> >
> > I pinged Ken this morning to take a look at that. He should be able to
> > tell us whether it's a good idea or crazy talk. :-)
> >
> 
> Like I can tell the difference between crazy and good ideas.  Ben I thought
> you knew me better. ;)
> 
> As discussed you can enable the mandatory flag on a per RPCClient instance,
> for example:
> 
>        _topts = oslo_messaging.TransportOptions(at_least_once=True)
>          client = oslo_messaging.RPCClient(self.transport,
>                                       self.target,
>                                       timeout=conf.timeout,
>                                      version_cap=conf.target_version,
>                                      transport_options=_topts).prepare()
> 
> This will cause an rpc call/cast to fail if rabbitmq cannot find a queue
> for the rpc request message [note the difference between 'queuing the
> message' and 'having the message consumed' - the mandatory flag has nothing
> to do with whether or not the message is eventually consumed].
> 
> Keep in mind that there may be some cases where having no active consumers
> is ok and you do not want to get a delivery failure exception -
> specifically fanout or perhaps cast.  Depends on the use case.   If there
> are fanout use cases that fail or degrade if all present services don't get
> a message then the mandatory flag will not detect an error if  a subset of
> the bindings are lost.
> 
> My biggest concern with this type of failure (lost binding) is that
> apparently the consumer is none the wiser when it happens.  Without some
> sort of event issued by rabbitmq the RPC server cannot detect this problem
> and take corrective actions (or at least I cannot think of any ATM).
> 
> 
> -- 
> Ken Giusti  (kgiusti at gmail.com)


From juliaashleykreger at gmail.com  Thu Aug 20 15:49:14 2020
From: juliaashleykreger at gmail.com (Julia Kreger)
Date: Thu, 20 Aug 2020 08:49:14 -0700
Subject: [ironic][tripleo] RFC: deprecate the iSCSI deploy interface?
In-Reply-To: <CACNgkFy4Gx9UsSfR3EW_vfZzyPGg9mR2ks+h6noycOVz1C1-Ww@mail.gmail.com>
References: <CACNgkFy4Gx9UsSfR3EW_vfZzyPGg9mR2ks+h6noycOVz1C1-Ww@mail.gmail.com>
Message-ID: <CAF7gwdi5D-ppP5Tu8280xVxyYbPMa7mLJTC_vbf9qLP9STGswg@mail.gmail.com>

I'm having a sense of deja vu!

Because of the way the mechanics work, the iscsi deploy driver is in
an unfortunate position of being harder to troubleshoot and diagnose
failures. Which basically means we've not been able to really identify
common failures and add logic to handle them appropriately, like we
are able to with a tcp socket and file download. Based on this alone,
I think it makes a solid case for us to seriously consider
deprecation.

Overall, I'm +1 for the proposal and I believe over two cycles is the
right way to go.

I suspect we're going to have lots of push back from the TripleO
community because there has been resistance to change their default
usage in the past. As such I'm adding them to the subject so hopefully
they will be at least aware.

I guess my other worry is operators who already have a substantial
operational infrastructure investment built around the iscsi deploy
interface. I wonder why they didn't use direct, but maybe they have
all migrated in the past ?5? years. This could just be a non-concern
in reality, I'm just not sure.

Of course, if someone is willing to step up and make the iscsi
deployment interface their primary focus, that also shifts the
discussion to making direct the default interface?

-Julia


On Thu, Aug 20, 2020 at 1:57 AM Dmitry Tantsur <dtantsur at redhat.com> wrote:
>
> Hi all,
>
> Side note for those lacking context: this proposal concerns deprecating one of the ironic deploy interfaces detailed in https://docs.openstack.org/ironic/latest/admin/interfaces/deploy.html. It does not affect the boot-from-iSCSI feature.
>
> I would like to propose deprecating and removing the 'iscsi' deploy interface over the course of the next 2 cycles. The reasons are:
> 1) The iSCSI deploy is a source of occasional cryptic bugs when a target cannot be discovered or mounted properly.
> 2) Its security is questionable: I don't think we even use authentication.
> 3) Operators confusion: right now we default to the iSCSI deploy but pretty much direct everyone who cares about scalability or security to the 'direct' deploy.
> 4) Cost of maintenance: our feature set is growing, our team - not so much. iscsi_deploy.py is 800 lines of code that can be removed, and some dependencies that can be dropped as well.
>
> As far as I can remember, we've kept the iSCSI deploy for two reasons:
> 1) The direct deploy used to require Glance with Swift backend. The recently added [agent]image_download_source option allows caching and serving images via the ironic's HTTP server, eliminating this problem. I guess we'll have to switch to 'http' by default for this option to keep the out-of-box experience.
> 2) Memory footprint of the direct deploy. With the raw images streaming we no longer have to cache the downloaded images in the agent memory, removing this problem as well (I'm not even sure how much of a problem it is in 2020, even my phone has 4GiB of RAM).
>
> If this proposal is accepted, I suggest to execute it as follows:
> Victoria release:
> 1) Put an early deprecation warning in the release notes.
> 2) Announce the future change of the default value for [agent]image_download_source.
> W release:
> 3) Change [agent]image_download_source to 'http' by default.
> 4) Remove iscsi from the default enabled_deploy_interfaces and move it to the back of the supported list (effectively making direct deploy the default).
> X release:
> 5) Remove the iscsi deploy code from both ironic and IPA.
>
> Thoughts, opinions, suggestions?
>
> Dmitry


From sean.mcginnis at gmx.com  Thu Aug 20 16:39:37 2020
From: sean.mcginnis at gmx.com (Sean McGinnis)
Date: Thu, 20 Aug 2020 11:39:37 -0500
Subject: [all] Proposed Wallaby cycle schedule
In-Reply-To: <0083db2a-0ef7-99fa-0c45-fd170f7d7902@gmx.com>
References: <2e56de68-c416-e3ea-f3da-caaf9399287d@gmx.com>
 <0083db2a-0ef7-99fa-0c45-fd170f7d7902@gmx.com>
Message-ID: <fd59c3ab-306c-0b56-c968-d6d843cb9ac9@gmx.com>

>> The current thinking is it will likely take place in May (nothing is
>> set, just an educated guess, so please don't use that for any other
>> planning). So for the sake of figuring out the release schedule, we are
>> targeting a release date in early May. Hopefully this will then align
>> well with event plans.
>>
>> I have a proposed release schedule up for review here:
>>
>> https://review.opendev.org/#/c/744729/
...
>
> As an alternative option, I have proposed a 26 week option:
>
> https://review.opendev.org/#/c/745911/
>
The majority of support so far has been for the 26 week schedule, with
the only -1 votes going to the 29 week option.

This is a final call to raise any objects or issues with either option.
Unless something changes, we plan to approve the 26 week schedule early
next week.

Thanks!

Sean


From elfosardo at gmail.com  Thu Aug 20 17:05:49 2020
From: elfosardo at gmail.com (Riccardo Pittau)
Date: Thu, 20 Aug 2020 19:05:49 +0200
Subject: [ironic] next Victoria meetup
In-Reply-To: <CAORRS=kxY6MbaNmo6eh2v_5taKCnVLAxMZXneCYXAjQA-ySg+w@mail.gmail.com>
References: <CAORRS=kxY6MbaNmo6eh2v_5taKCnVLAxMZXneCYXAjQA-ySg+w@mail.gmail.com>
Message-ID: <CAORRS==wC6_OceMrbf8o5uNRGbFGPyHP5V5j1K8rXnZSPQrYEA@mail.gmail.com>

Hello again!

Friendly reminder about the vote to schedule the next Ironic Virtual Meetup!
Since a lot of people are on vacation in this period, we've decided to
postpone the final day for the vote to next Wednesday August 26

And we have an etherpad now!
https://etherpad.opendev.org/p/Ironic-Victoria-midcycle
Feel free to propose topics, we'll discuss also about the upcoming PTG and
Forum.

Thanks!

A si biri

Riccardo


On Mon, Aug 17, 2020 at 6:29 PM Riccardo Pittau <elfosardo at gmail.com> wrote:

> Hello everyone!
>
> The time for the next Ironic virtual meetup is close!
> It will be an opportunity to review what has been done in the last months,
> exchange ideas and plan for the time before the upcoming victoria release,
> with an eye towards the future.
>
> We're aiming to have the virtual meetup the first week of September
> (Monday August 31 - Friday September 4) and split it in two days, with one
> two-hours slot per day.
> Please vote for your best time slots here:
> https://doodle.com/poll/pi4x3kuxamf4nnpu
>
> We're planning to leave the vote open at least for the entire week until
> Friday August 21, so to have enough time to announce the final slots and
> planning early next week.
>
> Thanks!
>
> A si biri
>
> Riccardo
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200820/c78e40dc/attachment.html>

From pierre at stackhpc.com  Thu Aug 20 17:09:11 2020
From: pierre at stackhpc.com (Pierre Riteau)
Date: Thu, 20 Aug 2020 19:09:11 +0200
Subject: [cloudkitty] Resuming CloudKitty IRC meetings
Message-ID: <CA+ny2swkc2e7CGDrq5auGQPQ0=qngH6BKTaLPodVD+rdO-YeKQ@mail.gmail.com>

Hello,

We are resuming IRC meetings for the CloudKitty project using the
existing calendar schedule.
The first meeting will be on Monday August 24 at 1400 UTC in
#cloudkitty on freenode, then every two weeks.

Everyone is welcome: contributors, users, and anyone who would like to
contribute or use CloudKitty but doesn't know how to get started.
The agenda is available on Etherpad [1].

The meeting description [2] was stating that meetings were on the
first and third Monday of the month, but the calendar schedule was
using odd weeks. I've submitted a change [3] to synchronise the
description: let's meet on odd weeks instead of using a month-based
schedule.

Thanks in advance to all of you helping to keep the project going.

Pierre Riteau (priteau)

[1] https://etherpad.opendev.org/p/cloudkitty-meeting-topics
[2] http://eavesdrop.openstack.org/#CloudKitty_Team_Meeting
[3] https://review.opendev.org/#/c/747256/


From dev.faz at gmail.com  Thu Aug 20 17:16:17 2020
From: dev.faz at gmail.com (Fabian Zimmermann)
Date: Thu, 20 Aug 2020 19:16:17 +0200
Subject: [nova][neutron][oslo][ops][kolla] rabbit bindings issue
In-Reply-To: <20200818120708.GV31915@sync>
References: <CAA857VxY8mr0OSw8vSYdMph4tXb=5Wwwx3cM_YVgGa8=P+cxkQ@mail.gmail.com>
 <CALaZjRGEcFDBiQ2YnsMjXJ_WF3ALcjWVLQuetupVd__CWhjg0g@mail.gmail.com>
 <CAA857VzKREnBOEF667n4B+9L9Rbm+y7o2LVrDjhxhCT=2XWR8g@mail.gmail.com>
 <1a338d7e-c82c-cda2-2d47-b5aebb999142@openstack.org>
 <CAA857VxxmrHhBZgNHCg2X=ymvOdS_YcpERYN5OnKs0NMEA=cyA@mail.gmail.com>
 <CAA857Vya0rou-SwzO+A=smVHXNTZ9a03_3r09BJj=bb3NAcdGg@mail.gmail.com>
 <CAA857Vzw+kvW_7R37oqJF6gRkxGBSY3gNYTLKu++pY5WHH49qg@mail.gmail.com>
 <CAA857Vzq1nYLM3ES35J_9PxTieVskhZBc43Z40R3NeiEEOL4rA@mail.gmail.com>
 <CAPgF-fr-3J=acxng6SwRkLPXqvTYNsNmCZUXyvzi4rxDnPDdEA@mail.gmail.com>
 <CAA857Vw_HG8VuZyyihm8RB9Mo8aPm=v36GJrnK1jD9wyZHAb2A@mail.gmail.com>
 <20200818120708.GV31915@sync>
Message-ID: <CAA857Vxq_uhJ_Q8TOOOLouFeun=cr6Y3vin4R7fGiBDiPhrNdg@mail.gmail.com>

Hi,

just another idea:

Rabbitmq is able to count undelivered messages. We could use this
information to detect the broken bindings (causing undeliverable messages).

Anyone already doing this?

I currently don't have a way to reproduce the broken bindings, so I'm
unable to proof the idea.

Seems we have to wait issue to happen again - what - hopefully - never
happens :)

 Fabian

Arnaud Morin <arnaud.morin at gmail.com> schrieb am Di., 18. Aug. 2020, 14:07:

> Hey all,
>
> About the vexxhost strategy to use only one rabbit server and manage HA
> through
> rabbit.
> Do you plan to do the same for MariaDB/MySQL?
>
> --
> Arnaud Morin
>
> On 14.08.20 - 18:45, Fabian Zimmermann wrote:
> > Hi,
> >
> > i read somewhere that vexxhosts kubernetes openstack-Operator is running
> > one rabbitmq Container per Service. Just the kubernetes self healing is
> > used as "ha" for rabbitmq.
> >
> > That seems to match with my finding: run rabbitmq standalone and use an
> > external system to restart rabbitmq if required.
> >
> >  Fabian
> >
> > Satish Patel <satish.txt at gmail.com> schrieb am Fr., 14. Aug. 2020,
> 16:59:
> >
> > > Fabian,
> > >
> > > what do you mean?
> > >
> > > >> I think vexxhost is running (1) with their openstack-operator - for
> > > reasons.
> > >
> > > On Fri, Aug 14, 2020 at 7:28 AM Fabian Zimmermann <dev.faz at gmail.com>
> > > wrote:
> > > >
> > > > Hello again,
> > > >
> > > > just a short update about the results of my tests.
> > > >
> > > > I currently see 2 ways of running openstack+rabbitmq
> > > >
> > > > 1. without durable-queues and without replication - just one
> > > rabbitmq-process which gets (somehow) restarted if it fails.
> > > > 2. durable-queues and replication
> > > >
> > > > Any other combination of these settings leads to more or less issues
> with
> > > >
> > > > * broken / non working bindings
> > > > * broken queues
> > > >
> > > > I think vexxhost is running (1) with their openstack-operator - for
> > > reasons.
> > > >
> > > > I added [kolla], because kolla-ansible is installing rabbitmq with
> > > replication but without durable-queues.
> > > >
> > > > May someone point me to the best way to document these findings to
> some
> > > official doc?
> > > > I think a lot of installations out there will run into issues if -
> under
> > > load - a node fails.
> > > >
> > > >  Fabian
> > > >
> > > >
> > > > Am Do., 13. Aug. 2020 um 15:13 Uhr schrieb Fabian Zimmermann <
> > > dev.faz at gmail.com>:
> > > >>
> > > >> Hi,
> > > >>
> > > >> just did some short tests today in our test-environment (without
> > > durable queues and without replication):
> > > >>
> > > >> * started a rally task to generate some load
> > > >> * kill-9-ed rabbitmq on one node
> > > >> * rally task immediately stopped and the cloud (mostly) stopped
> working
> > > >>
> > > >> after some debugging i found (again) exchanges which had bindings to
> > > queues, but these bindings didnt forward any msgs.
> > > >> Wrote a small script to detect these broken bindings and will now
> check
> > > if this is "reproducible"
> > > >>
> > > >> then I will try "durable queues" and "durable queues with
> replication"
> > > to see if this helps. Even if I would expect
> > > >> rabbitmq should be able to handle this without these "hidden broken
> > > bindings"
> > > >>
> > > >> This just FYI.
> > > >>
> > > >>  Fabian
> > >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200820/0413e0e5/attachment-0001.html>

From sean.mcginnis at gmx.com  Thu Aug 20 17:18:38 2020
From: sean.mcginnis at gmx.com (Sean McGinnis)
Date: Thu, 20 Aug 2020 12:18:38 -0500
Subject: [cloudkitty] Resuming CloudKitty IRC meetings
In-Reply-To: <CA+ny2swkc2e7CGDrq5auGQPQ0=qngH6BKTaLPodVD+rdO-YeKQ@mail.gmail.com>
References: <CA+ny2swkc2e7CGDrq5auGQPQ0=qngH6BKTaLPodVD+rdO-YeKQ@mail.gmail.com>
Message-ID: <4a4ca182-62fc-a728-65bb-1001510f5af8@gmx.com>

On 8/20/20 12:09 PM, Pierre Riteau wrote:
> Hello,
>
> We are resuming IRC meetings for the CloudKitty project using the
> existing calendar schedule.
> The first meeting will be on Monday August 24 at 1400 UTC in
> #cloudkitty on freenode, then every two weeks.
>
> Everyone is welcome: contributors, users, and anyone who would like to
> contribute or use CloudKitty but doesn't know how to get started.
> The agenda is available on Etherpad [1].
>
> The meeting description [2] was stating that meetings were on the
> first and third Monday of the month, but the calendar schedule was
> using odd weeks. I've submitted a change [3] to synchronise the
> description: let's meet on odd weeks instead of using a month-based
> schedule.

With what you said above, this is actually taking place on even weeks now.

Can you clarify - is it a one-off that you will be holding this on the
4th Monday next week? Or do you actually intend to switch these to even
weeks (in which case that patch is incorrect)?

>
> Thanks in advance to all of you helping to keep the project going.
>
> Pierre Riteau (priteau)
>
> [1] https://etherpad.opendev.org/p/cloudkitty-meeting-topics
> [2] http://eavesdrop.openstack.org/#CloudKitty_Team_Meeting
> [3] https://review.opendev.org/#/c/747256/
>


From pierre at stackhpc.com  Thu Aug 20 17:54:16 2020
From: pierre at stackhpc.com (Pierre Riteau)
Date: Thu, 20 Aug 2020 19:54:16 +0200
Subject: [cloudkitty] Resuming CloudKitty IRC meetings
In-Reply-To: <4a4ca182-62fc-a728-65bb-1001510f5af8@gmx.com>
References: <CA+ny2swkc2e7CGDrq5auGQPQ0=qngH6BKTaLPodVD+rdO-YeKQ@mail.gmail.com>
 <4a4ca182-62fc-a728-65bb-1001510f5af8@gmx.com>
Message-ID: <CA+ny2swxtFVs2JY_ZYXDaKH+u-GdFS2L3kWVwBe-5HR2ocQMhQ@mail.gmail.com>

On Thu, 20 Aug 2020 at 19:27, Sean McGinnis <sean.mcginnis at gmx.com> wrote:
>
> On 8/20/20 12:09 PM, Pierre Riteau wrote:
> > Hello,
> >
> > We are resuming IRC meetings for the CloudKitty project using the
> > existing calendar schedule.
> > The first meeting will be on Monday August 24 at 1400 UTC in
> > #cloudkitty on freenode, then every two weeks.
> >
> > Everyone is welcome: contributors, users, and anyone who would like to
> > contribute or use CloudKitty but doesn't know how to get started.
> > The agenda is available on Etherpad [1].
> >
> > The meeting description [2] was stating that meetings were on the
> > first and third Monday of the month, but the calendar schedule was
> > using odd weeks. I've submitted a change [3] to synchronise the
> > description: let's meet on odd weeks instead of using a month-based
> > schedule.
>
> With what you said above, this is actually taking place on even weeks now.

Unless I am mistaken, a schedule based on the Nth day of month is not
fixed to even or odd weeks.
For example, in June the first and third Monday were in weeks 23 and
25 (odd), but since July they take place in even weeks.

What I am proposing is we disregard the existing description and go by
the frequency defined in the yaml file, which is biweekly-odd.
It means that people who have already imported the calendar invite
will have the correct date (although not the right description).
And we'll always have two weeks between each meeting, instead of
sometimes three.

> Can you clarify - is it a one-off that you will be holding this on the
> 4th Monday next week? Or do you actually intend to switch these to even
> weeks (in which case that patch is incorrect)?
>
> >
> > Thanks in advance to all of you helping to keep the project going.
> >
> > Pierre Riteau (priteau)
> >
> > [1] https://etherpad.opendev.org/p/cloudkitty-meeting-topics
> > [2] http://eavesdrop.openstack.org/#CloudKitty_Team_Meeting
> > [3] https://review.opendev.org/#/c/747256/
> >
>


From fungi at yuggoth.org  Thu Aug 20 18:24:39 2020
From: fungi at yuggoth.org (Jeremy Stanley)
Date: Thu, 20 Aug 2020 18:24:39 +0000
Subject: [cloudkitty] Resuming CloudKitty IRC meetings
In-Reply-To: <CA+ny2swxtFVs2JY_ZYXDaKH+u-GdFS2L3kWVwBe-5HR2ocQMhQ@mail.gmail.com>
References: <CA+ny2swkc2e7CGDrq5auGQPQ0=qngH6BKTaLPodVD+rdO-YeKQ@mail.gmail.com>
 <4a4ca182-62fc-a728-65bb-1001510f5af8@gmx.com>
 <CA+ny2swxtFVs2JY_ZYXDaKH+u-GdFS2L3kWVwBe-5HR2ocQMhQ@mail.gmail.com>
Message-ID: <20200820182438.lrqp5baym5o37nhl@yuggoth.org>

On 2020-08-20 19:54:16 +0200 (+0200), Pierre Riteau wrote:
[...]
> Unless I am mistaken, a schedule based on the Nth day of month is not
> fixed to even or odd weeks.
> For example, in June the first and third Monday were in weeks 23 and
> 25 (odd), but since July they take place in even weeks.
[...]

Correct, this is documented in the README.rst for the yaml2ical
library, which irc-meetings uses to render this metadata into
scheduling:

https://opendev.org/opendev/yaml2ical#user-content-frequencies

"biweekly-odd Occurs on odd weeks (ISOweek % 2 == 1)"

"Odd/Even and week numbers are based on the ISO week number. ISO
weeks can be checked with %V in GNU date(1)"

I think some people have assumed it's even/odd week numbers in a
month rather than even/odd week numbers counting from the epoch.
Technically we have frequencies like "first-tuesday" and
"third-tuesday" to address specific week numbers in a month rather
than truly alternating weeks.
-- 
Jeremy Stanley
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 963 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200820/e8fbc5b1/attachment.sig>

From sean.mcginnis at gmx.com  Thu Aug 20 18:37:53 2020
From: sean.mcginnis at gmx.com (Sean McGinnis)
Date: Thu, 20 Aug 2020 13:37:53 -0500
Subject: [cloudkitty] Resuming CloudKitty IRC meetings
In-Reply-To: <20200820182438.lrqp5baym5o37nhl@yuggoth.org>
References: <CA+ny2swkc2e7CGDrq5auGQPQ0=qngH6BKTaLPodVD+rdO-YeKQ@mail.gmail.com>
 <4a4ca182-62fc-a728-65bb-1001510f5af8@gmx.com>
 <CA+ny2swxtFVs2JY_ZYXDaKH+u-GdFS2L3kWVwBe-5HR2ocQMhQ@mail.gmail.com>
 <20200820182438.lrqp5baym5o37nhl@yuggoth.org>
Message-ID: <f1074faf-9959-d0d5-fafb-a2cdd13a9ed3@gmx.com>


> "biweekly-odd Occurs on odd weeks (ISOweek % 2 == 1)"
>
> "Odd/Even and week numbers are based on the ISO week number. ISO
> weeks can be checked with %V in GNU date(1)"
>
> I think some people have assumed it's even/odd week numbers in a
> month rather than even/odd week numbers counting from the epoch.
> Technically we have frequencies like "first-tuesday" and
> "third-tuesday" to address specific week numbers in a month rather
> than truly alternating weeks.
Based on the existing description, that appears to be the intent. So it
had been the first and third weeks of the month, so therefore the odd
weeks. But then we at least have a mismatch between phrasing used and
what yaml2ical generates, so the description does need to be updated.


From fungi at yuggoth.org  Thu Aug 20 19:22:31 2020
From: fungi at yuggoth.org (Jeremy Stanley)
Date: Thu, 20 Aug 2020 19:22:31 +0000
Subject: [cloudkitty] Resuming CloudKitty IRC meetings
In-Reply-To: <f1074faf-9959-d0d5-fafb-a2cdd13a9ed3@gmx.com>
References: <CA+ny2swkc2e7CGDrq5auGQPQ0=qngH6BKTaLPodVD+rdO-YeKQ@mail.gmail.com>
 <4a4ca182-62fc-a728-65bb-1001510f5af8@gmx.com>
 <CA+ny2swxtFVs2JY_ZYXDaKH+u-GdFS2L3kWVwBe-5HR2ocQMhQ@mail.gmail.com>
 <20200820182438.lrqp5baym5o37nhl@yuggoth.org>
 <f1074faf-9959-d0d5-fafb-a2cdd13a9ed3@gmx.com>
Message-ID: <20200820192231.2cv3x3qqm5oaanaz@yuggoth.org>

On 2020-08-20 13:37:53 -0500 (-0500), Sean McGinnis wrote:
> 
> > "biweekly-odd Occurs on odd weeks (ISOweek % 2 == 1)"
> > 
> > "Odd/Even and week numbers are based on the ISO week number. ISO
> > weeks can be checked with %V in GNU date(1)"
> > 
> > I think some people have assumed it's even/odd week numbers in a
> > month rather than even/odd week numbers counting from the epoch.
> > Technically we have frequencies like "first-tuesday" and
> > "third-tuesday" to address specific week numbers in a month rather
> > than truly alternating weeks.
> Based on the existing description, that appears to be the intent. So it
> had been the first and third weeks of the month, so therefore the odd
> weeks. But then we at least have a mismatch between phrasing used and
> what yaml2ical generates, so the description does need to be updated.

And even I'm easily confused by this, as evidenced by the fact that
above I confused epoch weeks with ISO annual week counts. As the
README.rst suggests, if you have a "53-week" year then you get ISO
odd weeks back to back between the end of that year and the start
of the next, or a pair of ISO even weeks which are three weeks apart
instead of two.
-- 
Jeremy Stanley
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 963 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200820/577db73d/attachment.sig>

From arnaud.morin at gmail.com  Thu Aug 20 19:28:40 2020
From: arnaud.morin at gmail.com (Arnaud MORIN)
Date: Thu, 20 Aug 2020 21:28:40 +0200
Subject: [nova][neutron][oslo][ops][kolla] rabbit bindings issue
In-Reply-To: <CAA857Vxq_uhJ_Q8TOOOLouFeun=cr6Y3vin4R7fGiBDiPhrNdg@mail.gmail.com>
References: <CAA857VxY8mr0OSw8vSYdMph4tXb=5Wwwx3cM_YVgGa8=P+cxkQ@mail.gmail.com>
 <CALaZjRGEcFDBiQ2YnsMjXJ_WF3ALcjWVLQuetupVd__CWhjg0g@mail.gmail.com>
 <CAA857VzKREnBOEF667n4B+9L9Rbm+y7o2LVrDjhxhCT=2XWR8g@mail.gmail.com>
 <1a338d7e-c82c-cda2-2d47-b5aebb999142@openstack.org>
 <CAA857VxxmrHhBZgNHCg2X=ymvOdS_YcpERYN5OnKs0NMEA=cyA@mail.gmail.com>
 <CAA857Vya0rou-SwzO+A=smVHXNTZ9a03_3r09BJj=bb3NAcdGg@mail.gmail.com>
 <CAA857Vzw+kvW_7R37oqJF6gRkxGBSY3gNYTLKu++pY5WHH49qg@mail.gmail.com>
 <CAA857Vzq1nYLM3ES35J_9PxTieVskhZBc43Z40R3NeiEEOL4rA@mail.gmail.com>
 <CAPgF-fr-3J=acxng6SwRkLPXqvTYNsNmCZUXyvzi4rxDnPDdEA@mail.gmail.com>
 <CAA857Vw_HG8VuZyyihm8RB9Mo8aPm=v36GJrnK1jD9wyZHAb2A@mail.gmail.com>
 <20200818120708.GV31915@sync>
 <CAA857Vxq_uhJ_Q8TOOOLouFeun=cr6Y3vin4R7fGiBDiPhrNdg@mail.gmail.com>
Message-ID: <CALn_SgZ41GB4j6LpBkf+sEcmOVHoMX2LA4cEgi3upAw1MQ934g@mail.gmail.com>

Hello,
Are you doing that using alternate exchange ?
I started configuring it in our env but not yet finished.

Cheers,

Le jeu. 20 août 2020 à 19:16, Fabian Zimmermann <dev.faz at gmail.com> a
écrit :

> Hi,
>
> just another idea:
>
> Rabbitmq is able to count undelivered messages. We could use this
> information to detect the broken bindings (causing undeliverable messages).
>
> Anyone already doing this?
>
> I currently don't have a way to reproduce the broken bindings, so I'm
> unable to proof the idea.
>
> Seems we have to wait issue to happen again - what - hopefully - never
> happens :)
>
>  Fabian
>
> Arnaud Morin <arnaud.morin at gmail.com> schrieb am Di., 18. Aug. 2020,
> 14:07:
>
>> Hey all,
>>
>> About the vexxhost strategy to use only one rabbit server and manage HA
>> through
>> rabbit.
>> Do you plan to do the same for MariaDB/MySQL?
>>
>> --
>> Arnaud Morin
>>
>> On 14.08.20 - 18:45, Fabian Zimmermann wrote:
>> > Hi,
>> >
>> > i read somewhere that vexxhosts kubernetes openstack-Operator is running
>> > one rabbitmq Container per Service. Just the kubernetes self healing is
>> > used as "ha" for rabbitmq.
>> >
>> > That seems to match with my finding: run rabbitmq standalone and use an
>> > external system to restart rabbitmq if required.
>> >
>> >  Fabian
>> >
>> > Satish Patel <satish.txt at gmail.com> schrieb am Fr., 14. Aug. 2020,
>> 16:59:
>> >
>> > > Fabian,
>> > >
>> > > what do you mean?
>> > >
>> > > >> I think vexxhost is running (1) with their openstack-operator - for
>> > > reasons.
>> > >
>> > > On Fri, Aug 14, 2020 at 7:28 AM Fabian Zimmermann <dev.faz at gmail.com>
>> > > wrote:
>> > > >
>> > > > Hello again,
>> > > >
>> > > > just a short update about the results of my tests.
>> > > >
>> > > > I currently see 2 ways of running openstack+rabbitmq
>> > > >
>> > > > 1. without durable-queues and without replication - just one
>> > > rabbitmq-process which gets (somehow) restarted if it fails.
>> > > > 2. durable-queues and replication
>> > > >
>> > > > Any other combination of these settings leads to more or less
>> issues with
>> > > >
>> > > > * broken / non working bindings
>> > > > * broken queues
>> > > >
>> > > > I think vexxhost is running (1) with their openstack-operator - for
>> > > reasons.
>> > > >
>> > > > I added [kolla], because kolla-ansible is installing rabbitmq with
>> > > replication but without durable-queues.
>> > > >
>> > > > May someone point me to the best way to document these findings to
>> some
>> > > official doc?
>> > > > I think a lot of installations out there will run into issues if -
>> under
>> > > load - a node fails.
>> > > >
>> > > >  Fabian
>> > > >
>> > > >
>> > > > Am Do., 13. Aug. 2020 um 15:13 Uhr schrieb Fabian Zimmermann <
>> > > dev.faz at gmail.com>:
>> > > >>
>> > > >> Hi,
>> > > >>
>> > > >> just did some short tests today in our test-environment (without
>> > > durable queues and without replication):
>> > > >>
>> > > >> * started a rally task to generate some load
>> > > >> * kill-9-ed rabbitmq on one node
>> > > >> * rally task immediately stopped and the cloud (mostly) stopped
>> working
>> > > >>
>> > > >> after some debugging i found (again) exchanges which had bindings
>> to
>> > > queues, but these bindings didnt forward any msgs.
>> > > >> Wrote a small script to detect these broken bindings and will now
>> check
>> > > if this is "reproducible"
>> > > >>
>> > > >> then I will try "durable queues" and "durable queues with
>> replication"
>> > > to see if this helps. Even if I would expect
>> > > >> rabbitmq should be able to handle this without these "hidden broken
>> > > bindings"
>> > > >>
>> > > >> This just FYI.
>> > > >>
>> > > >>  Fabian
>> > >
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200820/6104d778/attachment-0001.html>

From sean.mcginnis at gmx.com  Thu Aug 20 20:02:45 2020
From: sean.mcginnis at gmx.com (Sean McGinnis)
Date: Thu, 20 Aug 2020 15:02:45 -0500
Subject: [release] Release countdown for week R-7 Aug 24 - 28
Message-ID: <20200820200245.GA212631@sm-workstation>

Development Focus
-----------------

We are entering the last weeks of the Victoria development cycle. From
now until the final release, we'll send a countdown email like this
every week.

It's probably a good time for teams to take stock of their library and
client work that needs to be completed yet. The non-client library freeze
is coming up, followed closely by the client lib freeze. Please plan
accordingly to avoid any last minute rushes to get key functionality in.

General Information
-------------------

Next week is the Extra-ATC freeze, in preparation for future elections. All
contributions to OpenStack are valuable, but some are not expressed as
Gerrit code changes. Please list active contributors to your project team
who do not have a code contribution this cycle, and therefore won't
automatically be considered an Active Technical Contributor and allowed
to vote. This is done by adding extra-atcs to
https://opendev.org/openstack/governance/src/branch/master/reference/projects.yaml
before the Extra-ATC freeze on August 28.

A quick reminder of the upcoming freeze dates. Those vary depending on
deliverable type:

* General libraries (except client libraries) need to have their last
  feature release before Non-client library freeze (Sept 3). Their
  stable branches are cut early.

* Client libraries (think python-*client libraries) need to have their
  last feature release before Client library freeze (Sept 10)

* Deliverables following a cycle-with-rc model (that would be most
  services) observe a Feature freeze on that same date, Sept 10. Any
  feature addition beyond that date should be discussed on the mailing-list
  and get PTL approval. After feature freeze, cycle-with-rc deliverables
  need to produce a first release candidate (and a stable branch) before
  RC1 deadline (Sept 24)

* Deliverables following cycle-with-intermediary model can release as
  necessary, but in all cases before Final RC deadline (Oct 8)

Finally, now is also a good time to start planning what highlights you
want for your deliverables in the cycle highlights. The deadline to
submit an initial version for those is set to Feature freeze (Sept 10).

Background on cycle-highlights:
http://lists.openstack.org/pipermail/openstack-dev/2017-December/125613.html
Project Team Guide, Cycle-Highlights:
https://docs.openstack.org/project-team-guide/release-management.html#cycle-highlights
knelson [at] openstack.org/diablo_rojo on IRC is available if you need
help selecting or writing your highlights

Upcoming Deadlines & Dates
--------------------------

Non-client library freeze: September 3 (R-6 week)
Client library freeze: September 10 (R-5 week)
Victoria-3 milestone: September 10 (R-5 week)
Victoria release: October 14


From skaplons at redhat.com  Thu Aug 20 20:24:20 2020
From: skaplons at redhat.com (Slawek Kaplonski)
Date: Thu, 20 Aug 2020 22:24:20 +0200
Subject: [Neutron] Drivers meeting agenda
Message-ID: <20200820202420.flkrajceygsy7y37@skaplons-mac>

Hi,

Here is agenda for tomorrow's drivers meeting. We have 2 RFEs to discuss:

* https://bugs.launchpad.net/neutron/+bug/1891334 -  [RFE] Enable change of CIDR
  on a subnet
* https://bugs.launchpad.net/neutron/+bug/1892200 - Make keepalived healthcheck
  more configurable

There are also some points from Rodolfo on the on demand agenda but Rodolfo is
on PTO this week so probably we will discuss those topics finally next week.
See You on the meeting tomorrow and have a nice day :)

-- 
Slawek Kaplonski
Principal software engineer
Red Hat


From pierre at stackhpc.com  Thu Aug 20 20:48:26 2020
From: pierre at stackhpc.com (Pierre Riteau)
Date: Thu, 20 Aug 2020 22:48:26 +0200
Subject: [cloudkitty] Resuming CloudKitty IRC meetings
In-Reply-To: <20200820192231.2cv3x3qqm5oaanaz@yuggoth.org>
References: <CA+ny2swkc2e7CGDrq5auGQPQ0=qngH6BKTaLPodVD+rdO-YeKQ@mail.gmail.com>
 <4a4ca182-62fc-a728-65bb-1001510f5af8@gmx.com>
 <CA+ny2swxtFVs2JY_ZYXDaKH+u-GdFS2L3kWVwBe-5HR2ocQMhQ@mail.gmail.com>
 <20200820182438.lrqp5baym5o37nhl@yuggoth.org>
 <f1074faf-9959-d0d5-fafb-a2cdd13a9ed3@gmx.com>
 <20200820192231.2cv3x3qqm5oaanaz@yuggoth.org>
Message-ID: <CA+ny2sxKwFQr_4O2kgV37ywmFJHOj-PE4ODyqKgeAGY=C6R=AQ@mail.gmail.com>

Thanks for clearing up the confusion, I had not considered this
alternative interpretation of even and odds weeks.
And thanks for sharing the tidbit about 53-week years. Should I
mention there are months with a fifth Monday as well? ;-)

So, I would like to clarify that I am proposing meetings on odd ISO
week numbers. This is an initial schedule, which we can adapt based on
project activity.

On Thu, 20 Aug 2020 at 21:34, Jeremy Stanley <fungi at yuggoth.org> wrote:
>
> On 2020-08-20 13:37:53 -0500 (-0500), Sean McGinnis wrote:
> >
> > > "biweekly-odd Occurs on odd weeks (ISOweek % 2 == 1)"
> > >
> > > "Odd/Even and week numbers are based on the ISO week number. ISO
> > > weeks can be checked with %V in GNU date(1)"
> > >
> > > I think some people have assumed it's even/odd week numbers in a
> > > month rather than even/odd week numbers counting from the epoch.
> > > Technically we have frequencies like "first-tuesday" and
> > > "third-tuesday" to address specific week numbers in a month rather
> > > than truly alternating weeks.
> > Based on the existing description, that appears to be the intent. So it
> > had been the first and third weeks of the month, so therefore the odd
> > weeks. But then we at least have a mismatch between phrasing used and
> > what yaml2ical generates, so the description does need to be updated.
>
> And even I'm easily confused by this, as evidenced by the fact that
> above I confused epoch weeks with ISO annual week counts. As the
> README.rst suggests, if you have a "53-week" year then you get ISO
> odd weeks back to back between the end of that year and the start
> of the next, or a pair of ISO even weeks which are three weeks apart
> instead of two.
> --
> Jeremy Stanley


From openstack at nemebean.com  Thu Aug 20 21:16:44 2020
From: openstack at nemebean.com (Ben Nemec)
Date: Thu, 20 Aug 2020 16:16:44 -0500
Subject: [requirements][oslo] Inclusion of CONFspirator in
 openstack/requirements
In-Reply-To: <e1c14afe-1ec3-2447-75e4-d2015f2939c6@catalystcloud.nz>
References: <e1c14afe-1ec3-2447-75e4-d2015f2939c6@catalystcloud.nz>
Message-ID: <b822c306-35ca-32bb-8e59-e14d027add02@nemebean.com>


On 8/16/20 11:42 PM, Adrian Turjak wrote:
> Hey OpenStackers!
> 
> I'm hoping to add CONFspirator to openstack/requirements as I'm using it 
> Adjutant:
> https://review.opendev.org/#/c/746436/
> 
> The library has been in Adjutant for a while but I didn't add it to 
> openstack/requirements, so I'm trying to remedy that now. I think it is 
> different enough from oslo.config and I think the features/differences 
> are ones that are unlikely to ever make sense in oslo.config without 
> breaking it for people who do use it as it is, or adding too much 
> complexity.
> 
> I wanted to use oslo.config but quickly found that the way I was 
> currently doing config in Adjutant was heavily dependent on yaml, and 
> the ability to nest things. I was in a bind because I didn't have a 
> declarative config system like oslo.config, and the config for Adjutant 
> was a mess to maintain and understand (even for me, and I wrote it) with 
> random parts of the code pulling config that may or may not have been 
> set/declared.
> 
> After finding oslo.config was not suitable for my rather weird needs, I 
> took oslo.config as a starting point and ended up writing another 
> library specific to my requirements in Adjutant, and rather than keeping 
> it internal to Adjutant, moved it to an external library.
> 
> CONFspirator was built for a weird and complex edge case, because I have 
> plugins that need to dynamically load config on startup, which then has 
> to be lazy_loaded. I also have weird overlay logic for defaults that can 
> be overridden, and building it into the library made Adjutant simpler. I 
> also have nested config groups that need to be named dynamically to 
> allow plugin classes to be extended without subclasses sharing the same 
> config group name. I built something specific to my needs, that just so 
> happens to also be a potentially useful library for people wanting 
> something like oslo.config but that is targeted towards yaml and toml, 
> and the ability to nest groups.
> 
> The docs are here: https://confspirator.readthedocs.io/
> The code is here: https://gitlab.com/catalyst-cloud/confspirator
> 
> And for those interested in how I use it in Adjutant here are some 
> places of interest (be warned, it may be a rabbit hole):
> https://opendev.org/openstack/adjutant/src/branch/master/adjutant/config
> https://opendev.org/openstack/adjutant/src/branch/master/adjutant/feature_set.py 
> 
> https://opendev.org/openstack/adjutant/src/branch/master/adjutant/core.py
> https://opendev.org/openstack/adjutant/src/branch/master/adjutant/api/v1/openstack.py#L35-L44 
> 
> https://opendev.org/openstack/adjutant/src/branch/master/adjutant/actions/v1/projects.py#L155-L164 
> 
> https://opendev.org/openstack/adjutant/src/branch/master/adjutant/actions/v1/base.py#L146 
> 
> https://opendev.org/openstack/adjutant/src/branch/master/adjutant/tasks/v1/base.py#L30 
> 
> https://opendev.org/openstack/adjutant/src/branch/master/adjutant/tasks/v1/base.py#L293 
> 
> 
> If there are strong opinions about working to add this to oslo.config, 
> let's chat, as I'm not against merging this into it somehow if we find a 
> way that make sense, but while some aspects where similar, I felt that 
> this was cleaner without being part of oslo.config because the mindset I 
> was building towards seemed different and oslo.config didn't need my 
> complexity.

Okay, I'll take a crack at discussing this from the Oslo side.

First, we've tried to avoid adding YAML support to oslo.config for a 
couple of reasons:

1) consistency of configs across services. We didn't want to end up with 
a mix of ini and yaml files.
2) As you discovered, the oslo.config model isn't conducive to nested 
YAML layouts, so most of the benefits are lost anyway.

There are exceptions, of course. Just within oslo, oslo.policy uses YAML 
configs, but it gives up most of the oslo.config niceties to do so. 
Policy had to reimplement things like deprecation handling because it's 
dealing with raw YAML instead of a config object. I believe there are 
other examples where services had to refer to a YAML file for their 
complex config opts.

With all that said, I'm pretty sure a motivated person could write a 
YAML driver for oslo.config. It would introduce a layer of indirection - 
the service would refer to a .conf file containing just the driver 
config, which would then point to a separate .yaml file. I'm not sure 
you could implement nesting this way, but I haven't dug into the code to 
find out for sure.

In general, given the complexity of what you're talking about I think a 
driver plugin would be the way to go, as opposed to trying to fit this 
all in with the core oslo.config functionality (assuming you/we decide 
to pursue integrating at all).

There were a few other things you mentioned as features of the library. 
The following are some off-the-cuff thoughts without having looked too 
closely at the library, so if they're nonsense that's my excuse. ;-)

"because I have plugins that need to dynamically load config on startup, 
which then has to be lazy_loaded"

Something like this could probably be done. I believe this is kind of 
how the castellan driver in oslo.config works. Config values are looked 
up and cached on-demand, as opposed to all at once.

"I also have weird overlay logic for defaults that can be overridden"

My knee-jerk reaction to this is that oslo.config already supports 
overriding defaults, so I assume there's something about your use case 
that didn't mesh with that functionality? Or is this part of oslo.config 
that you reused?

"I also have nested config groups that need to be named dynamically to 
allow plugin classes to be extended without subclasses sharing the same 
config group name."

Minus the nesting part, this is also something being done with 
oslo.config today. The config driver feature actually uses dynamically 
named groups, and I believe at least Cinder is using them too. They do 
cause a bit of grief for some config tools, like the validator, but it 
is possible to do something like this.

Now, as to the question of whether we should try to integrate this 
library with oslo.config: I don't know. Helpful, right? ;-)

I think answering that question definitively would take a deeper dive 
into what the new library is doing differently than I can commit to. As 
I noted above, I don't think the things you're doing are so far out in 
left field that it would be unreasonable to consider integrating into 
oslo.config, but the devil is in the details and there are a lot of 
details here that I don't know anything about. For example, will the 
oslo.config data model even accommodate nested groups? I suspect it 
doesn't now, but could it? Probably, but I can't say how 
difficult/disruptive it would be.

If someone wanted to make incremental progress toward integration, I 
think writing a basic YAML driver for oslo.config would be a good start. 
It would be generally useful even if this work doesn't go anywhere, and 
it would provide a basis for a hypothetical future YAML-based driver 
providing CONFspirator functionality. From there we could start looking 
to integrate more advanced features one at a time.

Apologies for the wall of text. I hope you got something out of this 
before your eyes glazed over. :-)

-Ben


From openstack at nemebean.com  Thu Aug 20 21:41:07 2020
From: openstack at nemebean.com (Ben Nemec)
Date: Thu, 20 Aug 2020 16:41:07 -0500
Subject: [largescale-sig][nova][neutron][oslo] RPC ping
In-Reply-To: <20200820153503.GY31915@sync>
References: <c5afa01d-d8ef-7441-26a5-ddbc756f719c@nemebean.com>
 <2af09e63936f75489946ea6b70c41d6e091531ee.camel@redhat.com>
 <7496bd35-856e-f48f-b6d8-65155b1777f1@openstack.org>
 <16a3adf0-2f51-dd7d-c729-7b27f1593980@nemebean.com>
 <a36bcd1d-1dbe-bd69-d195-c89795680e44@openstack.org>
 <6e68d1a3cfc4efff91d3668bb53805dc469673c6.camel@redhat.com>
 <e614e2ce-e5c6-087a-b186-74ab96889acf@nemebean.com>
 <65204b738f13fcea16b9b6d5a68149c89be73e6a.camel@redhat.com>
 <ecae0bcd-5ab1-caaa-aa64-0988836ec07a@nemebean.com>
 <CAJoCO=MJ1vEEfwauom5qog3jgyVyEH7tZs4nRjVu8K_hcHTioA@mail.gmail.com>
 <20200820153503.GY31915@sync>
Message-ID: <d6b20c1e-b165-4c12-ed14-2fa98984c43f@nemebean.com>

Thanks for your patience with this!

In the last Oslo meeting we had discussed possibly adding some sort of 
ping client to oslo.messaging to provide a common interface to use this. 
That would mitigate some of the concerns about everyone having to write 
their own ping test and potentially sending incorrect messages on the 
rabbit bus.

Obviously that would be done as a followup to this, but I thought I'd 
mention it in case anyone wants to take a crack at writing something up.

On 8/20/20 10:35 AM, Arnaud Morin wrote:
> Hey all,
> 
> TLDR:
> - Patch in [1] updated
> - Example of usage in [3]
> - Agree with fixing nova/rabbit/oslo but would like to keep this ping
>    endpoint also
> - Totally agree with documentation needed
> 
> Long:
> 
> Thank you all for your review and for the great information you bring to
> that topic!
> 
> First thing, we are not yet using that patch in production, but in
> testing/dev only for now (at OVH).
> But the plan is to use it in production ASAP.
> 
> Also, we initially pushed that for neutron agent, that's why I missed
> the fact that nova already used the "ping" endpoint, sorry for that.
> 
> Anyway, I dont care about the naming, so in latest patchset of [1], you
> will see that I changed the name of the endpoint following Ken Giusti
> suggestions.
> 
> The bug reported in [2] looks very similar to what we saw.
> Thank you Sean for bringing that to attention in this thread.
> 
> To detect this error, using the above "ping" endpoint in oslo, we can
> use a script like the one in [3] (sorry about it, I can write better
> python :p).
> As mentionned by Sean in a previous mail, I am calling effectively
> the topic "compute.host123456.sbg5.cloud.ovh.net" in "nova" exchange.
> My initial plan would be to identify topics related to a compute and do
> pings in all topics, to make sure that all of them are answering.
> I am not yet sure about how often and if this is a good plan btw.
> 
> Anyway, the compute is reporting status as UP, but the ping is
> timeouting, which is exactly what I wanted to detect!
> 
> I mostly agree with all your comments about the fact that this is a
> trick that we do as operator, and using the RPC bus is maybe not the
> best approach, but this is pragmatic and quite simple IMHO.
> What I also like in this solution is the fact that this is partialy
> outside of OpenStack: the endpoint is inside, but doing the ping is
> external.
> Monitoring OpenStack is not always easy, and sometimes we struggle on
> finding the root cause of some issues. Having such endpoint
> allow us to monitor OpenStack from an external point of view, but still
> in a deeper way.
> It's like a probe in your car telling you that even if you are still
> running, your engine is off :)
> 
> Still, making sure that this bug is fixed by doing some work on
> (rabbit|oslo.messaging|nova|whatever} is the best thing to do.
> 
> However, IMO, this does not prevent this rpc ping endpoint from
> existing.
> 
> Last, but not least, I totally agree about documenting this, but also
> adding some documentation on how to configure rabbit and OpenStack
> services in a way that fit operator needs.
> There are plenty of parameters which could be tweaked on both OpenStack
> and rabbit side. IMO, we need to explain a little bit more what are the
> impact of setting a specific parameter to a given value.
> For example, in another discussion ([4]), we were talking about
> "durable" queues in rabbit. We manage to find that if we enable HA, we
> should also enable durability of queues.
> 
> Anyway that's another topic, and this is also something we discuss in
> large-scale group.
> 
> Thank you all,
> 
> [1] https://review.opendev.org/#/c/735385/
> [2] https://bugs.launchpad.net/nova/+bug/1854992
> [3] http://paste.openstack.org/show/796990/
> [4] http://lists.openstack.org/pipermail/openstack-discuss/2020-August/016362.html
> 
> 


From its-openstack at zohocorp.com  Fri Aug 21 04:23:29 2020
From: its-openstack at zohocorp.com (its-openstack at zohocorp.com)
Date: Fri, 21 Aug 2020 09:53:29 +0530
Subject: per user quota not working properly
Message-ID: <1740f41dfb5.b8327d7677873.277850771393902971@zohocorp.com>

Dear openstack,

We are facing a peculiar issue with regards to users quota of resources.


e.g:
s.no


project 


user 


instance quota 


no instance created


1


test


-


10


2


test

user1


2 


2


3


test


user2 


2


error "quota over"


4


test 


user3 


3 


able to create only 1 instance


5


test 


user4 


no user quota defined 


able to create 10 instance


As you see from mentioned table. when user1,user2, has instance quota of 2 and when user1 has created 2 instance, user2 unable to create instance.
but user3 able to create only 1 more instance, user 4 has no quota applied so project quota 10 will be applied and he can create 10 instance.

the quota is applied to each user but not tracked for each user, so this defeats the purpose of per user quota.

Please help us with resolving this issue.

    
Regards,

sysadmin team
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200821/62efed86/attachment.html>

From adriant at catalystcloud.nz  Fri Aug 21 04:59:24 2020
From: adriant at catalystcloud.nz (Adrian Turjak)
Date: Fri, 21 Aug 2020 16:59:24 +1200
Subject: [requirements][oslo] Inclusion of CONFspirator in
 openstack/requirements
In-Reply-To: <431d53d1-d92c-913b-f0a5-2be33b0c4e7a@catalyst.net.nz>
References: <e1c14afe-1ec3-2447-75e4-d2015f2939c6@catalystcloud.nz>
 <b822c306-35ca-32bb-8e59-e14d027add02@nemebean.com>
 <431d53d1-d92c-913b-f0a5-2be33b0c4e7a@catalyst.net.nz>
Message-ID: <8c83ddc7-3f53-3dc5-8116-436bf6f03064@catalystcloud.nz>

On 21/08/20 9:16 am, Ben Nemec wrote:
>
> In general, given the complexity of what you're talking about I think 
> a driver plugin would be the way to go, as opposed to trying to fit 
> this all in with the core oslo.config functionality (assuming you/we 
> decide to pursue integrating at all).
>
> There were a few other things you mentioned as features of the 
> library. The following are some off-the-cuff thoughts without having 
> looked too closely at the library, so if they're nonsense that's my 
> excuse.
>
> "because I have plugins that need to dynamically load config on 
> startup, which then has to be lazy_loaded"
>
> Something like this could probably be done. I believe this is kind of 
> how the castellan driver in oslo.config works. Config values are 
> looked up and cached on-demand, as opposed to all at once.

This one is a little weird, but essentially the way this works this in 
Adjutant:

I load the config so I can start the app and go through base logic and 
loading plugins, but the config groups that are pulled from plugins 
aren't added to my config group tree until AFTER the config has already 
been loaded. So part of the config is usable, but some hasn't yet fully 
been loaded until after the plugins are done, and then that subtree will 
lazy_load itself when first accessed. It means that until a given 
lazy_loaded group is actually accessed as config, the config tree 
underneath it can still have config options added. It's likely not too 
crazy to do this in oslo, and have groups only read from the cached 
source (loaded file dict) when first accessed.

>
> "I also have weird overlay logic for defaults that can be overridden"
>
> My knee-jerk reaction to this is that oslo.config already supports 
> overriding defaults, so I assume there's something about your use case 
> that didn't mesh with that functionality? Or is this part of 
> oslo.config that you reused?

Sooo, this one is a little special because what this feature lets you do 
is take any group in the config tree once loaded, and call the overlay 
function on it with either a dict, or another group. The returned value 
will be a deep copy of the config tree with the values present in the 
given dict/group overlaid on that original group. As in a depth first 
dict update, where only keys that exist on the overriding dict will be 
updated in the copy of the original dict.

I need to write the docs for this... with a sane example, but here is my 
unit test for it:
https://gitlab.com/catalyst-cloud/confspirator/-/blob/master/confspirator/tests/test_configs.py#L211 


I use this in Adjutant by having some config groups which are a default 
for something, and another place where many things can override that 
default via another group in the config, so I create an overlay copy and 
pass that to the place that needs the config it it's most specific case 
when the code actually pulls values from the conf group I passed it.
See: 
https://opendev.org/openstack/adjutant/src/branch/master/adjutant/actions/v1/base.py#L146-L167 


I hope that helps, because it is in my mind a little odd to explain, but 
it allows some useful things when reusing actions in different tasks.

> "I also have nested config groups that need to be named dynamically to 
> allow plugin classes to be extended without subclasses sharing the 
> same config group name."
>
> Minus the nesting part, this is also something being done with 
> oslo.config today. The config driver feature actually uses dynamically 
> named groups, and I believe at least Cinder is using them too. They do 
> cause a bit of grief for some config tools, like the validator, but it 
> is possible to do something like this.
Cool! yeah I implemented that feature in my code because I ran into a 
case of confliciting namespaces and needed to find a better way to 
handle subclasses that needed to have different dynamic names for their 
config groups.
>
> Now, as to the question of whether we should try to integrate this 
> library with oslo.config: I don't know. Helpful, right?
That's mostly where we got to last time I asked in #openstack-oslo!
>
> I think answering that question definitively would take a deeper dive 
> into what the new library is doing differently than I can commit to. 
> As I noted above, I don't think the things you're doing are so far out 
> in left field that it would be unreasonable to consider integrating 
> into oslo.config, but the devil is in the details and there are a lot 
> of details here that I don't know anything about. For example, will 
> the oslo.config data model even accommodate nested groups? I suspect 
> it doesn't now, but could it? Probably, but I can't say how 
> difficult/disruptive it would be.
I looked into it briefly, and to do what I wanted, while also 
maintaining oslo.config how it was... ended up a bit messy, so I gave up 
because it would take too long and the politics of trying to get it 
merged/reviewed wouldn't be worth the effort most likely.
>
> If someone wanted to make incremental progress toward integration, I 
> think writing a basic YAML driver for oslo.config would be a good 
> start. It would be generally useful even if this work doesn't go 
> anywhere, and it would provide a basis for a hypothetical future 
> YAML-based driver providing CONFspirator functionality. From there we 
> could start looking to integrate more advanced features one at a time.
>
> Apologies for the wall of text. I hope you got something out of this 
> before your eyes glazed over.
>
> -Ben
>
Thanks for the wall of text! It was useful!

I think ultimately it may be safer just maintaining my own separate 
library. For people who don't want to use .ini and prefer yaml/toml it's 
simpler, and for people who prefer .ini and don't need nesting etc, it's 
safer to keep oslo.config as it is. If someone does find anything I've 
done that makes sense in oslo.config I'd be happy to work porting it 
over, but I don't want to make structural changes to it. I'll always 
keep an eye on oslo.config, and may occasionally steal the odd idea if 
you add something cool, but I think other than my use of types.py and 
some of the Opt classes as a base, my code has diverged quite a bit.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200821/f61de1e7/attachment.html>

From dev.faz at gmail.com  Fri Aug 21 07:06:24 2020
From: dev.faz at gmail.com (Fabian Zimmermann)
Date: Fri, 21 Aug 2020 09:06:24 +0200
Subject: [nova][neutron][oslo][ops][kolla] rabbit bindings issue
In-Reply-To: <CALn_SgZ41GB4j6LpBkf+sEcmOVHoMX2LA4cEgi3upAw1MQ934g@mail.gmail.com>
References: <CAA857VxY8mr0OSw8vSYdMph4tXb=5Wwwx3cM_YVgGa8=P+cxkQ@mail.gmail.com>
 <CALaZjRGEcFDBiQ2YnsMjXJ_WF3ALcjWVLQuetupVd__CWhjg0g@mail.gmail.com>
 <CAA857VzKREnBOEF667n4B+9L9Rbm+y7o2LVrDjhxhCT=2XWR8g@mail.gmail.com>
 <1a338d7e-c82c-cda2-2d47-b5aebb999142@openstack.org>
 <CAA857VxxmrHhBZgNHCg2X=ymvOdS_YcpERYN5OnKs0NMEA=cyA@mail.gmail.com>
 <CAA857Vya0rou-SwzO+A=smVHXNTZ9a03_3r09BJj=bb3NAcdGg@mail.gmail.com>
 <CAA857Vzw+kvW_7R37oqJF6gRkxGBSY3gNYTLKu++pY5WHH49qg@mail.gmail.com>
 <CAA857Vzq1nYLM3ES35J_9PxTieVskhZBc43Z40R3NeiEEOL4rA@mail.gmail.com>
 <CAPgF-fr-3J=acxng6SwRkLPXqvTYNsNmCZUXyvzi4rxDnPDdEA@mail.gmail.com>
 <CAA857Vw_HG8VuZyyihm8RB9Mo8aPm=v36GJrnK1jD9wyZHAb2A@mail.gmail.com>
 <20200818120708.GV31915@sync>
 <CAA857Vxq_uhJ_Q8TOOOLouFeun=cr6Y3vin4R7fGiBDiPhrNdg@mail.gmail.com>
 <CALn_SgZ41GB4j6LpBkf+sEcmOVHoMX2LA4cEgi3upAw1MQ934g@mail.gmail.com>
Message-ID: <CAA857Vwya+MrmPO_12zxy=7itR_PwLheDeyfNjD6UNrzv6Rvew@mail.gmail.com>

Hi,

don't understand what you mean with "alternate exchange"? I'm doing
all my tests on my DEV-Env? It's a completely separated / dedicated
(virtual) cluster.

I just enabled the feature and wrote a small script to read the
metrics from the api.

I'm having some "dropped msg" in my cluster, just trying to figure out
if they are "normal".

 Fabian

Am Do., 20. Aug. 2020 um 21:28 Uhr schrieb Arnaud MORIN
<arnaud.morin at gmail.com>:
>
> Hello,
> Are you doing that using alternate exchange ?
> I started configuring it in our env but not yet finished.
>
> Cheers,
>
> Le jeu. 20 août 2020 à 19:16, Fabian Zimmermann <dev.faz at gmail.com> a écrit :
>>
>> Hi,
>>
>> just another idea:
>>
>> Rabbitmq is able to count undelivered messages. We could use this information to detect the broken bindings (causing undeliverable messages).
>>
>> Anyone already doing this?
>>
>> I currently don't have a way to reproduce the broken bindings, so I'm unable to proof the idea.
>>
>> Seems we have to wait issue to happen again - what - hopefully - never happens :)
>>
>>  Fabian
>>
>> Arnaud Morin <arnaud.morin at gmail.com> schrieb am Di., 18. Aug. 2020, 14:07:
>>>
>>> Hey all,
>>>
>>> About the vexxhost strategy to use only one rabbit server and manage HA through
>>> rabbit.
>>> Do you plan to do the same for MariaDB/MySQL?
>>>
>>> --
>>> Arnaud Morin
>>>
>>> On 14.08.20 - 18:45, Fabian Zimmermann wrote:
>>> > Hi,
>>> >
>>> > i read somewhere that vexxhosts kubernetes openstack-Operator is running
>>> > one rabbitmq Container per Service. Just the kubernetes self healing is
>>> > used as "ha" for rabbitmq.
>>> >
>>> > That seems to match with my finding: run rabbitmq standalone and use an
>>> > external system to restart rabbitmq if required.
>>> >
>>> >  Fabian
>>> >
>>> > Satish Patel <satish.txt at gmail.com> schrieb am Fr., 14. Aug. 2020, 16:59:
>>> >
>>> > > Fabian,
>>> > >
>>> > > what do you mean?
>>> > >
>>> > > >> I think vexxhost is running (1) with their openstack-operator - for
>>> > > reasons.
>>> > >
>>> > > On Fri, Aug 14, 2020 at 7:28 AM Fabian Zimmermann <dev.faz at gmail.com>
>>> > > wrote:
>>> > > >
>>> > > > Hello again,
>>> > > >
>>> > > > just a short update about the results of my tests.
>>> > > >
>>> > > > I currently see 2 ways of running openstack+rabbitmq
>>> > > >
>>> > > > 1. without durable-queues and without replication - just one
>>> > > rabbitmq-process which gets (somehow) restarted if it fails.
>>> > > > 2. durable-queues and replication
>>> > > >
>>> > > > Any other combination of these settings leads to more or less issues with
>>> > > >
>>> > > > * broken / non working bindings
>>> > > > * broken queues
>>> > > >
>>> > > > I think vexxhost is running (1) with their openstack-operator - for
>>> > > reasons.
>>> > > >
>>> > > > I added [kolla], because kolla-ansible is installing rabbitmq with
>>> > > replication but without durable-queues.
>>> > > >
>>> > > > May someone point me to the best way to document these findings to some
>>> > > official doc?
>>> > > > I think a lot of installations out there will run into issues if - under
>>> > > load - a node fails.
>>> > > >
>>> > > >  Fabian
>>> > > >
>>> > > >
>>> > > > Am Do., 13. Aug. 2020 um 15:13 Uhr schrieb Fabian Zimmermann <
>>> > > dev.faz at gmail.com>:
>>> > > >>
>>> > > >> Hi,
>>> > > >>
>>> > > >> just did some short tests today in our test-environment (without
>>> > > durable queues and without replication):
>>> > > >>
>>> > > >> * started a rally task to generate some load
>>> > > >> * kill-9-ed rabbitmq on one node
>>> > > >> * rally task immediately stopped and the cloud (mostly) stopped working
>>> > > >>
>>> > > >> after some debugging i found (again) exchanges which had bindings to
>>> > > queues, but these bindings didnt forward any msgs.
>>> > > >> Wrote a small script to detect these broken bindings and will now check
>>> > > if this is "reproducible"
>>> > > >>
>>> > > >> then I will try "durable queues" and "durable queues with replication"
>>> > > to see if this helps. Even if I would expect
>>> > > >> rabbitmq should be able to handle this without these "hidden broken
>>> > > bindings"
>>> > > >>
>>> > > >> This just FYI.
>>> > > >>
>>> > > >>  Fabian
>>> > >


From skaplons at redhat.com  Fri Aug 21 07:21:55 2020
From: skaplons at redhat.com (Slawek Kaplonski)
Date: Fri, 21 Aug 2020 09:21:55 +0200
Subject: [neutron] Wallaby PTG planning
Message-ID: <20200821072155.pfdjttikum5r54hz@skaplons-mac>

Hi,

It's again that time of the cycle (time flies) when we need to start thinking
about next cycle already.
As You probably know, next virtual PTG will be in October 26-30.
I need to book some space for the Neuton team before 11th of September so I
prepared doodle [1] with possible time slots. Please reply there what are the
best days and hours for You so we can try to schedule our sessions in the time
slots which fits best most of us :)
Please fill this doodle before 4.09 so I will have time to summarize it and book
some slots for us.

I also prepared etherpad [2]. Please add Your name if You are going to attend
the PTG sessions.
Please also add proposals of the topics which You want to discuss during the
PTG.

[1] https://doodle.com/poll/2ppmnua2nuva5nyp
[2] https://etherpad.opendev.org/p/neutron-wallaby-ptg

-- 
Slawek Kaplonski
Principal software engineer
Red Hat


From emiller at genesishosting.com  Fri Aug 21 07:43:04 2020
From: emiller at genesishosting.com (Eric K. Miller)
Date: Fri, 21 Aug 2020 02:43:04 -0500
Subject: Using os_token
Message-ID: <046E9C0290DD9149B106B72FC9156BEA04814560@gmsxchsvr01.thecreation.com>

Hi,

 
It looks like the OS_TOKEN environment variable can be set so a token
can be re-used instead of a new authentication for each CLI command with
the OpenStack Client.  I'm a little confused as to how this works and
haven't found any good documentation on the subject.

 
I would have expected to be able to:

0) set appropriate OS_* variables for password authentication 

1) create a token using "openstack token issue"

2) unset all OS_* environment variables

3) set OS_TOKEN to the token's value provided in #1

4) set OS_AUTH_TYPE to "v3token"

5) set OS_AUTH_URL to the respective KeyStone endpoint

6) set OS_IDENTITY_API_VERSION to "3"

7) use the CLI as normal

 
However, I get a "The service catalog is empty." message.  Maybe I'm
missing something above or am I completely misunderstanding the purpose
of the OS_TOKEN variable?

 
>From examples I have seen, it looks like the token can be used in a REST
API call.  Is there a way to use an existing token with the CLI,
instead, so a new token is not issued for every CLI command
instantiation?

 
Thanks!

 
Eric

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200821/d520d11e/attachment.html>

From kotobi at dkrz.de  Fri Aug 21 07:54:49 2020
From: kotobi at dkrz.de (Amjad Kotobi)
Date: Fri, 21 Aug 2020 09:54:49 +0200
Subject: [glance][horizon][ops] Dashboard show Forbidden 403 but openstack cli
 works fine
Message-ID: <04C431DA-85DE-40FB-92CE-3A7273E98E11@dkrz.de>

Hi,

We are running Train release, currently facing below error when change to “Images” panel on the project which I have “User” role. The image is visibility is public. 

Error: Forbidden. Insufficient permissions of the requested operation
Error: Unable to retrieve the project. 

By using openstack-cli everything works and I do not face “Forbidden” 403, but in dashboard “access.log” it shows 

"GET /dashboard/api/keystone/projects/7331defd55ef479fbf0a9a1ac3fe9055 HTTP/1.1” 403 http://xxxxxx/dashboard/project/images”

I checked:

1. glance: policy.json from both dashboard and api/registry hosts are same.
2. From dashboard API images is 2

As soon as I change the OWNER of image to the project which my role = USER the error disappears.

Any ideas or previous encounter similar to this issue?

Thanks
Amjad 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200821/61a3971f/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5223 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200821/61a3971f/attachment.bin>

From arnaud.morin at gmail.com  Fri Aug 21 08:13:32 2020
From: arnaud.morin at gmail.com (Arnaud Morin)
Date: Fri, 21 Aug 2020 08:13:32 +0000
Subject: [nova][neutron][oslo][ops][kolla] rabbit bindings issue
In-Reply-To: <CAA857Vwya+MrmPO_12zxy=7itR_PwLheDeyfNjD6UNrzv6Rvew@mail.gmail.com>
References: <CAA857VxxmrHhBZgNHCg2X=ymvOdS_YcpERYN5OnKs0NMEA=cyA@mail.gmail.com>
 <CAA857Vya0rou-SwzO+A=smVHXNTZ9a03_3r09BJj=bb3NAcdGg@mail.gmail.com>
 <CAA857Vzw+kvW_7R37oqJF6gRkxGBSY3gNYTLKu++pY5WHH49qg@mail.gmail.com>
 <CAA857Vzq1nYLM3ES35J_9PxTieVskhZBc43Z40R3NeiEEOL4rA@mail.gmail.com>
 <CAPgF-fr-3J=acxng6SwRkLPXqvTYNsNmCZUXyvzi4rxDnPDdEA@mail.gmail.com>
 <CAA857Vw_HG8VuZyyihm8RB9Mo8aPm=v36GJrnK1jD9wyZHAb2A@mail.gmail.com>
 <20200818120708.GV31915@sync>
 <CAA857Vxq_uhJ_Q8TOOOLouFeun=cr6Y3vin4R7fGiBDiPhrNdg@mail.gmail.com>
 <CALn_SgZ41GB4j6LpBkf+sEcmOVHoMX2LA4cEgi3upAw1MQ934g@mail.gmail.com>
 <CAA857Vwya+MrmPO_12zxy=7itR_PwLheDeyfNjD6UNrzv6Rvew@mail.gmail.com>
Message-ID: <20200821081332.GZ31915@sync>

Hey,
I am talking about that:
https://www.rabbitmq.com/ae.html

Cheers,

-- 
Arnaud Morin

On 21.08.20 - 09:06, Fabian Zimmermann wrote:
> Hi,
> 
> don't understand what you mean with "alternate exchange"? I'm doing
> all my tests on my DEV-Env? It's a completely separated / dedicated
> (virtual) cluster.
> 
> I just enabled the feature and wrote a small script to read the
> metrics from the api.
> 
> I'm having some "dropped msg" in my cluster, just trying to figure out
> if they are "normal".
> 
>  Fabian
> 
> Am Do., 20. Aug. 2020 um 21:28 Uhr schrieb Arnaud MORIN
> <arnaud.morin at gmail.com>:
> >
> > Hello,
> > Are you doing that using alternate exchange ?
> > I started configuring it in our env but not yet finished.
> >
> > Cheers,
> >
> > Le jeu. 20 août 2020 à 19:16, Fabian Zimmermann <dev.faz at gmail.com> a écrit :
> >>
> >> Hi,
> >>
> >> just another idea:
> >>
> >> Rabbitmq is able to count undelivered messages. We could use this information to detect the broken bindings (causing undeliverable messages).
> >>
> >> Anyone already doing this?
> >>
> >> I currently don't have a way to reproduce the broken bindings, so I'm unable to proof the idea.
> >>
> >> Seems we have to wait issue to happen again - what - hopefully - never happens :)
> >>
> >>  Fabian
> >>
> >> Arnaud Morin <arnaud.morin at gmail.com> schrieb am Di., 18. Aug. 2020, 14:07:
> >>>
> >>> Hey all,
> >>>
> >>> About the vexxhost strategy to use only one rabbit server and manage HA through
> >>> rabbit.
> >>> Do you plan to do the same for MariaDB/MySQL?
> >>>
> >>> --
> >>> Arnaud Morin
> >>>
> >>> On 14.08.20 - 18:45, Fabian Zimmermann wrote:
> >>> > Hi,
> >>> >
> >>> > i read somewhere that vexxhosts kubernetes openstack-Operator is running
> >>> > one rabbitmq Container per Service. Just the kubernetes self healing is
> >>> > used as "ha" for rabbitmq.
> >>> >
> >>> > That seems to match with my finding: run rabbitmq standalone and use an
> >>> > external system to restart rabbitmq if required.
> >>> >
> >>> >  Fabian
> >>> >
> >>> > Satish Patel <satish.txt at gmail.com> schrieb am Fr., 14. Aug. 2020, 16:59:
> >>> >
> >>> > > Fabian,
> >>> > >
> >>> > > what do you mean?
> >>> > >
> >>> > > >> I think vexxhost is running (1) with their openstack-operator - for
> >>> > > reasons.
> >>> > >
> >>> > > On Fri, Aug 14, 2020 at 7:28 AM Fabian Zimmermann <dev.faz at gmail.com>
> >>> > > wrote:
> >>> > > >
> >>> > > > Hello again,
> >>> > > >
> >>> > > > just a short update about the results of my tests.
> >>> > > >
> >>> > > > I currently see 2 ways of running openstack+rabbitmq
> >>> > > >
> >>> > > > 1. without durable-queues and without replication - just one
> >>> > > rabbitmq-process which gets (somehow) restarted if it fails.
> >>> > > > 2. durable-queues and replication
> >>> > > >
> >>> > > > Any other combination of these settings leads to more or less issues with
> >>> > > >
> >>> > > > * broken / non working bindings
> >>> > > > * broken queues
> >>> > > >
> >>> > > > I think vexxhost is running (1) with their openstack-operator - for
> >>> > > reasons.
> >>> > > >
> >>> > > > I added [kolla], because kolla-ansible is installing rabbitmq with
> >>> > > replication but without durable-queues.
> >>> > > >
> >>> > > > May someone point me to the best way to document these findings to some
> >>> > > official doc?
> >>> > > > I think a lot of installations out there will run into issues if - under
> >>> > > load - a node fails.
> >>> > > >
> >>> > > >  Fabian
> >>> > > >
> >>> > > >
> >>> > > > Am Do., 13. Aug. 2020 um 15:13 Uhr schrieb Fabian Zimmermann <
> >>> > > dev.faz at gmail.com>:
> >>> > > >>
> >>> > > >> Hi,
> >>> > > >>
> >>> > > >> just did some short tests today in our test-environment (without
> >>> > > durable queues and without replication):
> >>> > > >>
> >>> > > >> * started a rally task to generate some load
> >>> > > >> * kill-9-ed rabbitmq on one node
> >>> > > >> * rally task immediately stopped and the cloud (mostly) stopped working
> >>> > > >>
> >>> > > >> after some debugging i found (again) exchanges which had bindings to
> >>> > > queues, but these bindings didnt forward any msgs.
> >>> > > >> Wrote a small script to detect these broken bindings and will now check
> >>> > > if this is "reproducible"
> >>> > > >>
> >>> > > >> then I will try "durable queues" and "durable queues with replication"
> >>> > > to see if this helps. Even if I would expect
> >>> > > >> rabbitmq should be able to handle this without these "hidden broken
> >>> > > bindings"
> >>> > > >>
> >>> > > >> This just FYI.
> >>> > > >>
> >>> > > >>  Fabian
> >>> > >


From dev.faz at gmail.com  Fri Aug 21 08:28:32 2020
From: dev.faz at gmail.com (Fabian Zimmermann)
Date: Fri, 21 Aug 2020 10:28:32 +0200
Subject: [nova][neutron][oslo][ops][kolla] rabbit bindings issue
In-Reply-To: <20200821081332.GZ31915@sync>
References: <CAA857VxxmrHhBZgNHCg2X=ymvOdS_YcpERYN5OnKs0NMEA=cyA@mail.gmail.com>
 <CAA857Vya0rou-SwzO+A=smVHXNTZ9a03_3r09BJj=bb3NAcdGg@mail.gmail.com>
 <CAA857Vzw+kvW_7R37oqJF6gRkxGBSY3gNYTLKu++pY5WHH49qg@mail.gmail.com>
 <CAA857Vzq1nYLM3ES35J_9PxTieVskhZBc43Z40R3NeiEEOL4rA@mail.gmail.com>
 <CAPgF-fr-3J=acxng6SwRkLPXqvTYNsNmCZUXyvzi4rxDnPDdEA@mail.gmail.com>
 <CAA857Vw_HG8VuZyyihm8RB9Mo8aPm=v36GJrnK1jD9wyZHAb2A@mail.gmail.com>
 <20200818120708.GV31915@sync>
 <CAA857Vxq_uhJ_Q8TOOOLouFeun=cr6Y3vin4R7fGiBDiPhrNdg@mail.gmail.com>
 <CALn_SgZ41GB4j6LpBkf+sEcmOVHoMX2LA4cEgi3upAw1MQ934g@mail.gmail.com>
 <CAA857Vwya+MrmPO_12zxy=7itR_PwLheDeyfNjD6UNrzv6Rvew@mail.gmail.com>
 <20200821081332.GZ31915@sync>
Message-ID: <CAA857VxBfsW602uHi=PMWD-eheMMv6f6fY6AAJdbMDwzkoQKQg@mail.gmail.com>

Hi,

yeah, that's what I'm currently using.

I also tried to use the unroutable-counters, but these are only
available for channels, which may not have any bindings, so there is
no way to find the "root cause"

I created an AE "unroutable" and wrote a script to show me the msgs
placed here.. currently I get

--
     20 Exchange: q-agent-notifier-network-delete_fanout, RoutingKey:
    226 Exchange: q-agent-notifier-port-delete_fanout, RoutingKey:
     88 Exchange: q-agent-notifier-port-update_fanout, RoutingKey:
    388 Exchange: q-agent-notifier-security_group-update_fanout, RoutingKey:
--

I think I will start another thread to debug the reason for this,
because it has nothing to do with "broken bindings".

 Fabian

Am Fr., 21. Aug. 2020 um 10:13 Uhr schrieb Arnaud Morin
<arnaud.morin at gmail.com>:
>
> Hey,
> I am talking about that:
> https://www.rabbitmq.com/ae.html
>
> Cheers,
>
> --
> Arnaud Morin
>
> On 21.08.20 - 09:06, Fabian Zimmermann wrote:
> > Hi,
> >
> > don't understand what you mean with "alternate exchange"? I'm doing
> > all my tests on my DEV-Env? It's a completely separated / dedicated
> > (virtual) cluster.
> >
> > I just enabled the feature and wrote a small script to read the
> > metrics from the api.
> >
> > I'm having some "dropped msg" in my cluster, just trying to figure out
> > if they are "normal".
> >
> >  Fabian
> >
> > Am Do., 20. Aug. 2020 um 21:28 Uhr schrieb Arnaud MORIN
> > <arnaud.morin at gmail.com>:
> > >
> > > Hello,
> > > Are you doing that using alternate exchange ?
> > > I started configuring it in our env but not yet finished.
> > >
> > > Cheers,
> > >
> > > Le jeu. 20 août 2020 à 19:16, Fabian Zimmermann <dev.faz at gmail.com> a écrit :
> > >>
> > >> Hi,
> > >>
> > >> just another idea:
> > >>
> > >> Rabbitmq is able to count undelivered messages. We could use this information to detect the broken bindings (causing undeliverable messages).
> > >>
> > >> Anyone already doing this?
> > >>
> > >> I currently don't have a way to reproduce the broken bindings, so I'm unable to proof the idea.
> > >>
> > >> Seems we have to wait issue to happen again - what - hopefully - never happens :)
> > >>
> > >>  Fabian
> > >>
> > >> Arnaud Morin <arnaud.morin at gmail.com> schrieb am Di., 18. Aug. 2020, 14:07:
> > >>>
> > >>> Hey all,
> > >>>
> > >>> About the vexxhost strategy to use only one rabbit server and manage HA through
> > >>> rabbit.
> > >>> Do you plan to do the same for MariaDB/MySQL?
> > >>>
> > >>> --
> > >>> Arnaud Morin
> > >>>
> > >>> On 14.08.20 - 18:45, Fabian Zimmermann wrote:
> > >>> > Hi,
> > >>> >
> > >>> > i read somewhere that vexxhosts kubernetes openstack-Operator is running
> > >>> > one rabbitmq Container per Service. Just the kubernetes self healing is
> > >>> > used as "ha" for rabbitmq.
> > >>> >
> > >>> > That seems to match with my finding: run rabbitmq standalone and use an
> > >>> > external system to restart rabbitmq if required.
> > >>> >
> > >>> >  Fabian
> > >>> >
> > >>> > Satish Patel <satish.txt at gmail.com> schrieb am Fr., 14. Aug. 2020, 16:59:
> > >>> >
> > >>> > > Fabian,
> > >>> > >
> > >>> > > what do you mean?
> > >>> > >
> > >>> > > >> I think vexxhost is running (1) with their openstack-operator - for
> > >>> > > reasons.
> > >>> > >
> > >>> > > On Fri, Aug 14, 2020 at 7:28 AM Fabian Zimmermann <dev.faz at gmail.com>
> > >>> > > wrote:
> > >>> > > >
> > >>> > > > Hello again,
> > >>> > > >
> > >>> > > > just a short update about the results of my tests.
> > >>> > > >
> > >>> > > > I currently see 2 ways of running openstack+rabbitmq
> > >>> > > >
> > >>> > > > 1. without durable-queues and without replication - just one
> > >>> > > rabbitmq-process which gets (somehow) restarted if it fails.
> > >>> > > > 2. durable-queues and replication
> > >>> > > >
> > >>> > > > Any other combination of these settings leads to more or less issues with
> > >>> > > >
> > >>> > > > * broken / non working bindings
> > >>> > > > * broken queues
> > >>> > > >
> > >>> > > > I think vexxhost is running (1) with their openstack-operator - for
> > >>> > > reasons.
> > >>> > > >
> > >>> > > > I added [kolla], because kolla-ansible is installing rabbitmq with
> > >>> > > replication but without durable-queues.
> > >>> > > >
> > >>> > > > May someone point me to the best way to document these findings to some
> > >>> > > official doc?
> > >>> > > > I think a lot of installations out there will run into issues if - under
> > >>> > > load - a node fails.
> > >>> > > >
> > >>> > > >  Fabian
> > >>> > > >
> > >>> > > >
> > >>> > > > Am Do., 13. Aug. 2020 um 15:13 Uhr schrieb Fabian Zimmermann <
> > >>> > > dev.faz at gmail.com>:
> > >>> > > >>
> > >>> > > >> Hi,
> > >>> > > >>
> > >>> > > >> just did some short tests today in our test-environment (without
> > >>> > > durable queues and without replication):
> > >>> > > >>
> > >>> > > >> * started a rally task to generate some load
> > >>> > > >> * kill-9-ed rabbitmq on one node
> > >>> > > >> * rally task immediately stopped and the cloud (mostly) stopped working
> > >>> > > >>
> > >>> > > >> after some debugging i found (again) exchanges which had bindings to
> > >>> > > queues, but these bindings didnt forward any msgs.
> > >>> > > >> Wrote a small script to detect these broken bindings and will now check
> > >>> > > if this is "reproducible"
> > >>> > > >>
> > >>> > > >> then I will try "durable queues" and "durable queues with replication"
> > >>> > > to see if this helps. Even if I would expect
> > >>> > > >> rabbitmq should be able to handle this without these "hidden broken
> > >>> > > bindings"
> > >>> > > >>
> > >>> > > >> This just FYI.
> > >>> > > >>
> > >>> > > >>  Fabian
> > >>> > >


From emiller at genesishosting.com  Fri Aug 21 08:29:30 2020
From: emiller at genesishosting.com (Eric K. Miller)
Date: Fri, 21 Aug 2020 03:29:30 -0500
Subject: Using os_token
Message-ID: <046E9C0290DD9149B106B72FC9156BEA04814561@gmsxchsvr01.thecreation.com>

I happened to run across an unrelated github issue:
https://github.com/terraform-providers/terraform-provider-openstack/issu
es/271

which gave me a clue to what I was missing.  I needed to include some
additional variables (see steps 7 through 9 below).

Revised steps - which works fine with the OpenStack Client:
0) set appropriate OS_* variables for password authentication 
1) create a token using "openstack token issue"
2) unset all OS_* environment variables
3) set OS_TOKEN to the token's value provided in #1
4) set OS_AUTH_TYPE to "v3token"
5) set OS_AUTH_URL to the respective KeyStone endpoint
6) set OS_IDENTITY_API_VERSION to "3"
7) set OS_PROJECT_DOMAIN_ID as appropriate
8) set OS_PROJECT_NAME as appropriate
9) set OS_REGION_NAME as appropriate
10) use the CLI as normal

This shaves anywhere from 0.2 to 0.6 seconds off of a test command I'm
running when compared to password authentication (which normally takes
about 2.5 seconds to run), where a new token is issued each time.

openstack token revoke works as expected too.

Eric


From dev.faz at gmail.com  Fri Aug 21 08:32:09 2020
From: dev.faz at gmail.com (Fabian Zimmermann)
Date: Fri, 21 Aug 2020 10:32:09 +0200
Subject: [neutron][ops] q-agent-notifier exchanges without bindings.
Message-ID: <CAA857VwW9gRwjd-aAqz=R7Ffcmq7Zq4LZ5-hftNHsg5HW7_4iw@mail.gmail.com>

Hi,

im currently on the way to analyse some rabbitmq-issues.

atm im taking a look on "unroutable messages", so I

* created an Alternative Exchange and Queue: "unroutable"
* created a policy to send all unroutable msgs to this exchange/queue.
* wrote a script to show me the msgs placed here.. currently I get

Seems like my neutron is placing msgs in these exchanges, but there is
nobody listening/binding to:
--
     20 Exchange: q-agent-notifier-network-delete_fanout, RoutingKey:
    226 Exchange: q-agent-notifier-port-delete_fanout, RoutingKey:
     88 Exchange: q-agent-notifier-port-update_fanout, RoutingKey:
    388 Exchange: q-agent-notifier-security_group-update_fanout, RoutingKey:
--

Is someone able to give me a hint where to look at / how to debug this?

 Fabian


From moreira.belmiro.email.lists at gmail.com  Fri Aug 21 09:26:35 2020
From: moreira.belmiro.email.lists at gmail.com (Belmiro Moreira)
Date: Fri, 21 Aug 2020 11:26:35 +0200
Subject: [nova][ops] Live migration and CPU features
In-Reply-To: <20200819092130.GX31915@sync>
References: <CAPkQhne9QKod_B7+0zzbG=ag=S76GAXKJWa=Fou6m-QcimOenQ@mail.gmail.com>
 <CAA857VzZYMZxLaZ=4=fJo7rXnYak5mN-64Wz7ACoYD11sp+OmQ@mail.gmail.com>
 <44347504ff7308a6c3b4155060c778fad368a002.camel@redhat.com>
 <20200819092130.GX31915@sync>
Message-ID: <CAPkQhndO7Ab2S_bgv4+WPi=EUg3NH1NBTs=LKovmYS23FNxKsg@mail.gmail.com>

Hi,
thank you all for your comments/suggestions.

Having a "custom" cpu_mode seems the best option for our use case.
"host-passhtough" is problematic when the hardware is retired and instances
need to be moved to newer compute nodes.

Belmiro

On Wed, Aug 19, 2020 at 11:21 AM Arnaud Morin <arnaud.morin at gmail.com>
wrote:

>
> Hello,
>
> We have the same kind of issue.
> To help mitigate it, we do segregation and also use cpu_mode=custom, so we
> can use a model which is close to our hardware (cpu_model=Haswell-noTSX)
> and add extra_flags when needed.
>
> This is painful.
>
> Cheers,
>
> --
> Arnaud Morin
>
> On 18.08.20 - 16:16, Sean Mooney wrote:
> > On Tue, 2020-08-18 at 17:06 +0200, Fabian Zimmermann wrote:
> > > Hi,
> > >
> > > We are using the "custom"-way. But this does not protect you from all
> issues.
> > >
> > > We had problems with a new cpu-generation not (jet) detected correctly
> > > in an libvirt-version. So libvirt failed back to the "desktop"-cpu of
> > > this newer generation, but didnt support/detect some features =>
> > > blocked live-migration.
> > yes that is common when using really new hardware. having previouly
> worked
> > at intel and hitting this often that one of the reason i tend to default
> to host-passthouh
> > and recommend using AZ or aggreate to segreatate the cloud for live
> migration.
> >
> > in the case where your libvirt does not know about the new cpus your
> best approch is to use the
> > newest server cpu model that it know about and then if you really need
> the new fature you can try
> > to add theem using the config options  but that is effectivly the same
> as using host-passhtough
> > which is why i default to that as a workaround instead.
> >
> > >
> > >  Fabian
> > >
> > > Am Di., 18. Aug. 2020 um 16:54 Uhr schrieb Belmiro Moreira
> > > <moreira.belmiro.email.lists at gmail.com>:
> > > >
> > > > Hi,
> > > > in our infrastructure we have always compute nodes that need a
> hardware intervention and as a consequence they are
> > > > rebooted, bringing a new kernel, kvm, ...
> > > >
> > > > In order to have a good compromise between performance and
> flexibility (live migration) we have been using "host-
> > > > model" for the "cpu_mode" configuration of our service VMs. We
> didn't expect to have CPU compatibility issues
> > > > because we have the same hardware type per cell.
> > > >
> > > > The problem is that when a compute node is rebooted the instance
> domain is recreated with the new cpu features that
> > > > were introduced because of the reboot (using centOS).
> > > >
> > > > If there are new CPU features exposed, this basically blocks live
> migration to all the non rebooted compute nodes
> > > > (those cpu features are not exposed, yet). The nova-scheduler
> doesn't know about them when scheduling the live
> > > > migration destination.
> > > >
> > > > I wonder how other operators are solving this issue.
> > > > I don't like stopping OS upgrades.
> > > > What I'm considering is to define a "custom" cpu_mode for each
> hardware type.
> > > >
> > > > I would appreciate your comments and learn how you are solving this
> problem.
> > > >
> > > > Belmiro
> > > >
> > >
> > >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200821/f07ab2ad/attachment.html>

From dev.faz at gmail.com  Fri Aug 21 11:29:15 2020
From: dev.faz at gmail.com (Fabian Zimmermann)
Date: Fri, 21 Aug 2020 13:29:15 +0200
Subject: [nova][neutron][oslo][ops][kolla] rabbit bindings issue
In-Reply-To: <CAA857VxBfsW602uHi=PMWD-eheMMv6f6fY6AAJdbMDwzkoQKQg@mail.gmail.com>
References: <CAA857VxxmrHhBZgNHCg2X=ymvOdS_YcpERYN5OnKs0NMEA=cyA@mail.gmail.com>
 <CAA857Vya0rou-SwzO+A=smVHXNTZ9a03_3r09BJj=bb3NAcdGg@mail.gmail.com>
 <CAA857Vzw+kvW_7R37oqJF6gRkxGBSY3gNYTLKu++pY5WHH49qg@mail.gmail.com>
 <CAA857Vzq1nYLM3ES35J_9PxTieVskhZBc43Z40R3NeiEEOL4rA@mail.gmail.com>
 <CAPgF-fr-3J=acxng6SwRkLPXqvTYNsNmCZUXyvzi4rxDnPDdEA@mail.gmail.com>
 <CAA857Vw_HG8VuZyyihm8RB9Mo8aPm=v36GJrnK1jD9wyZHAb2A@mail.gmail.com>
 <20200818120708.GV31915@sync>
 <CAA857Vxq_uhJ_Q8TOOOLouFeun=cr6Y3vin4R7fGiBDiPhrNdg@mail.gmail.com>
 <CALn_SgZ41GB4j6LpBkf+sEcmOVHoMX2LA4cEgi3upAw1MQ934g@mail.gmail.com>
 <CAA857Vwya+MrmPO_12zxy=7itR_PwLheDeyfNjD6UNrzv6Rvew@mail.gmail.com>
 <20200821081332.GZ31915@sync>
 <CAA857VxBfsW602uHi=PMWD-eheMMv6f6fY6AAJdbMDwzkoQKQg@mail.gmail.com>
Message-ID: <CAA857VwF_1LiG=J-7OTeJauosAOVXtVXvj-Pf1mQQ=7cMJWzbg@mail.gmail.com>

Hi,

just to keep you updated.

It seems these "q-agent-notifier"-exchanges are not used by every
possible neutron-driver/agent-backend, so it seems to be fine to have
unrouted msgs here.

I was (again) able to get some broken bindings in my dev-cluster.
The counters for "unrouted msg" are increased, but the msgs sent to
these exchanges/bindings/queues are *NOT* placed in the
alternate-exchange.

It's quite bad, because of the above "normal" unrouted msgs we could
not just use the counter as "error-indicator".

I think I will try to create a valid "bind" in above exchanges, so
these will not increment the "unroutable"-counter and use the counter
as monitoring-target.

 Fabian


Am Fr., 21. Aug. 2020 um 10:28 Uhr schrieb Fabian Zimmermann
<dev.faz at gmail.com>:
>
> Hi,
>
> yeah, that's what I'm currently using.
>
> I also tried to use the unroutable-counters, but these are only
> available for channels, which may not have any bindings, so there is
> no way to find the "root cause"
>
> I created an AE "unroutable" and wrote a script to show me the msgs
> placed here.. currently I get
>
> --
>      20 Exchange: q-agent-notifier-network-delete_fanout, RoutingKey:
>     226 Exchange: q-agent-notifier-port-delete_fanout, RoutingKey:
>      88 Exchange: q-agent-notifier-port-update_fanout, RoutingKey:
>     388 Exchange: q-agent-notifier-security_group-update_fanout, RoutingKey:
> --
>
> I think I will start another thread to debug the reason for this,
> because it has nothing to do with "broken bindings".
>
>  Fabian
>
> Am Fr., 21. Aug. 2020 um 10:13 Uhr schrieb Arnaud Morin
> <arnaud.morin at gmail.com>:
> >
> > Hey,
> > I am talking about that:
> > https://www.rabbitmq.com/ae.html
> >
> > Cheers,
> >
> > --
> > Arnaud Morin
> >
> > On 21.08.20 - 09:06, Fabian Zimmermann wrote:
> > > Hi,
> > >
> > > don't understand what you mean with "alternate exchange"? I'm doing
> > > all my tests on my DEV-Env? It's a completely separated / dedicated
> > > (virtual) cluster.
> > >
> > > I just enabled the feature and wrote a small script to read the
> > > metrics from the api.
> > >
> > > I'm having some "dropped msg" in my cluster, just trying to figure out
> > > if they are "normal".
> > >
> > >  Fabian
> > >
> > > Am Do., 20. Aug. 2020 um 21:28 Uhr schrieb Arnaud MORIN
> > > <arnaud.morin at gmail.com>:
> > > >
> > > > Hello,
> > > > Are you doing that using alternate exchange ?
> > > > I started configuring it in our env but not yet finished.
> > > >
> > > > Cheers,
> > > >
> > > > Le jeu. 20 août 2020 à 19:16, Fabian Zimmermann <dev.faz at gmail.com> a écrit :
> > > >>
> > > >> Hi,
> > > >>
> > > >> just another idea:
> > > >>
> > > >> Rabbitmq is able to count undelivered messages. We could use this information to detect the broken bindings (causing undeliverable messages).
> > > >>
> > > >> Anyone already doing this?
> > > >>
> > > >> I currently don't have a way to reproduce the broken bindings, so I'm unable to proof the idea.
> > > >>
> > > >> Seems we have to wait issue to happen again - what - hopefully - never happens :)
> > > >>
> > > >>  Fabian
> > > >>
> > > >> Arnaud Morin <arnaud.morin at gmail.com> schrieb am Di., 18. Aug. 2020, 14:07:
> > > >>>
> > > >>> Hey all,
> > > >>>
> > > >>> About the vexxhost strategy to use only one rabbit server and manage HA through
> > > >>> rabbit.
> > > >>> Do you plan to do the same for MariaDB/MySQL?
> > > >>>
> > > >>> --
> > > >>> Arnaud Morin
> > > >>>
> > > >>> On 14.08.20 - 18:45, Fabian Zimmermann wrote:
> > > >>> > Hi,
> > > >>> >
> > > >>> > i read somewhere that vexxhosts kubernetes openstack-Operator is running
> > > >>> > one rabbitmq Container per Service. Just the kubernetes self healing is
> > > >>> > used as "ha" for rabbitmq.
> > > >>> >
> > > >>> > That seems to match with my finding: run rabbitmq standalone and use an
> > > >>> > external system to restart rabbitmq if required.
> > > >>> >
> > > >>> >  Fabian
> > > >>> >
> > > >>> > Satish Patel <satish.txt at gmail.com> schrieb am Fr., 14. Aug. 2020, 16:59:
> > > >>> >
> > > >>> > > Fabian,
> > > >>> > >
> > > >>> > > what do you mean?
> > > >>> > >
> > > >>> > > >> I think vexxhost is running (1) with their openstack-operator - for
> > > >>> > > reasons.
> > > >>> > >
> > > >>> > > On Fri, Aug 14, 2020 at 7:28 AM Fabian Zimmermann <dev.faz at gmail.com>
> > > >>> > > wrote:
> > > >>> > > >
> > > >>> > > > Hello again,
> > > >>> > > >
> > > >>> > > > just a short update about the results of my tests.
> > > >>> > > >
> > > >>> > > > I currently see 2 ways of running openstack+rabbitmq
> > > >>> > > >
> > > >>> > > > 1. without durable-queues and without replication - just one
> > > >>> > > rabbitmq-process which gets (somehow) restarted if it fails.
> > > >>> > > > 2. durable-queues and replication
> > > >>> > > >
> > > >>> > > > Any other combination of these settings leads to more or less issues with
> > > >>> > > >
> > > >>> > > > * broken / non working bindings
> > > >>> > > > * broken queues
> > > >>> > > >
> > > >>> > > > I think vexxhost is running (1) with their openstack-operator - for
> > > >>> > > reasons.
> > > >>> > > >
> > > >>> > > > I added [kolla], because kolla-ansible is installing rabbitmq with
> > > >>> > > replication but without durable-queues.
> > > >>> > > >
> > > >>> > > > May someone point me to the best way to document these findings to some
> > > >>> > > official doc?
> > > >>> > > > I think a lot of installations out there will run into issues if - under
> > > >>> > > load - a node fails.
> > > >>> > > >
> > > >>> > > >  Fabian
> > > >>> > > >
> > > >>> > > >
> > > >>> > > > Am Do., 13. Aug. 2020 um 15:13 Uhr schrieb Fabian Zimmermann <
> > > >>> > > dev.faz at gmail.com>:
> > > >>> > > >>
> > > >>> > > >> Hi,
> > > >>> > > >>
> > > >>> > > >> just did some short tests today in our test-environment (without
> > > >>> > > durable queues and without replication):
> > > >>> > > >>
> > > >>> > > >> * started a rally task to generate some load
> > > >>> > > >> * kill-9-ed rabbitmq on one node
> > > >>> > > >> * rally task immediately stopped and the cloud (mostly) stopped working
> > > >>> > > >>
> > > >>> > > >> after some debugging i found (again) exchanges which had bindings to
> > > >>> > > queues, but these bindings didnt forward any msgs.
> > > >>> > > >> Wrote a small script to detect these broken bindings and will now check
> > > >>> > > if this is "reproducible"
> > > >>> > > >>
> > > >>> > > >> then I will try "durable queues" and "durable queues with replication"
> > > >>> > > to see if this helps. Even if I would expect
> > > >>> > > >> rabbitmq should be able to handle this without these "hidden broken
> > > >>> > > bindings"
> > > >>> > > >>
> > > >>> > > >> This just FYI.
> > > >>> > > >>
> > > >>> > > >>  Fabian
> > > >>> > >


From adriant at catalystcloud.nz  Fri Aug 21 11:30:24 2020
From: adriant at catalystcloud.nz (Adrian Turjak)
Date: Fri, 21 Aug 2020 23:30:24 +1200
Subject: Using os_token
In-Reply-To: <046E9C0290DD9149B106B72FC9156BEA04814561@gmsxchsvr01.thecreation.com>
References: <046E9C0290DD9149B106B72FC9156BEA04814561@gmsxchsvr01.thecreation.com>
Message-ID: <f0abdd60-51b8-4eaa-e292-a1aa622d41a3@catalystcloud.nz>

hah, that's my issue.

The problem though is that keystoneauth actually does fetch a new token 
every time even when you supply it with one, but that new token is based 
on the one you supply, and is a scoped token. It's likely the api for 
getting a token from an existing one is faster than password auth. I 
wish there was a way to  have the tools actually reuse a given scoped 
token rather than fetch a new one every time...

OS_TOKEN is also useful/important because of MFA, which otherwise 
wouldn't work unless you reuse a token. And I'm hoping that when someone 
has time to work MFA support properly into the cli tool they can 
hopefully also think about how to make the token reuse better.

On 21/08/20 8:29 pm, Eric K. Miller wrote:
> I happened to run across an unrelated github issue:
> https://github.com/terraform-providers/terraform-provider-openstack/issu
> es/271
>
> which gave me a clue to what I was missing.  I needed to include some
> additional variables (see steps 7 through 9 below).
>
> Revised steps - which works fine with the OpenStack Client:
> 0) set appropriate OS_* variables for password authentication
> 1) create a token using "openstack token issue"
> 2) unset all OS_* environment variables
> 3) set OS_TOKEN to the token's value provided in #1
> 4) set OS_AUTH_TYPE to "v3token"
> 5) set OS_AUTH_URL to the respective KeyStone endpoint
> 6) set OS_IDENTITY_API_VERSION to "3"
> 7) set OS_PROJECT_DOMAIN_ID as appropriate
> 8) set OS_PROJECT_NAME as appropriate
> 9) set OS_REGION_NAME as appropriate
> 10) use the CLI as normal
>
> This shaves anywhere from 0.2 to 0.6 seconds off of a test command I'm
> running when compared to password authentication (which normally takes
> about 2.5 seconds to run), where a new token is issued each time.
>
> openstack token revoke works as expected too.
>
> Eric
>
>
>


From sandeep.ee.nagendra at gmail.com  Thu Aug 20 16:32:01 2020
From: sandeep.ee.nagendra at gmail.com (sandeep)
Date: Thu, 20 Aug 2020 22:02:01 +0530
Subject: Cliff auto completion not working inside interactive mode
Message-ID: <CAPcssy=OqBQfR92PuBywQeVjfhKOjeUPTA_5bCz3azadPMiJxQ@mail.gmail.com>

Hi Team,

In my system, I am trying auto completion for my CLI application.

*CLIFF version - cliff==3.4.0*
Auto complete works fine on bash prompt.

But inside the interactive shell, auto complete does not work.

Below is the screenshot for the help command inside the interactive shell.

[image: image.png]

Now, if I type swm and press tab, it lists all the sub commands under it.

But, swm s<tab> gives
swm "s

and further command auto completion does not work.
[image: image.png]

Could you please let me know what could be the problem?

Is this a known issue? or Am i missing something?

Thanks,
Sandeep
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200820/faacabca/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 28735 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200820/faacabca/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 7345 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200820/faacabca/attachment-0003.png>

From jasowang at redhat.com  Fri Aug 21 03:14:41 2020
From: jasowang at redhat.com (Jason Wang)
Date: Fri, 21 Aug 2020 11:14:41 +0800
Subject: [ovirt-devel] Re: device compatibility interface for live
 migration with assigned devices
In-Reply-To: <20200820142740.6513884d.cohuck@redhat.com>
References: <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
 <20200818113652.5d81a392.cohuck@redhat.com>
 <BY5PR12MB4322C9D1A66C4657776A1383DC5C0@BY5PR12MB4322.namprd12.prod.outlook.com>
 <20200819033035.GA21172@joy-OptiPlex-7040>
 <e20812b7-994b-b7f9-2df4-a78c4d116c7f@redhat.com>
 <20200819065951.GB21172@joy-OptiPlex-7040>
 <d6f9a51e-80b3-44c5-2656-614b327dc080@redhat.com>
 <20200819081338.GC21172@joy-OptiPlex-7040>
 <c1d580dd-5c0c-21bc-19a6-f776617d4ec2@redhat.com>
 <20200820142740.6513884d.cohuck@redhat.com>
Message-ID: <ea0e84c5-733a-2bdb-4c1e-95fd16698ed8@redhat.com>


On 2020/8/20 下午8:27, Cornelia Huck wrote:
> On Wed, 19 Aug 2020 17:28:38 +0800
> Jason Wang <jasowang at redhat.com> wrote:
>
>> On 2020/8/19 下午4:13, Yan Zhao wrote:
>>> On Wed, Aug 19, 2020 at 03:39:50PM +0800, Jason Wang wrote:
>>>> On 2020/8/19 下午2:59, Yan Zhao wrote:
>>>>> On Wed, Aug 19, 2020 at 02:57:34PM +0800, Jason Wang wrote:
>>>>>> On 2020/8/19 上午11:30, Yan Zhao wrote:
>>>>>>> hi All,
>>>>>>> could we decide that sysfs is the interface that every VFIO vendor driver
>>>>>>> needs to provide in order to support vfio live migration, otherwise the
>>>>>>> userspace management tool would not list the device into the compatible
>>>>>>> list?
>>>>>>>
>>>>>>> if that's true, let's move to the standardizing of the sysfs interface.
>>>>>>> (1) content
>>>>>>> common part: (must)
>>>>>>>        - software_version: (in major.minor.bugfix scheme)
>>>>>> This can not work for devices whose features can be negotiated/advertised
>>>>>> independently. (E.g virtio devices)
> I thought the 'software_version' was supposed to describe kind of a
> 'protocol version' for the data we transmit? I.e., you add a new field,
> you bump the version number.


Ok, but since we mandate backward compatibility of uABI, is this really 
worth to have a version for sysfs? (Searching on sysfs shows no examples 
like this)


>
>>>>>>   
>>>>> sorry, I don't understand here, why virtio devices need to use vfio interface?
>>>> I don't see any reason that virtio devices can't be used by VFIO. Do you?
>>>>
>>>> Actually, virtio devices have been used by VFIO for many years:
>>>>
>>>> - passthrough a hardware virtio devices to userspace(VM) drivers
>>>> - using virtio PMD inside guest
>>>>   
>>> So, what's different for it vs passing through a physical hardware via VFIO?
>>
>> The difference is in the guest, the device could be either real hardware
>> or emulated ones.
>>
>>
>>> even though the features are negotiated dynamically, could you explain
>>> why it would cause software_version not work?
>>
>> Virtio device 1 supports feature A, B, C
>> Virtio device 2 supports feature B, C, D
>>
>> So you can't migrate a guest from device 1 to device 2. And it's
>> impossible to model the features with versions.
> We're talking about the features offered by the device, right? Would it
> be sufficient to mandate that the target device supports the same
> features or a superset of the features supported by the source device?


Yes.


>
>>
>>>   
>>>>> I think this thread is discussing about vfio related devices.
>>>>>   
>>>>>>>        - device_api: vfio-pci or vfio-ccw ...
>>>>>>>        - type: mdev type for mdev device or
>>>>>>>                a signature for physical device which is a counterpart for
>>>>>>> 	   mdev type.
>>>>>>>
>>>>>>> device api specific part: (must)
>>>>>>>       - pci id: pci id of mdev parent device or pci id of physical pci
>>>>>>>         device (device_api is vfio-pci)API here.
>>>>>> So this assumes a PCI device which is probably not true.
>>>>>>   
>>>>> for device_api of vfio-pci, why it's not true?
>>>>>
>>>>> for vfio-ccw, it's subchannel_type.
>>>> Ok but having two different attributes for the same file is not good idea.
>>>> How mgmt know there will be a 3rd type?
>>> that's why some attributes need to be common. e.g.
>>> device_api: it's common because mgmt need to know it's a pci device or a
>>>               ccw device. and the api type is already defined vfio.h.
>>> 	    (The field is agreed by and actually suggested by Alex in previous mail)
>>> type: mdev_type for mdev. if mgmt does not understand it, it would not
>>>         be able to create one compatible mdev device.
>>> software_version: mgmt can compare the major and minor if it understands
>>>         this fields.
>>
>> I think it would be helpful if you can describe how mgmt is expected to
>> work step by step with the proposed sysfs API. This can help people to
>> understand.
> My proposal would be:
> - check that device_api matches
> - check possible device_api specific attributes
> - check that type matches [I don't think the combination of mdev types
>    and another attribute to determine compatibility is a good idea;


Any reason for this? Actually if we only use mdev type to detect the 
compatibility, it would be much more easier. Otherwise, we are actually 
re-inventing mdev types.

E.g can we have the same mdev types with different device_api and other 
attributes?


>    actually, the current proposal confuses me every time I look at it]
> - check that software_version is compatible, assuming semantic
>    versioning
> - check possible type-specific attributes


I'm not sure if this is too complicated. And I suspect there will be 
vendor specific attributes:

- for compatibility check: I think we should either modeling everything 
via mdev type or making it totally vendor specific. Having something in 
the middle will bring a lot of burden
- for provisioning: it's still not clear. As shown in this proposal, for 
NVME we may need to set remote_url, but unless there will be a subclass 
(NVME) in the mdev (which I guess not), we can't prevent vendor from 
using another attribute name, in this case, tricks like attributes 
iteration in some sub directory won't work. So even if we had some 
common API for compatibility check, the provisioning API is still vendor 
specific ...

Thanks


>
>> Thanks for the patience. Since sysfs is uABI, when accepted, we need
>> support it forever. That's why we need to be careful.
> Nod.
>
> (...)


From mahdi.abbasi.2013 at gmail.com  Fri Aug 21 08:04:41 2020
From: mahdi.abbasi.2013 at gmail.com (mahdi abbasi)
Date: Fri, 21 Aug 2020 12:34:41 +0430
Subject: Nova Docker
Message-ID: <CAOgBtR8b+kcYydXr8hRiHs12sermt_SF_wAEdZ08vS3J+zEZuA@mail.gmail.com>

Hi openstack development team,

Given that the nova-docker peoject is np longer availble, is there any
solution for creating a docker instanace in openstack?

Best regards
Mahdi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200821/388a4998/attachment.html>

From dbengt at redhat.com  Fri Aug 21 09:58:09 2020
From: dbengt at redhat.com (Daniel Bengtsson)
Date: Fri, 21 Aug 2020 11:58:09 +0200
Subject: Can't fetch from opendev.
In-Reply-To: <20200818152414.s5srmotngy7a7w7r@yuggoth.org>
References: <58c9ecb6-d1cc-df2f-caa8-693ed3f03d00@redhat.com>
 <20200817143703.c5rh3eqcl3ihxy4m@yuggoth.org>
 <6590e740-00f1-ee60-ac00-5872039e0cb0@redhat.com>
 <20200818152414.s5srmotngy7a7w7r@yuggoth.org>
Message-ID: <2307899b-c1ac-43f0-4fa9-bd61e164979f@redhat.com>


On 8/18/20 5:24 PM, Jeremy Stanley wrote:
> and it just hangs indefinitely and never returns an error?
Yes.

> One reason I suspect this might be the problem is that GitHub is
> IPv4-only, so if you have something black-holing or blocking traffic
> for global IPv6 routes, then that could cause the behavior you're
> observing.
I have the same problem with the -4 option.


From rosmaita.fossdev at gmail.com  Fri Aug 21 13:36:50 2020
From: rosmaita.fossdev at gmail.com (Brian Rosmaita)
Date: Fri, 21 Aug 2020 09:36:50 -0400
Subject: [cinder] reviewing priorities for next few weeks
Message-ID: <5ac97d24-571c-c25a-a896-d27b5413f98d@gmail.com>

Hello Cinderinos,

We're near the end of week R-8, and people are getting antsy about 
having their changes reviewed.  So here are the Cinder project reviewing 
priorities over the next few weeks.

(1) os-brick
The Victoria release of os-brick must take place during week R-6 (i.e., 
by 31 August).  Hence, os-brick reviews are the project's TOP PRIORITY 
right now.  Among os-brick reviews, these are the most important changes:
- support for volume-local-cache feature
   https://review.opendev.org/663549

- support for cinderlib RBD use
   https://review.opendev.org/#/q/topic:cinderlib-changes+status:open

- code cleanup (should be quick reviews)
 
https://review.opendev.org/#/q/status:open+project:openstack/os-brick+branch:master+topic:major-bump

There are some other open reviews in master; if they interest you, go 
for it.  But the above are release-critical.


(2) cinder features requiring python-cinderclient support
The Victoria release of python-cinderclient must take place during week 
R-5 (i.e., by 7 September).
- project-level default volume-types
   cinder: https://review.opendev.org/737707
   cinderclient: https://review.opendev.org/739223/

- active-active support
   https://review.opendev.org/#/q/topic:a-a-support+status:open

There are other open reviews in master for python-cinderclient that 
could use some eyes.  I didn't see anything major, but it would be good 
to look in case I missed something.


(3) other features (including drivers)
The feature freeze is the end of R-5 (i.e., 8 September)
- Use the blueprints to find these:
   https://blueprints.launchpad.net/cinder/victoria


And we'll be reviewing everything else also, but items in the 3 
categories above get priority.  If your patch is in category (3) or 
not-prioritized, you can always help speed things up by reviewing the 
higher-priority items.


We're almost at the end of the Victoria cycle.  Let's have a productive 
few weeks!
brian


From fungi at yuggoth.org  Fri Aug 21 13:45:54 2020
From: fungi at yuggoth.org (Jeremy Stanley)
Date: Fri, 21 Aug 2020 13:45:54 +0000
Subject: Can't fetch from opendev.
In-Reply-To: <2307899b-c1ac-43f0-4fa9-bd61e164979f@redhat.com>
References: <58c9ecb6-d1cc-df2f-caa8-693ed3f03d00@redhat.com>
 <20200817143703.c5rh3eqcl3ihxy4m@yuggoth.org>
 <6590e740-00f1-ee60-ac00-5872039e0cb0@redhat.com>
 <20200818152414.s5srmotngy7a7w7r@yuggoth.org>
 <2307899b-c1ac-43f0-4fa9-bd61e164979f@redhat.com>
Message-ID: <20200821134553.cpbt2q3l5gw7zbvk@yuggoth.org>

On 2020-08-21 11:58:09 +0200 (+0200), Daniel Bengtsson wrote:
> On 8/18/20 5:24 PM, Jeremy Stanley wrote:
[...]
> > and it just hangs indefinitely and never returns an error?
> 
> Yes.
> 
> > One reason I suspect this might be the problem is that GitHub is
> > IPv4-only, so if you have something black-holing or blocking traffic
> > for global IPv6 routes, then that could cause the behavior you're
> > observing.
> 
> I have the same problem with the -4 option.

You mentioned earlier that you and your colleague are both using a
work VPN. If this is a full-tunnel VPN, or a split-tunnel providing
conflicting routes, it's possible something within the work network
is eating or not properly rerouting your packets or the return
responses. Have you tried a git fetch with the VPN temporarily
turned off? Are you able to browse https://opendev.org/ from the
same system? Are you running git from directly within your system,
or are you running it inside a virtual machine/container on your
system?
-- 
Jeremy Stanley
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 963 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200821/f5b60940/attachment.sig>

From jean-philippe at evrard.me  Fri Aug 21 14:22:23 2020
From: jean-philippe at evrard.me (Jean-Philippe Evrard)
Date: Fri, 21 Aug 2020 16:22:23 +0200
Subject: [tc][masakari] Project aliveness (was: [masakari] Meetings)
In-Reply-To: <CAA857Vz2W+_tnfT2fAhVEVojcxwASJCazF3JLW+LF=om02Q5Eg@mail.gmail.com>
References: <CAKZ_x79qopDY36TNg_5mvWFy=tiiB08PZqpjKeft-TnR0dQbtA@mail.gmail.com>
 <CAKZ_x799gE8njCGVwUtd28REvJaYK86n0WZ3PVeQv5-HXKGdmQ@mail.gmail.com>
 <CAFHSqWroR+rirgLLn7hB-8fHKBR8jFE8KezgSg4EayePC3_dgQ@mail.gmail.com>
 <AM0PR02MB5746C86B851DAE317F5A2CEDA85F0@AM0PR02MB5746.eurprd02.prod.outlook.com>
 <CA+RdavF=POsLiCzkL6DxgjAxorUaw0z2njhm8BHGU67mwfkOwg@mail.gmail.com>
 <CAA857Vz2W+_tnfT2fAhVEVojcxwASJCazF3JLW+LF=om02Q5Eg@mail.gmail.com>
Message-ID: <6868fdd8-54cd-4ccf-a3d7-ffecf5eb601b@www.fastmail.com>


On Wed, Aug 19, 2020, at 06:23, Fabian Zimmermann wrote:
> Hi,
> 
> if nobody complains I also would like to request core status to help getting the project further. 
> 
>  Fabian Zimmermann 

Let's hope this will not be lost in the list :)


From jean-philippe at evrard.me  Fri Aug 21 14:35:49 2020
From: jean-philippe at evrard.me (Jean-Philippe Evrard)
Date: Fri, 21 Aug 2020 16:35:49 +0200
Subject: [releases] Dropping my releases core/release-manager hat
Message-ID: <f0ba875c-6c9c-4fcc-aa6f-48bd75bc1da9@www.fastmail.com>

Hello folks,

I am sad to announce that, while super motivated to keep helping the team, I cannot reliably and consistantly do my duties of core in the releases team, due to my current duties at work.

It's been a while I haven't significantly helped the release team, and the team deserve all the transparency and clarity it can get about its contributors. It's time for me to step down.

It's been a pleasure to help the team while it lasted. If you are looking for a team to get involved in OpenStack, make no mistake, the release team is awesome. Thank you everyone in the team, you were all amazing and so welcoming :)

Regards,
Jean-Philippe Evrard (evrardjp)


From radoslaw.piliszek at gmail.com  Fri Aug 21 14:42:59 2020
From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=)
Date: Fri, 21 Aug 2020 16:42:59 +0200
Subject: Nova Docker
In-Reply-To: <CAOgBtR8b+kcYydXr8hRiHs12sermt_SF_wAEdZ08vS3J+zEZuA@mail.gmail.com>
References: <CAOgBtR8b+kcYydXr8hRiHs12sermt_SF_wAEdZ08vS3J+zEZuA@mail.gmail.com>
Message-ID: <CAKZ_x78wE68N=DQCd5TG-7CQbk0+zH9Bq526hb3J7QVy+m4rDg@mail.gmail.com>

Hi,

You might be interested in Zun. [1]

[1] https://opendev.org/openstack/zun

-yoctozepto

On Fri, Aug 21, 2020 at 3:36 PM mahdi abbasi
<mahdi.abbasi.2013 at gmail.com> wrote:
>
> Hi openstack development team,
>
> Given that the nova-docker peoject is np longer availble, is there any solution for creating a docker instanace in openstack?
>
> Best regards
> Mahdi


From hberaud at redhat.com  Fri Aug 21 15:21:59 2020
From: hberaud at redhat.com (Herve Beraud)
Date: Fri, 21 Aug 2020 17:21:59 +0200
Subject: [releases] Dropping my releases core/release-manager hat
In-Reply-To: <f0ba875c-6c9c-4fcc-aa6f-48bd75bc1da9@www.fastmail.com>
References: <f0ba875c-6c9c-4fcc-aa6f-48bd75bc1da9@www.fastmail.com>
Message-ID: <CAFDq9gUtVRUNxh+YBf4qSz8iMyNdPztnQyWwJBcYStER1cbnbg@mail.gmail.com>

Thanks for all the things you have done as a team member!


Le ven. 21 août 2020 à 16:39, Jean-Philippe Evrard <jean-philippe at evrard.me>
a écrit :

> Hello folks,
>
> I am sad to announce that, while super motivated to keep helping the team,
> I cannot reliably and consistantly do my duties of core in the releases
> team, due to my current duties at work.
>
> It's been a while I haven't significantly helped the release team, and the
> team deserve all the transparency and clarity it can get about its
> contributors. It's time for me to step down.
>
> It's been a pleasure to help the team while it lasted. If you are looking
> for a team to get involved in OpenStack, make no mistake, the release team
> is awesome. Thank you everyone in the team, you were all amazing and so
> welcoming :)
>
> Regards,
> Jean-Philippe Evrard (evrardjp)
>
>
>

-- 
Hervé Beraud
Senior Software Engineer
Red Hat - Openstack Oslo
irc: hberaud
-----BEGIN PGP SIGNATURE-----

wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+
Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+
RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP
F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G
5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g
glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw
m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ
hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0
qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y
F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3
B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O
v6rDpkeNksZ9fFSyoY2o
=ECSj
-----END PGP SIGNATURE-----
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200821/70ddf7a3/attachment.html>

From dev.faz at gmail.com  Fri Aug 21 16:59:19 2020
From: dev.faz at gmail.com (Fabian Zimmermann)
Date: Fri, 21 Aug 2020 18:59:19 +0200
Subject: [tc][masakari] Project aliveness (was: [masakari] Meetings)
In-Reply-To: <6868fdd8-54cd-4ccf-a3d7-ffecf5eb601b@www.fastmail.com>
References: <CAKZ_x79qopDY36TNg_5mvWFy=tiiB08PZqpjKeft-TnR0dQbtA@mail.gmail.com>
 <CAKZ_x799gE8njCGVwUtd28REvJaYK86n0WZ3PVeQv5-HXKGdmQ@mail.gmail.com>
 <CAFHSqWroR+rirgLLn7hB-8fHKBR8jFE8KezgSg4EayePC3_dgQ@mail.gmail.com>
 <AM0PR02MB5746C86B851DAE317F5A2CEDA85F0@AM0PR02MB5746.eurprd02.prod.outlook.com>
 <CA+RdavF=POsLiCzkL6DxgjAxorUaw0z2njhm8BHGU67mwfkOwg@mail.gmail.com>
 <CAA857Vz2W+_tnfT2fAhVEVojcxwASJCazF3JLW+LF=om02Q5Eg@mail.gmail.com>
 <6868fdd8-54cd-4ccf-a3d7-ffecf5eb601b@www.fastmail.com>
Message-ID: <CAA857VyQ9qYvsao0U4V7PBzG6kOgOCR3q08BU5o_fdLEeGu+BA@mail.gmail.com>

Hi,

As long as there are enough cores to keep the project running everything is
fine :)

 Fabian

Jean-Philippe Evrard <jean-philippe at evrard.me> schrieb am Fr., 21. Aug.
2020, 16:32:

>
> On Wed, Aug 19, 2020, at 06:23, Fabian Zimmermann wrote:
> > Hi,
> >
> > if nobody complains I also would like to request core status to help
> getting the project further.
> >
> >  Fabian Zimmermann
>
> Let's hope this will not be lost in the list :)
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200821/c5da3414/attachment.html>

From cohuck at redhat.com  Fri Aug 21 14:52:55 2020
From: cohuck at redhat.com (Cornelia Huck)
Date: Fri, 21 Aug 2020 16:52:55 +0200
Subject: [ovirt-devel] Re: device compatibility interface for live
 migration with assigned devices
In-Reply-To: <ea0e84c5-733a-2bdb-4c1e-95fd16698ed8@redhat.com>
References: <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
 <20200818113652.5d81a392.cohuck@redhat.com>
 <BY5PR12MB4322C9D1A66C4657776A1383DC5C0@BY5PR12MB4322.namprd12.prod.outlook.com>
 <20200819033035.GA21172@joy-OptiPlex-7040>
 <e20812b7-994b-b7f9-2df4-a78c4d116c7f@redhat.com>
 <20200819065951.GB21172@joy-OptiPlex-7040>
 <d6f9a51e-80b3-44c5-2656-614b327dc080@redhat.com>
 <20200819081338.GC21172@joy-OptiPlex-7040>
 <c1d580dd-5c0c-21bc-19a6-f776617d4ec2@redhat.com>
 <20200820142740.6513884d.cohuck@redhat.com>
 <ea0e84c5-733a-2bdb-4c1e-95fd16698ed8@redhat.com>
Message-ID: <20200821165255.53e26628.cohuck@redhat.com>

On Fri, 21 Aug 2020 11:14:41 +0800
Jason Wang <jasowang at redhat.com> wrote:

> On 2020/8/20 下午8:27, Cornelia Huck wrote:
> > On Wed, 19 Aug 2020 17:28:38 +0800
> > Jason Wang <jasowang at redhat.com> wrote:
> >  
> >> On 2020/8/19 下午4:13, Yan Zhao wrote:  
> >>> On Wed, Aug 19, 2020 at 03:39:50PM +0800, Jason Wang wrote:  
> >>>> On 2020/8/19 下午2:59, Yan Zhao wrote:  
> >>>>> On Wed, Aug 19, 2020 at 02:57:34PM +0800, Jason Wang wrote:  
> >>>>>> On 2020/8/19 上午11:30, Yan Zhao wrote:  
> >>>>>>> hi All,
> >>>>>>> could we decide that sysfs is the interface that every VFIO vendor driver
> >>>>>>> needs to provide in order to support vfio live migration, otherwise the
> >>>>>>> userspace management tool would not list the device into the compatible
> >>>>>>> list?
> >>>>>>>
> >>>>>>> if that's true, let's move to the standardizing of the sysfs interface.
> >>>>>>> (1) content
> >>>>>>> common part: (must)
> >>>>>>>        - software_version: (in major.minor.bugfix scheme)  
> >>>>>> This can not work for devices whose features can be negotiated/advertised
> >>>>>> independently. (E.g virtio devices)  
> > I thought the 'software_version' was supposed to describe kind of a
> > 'protocol version' for the data we transmit? I.e., you add a new field,
> > you bump the version number.  
> 
> 
> Ok, but since we mandate backward compatibility of uABI, is this really 
> worth to have a version for sysfs? (Searching on sysfs shows no examples 
> like this)

I was not thinking about the sysfs interface, but rather about the data
that is sent over while migrating. E.g. we find out that sending some
auxiliary data is a good idea and bump to version 1.1.0; version 1.0.0
cannot deal with the extra data, but version 1.1.0 can deal with the
older data stream.

(...)

> >>>>>>>        - device_api: vfio-pci or vfio-ccw ...
> >>>>>>>        - type: mdev type for mdev device or
> >>>>>>>                a signature for physical device which is a counterpart for
> >>>>>>> 	   mdev type.
> >>>>>>>
> >>>>>>> device api specific part: (must)
> >>>>>>>       - pci id: pci id of mdev parent device or pci id of physical pci
> >>>>>>>         device (device_api is vfio-pci)API here.  
> >>>>>> So this assumes a PCI device which is probably not true.
> >>>>>>     
> >>>>> for device_api of vfio-pci, why it's not true?
> >>>>>
> >>>>> for vfio-ccw, it's subchannel_type.  
> >>>> Ok but having two different attributes for the same file is not good idea.
> >>>> How mgmt know there will be a 3rd type?  
> >>> that's why some attributes need to be common. e.g.
> >>> device_api: it's common because mgmt need to know it's a pci device or a
> >>>               ccw device. and the api type is already defined vfio.h.
> >>> 	    (The field is agreed by and actually suggested by Alex in previous mail)
> >>> type: mdev_type for mdev. if mgmt does not understand it, it would not
> >>>         be able to create one compatible mdev device.
> >>> software_version: mgmt can compare the major and minor if it understands
> >>>         this fields.  
> >>
> >> I think it would be helpful if you can describe how mgmt is expected to
> >> work step by step with the proposed sysfs API. This can help people to
> >> understand.  
> > My proposal would be:
> > - check that device_api matches
> > - check possible device_api specific attributes
> > - check that type matches [I don't think the combination of mdev types
> >    and another attribute to determine compatibility is a good idea;  
> 
> 
> Any reason for this? Actually if we only use mdev type to detect the 
> compatibility, it would be much more easier. Otherwise, we are actually 
> re-inventing mdev types.
> 
> E.g can we have the same mdev types with different device_api and other 
> attributes?

In the end, the mdev type is represented as a string; but I'm not sure
we can expect that two types with the same name, but a different
device_api are related in any way.

If we e.g. compare vfio-pci and vfio-ccw, they are fundamentally
different.

I was mostly concerned about the aggregation proposal, where type A +
aggregation value b might be compatible with type B + aggregation value
a.

> 
> 
> >    actually, the current proposal confuses me every time I look at it]
> > - check that software_version is compatible, assuming semantic
> >    versioning
> > - check possible type-specific attributes  
> 
> 
> I'm not sure if this is too complicated. And I suspect there will be 
> vendor specific attributes:
> 
> - for compatibility check: I think we should either modeling everything 
> via mdev type or making it totally vendor specific. Having something in 
> the middle will bring a lot of burden

FWIW, I'm for a strict match on mdev type, and flexibility in per-type
attributes.

> - for provisioning: it's still not clear. As shown in this proposal, for 
> NVME we may need to set remote_url, but unless there will be a subclass 
> (NVME) in the mdev (which I guess not), we can't prevent vendor from 
> using another attribute name, in this case, tricks like attributes 
> iteration in some sub directory won't work. So even if we had some 
> common API for compatibility check, the provisioning API is still vendor 
> specific ...

Yes, I'm not sure how to deal with the "same thing for different
vendors" problem. We can try to make sure that in-kernel drivers play
nicely, but not much more.


From mahdi.abbasi.2013 at gmail.com  Fri Aug 21 17:13:47 2020
From: mahdi.abbasi.2013 at gmail.com (mahdi abbasi)
Date: Fri, 21 Aug 2020 21:43:47 +0430
Subject: Nova Docker
In-Reply-To: <CAKZ_x78wE68N=DQCd5TG-7CQbk0+zH9Bq526hb3J7QVy+m4rDg@mail.gmail.com>
References: <CAOgBtR8b+kcYydXr8hRiHs12sermt_SF_wAEdZ08vS3J+zEZuA@mail.gmail.com>
 <CAKZ_x78wE68N=DQCd5TG-7CQbk0+zH9Bq526hb3J7QVy+m4rDg@mail.gmail.com>
Message-ID: <CAOgBtR_rH=38x2V9MdnyNjO9n+shXWOJjrKGYWfstXqZE1Az=A@mail.gmail.com>

Thanks a lot

On Fri, 21 Aug 2020, 19:13 Radosław Piliszek, <radoslaw.piliszek at gmail.com>
wrote:

> Hi,
>
> You might be interested in Zun. [1]
>
> [1] https://opendev.org/openstack/zun
>
> -yoctozepto
>
> On Fri, Aug 21, 2020 at 3:36 PM mahdi abbasi
> <mahdi.abbasi.2013 at gmail.com> wrote:
> >
> > Hi openstack development team,
> >
> > Given that the nova-docker peoject is np longer availble, is there any
> solution for creating a docker instanace in openstack?
> >
> > Best regards
> > Mahdi
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200821/b1cd2a7b/attachment.html>

From tonyliu0592 at hotmail.com  Fri Aug 21 18:42:11 2020
From: tonyliu0592 at hotmail.com (Tony Liu)
Date: Fri, 21 Aug 2020 18:42:11 +0000
Subject: [Kolla Ansible] host maintenance
Message-ID: <MWHPR08MB23822BB16CA7FFA8667B5A73BD5B0@MWHPR08MB2382.namprd08.prod.outlook.com>

Hi,

I wonder if it's supported by Kolla Ansible to deploy a specific
host and add it into existing cluster, like replace a control
host or compute host?


Thanks!
Tony


From dev.faz at gmail.com  Fri Aug 21 20:19:42 2020
From: dev.faz at gmail.com (Fabian Zimmermann)
Date: Fri, 21 Aug 2020 22:19:42 +0200
Subject: [Kolla Ansible] host maintenance
In-Reply-To: <MWHPR08MB23822BB16CA7FFA8667B5A73BD5B0@MWHPR08MB2382.namprd08.prod.outlook.com>
References: <MWHPR08MB23822BB16CA7FFA8667B5A73BD5B0@MWHPR08MB2382.namprd08.prod.outlook.com>
Message-ID: <CAA857VyMP9jH0575oMF446L==1UDYKk5UGs6YQ-kxkrq4F4GJg@mail.gmail.com>

Hi,

seems like someone else is trying to migrate an existing setup to kolla 😉

We currently try it step by step.

1. Use kolla images instead of self developed builder.
2. Generate suitable kolla configuration file layout
3. Hopefully kolla-ansible will hand over

But we are still in PoC state.

 Fabian

Tony Liu <tonyliu0592 at hotmail.com> schrieb am Fr., 21. Aug. 2020, 20:49:

> Hi,
>
> I wonder if it's supported by Kolla Ansible to deploy a specific
> host and add it into existing cluster, like replace a control
> host or compute host?
>
>
> Thanks!
> Tony
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200821/0a43eb21/attachment.html>

From jasonanderson at uchicago.edu  Fri Aug 21 20:45:07 2020
From: jasonanderson at uchicago.edu (Jason Anderson)
Date: Fri, 21 Aug 2020 20:45:07 +0000
Subject: [Kolla Ansible] host maintenance
In-Reply-To: <CAA857VyMP9jH0575oMF446L==1UDYKk5UGs6YQ-kxkrq4F4GJg@mail.gmail.com>
References: <MWHPR08MB23822BB16CA7FFA8667B5A73BD5B0@MWHPR08MB2382.namprd08.prod.outlook.com>
 <CAA857VyMP9jH0575oMF446L==1UDYKk5UGs6YQ-kxkrq4F4GJg@mail.gmail.com>
Message-ID: <E1F06E18-FD98-42FB-9DAC-0D4A45343226@uchicago.edu>

Hello, if you are working on a migration to Kolla, there is a nice guide written by StackHPC that provides one example approach for this complicated maneuver: https://www.stackhpc.com/migrating-to-kolla.html

Perhaps not relevant to your specific case, but it can offer some guidance!

Cheers,
/Jason

On Aug 21, 2020, at 3:19 PM, Fabian Zimmermann <dev.faz at gmail.com<mailto:dev.faz at gmail.com>> wrote:

Hi,

seems like someone else is trying to migrate an existing setup to kolla 😉

We currently try it step by step.

1. Use kolla images instead of self developed builder.
2. Generate suitable kolla configuration file layout
3. Hopefully kolla-ansible will hand over

But we are still in PoC state.

 Fabian

Tony Liu <tonyliu0592 at hotmail.com<mailto:tonyliu0592 at hotmail.com>> schrieb am Fr., 21. Aug. 2020, 20:49:
Hi,

I wonder if it's supported by Kolla Ansible to deploy a specific
host and add it into existing cluster, like replace a control
host or compute host?


Thanks!
Tony


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200821/49bff5b9/attachment.html>

From tonyliu0592 at hotmail.com  Fri Aug 21 22:15:25 2020
From: tonyliu0592 at hotmail.com (Tony Liu)
Date: Fri, 21 Aug 2020 22:15:25 +0000
Subject: [Kolla Ansible] host maintenance
In-Reply-To: <E1F06E18-FD98-42FB-9DAC-0D4A45343226@uchicago.edu>
References: <MWHPR08MB23822BB16CA7FFA8667B5A73BD5B0@MWHPR08MB2382.namprd08.prod.outlook.com>
 <CAA857VyMP9jH0575oMF446L==1UDYKk5UGs6YQ-kxkrq4F4GJg@mail.gmail.com>
 <E1F06E18-FD98-42FB-9DAC-0D4A45343226@uchicago.edu>
Message-ID: <MWHPR08MB2382D4FADCD4064948B74D60BD5B0@MWHPR08MB2382.namprd08.prod.outlook.com>

Actually, in my case, the setup is originally deploy by
Kolla Ansible. Other than the initial deployment, I am
looking for using Kolla Ansible for maintenance operations.
What I am looking for, eg. replace a host, can surely be
done by manual steps or customized script. I'd like to know
if they are automated by Kolla Ansible.

Thanks!
Tony
> -----Original Message-----
> From: Jason Anderson <jasonanderson at uchicago.edu>
> Sent: Friday, August 21, 2020 1:45 PM
> To: Fabian Zimmermann <dev.faz at gmail.com>
> Cc: Tony Liu <tonyliu0592 at hotmail.com>; openstack-discuss <openstack-
> discuss at lists.openstack.org>
> Subject: Re: [Kolla Ansible] host maintenance
> 
> Hello, if you are working on a migration to Kolla, there is a nice guide
> written by StackHPC that provides one example approach for this
> complicated maneuver: https://www.stackhpc.com/migrating-to-kolla.html
> 
> Perhaps not relevant to your specific case, but it can offer some
> guidance!
> 
> Cheers,
> /Jason
> 
> 
> 
> 	On Aug 21, 2020, at 3:19 PM, Fabian Zimmermann <dev.faz at gmail.com
> <mailto:dev.faz at gmail.com> > wrote:
> 
> 	Hi,
> 
> 	seems like someone else is trying to migrate an existing setup to
> kolla 😉
> 
> 	We currently try it step by step.
> 
> 	1. Use kolla images instead of self developed builder.
> 	2. Generate suitable kolla configuration file layout
> 	3. Hopefully kolla-ansible will hand over
> 
> 	But we are still in PoC state.
> 
> 	 Fabian
> 
> 	Tony Liu <tonyliu0592 at hotmail.com <mailto:tonyliu0592 at hotmail.com> >
> schrieb am Fr., 21. Aug. 2020, 20:49:
> 
> 
> 		Hi,
> 
> 		I wonder if it's supported by Kolla Ansible to deploy a
> specific
> 		host and add it into existing cluster, like replace a control
> 		host or compute host?
> 
> 
> 		Thanks!
> 		Tony
> 
> 
> 
> 


From emiller at genesishosting.com  Sat Aug 22 00:03:00 2020
From: emiller at genesishosting.com (Eric K. Miller)
Date: Fri, 21 Aug 2020 19:03:00 -0500
Subject: Using os_token
In-Reply-To: <f0abdd60-51b8-4eaa-e292-a1aa622d41a3@catalystcloud.nz>
References: <046E9C0290DD9149B106B72FC9156BEA04814561@gmsxchsvr01.thecreation.com>
 <f0abdd60-51b8-4eaa-e292-a1aa622d41a3@catalystcloud.nz>
Message-ID: <046E9C0290DD9149B106B72FC9156BEA04814568@gmsxchsvr01.thecreation.com>

> The problem though is that keystoneauth actually does fetch a new token
> every time even when you supply it with one, but that new token is based
> on the one you supply, and is a scoped token. It's likely the api for
> getting a token from an existing one is faster than password auth. I
> wish there was a way to  have the tools actually reuse a given scoped
> token rather than fetch a new one every time...

Interesting!  I was kinda wondering if that was actually what was happening.  It still seems like quite a bit of a delay compared to running the OpenStack Client and running commands on its command line repeatedly (as opposed to loading the OpenStack Client each time).  I assumed that there was still some work to load Python, etc., but using --debug does show a pull of the service catalog, which is slow.  It definitely would be nice to have a way to save/load the "session" that is created by the OpenStack Client to avoid all of the overhead, or, as you said, provide a scoped token.

I tested the performance again to be sure I wasn't going crazy, and with OS_TOKEN, it is definitely between 0.2 and 0.6 (most of the time at the higher end of this) seconds faster.  Any improvement is good.

> OS_TOKEN is also useful/important because of MFA, which otherwise
> wouldn't work unless you reuse a token. And I'm hoping that when someone
> has time to work MFA support properly into the cli tool they can
> hopefully also think about how to make the token reuse better.

You mentioned "MFA support properly".  What issue exists?  I'm interested since I was about to look into this next.

Thanks!

Eric

From emiller at genesishosting.com  Sat Aug 22 00:09:53 2020
From: emiller at genesishosting.com (Eric K. Miller)
Date: Fri, 21 Aug 2020 19:09:53 -0500
Subject: [Kolla Ansible] host maintenance
In-Reply-To: <MWHPR08MB2382D4FADCD4064948B74D60BD5B0@MWHPR08MB2382.namprd08.prod.outlook.com>
References: <MWHPR08MB23822BB16CA7FFA8667B5A73BD5B0@MWHPR08MB2382.namprd08.prod.outlook.com>
 <CAA857VyMP9jH0575oMF446L==1UDYKk5UGs6YQ-kxkrq4F4GJg@mail.gmail.com>
 <E1F06E18-FD98-42FB-9DAC-0D4A45343226@uchicago.edu>
 <MWHPR08MB2382D4FADCD4064948B74D60BD5B0@MWHPR08MB2382.namprd08.prod.outlook.com>
Message-ID: <046E9C0290DD9149B106B72FC9156BEA04814569@gmsxchsvr01.thecreation.com>

> Actually, in my case, the setup is originally deploy by
> Kolla Ansible. Other than the initial deployment, I am
> looking for using Kolla Ansible for maintenance operations.
> What I am looking for, eg. replace a host, can surely be
> done by manual steps or customized script. I'd like to know
> if they are automated by Kolla Ansible.

We do this often by simply using the "limit" flag in Kolla Ansible to only include the controllers and new compute node (after adding the compute node to the multinode.ini file).  Specify "reconfigure" for the action, and not "install".

Eric

From tonyliu0592 at hotmail.com  Sat Aug 22 01:14:49 2020
From: tonyliu0592 at hotmail.com (Tony Liu)
Date: Sat, 22 Aug 2020 01:14:49 +0000
Subject: [Kolla Ansible] host maintenance
In-Reply-To: <046E9C0290DD9149B106B72FC9156BEA04814569@gmsxchsvr01.thecreation.com>
References: <MWHPR08MB23822BB16CA7FFA8667B5A73BD5B0@MWHPR08MB2382.namprd08.prod.outlook.com>
 <CAA857VyMP9jH0575oMF446L==1UDYKk5UGs6YQ-kxkrq4F4GJg@mail.gmail.com>
 <E1F06E18-FD98-42FB-9DAC-0D4A45343226@uchicago.edu>
 <MWHPR08MB2382D4FADCD4064948B74D60BD5B0@MWHPR08MB2382.namprd08.prod.outlook.com>
 <046E9C0290DD9149B106B72FC9156BEA04814569@gmsxchsvr01.thecreation.com>
Message-ID: <MWHPR08MB2382B336BEC159AA71994964BD580@MWHPR08MB2382.namprd08.prod.outlook.com>

Thanks Eric! I will run some tests to validate.

Tony
> -----Original Message-----
> From: Eric K. Miller <emiller at genesishosting.com>
> Sent: Friday, August 21, 2020 5:10 PM
> To: openstack-discuss <openstack-discuss at lists.openstack.org>
> Subject: RE: [Kolla Ansible] host maintenance
> 
> > Actually, in my case, the setup is originally deploy by Kolla Ansible.
> > Other than the initial deployment, I am looking for using Kolla
> > Ansible for maintenance operations.
> > What I am looking for, eg. replace a host, can surely be done by
> > manual steps or customized script. I'd like to know if they are
> > automated by Kolla Ansible.
> 
> We do this often by simply using the "limit" flag in Kolla Ansible to
> only include the controllers and new compute node (after adding the
> compute node to the multinode.ini file).  Specify "reconfigure" for the
> action, and not "install".
> 
> Eric

From 358111907 at qq.com  Sat Aug 22 01:24:48 2020
From: 358111907 at qq.com (=?gb18030?B?wO7WvtS2?=)
Date: Sat, 22 Aug 2020 09:24:48 +0800
Subject: About Devstack
Message-ID: <tencent_E51A98659906550A5DD44A8CCB76E8937606@qq.com>

I'm sorry to disturb you. 
Recently, I tried to install openstack through devstack. When I input "./stack.sh". I can install openstack successfully. 
Then I tried to create a cloud instance and use the public network 172.24.4.0/24 which is created during installation( this subnet is created by default, I didn't configure network informartion in local.conf before installation). And the instance can access to the Internet smoothly.
But the instance will not access the Internet when I reboot my server&nbsp; (physical machine). After rebooting, I input "sudo ifconfig br-ex 172.24.4.1/24 up", the instance can access my server IP, but it can't PING the gateway addresses of my server. Of course, the instance also can't access the Internet. But my server can PING it's gateway and access to the Internet. Finally, the cloud instance can only communicate with my server.
I tried many methods to restore the network environment of my openstack. But I can't find the reason. So I need your help. I install the version of devstack is stable/train. Thank you very much!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200822/c4032a3b/attachment-0001.html>

From berndbausch at mailbox.org  Sat Aug 22 02:33:44 2020
From: berndbausch at mailbox.org (Bernd Bausch)
Date: Sat, 22 Aug 2020 11:33:44 +0900
Subject: OpenStack user survey data available?
Message-ID: <a0334e1c-cef4-a23e-cdce-9d5e612e470b@mailbox.org>

Is there a way to get access to raw user survey data? Some of the 
graphical data on the analytics page 
<https://www.openstack.org/analytics> is unreadable, in particular 
information about Neutron's drivers:

The analytics FAQ tells me to contact heidijoy at openstack.org 
<mailto:heidijoy at openstack.org> for questions, but email to this address 
bounces back.

550 5.1.1 <heidijoy at openstack.org>: Email address could not be found, or 
was misspelled (G8)

Thanks much,

Bernd

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200822/492f6fa9/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: jjojlgndnihopfcd.png
Type: image/png
Size: 37546 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200822/492f6fa9/attachment-0001.png>

From allison at openstack.org  Sat Aug 22 17:09:56 2020
From: allison at openstack.org (Allison Price)
Date: Sat, 22 Aug 2020 12:09:56 -0500
Subject: OpenStack user survey data available?
In-Reply-To: <a0334e1c-cef4-a23e-cdce-9d5e612e470b@mailbox.org>
References: <a0334e1c-cef4-a23e-cdce-9d5e612e470b@mailbox.org>
Message-ID: <D19FA4F2-A1FA-4716-9FB0-B3699E35979A@openstack.org>

Hi Bernd,

Thanks for reaching out. We will change the FAQ with updated contact information, but I can help you on this request. Attached is a spreadsheet with the anonymous data distribution for the Neutron driver question from the 2019 survey. The 2020 data will be available soon, so please let me know if that’s data you would like as well.  

If there are other specific questions you would like anonymous data on, please let me know as it does need to be pulled manually. 


Cheers,
Allison

Allison Price
OpenStack Foundation
allison at openstack.org


> On Aug 21, 2020, at 9:33 PM, Bernd Bausch <berndbausch at mailbox.org> wrote:
> 
> Is there a way to get access to raw user survey data? Some of the graphical data on the analytics page <https://www.openstack.org/analytics> is unreadable, in particular information about Neutron's drivers:<jjojlgndnihopfcd.png>
> 
> The analytics FAQ tells me to contact heidijoy at openstack.org <mailto:heidijoy at openstack.org> for questions, but email to this address bounces back. 
> 
> 550 5.1.1 <heidijoy at openstack.org> <mailto:heidijoy at openstack.org>: Email address could not be found, or was misspelled (G8)
> 
> Thanks much,
> 
> Bernd
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200822/4726150f/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 2019_networking drivers.xlsx
Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
Size: 10184 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200822/4726150f/attachment.xlsx>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200822/4726150f/attachment-0001.html>

From mahdi.abbasi.2013 at gmail.com  Sat Aug 22 16:43:58 2020
From: mahdi.abbasi.2013 at gmail.com (mahdi abbasi)
Date: Sat, 22 Aug 2020 21:13:58 +0430
Subject: python-zunclient
Message-ID: <CAOgBtR9rbvSJB4DRWVfygj1v2rz7dD5RQc5U+BrfMU3LyfABdg@mail.gmail.com>

Hi openstack development team,

I installed python-zunclient successfully but openstack appcontainer
command steal return not found. Please help me.

Best regards
Mahdi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200822/097559ba/attachment.html>

From hongbin034 at gmail.com  Sat Aug 22 19:10:55 2020
From: hongbin034 at gmail.com (Hongbin Lu)
Date: Sat, 22 Aug 2020 15:10:55 -0400
Subject: python-zunclient
In-Reply-To: <CAOgBtR9rbvSJB4DRWVfygj1v2rz7dD5RQc5U+BrfMU3LyfABdg@mail.gmail.com>
References: <CAOgBtR9rbvSJB4DRWVfygj1v2rz7dD5RQc5U+BrfMU3LyfABdg@mail.gmail.com>
Message-ID: <CAFNUc+QWE8FWypO5MJiVkqyVkDAhhrC=C-3NyqPYWjozuAawBg@mail.gmail.com>

How did you install python-zunclient? and how did you install
openstackclient? My best guess is the mess up of python2/3 environment.
Mind pasting output of the following commands?

$ pip --version
$ pip freeze
$ pip3 freeze

On Sat, Aug 22, 2020 at 3:03 PM mahdi abbasi <mahdi.abbasi.2013 at gmail.com>
wrote:

> Hi openstack development team,
>
> I installed python-zunclient successfully but openstack appcontainer
> command steal return not found. Please help me.
>
> Best regards
> Mahdi
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200822/a1cc578e/attachment-0001.html>

From jomin0613 at gmail.com  Sun Aug 23 12:18:32 2020
From: jomin0613 at gmail.com (Mingi Jo)
Date: Sun, 23 Aug 2020 21:18:32 +0900
Subject: [keystone] openstack token auth scpore system Question
Message-ID: <CAFW69vguQN6nLQmnUyd0hzWf=6z4LfTtNt-v_jxe-vyBfL_V4g@mail.gmail.com>

Hi, I'm studying OpenStack.If you use OpenStack and use it with a
keystone token on all computers,If there is a project in the endpoint
URL, the api request cannot be made properly.The error message is
output at 400, and the request fails. We've looked into this, and I've
found out,https://bugs.launchpad.net/cinder/+bug/1745905Here's the bug
reporting, and I think it's done with the paperwork.However, various
services such as cinder, swift, and probe are required to include
projects in the endpoint url of the installation guide, which is
considered contradictory.Is there any way to fix this?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200823/5f1612ec/attachment.html>

From berndbausch at gmail.com  Sun Aug 23 12:39:28 2020
From: berndbausch at gmail.com (Bernd Bausch)
Date: Sun, 23 Aug 2020 21:39:28 +0900
Subject: [simplification] Making ask.openstack.org read-only
In-Reply-To: <f1cdb911-5e4c-cf4f-7f3b-32ca12353ace@openstack.org>
References: <f1cdb911-5e4c-cf4f-7f3b-32ca12353ace@openstack.org>
Message-ID: <648c6ac3-0ab8-e442-ed9b-fbbfbbea16f7@gmail.com>

Thanks for calling me out, but I am certainly not the only one answering 
questions.

After the notification feature broke down entirely, leaving me no way to 
see which questions I am involved in, it's indeed time to move on. I 
agree with the change as well.

Bernd.

On 8/18/2020 7:44 PM, Thierry Carrez wrote:
> Hi everyone,
>
> This has been discussed several times on this mailing list in the 
> past, but we never got to actually pull the plug.
>
> Ask.openstack.org was launched in 2013. The reason for hosting our own 
> setup was to be able to support multiple languages, while 
> StackOverflow rejected our proposal to have our own openstack-branded 
> StackExchange site. The Chinese ask.o.o side never really took off. 
> The English side also never really worked perfectly (like email alerts 
> are hopelessly broken), but we figured it would get better with time 
> if a big community formed around it.
>
> Fast-forward to 2020 and the instance is lacking volunteers to help 
> run it, while the code (and our customization of it) has become more 
> complicated to maintain. It regularly fails one way or another, and 
> questions there often go unanswered, making us look bad. Of the top 30 
> users, most have abandoned the platform since 2017, leaving only Bernd 
> Bausch actively engaging and helping moderate questions lately. We 
> have called for volunteers several times, but the offers for help 
> never really materialized.
>
> At the same time, people are asking OpenStack questions on 
> StackOverflow, and sometimes getting answers there[1]. The 
> fragmentation of the "questions" space is not helping users getting 
> good answers.
>
> I think it's time to pull the plug, make ask.openstack.org read-only 
> (so that links to old answers are not lost) and redirect users to the 
> mailing-list and the "OpenStack" tag on StackOverflow. I picked 
> StackOverflow since it seems to have the most openstack questions 
> (2,574 on SO, 76 on SuperUser and 430 on ServerFault).
>
> We discussed that option several times, but I now proposed a change to 
> actually make it happen:
>
> https://review.opendev.org/#/c/746497/
>
> It's always a difficult decision to make to kill a resource, but I 
> feel like in this case, consolidation and simplification would help.
>
> Thoughts, comments?
>
> [1] https://stackoverflow.com/questions/tagged/openstack
>


From mahdi.abbasi.2013 at gmail.com  Sat Aug 22 19:27:13 2020
From: mahdi.abbasi.2013 at gmail.com (mahdi abbasi)
Date: Sat, 22 Aug 2020 23:57:13 +0430
Subject: python-zunclient
In-Reply-To: <CAFNUc+QWE8FWypO5MJiVkqyVkDAhhrC=C-3NyqPYWjozuAawBg@mail.gmail.com>
References: <CAOgBtR9rbvSJB4DRWVfygj1v2rz7dD5RQc5U+BrfMU3LyfABdg@mail.gmail.com>
 <CAFNUc+QWE8FWypO5MJiVkqyVkDAhhrC=C-3NyqPYWjozuAawBg@mail.gmail.com>
Message-ID: <CAOgBtR-AHZWjFzS-pq9VAPVNfs=b4a__UkaE=iBwscgiB=JT5A@mail.gmail.com>

Thanks Hongbin,
This Issue has been resolved.

On Sat, 22 Aug 2020, 23:41 Hongbin Lu, <hongbin034 at gmail.com> wrote:

> How did you install python-zunclient? and how did you install
> openstackclient? My best guess is the mess up of python2/3 environment.
> Mind pasting output of the following commands?
>
> $ pip --version
> $ pip freeze
> $ pip3 freeze
>
> On Sat, Aug 22, 2020 at 3:03 PM mahdi abbasi <mahdi.abbasi.2013 at gmail.com>
> wrote:
>
>> Hi openstack development team,
>>
>> I installed python-zunclient successfully but openstack appcontainer
>> command steal return not found. Please help me.
>>
>> Best regards
>> Mahdi
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200822/7672217e/attachment.html>

From alterriu at gmail.com  Mon Aug 24 05:31:43 2020
From: alterriu at gmail.com (Popoi Zen)
Date: Mon, 24 Aug 2020 12:31:43 +0700
Subject: [ovn][neutron][ussuri] DPDK support on OVN Openstack? Openstack Doc
 contradictive Information?
Message-ID: <CAEW15yD=Btu3NH3ENkbajCkyTCpwk0HspfSYRymZwqHJ+dbC3Q@mail.gmail.com>

I want to implement DPDK on my Openstack using OVN as mechanism driver. I
read 2 documentation:
[1] https://docs.openstack.org/neutron/ussuri/admin/ovn/dpdk.html
[2] https://docs.openstack.org/neutron/ussuri/admin/config-ovs-dpdk.html

In first doc [1], it is said that DPDK has been supported on OVN. But in
second doc [2] it is said `The support of this feature is not yet present
in ML2 OVN and ODL mechanism drivers.`

Which one is true? because I have TCP checksum issue when implement DPDK on
OVN same like this: https://bugs.launchpad.net/neutron/+bug/1832021. It has
patched on OVS but not work on OVN.


Regards,
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200824/8584ec30/attachment.html>

From alterriu at gmail.com  Mon Aug 24 06:57:59 2020
From: alterriu at gmail.com (Popoi Zen)
Date: Mon, 24 Aug 2020 13:57:59 +0700
Subject: [ovn][neutron][ussuri] DPDK support on OVN Openstack? Openstack
 Doc contradictive Information?
In-Reply-To: <CAEW15yD=Btu3NH3ENkbajCkyTCpwk0HspfSYRymZwqHJ+dbC3Q@mail.gmail.com>
References: <CAEW15yD=Btu3NH3ENkbajCkyTCpwk0HspfSYRymZwqHJ+dbC3Q@mail.gmail.com>
Message-ID: <CAEW15yB_g=cV5BHoMGHpp_VskUM55jZ0B=q3KCB7U8CZgyCs0Q@mail.gmail.com>

Ah, it seems like vhost-user-reconnect feature which is not yet supported.

But the problem still,  I cant get metadata because of tcp checksum error.

On Mon, Aug 24, 2020, 12:31 Popoi Zen <alterriu at gmail.com> wrote:

> I want to implement DPDK on my Openstack using OVN as mechanism driver. I
> read 2 documentation:
> [1] https://docs.openstack.org/neutron/ussuri/admin/ovn/dpdk.html
> [2] https://docs.openstack.org/neutron/ussuri/admin/config-ovs-dpdk.html
>
> In first doc [1], it is said that DPDK has been supported on OVN. But in
> second doc [2] it is said `The support of this feature is not yet present
> in ML2 OVN and ODL mechanism drivers.`
>
> Which one is true? because I have TCP checksum issue when implement DPDK
> on OVN same like this: https://bugs.launchpad.net/neutron/+bug/1832021.
> It has patched on OVS but not work on OVN.
>
>
> Regards,
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200824/b12dfd77/attachment-0001.html>

From berndbausch at gmail.com  Mon Aug 24 07:32:31 2020
From: berndbausch at gmail.com (Bernd Bausch)
Date: Mon, 24 Aug 2020 16:32:31 +0900
Subject: About Devstack
In-Reply-To: <tencent_E51A98659906550A5DD44A8CCB76E8937606@qq.com>
References: <tencent_E51A98659906550A5DD44A8CCB76E8937606@qq.com>
Message-ID: <17933c25-2fe9-4f0e-b1c5-797d4a97a5dc@gmail.com>

Devstack is not meant to be restarted. However, setting the IP address 
on br-ex and bringing it up is normally sufficient to re-establish 
networking. After that, you probably still need to recreate the loop 
devices for Cinder and Swift.

What I don't understand: The external network that Devstack sets up by 
default, named "public", is fake. It's not external, and it's not 
connected to the outside world at all, thus the IP address range of 
172.24.4.0/24. How your instances were able to access the internet 
without any manual tweaking is a mystery to me. If you did some manual 
tweaking, I guess it was lost when you rebooted.

Perhaps you had a non-persistent routing table entry that connected 
172.24.4.0/24 to the outside world?

Bernd.

On 8/22/2020 10:24 AM, 李志远 wrote:
> I'm sorry to disturb you.
> Recently, I tried to install openstack through devstack. When I input 
> "./stack.sh". I can install openstack successfully.
> Then I tried to create a cloud instance and use the public network 
> 172.24.4.0/24 <http://172.24.4.0/24> which is created during 
> installation( this subnet is created by default, I didn't configure 
> network informartion in local.conf before installation). And the 
> instance can access to the Internet smoothly.
> But the instance will not access the Internet when I reboot my server  
> (physical machine). After rebooting, I input "sudo ifconfig br-ex 
> 172.24.4.1/24 <http://172.24.4.1/24> up", the instance can access my 
> server IP, but it can't PING the gateway addresses of my server. Of 
> course, the instance also can't access the Internet. But my server can 
> PING it's gateway and access to the Internet. Finally, the cloud 
> instance can only communicate with my server.
> I tried many methods to restore the network environment of my 
> openstack. But I can't find the reason. So I need your help. I install 
> the version of devstack is stable/train. Thank you very much!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200824/88a61165/attachment.html>

From mark at stackhpc.com  Mon Aug 24 07:46:13 2020
From: mark at stackhpc.com (Mark Goddard)
Date: Mon, 24 Aug 2020 08:46:13 +0100
Subject: [Kolla Ansible] host maintenance
In-Reply-To: <046E9C0290DD9149B106B72FC9156BEA04814569@gmsxchsvr01.thecreation.com>
References: <MWHPR08MB23822BB16CA7FFA8667B5A73BD5B0@MWHPR08MB2382.namprd08.prod.outlook.com>
 <CAA857VyMP9jH0575oMF446L==1UDYKk5UGs6YQ-kxkrq4F4GJg@mail.gmail.com>
 <E1F06E18-FD98-42FB-9DAC-0D4A45343226@uchicago.edu>
 <MWHPR08MB2382D4FADCD4064948B74D60BD5B0@MWHPR08MB2382.namprd08.prod.outlook.com>
 <046E9C0290DD9149B106B72FC9156BEA04814569@gmsxchsvr01.thecreation.com>
Message-ID: <CAFHSqWonE_7WAhLf4rVt+_=hJ+HCBoq=c2UdZSw+cwumyLxebQ@mail.gmail.com>

On Sat, 22 Aug 2020 at 01:10, Eric K. Miller <emiller at genesishosting.com> wrote:
>
> > Actually, in my case, the setup is originally deploy by
> > Kolla Ansible. Other than the initial deployment, I am
> > looking for using Kolla Ansible for maintenance operations.
> > What I am looking for, eg. replace a host, can surely be
> > done by manual steps or customized script. I'd like to know
> > if they are automated by Kolla Ansible.
>
> We do this often by simply using the "limit" flag in Kolla Ansible to only include the controllers and new compute node (after adding the compute node to the multinode.ini file).  Specify "reconfigure" for the action, and not "install".

We need some better docs around this, and I think they will be added
soon. Some things to watch out for:

* if adding a new controller, ensure that if using --limit, all
controllers are included and do not use serial mode
* if removing a controller, reconfigure other controllers to update
the RabbitMQ & Galera cluster nodes etc.

>
> Eric


From arne.wiebalck at cern.ch  Mon Aug 24 08:24:05 2020
From: arne.wiebalck at cern.ch (Arne Wiebalck)
Date: Mon, 24 Aug 2020 10:24:05 +0200
Subject: [ironic][tripleo] RFC: deprecate the iSCSI deploy interface?
In-Reply-To: <CAF7gwdi5D-ppP5Tu8280xVxyYbPMa7mLJTC_vbf9qLP9STGswg@mail.gmail.com>
References: <CACNgkFy4Gx9UsSfR3EW_vfZzyPGg9mR2ks+h6noycOVz1C1-Ww@mail.gmail.com>
 <CAF7gwdi5D-ppP5Tu8280xVxyYbPMa7mLJTC_vbf9qLP9STGswg@mail.gmail.com>
Message-ID: <a6465ba4-477b-db59-127a-b7e153f973d2@cern.ch>

Hi!

CERN's deployment is using the iscsi deploy interface since we started 
with Ironic a couple of years ago (and we installed around 5000 nodes
with it by now). The reason we chose it at the time was simplicity: we 
did not (and still do not) have a Swift backend to Glance, and the iscsi
interface provided a straightforward alternative.

While we have not seen obscure bugs/issues with it, I can certainly back
the scalability issues mentioned by Dmitry: the tunneling of the images
through the controllers can create issues when deploying hundreds of 
nodes at the same time. The security of the iscsi interface is less of a 
concern in our specific environment.

So, why did we not move to direct (yet)? In addition to the lack of 
Swift, mostly since iscsi works for us and the scalability issues were 
not that much of a burning problem ... so we focused on other things :)

Here are some thoughts/suggestions for this discussion:

How would 'direct' work with other Glance backends (like Ceph/RBD in our 
case)? If using direct requires to duplicate images from Glance to 
Ironic (or somewhere else) to be served, I think this would be an 
argument against deprecating iscsi.

Equally, if this would require to completely move the Glance backend to 
something else, like from RBD to RadosGW, I would not expect happy 
operators. (Does anyone know if RadosGW could even replace Swift for 
this specific use case?)

Do we have numbers on how many deployments use iscsi vs direct? If many
rely on iscsi, I would also suggest to establish a migration guide for 
operators on how to move from iscsi to direct, for the various configs.
Recent versions of Glance support multiple backends, so a migration path
may be to add a new (direct compatible) backend for new images.

Cheers,
  Arne

On 20.08.20 17:49, Julia Kreger wrote:
> I'm having a sense of deja vu!
> 
> Because of the way the mechanics work, the iscsi deploy driver is in
> an unfortunate position of being harder to troubleshoot and diagnose
> failures. Which basically means we've not been able to really identify
> common failures and add logic to handle them appropriately, like we
> are able to with a tcp socket and file download. Based on this alone,
> I think it makes a solid case for us to seriously consider
> deprecation.
> 
> Overall, I'm +1 for the proposal and I believe over two cycles is the
> right way to go.
> 
> I suspect we're going to have lots of push back from the TripleO
> community because there has been resistance to change their default
> usage in the past. As such I'm adding them to the subject so hopefully
> they will be at least aware.
> 
> I guess my other worry is operators who already have a substantial
> operational infrastructure investment built around the iscsi deploy
> interface. I wonder why they didn't use direct, but maybe they have
> all migrated in the past ?5? years. This could just be a non-concern
> in reality, I'm just not sure.
> 
> Of course, if someone is willing to step up and make the iscsi
> deployment interface their primary focus, that also shifts the
> discussion to making direct the default interface?
> 
> -Julia
> 
> 
> On Thu, Aug 20, 2020 at 1:57 AM Dmitry Tantsur <dtantsur at redhat.com> wrote:
>>
>> Hi all,
>>
>> Side note for those lacking context: this proposal concerns deprecating one of the ironic deploy interfaces detailed in https://docs.openstack.org/ironic/latest/admin/interfaces/deploy.html. It does not affect the boot-from-iSCSI feature.
>>
>> I would like to propose deprecating and removing the 'iscsi' deploy interface over the course of the next 2 cycles. The reasons are:
>> 1) The iSCSI deploy is a source of occasional cryptic bugs when a target cannot be discovered or mounted properly.
>> 2) Its security is questionable: I don't think we even use authentication.
>> 3) Operators confusion: right now we default to the iSCSI deploy but pretty much direct everyone who cares about scalability or security to the 'direct' deploy.
>> 4) Cost of maintenance: our feature set is growing, our team - not so much. iscsi_deploy.py is 800 lines of code that can be removed, and some dependencies that can be dropped as well.
>>
>> As far as I can remember, we've kept the iSCSI deploy for two reasons:
>> 1) The direct deploy used to require Glance with Swift backend. The recently added [agent]image_download_source option allows caching and serving images via the ironic's HTTP server, eliminating this problem. I guess we'll have to switch to 'http' by default for this option to keep the out-of-box experience.
>> 2) Memory footprint of the direct deploy. With the raw images streaming we no longer have to cache the downloaded images in the agent memory, removing this problem as well (I'm not even sure how much of a problem it is in 2020, even my phone has 4GiB of RAM).
>>
>> If this proposal is accepted, I suggest to execute it as follows:
>> Victoria release:
>> 1) Put an early deprecation warning in the release notes.
>> 2) Announce the future change of the default value for [agent]image_download_source.
>> W release:
>> 3) Change [agent]image_download_source to 'http' by default.
>> 4) Remove iscsi from the default enabled_deploy_interfaces and move it to the back of the supported list (effectively making direct deploy the default).
>> X release:
>> 5) Remove the iscsi deploy code from both ironic and IPA.
>>
>> Thoughts, opinions, suggestions?
>>
>> Dmitry
> 


From dtantsur at redhat.com  Mon Aug 24 08:32:57 2020
From: dtantsur at redhat.com (Dmitry Tantsur)
Date: Mon, 24 Aug 2020 10:32:57 +0200
Subject: [ironic][tripleo] RFC: deprecate the iSCSI deploy interface?
In-Reply-To: <a6465ba4-477b-db59-127a-b7e153f973d2@cern.ch>
References: <CACNgkFy4Gx9UsSfR3EW_vfZzyPGg9mR2ks+h6noycOVz1C1-Ww@mail.gmail.com>
 <CAF7gwdi5D-ppP5Tu8280xVxyYbPMa7mLJTC_vbf9qLP9STGswg@mail.gmail.com>
 <a6465ba4-477b-db59-127a-b7e153f973d2@cern.ch>
Message-ID: <CACNgkFxJCxY3SGhT_Q197SE5HKMm2tbmOVUNxMsfPf+dGSiw1Q@mail.gmail.com>

Hi,

On Mon, Aug 24, 2020 at 10:24 AM Arne Wiebalck <arne.wiebalck at cern.ch>
wrote:

> Hi!
>
> CERN's deployment is using the iscsi deploy interface since we started
> with Ironic a couple of years ago (and we installed around 5000 nodes
> with it by now). The reason we chose it at the time was simplicity: we
> did not (and still do not) have a Swift backend to Glance, and the iscsi
> interface provided a straightforward alternative.
>
> While we have not seen obscure bugs/issues with it, I can certainly back
> the scalability issues mentioned by Dmitry: the tunneling of the images
> through the controllers can create issues when deploying hundreds of
> nodes at the same time. The security of the iscsi interface is less of a
> concern in our specific environment.
>
> So, why did we not move to direct (yet)? In addition to the lack of
> Swift, mostly since iscsi works for us and the scalability issues were
> not that much of a burning problem ... so we focused on other things :)
>
> Here are some thoughts/suggestions for this discussion:
>
> How would 'direct' work with other Glance backends (like Ceph/RBD in our
> case)? If using direct requires to duplicate images from Glance to
> Ironic (or somewhere else) to be served, I think this would be an
> argument against deprecating iscsi.
>

With image_download_source=http ironic will download the image to the
conductor to be able serve it to the node. Which is exactly what the iscsi
is doing, so not much of a change for you (except for s/iSCSI/HTTP/ as a
means of serving the image).

Would it be an option for you to test direct deploy with
image_download_source=http?


>
> Equally, if this would require to completely move the Glance backend to
> something else, like from RBD to RadosGW, I would not expect happy
> operators. (Does anyone know if RadosGW could even replace Swift for
> this specific use case?)
>

AFAIK ironic works with RadosGW, we have some support code for it.


>
> Do we have numbers on how many deployments use iscsi vs direct? If many
> rely on iscsi, I would also suggest to establish a migration guide for
> operators on how to move from iscsi to direct, for the various configs.
> Recent versions of Glance support multiple backends, so a migration path
> may be to add a new (direct compatible) backend for new images.
>

I don't have any numbers, but a migration guide is a must in any case.

I expect most TripleO consumers to use the iscsi deploy, but only because
it's the default. Their Edge solution uses the direct deploy. I've polled a
few operators I know, they all (except for you, obviously :) seem to use
the direct deploy. Metal3 uses direct deploy.

Dmitry


>
> Cheers,
>   Arne
>
> On 20.08.20 17:49, Julia Kreger wrote:
> > I'm having a sense of deja vu!
> >
> > Because of the way the mechanics work, the iscsi deploy driver is in
> > an unfortunate position of being harder to troubleshoot and diagnose
> > failures. Which basically means we've not been able to really identify
> > common failures and add logic to handle them appropriately, like we
> > are able to with a tcp socket and file download. Based on this alone,
> > I think it makes a solid case for us to seriously consider
> > deprecation.
> >
> > Overall, I'm +1 for the proposal and I believe over two cycles is the
> > right way to go.
> >
> > I suspect we're going to have lots of push back from the TripleO
> > community because there has been resistance to change their default
> > usage in the past. As such I'm adding them to the subject so hopefully
> > they will be at least aware.
> >
> > I guess my other worry is operators who already have a substantial
> > operational infrastructure investment built around the iscsi deploy
> > interface. I wonder why they didn't use direct, but maybe they have
> > all migrated in the past ?5? years. This could just be a non-concern
> > in reality, I'm just not sure.
> >
> > Of course, if someone is willing to step up and make the iscsi
> > deployment interface their primary focus, that also shifts the
> > discussion to making direct the default interface?
> >
> > -Julia
> >
> >
> > On Thu, Aug 20, 2020 at 1:57 AM Dmitry Tantsur <dtantsur at redhat.com>
> wrote:
> >>
> >> Hi all,
> >>
> >> Side note for those lacking context: this proposal concerns deprecating
> one of the ironic deploy interfaces detailed in
> https://docs.openstack.org/ironic/latest/admin/interfaces/deploy.html. It
> does not affect the boot-from-iSCSI feature.
> >>
> >> I would like to propose deprecating and removing the 'iscsi' deploy
> interface over the course of the next 2 cycles. The reasons are:
> >> 1) The iSCSI deploy is a source of occasional cryptic bugs when a
> target cannot be discovered or mounted properly.
> >> 2) Its security is questionable: I don't think we even use
> authentication.
> >> 3) Operators confusion: right now we default to the iSCSI deploy but
> pretty much direct everyone who cares about scalability or security to the
> 'direct' deploy.
> >> 4) Cost of maintenance: our feature set is growing, our team - not so
> much. iscsi_deploy.py is 800 lines of code that can be removed, and some
> dependencies that can be dropped as well.
> >>
> >> As far as I can remember, we've kept the iSCSI deploy for two reasons:
> >> 1) The direct deploy used to require Glance with Swift backend. The
> recently added [agent]image_download_source option allows caching and
> serving images via the ironic's HTTP server, eliminating this problem. I
> guess we'll have to switch to 'http' by default for this option to keep the
> out-of-box experience.
> >> 2) Memory footprint of the direct deploy. With the raw images streaming
> we no longer have to cache the downloaded images in the agent memory,
> removing this problem as well (I'm not even sure how much of a problem it
> is in 2020, even my phone has 4GiB of RAM).
> >>
> >> If this proposal is accepted, I suggest to execute it as follows:
> >> Victoria release:
> >> 1) Put an early deprecation warning in the release notes.
> >> 2) Announce the future change of the default value for
> [agent]image_download_source.
> >> W release:
> >> 3) Change [agent]image_download_source to 'http' by default.
> >> 4) Remove iscsi from the default enabled_deploy_interfaces and move it
> to the back of the supported list (effectively making direct deploy the
> default).
> >> X release:
> >> 5) Remove the iscsi deploy code from both ironic and IPA.
> >>
> >> Thoughts, opinions, suggestions?
> >>
> >> Dmitry
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200824/90639f23/attachment-0001.html>

From lucasagomes at gmail.com  Mon Aug 24 08:34:30 2020
From: lucasagomes at gmail.com (Lucas Alvares Gomes)
Date: Mon, 24 Aug 2020 09:34:30 +0100
Subject: [neutron] Bug Deputy Report Aug 17-24
Message-ID: <CAB1EZBqvAA6FXs+mHh4UPN0e2EqaHH1Kf85Ao1kRhz5kwDir-w@mail.gmail.com>

Hi,

This is the Neutron bug report of the week of 2020-08-17.

High:
* https://bugs.launchpad.net/neutron/+bug/1891673 - "qrouter ns ip
rules not deleted when fip removed from vm"
   Assigned to: hopem

* https://bugs.launchpad.net/neutron/+bug/1892017 - "Neutron server
logs are too big in the gate jobs"
   Assigned to: slaweq

* https://bugs.launchpad.net/neutron/+bug/1892477 - "[OVN] Avoid
nb_cfg update notification flooding during agents health check"
   Assigned to: lucasagomes

* https://bugs.launchpad.net/neutron/+bug/1892489 - "[Prefix
delegation] When subnet with PD enabled is added to the router, L3
agent fails on waiting for LLAs to be available"
   Assigned to: slaweq

Needs further triage:
* https://bugs.launchpad.net/neutron/+bug/1892405 - "Removing router
interface causes router to stop routing between all"
   Unassigned

* https://bugs.launchpad.net/neutron/+bug/1892496 - "500 on SG
deletion: Cannot delete or update a parent row"
   Unassigned

Medium:
* https://bugs.launchpad.net/neutron/+bug/1892364 - "L3 agent prefix
delegation - adding new subnet to the router fails"
   Assigned to: brian-haley

* https://bugs.launchpad.net/neutron/+bug/1892362 - "Restarting L3
agent when PD is used fails due to IPAddressAlreadyExists error"
   Assigned to: slaweq

Wishlist:
* https://bugs.launchpad.net/neutron/+bug/1892200 - "Make keepalived
healthcheck more configurable"
   Unassigned


From arne.wiebalck at cern.ch  Mon Aug 24 09:03:15 2020
From: arne.wiebalck at cern.ch (Arne Wiebalck)
Date: Mon, 24 Aug 2020 11:03:15 +0200
Subject: [ironic][tripleo] RFC: deprecate the iSCSI deploy interface?
In-Reply-To: <CACNgkFxJCxY3SGhT_Q197SE5HKMm2tbmOVUNxMsfPf+dGSiw1Q@mail.gmail.com>
References: <CACNgkFy4Gx9UsSfR3EW_vfZzyPGg9mR2ks+h6noycOVz1C1-Ww@mail.gmail.com>
 <CAF7gwdi5D-ppP5Tu8280xVxyYbPMa7mLJTC_vbf9qLP9STGswg@mail.gmail.com>
 <a6465ba4-477b-db59-127a-b7e153f973d2@cern.ch>
 <CACNgkFxJCxY3SGhT_Q197SE5HKMm2tbmOVUNxMsfPf+dGSiw1Q@mail.gmail.com>
Message-ID: <91efc60b-6995-eda0-4ff9-7d6ae6a31641@cern.ch>

Hi Dmitry,

On 24.08.20 10:32, Dmitry Tantsur wrote:
> Hi,
> 
> On Mon, Aug 24, 2020 at 10:24 AM Arne Wiebalck <arne.wiebalck at cern.ch 
> <mailto:arne.wiebalck at cern.ch>> wrote:
> 
>     Hi!
> 
>     CERN's deployment is using the iscsi deploy interface since we started
>     with Ironic a couple of years ago (and we installed around 5000 nodes
>     with it by now). The reason we chose it at the time was simplicity: we
>     did not (and still do not) have a Swift backend to Glance, and the iscsi
>     interface provided a straightforward alternative.
> 
>     While we have not seen obscure bugs/issues with it, I can certainly back
>     the scalability issues mentioned by Dmitry: the tunneling of the images
>     through the controllers can create issues when deploying hundreds of
>     nodes at the same time. The security of the iscsi interface is less
>     of a
>     concern in our specific environment.
> 
>     So, why did we not move to direct (yet)? In addition to the lack of
>     Swift, mostly since iscsi works for us and the scalability issues were
>     not that much of a burning problem ... so we focused on other things :)
> 
>     Here are some thoughts/suggestions for this discussion:
> 
>     How would 'direct' work with other Glance backends (like Ceph/RBD in
>     our
>     case)? If using direct requires to duplicate images from Glance to
>     Ironic (or somewhere else) to be served, I think this would be an
>     argument against deprecating iscsi.
> 
> 
> With image_download_source=http ironic will download the image to the 
> conductor to be able serve it to the node. Which is exactly what the 
> iscsi is doing, so not much of a change for you (except for 
> s/iSCSI/HTTP/ as a means of serving the image).
> 
> Would it be an option for you to test direct deploy with 
> image_download_source=http?

Oh, absolutely! I was not aware that setting this option would make 
Ironic act as an image buffer (I thought this would expect some URL the 
admin had to provide) ... I will try this and let you know.

> 
> 
>     Equally, if this would require to completely move the Glance backend to
>     something else, like from RBD to RadosGW, I would not expect happy
>     operators. (Does anyone know if RadosGW could even replace Swift for
>     this specific use case?)
> 
> 
> AFAIK ironic works with RadosGW, we have some support code for it.

I was mostly asking to see if RadosGW is a (longer term) option to fully 
benefit from direct's inherent scaling.

> 
> 
>     Do we have numbers on how many deployments use iscsi vs direct? If many
>     rely on iscsi, I would also suggest to establish a migration guide for
>     operators on how to move from iscsi to direct, for the various configs.
>     Recent versions of Glance support multiple backends, so a migration path
>     may be to add a new (direct compatible) backend for new images.
> 
> 
> I don't have any numbers, but a migration guide is a must in any case.
> 
> I expect most TripleO consumers to use the iscsi deploy, but only 
> because it's the default. Their Edge solution uses the direct deploy. 
> I've polled a few operators I know, they all (except for you, obviously 
> :) seem to use the direct deploy. Metal3 uses direct deploy.

Thanks!
  Arne

> Dmitry
> 
> 
>     Cheers,
>        Arne
> 
>     On 20.08.20 17:49, Julia Kreger wrote:
>      > I'm having a sense of deja vu!
>      >
>      > Because of the way the mechanics work, the iscsi deploy driver is in
>      > an unfortunate position of being harder to troubleshoot and diagnose
>      > failures. Which basically means we've not been able to really
>     identify
>      > common failures and add logic to handle them appropriately, like we
>      > are able to with a tcp socket and file download. Based on this alone,
>      > I think it makes a solid case for us to seriously consider
>      > deprecation.
>      >
>      > Overall, I'm +1 for the proposal and I believe over two cycles is the
>      > right way to go.
>      >
>      > I suspect we're going to have lots of push back from the TripleO
>      > community because there has been resistance to change their default
>      > usage in the past. As such I'm adding them to the subject so
>     hopefully
>      > they will be at least aware.
>      >
>      > I guess my other worry is operators who already have a substantial
>      > operational infrastructure investment built around the iscsi deploy
>      > interface. I wonder why they didn't use direct, but maybe they have
>      > all migrated in the past ?5? years. This could just be a non-concern
>      > in reality, I'm just not sure.
>      >
>      > Of course, if someone is willing to step up and make the iscsi
>      > deployment interface their primary focus, that also shifts the
>      > discussion to making direct the default interface?
>      >
>      > -Julia
>      >
>      >
>      > On Thu, Aug 20, 2020 at 1:57 AM Dmitry Tantsur
>     <dtantsur at redhat.com <mailto:dtantsur at redhat.com>> wrote:
>      >>
>      >> Hi all,
>      >>
>      >> Side note for those lacking context: this proposal concerns
>     deprecating one of the ironic deploy interfaces detailed in
>     https://docs.openstack.org/ironic/latest/admin/interfaces/deploy.html.
>     It does not affect the boot-from-iSCSI feature.
>      >>
>      >> I would like to propose deprecating and removing the 'iscsi'
>     deploy interface over the course of the next 2 cycles. The reasons are:
>      >> 1) The iSCSI deploy is a source of occasional cryptic bugs when
>     a target cannot be discovered or mounted properly.
>      >> 2) Its security is questionable: I don't think we even use
>     authentication.
>      >> 3) Operators confusion: right now we default to the iSCSI deploy
>     but pretty much direct everyone who cares about scalability or
>     security to the 'direct' deploy.
>      >> 4) Cost of maintenance: our feature set is growing, our team -
>     not so much. iscsi_deploy.py is 800 lines of code that can be
>     removed, and some dependencies that can be dropped as well.
>      >>
>      >> As far as I can remember, we've kept the iSCSI deploy for two
>     reasons:
>      >> 1) The direct deploy used to require Glance with Swift backend.
>     The recently added [agent]image_download_source option allows
>     caching and serving images via the ironic's HTTP server, eliminating
>     this problem. I guess we'll have to switch to 'http' by default for
>     this option to keep the out-of-box experience.
>      >> 2) Memory footprint of the direct deploy. With the raw images
>     streaming we no longer have to cache the downloaded images in the
>     agent memory, removing this problem as well (I'm not even sure how
>     much of a problem it is in 2020, even my phone has 4GiB of RAM).
>      >>
>      >> If this proposal is accepted, I suggest to execute it as follows:
>      >> Victoria release:
>      >> 1) Put an early deprecation warning in the release notes.
>      >> 2) Announce the future change of the default value for
>     [agent]image_download_source.
>      >> W release:
>      >> 3) Change [agent]image_download_source to 'http' by default.
>      >> 4) Remove iscsi from the default enabled_deploy_interfaces and
>     move it to the back of the supported list (effectively making direct
>     deploy the default).
>      >> X release:
>      >> 5) Remove the iscsi deploy code from both ironic and IPA.
>      >>
>      >> Thoughts, opinions, suggestions?
>      >>
>      >> Dmitry
>      >
> 


From aaronzhu1121 at gmail.com  Mon Aug 24 09:13:24 2020
From: aaronzhu1121 at gmail.com (Rong Zhu)
Date: Mon, 24 Aug 2020 17:13:24 +0800
Subject: [MURANO] Murano Class error when try to deploy WordPress APP
In-Reply-To: <CAN_SLJVBMoKwUR-GVe4XVHb8O-pqHPKO_duhZQQQTDJ86=8KuA@mail.gmail.com>
References: <CAN_SLJVBMoKwUR-GVe4XVHb8O-pqHPKO_duhZQQQTDJ86=8KuA@mail.gmail.com>
Message-ID: <CAN2nDNtUhnH-H9j+-uZPOYsyrfcj5+KJc5k=-_LX0enEE8GM1A@mail.gmail.com>

Hi,

Sorry for the later reply. Recently I am busy with some internal works, I
don't have much time to debug this. I think you should check the app
package first. I will debug it when I am free.

İzzettin Erdem <root.mch at gmail.com>于2020年8月20日 周四20:47写道：

> Hello everyone,
>
> WordPress needs Mysql, HTTP and Zabbix Server/Agent. These apps run
> individually with succes but when I try to deploy WordPress App on Murano
> it gives the error about Apache HTTP that mentioned below.
>
> How can I fix this? Do you have any suggestions?
>
> Error:
> http://paste.openstack.org/show/796980/
> http://paste.openstack.org/show/796983/ (cont.)
>
>
> --
Thanks,
Rong Zhu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200824/e500ee13/attachment.html>

From luyao.zhong at intel.com  Mon Aug 24 09:19:56 2020
From: luyao.zhong at intel.com (Zhong, Luyao)
Date: Mon, 24 Aug 2020 09:19:56 +0000
Subject: [Nova] We are dropping the 'delete_instance_files' virt driver
 interface
Message-ID: <183EFA13E8A23E4AA7057ED9BCC1102E3E542685@SHSMSX107.ccr.corp.intel.com>

Hi all especially maintainers of out-of-tree drivers,

Please pay attention to this change. https://review.opendev.org/#/c/714653/

We are dropping 'delete_instance_files' virt driver interface, and will use "cleanup_instance" to take charge of lingering instance cleanup, including instance files deleting and whatever we add in the future.

Best Regards,
Luyao


From thierry at openstack.org  Mon Aug 24 10:02:19 2020
From: thierry at openstack.org (Thierry Carrez)
Date: Mon, 24 Aug 2020 12:02:19 +0200
Subject: [largescale-sig] Next meeting: August 26, 8utc
Message-ID: <c3032c83-e083-f1fa-fc7a-a2a754fe3489@openstack.org>

Hi everyone,

Our next meeting will be a EU-APAC-friendly meeting, on Wednesday, 
August 26 at 8 UTC[1] in the #openstack-meeting-3 channel on IRC:

https://www.timeanddate.com/worldclock/fixedtime.html?iso=20200826T08

Feel free to add topics to our agenda at:

https://etherpad.openstack.org/p/large-scale-sig-meeting

A reminder of the TODOs we had from last meeting, in case you have time 
to make progress on them:

- amorin to add some meat to the wiki page before we push the Nova doc 
patch further
- all to describe briefly how you solved metrics/billing in your 
deployment in https://etherpad.openstack.org/p/large-scale-sig-documentation

Talk to you all on Wednesday,

-- 
Thierry Carrez


From smooney at redhat.com  Mon Aug 24 11:41:51 2020
From: smooney at redhat.com (Sean Mooney)
Date: Mon, 24 Aug 2020 12:41:51 +0100
Subject: About Devstack
In-Reply-To: <17933c25-2fe9-4f0e-b1c5-797d4a97a5dc@gmail.com>
References: <tencent_E51A98659906550A5DD44A8CCB76E8937606@qq.com>
 <17933c25-2fe9-4f0e-b1c5-797d4a97a5dc@gmail.com>
Message-ID: <9d33bc9f1bc819c4041b2b7217c652e685ed35c7.camel@redhat.com>

On Mon, 2020-08-24 at 16:32 +0900, Bernd Bausch wrote:
> Devstack is not meant to be restarted. However, setting the IP address 
> on br-ex and bringing it up is normally sufficient to re-establish 
> networking. After that, you probably still need to recreate the loop 
> devices for Cinder and Swift.
devstack used to support restart with a seperate script. That was remvoed
but when we latter swapped to systemd it almost fixed restart there
are a cloule of things that are not done porperly to allow restart to work
but really we should proably just fix those. some run directores are missng after
the reboot that prevent some wsgi servicces restarting if you correct that i think that
is basically all tha tis needed.
> 
> What I don't understand: The external network that Devstack sets up by 
> default, named "public", is fake. It's not external, and it's not 
> connected to the outside world at all, thus the IP address range of 
> 172.24.4.0/24. How your instances were able to access the internet 
> without any manual tweaking is a mystery to me. If you did some manual 
> tweaking, I guess it was lost when you rebooted.
> 
> Perhaps you had a non-persistent routing table entry that connected 
> 172.24.4.0/24 to the outside world?

https://www.rdoproject.org/networking/networking-in-too-much-detail/#network-host-external-traffic-kl
this might help. the networking is not exactly fake but if you dont configure your br-ex to have
a port attached and configure your router as the gateway for the 172.24.4.0/28 network you wont have connectivity

you can alternitivly nat the traffic form openstack.

there are some devstack docs on how to configure networing
https://github.com/openstack/devstack/blob/master/doc/source/networking.rst
but basically devstack will not majically make your openstack network routeable on your physical network you have to
do some manual operatoin on your router to make that happen.
> 
> Bernd.
> 
> On 8/22/2020 10:24 AM, 閺夊骸绻旀潻锟 wrote:
> > I'm sorry to disturb you.
> > Recently, I tried to install openstack through devstack. When I input 
> > "./stack.sh". I can install openstack successfully.
> > Then I tried to create a cloud instance and use the public network 
> > 172.24.4.0/24 <http://172.24.4.0/24> which is created during 
> > installation( this subnet is created by default, I didn't configure 
> > network informartion in local.conf before installation). And the 
> > instance can access to the Internet smoothly.
> > But the instance will not access the Internet when I reboot my server鑱 
> > (physical machine). After rebooting, I input "sudo ifconfig br-ex 
> > 172.24.4.1/24 <http://172.24.4.1/24> up", the instance can access my 
> > server IP, but it can't PING the gateway addresses of my server. Of 
> > course, the instance also can't access the Internet. But my server can 
> > PING it's gateway and access to the Internet. Finally, the cloud 
> > instance can only communicate with my server.
> > I tried many methods to restore the network environment of my 
> > openstack. But I can't find the reason. So I need your help. I install 
> > the version of devstack is stable/train. Thank you very much!


From smooney at redhat.com  Mon Aug 24 11:52:48 2020
From: smooney at redhat.com (Sean Mooney)
Date: Mon, 24 Aug 2020 12:52:48 +0100
Subject: [ironic][tripleo] RFC: deprecate the iSCSI deploy interface?
In-Reply-To: <CACNgkFxJCxY3SGhT_Q197SE5HKMm2tbmOVUNxMsfPf+dGSiw1Q@mail.gmail.com>
References: <CACNgkFy4Gx9UsSfR3EW_vfZzyPGg9mR2ks+h6noycOVz1C1-Ww@mail.gmail.com>
 <CAF7gwdi5D-ppP5Tu8280xVxyYbPMa7mLJTC_vbf9qLP9STGswg@mail.gmail.com>
 <a6465ba4-477b-db59-127a-b7e153f973d2@cern.ch>
 <CACNgkFxJCxY3SGhT_Q197SE5HKMm2tbmOVUNxMsfPf+dGSiw1Q@mail.gmail.com>
Message-ID: <0800d06e870cc5370ada0a85c5e4aaf3b329107d.camel@redhat.com>

On Mon, 2020-08-24 at 10:32 +0200, Dmitry Tantsur wrote:
> Hi,
> 
> On Mon, Aug 24, 2020 at 10:24 AM Arne Wiebalck <arne.wiebalck at cern.ch>
> wrote:
> 
> > Hi!
> > 
> > CERN's deployment is using the iscsi deploy interface since we started
> > with Ironic a couple of years ago (and we installed around 5000 nodes
> > with it by now). The reason we chose it at the time was simplicity: we
> > did not (and still do not) have a Swift backend to Glance, and the iscsi
> > interface provided a straightforward alternative.
> > 
> > While we have not seen obscure bugs/issues with it, I can certainly back
> > the scalability issues mentioned by Dmitry: the tunneling of the images
> > through the controllers can create issues when deploying hundreds of
> > nodes at the same time. The security of the iscsi interface is less of a
> > concern in our specific environment.
> > 
> > So, why did we not move to direct (yet)? In addition to the lack of
> > Swift, mostly since iscsi works for us and the scalability issues were
> > not that much of a burning problem ... so we focused on other things :)
> > 
> > Here are some thoughts/suggestions for this discussion:
> > 
> > How would 'direct' work with other Glance backends (like Ceph/RBD in our
> > case)? If using direct requires to duplicate images from Glance to
> > Ironic (or somewhere else) to be served, I think this would be an
> > argument against deprecating iscsi.
> > 
> 
> With image_download_source=http ironic will download the image to the
> conductor to be able serve it to the node. Which is exactly what the iscsi
> is doing, so not much of a change for you (except for s/iSCSI/HTTP/ as a
> means of serving the image).
> 
> Would it be an option for you to test direct deploy with
> image_download_source=http?
i think if there is still an option to not force deployemnt to altere any of there
other sevices this is likely ok but i think the onious shoudl be on the ironic
and ooo teams to ensure there is an upgrade path for those useres before this deprecation
becomes a removal without deploying swift or a swift compatibale api e.g. RadosGW

perhaps a ci job could be put in place maybe using grenade that starts with iscsi and moves
to direct with http porvided to show that just setting that weill allow the conductor to download
the image from glance and server it to the ipa.


unlike cern i just use ironic in a tiny home deployment where i have an all in one deployment + 4 addtional
nodes for ironic. i cant deploy swift as all my disks are already in use for cinder so down the line when 
i eventually upgrade to vicortia and wallaby  i would either have to drop ironic or not upgrade it
if there is not a option to just pull the image from glance or glance via the conductor. enhancing the ipa
to pull directly from glance would also proably work for many who use iscsi today but that would depend on your network
toplogy i guess.
> 
> 
> > 
> > Equally, if this would require to completely move the Glance backend to
> > something else, like from RBD to RadosGW, I would not expect happy
> > operators. (Does anyone know if RadosGW could even replace Swift for
> > this specific use case?)
> > 
> 
> AFAIK ironic works with RadosGW, we have some support code for it.
> 
> 
> > 
> > Do we have numbers on how many deployments use iscsi vs direct? If many
> > rely on iscsi, I would also suggest to establish a migration guide for
> > operators on how to move from iscsi to direct, for the various configs.
> > Recent versions of Glance support multiple backends, so a migration path
> > may be to add a new (direct compatible) backend for new images.
> > 
> 
> I don't have any numbers, but a migration guide is a must in any case.
> 
> I expect most TripleO consumers to use the iscsi deploy, but only because
> it's the default. Their Edge solution uses the direct deploy. I've polled a
> few operators I know, they all (except for you, obviously :) seem to use
> the direct deploy. Metal3 uses direct deploy.
> 
> Dmitry
> 
> 
> > 
> > Cheers,
> >   Arne
> > 
> > On 20.08.20 17:49, Julia Kreger wrote:
> > > I'm having a sense of deja vu!
> > > 
> > > Because of the way the mechanics work, the iscsi deploy driver is in
> > > an unfortunate position of being harder to troubleshoot and diagnose
> > > failures. Which basically means we've not been able to really identify
> > > common failures and add logic to handle them appropriately, like we
> > > are able to with a tcp socket and file download. Based on this alone,
> > > I think it makes a solid case for us to seriously consider
> > > deprecation.
> > > 
> > > Overall, I'm +1 for the proposal and I believe over two cycles is the
> > > right way to go.
> > > 
> > > I suspect we're going to have lots of push back from the TripleO
> > > community because there has been resistance to change their default
> > > usage in the past. As such I'm adding them to the subject so hopefully
> > > they will be at least aware.
> > > 
> > > I guess my other worry is operators who already have a substantial
> > > operational infrastructure investment built around the iscsi deploy
> > > interface. I wonder why they didn't use direct, but maybe they have
> > > all migrated in the past ?5? years. This could just be a non-concern
> > > in reality, I'm just not sure.
> > > 
> > > Of course, if someone is willing to step up and make the iscsi
> > > deployment interface their primary focus, that also shifts the
> > > discussion to making direct the default interface?
> > > 
> > > -Julia
> > > 
> > > 
> > > On Thu, Aug 20, 2020 at 1:57 AM Dmitry Tantsur <dtantsur at redhat.com>
> > 
> > wrote:
> > > > 
> > > > Hi all,
> > > > 
> > > > Side note for those lacking context: this proposal concerns deprecating
> > 
> > one of the ironic deploy interfaces detailed in
> > https://docs.openstack.org/ironic/latest/admin/interfaces/deploy.html. It
> > does not affect the boot-from-iSCSI feature.
> > > > 
> > > > I would like to propose deprecating and removing the 'iscsi' deploy
> > 
> > interface over the course of the next 2 cycles. The reasons are:
> > > > 1) The iSCSI deploy is a source of occasional cryptic bugs when a
> > 
> > target cannot be discovered or mounted properly.
> > > > 2) Its security is questionable: I don't think we even use
> > 
> > authentication.
> > > > 3) Operators confusion: right now we default to the iSCSI deploy but
> > 
> > pretty much direct everyone who cares about scalability or security to the
> > 'direct' deploy.
> > > > 4) Cost of maintenance: our feature set is growing, our team - not so
> > 
> > much. iscsi_deploy.py is 800 lines of code that can be removed, and some
> > dependencies that can be dropped as well.
> > > > 
> > > > As far as I can remember, we've kept the iSCSI deploy for two reasons:
> > > > 1) The direct deploy used to require Glance with Swift backend. The
> > 
> > recently added [agent]image_download_source option allows caching and
> > serving images via the ironic's HTTP server, eliminating this problem. I
> > guess we'll have to switch to 'http' by default for this option to keep the
> > out-of-box experience.
> > > > 2) Memory footprint of the direct deploy. With the raw images streaming
> > 
> > we no longer have to cache the downloaded images in the agent memory,
> > removing this problem as well (I'm not even sure how much of a problem it
> > is in 2020, even my phone has 4GiB of RAM).
> > > > 
> > > > If this proposal is accepted, I suggest to execute it as follows:
> > > > Victoria release:
> > > > 1) Put an early deprecation warning in the release notes.
> > > > 2) Announce the future change of the default value for
> > 
> > [agent]image_download_source.
> > > > W release:
> > > > 3) Change [agent]image_download_source to 'http' by default.
> > > > 4) Remove iscsi from the default enabled_deploy_interfaces and move it
> > 
> > to the back of the supported list (effectively making direct deploy the
> > default).
> > > > X release:
> > > > 5) Remove the iscsi deploy code from both ironic and IPA.
> > > > 
> > > > Thoughts, opinions, suggestions?
> > > > 
> > > > Dmitry
> > 
> > 


From fungi at yuggoth.org  Mon Aug 24 11:58:37 2020
From: fungi at yuggoth.org (Jeremy Stanley)
Date: Mon, 24 Aug 2020 11:58:37 +0000
Subject: [ovn][neutron][ussuri] DPDK support on OVN Openstack? Openstack
 Doc contradictive Information?
In-Reply-To: <CAEW15yB_g=cV5BHoMGHpp_VskUM55jZ0B=q3KCB7U8CZgyCs0Q@mail.gmail.com>
References: <CAEW15yD=Btu3NH3ENkbajCkyTCpwk0HspfSYRymZwqHJ+dbC3Q@mail.gmail.com>
 <CAEW15yB_g=cV5BHoMGHpp_VskUM55jZ0B=q3KCB7U8CZgyCs0Q@mail.gmail.com>
Message-ID: <20200824115836.jysj7yegqkrpndhn@yuggoth.org>

On 2020-08-24 13:57:59 +0700 (+0700), Popoi Zen wrote:
[...]
> I cant get metadata because of tcp checksum error.
[...]

I'm not familiar with the rest of the challenges you're facing, but
consider using configdrive for metadata access. It's generally more
reliable and resilient than trying to retrieve metadata over the
network.

https://docs.openstack.org/nova/ussuri/admin/config-drive.html
-- 
Jeremy Stanley
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 963 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200824/4bd7c82f/attachment.sig>

From dtantsur at redhat.com  Mon Aug 24 12:05:47 2020
From: dtantsur at redhat.com (Dmitry Tantsur)
Date: Mon, 24 Aug 2020 14:05:47 +0200
Subject: [ironic][tripleo] RFC: deprecate the iSCSI deploy interface?
In-Reply-To: <0800d06e870cc5370ada0a85c5e4aaf3b329107d.camel@redhat.com>
References: <CACNgkFy4Gx9UsSfR3EW_vfZzyPGg9mR2ks+h6noycOVz1C1-Ww@mail.gmail.com>
 <CAF7gwdi5D-ppP5Tu8280xVxyYbPMa7mLJTC_vbf9qLP9STGswg@mail.gmail.com>
 <a6465ba4-477b-db59-127a-b7e153f973d2@cern.ch>
 <CACNgkFxJCxY3SGhT_Q197SE5HKMm2tbmOVUNxMsfPf+dGSiw1Q@mail.gmail.com>
 <0800d06e870cc5370ada0a85c5e4aaf3b329107d.camel@redhat.com>
Message-ID: <CACNgkFydM1fmKZJ4wVwuJz-7aTC5A+NzAVLAgoxH5UaCkK_uwQ@mail.gmail.com>

On Mon, Aug 24, 2020 at 1:52 PM Sean Mooney <smooney at redhat.com> wrote:

> On Mon, 2020-08-24 at 10:32 +0200, Dmitry Tantsur wrote:
> > Hi,
> >
> > On Mon, Aug 24, 2020 at 10:24 AM Arne Wiebalck <arne.wiebalck at cern.ch>
> > wrote:
> >
> > > Hi!
> > >
> > > CERN's deployment is using the iscsi deploy interface since we started
> > > with Ironic a couple of years ago (and we installed around 5000 nodes
> > > with it by now). The reason we chose it at the time was simplicity: we
> > > did not (and still do not) have a Swift backend to Glance, and the
> iscsi
> > > interface provided a straightforward alternative.
> > >
> > > While we have not seen obscure bugs/issues with it, I can certainly
> back
> > > the scalability issues mentioned by Dmitry: the tunneling of the images
> > > through the controllers can create issues when deploying hundreds of
> > > nodes at the same time. The security of the iscsi interface is less of
> a
> > > concern in our specific environment.
> > >
> > > So, why did we not move to direct (yet)? In addition to the lack of
> > > Swift, mostly since iscsi works for us and the scalability issues were
> > > not that much of a burning problem ... so we focused on other things :)
> > >
> > > Here are some thoughts/suggestions for this discussion:
> > >
> > > How would 'direct' work with other Glance backends (like Ceph/RBD in
> our
> > > case)? If using direct requires to duplicate images from Glance to
> > > Ironic (or somewhere else) to be served, I think this would be an
> > > argument against deprecating iscsi.
> > >
> >
> > With image_download_source=http ironic will download the image to the
> > conductor to be able serve it to the node. Which is exactly what the
> iscsi
> > is doing, so not much of a change for you (except for s/iSCSI/HTTP/ as a
> > means of serving the image).
> >
> > Would it be an option for you to test direct deploy with
> > image_download_source=http?
> i think if there is still an option to not force deployemnt to altere any
> of there
> other sevices this is likely ok but i think the onious shoudl be on the
> ironic
> and ooo teams to ensure there is an upgrade path for those useres before
> this deprecation
> becomes a removal without deploying swift or a swift compatibale api e.g.
> RadosGW
>

Swift is NOT a requirement (nor is RadosGW) when image_download_source=http
is used. Any glance backend (or no glance at all) will work.


>
> perhaps a ci job could be put in place maybe using grenade that starts
> with iscsi and moves
> to direct with http porvided to show that just setting that weill allow
> the conductor to download
> the image from glance and server it to the ipa.
>

We already have CI jobs that do it, I'm not sure what grenade would win us?
At this point, we keep grenade jobs barely working at all (actually, the
multinode grenade job is not working), we cannot add anything there.

Dmitry


>
>
> unlike cern i just use ironic in a tiny home deployment where i have an
> all in one deployment + 4 addtional
> nodes for ironic. i cant deploy swift as all my disks are already in use
> for cinder so down the line when
> i eventually upgrade to vicortia and wallaby  i would either have to drop
> ironic or not upgrade it
> if there is not a option to just pull the image from glance or glance via
> the conductor. enhancing the ipa
> to pull directly from glance would also proably work for many who use
> iscsi today but that would depend on your network
> toplogy i guess.
> >
> >
> > >
> > > Equally, if this would require to completely move the Glance backend to
> > > something else, like from RBD to RadosGW, I would not expect happy
> > > operators. (Does anyone know if RadosGW could even replace Swift for
> > > this specific use case?)
> > >
> >
> > AFAIK ironic works with RadosGW, we have some support code for it.
> >
> >
> > >
> > > Do we have numbers on how many deployments use iscsi vs direct? If many
> > > rely on iscsi, I would also suggest to establish a migration guide for
> > > operators on how to move from iscsi to direct, for the various configs.
> > > Recent versions of Glance support multiple backends, so a migration
> path
> > > may be to add a new (direct compatible) backend for new images.
> > >
> >
> > I don't have any numbers, but a migration guide is a must in any case.
> >
> > I expect most TripleO consumers to use the iscsi deploy, but only because
> > it's the default. Their Edge solution uses the direct deploy. I've
> polled a
> > few operators I know, they all (except for you, obviously :) seem to use
> > the direct deploy. Metal3 uses direct deploy.
> >
> > Dmitry
> >
> >
> > >
> > > Cheers,
> > >   Arne
> > >
> > > On 20.08.20 17:49, Julia Kreger wrote:
> > > > I'm having a sense of deja vu!
> > > >
> > > > Because of the way the mechanics work, the iscsi deploy driver is in
> > > > an unfortunate position of being harder to troubleshoot and diagnose
> > > > failures. Which basically means we've not been able to really
> identify
> > > > common failures and add logic to handle them appropriately, like we
> > > > are able to with a tcp socket and file download. Based on this alone,
> > > > I think it makes a solid case for us to seriously consider
> > > > deprecation.
> > > >
> > > > Overall, I'm +1 for the proposal and I believe over two cycles is the
> > > > right way to go.
> > > >
> > > > I suspect we're going to have lots of push back from the TripleO
> > > > community because there has been resistance to change their default
> > > > usage in the past. As such I'm adding them to the subject so
> hopefully
> > > > they will be at least aware.
> > > >
> > > > I guess my other worry is operators who already have a substantial
> > > > operational infrastructure investment built around the iscsi deploy
> > > > interface. I wonder why they didn't use direct, but maybe they have
> > > > all migrated in the past ?5? years. This could just be a non-concern
> > > > in reality, I'm just not sure.
> > > >
> > > > Of course, if someone is willing to step up and make the iscsi
> > > > deployment interface their primary focus, that also shifts the
> > > > discussion to making direct the default interface?
> > > >
> > > > -Julia
> > > >
> > > >
> > > > On Thu, Aug 20, 2020 at 1:57 AM Dmitry Tantsur <dtantsur at redhat.com>
> > >
> > > wrote:
> > > > >
> > > > > Hi all,
> > > > >
> > > > > Side note for those lacking context: this proposal concerns
> deprecating
> > >
> > > one of the ironic deploy interfaces detailed in
> > > https://docs.openstack.org/ironic/latest/admin/interfaces/deploy.html.
> It
> > > does not affect the boot-from-iSCSI feature.
> > > > >
> > > > > I would like to propose deprecating and removing the 'iscsi' deploy
> > >
> > > interface over the course of the next 2 cycles. The reasons are:
> > > > > 1) The iSCSI deploy is a source of occasional cryptic bugs when a
> > >
> > > target cannot be discovered or mounted properly.
> > > > > 2) Its security is questionable: I don't think we even use
> > >
> > > authentication.
> > > > > 3) Operators confusion: right now we default to the iSCSI deploy
> but
> > >
> > > pretty much direct everyone who cares about scalability or security to
> the
> > > 'direct' deploy.
> > > > > 4) Cost of maintenance: our feature set is growing, our team - not
> so
> > >
> > > much. iscsi_deploy.py is 800 lines of code that can be removed, and
> some
> > > dependencies that can be dropped as well.
> > > > >
> > > > > As far as I can remember, we've kept the iSCSI deploy for two
> reasons:
> > > > > 1) The direct deploy used to require Glance with Swift backend. The
> > >
> > > recently added [agent]image_download_source option allows caching and
> > > serving images via the ironic's HTTP server, eliminating this problem.
> I
> > > guess we'll have to switch to 'http' by default for this option to
> keep the
> > > out-of-box experience.
> > > > > 2) Memory footprint of the direct deploy. With the raw images
> streaming
> > >
> > > we no longer have to cache the downloaded images in the agent memory,
> > > removing this problem as well (I'm not even sure how much of a problem
> it
> > > is in 2020, even my phone has 4GiB of RAM).
> > > > >
> > > > > If this proposal is accepted, I suggest to execute it as follows:
> > > > > Victoria release:
> > > > > 1) Put an early deprecation warning in the release notes.
> > > > > 2) Announce the future change of the default value for
> > >
> > > [agent]image_download_source.
> > > > > W release:
> > > > > 3) Change [agent]image_download_source to 'http' by default.
> > > > > 4) Remove iscsi from the default enabled_deploy_interfaces and
> move it
> > >
> > > to the back of the supported list (effectively making direct deploy the
> > > default).
> > > > > X release:
> > > > > 5) Remove the iscsi deploy code from both ironic and IPA.
> > > > >
> > > > > Thoughts, opinions, suggestions?
> > > > >
> > > > > Dmitry
> > >
> > >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200824/5aefbbad/attachment.html>

From balazs.gibizer at est.tech  Mon Aug 24 12:06:22 2020
From: balazs.gibizer at est.tech (=?iso-8859-1?q?Bal=E1zs?= Gibizer)
Date: Mon, 24 Aug 2020 14:06:22 +0200
Subject: [nova] virtual PTG and Forum planning
Message-ID: <MMHKFQ.Y4T3UHA0QRRJ@est.tech>

Hi,

As you probably know the next virtual PTG will be held between October 
26-30. I need to book time slots for Nova [1] so please add your 
availability to the doodle [2] before 7th of September.

I have created an etherpad [3] to collect the PTG topics for the Nova 
sessions. Feel free to add your topics.

Also there will be a Forum between October 19-23 [4]. You can use the 
PTG etherpad [3] to brainstorm forum topics before the official CFP 
opens.

Cheers,
gibi

[1] https://ethercalc.openstack.org/7xp2pcbh1ncb
[2] https://doodle.com/poll/a5pgqh7bypq8piew
[3] https://etherpad.opendev.org/p/nova-wallaby-ptg
[4] https://wiki.openstack.org/wiki/Forum/Virtual202


From smooney at redhat.com  Mon Aug 24 12:26:50 2020
From: smooney at redhat.com (Sean Mooney)
Date: Mon, 24 Aug 2020 13:26:50 +0100
Subject: [ovn][neutron][ussuri] DPDK support on OVN Openstack? Openstack
 Doc contradictive Information?
In-Reply-To: <20200824115836.jysj7yegqkrpndhn@yuggoth.org>
References: <CAEW15yD=Btu3NH3ENkbajCkyTCpwk0HspfSYRymZwqHJ+dbC3Q@mail.gmail.com>
 <CAEW15yB_g=cV5BHoMGHpp_VskUM55jZ0B=q3KCB7U8CZgyCs0Q@mail.gmail.com>
 <20200824115836.jysj7yegqkrpndhn@yuggoth.org>
Message-ID: <3df1492ddefd4066df24736b8eba273992ba1d5a.camel@redhat.com>

On Mon, 2020-08-24 at 11:58 +0000, Jeremy Stanley wrote:
> On 2020-08-24 13:57:59 +0700 (+0700), Popoi Zen wrote:
> [...]
> > I cant get metadata because of tcp checksum error.
> 
> [...]
> 
> I'm not familiar with the rest of the challenges you're facing, but
> consider using configdrive for metadata access. It's generally more
> reliable and resilient than trying to retrieve metadata over the
> network.
> 
> https://docs.openstack.org/nova/ussuri/admin/config-drive.html

ovn should not change the packet processing vs ml2/ovs with dpdk
if you are haveing checksum issue its proably an issue with your underlying network or dpdk/ovs not
the fact your using ovn.


From arnaud.morin at gmail.com  Mon Aug 24 12:41:54 2020
From: arnaud.morin at gmail.com (Arnaud Morin)
Date: Mon, 24 Aug 2020 12:41:54 +0000
Subject: [neutron][ops] q-agent-notifier exchanges without bindings.
In-Reply-To: <CAA857VwW9gRwjd-aAqz=R7Ffcmq7Zq4LZ5-hftNHsg5HW7_4iw@mail.gmail.com>
References: <CAA857VwW9gRwjd-aAqz=R7Ffcmq7Zq4LZ5-hftNHsg5HW7_4iw@mail.gmail.com>
Message-ID: <20200824124154.GA31915@sync>

Hey,

I did exactly the same on my side.
I also have unroutable messages going in my alternate exchange, related
to the same exchanges (q-agent-notifier-security_group-update_fanout,
etc.)

Did you figured out why you have unroutable messages like this?
Are you using a custom neutron driver?

Cheers,

-- 
Arnaud Morin

On 21.08.20 - 10:32, Fabian Zimmermann wrote:
> Hi,
> 
> im currently on the way to analyse some rabbitmq-issues.
> 
> atm im taking a look on "unroutable messages", so I
> 
> * created an Alternative Exchange and Queue: "unroutable"
> * created a policy to send all unroutable msgs to this exchange/queue.
> * wrote a script to show me the msgs placed here.. currently I get
> 
> Seems like my neutron is placing msgs in these exchanges, but there is
> nobody listening/binding to:
> --
>      20 Exchange: q-agent-notifier-network-delete_fanout, RoutingKey:
>     226 Exchange: q-agent-notifier-port-delete_fanout, RoutingKey:
>      88 Exchange: q-agent-notifier-port-update_fanout, RoutingKey:
>     388 Exchange: q-agent-notifier-security_group-update_fanout, RoutingKey:
> --
> 
> Is someone able to give me a hint where to look at / how to debug this?
> 
>  Fabian
> 


From gagehugo at gmail.com  Mon Aug 24 13:17:43 2020
From: gagehugo at gmail.com (Gage Hugo)
Date: Mon, 24 Aug 2020 08:17:43 -0500
Subject: [openstack-helm] OpenStack-Helm Meeting Aug 25th Cancelled
Message-ID: <CAE4Awf_+88xy4GgNdr5z=DXSPdzraJDDvtWzwWNNF5rTBBtmgw@mail.gmail.com>

Good morning,

The openstack-helm meeting for tomorrow, Aug 25th 2020 will be cancelled,
we will see you all next week!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200824/5a6ee365/attachment.html>

From gagehugo at gmail.com  Mon Aug 24 13:18:43 2020
From: gagehugo at gmail.com (Gage Hugo)
Date: Mon, 24 Aug 2020 08:18:43 -0500
Subject: [security] Security SIG Meeting Aug 27th Cancelled
Message-ID: <CAE4Awf8UrMa7f0dhNP6a_3==esrk3Cj2RX2AvUeiwE1VZPwtnw@mail.gmail.com>

Good morning,

The security SIG meeting for Thursday, Aug 27th 2020 will be cancelled.  We
will meet next week at the usual time.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200824/18e87d13/attachment.html>

From mkopec at redhat.com  Mon Aug 24 14:12:17 2020
From: mkopec at redhat.com (Martin Kopec)
Date: Mon, 24 Aug 2020 16:12:17 +0200
Subject: [all] READMEs of zuul roles not rendered properly - missing content
Message-ID: <CAKZGdE2rW_i2JX1HAytTakQcJBS6A=9SGeyx0JzzNagL6u_w2g@mail.gmail.com>

Hello everyone,

I've noticed that READMEs of zuul roles within openstack projects are not
rendered properly on opendev.org - ".. zuul:rolevar::" syntax seems to be
the problem. Although it's rendered well on github.com, see f.e. [1] [2].

I wonder if there were some changes in the supported README syntax. Also
the ".. zuul:rolevar::" syntax throws errors on online rst formatters I was
testing on, however, it's rendered fine by md online formatters - maybe
opendev.org is more rst strict in case of .rst files than github?

Any ideas?

[1]
https://opendev.org/openstack/tempest/src/branch/master/roles/run-tempest
[2] https://github.com/openstack/tempest/tree/master/roles/run-tempest

Thanks,
-- 

Martin Kopec

Quality Engineer

Red Hat EMEA <https://www.redhat.com>

<https://red.ht/sig>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200824/54f014c5/attachment.html>

From eblock at nde.ag  Mon Aug 24 14:19:04 2020
From: eblock at nde.ag (Eugen Block)
Date: Mon, 24 Aug 2020 14:19:04 +0000
Subject: [horizon] default create_volume setting can't be changed
Message-ID: <20200824141904.Horde.biUwyDcXRQDK2D0KW6vwbE1@webmail.nde.ag>

Hi *,

we recently upgraded from Ocata to Train and I'm struggling with a  
specific setting: I believe since Pike version the default for  
"create_volume" changed to "true" when launching instances from  
Horizon dashboard. I would like to change that to "false" and set it  
in our custom  
/srv/www/openstack-dashboard/openstack_dashboard/local/local_settings.d/_100_local_settings.py:


LAUNCH_INSTANCE_DEFAULTS = {
     'config_drive': False,
     'create_volume': False,
     'hide_create_volume': False,
     'disable_image': False,
     'disable_instance_snapshot': False,
     'disable_volume': False,
     'disable_volume_snapshot': False,
     'enable_scheduler_hints': True,
}

Other configs from this file work as expected, so that custom file  
can't be the reason.
After apache and memcached restart nothing changes, the default is  
still "true". Can anyone shed some light, please? I haven't tried  
other configs yet so I can't tell if more options are affected.

Thanks!
Eugen


From fungi at yuggoth.org  Mon Aug 24 14:36:18 2020
From: fungi at yuggoth.org (Jeremy Stanley)
Date: Mon, 24 Aug 2020 14:36:18 +0000
Subject: [all][infra] READMEs of zuul roles not rendered properly -
 missing content
In-Reply-To: <CAKZGdE2rW_i2JX1HAytTakQcJBS6A=9SGeyx0JzzNagL6u_w2g@mail.gmail.com>
References: <CAKZGdE2rW_i2JX1HAytTakQcJBS6A=9SGeyx0JzzNagL6u_w2g@mail.gmail.com>
Message-ID: <20200824143618.7xdecj67m5jzwpkz@yuggoth.org>

On 2020-08-24 16:12:17 +0200 (+0200), Martin Kopec wrote:
> I've noticed that READMEs of zuul roles within openstack projects
> are not rendered properly on opendev.org - ".. zuul:rolevar::"
> syntax seems to be the problem. Although it's rendered well on
> github.com, see f.e. [1] [2].
> 
> I wonder if there were some changes in the supported README
> syntax. Also the ".. zuul:rolevar::" syntax throws errors on
> online rst formatters I was testing on, however, it's rendered
> fine by md online formatters - maybe opendev.org is more rst
> strict in case of .rst files than github?
> 
> Any ideas?
> 
> [1] https://opendev.org/openstack/tempest/src/branch/master/roles/run-tempest
> [2] https://github.com/openstack/tempest/tree/master/roles/run-tempest

Those wrappers rely on the zuul_sphinx plugin, which needs to be
included in docs builds thusly:

<URL: https://opendev.org/zuul/zuul-jobs/src/commit/1e92a67db6f5fa3f3284d5b1928f104c428187f3/doc/source/conf.py#L24 >

That extension allows you to build job and role documentation like
this:

https://zuul-ci.org/docs/zuul-jobs/

If the project doesn't plan to build such documentation, they can of
course be omitted (openstack/tempest's doc/source/conf.py doesn't
use zuul_sphinx that I can see). As far as why the README rendering
in Gitea is tripping over it, sounds likely to be a bug in whatever
reStructuredText parser library it uses. Was this working up until
recently? We did just upgrade Gitea again in the past week or two.

To be entirely honest, I wish Gitea didn't automatically attempt to
render RST files, that makes it harder to actually refer to the
source code for them, and it's a source code browser not a CMS for
publishing documentation, but apparently this is a feature many
other users do like for some reason.
-- 
Jeremy Stanley
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 963 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200824/c0081478/attachment-0001.sig>

From cboylan at sapwetik.org  Mon Aug 24 15:05:41 2020
From: cboylan at sapwetik.org (Clark Boylan)
Date: Mon, 24 Aug 2020 08:05:41 -0700
Subject: =?UTF-8?Q?Re:_[all][infra]_READMEs_of_zuul_roles_not_rendered_properly_-?=
 =?UTF-8?Q?_missing_content?=
In-Reply-To: <20200824143618.7xdecj67m5jzwpkz@yuggoth.org>
References: <CAKZGdE2rW_i2JX1HAytTakQcJBS6A=9SGeyx0JzzNagL6u_w2g@mail.gmail.com>
 <20200824143618.7xdecj67m5jzwpkz@yuggoth.org>
Message-ID: <a23f9c08-fbe1-4d1c-b488-84e9f710cce5@www.fastmail.com>

On Mon, Aug 24, 2020, at 7:36 AM, Jeremy Stanley wrote:
> On 2020-08-24 16:12:17 +0200 (+0200), Martin Kopec wrote:
> > I've noticed that READMEs of zuul roles within openstack projects
> > are not rendered properly on opendev.org - ".. zuul:rolevar::"
> > syntax seems to be the problem. Although it's rendered well on
> > github.com, see f.e. [1] [2].
> > 
> > I wonder if there were some changes in the supported README
> > syntax. Also the ".. zuul:rolevar::" syntax throws errors on
> > online rst formatters I was testing on, however, it's rendered
> > fine by md online formatters - maybe opendev.org is more rst
> > strict in case of .rst files than github?
> > 
> > Any ideas?
> > 
> > [1] https://opendev.org/openstack/tempest/src/branch/master/roles/run-tempest
> > [2] https://github.com/openstack/tempest/tree/master/roles/run-tempest
> 
> Those wrappers rely on the zuul_sphinx plugin, which needs to be
> included in docs builds thusly:
> 
> <URL: 
> https://opendev.org/zuul/zuul-jobs/src/commit/1e92a67db6f5fa3f3284d5b1928f104c428187f3/doc/source/conf.py#L24 >
> 
> That extension allows you to build job and role documentation like
> this:
> 
> https://zuul-ci.org/docs/zuul-jobs/
> 
> If the project doesn't plan to build such documentation, they can of
> course be omitted (openstack/tempest's doc/source/conf.py doesn't
> use zuul_sphinx that I can see). As far as why the README rendering
> in Gitea is tripping over it, sounds likely to be a bug in whatever
> reStructuredText parser library it uses. Was this working up until
> recently? We did just upgrade Gitea again in the past week or two.

We use pandoc to render the rst files, and the choice of tool and commands to run is driven entirely by config [3]. If we want to switch to another tool we'll want to ensure the new tool is installed in our container image [4]. Its possible that simply using the right pandoc options would get us what we want here?

> 
> To be entirely honest, I wish Gitea didn't automatically attempt to
> render RST files, that makes it harder to actually refer to the
> source code for them, and it's a source code browser not a CMS for
> publishing documentation, but apparently this is a feature many
> other users do like for some reason.

We can change this behavior by removing the external renderer (though I expect we're in the minority of preferring ability to link to the source here).

[3] https://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/gitea/templates/app.ini.j2#L88-L95
[4] https://opendev.org/opendev/system-config/src/branch/master/docker/gitea/Dockerfile#L92-L94

> -- 
> Jeremy Stanley


From emilien at redhat.com  Mon Aug 24 15:28:12 2020
From: emilien at redhat.com (Emilien Macchi)
Date: Mon, 24 Aug 2020 11:28:12 -0400
Subject: [tripleo] Proposing Takashi Kajinami to be core on puppet-tripleo
In-Reply-To: <DS7PR19MB463081789D3FED379F33EBCB9A5A0@DS7PR19MB4630.namprd19.prod.outlook.com>
References: <CACu=hytgMt=4whc8vVRMLw-gkrfZ-rYiX26sNH78jnvgwu_zjQ@mail.gmail.com>
 <CABJHmF5eJOgG=hoHrE2z-8gMwm_cxGihMSLOd02hcXd+2GvK7w@mail.gmail.com>
 <1597847922905.32607@binero.com>
 <CAOHJT4K3FFpO2dbiNsVCUh2d+9TPPQsnnX=k1yPRqqaPBN1bTg@mail.gmail.com>
 <DS7PR19MB463081789D3FED379F33EBCB9A5A0@DS7PR19MB4630.namprd19.prod.outlook.com>
Message-ID: <CACu=hyv1wB-Cggo7z87XgGG1S1JnS_cEdeboDd7OniwaB5oYuA@mail.gmail.com>

I went ahead and added Takashi to the newly created puppet-tripleo core
group in Gerrit.

Thanks again for your hard work!

On Thu, Aug 20, 2020 at 9:25 AM Karthik, Rajini <Rajini.Karthik at dell.com>
wrote:

> +1 .
>
>
>
> Rajini
>
>
>
> *From:* Wesley Hayutin <whayutin at redhat.com>
> *Sent:* Wednesday, August 19, 2020 9:09 PM
> *To:* openstack-discuss
> *Cc:* Emilien Macchi
> *Subject:* Re: [tripleo] Proposing Takashi Kajinami to be core on
> puppet-tripleo
>
>
>
> [EXTERNAL EMAIL]
>
>
>
>
>
> On Wed, Aug 19, 2020 at 8:40 AM Tobias Urdin <tobias.urdin at binero.com>
> wrote:
>
> Big +1 from an outsider :))
>
>
>
> Best regards
>
> Tobias
>
>
> ------------------------------
>
> *From:* Rabi Mishra <ramishra at redhat.com>
> *Sent:* Wednesday, August 19, 2020 3:37 PM
> *To:* Emilien Macchi
> *Cc:* openstack-discuss
> *Subject:* Re: [tripleo] Proposing Takashi Kajinami to be core on
> puppet-tripleo
>
>
>
> +1
>
>
>
> On Tue, Aug 18, 2020 at 8:03 PM Emilien Macchi <emilien at redhat.com> wrote:
>
> Hi people,
>
>
>
> If you don't know Takashi yet, he has been involved in the Puppet
> OpenStack project and helped *a lot* in its maintenance (and by maintenance
> I mean not-funny-work). When our community was getting smaller and smaller,
> he joined us and our review velicity went back to eleven. He became a core
> maintainer very quickly and we're glad to have him onboard.
>
>
>
> He's also been involved in taking care of puppet-tripleo for a few months
> and I believe he has more than enough knowledge on the module to provide
> core reviews and be part of the core maintainer group. I also noticed his
> amount of contribution (bug fixes, improvements, reviews, etc) in other
> TripleO repos and I'm confident he'll make his road to be core in TripleO
> at some point. For now I would like him to propose him to be core in
> puppet-tripleo.
>
>
>
> As usual, any feedback is welcome but in the meantime I want to thank
> Takashi for his work in TripleO and we're super happy to have new
> contributors!
>
>
>
> Thanks,
>
> --
>
> Emilien Macchi
>
>
>
>
> --
>
> Regards,
>
> Rabi Mishra
>
>
>
>
>
> +1, thanks for your contributions Takashi!
>


-- 
Emilien Macchi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200824/94006229/attachment.html>

From satish.txt at gmail.com  Mon Aug 24 15:47:51 2020
From: satish.txt at gmail.com (Satish Patel)
Date: Mon, 24 Aug 2020 11:47:51 -0400
Subject: [nova][neutron][oslo][ops][kolla] rabbit bindings issue
In-Reply-To: <f4d5d0e32626f6c8b5509d67eb3d38272169e4fc.camel@redhat.com>
References: <20200806144016.GP31915@sync>
 <CAA857VxY8mr0OSw8vSYdMph4tXb=5Wwwx3cM_YVgGa8=P+cxkQ@mail.gmail.com>
 <CALaZjRGEcFDBiQ2YnsMjXJ_WF3ALcjWVLQuetupVd__CWhjg0g@mail.gmail.com>
 <CAA857VzKREnBOEF667n4B+9L9Rbm+y7o2LVrDjhxhCT=2XWR8g@mail.gmail.com>
 <1a338d7e-c82c-cda2-2d47-b5aebb999142@openstack.org>
 <CAA857VxxmrHhBZgNHCg2X=ymvOdS_YcpERYN5OnKs0NMEA=cyA@mail.gmail.com>
 <CAA857Vya0rou-SwzO+A=smVHXNTZ9a03_3r09BJj=bb3NAcdGg@mail.gmail.com>
 <CAA857Vzw+kvW_7R37oqJF6gRkxGBSY3gNYTLKu++pY5WHH49qg@mail.gmail.com>
 <CAA857Vzq1nYLM3ES35J_9PxTieVskhZBc43Z40R3NeiEEOL4rA@mail.gmail.com>
 <CAPgF-fr-3J=acxng6SwRkLPXqvTYNsNmCZUXyvzi4rxDnPDdEA@mail.gmail.com>
 <CAA857Vw_HG8VuZyyihm8RB9Mo8aPm=v36GJrnK1jD9wyZHAb2A@mail.gmail.com>
 <28f04c4eff84aa6d15424f3de3706ae9ec361fa7.camel@redhat.com>
 <CAPgF-fo-MFk7oVcKKPXthBE=hi4U+VtRtkdtkc3w=SsGuJJ-ag@mail.gmail.com>
 <f4d5d0e32626f6c8b5509d67eb3d38272169e4fc.camel@redhat.com>
Message-ID: <CAPgF-fr20oUc788f+gEq3cF__ROtDmLVtXjEQTFLSKC-P+9EwQ@mail.gmail.com>

Sorry for the late reply Sean,

When you said Cells is only a nova feature what does that mean?
Correct me if i am wrong here, only nova means i can deploy rabbitmq
in cells to just handl nova-* services but not neutron or any other
services right?

On Sun, Aug 16, 2020 at 9:37 AM Sean Mooney <smooney at redhat.com> wrote:
>
> On Sat, 2020-08-15 at 20:13 -0400, Satish Patel wrote:
> > Hi Sean,
> >
> > Sounds good, but running rabbitmq for each service going to be little
> > overhead also, how do you scale cluster (Yes we can use cellv2 but its
> > not something everyone like to do because of complexity).
>
> my understanding is that when using rabbitmq adding multiple rabbitmq servers in a cluster lowers
> througput vs jsut 1 rabbitmq instance for any given excahnge. that is because the content of
> the queue need to be syconised across the cluster. so if cinder nova and neutron share
> a 3 node cluster and your compaure that to the same service deployed with cinder nova and neuton
> each having there on rabbitmq service then the independent deployment will tend to out perform the
> clustered solution. im not really sure if that has change i know tha thow clustering has been donw has evovled
> over the years but in the past clustering was the adversary of scaling.
>
> >  If we thinks
> > rabbitMQ is growing pain then why community not looking for
> > alternative option (kafka) etc..?
> we have looked at alternivives several times
> rabbit mq  wroks well enough ans scales well enough for most deployments.
> there other amqp implimantation that scale better then rabbit,
> activemq and qpid are both reported to scale better but they perfrom worse
> out of the box and need to be carfully tuned
>
> in the past zeromq has been supported but peole did not maintain it.
>
> kafka i dont think is a good alternative but nats https://nats.io/ might be.
>
> for what its worth all nova deployment are cellv2 deployments with 1 cell from around pike/rocky
> and its really not that complex. cells_v1 was much more complex bug part of the redesign
> for cells_v2 was makeing sure there is only 1 code path. adding a second cell just need another
> cell db and conductor to be deployed assuming you startted with a super conductor in the first
> place. the issue is cells is only a nova feature no other service have cells so it does not help
> you with cinder or neutron. as such cinder an neutron likely be the services that hit scaling limits first.
> adopign cells in other services is not nessaryally the right approch either but when we talk about scale
> we do need to keep in mind that cells is just for nova today.
>
>
> >
> > On Fri, Aug 14, 2020 at 3:09 PM Sean Mooney <smooney at redhat.com> wrote:
> > >
> > > On Fri, 2020-08-14 at 18:45 +0200, Fabian Zimmermann wrote:
> > > > Hi,
> > > >
> > > > i read somewhere that vexxhosts kubernetes openstack-Operator is running
> > > > one rabbitmq Container per Service. Just the kubernetes self healing is
> > > > used as "ha" for rabbitmq.
> > > >
> > > > That seems to match with my finding: run rabbitmq standalone and use an
> > > > external system to restart rabbitmq if required.
> > >
> > > thats the design that was orginally planned for kolla-kubernetes orrignally
> > >
> > > each service was to be deployed with its own rabbit mq server if it required one
> > > and if it crashed it woudl just be recreated by k8s. it perfromace better then a cluster
> > > and if you trust k8s or the external service enough to ensure it is recteated it
> > > should be as effective a solution. you dont even need k8s to do that but it seams to be
> > > a good fit if  your prepared to ocationally loose inflight rpcs.
> > > if you not then you can configure rabbit to persite all message to disk and mont that on a shared
> > > file system like nfs or cephfs so that when the rabbit instance is recreated the queue contency is
> > > perserved. assuming you can take the perfromance hit of writing all messages to disk that is.
> > > >
> > > >  Fabian
> > > >
> > > > Satish Patel <satish.txt at gmail.com> schrieb am Fr., 14. Aug. 2020, 16:59:
> > > >
> > > > > Fabian,
> > > > >
> > > > > what do you mean?
> > > > >
> > > > > > > I think vexxhost is running (1) with their openstack-operator - for
> > > > >
> > > > > reasons.
> > > > >
> > > > > On Fri, Aug 14, 2020 at 7:28 AM Fabian Zimmermann <dev.faz at gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > Hello again,
> > > > > >
> > > > > > just a short update about the results of my tests.
> > > > > >
> > > > > > I currently see 2 ways of running openstack+rabbitmq
> > > > > >
> > > > > > 1. without durable-queues and without replication - just one
> > > > >
> > > > > rabbitmq-process which gets (somehow) restarted if it fails.
> > > > > > 2. durable-queues and replication
> > > > > >
> > > > > > Any other combination of these settings leads to more or less issues with
> > > > > >
> > > > > > * broken / non working bindings
> > > > > > * broken queues
> > > > > >
> > > > > > I think vexxhost is running (1) with their openstack-operator - for
> > > > >
> > > > > reasons.
> > > > > >
> > > > > > I added [kolla], because kolla-ansible is installing rabbitmq with
> > > > >
> > > > > replication but without durable-queues.
> > > > > >
> > > > > > May someone point me to the best way to document these findings to some
> > > > >
> > > > > official doc?
> > > > > > I think a lot of installations out there will run into issues if - under
> > > > >
> > > > > load - a node fails.
> > > > > >
> > > > > >  Fabian
> > > > > >
> > > > > >
> > > > > > Am Do., 13. Aug. 2020 um 15:13 Uhr schrieb Fabian Zimmermann <
> > > > >
> > > > > dev.faz at gmail.com>:
> > > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > just did some short tests today in our test-environment (without
> > > > >
> > > > > durable queues and without replication):
> > > > > > >
> > > > > > > * started a rally task to generate some load
> > > > > > > * kill-9-ed rabbitmq on one node
> > > > > > > * rally task immediately stopped and the cloud (mostly) stopped working
> > > > > > >
> > > > > > > after some debugging i found (again) exchanges which had bindings to
> > > > >
> > > > > queues, but these bindings didnt forward any msgs.
> > > > > > > Wrote a small script to detect these broken bindings and will now check
> > > > >
> > > > > if this is "reproducible"
> > > > > > >
> > > > > > > then I will try "durable queues" and "durable queues with replication"
> > > > >
> > > > > to see if this helps. Even if I would expect
> > > > > > > rabbitmq should be able to handle this without these "hidden broken
> > > > >
> > > > > bindings"
> > > > > > >
> > > > > > > This just FYI.
> > > > > > >
> > > > > > >  Fabian
> >
> >
>


From tonyliu0592 at hotmail.com  Mon Aug 24 16:53:02 2020
From: tonyliu0592 at hotmail.com (Tony Liu)
Date: Mon, 24 Aug 2020 16:53:02 +0000
Subject: [Kolla Ansible] host maintenance
In-Reply-To: <CAFHSqWonE_7WAhLf4rVt+_=hJ+HCBoq=c2UdZSw+cwumyLxebQ@mail.gmail.com>
References: <MWHPR08MB23822BB16CA7FFA8667B5A73BD5B0@MWHPR08MB2382.namprd08.prod.outlook.com>
 <CAA857VyMP9jH0575oMF446L==1UDYKk5UGs6YQ-kxkrq4F4GJg@mail.gmail.com>
 <E1F06E18-FD98-42FB-9DAC-0D4A45343226@uchicago.edu>
 <MWHPR08MB2382D4FADCD4064948B74D60BD5B0@MWHPR08MB2382.namprd08.prod.outlook.com>
 <046E9C0290DD9149B106B72FC9156BEA04814569@gmsxchsvr01.thecreation.com>
 <CAFHSqWonE_7WAhLf4rVt+_=hJ+HCBoq=c2UdZSw+cwumyLxebQ@mail.gmail.com>
Message-ID: <MWHPR08MB2382B0D4E138759352B92EE3BD560@MWHPR08MB2382.namprd08.prod.outlook.com>

> -----Original Message-----
> From: Mark Goddard <mark at stackhpc.com>
> Sent: Monday, August 24, 2020 12:46 AM
> To: Eric K. Miller <emiller at genesishosting.com>
> Cc: openstack-discuss <openstack-discuss at lists.openstack.org>
> Subject: Re: [Kolla Ansible] host maintenance
> 
> On Sat, 22 Aug 2020 at 01:10, Eric K. Miller <emiller at genesishosting.com>
> wrote:
> >
> > > Actually, in my case, the setup is originally deploy by Kolla
> > > Ansible. Other than the initial deployment, I am looking for using
> > > Kolla Ansible for maintenance operations.
> > > What I am looking for, eg. replace a host, can surely be done by
> > > manual steps or customized script. I'd like to know if they are
> > > automated by Kolla Ansible.
> >
> > We do this often by simply using the "limit" flag in Kolla Ansible to
> only include the controllers and new compute node (after adding the
> compute node to the multinode.ini file).  Specify "reconfigure" for the
> action, and not "install".
> 
> We need some better docs around this, and I think they will be added
> soon. Some things to watch out for:
> 
> * if adding a new controller, ensure that if using --limit, all
> controllers are included and do not use serial mode

What I tried was to replace a controller, where I don't need to
update other controllers, because there is no address update.

If there is address update caused by controller change, then all
controllers have to be included to get update.

What's "serial mode"?

> * if removing a controller, reconfigure other controllers to update the
> RabbitMQ & Galera cluster nodes etc.

In this case, are those services who don't need any updates going
to be restarted or untouched?

Thanks!
Tony


From mark at stackhpc.com  Mon Aug 24 18:20:46 2020
From: mark at stackhpc.com (Mark Goddard)
Date: Mon, 24 Aug 2020 19:20:46 +0100
Subject: [Kolla Ansible] host maintenance
In-Reply-To: <MWHPR08MB2382B0D4E138759352B92EE3BD560@MWHPR08MB2382.namprd08.prod.outlook.com>
References: <MWHPR08MB23822BB16CA7FFA8667B5A73BD5B0@MWHPR08MB2382.namprd08.prod.outlook.com>
 <CAA857VyMP9jH0575oMF446L==1UDYKk5UGs6YQ-kxkrq4F4GJg@mail.gmail.com>
 <E1F06E18-FD98-42FB-9DAC-0D4A45343226@uchicago.edu>
 <MWHPR08MB2382D4FADCD4064948B74D60BD5B0@MWHPR08MB2382.namprd08.prod.outlook.com>
 <046E9C0290DD9149B106B72FC9156BEA04814569@gmsxchsvr01.thecreation.com>
 <CAFHSqWonE_7WAhLf4rVt+_=hJ+HCBoq=c2UdZSw+cwumyLxebQ@mail.gmail.com>
 <MWHPR08MB2382B0D4E138759352B92EE3BD560@MWHPR08MB2382.namprd08.prod.outlook.com>
Message-ID: <CAFHSqWqs2dPgkuvX-tRN7F78Aa-QzX2pSS8eXXyaaoqkhAzATA@mail.gmail.com>

On Mon, 24 Aug 2020 at 17:53, Tony Liu <tonyliu0592 at hotmail.com> wrote:
>
> > -----Original Message-----
> > From: Mark Goddard <mark at stackhpc.com>
> > Sent: Monday, August 24, 2020 12:46 AM
> > To: Eric K. Miller <emiller at genesishosting.com>
> > Cc: openstack-discuss <openstack-discuss at lists.openstack.org>
> > Subject: Re: [Kolla Ansible] host maintenance
> >
> > On Sat, 22 Aug 2020 at 01:10, Eric K. Miller <emiller at genesishosting.com>
> > wrote:
> > >
> > > > Actually, in my case, the setup is originally deploy by Kolla
> > > > Ansible. Other than the initial deployment, I am looking for using
> > > > Kolla Ansible for maintenance operations.
> > > > What I am looking for, eg. replace a host, can surely be done by
> > > > manual steps or customized script. I'd like to know if they are
> > > > automated by Kolla Ansible.
> > >
> > > We do this often by simply using the "limit" flag in Kolla Ansible to
> > only include the controllers and new compute node (after adding the
> > compute node to the multinode.ini file).  Specify "reconfigure" for the
> > action, and not "install".
> >
> > We need some better docs around this, and I think they will be added
> > soon. Some things to watch out for:
> >
> > * if adding a new controller, ensure that if using --limit, all
> > controllers are included and do not use serial mode
>
> What I tried was to replace a controller, where I don't need to
> update other controllers, because there is no address update.
>
> If there is address update caused by controller change, then all
> controllers have to be included to get update.

While this may work at the moment, we have just merged a change that
prevents this. For keystone, we need access to all controllers, to
determine whether it is a new cluster or a new node in an existing
cluster.

>
> What's "serial mode"?

Ansible has a feature to run plays in batches of some % of the hosts.
In Kolla Ansible you can e.g. export ANSIBLE_SERIAL=0.1. It's an
advanced use case and needs some care.

>
> > * if removing a controller, reconfigure other controllers to update the
> > RabbitMQ & Galera cluster nodes etc.
>
> In this case, are those services who don't need any updates going
> to be restarted or untouched?
>
> Thanks!
> Tony
>


From tonyliu0592 at hotmail.com  Mon Aug 24 18:50:04 2020
From: tonyliu0592 at hotmail.com (Tony Liu)
Date: Mon, 24 Aug 2020 18:50:04 +0000
Subject: [Kolla Ansible] host maintenance
In-Reply-To: <CAFHSqWqs2dPgkuvX-tRN7F78Aa-QzX2pSS8eXXyaaoqkhAzATA@mail.gmail.com>
References: <MWHPR08MB23822BB16CA7FFA8667B5A73BD5B0@MWHPR08MB2382.namprd08.prod.outlook.com>
 <CAA857VyMP9jH0575oMF446L==1UDYKk5UGs6YQ-kxkrq4F4GJg@mail.gmail.com>
 <E1F06E18-FD98-42FB-9DAC-0D4A45343226@uchicago.edu>
 <MWHPR08MB2382D4FADCD4064948B74D60BD5B0@MWHPR08MB2382.namprd08.prod.outlook.com>
 <046E9C0290DD9149B106B72FC9156BEA04814569@gmsxchsvr01.thecreation.com>
 <CAFHSqWonE_7WAhLf4rVt+_=hJ+HCBoq=c2UdZSw+cwumyLxebQ@mail.gmail.com>
 <MWHPR08MB2382B0D4E138759352B92EE3BD560@MWHPR08MB2382.namprd08.prod.outlook.com>
 <CAFHSqWqs2dPgkuvX-tRN7F78Aa-QzX2pSS8eXXyaaoqkhAzATA@mail.gmail.com>
Message-ID: <MWHPR08MB2382D53271A60CDB1B9D3404BD560@MWHPR08MB2382.namprd08.prod.outlook.com>

> -----Original Message-----
> From: Mark Goddard <mark at stackhpc.com>
> Sent: Monday, August 24, 2020 11:21 AM
> To: Tony Liu <tonyliu0592 at hotmail.com>
> Cc: Eric K. Miller <emiller at genesishosting.com>; openstack-discuss
> <openstack-discuss at lists.openstack.org>
> Subject: Re: [Kolla Ansible] host maintenance
> 
> On Mon, 24 Aug 2020 at 17:53, Tony Liu <tonyliu0592 at hotmail.com> wrote:
> >
> > > -----Original Message-----
> > > From: Mark Goddard <mark at stackhpc.com>
> > > Sent: Monday, August 24, 2020 12:46 AM
> > > To: Eric K. Miller <emiller at genesishosting.com>
> > > Cc: openstack-discuss <openstack-discuss at lists.openstack.org>
> > > Subject: Re: [Kolla Ansible] host maintenance
> > >
> > > On Sat, 22 Aug 2020 at 01:10, Eric K. Miller
> > > <emiller at genesishosting.com>
> > > wrote:
> > > >
> > > > > Actually, in my case, the setup is originally deploy by Kolla
> > > > > Ansible. Other than the initial deployment, I am looking for
> > > > > using Kolla Ansible for maintenance operations.
> > > > > What I am looking for, eg. replace a host, can surely be done by
> > > > > manual steps or customized script. I'd like to know if they are
> > > > > automated by Kolla Ansible.
> > > >
> > > > We do this often by simply using the "limit" flag in Kolla Ansible
> > > > to
> > > only include the controllers and new compute node (after adding the
> > > compute node to the multinode.ini file).  Specify "reconfigure" for
> > > the action, and not "install".
> > >
> > > We need some better docs around this, and I think they will be added
> > > soon. Some things to watch out for:
> > >
> > > * if adding a new controller, ensure that if using --limit, all
> > > controllers are included and do not use serial mode
> >
> > What I tried was to replace a controller, where I don't need to update
> > other controllers, because there is no address update.
> >
> > If there is address update caused by controller change, then all
> > controllers have to be included to get update.
> 
> While this may work at the moment, we have just merged a change that
> prevents this. For keystone, we need access to all controllers, to
> determine whether it is a new cluster or a new node in an existing
> cluster.
> 
> >
> > What's "serial mode"?
> 
> Ansible has a feature to run plays in batches of some % of the hosts.
> In Kolla Ansible you can e.g. export ANSIBLE_SERIAL=0.1. It's an
> advanced use case and needs some care.
> 
> >
> > > * if removing a controller, reconfigure other controllers to update
> > > the RabbitMQ & Galera cluster nodes etc.
> >
> > In this case, are those services who don't need any updates going to
> > be restarted or untouched?

Could you comment on this? This is my biggest concern. I'd like
to ensure services who don't need update remain untouched.

Thanks!
Tony


From mnaser at vexxhost.com  Mon Aug 24 18:54:40 2020
From: mnaser at vexxhost.com (Mohammed Naser)
Date: Mon, 24 Aug 2020 14:54:40 -0400
Subject: [nova][neutron][oslo][ops][kolla] rabbit bindings issue
In-Reply-To: <20200818120708.GV31915@sync>
References: <CAA857VxY8mr0OSw8vSYdMph4tXb=5Wwwx3cM_YVgGa8=P+cxkQ@mail.gmail.com>
 <CALaZjRGEcFDBiQ2YnsMjXJ_WF3ALcjWVLQuetupVd__CWhjg0g@mail.gmail.com>
 <CAA857VzKREnBOEF667n4B+9L9Rbm+y7o2LVrDjhxhCT=2XWR8g@mail.gmail.com>
 <1a338d7e-c82c-cda2-2d47-b5aebb999142@openstack.org>
 <CAA857VxxmrHhBZgNHCg2X=ymvOdS_YcpERYN5OnKs0NMEA=cyA@mail.gmail.com>
 <CAA857Vya0rou-SwzO+A=smVHXNTZ9a03_3r09BJj=bb3NAcdGg@mail.gmail.com>
 <CAA857Vzw+kvW_7R37oqJF6gRkxGBSY3gNYTLKu++pY5WHH49qg@mail.gmail.com>
 <CAA857Vzq1nYLM3ES35J_9PxTieVskhZBc43Z40R3NeiEEOL4rA@mail.gmail.com>
 <CAPgF-fr-3J=acxng6SwRkLPXqvTYNsNmCZUXyvzi4rxDnPDdEA@mail.gmail.com>
 <CAA857Vw_HG8VuZyyihm8RB9Mo8aPm=v36GJrnK1jD9wyZHAb2A@mail.gmail.com>
 <20200818120708.GV31915@sync>
Message-ID: <CAEs876iDqKRFdqT=wKJ2EvsktgJ7jRzs55RvptUnk1bfQcYowA@mail.gmail.com>

On Tue, Aug 18, 2020 at 8:11 AM Arnaud Morin <arnaud.morin at gmail.com> wrote:
>
> Hey all,
>
> About the vexxhost strategy to use only one rabbit server and manage HA through
> rabbit.
> Do you plan to do the same for MariaDB/MySQL?

We use a MySQL operator to deploy a good o'l master/slave replication
cluster and point towards the master, for every service, for two
reasons:

1) We always pointed to a master Galera system anyways, multi-master
was overcomplicated for no real advantage
2) The failover time vs the complexity of Galera (and how often we
failover) favours #1
3) We use "orchestrator" by GitHub which manages all the promotions/etc for us

> --
> Arnaud Morin
>
> On 14.08.20 - 18:45, Fabian Zimmermann wrote:
> > Hi,
> >
> > i read somewhere that vexxhosts kubernetes openstack-Operator is running
> > one rabbitmq Container per Service. Just the kubernetes self healing is
> > used as "ha" for rabbitmq.
> >
> > That seems to match with my finding: run rabbitmq standalone and use an
> > external system to restart rabbitmq if required.
> >
> >  Fabian
> >
> > Satish Patel <satish.txt at gmail.com> schrieb am Fr., 14. Aug. 2020, 16:59:
> >
> > > Fabian,
> > >
> > > what do you mean?
> > >
> > > >> I think vexxhost is running (1) with their openstack-operator - for
> > > reasons.
> > >
> > > On Fri, Aug 14, 2020 at 7:28 AM Fabian Zimmermann <dev.faz at gmail.com>
> > > wrote:
> > > >
> > > > Hello again,
> > > >
> > > > just a short update about the results of my tests.
> > > >
> > > > I currently see 2 ways of running openstack+rabbitmq
> > > >
> > > > 1. without durable-queues and without replication - just one
> > > rabbitmq-process which gets (somehow) restarted if it fails.
> > > > 2. durable-queues and replication
> > > >
> > > > Any other combination of these settings leads to more or less issues with
> > > >
> > > > * broken / non working bindings
> > > > * broken queues
> > > >
> > > > I think vexxhost is running (1) with their openstack-operator - for
> > > reasons.
> > > >
> > > > I added [kolla], because kolla-ansible is installing rabbitmq with
> > > replication but without durable-queues.
> > > >
> > > > May someone point me to the best way to document these findings to some
> > > official doc?
> > > > I think a lot of installations out there will run into issues if - under
> > > load - a node fails.
> > > >
> > > >  Fabian
> > > >
> > > >
> > > > Am Do., 13. Aug. 2020 um 15:13 Uhr schrieb Fabian Zimmermann <
> > > dev.faz at gmail.com>:
> > > >>
> > > >> Hi,
> > > >>
> > > >> just did some short tests today in our test-environment (without
> > > durable queues and without replication):
> > > >>
> > > >> * started a rally task to generate some load
> > > >> * kill-9-ed rabbitmq on one node
> > > >> * rally task immediately stopped and the cloud (mostly) stopped working
> > > >>
> > > >> after some debugging i found (again) exchanges which had bindings to
> > > queues, but these bindings didnt forward any msgs.
> > > >> Wrote a small script to detect these broken bindings and will now check
> > > if this is "reproducible"
> > > >>
> > > >> then I will try "durable queues" and "durable queues with replication"
> > > to see if this helps. Even if I would expect
> > > >> rabbitmq should be able to handle this without these "hidden broken
> > > bindings"
> > > >>
> > > >> This just FYI.
> > > >>
> > > >>  Fabian
> > >
>


-- 
Mohammed Naser
VEXXHOST, Inc.


From pierre at stackhpc.com  Mon Aug 24 19:30:25 2020
From: pierre at stackhpc.com (Pierre Riteau)
Date: Mon, 24 Aug 2020 21:30:25 +0200
Subject: [blazar] IRC meetings cancelled this week
Message-ID: <CA+ny2syP1WPjHkA=0pR5kUmyUTFbuhoN6HJdt8KmymOQwiAcEg@mail.gmail.com>

Hello,

Apologies for the short notice: due to scheduling conflicts, I am not
available to chair either of the Blazar IRC meetings this week. I
propose that we cancel them.

Thanks,
Pierre Riteau (priteau)


From sbaker at redhat.com  Mon Aug 24 21:55:23 2020
From: sbaker at redhat.com (Steve Baker)
Date: Tue, 25 Aug 2020 09:55:23 +1200
Subject: [ironic][tripleo] RFC: deprecate the iSCSI deploy interface?
In-Reply-To: <CACNgkFydM1fmKZJ4wVwuJz-7aTC5A+NzAVLAgoxH5UaCkK_uwQ@mail.gmail.com>
References: <CACNgkFy4Gx9UsSfR3EW_vfZzyPGg9mR2ks+h6noycOVz1C1-Ww@mail.gmail.com>
 <CAF7gwdi5D-ppP5Tu8280xVxyYbPMa7mLJTC_vbf9qLP9STGswg@mail.gmail.com>
 <a6465ba4-477b-db59-127a-b7e153f973d2@cern.ch>
 <CACNgkFxJCxY3SGhT_Q197SE5HKMm2tbmOVUNxMsfPf+dGSiw1Q@mail.gmail.com>
 <0800d06e870cc5370ada0a85c5e4aaf3b329107d.camel@redhat.com>
 <CACNgkFydM1fmKZJ4wVwuJz-7aTC5A+NzAVLAgoxH5UaCkK_uwQ@mail.gmail.com>
Message-ID: <d8fcb957-aa98-2d7a-b2ef-6f8fd0fb02a8@redhat.com>


On 25/08/20 12:05 am, Dmitry Tantsur wrote:
>
>
> On Mon, Aug 24, 2020 at 1:52 PM Sean Mooney <smooney at redhat.com 
> <mailto:smooney at redhat.com>> wrote:
>
>     On Mon, 2020-08-24 at 10:32 +0200, Dmitry Tantsur wrote:
>     > Hi,
>     >
>     > On Mon, Aug 24, 2020 at 10:24 AM Arne Wiebalck
>     <arne.wiebalck at cern.ch <mailto:arne.wiebalck at cern.ch>>
>     > wrote:
>     >
>     > > Hi!
>     > >
>     > > CERN's deployment is using the iscsi deploy interface since we
>     started
>     > > with Ironic a couple of years ago (and we installed around
>     5000 nodes
>     > > with it by now). The reason we chose it at the time was
>     simplicity: we
>     > > did not (and still do not) have a Swift backend to Glance, and
>     the iscsi
>     > > interface provided a straightforward alternative.
>     > >
>     > > While we have not seen obscure bugs/issues with it, I can
>     certainly back
>     > > the scalability issues mentioned by Dmitry: the tunneling of
>     the images
>     > > through the controllers can create issues when deploying
>     hundreds of
>     > > nodes at the same time. The security of the iscsi interface is
>     less of a
>     > > concern in our specific environment.
>     > >
>     > > So, why did we not move to direct (yet)? In addition to the
>     lack of
>     > > Swift, mostly since iscsi works for us and the scalability
>     issues were
>     > > not that much of a burning problem ... so we focused on other
>     things :)
>     > >
>     > > Here are some thoughts/suggestions for this discussion:
>     > >
>     > > How would 'direct' work with other Glance backends (like
>     Ceph/RBD in our
>     > > case)? If using direct requires to duplicate images from Glance to
>     > > Ironic (or somewhere else) to be served, I think this would be an
>     > > argument against deprecating iscsi.
>     > >
>     >
>     > With image_download_source=http ironic will download the image
>     to the
>     > conductor to be able serve it to the node. Which is exactly what
>     the iscsi
>     > is doing, so not much of a change for you (except for
>     s/iSCSI/HTTP/ as a
>     > means of serving the image).
>     >
>     > Would it be an option for you to test direct deploy with
>     > image_download_source=http?
>     i think if there is still an option to not force deployemnt to
>     altere any of there
>     other sevices this is likely ok but i think the onious shoudl be
>     on the ironic
>     and ooo teams to ensure there is an upgrade path for those useres
>     before this deprecation
>     becomes a removal without deploying swift or a swift compatibale
>     api e.g. RadosGW
>
>
> Swift is NOT a requirement (nor is RadosGW) when 
> image_download_source=http is used. Any glance backend (or no glance 
> at all) will work.

Even though the TripleO undercloud has swift, I'd be inclined to do 
image_download_source=http so that it can scale out to minions, and so 
we're not relying on a single-node swift for image serving

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200825/e1d54bfb/attachment.html>

From tkajinam at redhat.com  Mon Aug 24 23:02:17 2020
From: tkajinam at redhat.com (Takashi Kajinami)
Date: Tue, 25 Aug 2020 08:02:17 +0900
Subject: [tripleo] Proposing Takashi Kajinami to be core on puppet-tripleo
In-Reply-To: <CACu=hyv1wB-Cggo7z87XgGG1S1JnS_cEdeboDd7OniwaB5oYuA@mail.gmail.com>
References: <CACu=hytgMt=4whc8vVRMLw-gkrfZ-rYiX26sNH78jnvgwu_zjQ@mail.gmail.com>
 <CABJHmF5eJOgG=hoHrE2z-8gMwm_cxGihMSLOd02hcXd+2GvK7w@mail.gmail.com>
 <1597847922905.32607@binero.com>
 <CAOHJT4K3FFpO2dbiNsVCUh2d+9TPPQsnnX=k1yPRqqaPBN1bTg@mail.gmail.com>
 <DS7PR19MB463081789D3FED379F33EBCB9A5A0@DS7PR19MB4630.namprd19.prod.outlook.com>
 <CACu=hyv1wB-Cggo7z87XgGG1S1JnS_cEdeboDd7OniwaB5oYuA@mail.gmail.com>
Message-ID: <CAL_crJQ_+MEkz=bWFLHp-L3EBsO9-5fwNeZOF=q=aj0nmjrZqw@mail.gmail.com>

Thank you, Emilien and the others who shared your kind feedback.
It's my great pleasure and  honor to have this happen.

I'll keep doing my best to make more contribution to TripleO project,

On Tue, Aug 25, 2020 at 12:30 AM Emilien Macchi <emilien at redhat.com> wrote:

> I went ahead and added Takashi to the newly created puppet-tripleo core
> group in Gerrit.
>
> Thanks again for your hard work!
>
> On Thu, Aug 20, 2020 at 9:25 AM Karthik, Rajini <Rajini.Karthik at dell.com>
> wrote:
>
>> +1 .
>>
>>
>>
>> Rajini
>>
>>
>>
>> *From:* Wesley Hayutin <whayutin at redhat.com>
>> *Sent:* Wednesday, August 19, 2020 9:09 PM
>> *To:* openstack-discuss
>> *Cc:* Emilien Macchi
>> *Subject:* Re: [tripleo] Proposing Takashi Kajinami to be core on
>> puppet-tripleo
>>
>>
>>
>> [EXTERNAL EMAIL]
>>
>>
>>
>>
>>
>> On Wed, Aug 19, 2020 at 8:40 AM Tobias Urdin <tobias.urdin at binero.com>
>> wrote:
>>
>> Big +1 from an outsider :))
>>
>>
>>
>> Best regards
>>
>> Tobias
>>
>>
>> ------------------------------
>>
>> *From:* Rabi Mishra <ramishra at redhat.com>
>> *Sent:* Wednesday, August 19, 2020 3:37 PM
>> *To:* Emilien Macchi
>> *Cc:* openstack-discuss
>> *Subject:* Re: [tripleo] Proposing Takashi Kajinami to be core on
>> puppet-tripleo
>>
>>
>>
>> +1
>>
>>
>>
>> On Tue, Aug 18, 2020 at 8:03 PM Emilien Macchi <emilien at redhat.com>
>> wrote:
>>
>> Hi people,
>>
>>
>>
>> If you don't know Takashi yet, he has been involved in the Puppet
>> OpenStack project and helped *a lot* in its maintenance (and by maintenance
>> I mean not-funny-work). When our community was getting smaller and smaller,
>> he joined us and our review velicity went back to eleven. He became a core
>> maintainer very quickly and we're glad to have him onboard.
>>
>>
>>
>> He's also been involved in taking care of puppet-tripleo for a few months
>> and I believe he has more than enough knowledge on the module to provide
>> core reviews and be part of the core maintainer group. I also noticed his
>> amount of contribution (bug fixes, improvements, reviews, etc) in other
>> TripleO repos and I'm confident he'll make his road to be core in TripleO
>> at some point. For now I would like him to propose him to be core in
>> puppet-tripleo.
>>
>>
>>
>> As usual, any feedback is welcome but in the meantime I want to thank
>> Takashi for his work in TripleO and we're super happy to have new
>> contributors!
>>
>>
>>
>> Thanks,
>>
>> --
>>
>> Emilien Macchi
>>
>>
>>
>>
>> --
>>
>> Regards,
>>
>> Rabi Mishra
>>
>>
>>
>>
>>
>> +1, thanks for your contributions Takashi!
>>
>
>
> --
> Emilien Macchi
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200825/9a3d1f39/attachment-0001.html>

From sam47priya at gmail.com  Tue Aug 25 04:29:50 2020
From: sam47priya at gmail.com (Sam P)
Date: Tue, 25 Aug 2020 13:29:50 +0900
Subject: [tc][masakari] Project aliveness (was: [masakari] Meetings)
In-Reply-To: <CAA857VyQ9qYvsao0U4V7PBzG6kOgOCR3q08BU5o_fdLEeGu+BA@mail.gmail.com>
References: <CAKZ_x79qopDY36TNg_5mvWFy=tiiB08PZqpjKeft-TnR0dQbtA@mail.gmail.com>
 <CAKZ_x799gE8njCGVwUtd28REvJaYK86n0WZ3PVeQv5-HXKGdmQ@mail.gmail.com>
 <CAFHSqWroR+rirgLLn7hB-8fHKBR8jFE8KezgSg4EayePC3_dgQ@mail.gmail.com>
 <AM0PR02MB5746C86B851DAE317F5A2CEDA85F0@AM0PR02MB5746.eurprd02.prod.outlook.com>
 <CA+RdavF=POsLiCzkL6DxgjAxorUaw0z2njhm8BHGU67mwfkOwg@mail.gmail.com>
 <CAA857Vz2W+_tnfT2fAhVEVojcxwASJCazF3JLW+LF=om02Q5Eg@mail.gmail.com>
 <6868fdd8-54cd-4ccf-a3d7-ffecf5eb601b@www.fastmail.com>
 <CAA857VyQ9qYvsao0U4V7PBzG6kOgOCR3q08BU5o_fdLEeGu+BA@mail.gmail.com>
Message-ID: <CA+RdavHWkTdpFSWLoJwpKWOnX6cDPBWex+wdP-U0h1YXF0_GmA@mail.gmail.com>

Hi All,

 I add the following members to the core team.

Fabian Zimmermann dev.faz at googlemail.com
Jegor van Opdorpjegor at greenedge.cloud
Radosław Piliszekradoslaw.piliszek at gmail.com
suzhengweisugar-2008 at 163.com

Please let me or other core members know if any one else would like to
join the core team.
--- Regards,
Sampath

On Sat, Aug 22, 2020 at 2:08 AM Fabian Zimmermann <dev.faz at gmail.com> wrote:
>
> Hi,
>
> As long as there are enough cores to keep the project running everything is fine :)
>
>  Fabian
>
> Jean-Philippe Evrard <jean-philippe at evrard.me> schrieb am Fr., 21. Aug. 2020, 16:32:
>>
>>
>> On Wed, Aug 19, 2020, at 06:23, Fabian Zimmermann wrote:
>> > Hi,
>> >
>> > if nobody complains I also would like to request core status to help getting the project further.
>> >
>> >  Fabian Zimmermann
>>
>> Let's hope this will not be lost in the list :)
>>


From sam47priya at gmail.com  Tue Aug 25 04:47:25 2020
From: sam47priya at gmail.com (Sam P)
Date: Tue, 25 Aug 2020 13:47:25 +0900
Subject: [tc][masakari] Project aliveness (was: [masakari] Meetings)
In-Reply-To: <CA+RdavHWkTdpFSWLoJwpKWOnX6cDPBWex+wdP-U0h1YXF0_GmA@mail.gmail.com>
References: <CAKZ_x79qopDY36TNg_5mvWFy=tiiB08PZqpjKeft-TnR0dQbtA@mail.gmail.com>
 <CAKZ_x799gE8njCGVwUtd28REvJaYK86n0WZ3PVeQv5-HXKGdmQ@mail.gmail.com>
 <CAFHSqWroR+rirgLLn7hB-8fHKBR8jFE8KezgSg4EayePC3_dgQ@mail.gmail.com>
 <AM0PR02MB5746C86B851DAE317F5A2CEDA85F0@AM0PR02MB5746.eurprd02.prod.outlook.com>
 <CA+RdavF=POsLiCzkL6DxgjAxorUaw0z2njhm8BHGU67mwfkOwg@mail.gmail.com>
 <CAA857Vz2W+_tnfT2fAhVEVojcxwASJCazF3JLW+LF=om02Q5Eg@mail.gmail.com>
 <6868fdd8-54cd-4ccf-a3d7-ffecf5eb601b@www.fastmail.com>
 <CAA857VyQ9qYvsao0U4V7PBzG6kOgOCR3q08BU5o_fdLEeGu+BA@mail.gmail.com>
 <CA+RdavHWkTdpFSWLoJwpKWOnX6cDPBWex+wdP-U0h1YXF0_GmA@mail.gmail.com>
Message-ID: <CA+RdavE9xfjZZOCELQ+9PaUYgqNy+eGTtUx+bxYhRAgJJ1bEbA@mail.gmail.com>

Thank you all for volunteering to maintain the project.
> Please let me know how we should proceed with the meetings.
> I can start them on Tuesdays at 7 AM UTC.
> And since the Masakari own channel is quite a peaceful one, I would
> suggest to run them there directly.
> What are your thoughts? :-)
I think #openstack-masakari channel is all set to conduct the meeting.
I am totally OK with that. And Tuesday at 7AM UTC also works for me.
Previously we conducted the meeting every two weeks (on even weeks).
How about others?
Please add comments to the following review.
https://review.opendev.org/#/c/747819/

--- Regards,
Sampath

On Tue, Aug 25, 2020 at 1:29 PM Sam P <sam47priya at gmail.com> wrote:
>
> Hi All,
>
>  I add the following members to the core team.
>
> Fabian Zimmermann dev.faz at googlemail.com
> Jegor van Opdorpjegor at greenedge.cloud
> Radosław Piliszekradoslaw.piliszek at gmail.com
> suzhengweisugar-2008 at 163.com
>
> Please let me or other core members know if any one else would like to
> join the core team.
> --- Regards,
> Sampath
>
> On Sat, Aug 22, 2020 at 2:08 AM Fabian Zimmermann <dev.faz at gmail.com> wrote:
> >
> > Hi,
> >
> > As long as there are enough cores to keep the project running everything is fine :)
> >
> >  Fabian
> >
> > Jean-Philippe Evrard <jean-philippe at evrard.me> schrieb am Fr., 21. Aug. 2020, 16:32:
> >>
> >>
> >> On Wed, Aug 19, 2020, at 06:23, Fabian Zimmermann wrote:
> >> > Hi,
> >> >
> >> > if nobody complains I also would like to request core status to help getting the project further.
> >> >
> >> >  Fabian Zimmermann
> >>
> >> Let's hope this will not be lost in the list :)
> >>


From yasufum.o at gmail.com  Tue Aug 25 05:27:51 2020
From: yasufum.o at gmail.com (Yasufumi Ogawa)
Date: Tue, 25 Aug 2020 14:27:51 +0900
Subject: [tacker] IRC meeting
Message-ID: <636004ca-130b-58ee-c769-19169926fcee@gmail.com>

Hi tacker team,

I am not available to join IRC meeting today unfortunately. I would like 
to suggest to anyone host the meeting, or skip it if no items.

Thanks,
Yasufumi


From luis.ramirez at opencloud.es  Tue Aug 25 05:32:23 2020
From: luis.ramirez at opencloud.es (Luis Ramirez)
Date: Tue, 25 Aug 2020 07:32:23 +0200
Subject: [tc][masakari] Project aliveness (was: [masakari] Meetings)
In-Reply-To: <CA+RdavE9xfjZZOCELQ+9PaUYgqNy+eGTtUx+bxYhRAgJJ1bEbA@mail.gmail.com>
References: <CAKZ_x79qopDY36TNg_5mvWFy=tiiB08PZqpjKeft-TnR0dQbtA@mail.gmail.com>
 <CAKZ_x799gE8njCGVwUtd28REvJaYK86n0WZ3PVeQv5-HXKGdmQ@mail.gmail.com>
 <CAFHSqWroR+rirgLLn7hB-8fHKBR8jFE8KezgSg4EayePC3_dgQ@mail.gmail.com>
 <AM0PR02MB5746C86B851DAE317F5A2CEDA85F0@AM0PR02MB5746.eurprd02.prod.outlook.com>
 <CA+RdavF=POsLiCzkL6DxgjAxorUaw0z2njhm8BHGU67mwfkOwg@mail.gmail.com>
 <CAA857Vz2W+_tnfT2fAhVEVojcxwASJCazF3JLW+LF=om02Q5Eg@mail.gmail.com>
 <6868fdd8-54cd-4ccf-a3d7-ffecf5eb601b@www.fastmail.com>
 <CAA857VyQ9qYvsao0U4V7PBzG6kOgOCR3q08BU5o_fdLEeGu+BA@mail.gmail.com>
 <CA+RdavHWkTdpFSWLoJwpKWOnX6cDPBWex+wdP-U0h1YXF0_GmA@mail.gmail.com>
 <CA+RdavE9xfjZZOCELQ+9PaUYgqNy+eGTtUx+bxYhRAgJJ1bEbA@mail.gmail.com>
Message-ID: <CAAvZht=fqvp2YFDwZ1_CmX_scXZRA8jnYgRS69AxC3zYFdeatQ@mail.gmail.com>

+1 I'll try to do my best. Please add me to the core

Br,
Luis Rmz <https://www.linkedin.com/in/luisframirez/>
Blockchain, DevOps & Open Source Cloud Solutions Architect
----------------------------------------
Founder & CEO
OpenCloud.es <http://www.opencloud.es/>
luis.ramirez at opencloud.es
Skype ID: d.overload
Hangouts: luis.ramirez at opencloud.es
[image: ] +34 911 950 123 / [image: ]+39 392 1289553 / [image: ]+49 152
26917722 / Česká republika: +420 774 274 882
-----------------------------------------------------


El mar., 25 ago. 2020 a las 6:52, Sam P (<sam47priya at gmail.com>) escribió:

> Thank you all for volunteering to maintain the project.
> > Please let me know how we should proceed with the meetings.
> > I can start them on Tuesdays at 7 AM UTC.
> > And since the Masakari own channel is quite a peaceful one, I would
> > suggest to run them there directly.
> > What are your thoughts? :-)
> I think #openstack-masakari channel is all set to conduct the meeting.
> I am totally OK with that. And Tuesday at 7AM UTC also works for me.
> Previously we conducted the meeting every two weeks (on even weeks).
> How about others?
> Please add comments to the following review.
> https://review.opendev.org/#/c/747819/
>
> --- Regards,
> Sampath
>
> On Tue, Aug 25, 2020 at 1:29 PM Sam P <sam47priya at gmail.com> wrote:
> >
> > Hi All,
> >
> >  I add the following members to the core team.
> >
> > Fabian Zimmermann dev.faz at googlemail.com
> > Jegor van Opdorpjegor at greenedge.cloud
> > Radosław Piliszekradoslaw.piliszek at gmail.com
> > suzhengweisugar-2008 at 163.com
> >
> > Please let me or other core members know if any one else would like to
> > join the core team.
> > --- Regards,
> > Sampath
> >
> > On Sat, Aug 22, 2020 at 2:08 AM Fabian Zimmermann <dev.faz at gmail.com>
> wrote:
> > >
> > > Hi,
> > >
> > > As long as there are enough cores to keep the project running
> everything is fine :)
> > >
> > >  Fabian
> > >
> > > Jean-Philippe Evrard <jean-philippe at evrard.me> schrieb am Fr., 21.
> Aug. 2020, 16:32:
> > >>
> > >>
> > >> On Wed, Aug 19, 2020, at 06:23, Fabian Zimmermann wrote:
> > >> > Hi,
> > >> >
> > >> > if nobody complains I also would like to request core status to
> help getting the project further.
> > >> >
> > >> >  Fabian Zimmermann
> > >>
> > >> Let's hope this will not be lost in the list :)
> > >>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200825/22218728/attachment.html>

From tonyliu0592 at hotmail.com  Tue Aug 25 06:00:18 2020
From: tonyliu0592 at hotmail.com (Tony Liu)
Date: Tue, 25 Aug 2020 06:00:18 +0000
Subject: [Monasca] Kolla docker image on Docker Hub
Message-ID: <MWHPR08MB238210A11EB7E08ECBBB5E58BD570@MWHPR08MB2382.namprd08.prod.outlook.com>

Hi,

Are those Monasca Kolla container images
kolla/centos-binary-monasca-* on Docker Hub?
I only see kolla/centos-binary-monasca-grafana.

I am running Kolla Ansible to deploy Monasca and got this failure.
========
docker.errors.ImageNotFound: 404 Client Error: Not Found (\"pull access denied for kolla/centos-binary-monasca-api, repository does not exist or may require \\'docker login\\': denied: requested access to the resource is denied\")
========

Thanks!
Tony


From arnaud.morin at gmail.com  Tue Aug 25 06:07:15 2020
From: arnaud.morin at gmail.com (Arnaud Morin)
Date: Tue, 25 Aug 2020 06:07:15 +0000
Subject: [neutron][ops] q-agent-notifier exchanges without bindings.
In-Reply-To: <20200824124154.GA31915@sync>
References: <CAA857VwW9gRwjd-aAqz=R7Ffcmq7Zq4LZ5-hftNHsg5HW7_4iw@mail.gmail.com>
 <20200824124154.GA31915@sync>
Message-ID: <20200825060715.GB31915@sync>

Hi again,

If I understand correctly neutron code, we have security group rule
update notified twice:
First with SecurityGroupServerNotifierRpcMixin [1]
Second with ResourcesPushRpcApi [2]

Can someone involved in neutron code confirm that?

It seems that, in OVS agent implementation, [1] is not used (my agent
is not consuming those messages), but neutron server is sending
messages in this exchange.
This is why I have unroutable messages.

[1] https://github.com/openstack/neutron/blob/3793f1f3888a85fc5e48c0e94e6a9f3c05e95c43/neutron/db/securitygroups_rpc_base.py#L40
[2] https://github.com/openstack/neutron/blob/f8b990736ba91af098e467608c6dfa0b801ec19c/neutron/api/rpc/handlers/resources_rpc.py#L198

-- 
Arnaud Morin

On 24.08.20 - 12:41, Arnaud Morin wrote:
> Hey,
> 
> I did exactly the same on my side.
> I also have unroutable messages going in my alternate exchange, related
> to the same exchanges (q-agent-notifier-security_group-update_fanout,
> etc.)
> 
> Did you figured out why you have unroutable messages like this?
> Are you using a custom neutron driver?
> 
> Cheers,
> 
> -- 
> Arnaud Morin
> 
> On 21.08.20 - 10:32, Fabian Zimmermann wrote:
> > Hi,
> > 
> > im currently on the way to analyse some rabbitmq-issues.
> > 
> > atm im taking a look on "unroutable messages", so I
> > 
> > * created an Alternative Exchange and Queue: "unroutable"
> > * created a policy to send all unroutable msgs to this exchange/queue.
> > * wrote a script to show me the msgs placed here.. currently I get
> > 
> > Seems like my neutron is placing msgs in these exchanges, but there is
> > nobody listening/binding to:
> > --
> >      20 Exchange: q-agent-notifier-network-delete_fanout, RoutingKey:
> >     226 Exchange: q-agent-notifier-port-delete_fanout, RoutingKey:
> >      88 Exchange: q-agent-notifier-port-update_fanout, RoutingKey:
> >     388 Exchange: q-agent-notifier-security_group-update_fanout, RoutingKey:
> > --
> > 
> > Is someone able to give me a hint where to look at / how to debug this?
> > 
> >  Fabian
> > 


From arne.wiebalck at cern.ch  Tue Aug 25 06:30:14 2020
From: arne.wiebalck at cern.ch (Arne Wiebalck)
Date: Tue, 25 Aug 2020 08:30:14 +0200
Subject: [ironic][tripleo] RFC: deprecate the iSCSI deploy interface?
In-Reply-To: <d8fcb957-aa98-2d7a-b2ef-6f8fd0fb02a8@redhat.com>
References: <CACNgkFy4Gx9UsSfR3EW_vfZzyPGg9mR2ks+h6noycOVz1C1-Ww@mail.gmail.com>
 <CAF7gwdi5D-ppP5Tu8280xVxyYbPMa7mLJTC_vbf9qLP9STGswg@mail.gmail.com>
 <a6465ba4-477b-db59-127a-b7e153f973d2@cern.ch>
 <CACNgkFxJCxY3SGhT_Q197SE5HKMm2tbmOVUNxMsfPf+dGSiw1Q@mail.gmail.com>
 <0800d06e870cc5370ada0a85c5e4aaf3b329107d.camel@redhat.com>
 <CACNgkFydM1fmKZJ4wVwuJz-7aTC5A+NzAVLAgoxH5UaCkK_uwQ@mail.gmail.com>
 <d8fcb957-aa98-2d7a-b2ef-6f8fd0fb02a8@redhat.com>
Message-ID: <a8fa32dc-5433-384a-5ad6-2bf817fd9258@cern.ch>

Hi Steve,

On 24.08.20 23:55, Steve Baker wrote:
> 
> On 25/08/20 12:05 am, Dmitry Tantsur wrote:
>>
>>
>> On Mon, Aug 24, 2020 at 1:52 PM Sean Mooney <smooney at redhat.com 
>> <mailto:smooney at redhat.com>> wrote:
>>
>>     On Mon, 2020-08-24 at 10:32 +0200, Dmitry Tantsur wrote:
>>     > Hi,
>>     >
>>     > On Mon, Aug 24, 2020 at 10:24 AM Arne Wiebalck
>>     <arne.wiebalck at cern.ch <mailto:arne.wiebalck at cern.ch>>
>>     > wrote:
>>     >
>>     > > Hi!
>>     > >
>>     > > CERN's deployment is using the iscsi deploy interface since we
>>     started
>>     > > with Ironic a couple of years ago (and we installed around
>>     5000 nodes
>>     > > with it by now). The reason we chose it at the time was
>>     simplicity: we
>>     > > did not (and still do not) have a Swift backend to Glance, and
>>     the iscsi
>>     > > interface provided a straightforward alternative.
>>     > >
>>     > > While we have not seen obscure bugs/issues with it, I can
>>     certainly back
>>     > > the scalability issues mentioned by Dmitry: the tunneling of
>>     the images
>>     > > through the controllers can create issues when deploying
>>     hundreds of
>>     > > nodes at the same time. The security of the iscsi interface is
>>     less of a
>>     > > concern in our specific environment.
>>     > >
>>     > > So, why did we not move to direct (yet)? In addition to the
>>     lack of
>>     > > Swift, mostly since iscsi works for us and the scalability
>>     issues were
>>     > > not that much of a burning problem ... so we focused on other
>>     things :)
>>     > >
>>     > > Here are some thoughts/suggestions for this discussion:
>>     > >
>>     > > How would 'direct' work with other Glance backends (like
>>     Ceph/RBD in our
>>     > > case)? If using direct requires to duplicate images from Glance to
>>     > > Ironic (or somewhere else) to be served, I think this would be an
>>     > > argument against deprecating iscsi.
>>     > >
>>     >
>>     > With image_download_source=http ironic will download the image
>>     to the
>>     > conductor to be able serve it to the node. Which is exactly what
>>     the iscsi
>>     > is doing, so not much of a change for you (except for
>>     s/iSCSI/HTTP/ as a
>>     > means of serving the image).
>>     >
>>     > Would it be an option for you to test direct deploy with
>>     > image_download_source=http?
>>     i think if there is still an option to not force deployemnt to
>>     altere any of there
>>     other sevices this is likely ok but i think the onious shoudl be
>>     on the ironic
>>     and ooo teams to ensure there is an upgrade path for those useres
>>     before this deprecation
>>     becomes a removal without deploying swift or a swift compatibale
>>     api e.g. RadosGW
>>
>>
>> Swift is NOT a requirement (nor is RadosGW) when 
>> image_download_source=http is used. Any glance backend (or no glance 
>> at all) will work.
> 
> Even though the TripleO undercloud has swift, I'd be inclined to do 
> image_download_source=http so that it can scale out to minions, and so 
> we're not relying on a single-node swift for image serving

This makes it sound a little like 'direct' with 
image_download_source=http would be easily scalable ... but it is only 
if you can (and are willing to) scale the Ironic control plane through 
which the images are still tunneled (and Glance behind it ... not sure 
if there is any caching of images inside the Ironic controllers). Seems 
to be the case for you and TripleO, but it may not be the case in other 
setups, using conductor groups may complicated things, for instance.

So, from what I see, image_download_source=http is a good option to move 
deployments off the iscsi deploy interface, but it does not bring the
same (scalability) advantages you would get from a setup where Glance is
backed by a scalable Swift or RadosGW backend.

Cheers,
  Arne


From radoslaw.piliszek at gmail.com  Tue Aug 25 07:08:31 2020
From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=)
Date: Tue, 25 Aug 2020 09:08:31 +0200
Subject: [Monasca] Kolla docker image on Docker Hub
In-Reply-To: <MWHPR08MB238210A11EB7E08ECBBB5E58BD570@MWHPR08MB2382.namprd08.prod.outlook.com>
References: <MWHPR08MB238210A11EB7E08ECBBB5E58BD570@MWHPR08MB2382.namprd08.prod.outlook.com>
Message-ID: <CAKZ_x78_Wfx2kVDjUoizH+wBB97PVY3zs5aw9tfViyGDma07VA@mail.gmail.com>

Hi Tony,

RDO does not package Monasca so it does not exist in the binary
flavour (except for dedicated Grafana).

Please consult [1].
Your immediate workaround is to use source flavour for Monasca.

[1] https://docs.openstack.org/kolla/ussuri/support_matrix.html

-yoctozepto

On Tue, Aug 25, 2020 at 8:11 AM Tony Liu <tonyliu0592 at hotmail.com> wrote:
>
> Hi,
>
> Are those Monasca Kolla container images
> kolla/centos-binary-monasca-* on Docker Hub?
> I only see kolla/centos-binary-monasca-grafana.
>
> I am running Kolla Ansible to deploy Monasca and got this failure.
> ========
> docker.errors.ImageNotFound: 404 Client Error: Not Found (\"pull access denied for kolla/centos-binary-monasca-api, repository does not exist or may require \\'docker login\\': denied: requested access to the resource is denied\")
> ========
>
> Thanks!
> Tony
>
>


From eblock at nde.ag  Tue Aug 25 07:42:12 2020
From: eblock at nde.ag (Eugen Block)
Date: Tue, 25 Aug 2020 07:42:12 +0000
Subject: [horizon] default create_volume setting can't be changed
In-Reply-To: <20200824141904.Horde.biUwyDcXRQDK2D0KW6vwbE1@webmail.nde.ag>
Message-ID: <20200825074212.Horde.X1xxti0Yt3f-evdwG6CWJyC@webmail.nde.ag>

Update: I found one (the right?) place to change the default to false:

/srv/www/openstack-dashboard/static/dashboard/project/workflow/launch-instance/launch-instance-model.service.js

//        create_volume_default: true,
         create_volume_default: false,


I've been struggling for years now with these dashboard settings, it  
started with

/srv/www/openstack-dashboard/openstack_dashboard/dashboards/project/instances/workflows/create_instance.py

where I needed to remove the default disk identifier (hard-coded  
"vda") when we were still running with xen hypervisors to let nova  
change the disk name. Then this changed and it had to be one of these  
files, I can't remember which was first, I just know that after some  
months I had to apply my changes to the other file, too:

/srv/www/openstack-dashboard/static/dashboard/project/workflow/launch-instance/launch-instance-model.service.js

/srv/www/openstack-dashboard/openstack_dashboard/dashboards/project/static/dashboard/project/workflow/launch-instance/source/source.controller.js


I'm not a developer but I must say, I don't really understand this  
setup and why it changes all the time. Of course I might be looking in  
the wrong places, it would be great if someone could point me to the  
right direction! I'm also willing to provide more information if  
necessary.

> Other configs from this file work as expected, so that custom file  
> can't be the reason.

I might be wrong about that, too. I noticed that although I disabled  
the debug settings in  
/srv/www/openstack-dashboard/openstack_dashboard/local/local_settings.d/_100_local_settings.py

DEBUG = False

I was still seeing debug messages. I had to turn them off in  
/srv/www/openstack-dashboard/openstack_dashboard/settings.py to be  
applied. So there might be other changes not applied from our custom  
config file.
I'd really appreciate it if anyone could comment on this.

Thanks,
Eugen


Zitat von Eugen Block <eblock at nde.ag>:

> Hi *,
>
> we recently upgraded from Ocata to Train and I'm struggling with a  
> specific setting: I believe since Pike version the default for  
> "create_volume" changed to "true" when launching instances from  
> Horizon dashboard. I would like to change that to "false" and set it  
> in our custom  
> /srv/www/openstack-dashboard/openstack_dashboard/local/local_settings.d/_100_local_settings.py:
>
>
> LAUNCH_INSTANCE_DEFAULTS = {
>     'config_drive': False,
>     'create_volume': False,
>     'hide_create_volume': False,
>     'disable_image': False,
>     'disable_instance_snapshot': False,
>     'disable_volume': False,
>     'disable_volume_snapshot': False,
>     'enable_scheduler_hints': True,
> }
>
> Other configs from this file work as expected, so that custom file  
> can't be the reason.
> After apache and memcached restart nothing changes, the default is  
> still "true". Can anyone shed some light, please? I haven't tried  
> other configs yet so I can't tell if more options are affected.
>
> Thanks!
> Eugen


From mark at stackhpc.com  Tue Aug 25 07:42:18 2020
From: mark at stackhpc.com (Mark Goddard)
Date: Tue, 25 Aug 2020 08:42:18 +0100
Subject: [Monasca] Kolla docker image on Docker Hub
In-Reply-To: <CAKZ_x78_Wfx2kVDjUoizH+wBB97PVY3zs5aw9tfViyGDma07VA@mail.gmail.com>
References: <MWHPR08MB238210A11EB7E08ECBBB5E58BD570@MWHPR08MB2382.namprd08.prod.outlook.com>
 <CAKZ_x78_Wfx2kVDjUoizH+wBB97PVY3zs5aw9tfViyGDma07VA@mail.gmail.com>
Message-ID: <CAFHSqWq-dA3+hGAH+YFzvRheHx7vvW9ghPsuh3RjSvYwGzYPMg@mail.gmail.com>

On Tue, 25 Aug 2020 at 08:10, Radosław Piliszek
<radoslaw.piliszek at gmail.com> wrote:
>
> Hi Tony,
>
> RDO does not package Monasca so it does not exist in the binary
> flavour (except for dedicated Grafana).
>
> Please consult [1].
> Your immediate workaround is to use source flavour for Monasca.
>
> [1] https://docs.openstack.org/kolla/ussuri/support_matrix.html
>
> -yoctozepto
>
> On Tue, Aug 25, 2020 at 8:11 AM Tony Liu <tonyliu0592 at hotmail.com> wrote:
> >
> > Hi,
> >
> > Are those Monasca Kolla container images
> > kolla/centos-binary-monasca-* on Docker Hub?
> > I only see kolla/centos-binary-monasca-grafana.
> >
> > I am running Kolla Ansible to deploy Monasca and got this failure.
> > ========
> > docker.errors.ImageNotFound: 404 Client Error: Not Found (\"pull access denied for kolla/centos-binary-monasca-api, repository does not exist or may require \\'docker login\\': denied: requested access to the resource is denied\")
> > ========

Please follow the kolla documentation for deploying monasca, which
includes forcing the use of source images:
https://docs.openstack.org/kolla-ansible/latest/reference/logging-and-monitoring/monasca-guide.html.

> >
> > Thanks!
> > Tony
> >
> >
>


From dtantsur at redhat.com  Tue Aug 25 07:46:42 2020
From: dtantsur at redhat.com (Dmitry Tantsur)
Date: Tue, 25 Aug 2020 09:46:42 +0200
Subject: [ironic][tripleo] RFC: deprecate the iSCSI deploy interface?
In-Reply-To: <0800d06e870cc5370ada0a85c5e4aaf3b329107d.camel@redhat.com>
References: <CACNgkFy4Gx9UsSfR3EW_vfZzyPGg9mR2ks+h6noycOVz1C1-Ww@mail.gmail.com>
 <CAF7gwdi5D-ppP5Tu8280xVxyYbPMa7mLJTC_vbf9qLP9STGswg@mail.gmail.com>
 <a6465ba4-477b-db59-127a-b7e153f973d2@cern.ch>
 <CACNgkFxJCxY3SGhT_Q197SE5HKMm2tbmOVUNxMsfPf+dGSiw1Q@mail.gmail.com>
 <0800d06e870cc5370ada0a85c5e4aaf3b329107d.camel@redhat.com>
Message-ID: <CACNgkFwPFqibYxR1x1wErCBs04ge16bD0tvaRns30Gx+zFYTEg@mail.gmail.com>

On Mon, Aug 24, 2020 at 1:52 PM Sean Mooney <smooney at redhat.com> wrote:

> On Mon, 2020-08-24 at 10:32 +0200, Dmitry Tantsur wrote:
> > Hi,
> >
> > On Mon, Aug 24, 2020 at 10:24 AM Arne Wiebalck <arne.wiebalck at cern.ch>
> > wrote:
> >
> > > Hi!
> > >
> > > CERN's deployment is using the iscsi deploy interface since we started
> > > with Ironic a couple of years ago (and we installed around 5000 nodes
> > > with it by now). The reason we chose it at the time was simplicity: we
> > > did not (and still do not) have a Swift backend to Glance, and the
> iscsi
> > > interface provided a straightforward alternative.
> > >
> > > While we have not seen obscure bugs/issues with it, I can certainly
> back
> > > the scalability issues mentioned by Dmitry: the tunneling of the images
> > > through the controllers can create issues when deploying hundreds of
> > > nodes at the same time. The security of the iscsi interface is less of
> a
> > > concern in our specific environment.
> > >
> > > So, why did we not move to direct (yet)? In addition to the lack of
> > > Swift, mostly since iscsi works for us and the scalability issues were
> > > not that much of a burning problem ... so we focused on other things :)
> > >
> > > Here are some thoughts/suggestions for this discussion:
> > >
> > > How would 'direct' work with other Glance backends (like Ceph/RBD in
> our
> > > case)? If using direct requires to duplicate images from Glance to
> > > Ironic (or somewhere else) to be served, I think this would be an
> > > argument against deprecating iscsi.
> > >
> >
> > With image_download_source=http ironic will download the image to the
> > conductor to be able serve it to the node. Which is exactly what the
> iscsi
> > is doing, so not much of a change for you (except for s/iSCSI/HTTP/ as a
> > means of serving the image).
> >
> > Would it be an option for you to test direct deploy with
> > image_download_source=http?
> i think if there is still an option to not force deployemnt to altere any
> of there
> other sevices this is likely ok but i think the onious shoudl be on the
> ironic
> and ooo teams to ensure there is an upgrade path for those useres before
> this deprecation
> becomes a removal without deploying swift or a swift compatibale api e.g.
> RadosGW
>
> perhaps a ci job could be put in place maybe using grenade that starts
> with iscsi and moves
> to direct with http porvided to show that just setting that weill allow
> the conductor to download
> the image from glance and server it to the ipa.
>

This is the CI job with direct deploy in a low RAM environment with a large
image (CentOS) without Swift:
https://zuul.opendev.org/t/openstack/build/58f623d90435470f9095eb68202c25f8

The change is https://review.opendev.org/#/c/747413/

Dmitry


>
>
> unlike cern i just use ironic in a tiny home deployment where i have an
> all in one deployment + 4 addtional
> nodes for ironic. i cant deploy swift as all my disks are already in use
> for cinder so down the line when
> i eventually upgrade to vicortia and wallaby  i would either have to drop
> ironic or not upgrade it
> if there is not a option to just pull the image from glance or glance via
> the conductor. enhancing the ipa
> to pull directly from glance would also proably work for many who use
> iscsi today but that would depend on your network
> toplogy i guess.
> >
> >
> > >
> > > Equally, if this would require to completely move the Glance backend to
> > > something else, like from RBD to RadosGW, I would not expect happy
> > > operators. (Does anyone know if RadosGW could even replace Swift for
> > > this specific use case?)
> > >
> >
> > AFAIK ironic works with RadosGW, we have some support code for it.
> >
> >
> > >
> > > Do we have numbers on how many deployments use iscsi vs direct? If many
> > > rely on iscsi, I would also suggest to establish a migration guide for
> > > operators on how to move from iscsi to direct, for the various configs.
> > > Recent versions of Glance support multiple backends, so a migration
> path
> > > may be to add a new (direct compatible) backend for new images.
> > >
> >
> > I don't have any numbers, but a migration guide is a must in any case.
> >
> > I expect most TripleO consumers to use the iscsi deploy, but only because
> > it's the default. Their Edge solution uses the direct deploy. I've
> polled a
> > few operators I know, they all (except for you, obviously :) seem to use
> > the direct deploy. Metal3 uses direct deploy.
> >
> > Dmitry
> >
> >
> > >
> > > Cheers,
> > >   Arne
> > >
> > > On 20.08.20 17:49, Julia Kreger wrote:
> > > > I'm having a sense of deja vu!
> > > >
> > > > Because of the way the mechanics work, the iscsi deploy driver is in
> > > > an unfortunate position of being harder to troubleshoot and diagnose
> > > > failures. Which basically means we've not been able to really
> identify
> > > > common failures and add logic to handle them appropriately, like we
> > > > are able to with a tcp socket and file download. Based on this alone,
> > > > I think it makes a solid case for us to seriously consider
> > > > deprecation.
> > > >
> > > > Overall, I'm +1 for the proposal and I believe over two cycles is the
> > > > right way to go.
> > > >
> > > > I suspect we're going to have lots of push back from the TripleO
> > > > community because there has been resistance to change their default
> > > > usage in the past. As such I'm adding them to the subject so
> hopefully
> > > > they will be at least aware.
> > > >
> > > > I guess my other worry is operators who already have a substantial
> > > > operational infrastructure investment built around the iscsi deploy
> > > > interface. I wonder why they didn't use direct, but maybe they have
> > > > all migrated in the past ?5? years. This could just be a non-concern
> > > > in reality, I'm just not sure.
> > > >
> > > > Of course, if someone is willing to step up and make the iscsi
> > > > deployment interface their primary focus, that also shifts the
> > > > discussion to making direct the default interface?
> > > >
> > > > -Julia
> > > >
> > > >
> > > > On Thu, Aug 20, 2020 at 1:57 AM Dmitry Tantsur <dtantsur at redhat.com>
> > >
> > > wrote:
> > > > >
> > > > > Hi all,
> > > > >
> > > > > Side note for those lacking context: this proposal concerns
> deprecating
> > >
> > > one of the ironic deploy interfaces detailed in
> > > https://docs.openstack.org/ironic/latest/admin/interfaces/deploy.html.
> It
> > > does not affect the boot-from-iSCSI feature.
> > > > >
> > > > > I would like to propose deprecating and removing the 'iscsi' deploy
> > >
> > > interface over the course of the next 2 cycles. The reasons are:
> > > > > 1) The iSCSI deploy is a source of occasional cryptic bugs when a
> > >
> > > target cannot be discovered or mounted properly.
> > > > > 2) Its security is questionable: I don't think we even use
> > >
> > > authentication.
> > > > > 3) Operators confusion: right now we default to the iSCSI deploy
> but
> > >
> > > pretty much direct everyone who cares about scalability or security to
> the
> > > 'direct' deploy.
> > > > > 4) Cost of maintenance: our feature set is growing, our team - not
> so
> > >
> > > much. iscsi_deploy.py is 800 lines of code that can be removed, and
> some
> > > dependencies that can be dropped as well.
> > > > >
> > > > > As far as I can remember, we've kept the iSCSI deploy for two
> reasons:
> > > > > 1) The direct deploy used to require Glance with Swift backend. The
> > >
> > > recently added [agent]image_download_source option allows caching and
> > > serving images via the ironic's HTTP server, eliminating this problem.
> I
> > > guess we'll have to switch to 'http' by default for this option to
> keep the
> > > out-of-box experience.
> > > > > 2) Memory footprint of the direct deploy. With the raw images
> streaming
> > >
> > > we no longer have to cache the downloaded images in the agent memory,
> > > removing this problem as well (I'm not even sure how much of a problem
> it
> > > is in 2020, even my phone has 4GiB of RAM).
> > > > >
> > > > > If this proposal is accepted, I suggest to execute it as follows:
> > > > > Victoria release:
> > > > > 1) Put an early deprecation warning in the release notes.
> > > > > 2) Announce the future change of the default value for
> > >
> > > [agent]image_download_source.
> > > > > W release:
> > > > > 3) Change [agent]image_download_source to 'http' by default.
> > > > > 4) Remove iscsi from the default enabled_deploy_interfaces and
> move it
> > >
> > > to the back of the supported list (effectively making direct deploy the
> > > default).
> > > > > X release:
> > > > > 5) Remove the iscsi deploy code from both ironic and IPA.
> > > > >
> > > > > Thoughts, opinions, suggestions?
> > > > >
> > > > > Dmitry
> > >
> > >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200825/be111943/attachment.html>

From mark at stackhpc.com  Tue Aug 25 07:55:08 2020
From: mark at stackhpc.com (Mark Goddard)
Date: Tue, 25 Aug 2020 08:55:08 +0100
Subject: [Kolla Ansible] host maintenance
In-Reply-To: <MWHPR08MB2382D53271A60CDB1B9D3404BD560@MWHPR08MB2382.namprd08.prod.outlook.com>
References: <MWHPR08MB23822BB16CA7FFA8667B5A73BD5B0@MWHPR08MB2382.namprd08.prod.outlook.com>
 <CAA857VyMP9jH0575oMF446L==1UDYKk5UGs6YQ-kxkrq4F4GJg@mail.gmail.com>
 <E1F06E18-FD98-42FB-9DAC-0D4A45343226@uchicago.edu>
 <MWHPR08MB2382D4FADCD4064948B74D60BD5B0@MWHPR08MB2382.namprd08.prod.outlook.com>
 <046E9C0290DD9149B106B72FC9156BEA04814569@gmsxchsvr01.thecreation.com>
 <CAFHSqWonE_7WAhLf4rVt+_=hJ+HCBoq=c2UdZSw+cwumyLxebQ@mail.gmail.com>
 <MWHPR08MB2382B0D4E138759352B92EE3BD560@MWHPR08MB2382.namprd08.prod.outlook.com>
 <CAFHSqWqs2dPgkuvX-tRN7F78Aa-QzX2pSS8eXXyaaoqkhAzATA@mail.gmail.com>
 <MWHPR08MB2382D53271A60CDB1B9D3404BD560@MWHPR08MB2382.namprd08.prod.outlook.com>
Message-ID: <CAFHSqWrZNvsu02UKKAp8Vm_LnFGoafOM+vdRpyVjFz3eHhGH1A@mail.gmail.com>

On Mon, 24 Aug 2020 at 19:50, Tony Liu <tonyliu0592 at hotmail.com> wrote:
>
> > -----Original Message-----
> > From: Mark Goddard <mark at stackhpc.com>
> > Sent: Monday, August 24, 2020 11:21 AM
> > To: Tony Liu <tonyliu0592 at hotmail.com>
> > Cc: Eric K. Miller <emiller at genesishosting.com>; openstack-discuss
> > <openstack-discuss at lists.openstack.org>
> > Subject: Re: [Kolla Ansible] host maintenance
> >
> > On Mon, 24 Aug 2020 at 17:53, Tony Liu <tonyliu0592 at hotmail.com> wrote:
> > >
> > > > -----Original Message-----
> > > > From: Mark Goddard <mark at stackhpc.com>
> > > > Sent: Monday, August 24, 2020 12:46 AM
> > > > To: Eric K. Miller <emiller at genesishosting.com>
> > > > Cc: openstack-discuss <openstack-discuss at lists.openstack.org>
> > > > Subject: Re: [Kolla Ansible] host maintenance
> > > >
> > > > On Sat, 22 Aug 2020 at 01:10, Eric K. Miller
> > > > <emiller at genesishosting.com>
> > > > wrote:
> > > > >
> > > > > > Actually, in my case, the setup is originally deploy by Kolla
> > > > > > Ansible. Other than the initial deployment, I am looking for
> > > > > > using Kolla Ansible for maintenance operations.
> > > > > > What I am looking for, eg. replace a host, can surely be done by
> > > > > > manual steps or customized script. I'd like to know if they are
> > > > > > automated by Kolla Ansible.
> > > > >
> > > > > We do this often by simply using the "limit" flag in Kolla Ansible
> > > > > to
> > > > only include the controllers and new compute node (after adding the
> > > > compute node to the multinode.ini file).  Specify "reconfigure" for
> > > > the action, and not "install".
> > > >
> > > > We need some better docs around this, and I think they will be added
> > > > soon. Some things to watch out for:
> > > >
> > > > * if adding a new controller, ensure that if using --limit, all
> > > > controllers are included and do not use serial mode
> > >
> > > What I tried was to replace a controller, where I don't need to update
> > > other controllers, because there is no address update.
> > >
> > > If there is address update caused by controller change, then all
> > > controllers have to be included to get update.
> >
> > While this may work at the moment, we have just merged a change that
> > prevents this. For keystone, we need access to all controllers, to
> > determine whether it is a new cluster or a new node in an existing
> > cluster.
> >
> > >
> > > What's "serial mode"?
> >
> > Ansible has a feature to run plays in batches of some % of the hosts.
> > In Kolla Ansible you can e.g. export ANSIBLE_SERIAL=0.1. It's an
> > advanced use case and needs some care.
> >
> > >
> > > > * if removing a controller, reconfigure other controllers to update
> > > > the RabbitMQ & Galera cluster nodes etc.
> > >
> > > In this case, are those services who don't need any updates going to
> > > be restarted or untouched?
>
> Could you comment on this? This is my biggest concern. I'd like
> to ensure services who don't need update remain untouched.

In general, Kolla Ansible will only restart containers if the config
files or container configuration changes. There is a bug in Ansible
which means that this isn't always true, e.g. if nova-api needs to
restart, we may also restart nova-conductor on the same host. See
https://bugs.launchpad.net/kolla-ansible/+bug/1863510

>
> Thanks!
> Tony
>


From radoslaw.piliszek at gmail.com  Tue Aug 25 08:01:35 2020
From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=)
Date: Tue, 25 Aug 2020 10:01:35 +0200
Subject: [tc][masakari] Project aliveness (was: [masakari] Meetings)
In-Reply-To: <CAAvZht=fqvp2YFDwZ1_CmX_scXZRA8jnYgRS69AxC3zYFdeatQ@mail.gmail.com>
References: <CAKZ_x79qopDY36TNg_5mvWFy=tiiB08PZqpjKeft-TnR0dQbtA@mail.gmail.com>
 <CAKZ_x799gE8njCGVwUtd28REvJaYK86n0WZ3PVeQv5-HXKGdmQ@mail.gmail.com>
 <CAFHSqWroR+rirgLLn7hB-8fHKBR8jFE8KezgSg4EayePC3_dgQ@mail.gmail.com>
 <AM0PR02MB5746C86B851DAE317F5A2CEDA85F0@AM0PR02MB5746.eurprd02.prod.outlook.com>
 <CA+RdavF=POsLiCzkL6DxgjAxorUaw0z2njhm8BHGU67mwfkOwg@mail.gmail.com>
 <CAA857Vz2W+_tnfT2fAhVEVojcxwASJCazF3JLW+LF=om02Q5Eg@mail.gmail.com>
 <6868fdd8-54cd-4ccf-a3d7-ffecf5eb601b@www.fastmail.com>
 <CAA857VyQ9qYvsao0U4V7PBzG6kOgOCR3q08BU5o_fdLEeGu+BA@mail.gmail.com>
 <CA+RdavHWkTdpFSWLoJwpKWOnX6cDPBWex+wdP-U0h1YXF0_GmA@mail.gmail.com>
 <CA+RdavE9xfjZZOCELQ+9PaUYgqNy+eGTtUx+bxYhRAgJJ1bEbA@mail.gmail.com>
 <CAAvZht=fqvp2YFDwZ1_CmX_scXZRA8jnYgRS69AxC3zYFdeatQ@mail.gmail.com>
Message-ID: <CAKZ_x7_KRf=O1fRe0O5VqVpBuRfGOYnr9_VRkZz2phaL=vJWvw@mail.gmail.com>

Hi Luis,

I've added you.

-yoctozepto

On Tue, Aug 25, 2020 at 7:35 AM Luis Ramirez <luis.ramirez at opencloud.es>
wrote:

> +1 I'll try to do my best. Please add me to the core
>
> Br,
> Luis Rmz <https://www.linkedin.com/in/luisframirez/>
> Blockchain, DevOps & Open Source Cloud Solutions Architect
> ----------------------------------------
> Founder & CEO
> OpenCloud.es <http://www.opencloud.es/>
> luis.ramirez at opencloud.es
> Skype ID: d.overload
> Hangouts: luis.ramirez at opencloud.es
> [image: ] +34 911 950 123 / [image: ]+39 392 1289553 / [image: ]+49
> 152 26917722 / Česká republika: +420 774 274 882
> -----------------------------------------------------
>
>
> El mar., 25 ago. 2020 a las 6:52, Sam P (<sam47priya at gmail.com>) escribió:
>
>> Thank you all for volunteering to maintain the project.
>> > Please let me know how we should proceed with the meetings.
>> > I can start them on Tuesdays at 7 AM UTC.
>> > And since the Masakari own channel is quite a peaceful one, I would
>> > suggest to run them there directly.
>> > What are your thoughts? :-)
>> I think #openstack-masakari channel is all set to conduct the meeting.
>> I am totally OK with that. And Tuesday at 7AM UTC also works for me.
>> Previously we conducted the meeting every two weeks (on even weeks).
>> How about others?
>> Please add comments to the following review.
>> https://review.opendev.org/#/c/747819/
>>
>> --- Regards,
>> Sampath
>>
>> On Tue, Aug 25, 2020 at 1:29 PM Sam P <sam47priya at gmail.com> wrote:
>> >
>> > Hi All,
>> >
>> >  I add the following members to the core team.
>> >
>> > Fabian Zimmermann dev.faz at googlemail.com
>> > Jegor van Opdorpjegor at greenedge.cloud
>> > Radosław Piliszekradoslaw.piliszek at gmail.com
>> > suzhengweisugar-2008 at 163.com
>> >
>> > Please let me or other core members know if any one else would like to
>> > join the core team.
>> > --- Regards,
>> > Sampath
>> >
>> > On Sat, Aug 22, 2020 at 2:08 AM Fabian Zimmermann <dev.faz at gmail.com>
>> wrote:
>> > >
>> > > Hi,
>> > >
>> > > As long as there are enough cores to keep the project running
>> everything is fine :)
>> > >
>> > >  Fabian
>> > >
>> > > Jean-Philippe Evrard <jean-philippe at evrard.me> schrieb am Fr., 21.
>> Aug. 2020, 16:32:
>> > >>
>> > >>
>> > >> On Wed, Aug 19, 2020, at 06:23, Fabian Zimmermann wrote:
>> > >> > Hi,
>> > >> >
>> > >> > if nobody complains I also would like to request core status to
>> help getting the project further.
>> > >> >
>> > >> >  Fabian Zimmermann
>> > >>
>> > >> Let's hope this will not be lost in the list :)
>> > >>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200825/3d8b7ff1/attachment.html>

From radoslaw.piliszek at gmail.com  Tue Aug 25 08:03:32 2020
From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=)
Date: Tue, 25 Aug 2020 10:03:32 +0200
Subject: [tc][masakari] Project aliveness (was: [masakari] Meetings)
In-Reply-To: <CA+RdavHWkTdpFSWLoJwpKWOnX6cDPBWex+wdP-U0h1YXF0_GmA@mail.gmail.com>
References: <CAKZ_x79qopDY36TNg_5mvWFy=tiiB08PZqpjKeft-TnR0dQbtA@mail.gmail.com>
 <CAKZ_x799gE8njCGVwUtd28REvJaYK86n0WZ3PVeQv5-HXKGdmQ@mail.gmail.com>
 <CAFHSqWroR+rirgLLn7hB-8fHKBR8jFE8KezgSg4EayePC3_dgQ@mail.gmail.com>
 <AM0PR02MB5746C86B851DAE317F5A2CEDA85F0@AM0PR02MB5746.eurprd02.prod.outlook.com>
 <CA+RdavF=POsLiCzkL6DxgjAxorUaw0z2njhm8BHGU67mwfkOwg@mail.gmail.com>
 <CAA857Vz2W+_tnfT2fAhVEVojcxwASJCazF3JLW+LF=om02Q5Eg@mail.gmail.com>
 <6868fdd8-54cd-4ccf-a3d7-ffecf5eb601b@www.fastmail.com>
 <CAA857VyQ9qYvsao0U4V7PBzG6kOgOCR3q08BU5o_fdLEeGu+BA@mail.gmail.com>
 <CA+RdavHWkTdpFSWLoJwpKWOnX6cDPBWex+wdP-U0h1YXF0_GmA@mail.gmail.com>
Message-ID: <CAKZ_x7_04d-rA5vtN7exetRT3egK0VE2mFaGv8zAoa484Pd84w@mail.gmail.com>

Hi Sampath,

Thanks for handling this.
I'll sit down to clean up the queue a bit and ask other new cores to
co-review and merge a few waiting patches.

-yoctozepto

On Tue, Aug 25, 2020 at 6:40 AM Sam P <sam47priya at gmail.com> wrote:
>
> Hi All,
>
>  I add the following members to the core team.
>
> Fabian Zimmermann dev.faz at googlemail.com
> Jegor van Opdorpjegor at greenedge.cloud
> Radosław Piliszekradoslaw.piliszek at gmail.com
> suzhengweisugar-2008 at 163.com
>
> Please let me or other core members know if any one else would like to
> join the core team.
> --- Regards,
> Sampath
>
> On Sat, Aug 22, 2020 at 2:08 AM Fabian Zimmermann <dev.faz at gmail.com> wrote:
> >
> > Hi,
> >
> > As long as there are enough cores to keep the project running everything is fine :)
> >
> >  Fabian
> >
> > Jean-Philippe Evrard <jean-philippe at evrard.me> schrieb am Fr., 21. Aug. 2020, 16:32:
> >>
> >>
> >> On Wed, Aug 19, 2020, at 06:23, Fabian Zimmermann wrote:
> >> > Hi,
> >> >
> >> > if nobody complains I also would like to request core status to help getting the project further.
> >> >
> >> >  Fabian Zimmermann
> >>
> >> Let's hope this will not be lost in the list :)
> >>
>


From radoslaw.piliszek at gmail.com  Tue Aug 25 08:08:52 2020
From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=)
Date: Tue, 25 Aug 2020 10:08:52 +0200
Subject: [tc][masakari] Project aliveness (was: [masakari] Meetings)
In-Reply-To: <CA+RdavE9xfjZZOCELQ+9PaUYgqNy+eGTtUx+bxYhRAgJJ1bEbA@mail.gmail.com>
References: <CAKZ_x79qopDY36TNg_5mvWFy=tiiB08PZqpjKeft-TnR0dQbtA@mail.gmail.com>
 <CAKZ_x799gE8njCGVwUtd28REvJaYK86n0WZ3PVeQv5-HXKGdmQ@mail.gmail.com>
 <CAFHSqWroR+rirgLLn7hB-8fHKBR8jFE8KezgSg4EayePC3_dgQ@mail.gmail.com>
 <AM0PR02MB5746C86B851DAE317F5A2CEDA85F0@AM0PR02MB5746.eurprd02.prod.outlook.com>
 <CA+RdavF=POsLiCzkL6DxgjAxorUaw0z2njhm8BHGU67mwfkOwg@mail.gmail.com>
 <CAA857Vz2W+_tnfT2fAhVEVojcxwASJCazF3JLW+LF=om02Q5Eg@mail.gmail.com>
 <6868fdd8-54cd-4ccf-a3d7-ffecf5eb601b@www.fastmail.com>
 <CAA857VyQ9qYvsao0U4V7PBzG6kOgOCR3q08BU5o_fdLEeGu+BA@mail.gmail.com>
 <CA+RdavHWkTdpFSWLoJwpKWOnX6cDPBWex+wdP-U0h1YXF0_GmA@mail.gmail.com>
 <CA+RdavE9xfjZZOCELQ+9PaUYgqNy+eGTtUx+bxYhRAgJJ1bEbA@mail.gmail.com>
Message-ID: <CAKZ_x79T2jJkxhUrY_yMmb-c3xG4=7T+NFc_a8YaEaF4i_gE8w@mail.gmail.com>

Hi New Cores,

Please join #openstack-masakari on Freenode and let me know your IRC
nicknames so that we can recognize each other.
I probably know some of your nicks already but it's best to refresh. :-)
The string in my message signature is my IRC nick in case you were
wondering what spell that is. :-)

-yoctozepto

On Tue, Aug 25, 2020 at 6:49 AM Sam P <sam47priya at gmail.com> wrote:
>
> Thank you all for volunteering to maintain the project.
> > Please let me know how we should proceed with the meetings.
> > I can start them on Tuesdays at 7 AM UTC.
> > And since the Masakari own channel is quite a peaceful one, I would
> > suggest to run them there directly.
> > What are your thoughts? :-)
> I think #openstack-masakari channel is all set to conduct the meeting.
> I am totally OK with that. And Tuesday at 7AM UTC also works for me.
> Previously we conducted the meeting every two weeks (on even weeks).
> How about others?
> Please add comments to the following review.
> https://review.opendev.org/#/c/747819/
>
> --- Regards,
> Sampath
>
> On Tue, Aug 25, 2020 at 1:29 PM Sam P <sam47priya at gmail.com> wrote:
> >
> > Hi All,
> >
> >  I add the following members to the core team.
> >
> > Fabian Zimmermann dev.faz at googlemail.com
> > Jegor van Opdorpjegor at greenedge.cloud
> > Radosław Piliszekradoslaw.piliszek at gmail.com
> > suzhengweisugar-2008 at 163.com
> >
> > Please let me or other core members know if any one else would like to
> > join the core team.
> > --- Regards,
> > Sampath
> >
> > On Sat, Aug 22, 2020 at 2:08 AM Fabian Zimmermann <dev.faz at gmail.com> wrote:
> > >
> > > Hi,
> > >
> > > As long as there are enough cores to keep the project running everything is fine :)
> > >
> > >  Fabian
> > >
> > > Jean-Philippe Evrard <jean-philippe at evrard.me> schrieb am Fr., 21. Aug. 2020, 16:32:
> > >>
> > >>
> > >> On Wed, Aug 19, 2020, at 06:23, Fabian Zimmermann wrote:
> > >> > Hi,
> > >> >
> > >> > if nobody complains I also would like to request core status to help getting the project further.
> > >> >
> > >> >  Fabian Zimmermann
> > >>
> > >> Let's hope this will not be lost in the list :)
> > >>
>


From zapiec at gonicus.de  Tue Aug 25 08:44:16 2020
From: zapiec at gonicus.de (Benjamin Zapiec)
Date: Tue, 25 Aug 2020 10:44:16 +0200
Subject: Scaling control nodes
Message-ID: <23e0e705-1446-dc32-74d2-5959fdba6368@gonicus.de>

Hello everyone,


while trying openstack i referred to the red hat installation
documentation which is okay but lead to one question.
It looks like there is no problem in scaling compute
nodes if you run out of resources.

But scaling the controller nodes is not supported
by red hat. Since I'm using the official
tripleo openstack version and not the red hat
version i was wondering if this is not supported
by the openstack project.

Having in mind that red hat doesn't support this
i was looking for something that tells me that it
is supported (or not) by the tripleo openstack
project. But i didn't found anything explicit.

So may you tell me if it is possible to scale up
Controller Nodes? And if not which component is
not scalable by tripleo?

Is it possible to create an controller profile
that is scalable?


Best regards

-- 
Benjamin Zapiec <benjamin.zapiec at gonicus.de> (System Engineer)
* GONICUS GmbH * Moehnestrasse 55 (Kaiserhaus) * D-59755 Arnsberg
* Tel.: +49 2932 916-0 * Fax: +49 2932 916-245
* http://www.GONICUS.de

* Sitz der Gesellschaft: Moehnestrasse 55 * D-59755 Arnsberg
* Geschaeftsfuehrer: Rainer Luelsdorf, Alfred Schroeder
* Vorsitzender des Beirats: Juergen Michels
* Amtsgericht Arnsberg * HRB 1968


Wir erfüllen unsere Informationspflichten zum Datenschutz gem. der
Artikel 13

und 14 DS-GVO durch Veröffentlichung auf unserer Internetseite unter:

https://www.gonicus.de/datenschutz oder durch Zusendung auf Ihre
formlose Anfrage.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200825/31163126/attachment-0001.sig>

From sandeep.ee.nagendra at gmail.com  Tue Aug 25 09:34:41 2020
From: sandeep.ee.nagendra at gmail.com (sandeep)
Date: Tue, 25 Aug 2020 15:04:41 +0530
Subject: [cliff] [dev] Cliff auto completion not working inside interactive
 mode
Message-ID: <CAPcssynd-2bzPTvsEsJO-3n23O+nCiw2XN+e6Q3yo2rW2LFnSg@mail.gmail.com>

Hi Team,

In my system, I am trying auto completion for my CLI application.

CLIFF version - cliff==3.4.0
Auto complete works fine on bash prompt.

But inside the interactive shell, auto complete does not work.

Below is the output for the help command inside the interactive shell.

(appcli) help

Miscellaneous help topics:
==========================
help

Application commands (type help <topic>):
=========================================
complete
snapshot list reports
service restart service-object-type app2
service restart service-object-type app3
service restart service-object-type app4
service show state service-object-type app1
service show state service-object-type app2
service show state service-object-type app3
service show state service-object-type app4
swm rollback node
swm cancel sw-update
swm downgrade node
swm list sw-info
swm show sw-info
swm start sw-update file
swm start sw-downgrade
help

Now, if I type swm and press tab, it lists all the sub commands under it.

(appcli) swm
cancel sw-update
list sw-info
start sw-update file
downgrade node
rollback node
show sw-info
start sw-downgrade

But if i type,

(appcli) swm s<tab> gives below output,
(appcli) swm "s

It stops at this point and further pressing tab does not autocomplete.

Could you please let me know what could be the problem?

Is this a known issue? or Am i missing something?

Thanks,
Sandeep
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200825/1adfff60/attachment.html>

From hjensas at redhat.com  Tue Aug 25 10:35:47 2020
From: hjensas at redhat.com (Harald Jensas)
Date: Tue, 25 Aug 2020 12:35:47 +0200
Subject: [ironic][tripleo] RFC: deprecate the iSCSI deploy interface?
In-Reply-To: <CAF7gwdi5D-ppP5Tu8280xVxyYbPMa7mLJTC_vbf9qLP9STGswg@mail.gmail.com>
References: <CACNgkFy4Gx9UsSfR3EW_vfZzyPGg9mR2ks+h6noycOVz1C1-Ww@mail.gmail.com>
 <CAF7gwdi5D-ppP5Tu8280xVxyYbPMa7mLJTC_vbf9qLP9STGswg@mail.gmail.com>
Message-ID: <a89f75fd-fb5e-c380-adf1-9bf6de4bf93b@redhat.com>

On 8/20/20 5:49 PM, Julia Kreger wrote:
> I suspect we're going to have lots of push back from the TripleO
> community because there has been resistance to change their default
> usage in the past. As such I'm adding them to the subject so hopefully
> they will be at least aware.

Since TripleO already support using the direct interface, it's 
recommended and tested by the TripleO group focusing on edge type 
deployments, switching to direct by default might not be too much of a 
hassle for TripleO.

We may want to change the disk-image format used by TripleO to raw as 
well, to benefit from the raw image streaming capabilities? Or would 
enabling image_download_source = http convert the images as they are 
cached on conductors? (see question inline below.)


> On Thu, Aug 20, 2020 at 1:57 AM Dmitry Tantsur <dtantsur at redhat.com> wrote:
>>
>> Hi all,
>>
>> Side note for those lacking context: this proposal concerns deprecating one of the ironic deploy interfaces detailed in https://docs.openstack.org/ironic/latest/admin/interfaces/deploy.html. It does not affect the boot-from-iSCSI feature.
>>
>> I would like to propose deprecating and removing the 'iscsi' deploy interface over the course of the next 2 cycles. The reasons are:
>> 1) The iSCSI deploy is a source of occasional cryptic bugs when a target cannot be discovered or mounted properly.
>> 2) Its security is questionable: I don't think we even use authentication.
>> 3) Operators confusion: right now we default to the iSCSI deploy but pretty much direct everyone who cares about scalability or security to the 'direct' deploy.
>> 4) Cost of maintenance: our feature set is growing, our team - not so much. iscsi_deploy.py is 800 lines of code that can be removed, and some dependencies that can be dropped as well.
>>
>> As far as I can remember, we've kept the iSCSI deploy for two reasons:
>> 1) The direct deploy used to require Glance with Swift backend. The recently added [agent]image_download_source option allows caching and serving images via the ironic's HTTP server, eliminating this problem. I guess we'll have to switch to 'http' by default for this option to keep the out-of-box experience.
>> 2) Memory footprint of the direct deploy. With the raw images streaming we no longer have to cache the downloaded images in the agent memory, removing this problem as well (I'm not even sure how much of a problem it is in 2020, even my phone has 4GiB of RAM).
>>

When using image_download_source = http, does Ironic convert non-raw 
images when they are placed on each conductors cache? To benefit from 
the raw image streaming?

>> If this proposal is accepted, I suggest to execute it as follows:
>> Victoria release:
>> 1) Put an early deprecation warning in the release notes.
>> 2) Announce the future change of the default value for [agent]image_download_source.
>> W release:
>> 3) Change [agent]image_download_source to 'http' by default.
>> 4) Remove iscsi from the default enabled_deploy_interfaces and move it to the back of the supported list (effectively making direct deploy the default).
>> X release:
>> 5) Remove the iscsi deploy code from both ironic and IPA.
>>
>> Thoughts, opinions, suggestions?
>>
>> Dmitry
> 


From dtantsur at redhat.com  Tue Aug 25 10:59:47 2020
From: dtantsur at redhat.com (Dmitry Tantsur)
Date: Tue, 25 Aug 2020 12:59:47 +0200
Subject: [ironic][tripleo] RFC: deprecate the iSCSI deploy interface?
In-Reply-To: <a89f75fd-fb5e-c380-adf1-9bf6de4bf93b@redhat.com>
References: <CACNgkFy4Gx9UsSfR3EW_vfZzyPGg9mR2ks+h6noycOVz1C1-Ww@mail.gmail.com>
 <CAF7gwdi5D-ppP5Tu8280xVxyYbPMa7mLJTC_vbf9qLP9STGswg@mail.gmail.com>
 <a89f75fd-fb5e-c380-adf1-9bf6de4bf93b@redhat.com>
Message-ID: <CACNgkFy0SB7SQP=THzqY8Kro_5FfN2RE+m40YY3ypE_U=LRE_A@mail.gmail.com>

On Tue, Aug 25, 2020 at 12:39 PM Harald Jensas <hjensas at redhat.com> wrote:

> On 8/20/20 5:49 PM, Julia Kreger wrote:
> > I suspect we're going to have lots of push back from the TripleO
> > community because there has been resistance to change their default
> > usage in the past. As such I'm adding them to the subject so hopefully
> > they will be at least aware.
>
> Since TripleO already support using the direct interface, it's
> recommended and tested by the TripleO group focusing on edge type
> deployments, switching to direct by default might not be too much of a
> hassle for TripleO.
>

++


>
> We may want to change the disk-image format used by TripleO to raw as
> well, to benefit from the raw image streaming capabilities? Or would
> enabling image_download_source = http convert the images as they are
> cached on conductors? (see question inline below.)
>

>
> > On Thu, Aug 20, 2020 at 1:57 AM Dmitry Tantsur <dtantsur at redhat.com>
> wrote:
> >>
> >> Hi all,
> >>
> >> Side note for those lacking context: this proposal concerns deprecating
> one of the ironic deploy interfaces detailed in
> https://docs.openstack.org/ironic/latest/admin/interfaces/deploy.html. It
> does not affect the boot-from-iSCSI feature.
> >>
> >> I would like to propose deprecating and removing the 'iscsi' deploy
> interface over the course of the next 2 cycles. The reasons are:
> >> 1) The iSCSI deploy is a source of occasional cryptic bugs when a
> target cannot be discovered or mounted properly.
> >> 2) Its security is questionable: I don't think we even use
> authentication.
> >> 3) Operators confusion: right now we default to the iSCSI deploy but
> pretty much direct everyone who cares about scalability or security to the
> 'direct' deploy.
> >> 4) Cost of maintenance: our feature set is growing, our team - not so
> much. iscsi_deploy.py is 800 lines of code that can be removed, and some
> dependencies that can be dropped as well.
> >>
> >> As far as I can remember, we've kept the iSCSI deploy for two reasons:
> >> 1) The direct deploy used to require Glance with Swift backend. The
> recently added [agent]image_download_source option allows caching and
> serving images via the ironic's HTTP server, eliminating this problem. I
> guess we'll have to switch to 'http' by default for this option to keep the
> out-of-box experience.
> >> 2) Memory footprint of the direct deploy. With the raw images streaming
> we no longer have to cache the downloaded images in the agent memory,
> removing this problem as well (I'm not even sure how much of a problem it
> is in 2020, even my phone has 4GiB of RAM).
> >>
>
> When using image_download_source = http, does Ironic convert non-raw
> images when they are placed on each conductors cache? To benefit from
> the raw image streaming?
>

Yes, unless it's explicitly disabled. Although storing raw images from the
beginning may make deployments a bit faster and save some disk space for
this conversion.

Dmitry


>
> >> If this proposal is accepted, I suggest to execute it as follows:
> >> Victoria release:
> >> 1) Put an early deprecation warning in the release notes.
> >> 2) Announce the future change of the default value for
> [agent]image_download_source.
> >> W release:
> >> 3) Change [agent]image_download_source to 'http' by default.
> >> 4) Remove iscsi from the default enabled_deploy_interfaces and move it
> to the back of the supported list (effectively making direct deploy the
> default).
> >> X release:
> >> 5) Remove the iscsi deploy code from both ironic and IPA.
> >>
> >> Thoughts, opinions, suggestions?
> >>
> >> Dmitry
> >
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200825/3077b3c4/attachment.html>

From masayuki.igawa at gmail.com  Tue Aug 25 11:06:54 2020
From: masayuki.igawa at gmail.com (Masayuki Igawa)
Date: Tue, 25 Aug 2020 20:06:54 +0900
Subject: [qa] Wallaby PTG planning
Message-ID: <f8325bce-0891-46a2-b7ab-4bab395f820e@www.fastmail.com>

Hi,

We need to start thinking about the next cycle already.
As you probably know, next virtual PTG will be held in October 26-30[0].

I prepared an etherpad[1] to discuss and track our topics. So, please add
your name if you are going to attend the PTG session. And also, please add
your proposals of the topics which you want to discuss during the PTG.

I also made a doodle[2] with possible time slots. Please put your best days and hours
so that we can try to schedule and book our sessions in the time slots.

[0] https://www.openstack.org/ptg/
[1] https://etherpad.opendev.org/p/qa-wallaby-ptg
[2] https://doodle.com/poll/qqd7ayz3i4ubnsbb

Best Regards,
-- Masayuki Igawa
  Key fingerprint = C27C 2F00 3A2A 999A 903A  753D 290F 53ED C899 BF89


From radoslaw.piliszek at gmail.com  Tue Aug 25 11:45:52 2020
From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=)
Date: Tue, 25 Aug 2020 13:45:52 +0200
Subject: [qa] Wallaby PTG planning
In-Reply-To: <f8325bce-0891-46a2-b7ab-4bab395f820e@www.fastmail.com>
References: <f8325bce-0891-46a2-b7ab-4bab395f820e@www.fastmail.com>
Message-ID: <CAKZ_x79nOoaTMMLc5z7V7y4NXoOmY_LRDuP9RjBBAds0wHw1vw@mail.gmail.com>

Thanks, Masayuki.
I added myself.

I hope we can get it non-colliding with Kolla meetings this time.
I'll try to do a better job at early collision detection. :-)

-yoctozepto

On Tue, Aug 25, 2020 at 1:16 PM Masayuki Igawa <masayuki.igawa at gmail.com> wrote:
>
> Hi,
>
> We need to start thinking about the next cycle already.
> As you probably know, next virtual PTG will be held in October 26-30[0].
>
> I prepared an etherpad[1] to discuss and track our topics. So, please add
> your name if you are going to attend the PTG session. And also, please add
> your proposals of the topics which you want to discuss during the PTG.
>
> I also made a doodle[2] with possible time slots. Please put your best days and hours
> so that we can try to schedule and book our sessions in the time slots.
>
> [0] https://www.openstack.org/ptg/
> [1] https://etherpad.opendev.org/p/qa-wallaby-ptg
> [2] https://doodle.com/poll/qqd7ayz3i4ubnsbb
>
> Best Regards,
> -- Masayuki Igawa
>   Key fingerprint = C27C 2F00 3A2A 999A 903A  753D 290F 53ED C899 BF89
>


From mnaser at vexxhost.com  Tue Aug 25 13:23:27 2020
From: mnaser at vexxhost.com (Mohammed Naser)
Date: Tue, 25 Aug 2020 09:23:27 -0400
Subject: [tc] weekly update
Message-ID: <CAEs876iv9A8NmgGeVgd=-vtCioORC=_7Bcoj4VDqTrfQHR-B2g@mail.gmail.com>

Hi everyone,

Here’s an update for what happened in the OpenStack TC this week. You
can get more information by checking for changes in
openstack/governance repository.  We've also included a few references
to some important mailing list threads that you should check out.

# Patches
## Open Reviews
- Add assert:supports-standalone https://review.opendev.org/722399
- Add etcd3gw to Oslo https://review.opendev.org/747188
- Update and simplify comparison of working groups
https://review.opendev.org/746763
- Drop requirement of 1/3 positive TC votes to land
https://review.opendev.org/746711
- Resolution to define distributed leadership for projects
https://review.opendev.org/744995
- Move towards dual office hours in diff TZ https://review.opendev.org/746167
- Create starter-kit:kubernetes-in-virt tag https://review.opendev.org/736369
- Drop all exceptions for legacy validation https://review.opendev.org/745403
- Move towards single office hour https://review.opendev.org/745200

## General Changes
- Fix names inside check-review-status https://review.opendev.org/745913

# Email Threads
- Zuul Native Jobs Goal Update #2:
http://lists.openstack.org/pipermail/openstack-discuss/2020-August/016561.html
- Masakari Project Aliveness:
http://lists.openstack.org/pipermail/openstack-discuss/2020-August/016520.html
- vPTG October 2020 Signup:
http://lists.openstack.org/pipermail/openstack-discuss/2020-August/016497.html
- OpenStack Client vs python-*clients:
http://lists.openstack.org/pipermail/openstack-discuss/2020-August/016409.html

Thanks for reading!
Mohammed & Kendall


-- 
Mohammed Naser
VEXXHOST, Inc.


From ts-takahashi at nec.com  Tue Aug 25 05:40:52 2020
From: ts-takahashi at nec.com (=?utf-8?B?VEFLQUhBU0hJIFRPU0hJQUtJKOmrmOapi+OAgOaVj+aYjik=?=)
Date: Tue, 25 Aug 2020 05:40:52 +0000
Subject: [tacker] IRC meeting
In-Reply-To: <636004ca-130b-58ee-c769-19169926fcee@gmail.com>
References: <636004ca-130b-58ee-c769-19169926fcee@gmail.com>
Message-ID: <TYAPR01MB4046C2BD2EDEDCA2095C7ECD93570@TYAPR01MB4046.jpnprd01.prod.outlook.com>

Hi Yasufumi and Tacker team,

Can I host the meeting?
I have 1 topic, feedback from NFV-TST.


Regards,
Toshiaki

-------------------------------------------------
　Toshiaki Takahashi
　E-mail: ts-takahashi at nec.com
-------------------------------------------------

> -----Original Message-----
> From: Yasufumi Ogawa <yasufum.o at gmail.com>
> Sent: Tuesday, August 25, 2020 2:28 PM
> To: openstack-discuss <openstack-discuss at lists.openstack.org>
> Subject: [tacker] IRC meeting
> 
> Hi tacker team,
> 
> I am not available to join IRC meeting today unfortunately. I would like to
> suggest to anyone host the meeting, or skip it if no items.
> 
> Thanks,
> Yasufumi


From sandeep.ee.nagendra at gmail.com  Tue Aug 25 06:15:41 2020
From: sandeep.ee.nagendra at gmail.com (sandeep)
Date: Tue, 25 Aug 2020 11:45:41 +0530
Subject: [Cliff] [dev] auto completion not working inside interactive mode
In-Reply-To: <CAPcssy=OqBQfR92PuBywQeVjfhKOjeUPTA_5bCz3azadPMiJxQ@mail.gmail.com>
References: <CAPcssy=OqBQfR92PuBywQeVjfhKOjeUPTA_5bCz3azadPMiJxQ@mail.gmail.com>
Message-ID: <CAPcssyn=wVbF43-Wga7Ezk9tFsDCGAe=6h1qF=FF-8TvSKZ4-Q@mail.gmail.com>

Hi Team,

In my system, I am trying auto completion for my CLI application.

*CLIFF version - cliff==3.4.0*
Auto complete works fine on bash prompt.

But inside the interactive shell, auto complete does not work.

Below is the screenshot for the help command inside the interactive shell.

[image: image.png]

Now, if I type swm and press tab, it lists all the sub commands under it.

But, swm s<tab> gives
swm "s

and further command auto completion does not work.
[image: image.png]

Could you please let me know what could be the problem?

Is this a known issue? or Am i missing something?

Thanks,
Sandeep
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200825/93195539/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 28735 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200825/93195539/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 7345 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200825/93195539/attachment-0003.png>

From amy at demarco.com  Tue Aug 25 05:28:33 2020
From: amy at demarco.com (Amy Marrich)
Date: Tue, 25 Aug 2020 00:28:33 -0500
Subject: [openstack-community] Error add member to pool ( OCTAVIA ) when
 using SSL to verify
In-Reply-To: <692B1576-9AB1-46F9-9328-0D510DDCEE01@hxcore.ol>
References: <692B1576-9AB1-46F9-9328-0D510DDCEE01@hxcore.ol>
Message-ID: <59EC5E93-FC3F-4EDC-A874-9A2F466B37DC@demarco.com>

Adding the OpenStack discuss list.

Amy (spotz)

> On Aug 24, 2020, at 11:14 PM, Vinh Nguyen Duc <vinhducnguyen1708 at gmail.com> wrote:
> 
> ﻿
> Dear Openstack community,
>  
> My name is Duc Vinh,  I am newer in Openstack
> I am deploy Openstack Ussuri on Centos8 , I am using three nodes controller with High Availability topology and using HAproxy to verify cert for connect HTTPS,
> I have trouble with project Octavia, I cannot add member in a pool after created Loadbalancer, listener, pool ( everything is fine).
> Here is my log and configuration file:
>  
> LOGS:
>  
> 2020-08-25 10:55:42.872 226250 DEBUG octavia.network.drivers.neutron.base [req-57c5b37c-e50f-4d50-b535-b0a3d19db1d5 - 8259463ce052437396afa845933afe4b - default default] Neutron extension security-group found enabled _check_extension_enabled /usr/lib/python3.6/site-packages/octavia/network/drivers/neutron/base.py:66
> 2020-08-25 10:55:42.892 226250 DEBUG octavia.network.drivers.neutron.base [req-57c5b37c-e50f-4d50-b535-b0a3d19db1d5 - 8259463ce052437396afa845933afe4b - default default] Neutron extension dns-integration is not enabled _check_extension_enabled /usr/lib/python3.6/site-packages/octavia/network/drivers/neutron/base.py:70
> 2020-08-25 10:55:42.911 226250 DEBUG octavia.network.drivers.neutron.base [req-57c5b37c-e50f-4d50-b535-b0a3d19db1d5 - 8259463ce052437396afa845933afe4b - default default] Neutron extension qos found enabled _check_extension_enabled /usr/lib/python3.6/site-packages/octavia/network/drivers/neutron/base.py:66
> 2020-08-25 10:55:42.933 226250 DEBUG octavia.network.drivers.neutron.base [req-57c5b37c-e50f-4d50-b535-b0a3d19db1d5 - 8259463ce052437396afa845933afe4b - default default] Neutron extension allowed-address-pairs found enabled _check_extension_enabled /usr/lib/python3.6/site-packages/octavia/network/drivers/neutron/base.py:66
> 2020-08-25 10:55:43.068 226250 WARNING keystoneauth.identity.generic.base [req-57c5b37c-e50f-4d50-b535-b0a3d19db1d5 - 8259463ce052437396afa845933afe4b - default default] Failed to discover available identity versions when contacting https://192.168.10.150:5000. Attempting to parse version from URL.: keystoneauth1.exceptions.connection.SSLError: SSL exception connecting to https://192.168.10.150:5000: HTTPSConnectionPool(host='192.168.10.150', port=5000): Max retries exceeded with url: / (Caused by SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:897)'),))
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base [req-57c5b37c-e50f-4d50-b535-b0a3d19db1d5 - 8259463ce052437396afa845933afe4b - default default] Error retrieving subnet (subnet id: 035f3183-f469-415f-b536-b4a81364e814.: keystoneauth1.exceptions.discovery.DiscoveryFailure: Could not find versioned identity endpoints when attempting to authenticate. Please check that your auth_url is correct. SSL exception connecting to https://192.168.10.150:5000: HTTPSConnectionPool(host='192.168.10.150', port=5000): Max retries exceeded with url: / (Caused by SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:897)'),))
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base Traceback (most recent call last):
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 600, in urlopen
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     chunked=chunked)
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 343, in _make_request
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     self._validate_conn(conn)
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 839, in _validate_conn
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     conn.connect()
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/urllib3/connection.py", line 344, in connect
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     ssl_context=context)
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/urllib3/util/ssl_.py", line 367, in ssl_wrap_socket
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     return context.wrap_socket(sock)
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib64/python3.6/ssl.py", line 365, in wrap_socket
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     _context=self, _session=session)
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib64/python3.6/ssl.py", line 776, in __init__
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     self.do_handshake()
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib64/python3.6/ssl.py", line 1036, in do_handshake
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     self._sslobj.do_handshake()
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib64/python3.6/ssl.py", line 648, in do_handshake
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     self._sslobj.do_handshake()
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:897)
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base During handling of the above exception, another exception occurred:
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base Traceback (most recent call last):
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/requests/adapters.py", line 449, in send
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     timeout=timeout
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 638, in urlopen
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     _stacktrace=sys.exc_info()[2])
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/urllib3/util/retry.py", line 399, in increment
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     raise MaxRetryError(_pool, url, error or ResponseError(cause))
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='192.168.10.150', port=5000): Max retries exceeded with url: / (Caused by SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:897)'),))
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base During handling of the above exception, another exception occurred:
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base Traceback (most recent call last):
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/keystoneauth1/session.py", line 1004, in _send_request
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     resp = self.session.request(method, url, **kwargs)
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/requests/sessions.py", line 533, in request
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     resp = self.send(prep, **send_kwargs)
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/requests/sessions.py", line 646, in send
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     r = adapter.send(request, **kwargs)
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/requests/adapters.py", line 514, in send
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     raise SSLError(e, request=request)
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base requests.exceptions.SSLError: HTTPSConnectionPool(host='192.168.10.150', port=5000): Max retries exceeded with url: / (Caused by SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:897)'),))
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base During handling of the above exception, another exception occurred:
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base Traceback (most recent call last):
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/keystoneauth1/identity/generic/base.py", line 138, in _do_create_plugin
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     authenticated=False)
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/keystoneauth1/identity/base.py", line 610, in get_discovery
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     authenticated=authenticated)
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/keystoneauth1/discover.py", line 1452, in get_discovery
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     disc = Discover(session, url, authenticated=authenticated)
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/keystoneauth1/discover.py", line 536, in __init__
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     authenticated=authenticated)
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/keystoneauth1/discover.py", line 102, in get_version_data
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     resp = session.get(url, headers=headers, authenticated=authenticated)
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/keystoneauth1/session.py", line 1123, in get
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     return self.request(url, 'GET', **kwargs)
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/keystoneauth1/session.py", line 913, in request
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     resp = send(**kwargs)
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/keystoneauth1/session.py", line 1008, in _send_request
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     raise exceptions.SSLError(msg)
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base keystoneauth1.exceptions.connection.SSLError: SSL exception connecting to https://192.168.10.150:5000: HTTPSConnectionPool(host='192.168.10.150', port=5000): Max retries exceeded with url: / (Caused by SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:897)'),))
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base During handling of the above exception, another exception occurred:
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base Traceback (most recent call last):
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/octavia/network/drivers/neutron/base.py", line 193, in _get_resource
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     resource_type)(resource_id)
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/neutronclient/v2_0/client.py", line 869, in show_subnet
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     return self.get(self.subnet_path % (subnet), params=_params)
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/neutronclient/v2_0/client.py", line 354, in get
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     headers=headers, params=params)
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/neutronclient/v2_0/client.py", line 331, in retry_request
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     headers=headers, params=params)
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/neutronclient/v2_0/client.py", line 282, in do_request
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     headers=headers)
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/neutronclient/client.py", line 339, in do_request
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     self._check_uri_length(url)
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/neutronclient/client.py", line 332, in _check_uri_length
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     uri_len = len(self.endpoint_url) + len(url)
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/neutronclient/client.py", line 346, in endpoint_url
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     return self.get_endpoint()
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/keystoneauth1/adapter.py", line 282, in get_endpoint
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     return self.session.get_endpoint(auth or self.auth, **kwargs)
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/keystoneauth1/session.py", line 1225, in get_endpoint
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     return auth.get_endpoint(self, **kwargs)
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/keystoneauth1/identity/base.py", line 380, in get_endpoint
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     allow_version_hack=allow_version_hack, **kwargs)
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/keystoneauth1/identity/base.py", line 271, in get_endpoint_data
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     service_catalog = self.get_access(session).service_catalog
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/keystoneauth1/identity/base.py", line 134, in get_access
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     self.auth_ref = self.get_auth_ref(session)
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/keystoneauth1/identity/generic/base.py", line 206, in get_auth_ref
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     self._plugin = self._do_create_plugin(session)
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/keystoneauth1/identity/generic/base.py", line 161, in _do_create_plugin
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     'auth_url is correct. %s' % e)
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base keystoneauth1.exceptions.discovery.DiscoveryFailure: Could not find versioned identity endpoints when attempting to authenticate. Please check that your auth_url is correct. SSL exception connecting to https://192.168.10.150:5000: HTTPSConnectionPool(host='192.168.10.150', port=5000): Max retries exceeded with url: / (Caused by SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:897)'),))
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
> 2020-08-25 10:55:43.074 226250 DEBUG wsme.api [req-57c5b37c-e50f-4d50-b535-b0a3d19db1d5 - 8259463ce052437396afa845933afe4b - default default] Client-side error: Subnet 035f3183-f469-415f-b536-b4a81364e814 not found. format_exception /usr/lib/python3.6/site-packages/wsme/api.py:222
> 2020-08-25 10:55:43.076 226250 DEBUG octavia.common.keystone [req-57c5b37c-e50f-4d50-b535-b0a3d19db1d5 - 8259463ce052437396afa845933afe4b - default default] Request path is / and it does not require keystone authentication process_request /usr/lib/python3.6/site-packages/octavia/common/keystone.py:77
> 2020-08-25 10:55:43.080 226250 DEBUG octavia.common.keystone [req-5091d326-0cb4-4ae1-bf4b-9ef6b9313dca - - - - -] Request path is / and it does not require keystone authentication process_request /usr/lib/python3.6/site-packages/octavia/common/keystone.py:77
>  
> Configuration:
> [root at controller01 ~]# cat /etc/octavia/octavia.conf
> [DEFAULT]
>  
> log_dir = /var/log/octavia
> debug = True
> transport_url = rabbit://openstack:4ychZAT5VrWlk6KFfgAmpXvGdzfdV8hEpIgOLhyF at 192.168.10.178:5672,openstack:4ychZAT5VrWlk6KFfgAmpXvGdzfdV8hEpIgOLhyF at 192.168.10.179:5672,openstack:4ychZAT5VrWlk6KFfgAmpXvGdzfdV8hEpIgOLhyF at 192.168.10.28:5672
>  
> [api_settings]
> api_base_uri = https://192.168.10.150:9876
> bind_host = 192.168.10.178
> bind_port = 9876
> auth_strategy = keystone
> healthcheck_enabled = True
> allow_tls_terminated_listeners = True
>  
> [database]
> connection = mysql+pymysql://octavia:FUkbii8AY4G6H9LxbJ2RRlOzHN61X8PI8FrMcuXQ at 192.168.10.150/octavia
> max_retries = -1
>  
> [health_manager]
> bind_port = 5555
> bind_ip = 192.168.10.178
> controller_ip_port_list = 192.168.10.178:5555, 192.168.10.179:5555, 192.168.10.28:5555
> heartbeat_key = insecure
>  
> [keystone_authtoken]
> service_token_roles_required = True
> www_authenticate_uri = https://192.168.10.150:5000
> auth_url = https://192.168.10.150:5000
> region_name = Hanoi
> memcached_servers = 192.168.10.178:11211,192.168.10.179:11211,192.168.10.28:11211
> auth_type = password
> project_domain_name = Default
> user_domain_name = Default
> project_name = service
> username = octavia
> password = esGn3rN3iJOAD2HXmqznFPI9oAY2wQNDWYwqJaCH
> cafile = /etc/ssl/private/haproxy.pem
> insecure = false
>  
>  
> [certificates]
> cert_generator = local_cert_generator
> #server_certs_key_passphrase = insecure-key-do-not-use-this-key
> ca_private_key_passphrase = esGn3rN3iJOAD2HXmqznFPI9oAY2wQNDWYwqJaCH
> ca_private_key = /etc/octavia/certs/server_ca.key.pem
> ca_certificate = /etc/octavia/certs/server_ca.cert.pem
> region_name = Hanoi
> ca_certificates_file = /etc/ssl/private/haproxy.pem
> endpoint_type = internal
>  
> [networking]
> #allow_vip_network_id = True
> #allow_vip_subnet_id = True
> #allow_vip_port_id = True
>  
> [haproxy_amphora]
> #bind_port = 9443
> server_ca = /etc/octavia/certs/server_ca.cert.pem
> client_cert = /etc/octavia/certs/client.cert-and-key.pem
> base_path = /var/lib/octavia
> base_cert_dir = /var/lib/octavia/certs
> connection_max_retries = 1500
> connection_retry_interval = 1
>  
> [controller_worker]
> amp_image_tag = amphora
> amp_ssh_key_name = octavia
> amp_secgroup_list = 80f44b73-dc9f-48aa-a0b8-8b78e5c6585c
> amp_boot_network_list = 04425cb2-5963-48f5-a229-b89b7c6036bd
> amp_flavor_id = 200
> network_driver = allowed_address_pairs_driver
> compute_driver = compute_nova_driver
> amphora_driver = amphora_haproxy_rest_driver
> client_ca = /etc/octavia/certs/client_ca.cert.pem
> loadbalancer_topology = SINGLE
> amp_active_retries = 9999
>  
> [task_flow]
> [oslo_messaging]
> topic = octavia_prov
> rpc_thread_pool_size = 2
>  
> [house_keeping]
> [amphora_agent]
> [keepalived_vrrp]
>  
> [service_auth]
> auth_url = https://192.168.10.150:5000
> auth_type = password
> project_domain_name = default
> user_domain_name = default
> project_name = admin
> username = admin
> password = F35sXAYW5qDlMGfQbhmexIx12DqrQdpw6ixAseTd
> cafile = /etc/ssl/private/haproxy.pem
> region_name = Hanoi
> memcached_servers = 192.168.10.178:11211,192.168.10.179:11211,192.168.10.28:11211
> #insecure = true
>  
>  
> [glance]
> ca_certificates_file = /etc/ssl/private/haproxy.pem
> region_name = Hanoi
> endpoint_type = internal
> insecure = false
>  
> [neutron]
> ca_certificates_file = /etc/ssl/private/haproxy.pem
> region_name = Hanoi
> endpoint_type = internal
> insecure = false
>  
> [cinder]
> ca_certificates_file = /etc/ssl/private/haproxy.pem
> region_name = Hanoi
> endpoint_type = internal
> insecure = false
>  
> [nova]
> ca_certificates_file = /etc/ssl/private/haproxy.pem
> region_name = Hanoi
> endpoint_type = internal
> insecure = false
>  
> [oslo_policy]
> #policy_file = /etc/octavia/policy.json
>  
> [oslo_messaging_notifications]
> transport_url = rabbit://openstack:4ychZAT5VrWlk6KFfgAmpXvGdzfdV8hEpIgOLhyF at 192.168.10.178:5672,openstack:4ychZAT5VrWlk6KFfgAmpXvGdzfdV8hEpIgOLhyF at 192.168.10.179:5672,openstack:4ychZAT5VrWlk6KFfgAmpXvGdzfdV8hEpIgOLhyF at 192.168.10.28:5672
>  
> _______________________________________________
> Community mailing list
> Community at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/community
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200825/1f2809ad/attachment-0001.html>

From harishkumarivaturi at gmail.com  Tue Aug 25 12:30:27 2020
From: harishkumarivaturi at gmail.com (HARISH KUMAR Ivaturi)
Date: Tue, 25 Aug 2020 14:30:27 +0200
Subject: Openstack with Nginx Support
Message-ID: <CAGmfrByFq8++y6T6sBvKmCjb+T9MGhkua0WP6ZYOVb0gkCTXRA@mail.gmail.com>

Hi
I am Harish Kumar, Master Student at BTH, Karlskrona, Sweden. I am working
on my Master thesis at BTH and my thesis topic is Performance evaluation of
OpenStack with HTTP/3.

I have successfully built curl and nginx with HTTP/3 support and I am
performing some commands using curl for generating tokens so i could access
the services of OpenStack.
OpenStack relies with the Apache web server and I could not get any results
using Nginx HTTP/3 . I would like to ask if there is any official
documentation on OpenStack relying with Nginx?, I have searched in the
internet reg. this info but could not get any, I would like to use nginx
instead of apache web server , so I could get some results by performing
curl and commands and nginx web server (with http/3 support). Please let me
know and if there is any content please share with me. I hope you have
understood this. It would be helpful for my Master Thesis.

BR
Harish Kumar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200825/b7526bcd/attachment-0001.html>

From wbedyk at suse.de  Tue Aug 25 12:56:51 2020
From: wbedyk at suse.de (Witek Bedyk)
Date: Tue, 25 Aug 2020 14:56:51 +0200
Subject: [monasca] Deprecate monasca-transform repository
Message-ID: <d5ce9020-7730-977f-076d-cda746130910@suse.de>

Hello,

this message is to announce the deprecation of
openstack/monasca-transform repository. The project will not accept new
development on master branch but accept fixes on stable branches. It
will follow the process described in Project Team Guide [1].

Please reply to this message until Sept. 7 if you would like to take
over the development and maintenance of this repository.

Thanks
Witek

[1]
https://docs.openstack.org/project-team-guide/repository.html#deprecating-a-repository


From rosmaita.fossdev at gmail.com  Tue Aug 25 14:16:09 2020
From: rosmaita.fossdev at gmail.com (Brian Rosmaita)
Date: Tue, 25 Aug 2020 10:16:09 -0400
Subject: [all][infra] READMEs of zuul roles not rendered properly -
 missing content
In-Reply-To: <a23f9c08-fbe1-4d1c-b488-84e9f710cce5@www.fastmail.com>
References: <CAKZGdE2rW_i2JX1HAytTakQcJBS6A=9SGeyx0JzzNagL6u_w2g@mail.gmail.com>
 <20200824143618.7xdecj67m5jzwpkz@yuggoth.org>
 <a23f9c08-fbe1-4d1c-b488-84e9f710cce5@www.fastmail.com>
Message-ID: <14978702-3919-943f-2750-3ecae1201a68@gmail.com>

On 8/24/20 11:05 AM, Clark Boylan wrote:
> On Mon, Aug 24, 2020, at 7:36 AM, Jeremy Stanley wrote:
>> On 2020-08-24 16:12:17 +0200 (+0200), Martin Kopec wrote:
>>> I've noticed that READMEs of zuul roles within openstack projects
>>> are not rendered properly on opendev.org - ".. zuul:rolevar::"
>>> syntax seems to be the problem. Although it's rendered well on
>>> github.com, see f.e. [1] [2].

[snip]

>> To be entirely honest, I wish Gitea didn't automatically attempt to
>> render RST files, that makes it harder to actually refer to the
>> source code for them, and it's a source code browser not a CMS for
>> publishing documentation, but apparently this is a feature many
>> other users do like for some reason.
> 
> We can change this behavior by removing the external renderer (though I expect we're in the minority of preferring ability to link to the source here).

This may be a bigger minority that you think ... I put up a patch to 
change the default behavior to not render RST, so anyone with a strong 
opinion, please comment on the patch:
   https://review.opendev.org/#/c/747796/

> 
> [3] https://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/gitea/templates/app.ini.j2#L88-L95
> [4] https://opendev.org/opendev/system-config/src/branch/master/docker/gitea/Dockerfile#L92-L94
> 
>> -- 
>> Jeremy Stanley
> 


From emilien at redhat.com  Tue Aug 25 14:32:32 2020
From: emilien at redhat.com (Emilien Macchi)
Date: Tue, 25 Aug 2020 10:32:32 -0400
Subject: [tripleo] no recheck please
Message-ID: <CACu=hyvr=2RgAhv=rRPv_j1En2gOzdewXNMQWZ0G_bdPhDb=iw@mail.gmail.com>

We're hitting the docker rate limits very badly right now and while our
mitigation patch will land [1], please refrain from approving or recheck
patches for now.
I've cleared the gate and I'll take care of re-adding these patches into
the gate when things will be stable again.

[1] https://review.opendev.org/#/c/746993

Thanks for your understanding and your patience!
-- 
Emilien Macchi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200825/894b2593/attachment.html>

From pramchan at yahoo.com  Tue Aug 25 15:53:39 2020
From: pramchan at yahoo.com (prakash RAMCHANDRAN)
Date: Tue, 25 Aug 2020 15:53:39 +0000 (UTC)
Subject: Openstack with Nginx Support (HARISH KUMAR Ivaturi)
In-Reply-To: <mailman.17987.1598364849.1283.openstack-discuss@lists.openstack.org>
References: <mailman.17987.1598364849.1283.openstack-discuss@lists.openstack.org>
Message-ID: <680450507.7321322.1598370819069@mail.yahoo.com>

Harish,
Note Horizon dashboard is based on Django framework over Apache. Thus logically it should work if you deploy Django over Nginx and please refer to link getting Django and once you have that rest should work as the Model, View, Controller (MVC)  take care of addressing the rest. I have not seen any Ngnix deployment of Open stack, but a single domain Open stack Controller  should be possible to deploy with Nginx. You can also reach out to Ngnix or F5 team to help you out, as this is a good  exercise for leveraging capability of Nginix for OpenSrack
https://uwsgi-docs.readthedocs.io/en/latest/tutorials/Django_and_nginx.html


ThanksPrakash

----------------------------------------------------------------------

Message: 1
Date: Tue, 25 Aug 2020 14:30:27 +0200
From: HARISH KUMAR Ivaturi <harishkumarivaturi at gmail.com>
To: openstack-discuss at lists.openstack.org
Subject: Openstack with Nginx Support
Message-ID:
    <CAGmfrByFq8++y6T6sBvKmCjb+T9MGhkua0WP6ZYOVb0gkCTXRA at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi
I am Harish Kumar, Master Student at BTH, Karlskrona, Sweden. I am working
on my Master thesis at BTH and my thesis topic is Performance evaluation of
OpenStack with HTTP/3.

I have successfully built curl and nginx with HTTP/3 support and I am
performing some commands using curl for generating tokens so i could access
the services of OpenStack.
OpenStack relies with the Apache web server and I could not get any results
using Nginx HTTP/3 . I would like to ask if there is any official
documentation on OpenStack relying with Nginx?, I have searched in the
internet reg. this info but could not get any, I would like to use nginx
instead of apache web server , so I could get some results by performing
curl and commands and nginx web server (with http/3 support). Please let me
know and if there is any content please share with me. I hope you have
understood this. It would be helpful for my Master Thesis.

BR
Harish Kumar

  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200825/4b69fb4f/attachment.html>

From fungi at yuggoth.org  Tue Aug 25 16:23:49 2020
From: fungi at yuggoth.org (Jeremy Stanley)
Date: Tue, 25 Aug 2020 16:23:49 +0000
Subject: [OSSA-2020-006] Nova: Live migration fails to update persistent
 domain XML (CVE-2020-17376)
Message-ID: <20200825162348.heaisepopqhmnfli@yuggoth.org>

===================================================================
OSSA-2020-006: Live migration fails to update persistent domain XML
===================================================================

:Date: August 25, 2020
:CVE: CVE-2020-17376


Affects
~~~~~~~
- Nova: <19.3.1, >=20.0.0 <20.3.1, ==21.0.0


Description
~~~~~~~~~~~
Tadayoshi Hosoya (NEC) and Lee Yarwood (Red Hat) reported a
vulnerability in Nova live migration. By performing a soft reboot of
an instance which has previously undergone live migration, a user may
gain access to destination host devices that share the same paths as
host devices previously referenced by the virtual machine on the
source. This can include block devices that map to different Cinder
volumes on the destination than the source. The risk is increased
significantly in non-default configurations allowing untrusted users
to initiate live migrations, so administrators may consider
temporarily disabling this in policy if they cannot upgrade
immediately. This only impacts deployments where users are allowed to
perform soft reboots of server instances; it is recommended to disable
soft reboots in policy (only allowing hard reboots) until the fix can
be applied.


Patches
~~~~~~~
- https://review.opendev.org/747978 (Pike)
- https://review.opendev.org/747976 (Queens)
- https://review.opendev.org/747975 (Rocky)
- https://review.opendev.org/747974 (Stein)
- https://review.opendev.org/747973 (Train)
- https://review.opendev.org/747972 (Ussuri)
- https://review.opendev.org/747969 (Victoria)


Credits
~~~~~~~
- Tadayoshi Hosoya from NEC (CVE-2020-17376)
- Lee Yarwood from Red Hat (CVE-2020-17376)


References
~~~~~~~~~~
- https://launchpad.net/bugs/1890501
- http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-17376


Notes
~~~~~
- The stable/rocky, stable/queens, and stable/pike branches are under extended
  maintenance and will receive no new point releases, but patches for them are
  provided as a courtesy.


-- 
Jeremy Stanley
OpenStack Vulnerability Management Team
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 963 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200825/1c56afc8/attachment.sig>

From rfolco at redhat.com  Tue Aug 25 23:22:16 2020
From: rfolco at redhat.com (Rafael Folco)
Date: Tue, 25 Aug 2020 20:22:16 -0300
Subject: [tripleo] TripleO CI Summary: Unified Sprint 31
Message-ID: <CAOr2xH7NT1XQGviNq-7-PiAHT04tph4wKYqeHQngnLHpSxx3Rw@mail.gmail.com>

Greetings,

The TripleO CI team has just completed **Unified Sprint 31** (July 31 thru Aug
20). The following is a summary of completed work during this sprint cycle*:


   -

   Continued building internal component and integration pipelines for
   rhos-16.2.
   -

   Added more jobs to the component and integration pipelines.
   -

   Completed promoter code and test scenarios to run on CentOS8/Python3.
   -

   Continued merging changes to switch to the new configuration engine in
   promoter code.
   -

   Merged all patches for CentOS-7 -> CentOS-8 stable/train upstream
   migration.
   -

   Design improvements to Tempest scenario manager are under review.
   -

   Python3 support on diskimage-builder and buildimage role in tripleo-ci
   repo is under review.
   -

   Ruck/Rover recorded notes [1].


The planned work for the next sprint extends the work started in the
previous sprint and focuses on the following:

   -

   Downstream OSP 16.2 pipeline.
   -

   Next-gen promoter changes (new configuration engine).
   -

   Dependency pipeline design to early detect breakages in the OS.
   -

   New container naming prefix on Victoria/Master onwards.


The Ruck and Rover for this sprint are Arx Cruz (arxcruz) and Amol Kahat
(akahat). Please direct questions or queries to them regarding CI status or
issues in #tripleo, ideally to whomever has the ‘|ruck’ suffix on their
nick. Ruck/rover notes to be tracked in hackmd [2].

Thanks,

rfolco

*TripleO-CI team is now using an internal JIRA instance to track sprint work

[1] https://hackmd.io/QnprH9-yRTi6uWlEfaahoQ

[2] https://hackmd.io/FUalpr55TJuy28QLp2tLng


-- 
Folco
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200825/4443b337/attachment-0001.html>

From emilien at redhat.com  Wed Aug 26 02:06:54 2020
From: emilien at redhat.com (Emilien Macchi)
Date: Tue, 25 Aug 2020 22:06:54 -0400
Subject: [tripleo] no recheck please
In-Reply-To: <CACu=hyvr=2RgAhv=rRPv_j1En2gOzdewXNMQWZ0G_bdPhDb=iw@mail.gmail.com>
References: <CACu=hyvr=2RgAhv=rRPv_j1En2gOzdewXNMQWZ0G_bdPhDb=iw@mail.gmail.com>
Message-ID: <CACu=hyv7=3vF59t1uCVYRHU8NkwVfkTy7_R5QpYeR65k7siAxA@mail.gmail.com>

We merged:
https://review.opendev.org/747953 - Disable docker.io mirrors (makes direct
calls against registry API instead of going through proxy via single public
IP and hit rate limits)
https://review.opendev.org/746993 - Use new modify_only_with_source
(reduces number of hits against registry API)

So for now it's safe to recheck / +2 +A patches again.

Thanks for your patience

On Tue, Aug 25, 2020 at 10:32 AM Emilien Macchi <emilien at redhat.com> wrote:

> We're hitting the docker rate limits very badly right now and while our
> mitigation patch will land [1], please refrain from approving or recheck
> patches for now.
> I've cleared the gate and I'll take care of re-adding these patches into
> the gate when things will be stable again.
>
> [1] https://review.opendev.org/#/c/746993
>
> Thanks for your understanding and your patience!
> --
> Emilien Macchi
>


-- 
Emilien Macchi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200825/07b4cab9/attachment.html>

From Istvan.Szabo at agoda.com  Wed Aug 26 06:57:33 2020
From: Istvan.Szabo at agoda.com (Szabo, Istvan (Agoda))
Date: Wed, 26 Aug 2020 06:57:33 +0000
Subject: DB Prune
Message-ID: <859fb3c996514c2ead3fc1ce3de4210b@SG-AGMBX-6002.agoda.local>

Hi,

We have a cluster where the user continuously spawn and delete servers which makes the db even in compressed state 1.1GB.
I'm sure it has a huge amount of trash because this is a cicd environment and the prod just uses 75MB.
How is it possible to cleanup the db on a safe way, what should be the steps?

Best regards,
Istvan

________________________________
This message is confidential and is for the sole use of the intended recipient(s). It may also be privileged or otherwise protected by copyright or other legal rules. If you have received it by mistake please let us know by reply email and delete it from your system. It is prohibited to copy this message or disclose its content to anyone. Any confidentiality or privilege is not waived or lost by any mistaken delivery or unauthorized disclosure of the message. All messages sent to and from Agoda may be monitored to ensure compliance with company policies, to protect the company's interests and to remove potential malware. Electronic messages may be intercepted, amended, lost or deleted, or contain viruses.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200826/365ac405/attachment.html>

From pierre-samuel.le-stang at corp.ovh.com  Wed Aug 26 07:12:23 2020
From: pierre-samuel.le-stang at corp.ovh.com (Pierre-Samuel LE STANG)
Date: Wed, 26 Aug 2020 09:12:23 +0200
Subject: DB Prune
In-Reply-To: <859fb3c996514c2ead3fc1ce3de4210b@SG-AGMBX-6002.agoda.local>
References: <859fb3c996514c2ead3fc1ce3de4210b@SG-AGMBX-6002.agoda.local>
Message-ID: <20200826071223.dih7c7pbeat3sqah@corp.ovh.com>

Hey,

You may have a look at OSArchiver (OpenStack DB archiver) which is a tool we
use at OVHCloud to archive our OpenStack databases. We open sourced it last
year but this is not an official OpenStack tool.

https://github.com/ovh/osarchiver

-- 
PS


Szabo, Istvan (Agoda) <Istvan.Szabo at agoda.com> wrote on mer. [2020-août-26 06:57:33 +0000]:
> Hi,
> 
> We have a cluster where the user continuously spawn and delete servers which
> makes the db even in compressed state 1.1GB.
> I’m sure it has a huge amount of trash because this is a cicd environment and
> the prod just uses 75MB.
> How is it possible to cleanup the db on a safe way, what should be the steps?
> 
> Best regards,
> Istvan
> 
> 
> ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
> 
> This message is confidential and is for the sole use of the intended recipient
> (s). It may also be privileged or otherwise protected by copyright or other
> legal rules. If you have received it by mistake please let us know by reply
> email and delete it from your system. It is prohibited to copy this message or
> disclose its content to anyone. Any confidentiality or privilege is not waived
> or lost by any mistaken delivery or unauthorized disclosure of the message. All
> messages sent to and from Agoda may be monitored to ensure compliance with
> company policies, to protect the company's interests and to remove potential
> malware. Electronic messages may be intercepted, amended, lost or deleted, or
> contain viruses.
> 

-- 
Pierre-Samuel Le Stang


From arne.wiebalck at cern.ch  Wed Aug 26 08:30:56 2020
From: arne.wiebalck at cern.ch (Arne Wiebalck)
Date: Wed, 26 Aug 2020 10:30:56 +0200
Subject: [baremetal-sig][ironic] Future work and regular meetings
Message-ID: <4f6c5ffd-0929-f516-4299-f69892b1d434@cern.ch>

Dear all,

With the release of the bare metal white paper [0] the bare metal
SIG has completed its first target and is now ready to tackle new
challenges.

A number of potential topics the SIG could work on were raised during
the recent opendev events. The suggestions are summarised on the bare 
metal etherpad [1].

To select and organise the future work, we feel that it may be better to 
start with regular meetings, though: the current idea is once a month, 
for one hour, on zoom.

Based on the experience with the ad-hoc meetings we had so far I have
set up a doodle to pick the exact slot:

https://doodle.com/poll/3hpypw73455t2g24

If interested, please respond by the end of this week.

Equally, if you have additional suggestions for the next focus of the
SIG, do not hesitate to add them to [1].

Thanks!
  Arne

[0] 
https://www.openstack.org/use-cases/bare-metal/how-ironic-delivers-abstraction-and-automation-using-open-source-infrastructure
[1] https://etherpad.opendev.org/p/bare-metal-sig


From zhangbailin at inspur.com  Wed Aug 26 08:33:44 2020
From: zhangbailin at inspur.com (=?gb2312?B?QnJpbiBaaGFuZyjVxbDZwdYp?=)
Date: Wed, 26 Aug 2020 08:33:44 +0000
Subject: [nova] Add 'accel_uuids' parameter to rebuild() function of the virt
 driver
Message-ID: <e1e22633d79b49ddb7357643af804a5b@inspur.com>

Hi all.

       In Ussuri release we were completed the nova-cyborg-interaction feature, but there are some operations of instance were blocked [1], we will support evacuate/rebuild [2] and/or shelve/unshelve [2] instance with accelerator in Victoria release. In [2] we will add 'accel_uuids' parameter to the rebuild() method of virt driver and Ironic driver, in virt/driver [4] we are not implemented the rebuild() method, and the 'accel_uuids' will be ignored in virt/ironic/driver.


       [1] https://docs.openstack.org/api-guide/compute/accelerator-support.html

       [2] Cyborg evacuate/rebuild support https://review.opendev.org/#/c/715326

       [3] Cyborg shelve/unshelve support https://review.opendev.org/#/c/729563

         [4] https://github.com/openstack/nova/blob/master/nova/virt/driver.py#L285

         [5] https://github.com/openstack/nova/blob/master/nova/virt/ironic/driver.py#L1669


brinzhang


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200826/394993a3/attachment-0001.html>

From amotoki at gmail.com  Wed Aug 26 08:38:00 2020
From: amotoki at gmail.com (Akihiro Motoki)
Date: Wed, 26 Aug 2020 17:38:00 +0900
Subject: [horizon] default create_volume setting can't be changed
In-Reply-To: <20200824141904.Horde.biUwyDcXRQDK2D0KW6vwbE1@webmail.nde.ag>
References: <20200824141904.Horde.biUwyDcXRQDK2D0KW6vwbE1@webmail.nde.ag>
Message-ID: <CALhU9tnxq5=i7_=c2b_vGHpic8NhxkJQJPMydJeftqNqddcEtA@mail.gmail.com>

Hi Eugen,

I also noticed this and filed a bug report at
https://bugs.launchpad.net/horizon/+bug/1892990.
It was caused by a missing comma in REST_API_REQUIRED_SETTINGS in
openstack_dashboard/defaults.py.
It was fixed in the master this month. It affects stable/train and
stable/ussuri branches.

I proposed backports to ussuri and train respectively.
Cloud you try the stable/train backport?
https://review.opendev.org/#/q/I1eae4be4464f55a29d169403a70c958c3b8a308b

Thanks,
Akihiro Motoki (irc: amotoki)

On Mon, Aug 24, 2020 at 11:21 PM Eugen Block <eblock at nde.ag> wrote:
>
> Hi *,
>
> we recently upgraded from Ocata to Train and I'm struggling with a
> specific setting: I believe since Pike version the default for
> "create_volume" changed to "true" when launching instances from
> Horizon dashboard. I would like to change that to "false" and set it
> in our custom
> /srv/www/openstack-dashboard/openstack_dashboard/local/local_settings.d/_100_local_settings.py:
>
>
> LAUNCH_INSTANCE_DEFAULTS = {
>      'config_drive': False,
>      'create_volume': False,
>      'hide_create_volume': False,
>      'disable_image': False,
>      'disable_instance_snapshot': False,
>      'disable_volume': False,
>      'disable_volume_snapshot': False,
>      'enable_scheduler_hints': True,
> }
>
> Other configs from this file work as expected, so that custom file
> can't be the reason.
> After apache and memcached restart nothing changes, the default is
> still "true". Can anyone shed some light, please? I haven't tried
> other configs yet so I can't tell if more options are affected.
>
> Thanks!
> Eugen
>
>


From eblock at nde.ag  Wed Aug 26 08:55:09 2020
From: eblock at nde.ag (Eugen Block)
Date: Wed, 26 Aug 2020 08:55:09 +0000
Subject: [horizon] default create_volume setting can't be changed
In-Reply-To: <CALhU9tnxq5=i7_=c2b_vGHpic8NhxkJQJPMydJeftqNqddcEtA@mail.gmail.com>
References: <20200824141904.Horde.biUwyDcXRQDK2D0KW6vwbE1@webmail.nde.ag>
 <CALhU9tnxq5=i7_=c2b_vGHpic8NhxkJQJPMydJeftqNqddcEtA@mail.gmail.com>
Message-ID: <20200826085509.Horde.q7sOlWnVkEVEkT1R3RLt0NM@webmail.nde.ag>

Hi,

thank you very much for the confirmation and the bug report.
Setting the comma seems to do the trick, I reverted my own changes and  
only added the comma, after restarting apache the dashboard applied my  
settings.

Thanks for the quick solution!

Best regards,
Eugen


Zitat von Akihiro Motoki <amotoki at gmail.com>:

> Hi Eugen,
>
> I also noticed this and filed a bug report at
> https://bugs.launchpad.net/horizon/+bug/1892990.
> It was caused by a missing comma in REST_API_REQUIRED_SETTINGS in
> openstack_dashboard/defaults.py.
> It was fixed in the master this month. It affects stable/train and
> stable/ussuri branches.
>
> I proposed backports to ussuri and train respectively.
> Cloud you try the stable/train backport?
> https://review.opendev.org/#/q/I1eae4be4464f55a29d169403a70c958c3b8a308b
>
> Thanks,
> Akihiro Motoki (irc: amotoki)
>
> On Mon, Aug 24, 2020 at 11:21 PM Eugen Block <eblock at nde.ag> wrote:
>>
>> Hi *,
>>
>> we recently upgraded from Ocata to Train and I'm struggling with a
>> specific setting: I believe since Pike version the default for
>> "create_volume" changed to "true" when launching instances from
>> Horizon dashboard. I would like to change that to "false" and set it
>> in our custom
>> /srv/www/openstack-dashboard/openstack_dashboard/local/local_settings.d/_100_local_settings.py:
>>
>>
>> LAUNCH_INSTANCE_DEFAULTS = {
>>      'config_drive': False,
>>      'create_volume': False,
>>      'hide_create_volume': False,
>>      'disable_image': False,
>>      'disable_instance_snapshot': False,
>>      'disable_volume': False,
>>      'disable_volume_snapshot': False,
>>      'enable_scheduler_hints': True,
>> }
>>
>> Other configs from this file work as expected, so that custom file
>> can't be the reason.
>> After apache and memcached restart nothing changes, the default is
>> still "true". Can anyone shed some light, please? I haven't tried
>> other configs yet so I can't tell if more options are affected.
>>
>> Thanks!
>> Eugen
>>
>>


From thierry at openstack.org  Wed Aug 26 09:00:52 2020
From: thierry at openstack.org (Thierry Carrez)
Date: Wed, 26 Aug 2020 11:00:52 +0200
Subject: [largescale-sig] Next meeting: August 26, 8utc
In-Reply-To: <c3032c83-e083-f1fa-fc7a-a2a754fe3489@openstack.org>
References: <c3032c83-e083-f1fa-fc7a-a2a754fe3489@openstack.org>
Message-ID: <362079ea-ef67-4e7b-a4f5-2f9ea17e7f95@openstack.org>

During our meeting today we discussed Summit/PTG plans, and agreed to 
request one Forum session on scaling stories, and one PTG short meeting 
to replace our regular meeting that week.

Meeting logs at:
http://eavesdrop.openstack.org/meetings/large_scale_sig/2020/large_scale_sig.2020-08-26-08.00.html

TODOs:
- all to contact US large deployment friends to invite them to next 
EU-US meeting
- ttx to request Forum/PTG sessions
- belmoreira, ttx to push for OSops resurrection
- all to describe briefly how you solved metrics/billing in your 
deployment in https://etherpad.openstack.org/p/large-scale-sig-documentation
- masahito to push latest patches to oslo.metrics
- ttx to look into a basic test framework for oslo,metrics
- amorin to see if oslo.metrics could be tested at OVH

Next meetings: Sep 9, 16:00UTC; Sep 23, 8:00UTC (#openstack-meeting-3)

-- 
Thierry Carrez (ttx)


From balazs.gibizer at est.tech  Wed Aug 26 12:15:29 2020
From: balazs.gibizer at est.tech (=?iso-8859-1?q?Bal=E1zs?= Gibizer)
Date: Wed, 26 Aug 2020 14:15:29 +0200
Subject: DB Prune
In-Reply-To: <859fb3c996514c2ead3fc1ce3de4210b@SG-AGMBX-6002.agoda.local>
References: <859fb3c996514c2ead3fc1ce3de4210b@SG-AGMBX-6002.agoda.local>
Message-ID: <TD7OFQ.SPEOKHDSCXBN3@est.tech>


On Wed, Aug 26, 2020 at 06:57, "Szabo, Istvan (Agoda)" 
<Istvan.Szabo at agoda.com> wrote:
> Hi,
> 
>  We have a cluster where the user continuously spawn and delete 
> servers which makes the db even in compressed state 1.1GB.
>  I’m sure it has a huge amount of trash because this is a cicd 
> environment and the prod just uses 75MB.
>  How is it possible to cleanup the db on a safe way, what should be 
> the steps?
> 

 From Nova perspective you can get rid of the data of the already 
deleted instances via the following two commands:

nova-manage db archive_deleted_rows
nova-manage db purge

Cheers,
gibi

[1]https://docs.openstack.org/nova/latest/cli/nova-manage.html


> 
>  Best regards,
>  Istvan
> 
> 
> This message is confidential and is for the sole use of the intended 
> recipient(s). It may also be privileged or otherwise protected by 
> copyright or other legal rules. If you have received it by mistake 
> please let us know by reply email and delete it from your system. It 
> is prohibited to copy this message or disclose its content to anyone. 
> Any confidentiality or privilege is not waived or lost by any 
> mistaken delivery or unauthorized disclosure of the message. All 
> messages sent to and from Agoda may be monitored to ensure compliance 
> with company policies, to protect the company's interests and to 
> remove potential malware. Electronic messages may be intercepted, 
> amended, lost or deleted, or contain viruses.


From sean.mcginnis at gmx.com  Wed Aug 26 14:12:22 2020
From: sean.mcginnis at gmx.com (Sean McGinnis)
Date: Wed, 26 Aug 2020 09:12:22 -0500
Subject: [releases] Dropping my releases core/release-manager hat
In-Reply-To: <f0ba875c-6c9c-4fcc-aa6f-48bd75bc1da9@www.fastmail.com>
References: <f0ba875c-6c9c-4fcc-aa6f-48bd75bc1da9@www.fastmail.com>
Message-ID: <4d71f05a-69db-76b8-6976-a4a2784f1124@gmx.com>

On 8/21/20 9:35 AM, Jean-Philippe Evrard wrote:
> Hello folks,
>
> I am sad to announce that, while super motivated to keep helping the team, I cannot reliably and consistantly do my duties of core in the releases team, due to my current duties at work.
>
> It's been a while I haven't significantly helped the release team, and the team deserve all the transparency and clarity it can get about its contributors. It's time for me to step down.
>
> It's been a pleasure to help the team while it lasted. If you are looking for a team to get involved in OpenStack, make no mistake, the release team is awesome. Thank you everyone in the team, you were all amazing and so welcoming :)
>
> Regards,
> Jean-Philippe Evrard (evrardjp)
>
Thanks for all your help with everything you've done JP. Just let us
know if the situation changes in the future.

Sean


From witold.bedyk at suse.com  Wed Aug 26 15:38:27 2020
From: witold.bedyk at suse.com (Witek Bedyk)
Date: Wed, 26 Aug 2020 17:38:27 +0200
Subject: [monasca] Retire monasca-analytics repository
Message-ID: <aeba140f-7757-f09d-500d-d3422ed487bf@suse.de>

Hello,

this message is to announce the retirement of
openstack/monasca-analytics repository. The project will not accept any
new patches. It will follow the process described in Project Team Guide [1].

Please reply to this message until Sept. 7 if you would like to take
over the development and maintenance of this repository.

Thanks
Witek

[1]
https://docs.openstack.org/project-team-guide/repository.html#retiring-a-repository


From mark at stackhpc.com  Wed Aug 26 16:17:31 2020
From: mark at stackhpc.com (Mark Goddard)
Date: Wed, 26 Aug 2020 17:17:31 +0100
Subject: [kolla] Kayobe config walkthrough docs call
Message-ID: <CAFHSqWqUCf_LCm+FBsM23buEW-O7+CfhgVzJ2K_3ShV2RRj=5Q@mail.gmail.com>

Hi,

In today's kolla IRC meeting we proposed to have a meeting to discuss
the long awaited Kayobe configuration walkthrough documentation. We'll
try to agree on an approach, and get people signed up for writing
parts or all of it.

The proposed meeting time is tomorrow (27th August) at 15:00 - 16:00
UTC, the same slot as the Kolla Klub (which is still on summer break).

Please reply if you would like to attend but cannot make this slot.

Google meet link: https://meet.google.com/xfg-ieza-qrz

Regards,
Mark


From radoslaw.piliszek at gmail.com  Wed Aug 26 17:38:03 2020
From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=)
Date: Wed, 26 Aug 2020 19:38:03 +0200
Subject: [tc][masakari] Project aliveness (was: [masakari] Meetings)
In-Reply-To: <CAKZ_x7_04d-rA5vtN7exetRT3egK0VE2mFaGv8zAoa484Pd84w@mail.gmail.com>
References: <CAKZ_x79qopDY36TNg_5mvWFy=tiiB08PZqpjKeft-TnR0dQbtA@mail.gmail.com>
 <CAKZ_x799gE8njCGVwUtd28REvJaYK86n0WZ3PVeQv5-HXKGdmQ@mail.gmail.com>
 <CAFHSqWroR+rirgLLn7hB-8fHKBR8jFE8KezgSg4EayePC3_dgQ@mail.gmail.com>
 <AM0PR02MB5746C86B851DAE317F5A2CEDA85F0@AM0PR02MB5746.eurprd02.prod.outlook.com>
 <CA+RdavF=POsLiCzkL6DxgjAxorUaw0z2njhm8BHGU67mwfkOwg@mail.gmail.com>
 <CAA857Vz2W+_tnfT2fAhVEVojcxwASJCazF3JLW+LF=om02Q5Eg@mail.gmail.com>
 <6868fdd8-54cd-4ccf-a3d7-ffecf5eb601b@www.fastmail.com>
 <CAA857VyQ9qYvsao0U4V7PBzG6kOgOCR3q08BU5o_fdLEeGu+BA@mail.gmail.com>
 <CA+RdavHWkTdpFSWLoJwpKWOnX6cDPBWex+wdP-U0h1YXF0_GmA@mail.gmail.com>
 <CAKZ_x7_04d-rA5vtN7exetRT3egK0VE2mFaGv8zAoa484Pd84w@mail.gmail.com>
Message-ID: <CAKZ_x78Yn33Vm29HH77zn8z6KLdmKac9N=b-rPhGvUzvCiDi7Q@mail.gmail.com>

On Tue, Aug 25, 2020 at 10:03 AM Radosław Piliszek
<radoslaw.piliszek at gmail.com> wrote:
> I'll sit down to clean up the queue a bit and ask other new cores to
> co-review and merge a few waiting patches.

Aaand it's been done. :-)

-yoctozepto


From dev.faz at gmail.com  Wed Aug 26 18:05:08 2020
From: dev.faz at gmail.com (Fabian Zimmermann)
Date: Wed, 26 Aug 2020 20:05:08 +0200
Subject: [tc][masakari] Project aliveness (was: [masakari] Meetings)
In-Reply-To: <CAKZ_x78Yn33Vm29HH77zn8z6KLdmKac9N=b-rPhGvUzvCiDi7Q@mail.gmail.com>
References: <CAKZ_x79qopDY36TNg_5mvWFy=tiiB08PZqpjKeft-TnR0dQbtA@mail.gmail.com>
 <CAKZ_x799gE8njCGVwUtd28REvJaYK86n0WZ3PVeQv5-HXKGdmQ@mail.gmail.com>
 <CAFHSqWroR+rirgLLn7hB-8fHKBR8jFE8KezgSg4EayePC3_dgQ@mail.gmail.com>
 <AM0PR02MB5746C86B851DAE317F5A2CEDA85F0@AM0PR02MB5746.eurprd02.prod.outlook.com>
 <CA+RdavF=POsLiCzkL6DxgjAxorUaw0z2njhm8BHGU67mwfkOwg@mail.gmail.com>
 <CAA857Vz2W+_tnfT2fAhVEVojcxwASJCazF3JLW+LF=om02Q5Eg@mail.gmail.com>
 <6868fdd8-54cd-4ccf-a3d7-ffecf5eb601b@www.fastmail.com>
 <CAA857VyQ9qYvsao0U4V7PBzG6kOgOCR3q08BU5o_fdLEeGu+BA@mail.gmail.com>
 <CA+RdavHWkTdpFSWLoJwpKWOnX6cDPBWex+wdP-U0h1YXF0_GmA@mail.gmail.com>
 <CAKZ_x7_04d-rA5vtN7exetRT3egK0VE2mFaGv8zAoa484Pd84w@mail.gmail.com>
 <CAKZ_x78Yn33Vm29HH77zn8z6KLdmKac9N=b-rPhGvUzvCiDi7Q@mail.gmail.com>
Message-ID: <22517cce-3644-2918-b38a-c6fafac7aab4@googlemail.com>

Hi,

Am 26.08.20 um 19:38 schrieb Radosław Piliszek:
>
> Aaand it's been done. :-)

I will check my emails tomorrow :)

Have a nice evening,

 Fabian


From cohuck at redhat.com  Tue Aug 25 14:39:25 2020
From: cohuck at redhat.com (Cornelia Huck)
Date: Tue, 25 Aug 2020 16:39:25 +0200
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200820031621.GA24997@joy-OptiPlex-7040>
References: <20200810074631.GA29059@joy-OptiPlex-7040>
 <e6e75807-0614-bd75-aeb6-64d643e029d3@redhat.com>
 <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
 <20200818113652.5d81a392.cohuck@redhat.com>
 <20200820003922.GE21172@joy-OptiPlex-7040>
 <20200819212234.223667b3@x1.home>
 <20200820031621.GA24997@joy-OptiPlex-7040>
Message-ID: <20200825163925.1c19b0f0.cohuck@redhat.com>

On Thu, 20 Aug 2020 11:16:21 +0800
Yan Zhao <yan.y.zhao at intel.com> wrote:

> On Wed, Aug 19, 2020 at 09:22:34PM -0600, Alex Williamson wrote:
> > On Thu, 20 Aug 2020 08:39:22 +0800
> > Yan Zhao <yan.y.zhao at intel.com> wrote:
> >   
> > > On Tue, Aug 18, 2020 at 11:36:52AM +0200, Cornelia Huck wrote:  
> > > > On Tue, 18 Aug 2020 10:16:28 +0100
> > > > Daniel P. Berrangé <berrange at redhat.com> wrote:
> > > >     
> > > > > On Tue, Aug 18, 2020 at 05:01:51PM +0800, Jason Wang wrote:    
> > > > > >    On 2020/8/18 下午4:55, Daniel P. Berrangé wrote:
> > > > > > 
> > > > > >  On Tue, Aug 18, 2020 at 11:24:30AM +0800, Jason Wang wrote:
> > > > > > 
> > > > > >  On 2020/8/14 下午1:16, Yan Zhao wrote:
> > > > > > 
> > > > > >  On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
> > > > > > 
> > > > > >  On 2020/8/10 下午3:46, Yan Zhao wrote:      
> > > > >     
> > > > > >  we actually can also retrieve the same information through sysfs, .e.g
> > > > > > 
> > > > > >  |- [path to device]
> > > > > >     |--- migration
> > > > > >     |     |--- self
> > > > > >     |     |   |---device_api
> > > > > >     |    |   |---mdev_type
> > > > > >     |    |   |---software_version
> > > > > >     |    |   |---device_id
> > > > > >     |    |   |---aggregator
> > > > > >     |     |--- compatible
> > > > > >     |     |   |---device_api
> > > > > >     |    |   |---mdev_type
> > > > > >     |    |   |---software_version
> > > > > >     |    |   |---device_id
> > > > > >     |    |   |---aggregator
> > > > > > 
> > > > > > 
> > > > > >  Yes but:
> > > > > > 
> > > > > >  - You need one file per attribute (one syscall for one attribute)
> > > > > >  - Attribute is coupled with kobject    
> > > > 
> > > > Is that really that bad? You have the device with an embedded kobject
> > > > anyway, and you can just put things into an attribute group?
> > > > 
> > > > [Also, I think that self/compatible split in the example makes things
> > > > needlessly complex. Shouldn't semantic versioning and matching already
> > > > cover nearly everything? I would expect very few cases that are more
> > > > complex than that. Maybe the aggregation stuff, but I don't think we
> > > > need that self/compatible split for that, either.]    
> > > Hi Cornelia,
> > > 
> > > The reason I want to declare compatible list of attributes is that
> > > sometimes it's not a simple 1:1 matching of source attributes and target attributes
> > > as I demonstrated below,
> > > source mdev of (mdev_type i915-GVTg_V5_2 + aggregator 1) is compatible to
> > > target mdev of (mdev_type i915-GVTg_V5_4 + aggregator 2),
> > >                (mdev_type i915-GVTg_V5_8 + aggregator 4)
> > > 
> > > and aggragator may be just one of such examples that 1:1 matching does not
> > > fit.  
> > 
> > If you're suggesting that we need a new 'compatible' set for every
> > aggregation, haven't we lost the purpose of aggregation?  For example,
> > rather than having N mdev types to represent all the possible
> > aggregation values, we have a single mdev type with N compatible
> > migration entries, one for each possible aggregation value.  BTW, how do
> > we have multiple compatible directories?  compatible0001,
> > compatible0002? Thanks,
> >   
> do you think the bin_attribute I proposed yesterday good?
> Then we can have a single compatible with a variable in the mdev_type and
> aggregator.
> 
>    mdev_type=i915-GVTg_V5_{val1:int:2,4,8}
>    aggregator={val1}/2

I'm not really a fan of binary attributes other than in cases where we
have some kind of binary format to begin with.

IIUC, we basically have:
- different partitioning (expressed in the mdev_type)
- different number of partitions (expressed via the aggregator)
- devices being compatible if the partitioning:aggregator ratio is the
  same

(The multiple mdev_type variants seem to come from avoiding extra
creation parameters, IIRC?)

Would it be enough to export
base_type=i915-GVTg_V5
aggregation_ratio=<integer>

to express the various combinations that are compatible without the
need for multiple sets of attributes?


From anlin.kong at gmail.com  Tue Aug 25 21:41:19 2020
From: anlin.kong at gmail.com (Lingxian Kong)
Date: Wed, 26 Aug 2020 09:41:19 +1200
Subject: [openstack-community] Error add member to pool ( OCTAVIA ) when
 using SSL to verify
In-Reply-To: <59EC5E93-FC3F-4EDC-A874-9A2F466B37DC@demarco.com>
References: <692B1576-9AB1-46F9-9328-0D510DDCEE01@hxcore.ol>
 <59EC5E93-FC3F-4EDC-A874-9A2F466B37DC@demarco.com>
Message-ID: <CALjNAZ1m41yAFSsoZqk40fXMN51f8mhJDZfvRsW_nBV53njS2w@mail.gmail.com>

>From the log, it seems like the HTTPS communication with Neutron failed,
can you successfully talk to Neutron using HTTPS? You can also try to
simulate the code here
https://github.com/openstack/octavia/blob/stable%2Fussuri/octavia/network/drivers/neutron/base.py#L38
for testing.

---
Lingxian Kong
Senior Software Engineer
Catalyst Cloud
www.catalystcloud.nz


On Wed, Aug 26, 2020 at 2:25 AM Amy Marrich <amy at demarco.com> wrote:

> Adding the OpenStack discuss list.
>
> Amy (spotz)
>
> On Aug 24, 2020, at 11:14 PM, Vinh Nguyen Duc <vinhducnguyen1708 at gmail.com>
> wrote:
>
> ﻿
>
> Dear Openstack community,
>
>
>
> My name is Duc Vinh,  I am newer in Openstack
>
> I am deploy Openstack Ussuri on Centos8 , I am using three nodes
> controller with High Availability topology and using HAproxy to verify
> cert for connect HTTPS,
>
> I have trouble with project Octavia, I cannot add member in a pool after
> created Loadbalancer, listener, pool ( everything is fine).
>
> Here is my log and configuration file:
>
>
>
> *LOGS: *
>
>
>
> 2020-08-25 10:55:42.872 226250 DEBUG octavia.network.drivers.neutron.base
> [req-57c5b37c-e50f-4d50-b535-b0a3d19db1d5 -
> 8259463ce052437396afa845933afe4b - default default] Neutron extension
> security-group found enabled _check_extension_enabled
> /usr/lib/python3.6/site-packages/octavia/network/drivers/neutron/base.py:66
>
> 2020-08-25 10:55:42.892 226250 DEBUG octavia.network.drivers.neutron.base
> [req-57c5b37c-e50f-4d50-b535-b0a3d19db1d5 -
> 8259463ce052437396afa845933afe4b - default default] Neutron extension
> dns-integration is not enabled _check_extension_enabled
> /usr/lib/python3.6/site-packages/octavia/network/drivers/neutron/base.py:70
>
> 2020-08-25 10:55:42.911 226250 DEBUG octavia.network.drivers.neutron.base
> [req-57c5b37c-e50f-4d50-b535-b0a3d19db1d5 -
> 8259463ce052437396afa845933afe4b - default default] Neutron extension qos
> found enabled _check_extension_enabled
> /usr/lib/python3.6/site-packages/octavia/network/drivers/neutron/base.py:66
>
> 2020-08-25 10:55:42.933 226250 DEBUG octavia.network.drivers.neutron.base
> [req-57c5b37c-e50f-4d50-b535-b0a3d19db1d5 -
> 8259463ce052437396afa845933afe4b - default default] Neutron extension
> allowed-address-pairs found enabled _check_extension_enabled
> /usr/lib/python3.6/site-packages/octavia/network/drivers/neutron/base.py:66
>
> 2020-08-25 10:55:43.068 226250 WARNING keystoneauth.identity.generic.base
> [req-57c5b37c-e50f-4d50-b535-b0a3d19db1d5 -
> 8259463ce052437396afa845933afe4b - default default] Failed to discover
> available identity versions when contacting https://192.168.10.150:5000.
> Attempting to parse version from URL.:
> keystoneauth1.exceptions.connection.SSLError: SSL exception connecting to
> https://192.168.10.150:5000: HTTPSConnectionPool(host='192.168.10.150',
> port=5000): Max retries exceeded with url: / (Caused by
> SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify
> failed (_ssl.c:897)'),))
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
> [req-57c5b37c-e50f-4d50-b535-b0a3d19db1d5 -
> 8259463ce052437396afa845933afe4b - default default] Error retrieving subnet
> (subnet id: 035f3183-f469-415f-b536-b4a81364e814.:
> keystoneauth1.exceptions.discovery.DiscoveryFailure: Could not find
> versioned identity endpoints when attempting to authenticate. Please check
> that your auth_url is correct. SSL exception connecting to
> https://192.168.10.150:5000: HTTPSConnectionPool(host='192.168.10.150',
> port=5000): Max retries exceeded with url: / (Caused by
> SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify
> failed (_ssl.c:897)'),))
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
> Traceback (most recent call last):
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 600, in
> urlopen
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     chunked=chunked)
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 343, in
> _make_request
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     self._validate_conn(conn)
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 839, in
> _validate_conn
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     conn.connect()
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/urllib3/connection.py", line 344, in
> connect
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     ssl_context=context)
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/urllib3/util/ssl_.py", line 367, in
> ssl_wrap_socket
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     return context.wrap_socket(sock)
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File "/usr/lib64/python3.6/ssl.py",
> line 365, in wrap_socket
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     _context=self, _session=session)
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File "/usr/lib64/python3.6/ssl.py",
> line 776, in __init__
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     self.do_handshake()
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File "/usr/lib64/python3.6/ssl.py",
> line 1036, in do_handshake
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     self._sslobj.do_handshake()
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File "/usr/lib64/python3.6/ssl.py",
> line 648, in do_handshake
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     self._sslobj.do_handshake()
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
> ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed
> (_ssl.c:897)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
> During handling of the above exception, another exception occurred:
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
> Traceback (most recent call last):
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/requests/adapters.py", line 449, in send
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     timeout=timeout
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 638, in
> urlopen
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
>     _stacktrace=sys.exc_info()[2])
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/urllib3/util/retry.py", line 399, in
> increment
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     raise MaxRetryError(_pool, url,
> error or ResponseError(cause))
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
> urllib3.exceptions.MaxRetryError:
> HTTPSConnectionPool(host='192.168.10.150', port=5000): Max retries exceeded
> with url: / (Caused by SSLError(SSLError(1, '[SSL:
> CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:897)'),))
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
> During handling of the above exception, another exception occurred:
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
> Traceback (most recent call last):
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/keystoneauth1/session.py", line 1004, in
> _send_request
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     resp =
> self.session.request(method, url, **kwargs)
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/requests/sessions.py", line 533, in
> request
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     resp = self.send(prep,
> **send_kwargs)
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/requests/sessions.py", line 646, in send
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     r = adapter.send(request, **kwargs)
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/requests/adapters.py", line 514, in send
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     raise SSLError(e, request=request)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
> requests.exceptions.SSLError: HTTPSConnectionPool(host='192.168.10.150',
> port=5000): Max retries exceeded with url: / (Caused by
> SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify
> failed (_ssl.c:897)'),))
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
> During handling of the above exception, another exception occurred:
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
> Traceback (most recent call last):
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/keystoneauth1/identity/generic/base.py",
> line 138, in _do_create_plugin
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     authenticated=False)
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/keystoneauth1/identity/base.py", line
> 610, in get_discovery
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     authenticated=authenticated)
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/keystoneauth1/discover.py", line 1452, in
> get_discovery
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     disc = Discover(session, url,
> authenticated=authenticated)
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/keystoneauth1/discover.py", line 536, in
> __init__
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     authenticated=authenticated)
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/keystoneauth1/discover.py", line 102, in
> get_version_data
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     resp = session.get(url,
> headers=headers, authenticated=authenticated)
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/keystoneauth1/session.py", line 1123, in
> get
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     return self.request(url, 'GET',
> **kwargs)
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/keystoneauth1/session.py", line 913, in
> request
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     resp = send(**kwargs)
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/keystoneauth1/session.py", line 1008, in
> _send_request
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     raise exceptions.SSLError(msg)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
> keystoneauth1.exceptions.connection.SSLError: SSL exception connecting to
> https://192.168.10.150:5000: HTTPSConnectionPool(host='192.168.10.150',
> port=5000): Max retries exceeded with url: / (Caused by
> SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify
> failed (_ssl.c:897)'),))
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
> During handling of the above exception, another exception occurred:
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
> Traceback (most recent call last):
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
>   File
> "/usr/lib/python3.6/site-packages/octavia/network/drivers/neutron/base.py",
> line 193, in _get_resource
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     resource_type)(resource_id)
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/neutronclient/v2_0/client.py", line 869,
> in show_subnet
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     return self.get(self.subnet_path %
> (subnet), params=_params)
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/neutronclient/v2_0/client.py", line 354,
> in get
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     headers=headers, params=params)
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/neutronclient/v2_0/client.py", line 331,
> in retry_request
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     headers=headers, params=params)
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/neutronclient/v2_0/client.py", line 282,
> in do_request
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     headers=headers)
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/neutronclient/client.py", line 339, in
> do_request
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     self._check_uri_length(url)
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/neutronclient/client.py", line 332, in
> _check_uri_length
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     uri_len = len(self.endpoint_url) +
> len(url)
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/neutronclient/client.py", line 346, in
> endpoint_url
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     return self.get_endpoint()
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/keystoneauth1/adapter.py", line 282, in
> get_endpoint
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     return
> self.session.get_endpoint(auth or self.auth, **kwargs)
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/keystoneauth1/session.py", line 1225, in
> get_endpoint
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     return auth.get_endpoint(self,
> **kwargs)
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/keystoneauth1/identity/base.py", line
> 380, in get_endpoint
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base
> allow_version_hack=allow_version_hack, **kwargs)
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/keystoneauth1/identity/base.py", line
> 271, in get_endpoint_data
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     service_catalog =
> self.get_access(session).service_catalog
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/keystoneauth1/identity/base.py", line
> 134, in get_access
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     self.auth_ref =
> self.get_auth_ref(session)
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/keystoneauth1/identity/generic/base.py",
> line 206, in get_auth_ref
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     self._plugin =
> self._do_create_plugin(session)
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base   File
> "/usr/lib/python3.6/site-packages/keystoneauth1/identity/generic/base.py",
> line 161, in _do_create_plugin
>
> 2020-08-25 10:55:43.070 226250 ERROR
> octavia.network.drivers.neutron.base     'auth_url is correct. %s' % e)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
> keystoneauth1.exceptions.discovery.DiscoveryFailure: Could not find
> versioned identity endpoints when attempting to authenticate. Please check
> that your auth_url is correct. SSL exception connecting to
> https://192.168.10.150:5000: HTTPSConnectionPool(host='192.168.10.150',
> port=5000): Max retries exceeded with url: / (Caused by
> SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify
> failed (_ssl.c:897)'),))
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
>
> 2020-08-25 10:55:43.074 226250 DEBUG wsme.api
> [req-57c5b37c-e50f-4d50-b535-b0a3d19db1d5 -
> 8259463ce052437396afa845933afe4b - default default] Client-side error:
> Subnet 035f3183-f469-415f-b536-b4a81364e814 not found. format_exception
> /usr/lib/python3.6/site-packages/wsme/api.py:222
>
> 2020-08-25 10:55:43.076 226250 DEBUG octavia.common.keystone
> [req-57c5b37c-e50f-4d50-b535-b0a3d19db1d5 -
> 8259463ce052437396afa845933afe4b - default default] Request path is / and
> it does not require keystone authentication process_request
> /usr/lib/python3.6/site-packages/octavia/common/keystone.py:77
>
> 2020-08-25 10:55:43.080 226250 DEBUG octavia.common.keystone
> [req-5091d326-0cb4-4ae1-bf4b-9ef6b9313dca - - - - -] Request path is / and
> it does not require keystone authentication process_request
> /usr/lib/python3.6/site-packages/octavia/common/keystone.py:77
>
>
>
> *Configuration:*
>
> [root at controller01 ~]# cat /etc/octavia/octavia.conf
>
> [DEFAULT]
>
>
>
> log_dir = /var/log/octavia
>
> debug = True
>
> transport_url = rabbit://
> openstack:4ychZAT5VrWlk6KFfgAmpXvGdzfdV8hEpIgOLhyF at 192.168.10.178:5672,
> openstack:4ychZAT5VrWlk6KFfgAmpXvGdzfdV8hEpIgOLhyF at 192.168.10.179:5672,
> openstack:4ychZAT5VrWlk6KFfgAmpXvGdzfdV8hEpIgOLhyF at 192.168.10.28:5672
>
>
>
> [api_settings]
>
> api_base_uri = https://192.168.10.150:9876
>
> bind_host = 192.168.10.178
>
> bind_port = 9876
>
> auth_strategy = keystone
>
> healthcheck_enabled = True
>
> allow_tls_terminated_listeners = True
>
>
>
> [database]
>
> connection = mysql+pymysql://
> octavia:FUkbii8AY4G6H9LxbJ2RRlOzHN61X8PI8FrMcuXQ at 192.168.10.150/octavia
>
> max_retries = -1
>
>
>
> [health_manager]
>
> bind_port = 5555
>
> bind_ip = 192.168.10.178
>
> controller_ip_port_list = 192.168.10.178:5555, 192.168.10.179:5555,
> 192.168.10.28:5555
>
> heartbeat_key = insecure
>
>
>
> [keystone_authtoken]
>
> service_token_roles_required = True
>
> www_authenticate_uri = https://192.168.10.150:5000
>
> auth_url = https://192.168.10.150:5000
>
> region_name = Hanoi
>
> memcached_servers = 192.168.10.178:11211,192.168.10.179:11211,
> 192.168.10.28:11211
>
> auth_type = password
>
> project_domain_name = Default
>
> user_domain_name = Default
>
> project_name = service
>
> username = octavia
>
> password = esGn3rN3iJOAD2HXmqznFPI9oAY2wQNDWYwqJaCH
>
> cafile = /etc/ssl/private/haproxy.pem
>
> insecure = false
>
>
>
>
>
> [certificates]
>
> cert_generator = local_cert_generator
>
> #server_certs_key_passphrase = insecure-key-do-not-use-this-key
>
> ca_private_key_passphrase = esGn3rN3iJOAD2HXmqznFPI9oAY2wQNDWYwqJaCH
>
> ca_private_key = /etc/octavia/certs/server_ca.key.pem
>
> ca_certificate = /etc/octavia/certs/server_ca.cert.pem
>
> region_name = Hanoi
>
> ca_certificates_file = /etc/ssl/private/haproxy.pem
>
> endpoint_type = internal
>
>
>
> [networking]
>
> #allow_vip_network_id = True
>
> #allow_vip_subnet_id = True
>
> #allow_vip_port_id = True
>
>
>
> [haproxy_amphora]
>
> #bind_port = 9443
>
> server_ca = /etc/octavia/certs/server_ca.cert.pem
>
> client_cert = /etc/octavia/certs/client.cert-and-key.pem
>
> base_path = /var/lib/octavia
>
> base_cert_dir = /var/lib/octavia/certs
>
> connection_max_retries = 1500
>
> connection_retry_interval = 1
>
>
>
> [controller_worker]
>
> amp_image_tag = amphora
>
> amp_ssh_key_name = octavia
>
> amp_secgroup_list = 80f44b73-dc9f-48aa-a0b8-8b78e5c6585c
>
> amp_boot_network_list = 04425cb2-5963-48f5-a229-b89b7c6036bd
>
> amp_flavor_id = 200
>
> network_driver = allowed_address_pairs_driver
>
> compute_driver = compute_nova_driver
>
> amphora_driver = amphora_haproxy_rest_driver
>
> client_ca = /etc/octavia/certs/client_ca.cert.pem
>
> loadbalancer_topology = SINGLE
>
> amp_active_retries = 9999
>
>
>
> [task_flow]
>
> [oslo_messaging]
>
> topic = octavia_prov
>
> rpc_thread_pool_size = 2
>
>
>
> [house_keeping]
>
> [amphora_agent]
>
> [keepalived_vrrp]
>
>
>
> [service_auth]
>
> auth_url = https://192.168.10.150:5000
>
> auth_type = password
>
> project_domain_name = default
>
> user_domain_name = default
>
> project_name = admin
>
> username = admin
>
> password = F35sXAYW5qDlMGfQbhmexIx12DqrQdpw6ixAseTd
>
> cafile = /etc/ssl/private/haproxy.pem
>
> region_name = Hanoi
>
> memcached_servers = 192.168.10.178:11211,192.168.10.179:11211,
> 192.168.10.28:11211
>
> #insecure = true
>
>
>
>
>
> [glance]
>
> ca_certificates_file = /etc/ssl/private/haproxy.pem
>
> region_name = Hanoi
>
> endpoint_type = internal
>
> insecure = false
>
>
>
> [neutron]
>
> ca_certificates_file = /etc/ssl/private/haproxy.pem
>
> region_name = Hanoi
>
> endpoint_type = internal
>
> insecure = false
>
>
>
> [cinder]
>
> ca_certificates_file = /etc/ssl/private/haproxy.pem
>
> region_name = Hanoi
>
> endpoint_type = internal
>
> insecure = false
>
>
>
> [nova]
>
> ca_certificates_file = /etc/ssl/private/haproxy.pem
>
> region_name = Hanoi
>
> endpoint_type = internal
>
> insecure = false
>
>
>
> [oslo_policy]
>
> #policy_file = /etc/octavia/policy.json
>
>
>
> [oslo_messaging_notifications]
>
> transport_url = rabbit://
> openstack:4ychZAT5VrWlk6KFfgAmpXvGdzfdV8hEpIgOLhyF at 192.168.10.178:5672,
> openstack:4ychZAT5VrWlk6KFfgAmpXvGdzfdV8hEpIgOLhyF at 192.168.10.179:5672,
> openstack:4ychZAT5VrWlk6KFfgAmpXvGdzfdV8hEpIgOLhyF at 192.168.10.28:5672
>
>
> _______________________________________________
> Community mailing list
> Community at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/community
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200826/622eec3f/attachment-0001.html>

From yan.y.zhao at intel.com  Wed Aug 26 06:41:17 2020
From: yan.y.zhao at intel.com (Yan Zhao)
Date: Wed, 26 Aug 2020 14:41:17 +0800
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200825163925.1c19b0f0.cohuck@redhat.com>
References: <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
 <20200818113652.5d81a392.cohuck@redhat.com>
 <20200820003922.GE21172@joy-OptiPlex-7040>
 <20200819212234.223667b3@x1.home>
 <20200820031621.GA24997@joy-OptiPlex-7040>
 <20200825163925.1c19b0f0.cohuck@redhat.com>
Message-ID: <20200826064117.GA22243@joy-OptiPlex-7040>

On Tue, Aug 25, 2020 at 04:39:25PM +0200, Cornelia Huck wrote:
<...>
> > do you think the bin_attribute I proposed yesterday good?
> > Then we can have a single compatible with a variable in the mdev_type and
> > aggregator.
> > 
> >    mdev_type=i915-GVTg_V5_{val1:int:2,4,8}
> >    aggregator={val1}/2
> 
> I'm not really a fan of binary attributes other than in cases where we
> have some kind of binary format to begin with.
> 
> IIUC, we basically have:
> - different partitioning (expressed in the mdev_type)
> - different number of partitions (expressed via the aggregator)
> - devices being compatible if the partitioning:aggregator ratio is the
>   same
> 
> (The multiple mdev_type variants seem to come from avoiding extra
> creation parameters, IIRC?)
> 
> Would it be enough to export
> base_type=i915-GVTg_V5
> aggregation_ratio=<integer>
> 
> to express the various combinations that are compatible without the
> need for multiple sets of attributes?

yes. I agree we need to decouple the mdev type name and aggregator for
compatibility detection purpose.

please allow me to put some words to describe the history and
motivation of introducing aggregator.

initially, we have fixed mdev_type
i915-GVTg_V5_1,
i915-GVTg_V5_2,
i915-GVTg_V5_4,
i915-GVTg_V5_8,
the digital after i915-GVTg_V5 representing the max number of instances
allowed to be created for this type. They also identify how many
resources are to be allocated for each type.

They are so far so good for current intel vgpus, i.e., cutting the
physical GPU into several virtual pieces and sharing them among several
VMs in pure mediation way.
fixed types are provided in advance as we thought it can meet needs from
most users and users can know the hardware capability they acquired
from the type name. the bigger in number, the smaller piece of physical
hardware.

Then, when it comes to scalable IOV in near future, one physical hardware
is able to be cut into a large number of units in hardware layer
The single unit to be assigned into guest can be very small while one to
several units are grouped into an mdev.

The fixed type scheme is then cumbersome. 
Therefore, a new attribute aggregator is introduced to specify the number
of resources to be assigned based on the base resource specified in type
name. e.g.
if type name is dsa-1dwq, and aggregator is 30, then the assignable
resources to guest is 30 wqs in a single created mdev.
if type name is dsa-2dwq, and aggregator is 15, then the assignable
resources to guest is also 30wqs in a single created mdev.
(in this example, the rule to define type name is different to the case
in GVT. here 1 wq means wq number is 1. yes, they are current reality.
:) )


previously, we want to regard the two mdevs created with dsa-1dwq x 30 and
dsa-2dwq x 15 as compatible, because the two mdevs consist equal resources.

But, as it's a burden to upper layer, we agree that if this condition
happens, we still treat the two as incompatible.

To fix it, either the driver should expose dsa-1dwq only, or the target
dsa-2dwq needs to be destroyed and reallocated via dsa-1dwq x 30.

Does it make sense?

Thanks
Yan


From yan.y.zhao at intel.com  Wed Aug 26 08:54:11 2020
From: yan.y.zhao at intel.com (Yan Zhao)
Date: Wed, 26 Aug 2020 16:54:11 +0800
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <47d216330e10152f0f5d27421da60a7b1c52e5f0.camel@redhat.com>
References: <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
 <20200818113652.5d81a392.cohuck@redhat.com>
 <20200820003922.GE21172@joy-OptiPlex-7040>
 <242591bb809b68c618f62fdc93d4f8ae7b146b6d.camel@redhat.com>
 <20200820040116.GB24121@joy-OptiPlex-7040>
 <da140e6d262632e2fb707f69f220915748d25d35.camel@redhat.com>
 <20200820062725.GB24997@joy-OptiPlex-7040>
 <47d216330e10152f0f5d27421da60a7b1c52e5f0.camel@redhat.com>
Message-ID: <20200826085411.GB22243@joy-OptiPlex-7040>

On Thu, Aug 20, 2020 at 02:24:26PM +0100, Sean Mooney wrote:
> On Thu, 2020-08-20 at 14:27 +0800, Yan Zhao wrote:
> > On Thu, Aug 20, 2020 at 06:16:28AM +0100, Sean Mooney wrote:
> > > On Thu, 2020-08-20 at 12:01 +0800, Yan Zhao wrote:
> > > > On Thu, Aug 20, 2020 at 02:29:07AM +0100, Sean Mooney wrote:
> > > > > On Thu, 2020-08-20 at 08:39 +0800, Yan Zhao wrote:
> > > > > > On Tue, Aug 18, 2020 at 11:36:52AM +0200, Cornelia Huck wrote:
> > > > > > > On Tue, 18 Aug 2020 10:16:28 +0100
> > > > > > > Daniel P. Berrangé <berrange at redhat.com> wrote:
> > > > > > > 
> > > > > > > > On Tue, Aug 18, 2020 at 05:01:51PM +0800, Jason Wang wrote:
> > > > > > > > >    On 2020/8/18 下午4:55, Daniel P. Berrangé wrote:
> > > > > > > > > 
> > > > > > > > >  On Tue, Aug 18, 2020 at 11:24:30AM +0800, Jason Wang wrote:
> > > > > > > > > 
> > > > > > > > >  On 2020/8/14 下午1:16, Yan Zhao wrote:
> > > > > > > > > 
> > > > > > > > >  On Thu, Aug 13, 2020 at 12:24:50PM +0800, Jason Wang wrote:
> > > > > > > > > 
> > > > > > > > >  On 2020/8/10 下午3:46, Yan Zhao wrote:  
> > > > > > > > >  we actually can also retrieve the same information through sysfs, .e.g
> > > > > > > > > 
> > > > > > > > >  |- [path to device]
> > > > > > > > >     |--- migration
> > > > > > > > >     |     |--- self
> > > > > > > > >     |     |   |---device_api
> > > > > > > > >     |    |   |---mdev_type
> > > > > > > > >     |    |   |---software_version
> > > > > > > > >     |    |   |---device_id
> > > > > > > > >     |    |   |---aggregator
> > > > > > > > >     |     |--- compatible
> > > > > > > > >     |     |   |---device_api
> > > > > > > > >     |    |   |---mdev_type
> > > > > > > > >     |    |   |---software_version
> > > > > > > > >     |    |   |---device_id
> > > > > > > > >     |    |   |---aggregator
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > >  Yes but:
> > > > > > > > > 
> > > > > > > > >  - You need one file per attribute (one syscall for one attribute)
> > > > > > > > >  - Attribute is coupled with kobject
> > > > > > > 
> > > > > > > Is that really that bad? You have the device with an embedded kobject
> > > > > > > anyway, and you can just put things into an attribute group?
> > > > > > > 
> > > > > > > [Also, I think that self/compatible split in the example makes things
> > > > > > > needlessly complex. Shouldn't semantic versioning and matching already
> > > > > > > cover nearly everything? I would expect very few cases that are more
> > > > > > > complex than that. Maybe the aggregation stuff, but I don't think we
> > > > > > > need that self/compatible split for that, either.]
> > > > > > 
> > > > > > Hi Cornelia,
> > > > > > 
> > > > > > The reason I want to declare compatible list of attributes is that
> > > > > > sometimes it's not a simple 1:1 matching of source attributes and target attributes
> > > > > > as I demonstrated below,
> > > > > > source mdev of (mdev_type i915-GVTg_V5_2 + aggregator 1) is compatible to
> > > > > > target mdev of (mdev_type i915-GVTg_V5_4 + aggregator 2),
> > > > > >                (mdev_type i915-GVTg_V5_8 + aggregator 4)
> > > > > 
> > > > > the way you are doing the nameing is till really confusing by the way
> > > > > if this has not already been merged in the kernel can you chagne the mdev
> > > > > so that mdev_type i915-GVTg_V5_2 is 2 of mdev_type i915-GVTg_V5_1 instead of half the device
> > > > > 
> > > > > currently you need to deived the aggratod by the number at the end of the mdev type to figure out
> > > > > how much of the phsicial device is being used with is a very unfridly api convention
> > > > > 
> > > > > the way aggrator are being proposed in general is not really someting i like but i thin this at least
> > > > > is something that should be able to correct.
> > > > > 
> > > > > with the complexity in the mdev type name + aggrator i suspect that this will never be support
> > > > > in openstack nova directly requireing integration via cyborg unless we can pre partion the
> > > > > device in to mdevs staicaly and just ignore this.
> > > > > 
> > > > > this is way to vendor sepecif to integrate into something like openstack in nova unless we can guarentee
> > > > > taht how aggreator work will be portable across vendors genericly.
> > > > > 
> > > > > > 
> > > > > > and aggragator may be just one of such examples that 1:1 matching does not
> > > > > > fit.
> > > > > 
> > > > > for openstack nova i dont see us support anything beyond the 1:1 case where the mdev type does not change.
> > > > > 
> > > > 
> > > > hi Sean,
> > > > I understand it's hard for openstack. but 1:N is always meaningful.
> > > > e.g.
> > > > if source device 1 has cap A, it is compatible to
> > > > device 2: cap A,
> > > > device 3: cap A+B,
> > > > device 4: cap A+B+C
> > > > ....
> > > > to allow openstack to detect it correctly, in compatible list of
> > > > device 2, we would say compatible cap is A;
> > > > device 3, compatible cap is A or A+B;
> > > > device 4, compatible cap is A or A+B, or A+B+C;
> > > > 
> > > > then if openstack finds device A's self cap A is contained in compatible
> > > > cap of device 2/3/4, it can migrate device 1 to device 2,3,4.
> > > > 
> > > > conversely,  device 1's compatible cap is only A,
> > > > so it is able to migrate device 2 to device 1, and it is not able to
> > > > migrate device 3/4 to device 1.
> > > 
> > > yes we build the palcement servce aroudn the idea of capablites as traits on resocue providres.
> > > which is why i originally asked if we coudl model compatibality with feature flags
> > > 
> > > we can seaislyt model deivce as aupport A, A+B or  A+B+C
> > > and then select hosts and evice based on that but
> > > 
> > > the list of compatable deivce you are propsoeing hide this feature infomation which whould be what we are matching
> > > on.
> > > 
> > > give me a lset of feature you want and list ting the feature avaiable on each device allow highre level ocestation
> > > to
> > > easily match the request to a host that can fulllfile it btu thave a set of other compatihble device does not help
> > > with
> > > that
> > > 
> > > so if a simple list a capabliteis can be advertiese d and if we know tha two dievce with the same capablity are
> > > intercahangebale that is workabout i suspect that will not be the case however and it would onely work within a
> > > familay
> > > of mdevs that are closely related.  which i think agian is an argument for not changeing the mdev type and at least
> > > intially only look at migatreion where the mdev type doee not change initally. 
> > > 
> > 
> > sorry Sean, I don't understand your words completely.
> > Please allow me to write it down in my words, and please confirm if my
> > understanding is right.
> > 1. you mean you agree on that each field is regarded as a trait, and
> > openstack can compare by itself if source trait is a subset of target trait, right?
> > e.g.
> > source device
> > field1=A1
> > field2=A2+B2
> > field3=A3
> > 
> > target device
> > field1=A1+B1
> > field2=A2+B2
> > filed3=A3
> > 
> > then openstack sees that field1/2/3 in source is a subset of field1/2/3 in
> > target, so it's migratable to target?
> 
> yes this is basically how cpu feature work.
> if we see the host cpu on the dest is a supperset of the cpu feature used
> by the vm we know its safe to migrate.

got it. glad to know it :)
> > 
> > 2. mdev_type + aggregator make it hard to achieve the above elegant
> > solution, so it's best to avoid the combined comparing of mdev_type + aggregator.
> > do I understand it correctly?
> yes and no. one of the challange that mdevs pose right now is that sometiem mdev model
> independent resouces and sometimes multipe mdev types consume the same underlying resouces
> there is know way for openstack to know if i915-GVTg_V5_2 and i915-GVTg_V5_4 consume the same resouces
> or not. as such we cant do the accounting properly so i would much prefer to have just 1 mdev type
> i915-GVTg and which models the minimal allocatable unit and then say i want 4 of them comsed into 1 device
> then have a second mdev type that does that since
> 
> what that means in pratice is we cannot trust the available_instances for a given mdev type
> as consuming a different mdev type might change it. aggrators makes that problem worse.
> which is why i siad i would prefer if instead of aggreator as prposed each consumable
> resouce was reported indepenedly as different mdev types and then we composed those
> like we would when bond ports creating an attachment or other logical aggration that refers
> to instance of mdevs of differing type which we expose as a singel mdev that is exposed to the guest.
> in a concreate example we might say create a aggreator of 64 cuda cores and 32 tensor cores and "bond them"
> or aggrate them as a single attachme mdev and provide that to a ml workload guest. a differnt guest could request
> 1 instace of the nvenc video encoder and one instance of the nvenc video decoder but no cuda or tensor for a video
> transcoding workload.
> 
The "bond" you described is a little different from the intension of the
aggregator we introduced for scalable IOV. (as explained in another mail
to Cornelia https://lists.gnu.org/archive/html/qemu-devel/2020-08/msg06523.html).

But any way, we agree that mdevs are not compatible if mdev_types are not compatible.  

> if each of those componets are indepent mdev types and can be composed with that granularity then i think that approch
> is better then the current aggreator with vendor sepcific fileds.
> we can model the phsical device as being multipel nested resouces with different traits for each type of resouce and
> different capsities for the same. we can even model how many of the attachments/compositions can be done indepently
> if there is a limit on that.
> 
> |- [parent physical device]
> |--- Vendor-specific-attributes [optional]
> |--- [mdev_supported_types]
> |     |--- [<type-id>]
> |     |   |--- create
> |     |   |--- name
> |     |   |--- available_instances
> |     |   |--- device_api
> |     |   |--- description
> |     |   |--- [devices]
> |     |--- [<type-id>]
> |     |   |--- create
> |     |   |--- name
> |     |   |--- available_instances
> |     |   |--- device_api
> |     |   |--- description
> |     |   |--- [devices]
> |     |--- [<type-id>]
> |          |--- create
> |          |--- name
> |          |--- available_instances
> |          |--- device_api
> |          |--- description
> |          |--- [devices]
> 
> a benifit of this appoch is we would be the mdev types would not change on migration 
> and we could jsut compuare a a simeple version stirgh and feature flag list to determin comaptiablity
> in a vendor neutral way. i dont nessisarly need to know what the vendeor flags mean just that the dest is a subset of
> the source and that the semaitic version numbers say the mdevs are compatible.
> > 
as aggregator and some other attributes are only meaningful after
devices are created, and vendors' naming of mdev types are not unified,
do you think below way is good?


|- [parent physical device]
|--- [mdev_supported_types]
|     |--- [<type-id>]
|     |   |--- create
|     |   |--- name
|     |   |--- available_instances
|     |   |--- compatible_type [must]
|     |   |--- Vendor-specific-compatible-type-attributes [optional]
|     |   |--- device_api [must]
|     |   |--- software_version [must]
|     |   |--- description
|     |   |--- [devices]
|     |   |--------[<uuid>]
|     |   |            |--- vendor-specific-compatible-device-attriutes [optional]

all vendor specific compatible attributes begin with compatible in name.

in GVT's current case,
|- 0000\:00\:02.0
|--- mdev_supported_types
|     |--- i915-GVTg_V5_8
|     |   |--- create
|     |   |--- name
|     |   |--- available_instances
|     |   |--- compatible_type : i915-GVTg_V5_8, i915-GVTg_V4_8
|     |   |--- device_api : vfio-pci
|     |   |--- software_version : 1.0.0
|     |   |--- compatible_pci_ids : 5931, 591b
|     |   |--- description
|     |   |--- devices
|     |   |       |- 882cc4da-dede-11e7-9180-078a62063ab1
|     |   |       |     | --- aggregator : 1
|     |   |       |     | --- compatible_aggregator : 1

suppose 882cc4da-dede-11e7-9180-078a62063ab1 is a src mdev.
the sequence for openstack to find a compatible mdev in my mind is that
1. make src mdev type and compatible_type as traits.

2. look for a mdev type that is either i915-GVTg_V4_8 or i915-GVTg_V5_8
as that in compatible_type.
(this is just an example, currently we only support migration between
mdevs whose attributes are all matching, from mdev type to aggregator,
to pci_ids)

3. if 2 fails, try to find a mdev type whose compatible_type is a
superset of src compatible_type. if found one, go to step 4; otherwise,
quit.

4. check if device_api, software_version under the type are compatible.

5. check if other vendor specific type attributes under the type are compatible.
- check if src compatible_pci_ids is a subset of target compatible_pci_ids.

6. check if device is created and not occupied, if not, create one.

7. check if vendor specific attributes under the device are compatible.
- check if src compatible_aggregator is a subset of target compatible_aggregator.
  if fails, try to find counterpart attribute of vendor specific device attribute
  and set target value according to compatible_xxx in source side.
  (for compatible_aggregator, its counterpart is aggregator.)
  if attribute aggregator exists, step 7 succeeds when setting of its value succeeds.
  if attribute aggregator does not exist, step 7 fails.

8. a compatible target is found.

not sure if the above steps look good to you.

some changes are required for compatibility check for physical device when mdev_type is absent.
but let's first arrive at consensus for mdevs first :)

> > 3. you don't like self list and compatible list, because it is hard for
> > openstack to compare different traits?
> > e.g. if we have self list and compatible list, then as below, openstack needs
> > to compare if self field1/2/3 is a subset of compatible field 1/2/3.
> currnetly we only use mdevs for vGPUs and in our documentaiton we tell customer
> to model the mdev_type as a trait and request it as a reuiqred trait.
> so for customer that are doing that today changing mdev types is not really an option.
> we would prefer that they request the feature they need instead of a spefic mdev type
> so we can select any that meets there needs
> for example we have a bunch of traits for cuda support
> https://github.com/openstack/os-traits/blob/master/os_traits/hw/gpu/cuda.py
> or driectx/vulkan/opengl https://github.com/openstack/os-traits/blob/master/os_traits/hw/gpu/api.py
> these are closely analogous to cpu feature flag lix avx or sse
> https://github.com/openstack/os-traits/blob/master/os_traits/hw/cpu/x86/__init__.py#L16
> 
> so when it comes to compatiablities it would be ideal if you could express capablities as something like
> a cpu feature flag then we can eaisly model those as traits. 
> > 
> > source device:
> > self field1=A1
> > self field2=A2+B2
> > self field3=A3
> > 
> > compatible field1=A1
> > compatible field2=A2;B2;A2+B2;
> > compatible field3=A3
> > 
> > 
> > target device:
> > self field1=A1+B1
> > self field2=A2+B2
> > self field3=A3
> > 
> > compatible field1=A1;B1;A1+B1;
> > compatible field2=A2;B2;A2+B2;
> > compatible field3=A3
> > 
> > 
> > Thanks
> > Yan
> > 
> > 
> > > > 
> > > > 
> > > > > i woudl really prefer if there was just one mdev type that repsented the minimal allcatable unit and the
> > > > > aggragaotr where used to create compostions of that. i.e instad of i915-GVTg_V5_2 beign half the device,
> > > > > have 1 mdev type i915-GVTg and if the device support 8 of them then we can aggrate 4 of i915-GVTg
> > > > > 
> > > > > if you want to have muplie mdev type to model the different amoutn of the resouce e.g. i915-GVTg_small i915-
> > > > > GVTg_large
> > > > > that is totlaly fine too or even i915-GVTg_4 indcating it sis 4 of i915-GVTg
> > > > > 
> > > > > failing that i would just expose an mdev type per composable resouce and allow us to compose them a the user
> > > > > level
> > > > > with
> > > > > some other construct mudeling a attament to the device. e.g. create composed mdev or somethig that is an
> > > > > aggreateion
> > > > > of
> > > > > multiple sub resouces each of which is an mdev. so kind of like how bond port work. we would create an mdev for
> > > > > each
> > > > > of
> > > > > the sub resouces and then create a bond or aggrated mdev by reference the other mdevs by uuid then attach only
> > > > > the
> > > > > aggreated mdev to the instance.
> > > > > 
> > > > > the current aggrator syntax and sematic however make me rather uncofrotable when i think about orchestating vms
> > > > > on
> > > > > top
> > > > > of it even to boot them let alone migrate them.
> > > > > > 
> > > > > > So, we explicitly list out self/compatible attributes, and management
> > > > > > tools only need to check if self attributes is contained compatible
> > > > > > attributes.
> > > > > > 
> > > > > > or do you mean only compatible list is enough, and the management tools
> > > > > > need to find out self list by themselves?
> > > > > > But I think provide a self list is easier for management tools.
> > > > > > 
> > > > > > Thanks
> > > > > > Yan
> > > > > > 
> > > > 
> > > > 
> > 
> > 
> 


From ankelezhang at gmail.com  Wed Aug 26 09:30:24 2020
From: ankelezhang at gmail.com (Ankele zhang)
Date: Wed, 26 Aug 2020 17:30:24 +0800
Subject: nova config vCenter and creating instance failed
Message-ID: <CA+TvxxCAE=0JHzsuyBRprTfktXc1O2LxexiFaDRFYC_TU+mViw@mail.gmail.com>

Hi all
I have config vCenter in my nova.conf, cinder.conf and glance-api.conf.
First of all, I can create VM inner vSphere successfully and I can create
VM inner OpenStack without vCenter configuration successfully.
Now I config vCenter driver in nova, cinder and glance.
Creating images and volumes successfully, but when I create VM instance, I
got the error message "Build of instance
e3e8e049-98fc-486e-95c7-e17ec0e22e59 aborted: 主机配置过程中出错。" , in english is
"Build of instance e3e8e049-98fc-486e-95c7-e17ec0e22e59 aborted: an error
occurred during host configuration". And error in vCSA client is just
"主机配置过程中出错" while creating vm.
Environment:
OpenStack(Rocky), vSphere(6.7), storage(iSCSI),network(OVS vlan),vCenter
is VMware-VIM-all-6.7.0-16046470.iso installed in windows2012 server.
I don't know where did my configuration error in OpenStack or something
error in my vSphere.

nova.conf:
[default]
...
compute_driver = vmwareapi.VMwareVCDriver
[vmware]
host_ip = 192.168.3.115
host_username = administrator at vsphere.local
host_password = Zl at 123456
cluster_name = mycluster
datastore_regex = Datastore_iscsi
insecure = True
vlan_interface = vmnic0
integration_bridge = br-int
api_retry_count = 10

cinder.conf:
[DEFAULT]
enabled_backends = vmware
default_volume_type = vmware
[vmware]
volume_driver = cinder.volume.drivers.vmware.vmdk.VMwareVcVmdkDriver
vmware_host_ip=192.168.3.115
vmware_host_password=Zl at 123456
vmware_host_username=administrator at vsphere.local
vmware_wsdl_location=https://192.168.3.115/sdk/vimService.wsdl
vmware_volume_folder= openstack_volume
vmware_datastore_regex = Datastore_iscsi
vmware_insecure = True
vmware_host_version = 6.7

glance-api.conf:
[default]
...
known_stores = vmware
default_store = vmware
[glance_store]
filesystem_store_datadir = /tri_fs/images/
stores = files,http,vmware
default_store = vsphere
vmware_server_host = 192.168.3.115
vmware_server_username = administrator at vsphere.local
vmware_server_password = Zl at 123456
vmware_datastore_name = Datastore_iscsi
vmware_datacenter_path = Datacenter
vmware_datastores = Datacenter:Datastore_iscsi
vmware_task_poll_interval = 5
vmware_store_image_dir = /openstack_glance
vmware_api_insecure = True

I hope you can help me.
Thank you very much!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200826/c5f82ce7/attachment-0001.html>

From mnaser at vexxhost.com  Wed Aug 26 19:44:04 2020
From: mnaser at vexxhost.com (Mohammed Naser)
Date: Wed, 26 Aug 2020 15:44:04 -0400
Subject: [nova][barbican][qa] barbican-tempest-plugin change breaking bfv
 [ceph]
Message-ID: <CAEs876g-bjTHf4AC5GySqAbg3HXfWSwdqF_fpKM-HeVHTSvdEQ@mail.gmail.com>

Hi everyone,

We just had our gating break due to a change merging inside
barbican-tempest-plugin which is the following:

https://review.opendev.org/#/c/515210/

It is resulting in an exception in our CI:

2020-08-26 18:04:32.663188 | controller |     Response - Headers:
{'content-length': '257', 'content-type': 'application/json',
'x-openstack-request-id': 'req-7f55e463-c3de-445e-a814-ef79c5f21235',
'connection': 'close', 'status': '409', 'content-location':
'http://glance.openstack.svc.cluster.local/v2/images/dec14e17-0870-415f-82a6-140c1b7e4a39'}
2020-08-26 18:04:32.663200 | controller |         Body: b'{"message":
"Image dec14e17-0870-415f-82a6-140c1b7e4a39 could not be deleted
because it is in use: The image cannot be deleted because it is in use
through the backend store outside of Glance.<br /><br />\\n\\n\\n",
"code": "409 Conflict", "title": "Conflict"}'

This is usually because it's trying to delete the Glance image before
deleting an instance that is using it, the specific test that is
failing is:

barbican_tempest_plugin.tests.scenario.test_certificate_validation.CertificateValidationTest.test_signed_image_invalid_cert_boot_failure[compute,id-6d354881-35a6-4568-94b8-2204bbf67b29,image]

This test landed yesterday, we're blacklisting the scenario right now.
I do find it quite interesting that in the logs here:

http://paste.openstack.org/show/797186/

That the instance reports that it _does_ indeed delete it, so maybe we
are trying to delete the image afterwards _too quickly_ and need to
wait for Nova to clean up?

I'd love to enable that test again and continue full coverage, happy
to hear discussion.

Thanks
Mohammed


-- 
Mohammed Naser
VEXXHOST, Inc.


From amy at demarco.com  Wed Aug 26 21:08:07 2020
From: amy at demarco.com (Amy Marrich)
Date: Wed, 26 Aug 2020 16:08:07 -0500
Subject: [Diversity] Diversity & Inclusion WG Meeting 8/31 - Removing Divisive
 Language
Message-ID: <CAFs83QpUcSEsAM0oNHg==mgXV=JKTfz29vNLnpYjXTSEwbt5+Q@mail.gmail.com>

The Diversity & Inclusion WG has taken on the task from this week's Board
meeting to assist with the development of the OSF's stance on the removal
of Divisive Language within the OSF projects.

The WG invites members of all OSF projects to participate in this effort
and to join us at our next meeting Monday, August 31, at 17:00 UTC which
will be held at https://meetpad.opendev.org/osf-diversity-and-inclusion.
<https://meetpad.opendev.org/osf-diversity-and-inclusion> The agenda can be
found at https://etherpad.openstack.org/p/diversity-wg-agenda.

If you have any questions please let me and the team know here, on
#openstack-diversity on IRC, or you can email me directly.

Thanks,
Amy Marrich (spotz)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200826/7033f714/attachment.html>

From johnsomor at gmail.com  Thu Aug 27 00:58:17 2020
From: johnsomor at gmail.com (Michael Johnson)
Date: Wed, 26 Aug 2020 17:58:17 -0700
Subject: [openstack-community] Error add member to pool ( OCTAVIA ) when
 using SSL to verify
In-Reply-To: <59EC5E93-FC3F-4EDC-A874-9A2F466B37DC@demarco.com>
References: <692B1576-9AB1-46F9-9328-0D510DDCEE01@hxcore.ol>
 <59EC5E93-FC3F-4EDC-A874-9A2F466B37DC@demarco.com>
Message-ID: <CAMH0MgJcRH=mFj959vzoLp_ZrmuVOX8jxsZ=_cag-AP3hEYQGA@mail.gmail.com>

Thank you again Amy.

Hi Duc Vinh,

Sorry to hear you are having trouble getting Octavia setup. It appears
to be an issue with the certificate on the keystone endpoint.

>From the log and your configuration I can see:
Your keystone auth_url is https://192.168.10.150:5000
You CAfile for this endpoint is configured as: /etc/ssl/private/haproxy.pem

Let's test that configuration by running the following command:

echo "Q" | openssl s_client -connect 192.168.10.150:5000 -CAfile
/etc/ssl/private/haproxy.pem

This will return a lot of information about the certificate on the
endpoint and test the CA file.
In the output of this command, you want to see "Verification: OK". If
you don't, there is a problem either with the certificate on the
endpoint of the CA file being used. Check both match and are the
expected files.

If you are still not sure what is wrong, please send the output of the
above command and the output of the following command:
openssl x509 -in /etc/ssl/private/haproxy.pem -noout -text

I will take a look at that information and should be able to help.

Michael

On Tue, Aug 25, 2020 at 7:19 AM Amy Marrich <amy at demarco.com> wrote:
>
> Adding the OpenStack discuss list.
>
> Amy (spotz)
>
> On Aug 24, 2020, at 11:14 PM, Vinh Nguyen Duc <vinhducnguyen1708 at gmail.com> wrote:
>
> ﻿
>
> Dear Openstack community,
>
>
>
> My name is Duc Vinh,  I am newer in Openstack
>
> I am deploy Openstack Ussuri on Centos8 , I am using three nodes controller with High Availability topology and using HAproxy to verify cert for connect HTTPS,
>
> I have trouble with project Octavia, I cannot add member in a pool after created Loadbalancer, listener, pool ( everything is fine).
>
> Here is my log and configuration file:
>
>
>
> LOGS:
>
>
>
> 2020-08-25 10:55:42.872 226250 DEBUG octavia.network.drivers.neutron.base [req-57c5b37c-e50f-4d50-b535-b0a3d19db1d5 - 8259463ce052437396afa845933afe4b - default default] Neutron extension security-group found enabled _check_extension_enabled /usr/lib/python3.6/site-packages/octavia/network/drivers/neutron/base.py:66
>
> 2020-08-25 10:55:42.892 226250 DEBUG octavia.network.drivers.neutron.base [req-57c5b37c-e50f-4d50-b535-b0a3d19db1d5 - 8259463ce052437396afa845933afe4b - default default] Neutron extension dns-integration is not enabled _check_extension_enabled /usr/lib/python3.6/site-packages/octavia/network/drivers/neutron/base.py:70
>
> 2020-08-25 10:55:42.911 226250 DEBUG octavia.network.drivers.neutron.base [req-57c5b37c-e50f-4d50-b535-b0a3d19db1d5 - 8259463ce052437396afa845933afe4b - default default] Neutron extension qos found enabled _check_extension_enabled /usr/lib/python3.6/site-packages/octavia/network/drivers/neutron/base.py:66
>
> 2020-08-25 10:55:42.933 226250 DEBUG octavia.network.drivers.neutron.base [req-57c5b37c-e50f-4d50-b535-b0a3d19db1d5 - 8259463ce052437396afa845933afe4b - default default] Neutron extension allowed-address-pairs found enabled _check_extension_enabled /usr/lib/python3.6/site-packages/octavia/network/drivers/neutron/base.py:66
>
> 2020-08-25 10:55:43.068 226250 WARNING keystoneauth.identity.generic.base [req-57c5b37c-e50f-4d50-b535-b0a3d19db1d5 - 8259463ce052437396afa845933afe4b - default default] Failed to discover available identity versions when contacting https://192.168.10.150:5000. Attempting to parse version from URL.: keystoneauth1.exceptions.connection.SSLError: SSL exception connecting to https://192.168.10.150:5000: HTTPSConnectionPool(host='192.168.10.150', port=5000): Max retries exceeded with url: / (Caused by SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:897)'),))
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base [req-57c5b37c-e50f-4d50-b535-b0a3d19db1d5 - 8259463ce052437396afa845933afe4b - default default] Error retrieving subnet (subnet id: 035f3183-f469-415f-b536-b4a81364e814.: keystoneauth1.exceptions.discovery.DiscoveryFailure: Could not find versioned identity endpoints when attempting to authenticate. Please check that your auth_url is correct. SSL exception connecting to https://192.168.10.150:5000: HTTPSConnectionPool(host='192.168.10.150', port=5000): Max retries exceeded with url: / (Caused by SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:897)'),))
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base Traceback (most recent call last):
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 600, in urlopen
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     chunked=chunked)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 343, in _make_request
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     self._validate_conn(conn)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 839, in _validate_conn
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     conn.connect()
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/urllib3/connection.py", line 344, in connect
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     ssl_context=context)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/urllib3/util/ssl_.py", line 367, in ssl_wrap_socket
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     return context.wrap_socket(sock)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib64/python3.6/ssl.py", line 365, in wrap_socket
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     _context=self, _session=session)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib64/python3.6/ssl.py", line 776, in __init__
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     self.do_handshake()
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib64/python3.6/ssl.py", line 1036, in do_handshake
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     self._sslobj.do_handshake()
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib64/python3.6/ssl.py", line 648, in do_handshake
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     self._sslobj.do_handshake()
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:897)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base During handling of the above exception, another exception occurred:
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base Traceback (most recent call last):
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/requests/adapters.py", line 449, in send
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     timeout=timeout
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 638, in urlopen
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     _stacktrace=sys.exc_info()[2])
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/urllib3/util/retry.py", line 399, in increment
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     raise MaxRetryError(_pool, url, error or ResponseError(cause))
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='192.168.10.150', port=5000): Max retries exceeded with url: / (Caused by SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:897)'),))
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base During handling of the above exception, another exception occurred:
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base Traceback (most recent call last):
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/keystoneauth1/session.py", line 1004, in _send_request
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     resp = self.session.request(method, url, **kwargs)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/requests/sessions.py", line 533, in request
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     resp = self.send(prep, **send_kwargs)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/requests/sessions.py", line 646, in send
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     r = adapter.send(request, **kwargs)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/requests/adapters.py", line 514, in send
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     raise SSLError(e, request=request)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base requests.exceptions.SSLError: HTTPSConnectionPool(host='192.168.10.150', port=5000): Max retries exceeded with url: / (Caused by SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:897)'),))
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base During handling of the above exception, another exception occurred:
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base Traceback (most recent call last):
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/keystoneauth1/identity/generic/base.py", line 138, in _do_create_plugin
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     authenticated=False)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/keystoneauth1/identity/base.py", line 610, in get_discovery
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     authenticated=authenticated)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/keystoneauth1/discover.py", line 1452, in get_discovery
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     disc = Discover(session, url, authenticated=authenticated)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/keystoneauth1/discover.py", line 536, in __init__
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     authenticated=authenticated)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/keystoneauth1/discover.py", line 102, in get_version_data
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     resp = session.get(url, headers=headers, authenticated=authenticated)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/keystoneauth1/session.py", line 1123, in get
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     return self.request(url, 'GET', **kwargs)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/keystoneauth1/session.py", line 913, in request
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     resp = send(**kwargs)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/keystoneauth1/session.py", line 1008, in _send_request
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     raise exceptions.SSLError(msg)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base keystoneauth1.exceptions.connection.SSLError: SSL exception connecting to https://192.168.10.150:5000: HTTPSConnectionPool(host='192.168.10.150', port=5000): Max retries exceeded with url: / (Caused by SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:897)'),))
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base During handling of the above exception, another exception occurred:
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base Traceback (most recent call last):
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/octavia/network/drivers/neutron/base.py", line 193, in _get_resource
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     resource_type)(resource_id)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/neutronclient/v2_0/client.py", line 869, in show_subnet
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     return self.get(self.subnet_path % (subnet), params=_params)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/neutronclient/v2_0/client.py", line 354, in get
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     headers=headers, params=params)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/neutronclient/v2_0/client.py", line 331, in retry_request
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     headers=headers, params=params)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/neutronclient/v2_0/client.py", line 282, in do_request
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     headers=headers)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/neutronclient/client.py", line 339, in do_request
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     self._check_uri_length(url)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/neutronclient/client.py", line 332, in _check_uri_length
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     uri_len = len(self.endpoint_url) + len(url)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/neutronclient/client.py", line 346, in endpoint_url
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     return self.get_endpoint()
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/keystoneauth1/adapter.py", line 282, in get_endpoint
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     return self.session.get_endpoint(auth or self.auth, **kwargs)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/keystoneauth1/session.py", line 1225, in get_endpoint
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     return auth.get_endpoint(self, **kwargs)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/keystoneauth1/identity/base.py", line 380, in get_endpoint
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     allow_version_hack=allow_version_hack, **kwargs)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/keystoneauth1/identity/base.py", line 271, in get_endpoint_data
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     service_catalog = self.get_access(session).service_catalog
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/keystoneauth1/identity/base.py", line 134, in get_access
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     self.auth_ref = self.get_auth_ref(session)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/keystoneauth1/identity/generic/base.py", line 206, in get_auth_ref
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     self._plugin = self._do_create_plugin(session)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base   File "/usr/lib/python3.6/site-packages/keystoneauth1/identity/generic/base.py", line 161, in _do_create_plugin
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base     'auth_url is correct. %s' % e)
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base keystoneauth1.exceptions.discovery.DiscoveryFailure: Could not find versioned identity endpoints when attempting to authenticate. Please check that your auth_url is correct. SSL exception connecting to https://192.168.10.150:5000: HTTPSConnectionPool(host='192.168.10.150', port=5000): Max retries exceeded with url: / (Caused by SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:897)'),))
>
> 2020-08-25 10:55:43.070 226250 ERROR octavia.network.drivers.neutron.base
>
> 2020-08-25 10:55:43.074 226250 DEBUG wsme.api [req-57c5b37c-e50f-4d50-b535-b0a3d19db1d5 - 8259463ce052437396afa845933afe4b - default default] Client-side error: Subnet 035f3183-f469-415f-b536-b4a81364e814 not found. format_exception /usr/lib/python3.6/site-packages/wsme/api.py:222
>
> 2020-08-25 10:55:43.076 226250 DEBUG octavia.common.keystone [req-57c5b37c-e50f-4d50-b535-b0a3d19db1d5 - 8259463ce052437396afa845933afe4b - default default] Request path is / and it does not require keystone authentication process_request /usr/lib/python3.6/site-packages/octavia/common/keystone.py:77
>
> 2020-08-25 10:55:43.080 226250 DEBUG octavia.common.keystone [req-5091d326-0cb4-4ae1-bf4b-9ef6b9313dca - - - - -] Request path is / and it does not require keystone authentication process_request /usr/lib/python3.6/site-packages/octavia/common/keystone.py:77
>
>
>
> Configuration:
>
> [root at controller01 ~]# cat /etc/octavia/octavia.conf
>
> [DEFAULT]
>
>
>
> log_dir = /var/log/octavia
>
> debug = True
>
> transport_url = rabbit://openstack:4ychZAT5VrWlk6KFfgAmpXvGdzfdV8hEpIgOLhyF at 192.168.10.178:5672,openstack:4ychZAT5VrWlk6KFfgAmpXvGdzfdV8hEpIgOLhyF at 192.168.10.179:5672,openstack:4ychZAT5VrWlk6KFfgAmpXvGdzfdV8hEpIgOLhyF at 192.168.10.28:5672
>
>
>
> [api_settings]
>
> api_base_uri = https://192.168.10.150:9876
>
> bind_host = 192.168.10.178
>
> bind_port = 9876
>
> auth_strategy = keystone
>
> healthcheck_enabled = True
>
> allow_tls_terminated_listeners = True
>
>
>
> [database]
>
> connection = mysql+pymysql://octavia:FUkbii8AY4G6H9LxbJ2RRlOzHN61X8PI8FrMcuXQ at 192.168.10.150/octavia
>
> max_retries = -1
>
>
>
> [health_manager]
>
> bind_port = 5555
>
> bind_ip = 192.168.10.178
>
> controller_ip_port_list = 192.168.10.178:5555, 192.168.10.179:5555, 192.168.10.28:5555
>
> heartbeat_key = insecure
>
>
>
> [keystone_authtoken]
>
> service_token_roles_required = True
>
> www_authenticate_uri = https://192.168.10.150:5000
>
> auth_url = https://192.168.10.150:5000
>
> region_name = Hanoi
>
> memcached_servers = 192.168.10.178:11211,192.168.10.179:11211,192.168.10.28:11211
>
> auth_type = password
>
> project_domain_name = Default
>
> user_domain_name = Default
>
> project_name = service
>
> username = octavia
>
> password = esGn3rN3iJOAD2HXmqznFPI9oAY2wQNDWYwqJaCH
>
> cafile = /etc/ssl/private/haproxy.pem
>
> insecure = false
>
>
>
>
>
> [certificates]
>
> cert_generator = local_cert_generator
>
> #server_certs_key_passphrase = insecure-key-do-not-use-this-key
>
> ca_private_key_passphrase = esGn3rN3iJOAD2HXmqznFPI9oAY2wQNDWYwqJaCH
>
> ca_private_key = /etc/octavia/certs/server_ca.key.pem
>
> ca_certificate = /etc/octavia/certs/server_ca.cert.pem
>
> region_name = Hanoi
>
> ca_certificates_file = /etc/ssl/private/haproxy.pem
>
> endpoint_type = internal
>
>
>
> [networking]
>
> #allow_vip_network_id = True
>
> #allow_vip_subnet_id = True
>
> #allow_vip_port_id = True
>
>
>
> [haproxy_amphora]
>
> #bind_port = 9443
>
> server_ca = /etc/octavia/certs/server_ca.cert.pem
>
> client_cert = /etc/octavia/certs/client.cert-and-key.pem
>
> base_path = /var/lib/octavia
>
> base_cert_dir = /var/lib/octavia/certs
>
> connection_max_retries = 1500
>
> connection_retry_interval = 1
>
>
>
> [controller_worker]
>
> amp_image_tag = amphora
>
> amp_ssh_key_name = octavia
>
> amp_secgroup_list = 80f44b73-dc9f-48aa-a0b8-8b78e5c6585c
>
> amp_boot_network_list = 04425cb2-5963-48f5-a229-b89b7c6036bd
>
> amp_flavor_id = 200
>
> network_driver = allowed_address_pairs_driver
>
> compute_driver = compute_nova_driver
>
> amphora_driver = amphora_haproxy_rest_driver
>
> client_ca = /etc/octavia/certs/client_ca.cert.pem
>
> loadbalancer_topology = SINGLE
>
> amp_active_retries = 9999
>
>
>
> [task_flow]
>
> [oslo_messaging]
>
> topic = octavia_prov
>
> rpc_thread_pool_size = 2
>
>
>
> [house_keeping]
>
> [amphora_agent]
>
> [keepalived_vrrp]
>
>
>
> [service_auth]
>
> auth_url = https://192.168.10.150:5000
>
> auth_type = password
>
> project_domain_name = default
>
> user_domain_name = default
>
> project_name = admin
>
> username = admin
>
> password = F35sXAYW5qDlMGfQbhmexIx12DqrQdpw6ixAseTd
>
> cafile = /etc/ssl/private/haproxy.pem
>
> region_name = Hanoi
>
> memcached_servers = 192.168.10.178:11211,192.168.10.179:11211,192.168.10.28:11211
>
> #insecure = true
>
>
>
>
>
> [glance]
>
> ca_certificates_file = /etc/ssl/private/haproxy.pem
>
> region_name = Hanoi
>
> endpoint_type = internal
>
> insecure = false
>
>
>
> [neutron]
>
> ca_certificates_file = /etc/ssl/private/haproxy.pem
>
> region_name = Hanoi
>
> endpoint_type = internal
>
> insecure = false
>
>
>
> [cinder]
>
> ca_certificates_file = /etc/ssl/private/haproxy.pem
>
> region_name = Hanoi
>
> endpoint_type = internal
>
> insecure = false
>
>
>
> [nova]
>
> ca_certificates_file = /etc/ssl/private/haproxy.pem
>
> region_name = Hanoi
>
> endpoint_type = internal
>
> insecure = false
>
>
>
> [oslo_policy]
>
> #policy_file = /etc/octavia/policy.json
>
>
>
> [oslo_messaging_notifications]
>
> transport_url = rabbit://openstack:4ychZAT5VrWlk6KFfgAmpXvGdzfdV8hEpIgOLhyF at 192.168.10.178:5672,openstack:4ychZAT5VrWlk6KFfgAmpXvGdzfdV8hEpIgOLhyF at 192.168.10.179:5672,openstack:4ychZAT5VrWlk6KFfgAmpXvGdzfdV8hEpIgOLhyF at 192.168.10.28:5672
>
>
>
> _______________________________________________
> Community mailing list
> Community at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/community


From Istvan.Szabo at agoda.com  Thu Aug 27 03:45:02 2020
From: Istvan.Szabo at agoda.com (Szabo, Istvan (Agoda))
Date: Thu, 27 Aug 2020 03:45:02 +0000
Subject: DB Prune
In-Reply-To: <TD7OFQ.SPEOKHDSCXBN3@est.tech>
References: <859fb3c996514c2ead3fc1ce3de4210b@SG-AGMBX-6002.agoda.local>
 <TD7OFQ.SPEOKHDSCXBN3@est.tech>
Message-ID: <188b52b4a07b41cab38858c1ae61a7fa@SG-AGMBX-6002.agoda.local>

Thank you guys, can do this online or need any outage?

-----Original Message-----
From: Balázs Gibizer <balazs.gibizer at est.tech>
Sent: Wednesday, August 26, 2020 7:15 PM
To: Szabo, Istvan (Agoda) <Istvan.Szabo at agoda.com>
Cc: openstack-discuss at lists.openstack.org
Subject: Re: DB Prune

Email received from outside the company. If in doubt don't click links nor open attachments!
________________________________

On Wed, Aug 26, 2020 at 06:57, "Szabo, Istvan (Agoda)"
<Istvan.Szabo at agoda.com> wrote:
> Hi,
>
>  We have a cluster where the user continuously spawn and delete
> servers which makes the db even in compressed state 1.1GB.
>  I’m sure it has a huge amount of trash because this is a cicd
> environment and the prod just uses 75MB.
>  How is it possible to cleanup the db on a safe way, what should be
> the steps?
>

 From Nova perspective you can get rid of the data of the already deleted instances via the following two commands:

nova-manage db archive_deleted_rows
nova-manage db purge

Cheers,
gibi

[1]https://docs.openstack.org/nova/latest/cli/nova-manage.html


>
>  Best regards,
>  Istvan
>
>
> This message is confidential and is for the sole use of the intended
> recipient(s). It may also be privileged or otherwise protected by
> copyright or other legal rules. If you have received it by mistake
> please let us know by reply email and delete it from your system. It
> is prohibited to copy this message or disclose its content to anyone.
> Any confidentiality or privilege is not waived or lost by any mistaken
> delivery or unauthorized disclosure of the message. All messages sent
> to and from Agoda may be monitored to ensure compliance with company
> policies, to protect the company's interests and to remove potential
> malware. Electronic messages may be intercepted, amended, lost or
> deleted, or contain viruses.


________________________________
This message is confidential and is for the sole use of the intended recipient(s). It may also be privileged or otherwise protected by copyright or other legal rules. If you have received it by mistake please let us know by reply email and delete it from your system. It is prohibited to copy this message or disclose its content to anyone. Any confidentiality or privilege is not waived or lost by any mistaken delivery or unauthorized disclosure of the message. All messages sent to and from Agoda may be monitored to ensure compliance with company policies, to protect the company's interests and to remove potential malware. Electronic messages may be intercepted, amended, lost or deleted, or contain viruses.

From elfosardo at gmail.com  Thu Aug 27 07:52:23 2020
From: elfosardo at gmail.com (Riccardo Pittau)
Date: Thu, 27 Aug 2020 09:52:23 +0200
Subject: [ironic] next Victoria meetup
In-Reply-To: <CAORRS==wC6_OceMrbf8o5uNRGbFGPyHP5V5j1K8rXnZSPQrYEA@mail.gmail.com>
References: <CAORRS=kxY6MbaNmo6eh2v_5taKCnVLAxMZXneCYXAjQA-ySg+w@mail.gmail.com>
 <CAORRS==wC6_OceMrbf8o5uNRGbFGPyHP5V5j1K8rXnZSPQrYEA@mail.gmail.com>
Message-ID: <CAORRS=n+tbqYhZgcj=78hAeri44ShUBA=GuhKzAEcLGPyoDvhw@mail.gmail.com>

Hello everyone!

Thanks to all who cast their vote, after looking at the results I'm happy
to announce that the next Ironic Virtual Meetup will be held on:
- Monday August 31st, at 1300 UTC until 1500 UTC
- Tuesday September 1st, at 1300 UTC until 1500 UTC

For latest news and topics, consult the etherpad at
https://etherpad.opendev.org/p/Ironic-Victoria-midcycle

Can't wait to see to you all at the Meetup, even if just virtually :)

A si biri

Riccardo

On Thu, Aug 20, 2020 at 7:05 PM Riccardo Pittau <elfosardo at gmail.com> wrote:

> Hello again!
>
> Friendly reminder about the vote to schedule the next Ironic Virtual
> Meetup!
> Since a lot of people are on vacation in this period, we've decided to
> postpone the final day for the vote to next Wednesday August 26
>
> And we have an etherpad now!
> https://etherpad.opendev.org/p/Ironic-Victoria-midcycle
> Feel free to propose topics, we'll discuss also about the upcoming PTG and
> Forum.
>
> Thanks!
>
> A si biri
>
> Riccardo
>
>
> On Mon, Aug 17, 2020 at 6:29 PM Riccardo Pittau <elfosardo at gmail.com>
> wrote:
>
>> Hello everyone!
>>
>> The time for the next Ironic virtual meetup is close!
>> It will be an opportunity to review what has been done in the last
>> months, exchange ideas and plan for the time before the upcoming victoria
>> release, with an eye towards the future.
>>
>> We're aiming to have the virtual meetup the first week of September
>> (Monday August 31 - Friday September 4) and split it in two days, with one
>> two-hours slot per day.
>> Please vote for your best time slots here:
>> https://doodle.com/poll/pi4x3kuxamf4nnpu
>>
>> We're planning to leave the vote open at least for the entire week until
>> Friday August 21, so to have enough time to announce the final slots and
>> planning early next week.
>>
>> Thanks!
>>
>> A si biri
>>
>> Riccardo
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200827/7a5e260c/attachment.html>

From balazs.gibizer at est.tech  Thu Aug 27 08:03:49 2020
From: balazs.gibizer at est.tech (=?iso-8859-1?q?Bal=E1zs?= Gibizer)
Date: Thu, 27 Aug 2020 10:03:49 +0200
Subject: DB Prune
In-Reply-To: <188b52b4a07b41cab38858c1ae61a7fa@SG-AGMBX-6002.agoda.local>
References: <859fb3c996514c2ead3fc1ce3de4210b@SG-AGMBX-6002.agoda.local>
 <TD7OFQ.SPEOKHDSCXBN3@est.tech>
 <188b52b4a07b41cab38858c1ae61a7fa@SG-AGMBX-6002.agoda.local>
Message-ID: <DEQPFQ.PGMBE9HOUX0R2@est.tech>


On Thu, Aug 27, 2020 at 03:45, "Szabo, Istvan (Agoda)" 
<Istvan.Szabo at agoda.com> wrote:
> Thank you guys, can do this online or need any outage?

I think it is safe to run these commands while the nova services are 
up. However if you have a lot of data to move and then delete that can 
cause extra DB load.

Cheers,
gibi

> 
> -----Original Message-----
> From: Balázs Gibizer <balazs.gibizer at est.tech>
> Sent: Wednesday, August 26, 2020 7:15 PM
> To: Szabo, Istvan (Agoda) <Istvan.Szabo at agoda.com>
> Cc: openstack-discuss at lists.openstack.org
> Subject: Re: DB Prune
> 
> Email received from outside the company. If in doubt don't click 
> links nor open attachments!
> ________________________________
> 
> On Wed, Aug 26, 2020 at 06:57, "Szabo, Istvan (Agoda)"
> <Istvan.Szabo at agoda.com> wrote:
>>  Hi,
>> 
>>   We have a cluster where the user continuously spawn and delete
>>  servers which makes the db even in compressed state 1.1GB.
>>   I’m sure it has a huge amount of trash because this is a cicd
>>  environment and the prod just uses 75MB.
>>   How is it possible to cleanup the db on a safe way, what should be
>>  the steps?
>> 
> 
>  From Nova perspective you can get rid of the data of the already 
> deleted instances via the following two commands:
> 
> nova-manage db archive_deleted_rows
> nova-manage db purge
> 
> Cheers,
> gibi
> 
> [1]https://docs.openstack.org/nova/latest/cli/nova-manage.html
> 
> 
>> 
>>   Best regards,
>>   Istvan
>> 
>> 
>>  This message is confidential and is for the sole use of the intended
>>  recipient(s). It may also be privileged or otherwise protected by
>>  copyright or other legal rules. If you have received it by mistake
>>  please let us know by reply email and delete it from your system. It
>>  is prohibited to copy this message or disclose its content to 
>> anyone.
>>  Any confidentiality or privilege is not waived or lost by any 
>> mistaken
>>  delivery or unauthorized disclosure of the message. All messages 
>> sent
>>  to and from Agoda may be monitored to ensure compliance with company
>>  policies, to protect the company's interests and to remove potential
>>  malware. Electronic messages may be intercepted, amended, lost or
>>  deleted, or contain viruses.
> 
> 
> 
> ________________________________
> This message is confidential and is for the sole use of the intended 
> recipient(s). It may also be privileged or otherwise protected by 
> copyright or other legal rules. If you have received it by mistake 
> please let us know by reply email and delete it from your system. It 
> is prohibited to copy this message or disclose its content to anyone. 
> Any confidentiality or privilege is not waived or lost by any 
> mistaken delivery or unauthorized disclosure of the message. All 
> messages sent to and from Agoda may be monitored to ensure compliance 
> with company policies, to protect the company's interests and to 
> remove potential malware. Electronic messages may be intercepted, 
> amended, lost or deleted, or contain viruses.


From mark at stackhpc.com  Thu Aug 27 08:08:44 2020
From: mark at stackhpc.com (Mark Goddard)
Date: Thu, 27 Aug 2020 09:08:44 +0100
Subject: [kolla] Focal upgrade
Message-ID: <CAFHSqWoUeGEtAAW0q76Sz-SBh00_H86JePPCwVphr_9gubqrVA@mail.gmail.com>

Hi,

For the Victoria release we will be moving our Ubuntu support from
Bionic 18.04 to the Focal 20.04 LTS release. This applies to both the
base container image and host OS.

We would like to request feedback from any Ubuntu users about how they
typically deal with a distro upgrade like this. I would assume that
the following workflow would be used:

1. start with a Ussuri release on Bionic
2. distro upgrade to Focal
3. OpenStack upgrade to Victoria

However, that would imply that it would not be possible to make any
more changes to the Ussuri deploy after the Focal upgrade, since Kolla
Ansible Ussuri release does not support Focal (it is blocked by
prechecks).

An alternative approach is:

1. start with a Ussuri release on Bionic
2. OpenStack upgrade to Victoria
3. distro upgrade to Focal

This implies that Victoria must support both Bionic and Focal as a
host OS, which it currently does. This flow matches more closely what
we are currently testing in CI (steps 1 and 2 only).

In both cases, Victoria container images are based on Focal.

Feedback on this would be appreciated.

Thanks,
Mark


From ssbarnea at redhat.com  Thu Aug 27 08:11:45 2020
From: ssbarnea at redhat.com (Sorin Sbarnea)
Date: Thu, 27 Aug 2020 09:11:45 +0100
Subject: Do you want to render ANSI in Zuul console?
Message-ID: <7AC2A3FE-FAE3-4EA1-BC0F-2B104F0D13CB@redhat.com>

At this moment Zuul web interfaces displays output of commands as raw, so any ANSI terminal output will display ugly artifacts.

I tried enabling ANSI about half a year ago but even after providing two different implementations, I was not able to popularize it enough.


As this is a UX related feature, I think would like more appropriate to ask for feedback from openstack-discuss, likely the biggest consumer of zuul web interface. 

Please comment/+/- on review below even if you are not a zuul core. At least it should show if this is a desired feature to have or not:

https://review.opendev.org/#/c/739444/ ✅

This review also includes a screenshot that shows how the rendering looks (an alternative for using the sitepreview)

Thanks
Sorin Sbarnea


From xin-ran.wang at intel.com  Thu Aug 27 09:50:17 2020
From: xin-ran.wang at intel.com (Wang, Xin-ran)
Date: Thu, 27 Aug 2020 09:50:17 +0000
Subject: [cyborg] Temporary treatment plan for the 3rd-party driver
In-Reply-To: <d3cc9f7f48fa42ba9b07c4a4890c3623@inspur.com>
References: <94B50EE3-F888-4BFA-908C-10B416096A64.ref@yahoo.com>
 <94B50EE3-F888-4BFA-908C-10B416096A64@yahoo.com>
 <91e7b70d6dea95fce428511010bfa8e0cf2ce4e4.camel@redhat.com>
 <d3cc9f7f48fa42ba9b07c4a4890c3623@inspur.com>
Message-ID: <DM6PR11MB3947BE3FE674AD3DC54D2D83D1550@DM6PR11MB3947.namprd11.prod.outlook.com>

Hi all,

According to our discussion on PTG and recent discussion by mailing list.  We have an agreement on using wiki to store the test report for the device drivers in the case that they do not have 3rd Party CI at present. 

Please see the wiki page here: https://wiki.openstack.org/wiki/Cyborg/TestReport.
Currently, there is one test report, other contributor who wants to upstream a device driver in Cyborg and who do not have the condition to hold a 3rd party CI can refer to this test report and give us your report when upstreaming.  Reference: https://wiki.openstack.org/wiki/Cyborg/TestReport/IntelQAT 

Thanks,
Xin-Ran

-----Original Message-----
From: Brin Zhang(张百林) <zhangbailin at inspur.com> 
Sent: Saturday, July 11, 2020 9:42 AM
To: smooney at redhat.com; yumeng_bao at yahoo.com; openstack-discuss at lists.openstack.org
Subject: 答复: [cyborg] Temporary treatment plan for the 3rd-party driver

On Fri, 2020-07-10 at 13:37 +0800, yumeng bao wrote:
> Brin, thanks for bringing this up!
> 
> > Hi all：
> >        This release we want to introduce some 3rd party drivers 
> > (e.g. Intel QAT, Inspur FPGA, and Inspur SSD etc.) in Cyborg, and we discussed the handling of 3rd-party driver CI in Cyborg IRC meeting [1].
> >        Due to the lack of CI test environment supported by hardware, 
> > we reached a temporary solution in two ways, as
> > follows:
> > 1. Provide a CI environment and provide a tempest test for Cyborg, 
> > this method is recommended; 2. If there is no CI environment, please 
> > provide the test results of this driver in the master branch or in 
> > the designated branch, which should be as complete as possible, sent to the Cyborg team, or pasted in the implementation of the commit.
> 
> Providing test result can be our option. The test result can be part 
> of the driver documentation[0] as this is public to users.
> And from my understanding, the test result should work as the role of 
> tempest case and clarify at least: necessary configuration,test operations and test results.

> i would advise against including the resulsts in docuemntation add int test results to a commit or provideing tiem at the poitn it merged just tells you it once worked on the developers system likely using devstack to deploy. it does not tell you that it still work after even a singel addtional commit has been merged. so i would sugges not adding the results to the docs as they will get out dateded quickly.

Good advice, this is also my original intention. Give the result verification in the submitted commit, and do not put the test verification result in the code base. As you said, this does not mean that it will always work unless a test report can be provided regularly. Of course, it is better if there is a third-party CI , we will try our best to fight for it.

> maintaining a wiki is fine but i woudl suggest considring any driver that does not have first or thirdparty ci to be experimental. the generic mdev driver we talked about can be tested using sampel kernel modules that provide realy mdevs implemnetaion of srial consoles or graphics devices. so it could be validated in first party ci and consider supported/non experimaental. if other driver can similarly be tested with virtual hardware or sample kernel modules that allowed testing in the first party ci they could alos be marked as fully supported. with out that level of testing however i would not advertise a driver as anything more then experimental.

> the old rule when i started working on openstack was if its not tested in ci its broken.


> 
> [0]
> https://docs.openstack.org/cyborg/latest/reference/support-matrix.html
> #driver-support
> 
> 
> >       [1]
> > http://eavesdrop.openstack.org/meetings/openstack_cyborg/2020/openst
> > ack_cyborg.2020-07-02-03.05.log.html
> 
> Regards，
> Yumeng
> 


From zhangbailin at inspur.com  Thu Aug 27 11:19:20 2020
From: zhangbailin at inspur.com (=?utf-8?B?QnJpbiBaaGFuZyjlvKDnmb7mnpcp?=)
Date: Thu, 27 Aug 2020 11:19:20 +0000
Subject: =?utf-8?B?562U5aSNOiBbY3lib3JnXSBUZW1wb3JhcnkgdHJlYXRtZW50IHBsYW4gZm9y?=
 =?utf-8?Q?_the_3rd-party_driver?=
References: <94B50EE3-F888-4BFA-908C-10B416096A64.ref@yahoo.com>
 <94B50EE3-F888-4BFA-908C-10B416096A64@yahoo.com>
 <91e7b70d6dea95fce428511010bfa8e0cf2ce4e4.camel@redhat.com> 
Message-ID: <8d43c413b4564e1c9d5ad67e53dbd5a3@inspur.com>

Hi all.
	In today's IRC meeting [1], we decide to have a wiki to maintain the 3-rd-party drivers temporary test results, like Intel QAT driver test result [2], and we also need to maintain the Driver Support docs [3], add "Temporary Test Result" as a column in the Driver Support list, we should mark the result added time, such as the QAT driver result, may we can say "This test results reported at Aug. 2020 in Victoria Release, please reference https://wiki.openstack.org/wiki/Cyborg/TestReport/IntelQAT".
	In the Driver Support part, we will claim the "Temporary Test Result" is a temporary result, it will not always work. If you encounter problems during the adaptation process, please contact the Cyborg Core Team [4] for help.

	[1] http://eavesdrop.openstack.org/irclogs/%23openstack-cyborg/%23openstack-cyborg.2020-08-27.log.html#t2020-08-27T03:11:40 
	[2] https://wiki.openstack.org/wiki/Cyborg/TestReport/IntelQAT 
	[3] https://docs.openstack.org/cyborg/latest/reference/support-matrix.html#driver-support 
	[4] https://review.opendev.org/#/admin/groups/1243,members 

brinzhang


-----邮件原件-----
发件人: Brin Zhang(张百林) 
发送时间: 2020年7月11日 9:41
收件人: 'smooney at redhat.com' <smooney at redhat.com>; 'yumeng_bao at yahoo.com' <yumeng_bao at yahoo.com>; 'openstack-discuss at lists.openstack.org' <openstack-discuss at lists.openstack.org>
主题: 答复: [cyborg] Temporary treatment plan for the 3rd-party driver

On Fri, 2020-07-10 at 13:37 +0800, yumeng bao wrote:
> Brin, thanks for bringing this up!
> 
> > Hi all：
> >        This release we want to introduce some 3rd party drivers 
> > (e.g. Intel QAT, Inspur FPGA, and Inspur SSD etc.) in Cyborg, and we discussed the handling of 3rd-party driver CI in Cyborg IRC meeting [1].
> >        Due to the lack of CI test environment supported by hardware, 
> > we reached a temporary solution in two ways, as
> > follows:
> > 1. Provide a CI environment and provide a tempest test for Cyborg, 
> > this method is recommended; 2. If there is no CI environment, please 
> > provide the test results of this driver in the master branch or in 
> > the designated branch, which should be as complete as possible, sent to the Cyborg team, or pasted in the implementation of the commit.
> 
> Providing test result can be our option. The test result can be part 
> of the driver documentation[0] as this is public to users.
> And from my understanding, the test result should work as the role of 
> tempest case and clarify at least: necessary configuration,test operations and test results.

> i would advise against including the resulsts in docuemntation add int test results to a commit or provideing tiem at the poitn it merged just tells you it once worked on the developers system likely using devstack to deploy. it does not tell you that it still work after even a singel addtional commit has been merged. so i would sugges not adding the results to the docs as they will get out dateded quickly.

Good advice, this is also my original intention. Give the result verification in the submitted commit, and do not put the test verification result in the code base. As you said, this does not mean that it will always work unless a test report can be provided regularly. Of course, it is better if there is a third-party CI , we will try our best to fight for it.

> maintaining a wiki is fine but i woudl suggest considring any driver that does not have first or thirdparty ci to be experimental. the generic mdev driver we talked about can be tested using sampel kernel modules that provide realy mdevs implemnetaion of srial consoles or graphics devices. so it could be validated in first party ci and consider supported/non experimaental. if other driver can similarly be tested with virtual hardware or sample kernel modules that allowed testing in the first party ci they could alos be marked as fully supported. with out that level of testing however i would not advertise a driver as anything more then experimental.

> the old rule when i started working on openstack was if its not tested in ci its broken.


> 
> [0]
> https://docs.openstack.org/cyborg/latest/reference/support-matrix.html
> #driver-support
> 
> 
> >       [1]
> > http://eavesdrop.openstack.org/meetings/openstack_cyborg/2020/openst
> > ack_cyborg.2020-07-02-03.05.log.html
> 
> Regards，
> Yumeng
> 


From CAPSEY at augusta.edu  Thu Aug 27 14:37:40 2020
From: CAPSEY at augusta.edu (Apsey, Christopher)
Date: Thu, 27 Aug 2020 14:37:40 +0000
Subject: [neutron][ovn] OVN Performance
Message-ID: <BN8PR03MB5123C63037F649D7ADB99D8BDD550@BN8PR03MB5123.namprd03.prod.outlook.com>

All,

I know that OVN is going to become the default neutron backend at some point and displace linuxbridge as the default configuration option in the docs, but we have noticed a pretty significant performance disparity between OVN and linuxbridge on identical hardware over the past year or so in a few different environments[1].  I know that example is unscientific, but similar results have been borne out in many different scenarios from what we have observed.  There are three main problems from what we see:


1.       OVN does not handle large concurrent requests as well as linuxbridge.  Additionally, linuxbridge concurrent capacity grows (not linearly, but grows nonetheless) by adding additional neutron API endpoints and RPC agents.  OVN does not really horizontally scale by adding additional API endpoints, from what we have observed.

2.       OVN gets significantly slower as load on the system grows.  We have observed a soft cap of about 2000-2500 instances in a given deployment before ovn-backed neutron stops responding altogether to nova requests (even for booting a single instance).  We have observed linuxbridge get to 5000+ instances before it starts to struggle on the same hardware (and we think that linuxbridge can go further with improved provider network design in that particular case).

3.       Once the southbound database process hits 100% CPU usage on the leader in the ovn cluster, it’s game over (probably causes 1+2)

It's entirely possible that we just don’t understand OVN well enough to tune it [2][3][4], but then the question becomes how do we get that tuning knowledge into the docs so people don’t scratch their heads when their cool new OVN deployment scales 40% as well as their ancient linuxbridge-based one?

If it is ‘known’ that OVN has some scaling challenges, is there a plan to fix it, and what is the best way to contribute to doing so?

We have observed similar results on Ubuntu 18.04/20.04 and CentOS 7/8 on Stein, Train, and Ussuri.

[1] https://pastebin.com/kyyURTJm
[2] https://github.com/GeorgiaCyber/kinetic/tree/master/formulas/ovsdb
[3] https://github.com/GeorgiaCyber/kinetic/tree/master/formulas/neutron
[4] https://github.com/GeorgiaCyber/kinetic/tree/master/formulas/compute

Chris Apsey
GEORGIA CYBER CENTER

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200827/86f5cc81/attachment.html>

From smooney at redhat.com  Thu Aug 27 15:10:30 2020
From: smooney at redhat.com (Sean Mooney)
Date: Thu, 27 Aug 2020 16:10:30 +0100
Subject: [neutron][ovn] OVN Performance
In-Reply-To: <BN8PR03MB5123C63037F649D7ADB99D8BDD550@BN8PR03MB5123.namprd03.prod.outlook.com>
References: <BN8PR03MB5123C63037F649D7ADB99D8BDD550@BN8PR03MB5123.namprd03.prod.outlook.com>
Message-ID: <29d70bbb4eeb330c435ae600d14aa8cfd627d696.camel@redhat.com>

On Thu, 2020-08-27 at 14:37 +0000, Apsey, Christopher wrote:
> All,
> 
> I know that OVN is going to become the default neutron backend at some point and displace linuxbridge as the default
> configuration option in the docs, but we have noticed a pretty significant performance disparity between OVN and
> linuxbridge on identical hardware over the past year or so in a few different environments[1].
the default backend in the docs is not linux bridge right now is it.
i tought i has been ml2/ovs for many years.
>   I know that example is unscientific, but similar results have been borne out in many different scenarios from what
> we have observed.  There are three main problems from what we see:
> 
> 
> 1.       OVN does not handle large concurrent requests as well as linuxbridge.  Additionally, linuxbridge concurrent
> capacity grows (not linearly, but grows nonetheless) by adding additional neutron API endpoints and RPC agents.  OVN
> does not really horizontally scale by adding additional API endpoints, from what we have observed.
> 
> 2.       OVN gets significantly slower as load on the system grows.  We have observed a soft cap of about 2000-2500
> instances in a given deployment before ovn-backed neutron stops responding altogether to nova requests (even for
> booting a single instance).  We have observed linuxbridge get to 5000+ instances before it starts to struggle on the
> same hardware (and we think that linuxbridge can go further with improved provider network design in that particular
> case).
> 
> 3.       Once the southbound database process hits 100% CPU usage on the leader in the ovn cluster, it’s game over
> (probably causes 1+2)
> 
> It's entirely possible that we just don’t understand OVN well enough to tune it [2][3][4], but then the question
> becomes how do we get that tuning knowledge into the docs so people don’t scratch their heads when their cool new OVN
> deployment scales 40% as well as their ancient linuxbridge-based one?
> 
> If it is ‘known’ that OVN has some scaling challenges, is there a plan to fix it, and what is the best way to
> contribute to doing so?
> 
> We have observed similar results on Ubuntu 18.04/20.04 and CentOS 7/8 on Stein, Train, and Ussuri.
> 
> [1] https://pastebin.com/kyyURTJm
> [2] https://github.com/GeorgiaCyber/kinetic/tree/master/formulas/ovsdb
> [3] https://github.com/GeorgiaCyber/kinetic/tree/master/formulas/neutron
> [4] https://github.com/GeorgiaCyber/kinetic/tree/master/formulas/compute
> 
> Chris Apsey
> GEORGIA CYBER CENTER
> 


From amuller at redhat.com  Thu Aug 27 15:17:45 2020
From: amuller at redhat.com (Assaf Muller)
Date: Thu, 27 Aug 2020 11:17:45 -0400
Subject: [neutron][ovn] OVN Performance
In-Reply-To: <BN8PR03MB5123C63037F649D7ADB99D8BDD550@BN8PR03MB5123.namprd03.prod.outlook.com>
References: <BN8PR03MB5123C63037F649D7ADB99D8BDD550@BN8PR03MB5123.namprd03.prod.outlook.com>
Message-ID: <CABARBAaCUF+6CKt9H0Cggi3wxAQjFgwdeM14Z2CQh=jf36uhyA@mail.gmail.com>

The most efficient way about this is to give one or more of the
Engineers working on OpenStack OVN upstream (I've added a few to this
thread) temporary access to an environment that can reproduce issues
you're seeing, we could then document the issues and work towards
solutions. If that's not possible, if you could provide reproducer
scripts, or alternatively sharpen the reproduction method, we'll take
a look. What you've described is not something that's 'acceptable',
OVN should definitely not scale worse than Neutron with the Linux
Bridge agent.  It's possible that the particular issues you ran in to
is something that we've already seen internally at Red Hat, or with
our customers, and we're already working on fixes in future versions
of OVN - I can't tell you until you elaborate on the details of the
issues you're seeing. In any case, the upstream community is committed
to improving OVN scale and fixing scale issues as they pop up.
Coincidentally, Red Hat scale engineers just published an article [1]
about work they've done to scale RH-OSP 16.1 (== OpenStack Train on
CentOS 8, with OVN 2.13 and TripleO) to 700 compute nodes.

[1] https://www.redhat.com/en/blog/scaling-red-hat-openstack-platform-161-more-700-nodes?source=bloglisting

On Thu, Aug 27, 2020 at 10:44 AM Apsey, Christopher <CAPSEY at augusta.edu> wrote:
>
> All,
>
>
>
> I know that OVN is going to become the default neutron backend at some point and displace linuxbridge as the default configuration option in the docs, but we have noticed a pretty significant performance disparity between OVN and linuxbridge on identical hardware over the past year or so in a few different environments[1].  I know that example is unscientific, but similar results have been borne out in many different scenarios from what we have observed.  There are three main problems from what we see:
>
>
>
> 1.       OVN does not handle large concurrent requests as well as linuxbridge.  Additionally, linuxbridge concurrent capacity grows (not linearly, but grows nonetheless) by adding additional neutron API endpoints and RPC agents.  OVN does not really horizontally scale by adding additional API endpoints, from what we have observed.
>
> 2.       OVN gets significantly slower as load on the system grows.  We have observed a soft cap of about 2000-2500 instances in a given deployment before ovn-backed neutron stops responding altogether to nova requests (even for booting a single instance).  We have observed linuxbridge get to 5000+ instances before it starts to struggle on the same hardware (and we think that linuxbridge can go further with improved provider network design in that particular case).
>
> 3.       Once the southbound database process hits 100% CPU usage on the leader in the ovn cluster, it’s game over (probably causes 1+2)
>
>
>
> It's entirely possible that we just don’t understand OVN well enough to tune it [2][3][4], but then the question becomes how do we get that tuning knowledge into the docs so people don’t scratch their heads when their cool new OVN deployment scales 40% as well as their ancient linuxbridge-based one?
>
>
>
> If it is ‘known’ that OVN has some scaling challenges, is there a plan to fix it, and what is the best way to contribute to doing so?
>
>
>
> We have observed similar results on Ubuntu 18.04/20.04 and CentOS 7/8 on Stein, Train, and Ussuri.
>
>
>
> [1] https://pastebin.com/kyyURTJm
>
> [2] https://github.com/GeorgiaCyber/kinetic/tree/master/formulas/ovsdb
>
> [3] https://github.com/GeorgiaCyber/kinetic/tree/master/formulas/neutron
>
> [4] https://github.com/GeorgiaCyber/kinetic/tree/master/formulas/compute
>
>
>
> Chris Apsey
>
> GEORGIA CYBER CENTER
>
>


From CAPSEY at augusta.edu  Thu Aug 27 15:20:04 2020
From: CAPSEY at augusta.edu (Apsey, Christopher)
Date: Thu, 27 Aug 2020 15:20:04 +0000
Subject: [EXTERNAL] Re: [neutron][ovn] OVN Performance
In-Reply-To: <29d70bbb4eeb330c435ae600d14aa8cfd627d696.camel@redhat.com>
References: <BN8PR03MB5123C63037F649D7ADB99D8BDD550@BN8PR03MB5123.namprd03.prod.outlook.com>
 <29d70bbb4eeb330c435ae600d14aa8cfd627d696.camel@redhat.com>
Message-ID: <BN8PR03MB5123A9F61681C316B1649C43DD550@BN8PR03MB5123.namprd03.prod.outlook.com>

>  the default backend in the docs is not linux bridge right now is it.
> i tought i has been ml2/ovs for many years.

Nope – still defaults to linuxbridge on master - https://docs.openstack.org/neutron/latest/install/controller-install-rdo.html.

And I don’t think that’s necessarily a bad thing if it’s the simplest option to get working well at the moment, but if the future is OVN, OVN should be at least as good in all respects.

Chris Apsey
GEORGIA CYBER CENTER

From: Sean Mooney <smooney at redhat.com>
Sent: Thursday, August 27, 2020 11:11 AM
To: Apsey, Christopher <CAPSEY at augusta.edu>; openstack-discuss at lists.openstack.org
Subject: [EXTERNAL] Re: [neutron][ovn] OVN Performance

CAUTION: EXTERNAL SENDER This email originated from an external source. Please exercise caution before opening attachments, clicking links, replying, or providing information to the sender. If you believe it to be fraudulent, contact the AU Cybersecurity Hotline at 72-CYBER (2-9237 / 706-722-9237) or 72CYBER at augusta.edu<mailto:72CYBER at augusta.edu>

On Thu, 2020-08-27 at 14:37 +0000, Apsey, Christopher wrote:
> All,
>
> I know that OVN is going to become the default neutron backend at some point and displace linuxbridge as the default
> configuration option in the docs, but we have noticed a pretty significant performance disparity between OVN and
> linuxbridge on identical hardware over the past year or so in a few different environments[1].
the default backend in the docs is not linux bridge right now is it.
i tought i has been ml2/ovs for many years.
> I know that example is unscientific, but similar results have been borne out in many different scenarios from what
> we have observed. There are three main problems from what we see:
>
>
> 1. OVN does not handle large concurrent requests as well as linuxbridge. Additionally, linuxbridge concurrent
> capacity grows (not linearly, but grows nonetheless) by adding additional neutron API endpoints and RPC agents. OVN
> does not really horizontally scale by adding additional API endpoints, from what we have observed.
>
> 2. OVN gets significantly slower as load on the system grows. We have observed a soft cap of about 2000-2500
> instances in a given deployment before ovn-backed neutron stops responding altogether to nova requests (even for
> booting a single instance). We have observed linuxbridge get to 5000+ instances before it starts to struggle on the
> same hardware (and we think that linuxbridge can go further with improved provider network design in that particular
> case).
>
> 3. Once the southbound database process hits 100% CPU usage on the leader in the ovn cluster, it’s game over
> (probably causes 1+2)
>
> It's entirely possible that we just don’t understand OVN well enough to tune it [2][3][4], but then the question
> becomes how do we get that tuning knowledge into the docs so people don’t scratch their heads when their cool new OVN
> deployment scales 40% as well as their ancient linuxbridge-based one?
>
> If it is ‘known’ that OVN has some scaling challenges, is there a plan to fix it, and what is the best way to
> contribute to doing so?
>
> We have observed similar results on Ubuntu 18.04/20.04 and CentOS 7/8 on Stein, Train, and Ussuri.
>
> [1] https://pastebin.com/kyyURTJm<https://pastebin.com/kyyURTJm>
> [2] https://github.com/GeorgiaCyber/kinetic/tree/master/formulas/ovsdb<https://github.com/GeorgiaCyber/kinetic/tree/master/formulas/ovsdb>
> [3] https://github.com/GeorgiaCyber/kinetic/tree/master/formulas/neutron<https://github.com/GeorgiaCyber/kinetic/tree/master/formulas/neutron>
> [4] https://github.com/GeorgiaCyber/kinetic/tree/master/formulas/compute<https://github.com/GeorgiaCyber/kinetic/tree/master/formulas/compute>
>
> Chris Apsey
> GEORGIA CYBER CENTER
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200827/44ede261/attachment.html>

From CAPSEY at augusta.edu  Thu Aug 27 15:32:37 2020
From: CAPSEY at augusta.edu (Apsey, Christopher)
Date: Thu, 27 Aug 2020 15:32:37 +0000
Subject: [EXTERNAL] Re: [neutron][ovn] OVN Performance
In-Reply-To: <CABARBAaCUF+6CKt9H0Cggi3wxAQjFgwdeM14Z2CQh=jf36uhyA@mail.gmail.com>
References: <BN8PR03MB5123C63037F649D7ADB99D8BDD550@BN8PR03MB5123.namprd03.prod.outlook.com>
 <CABARBAaCUF+6CKt9H0Cggi3wxAQjFgwdeM14Z2CQh=jf36uhyA@mail.gmail.com>
Message-ID: <BN8PR03MB512368D61ED97B2CEA16E5D5DD550@BN8PR03MB5123.namprd03.prod.outlook.com>

Assaf,

We can absolutely support engineering poking around in our environment (and possibly an even larger one at my previous employer that was experiencing similar issues during testing).  We can take this offline so we don’t spam the mailing list.

Just let me know how to proceed,

Thanks!

Chris Apsey
GEORGIA CYBER CENTER

From: Assaf Muller <amuller at redhat.com>
Sent: Thursday, August 27, 2020 11:18 AM
To: Apsey, Christopher <CAPSEY at augusta.edu>
Cc: openstack-discuss at lists.openstack.org; Lucas Alvares Gomes Martins <lmartins at redhat.com>; Jakub Libosvar <jlibosva at redhat.com>; Daniel Alvarez Sanchez <dalvarez at redhat.com>
Subject: [EXTERNAL] Re: [neutron][ovn] OVN Performance

CAUTION: EXTERNAL SENDER This email originated from an external source. Please exercise caution before opening attachments, clicking links, replying, or providing information to the sender. If you believe it to be fraudulent, contact the AU Cybersecurity Hotline at 72-CYBER (2-9237 / 706-722-9237) or 72CYBER at augusta.edu<mailto:72CYBER at augusta.edu>

The most efficient way about this is to give one or more of the
Engineers working on OpenStack OVN upstream (I've added a few to this
thread) temporary access to an environment that can reproduce issues
you're seeing, we could then document the issues and work towards
solutions. If that's not possible, if you could provide reproducer
scripts, or alternatively sharpen the reproduction method, we'll take
a look. What you've described is not something that's 'acceptable',
OVN should definitely not scale worse than Neutron with the Linux
Bridge agent. It's possible that the particular issues you ran in to
is something that we've already seen internally at Red Hat, or with
our customers, and we're already working on fixes in future versions
of OVN - I can't tell you until you elaborate on the details of the
issues you're seeing. In any case, the upstream community is committed
to improving OVN scale and fixing scale issues as they pop up.
Coincidentally, Red Hat scale engineers just published an article [1]
about work they've done to scale RH-OSP 16.1 (== OpenStack Train on
CentOS 8, with OVN 2.13 and TripleO) to 700 compute nodes.

[1] https://www.redhat.com/en/blog/scaling-red-hat-openstack-platform-161-more-700-nodes?source=bloglisting<https://www.redhat.com/en/blog/scaling-red-hat-openstack-platform-161-more-700-nodes?source=bloglisting>

On Thu, Aug 27, 2020 at 10:44 AM Apsey, Christopher <CAPSEY at augusta.edu<mailto:CAPSEY at augusta.edu>> wrote:
>
> All,
>
>
>
> I know that OVN is going to become the default neutron backend at some point and displace linuxbridge as the default configuration option in the docs, but we have noticed a pretty significant performance disparity between OVN and linuxbridge on identical hardware over the past year or so in a few different environments[1]. I know that example is unscientific, but similar results have been borne out in many different scenarios from what we have observed. There are three main problems from what we see:
>
>
>
> 1. OVN does not handle large concurrent requests as well as linuxbridge. Additionally, linuxbridge concurrent capacity grows (not linearly, but grows nonetheless) by adding additional neutron API endpoints and RPC agents. OVN does not really horizontally scale by adding additional API endpoints, from what we have observed.
>
> 2. OVN gets significantly slower as load on the system grows. We have observed a soft cap of about 2000-2500 instances in a given deployment before ovn-backed neutron stops responding altogether to nova requests (even for booting a single instance). We have observed linuxbridge get to 5000+ instances before it starts to struggle on the same hardware (and we think that linuxbridge can go further with improved provider network design in that particular case).
>
> 3. Once the southbound database process hits 100% CPU usage on the leader in the ovn cluster, it’s game over (probably causes 1+2)
>
>
>
> It's entirely possible that we just don’t understand OVN well enough to tune it [2][3][4], but then the question becomes how do we get that tuning knowledge into the docs so people don’t scratch their heads when their cool new OVN deployment scales 40% as well as their ancient linuxbridge-based one?
>
>
>
> If it is ‘known’ that OVN has some scaling challenges, is there a plan to fix it, and what is the best way to contribute to doing so?
>
>
>
> We have observed similar results on Ubuntu 18.04/20.04 and CentOS 7/8 on Stein, Train, and Ussuri.
>
>
>
> [1] https://pastebin.com/kyyURTJm<https://pastebin.com/kyyURTJm>
>
> [2] https://github.com/GeorgiaCyber/kinetic/tree/master/formulas/ovsdb<https://github.com/GeorgiaCyber/kinetic/tree/master/formulas/ovsdb>
>
> [3] https://github.com/GeorgiaCyber/kinetic/tree/master/formulas/neutron<https://github.com/GeorgiaCyber/kinetic/tree/master/formulas/neutron>
>
> [4] https://github.com/GeorgiaCyber/kinetic/tree/master/formulas/compute<https://github.com/GeorgiaCyber/kinetic/tree/master/formulas/compute>
>
>
>
> Chris Apsey
>
> GEORGIA CYBER CENTER
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200827/46adf2f7/attachment-0001.html>

From cboylan at sapwetik.org  Thu Aug 27 15:37:28 2020
From: cboylan at sapwetik.org (Clark Boylan)
Date: Thu, 27 Aug 2020 08:37:28 -0700
Subject: Do you want to render ANSI in Zuul console?
In-Reply-To: <7AC2A3FE-FAE3-4EA1-BC0F-2B104F0D13CB@redhat.com>
References: <7AC2A3FE-FAE3-4EA1-BC0F-2B104F0D13CB@redhat.com>
Message-ID: <b0bfb7a7-1f00-47b3-b3d6-49c42727faff@www.fastmail.com>

On Thu, Aug 27, 2020, at 1:11 AM, Sorin Sbarnea wrote:
> At this moment Zuul web interfaces displays output of commands as raw, 
> so any ANSI terminal output will display ugly artifacts.
> 
> I tried enabling ANSI about half a year ago but even after providing 
> two different implementations, I was not able to popularize it enough.
> 
> 
> As this is a UX related feature, I think would like more appropriate to 
> ask for feedback from openstack-discuss, likely the biggest consumer of 
> zuul web interface. 
> 
> Please comment/+/- on review below even if you are not a zuul core. At 
> least it should show if this is a desired feature to have or not:

Without my Zuul hat on but with my "I debug a lot of openstack jobs" hat I would prefer we remove ansi color controls from our log files entirely. They make using grep and other machine processing tools more difficult. I find the utility of grep, ^F, elasticsearch, and the log level severity filtering far more useful than scrolling and looking for colors that may be arbitrarily applied by the source.

> 
> https://review.opendev.org/#/c/739444/ ✅
> 
> This review also includes a screenshot that shows how the rendering 
> looks (an alternative for using the sitepreview)
> 
> Thanks
> Sorin Sbarnea
> 
> 
>


From smooney at redhat.com  Thu Aug 27 16:22:38 2020
From: smooney at redhat.com (Sean Mooney)
Date: Thu, 27 Aug 2020 17:22:38 +0100
Subject: [EXTERNAL] Re: [neutron][ovn] OVN Performance
In-Reply-To: <BN8PR03MB5123A9F61681C316B1649C43DD550@BN8PR03MB5123.namprd03.prod.outlook.com>
References: <BN8PR03MB5123C63037F649D7ADB99D8BDD550@BN8PR03MB5123.namprd03.prod.outlook.com>
 <29d70bbb4eeb330c435ae600d14aa8cfd627d696.camel@redhat.com>
 <BN8PR03MB5123A9F61681C316B1649C43DD550@BN8PR03MB5123.namprd03.prod.outlook.com>
Message-ID: <8ea879bb62bf6ec8f398b9036cbb105e5bd9ff64.camel@redhat.com>

On Thu, 2020-08-27 at 15:20 +0000, Apsey, Christopher wrote:
> >  the default backend in the docs is not linux bridge right now is it.
> > i tought i has been ml2/ovs for many years.
> 
> Nope – still defaults to linuxbridge on master - 
> https://docs.openstack.org/neutron/latest/install/controller-install-rdo.html.
> 
its not the default we use in devstack so its got much less testign then ovs so im surpised to see our docs decaulting
to it. im not sure if any openstack installer default to linux bridge and i know we have had trouble in the past
maintaining it when bugs arise. so its simple yes but  not the best maintained or developed driver.
i would be concerned that new people that deploy would hit bugs and not be able to find support on irc or the
mailinglist but i guss that is more of a first contact problem then related to your ovn issue..
> And I don’t think that’s necessarily a bad thing if it’s the simplest option to get working well at the moment, but if
> the future is OVN, OVN should be at least as good in all respects.
> 
> Chris Apsey
> GEORGIA CYBER CENTER
> 
> From: Sean Mooney <smooney at redhat.com>
> Sent: Thursday, August 27, 2020 11:11 AM
> To: Apsey, Christopher <CAPSEY at augusta.edu>; openstack-discuss at lists.openstack.org
> Subject: [EXTERNAL] Re: [neutron][ovn] OVN Performance
> 
> CAUTION: EXTERNAL SENDER This email originated from an external source. Please exercise caution before opening
> attachments, clicking links, replying, or providing information to the sender. If you believe it to be fraudulent,
> contact the AU Cybersecurity Hotline at 72-CYBER (2-9237 / 706-722-9237) or 72CYBER at augusta.edu<mailto:
> 72CYBER at augusta.edu>
> 
> On Thu, 2020-08-27 at 14:37 +0000, Apsey, Christopher wrote:
> > All,
> > 
> > I know that OVN is going to become the default neutron backend at some point and displace linuxbridge as the default
> > configuration option in the docs, but we have noticed a pretty significant performance disparity between OVN and
> > linuxbridge on identical hardware over the past year or so in a few different environments[1].
> 
> the default backend in the docs is not linux bridge right now is it.
> i tought i has been ml2/ovs for many years.
> > I know that example is unscientific, but similar results have been borne out in many different scenarios from what
> > we have observed. There are three main problems from what we see:
> > 
> > 
> > 1. OVN does not handle large concurrent requests as well as linuxbridge. Additionally, linuxbridge concurrent
> > capacity grows (not linearly, but grows nonetheless) by adding additional neutron API endpoints and RPC agents. OVN
> > does not really horizontally scale by adding additional API endpoints, from what we have observed.
> > 
> > 2. OVN gets significantly slower as load on the system grows. We have observed a soft cap of about 2000-2500
> > instances in a given deployment before ovn-backed neutron stops responding altogether to nova requests (even for
> > booting a single instance). We have observed linuxbridge get to 5000+ instances before it starts to struggle on the
> > same hardware (and we think that linuxbridge can go further with improved provider network design in that particular
> > case).
> > 
> > 3. Once the southbound database process hits 100% CPU usage on the leader in the ovn cluster, it’s game over
> > (probably causes 1+2)
> > 
> > It's entirely possible that we just don’t understand OVN well enough to tune it [2][3][4], but then the question
> > becomes how do we get that tuning knowledge into the docs so people don’t scratch their heads when their cool new
> > OVN
> > deployment scales 40% as well as their ancient linuxbridge-based one?
> > 
> > If it is ‘known’ that OVN has some scaling challenges, is there a plan to fix it, and what is the best way to
> > contribute to doing so?
> > 
> > We have observed similar results on Ubuntu 18.04/20.04 and CentOS 7/8 on Stein, Train, and Ussuri.
> > 
> > [1] https://pastebin.com/kyyURTJm<https://pastebin.com/kyyURTJm>;
> > [2] 
> > https://github.com/GeorgiaCyber/kinetic/tree/master/formulas/ovsdb<https://github.com/GeorgiaCyber/kinetic/tree/master/formulas/ovsdb>
> > ;
> > [3] 
> > https://github.com/GeorgiaCyber/kinetic/tree/master/formulas/neutron<https://github.com/GeorgiaCyber/kinetic/tree/master/formulas/neutron>
> > ;
> > [4] 
> > https://github.com/GeorgiaCyber/kinetic/tree/master/formulas/compute<https://github.com/GeorgiaCyber/kinetic/tree/master/formulas/compute>
> > ;
> > 
> > Chris Apsey
> > GEORGIA CYBER CENTER
> > 


From smooney at redhat.com  Thu Aug 27 16:24:58 2020
From: smooney at redhat.com (Sean Mooney)
Date: Thu, 27 Aug 2020 17:24:58 +0100
Subject: Do you want to render ANSI in Zuul console?
In-Reply-To: <b0bfb7a7-1f00-47b3-b3d6-49c42727faff@www.fastmail.com>
References: <7AC2A3FE-FAE3-4EA1-BC0F-2B104F0D13CB@redhat.com>
 <b0bfb7a7-1f00-47b3-b3d6-49c42727faff@www.fastmail.com>
Message-ID: <16d3afb3557c4ab745a3b244abb2d94b21c8d149.camel@redhat.com>

On Thu, 2020-08-27 at 08:37 -0700, Clark Boylan wrote:
> On Thu, Aug 27, 2020, at 1:11 AM, Sorin Sbarnea wrote:
> > At this moment Zuul web interfaces displays output of commands as raw, 
> > so any ANSI terminal output will display ugly artifacts.
> > 
> > I tried enabling ANSI about half a year ago but even after providing 
> > two different implementations, I was not able to popularize it enough.
> > 
> > 
> > As this is a UX related feature, I think would like more appropriate to 
> > ask for feedback from openstack-discuss, likely the biggest consumer of 
> > zuul web interface. 
> > 
> > Please comment/+/- on review below even if you are not a zuul core. At 
> > least it should show if this is a desired feature to have or not:
> 
> Without my Zuul hat on but with my "I debug a lot of openstack jobs" hat I would prefer we remove ansi color controls
> from our log files entirely. They make using grep and other machine processing tools more difficult. I find the
> utility of grep, ^F, elasticsearch, and the log level severity filtering far more useful than scrolling and looking
> for colors that may be arbitrarily applied by the source.
if we can remove them form the logs but use a javascpit lib in the viewer to still highlight thing that might be the
best of both worlds
i do fine the syntax hyilighign nice but we dont need color codes to do that.
> 
> > 
> > https://review.opendev.org/#/c/739444/ ✅
> > 
> > This review also includes a screenshot that shows how the rendering 
> > looks (an alternative for using the sitepreview)
> > 
> > Thanks
> > Sorin Sbarnea
> > 
> > 
> > 
> 
> 


From smooney at redhat.com  Thu Aug 27 16:26:04 2020
From: smooney at redhat.com (Sean Mooney)
Date: Thu, 27 Aug 2020 17:26:04 +0100
Subject: Do you want to render ANSI in Zuul console?
In-Reply-To: <16d3afb3557c4ab745a3b244abb2d94b21c8d149.camel@redhat.com>
References: <7AC2A3FE-FAE3-4EA1-BC0F-2B104F0D13CB@redhat.com>
 <b0bfb7a7-1f00-47b3-b3d6-49c42727faff@www.fastmail.com>
 <16d3afb3557c4ab745a3b244abb2d94b21c8d149.camel@redhat.com>
Message-ID: <c184d3afda6841f7dd3eccfc1fe8bf07aac7d254.camel@redhat.com>

On Thu, 2020-08-27 at 17:24 +0100, Sean Mooney wrote:
> On Thu, 2020-08-27 at 08:37 -0700, Clark Boylan wrote:
> > On Thu, Aug 27, 2020, at 1:11 AM, Sorin Sbarnea wrote:
> > > At this moment Zuul web interfaces displays output of commands as raw, 
> > > so any ANSI terminal output will display ugly artifacts.
> > > 
> > > I tried enabling ANSI about half a year ago but even after providing 
> > > two different implementations, I was not able to popularize it enough.
> > > 
> > > 
> > > As this is a UX related feature, I think would like more appropriate to 
> > > ask for feedback from openstack-discuss, likely the biggest consumer of 
> > > zuul web interface. 
> > > 
> > > Please comment/+/- on review below even if you are not a zuul core. At 
> > > least it should show if this is a desired feature to have or not:
> > 
> > Without my Zuul hat on but with my "I debug a lot of openstack jobs" hat I would prefer we remove ansi color
> > controls
> > from our log files entirely. They make using grep and other machine processing tools more difficult. I find the
> > utility of grep, ^F, elasticsearch, and the log level severity filtering far more useful than scrolling and looking
> > for colors that may be arbitrarily applied by the source.
> 
> if we can remove them form the logs but use a javascpit lib in the viewer to still highlight thing that might be the
> best of both worlds
> i do fine the syntax hyilighign nice but we dont need color codes to do that.
i ment to say i have had some success with https://highlightjs.org/ before for that use case mainly in blogs but
it might be a solution.
> > 
> > > 
> > > https://review.opendev.org/#/c/739444/ ✅
> > > 
> > > This review also includes a screenshot that shows how the rendering 
> > > looks (an alternative for using the sitepreview)
> > > 
> > > Thanks
> > > Sorin Sbarnea
> > > 
> > > 
> > > 
> > 
> > 
> 
> 


From sean.mcginnis at gmx.com  Thu Aug 27 17:06:51 2020
From: sean.mcginnis at gmx.com (Sean McGinnis)
Date: Thu, 27 Aug 2020 12:06:51 -0500
Subject: [ops] Restructuring OSOPS tools
Message-ID: <f24f25c2-fb44-125c-0306-1a3362732a3f@gmx.com>

Hello everyone,

We recently expanded the scope of the Ops Docs SIG to also include any
ops tooling. I think it's now time to move on to the next step of
actually getting some of the old tooling in place and organized how we
want it.

We have several semi-abandoned repos from back when there was more work
being done on ops tooling. During the great rebranding, those all were
moved under the x/ namespace:

https://opendev.org/x/?tab=&sort=recentupdate&q=osops

Since these are now owned by an official SIG, we can move this content
back under the openstack/ namespace. That should help increase
visibility somewhat, and make things look a little more official. It
will also allow contributors to tooling to get recognition for
contributing to an import part of the OpenStack ecosystem.

I do think it's can be a little more difficult to find things spread out
over several repos though. For simplicity with finding tooling, as well
as watching for reviews and helping with overall maintenance, I would
like to move all of these under a common openstack/osops. Under that
repo, we can then have a folder structure with tools/logging,
tools/monitoring, etc.

Then with everything in one place, we can have docs published in one
place that helps find everything and easily links between tools. We can
also capture some metadata about the tools, and use that to reflect
their state in those docs.

Please let me know if there are any objects to this plan. Otherwise, I
will start cleaning things up and getting it staged in a new repo to be
imported as an official repo owned by the SIG.

Thanks!

Sean


From sean.mcginnis at gmx.com  Thu Aug 27 17:10:08 2020
From: sean.mcginnis at gmx.com (Sean McGinnis)
Date: Thu, 27 Aug 2020 12:10:08 -0500
Subject: [all] Wallaby Release Schedule
Message-ID: <68e239af-115b-3fb9-96f4-fc10130b90fe@gmx.com>

Hey everyone,

We have officially published the schedule for the Wallaby development
cycle. That can now be found on the release.openstack.org site here:

https://releases.openstack.org/wallaby/schedule.html

PTLs, feel free to propose any updates if there are important
project-specific deadlines you would like to include on the schedule.
Just ping me if you need any examples of how that is done.

Thanks!

Sean


From akekane at redhat.com  Thu Aug 27 17:16:29 2020
From: akekane at redhat.com (Abhishek Kekane)
Date: Thu, 27 Aug 2020 22:46:29 +0530
Subject: [all] Wallaby Release Schedule
In-Reply-To: <68e239af-115b-3fb9-96f4-fc10130b90fe@gmx.com>
References: <68e239af-115b-3fb9-96f4-fc10130b90fe@gmx.com>
Message-ID: <CALOt+STOpU6_+JqDBmOeTij07RsbL9Wi7cqeG-Xnm6tngUExcw@mail.gmail.com>

Hi Sean,

I think PTG dates are not highlighted, does it need to be highlighted?

Thanks & Best Regards,

Abhishek Kekane


On Thu, Aug 27, 2020 at 10:43 PM Sean McGinnis <sean.mcginnis at gmx.com>
wrote:

> Hey everyone,
>
> We have officially published the schedule for the Wallaby development
> cycle. That can now be found on the release.openstack.org site here:
>
> https://releases.openstack.org/wallaby/schedule.html
>
> PTLs, feel free to propose any updates if there are important
> project-specific deadlines you would like to include on the schedule.
> Just ping me if you need any examples of how that is done.
>
> Thanks!
>
> Sean
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200827/192c9f58/attachment.html>

From its-openstack at zohocorp.com  Thu Aug 27 17:47:35 2020
From: its-openstack at zohocorp.com (its-openstack at zohocorp.com)
Date: Thu, 27 Aug 2020 23:17:35 +0530
Subject: per user quota not applign in openstack train
Message-ID: <17431083602.fe3e34d15305.5471067663006187936@zohocorp.com>

Dear openstack,


We are facing a peculiar issue with regards to users quota of resources.


e.g:

+------------------------------------------------------------------------------------------------------+
| project |   user  |  instance quota            |  no: of instance created      |

| -----------|------------|-----------------------------------|------------------------------------------|

|  tes      |     -      |      10                            |             -                               |

|  test     |  user1 |      2                              |            2                               |

|  test     |  user2 |      2                              |      error "quota over"          |

|  test     |  user3 |      3                              |      only 1 instance allowed  |

|  test     |  user4 | no user quota defined  |    able to create 10 instance|

+-------------------------------------------------------------------------------------------------------+

As
 you see from mentioned table. when user1,user2, has instance quota of 2
 and when user1 has created 2 instance, user2 unable to create instance.

but
 user3 able to create only 1 more instance, user 4 has no quota applied 
so project quota 10 will be applied and he can create 10 instance.


the quota is applied to each user but not tracked for each user, so this defeats the purpose of per user quota.


Please help us with resolving this issue.

    
Regards,

sysadmin team
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200827/9f869d01/attachment.html>

From satish.txt at gmail.com  Thu Aug 27 17:50:25 2020
From: satish.txt at gmail.com (Satish Patel)
Date: Thu, 27 Aug 2020 13:50:25 -0400
Subject: senlin auto scaling question
Message-ID: <CAPgF-frPN-Jqq1-UHgP0=Dsc=0AN1sE=VgwUqBWfMO-GObLBhg@mail.gmail.com>

Folks,

I have created very simple cluster using following command

openstack cluster create --profile myserver --desired-capacity 2
--min-size 2 --max-size 3 --strict my-asg

It spun up 2 vm immediately now because the desired capacity is 2 so I
am assuming if any node dies in the cluster it should spin up node to
make count 2 right?

so i killed one of node with "nove delete <instance-foo-1>"  but
senlin didn't create node automatically to make desired capacity 2 (In
AWS when you kill node in ASG it will create new node so is this
senlin different then AWS?)


From sean.mcginnis at gmx.com  Thu Aug 27 18:19:11 2020
From: sean.mcginnis at gmx.com (Sean McGinnis)
Date: Thu, 27 Aug 2020 13:19:11 -0500
Subject: [all] Wallaby Release Schedule
In-Reply-To: <CALOt+STOpU6_+JqDBmOeTij07RsbL9Wi7cqeG-Xnm6tngUExcw@mail.gmail.com>
References: <68e239af-115b-3fb9-96f4-fc10130b90fe@gmx.com>
 <CALOt+STOpU6_+JqDBmOeTij07RsbL9Wi7cqeG-Xnm6tngUExcw@mail.gmail.com>
Message-ID: <7494edec-c853-c8d1-4f6d-c076a97f4ed8@gmx.com>

On 8/27/20 12:16 PM, Abhishek Kekane wrote:
> Hi Sean,
>
> I think PTG dates are not highlighted, does it need to be highlighted?
>
> Thanks & Best Regards,
>
> Abhishek Kekane

Yep, thanks for pointing that out Abhishek. At the time, the PTG dates
were not confirmed yet. We do now have that set for October 26-30 now,
so I have proposed a patch to update the schedule to reflect that:

https://review.opendev.org/748504

Sean

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200827/262e487b/attachment-0001.html>

From mnaser at vexxhost.com  Thu Aug 27 18:28:10 2020
From: mnaser at vexxhost.com (Mohammed Naser)
Date: Thu, 27 Aug 2020 14:28:10 -0400
Subject: [tc] Monthly meeting
Message-ID: <CAEs876hXyOv00jtMoxzVj2DO6os=7iV9aqzouNp_WyQY5ACXOw@mail.gmail.com>

Hi everyone,

Our monthly TC meeting is scheduled for next Thursday, September 3rd,
at 1400 UTC.

If you would like to add topics for discussion, please go to
https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee#Next_Meeting
and fill out your suggestions by Wednesday, September 2nd, at 1900 UTC.

Thank you,

Regards,
Mohammed

-- 
Mohammed Naser
VEXXHOST, Inc.


From arunkumar.palanisamy at tcs.com  Wed Aug 26 19:02:22 2020
From: arunkumar.palanisamy at tcs.com (ARUNKUMAR PALANISAMY)
Date: Wed, 26 Aug 2020 19:02:22 +0000
Subject: Trove images for Cluster testing. 
Message-ID: <MA1PR01MB21229BC9F1D07FA7F3F571058A540@MA1PR01MB2122.INDPRD01.PROD.OUTLOOK.COM>

Hello Team,

My name is ARUNKUMAR PALANISAMY,

As part of our project requirement, we are evaluating trove components and need your support for experimental  datastore Image for testing cluster.  (Redis, Cassandra, MongoDB, Couchbase)


1.)    We are running devstack enviorment with Victoria Openstack release and with this image (trove-master-guest-ubuntu-bionic-dev.qcow2<https://tarballs.opendev.org/openstack/trove/images/trove-master-guest-ubuntu-bionic-dev.qcow2>), we are able to deploy mysql instance and and getting below error while creating mongoDB instances.

"ModuleNotFoundError: No module named 'trove.guestagent.datastore.experimental' "


2.)    While tried creating mongoDB image with diskimage-builder<https://opendev.org/openstack/diskimage-builder> tool, but we are getting "Block device " element error.


Regards,
Arunkumar Palanisamy
Cell: +49 172 6972490

=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200826/19693868/attachment.html>

From arunkumar.palanisamy at tcs.com  Wed Aug 26 19:34:32 2020
From: arunkumar.palanisamy at tcs.com (ARUNKUMAR PALANISAMY)
Date: Wed, 26 Aug 2020 19:34:32 +0000
Subject: openstack-discuss
Message-ID: <MA1PR01MB212230A54DAAC4E7D6D5CFB18A540@MA1PR01MB2122.INDPRD01.PROD.OUTLOOK.COM>

Hello Team,

My name is ARUNKUMAR PALANISAMY,

As part of our project requirement, we are evaluating trove components and need your support for experimental  datastore Image for testing cluster.  (Redis, Cassandra, MongoDB, Couchbase)


1.)    We are running devstack enviorment with Victoria Openstack release and with this image (trove-master-guest-ubuntu-bionic-dev.qcow2<https://tarballs.opendev.org/openstack/trove/images/trove-master-guest-ubuntu-bionic-dev.qcow2>), we are able to deploy mysql instance and and getting below error while creating mongoDB instances.

"ModuleNotFoundError: No module named 'trove.guestagent.datastore.experimental' "


2.)    While tried creating mongoDB image with diskimage-builder<https://opendev.org/openstack/diskimage-builder> tool, but we are getting "Block device " element error.


Regards,
Arunkumar Palanisamy

=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200826/8df5b630/attachment-0001.html>

From nanthini.a.a at ericsson.com  Thu Aug 27 11:09:11 2020
From: nanthini.a.a at ericsson.com (NANTHINI A A)
Date: Thu, 27 Aug 2020 11:09:11 +0000
Subject: [Heat] Reg Creation of resource based on another resource attribute
 value
Message-ID: <AM0PR0702MB3540365ECACE6DC033136590BA550@AM0PR0702MB3540.eurprd07.prod.outlook.com>

Hi Team ,
    I want to create the openstack subnet resource based on the openstack network resource's attribute STATUS value.
   i.e Create neutron subnet only when the neutron network status is ACTIVE .
   I can see currently the support of get_Attr function is not there in conditions section .Also the depends_on function accepts input as resource ids only .I cant pass a condition there .
   Is there any other way to implement the same .Please suggest .

Thanks,
A.Nanthini


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200827/7a743ece/attachment.html>

From tim.bell at cern.ch  Thu Aug 27 18:56:23 2020
From: tim.bell at cern.ch (Tim Bell)
Date: Thu, 27 Aug 2020 20:56:23 +0200
Subject: per user quota not applign in openstack train
In-Reply-To: <17431083602.fe3e34d15305.5471067663006187936@zohocorp.com>
References: <17431083602.fe3e34d15305.5471067663006187936@zohocorp.com>
Message-ID: <6559ED40-CE58-41A4-98B7-3AB90FF88E8A@cern.ch>


> On 27 Aug 2020, at 19:47, its-openstack at zohocorp.com wrote:
> 
> 
> 
> Dear openstack,
> 
> We are facing a peculiar issue with regards to users quota of resources.
> 
> e.g:
> +------------------------------------------------------------------------------------------------------+
> | project |   user  |  instance quota            |  no: of instance created      |
> | -----------|------------|-----------------------------------|------------------------------------------|
> |  tes      |     -      |      10                            |             -                               |
> |  test     |  user1 |      2                              |            2                               |
> |  test     |  user2 |      2                              |      error "quota over"          |
> |  test     |  user3 |      3                              |      only 1 instance allowed  |
> |  test     |  user4 | no user quota defined  |    able to create 10 instance|
> +-------------------------------------------------------------------------------------------------------+
> As you see from mentioned table. when user1,user2, has instance quota of 2 and when user1 has created 2 instance, user2 unable to create instance.
> but user3 able to create only 1 more instance, user 4 has no quota applied so project quota 10 will be applied and he can create 10 instance.
> 
> the quota is applied to each user but not tracked for each user, so this defeats the purpose of per user quota.
> 
> Please help us with resolving this issue.
>     
> 

I had understood that per-user quota was deprecated now.

Have you had a look at creating dedicated per-usret projects with assigned quotas ?

Tim

> Regards,
> sysadmin team
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200827/af82bd89/attachment.html>

From skaplons at redhat.com  Thu Aug 27 20:09:21 2020
From: skaplons at redhat.com (Slawek Kaplonski)
Date: Thu, 27 Aug 2020 22:09:21 +0200
Subject: [neutron] Drivers meeting 28.08.2020 cancelled
Message-ID: <20200827200921.6z7pl33zwgnk3caz@skaplons-mac>

Hi,

There is no any new RFEs in the agenda for tomorrow's drivers team meeting so
lets cancel it and see You all next week.
Have a great weekend.

-- 
Slawek Kaplonski
Principal software engineer
Red Hat


From tonyliu0592 at hotmail.com  Thu Aug 27 20:47:36 2020
From: tonyliu0592 at hotmail.com (Tony Liu)
Date: Thu, 27 Aug 2020 20:47:36 +0000
Subject: [Kolla] re-create container
Message-ID: <CY4PR08MB2375F72A3AB956D6B74FDDAABD550@CY4PR08MB2375.namprd08.prod.outlook.com>

Hi,

Is Kolla container created by playbook only or there is
something like docker-compose to re-create container in
case it's deleted after initial deployment?

Thanks!
Tony


From ssbarnea at redhat.com  Thu Aug 27 20:56:27 2020
From: ssbarnea at redhat.com (Sorin Sbarnea)
Date: Thu, 27 Aug 2020 21:56:27 +0100
Subject: Do you want to render ANSI in Zuul console?
In-Reply-To: <16d3afb3557c4ab745a3b244abb2d94b21c8d149.camel@redhat.com>
References: <7AC2A3FE-FAE3-4EA1-BC0F-2B104F0D13CB@redhat.com>
 <b0bfb7a7-1f00-47b3-b3d6-49c42727faff@www.fastmail.com>
 <16d3afb3557c4ab745a3b244abb2d94b21c8d149.camel@redhat.com>
Message-ID: <381A4B67-E346-4B17-8586-A07DDBCA1F79@redhat.com>

This does not make much sense to me as it sounds as: Lets convert all the images to B&W because it takes less space on disk and tell user to use JS based AI to recolor to them. 

Displaying ANSI does not mean colorize my logs, has nothing to do with it.

Displaying ANSI is about respecting the output produced by the executed tools.

Zuul should respect the output received on stderr/stdout and display it like a console/ terminal.  If the job author decides to use ANSI or not is up to them. 

Still, Zuul itself as product should just render ANSI content, mainly because I do not see any use-case where someone would want to render that text as RAW, as we all know ANSI escapes do not add any value to the user.

Still, if  the ability to display raw text, without ansi conversion is a real need, I could spend few more hours to implement it and add a preference option. Still,  think twice before asking for a feature that adds some code complexity and may not prove to be of real practical use. We all know that the raw text is still available inside the big json file in case someone has doubs regarding what was rendered may be wrong.

> On 27 Aug 2020, at 17:24, Sean Mooney <smooney at redhat.com> wrote:
> 
> On Thu, 2020-08-27 at 08:37 -0700, Clark Boylan wrote:
>> On Thu, Aug 27, 2020, at 1:11 AM, Sorin Sbarnea wrote:
>>> At this moment Zuul web interfaces displays output of commands as raw, 
>>> so any ANSI terminal output will display ugly artifacts.
>>> 
>>> I tried enabling ANSI about half a year ago but even after providing 
>>> two different implementations, I was not able to popularize it enough.
>>> 
>>> 
>>> As this is a UX related feature, I think would like more appropriate to 
>>> ask for feedback from openstack-discuss, likely the biggest consumer of 
>>> zuul web interface. 
>>> 
>>> Please comment/+/- on review below even if you are not a zuul core. At 
>>> least it should show if this is a desired feature to have or not:
>> 
>> Without my Zuul hat on but with my "I debug a lot of openstack jobs" hat I would prefer we remove ansi color controls
>> from our log files entirely. They make using grep and other machine processing tools more difficult. I find the
>> utility of grep, ^F, elasticsearch, and the log level severity filtering far more useful than scrolling and looking
>> for colors that may be arbitrarily applied by the source.
> if we can remove them form the logs but use a javascpit lib in the viewer to still highlight thing that might be the
> best of both worlds
> i do fine the syntax hyilighign nice but we dont need color codes to do that.
>> 
>>> 
>>> https://review.opendev.org/#/c/739444/ <https://review.opendev.org/#/c/739444/> ✅
>>> 
>>> This review also includes a screenshot that shows how the rendering 
>>> looks (an alternative for using the sitepreview)
>>> 
>>> Thanks
>>> Sorin Sbarnea

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200827/0d524e78/attachment.html>

From cboylan at sapwetik.org  Thu Aug 27 21:01:01 2020
From: cboylan at sapwetik.org (Clark Boylan)
Date: Thu, 27 Aug 2020 14:01:01 -0700
Subject: Do you want to render ANSI in Zuul console?
In-Reply-To: <381A4B67-E346-4B17-8586-A07DDBCA1F79@redhat.com>
References: <7AC2A3FE-FAE3-4EA1-BC0F-2B104F0D13CB@redhat.com>
 <b0bfb7a7-1f00-47b3-b3d6-49c42727faff@www.fastmail.com>
 <16d3afb3557c4ab745a3b244abb2d94b21c8d149.camel@redhat.com>
 <381A4B67-E346-4B17-8586-A07DDBCA1F79@redhat.com>
Message-ID: <3e4a7cf3-1686-4a16-92c4-c7a52a8d9dfe@www.fastmail.com>

On Thu, Aug 27, 2020, at 1:56 PM, Sorin Sbarnea wrote:
> This does not make much sense to me as it sounds as: Lets convert all 
> the images to B&W because it takes less space on disk and tell user to 
> use JS based AI to recolor to them. 
> 
> Displaying ANSI does not mean colorize my logs, has nothing to do with it.
> 
> Displaying ANSI is about respecting the output produced by the executed tools.
> 
> Zuul should respect the output received on stderr/stdout and display it 
> like a console/ terminal.  If the job author decides to use ANSI or not 
> is up to them. 

You asked if OpenStack would use/like to use such a feature. I'm suggesting a better option for OpenStack is to avoid adding a bunch of control codes to logs.

> 
> Still, Zuul itself as product should just render ANSI content, mainly 
> because I do not see any use-case where someone would want to render 
> that text as RAW, as we all know ANSI escapes do not add any value to 
> the user.
> 
> Still, if  the ability to display raw text, without ansi conversion is 
> a real need, I could spend few more hours to implement it and add a 
> preference option. Still,  think twice before asking for a feature that 
> adds some code complexity and may not prove to be of real practical 
> use. We all know that the raw text is still available inside the big 
> json file in case someone has doubs regarding what was rendered may be 
> wrong.
> 
> > On 27 Aug 2020, at 17:24, Sean Mooney <smooney at redhat.com> wrote:
> > 
> > On Thu, 2020-08-27 at 08:37 -0700, Clark Boylan wrote:
> >> On Thu, Aug 27, 2020, at 1:11 AM, Sorin Sbarnea wrote:
> >>> At this moment Zuul web interfaces displays output of commands as raw, 
> >>> so any ANSI terminal output will display ugly artifacts.
> >>> 
> >>> I tried enabling ANSI about half a year ago but even after providing 
> >>> two different implementations, I was not able to popularize it enough.
> >>> 
> >>> 
> >>> As this is a UX related feature, I think would like more appropriate to 
> >>> ask for feedback from openstack-discuss, likely the biggest consumer of 
> >>> zuul web interface. 
> >>> 
> >>> Please comment/+/- on review below even if you are not a zuul core. At 
> >>> least it should show if this is a desired feature to have or not:
> >> 
> >> Without my Zuul hat on but with my "I debug a lot of openstack jobs" hat I would prefer we remove ansi color controls
> >> from our log files entirely. They make using grep and other machine processing tools more difficult. I find the
> >> utility of grep, ^F, elasticsearch, and the log level severity filtering far more useful than scrolling and looking
> >> for colors that may be arbitrarily applied by the source.
> > if we can remove them form the logs but use a javascpit lib in the viewer to still highlight thing that might be the
> > best of both worlds
> > i do fine the syntax hyilighign nice but we dont need color codes to do that.
> >> 
> >>> 
> >>> https://review.opendev.org/#/c/739444/ ✅
> >>> 
> >>> This review also includes a screenshot that shows how the rendering 
> >>> looks (an alternative for using the sitepreview)
> >>> 
> >>> Thanks
> >>> Sorin Sbarnea
>


From viroel at gmail.com  Thu Aug 27 21:49:53 2020
From: viroel at gmail.com (Douglas)
Date: Thu, 27 Aug 2020 18:49:53 -0300
Subject: [manila] Victoria Collab Review next Tuesday (Sep 1st)
Message-ID: <CAE=en=PAHZP-h74pHxqR2FHJm5ZxoNqczOrJ=d3Dkfo=YSJ=OQ@mail.gmail.com>

Hi everybody

We will have a new edition of our collaborative review next Tuesday,
September 1st, where we'll go through the code and review the proposed
feature Share Server Migration[1][2].
This meeting is scheduled for two hours, starting at 5:00PM UTC. Meeting
notes and videoconference links will be available here[3].
Feel free to attend if you are interested and available.

Hoping to see you there,

- dviroel

[1]
https://opendev.org/openstack/manila-specs/src/branch/master/specs/victoria/share-server-migration.rst
[2]
https://review.opendev.org/#/q/topic:bp/share-server-migration+(status:open)
[3] https://etherpad.opendev.org/p/manila-victoria-collab-review
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200827/09871309/attachment.html>

From anlin.kong at gmail.com  Thu Aug 27 22:09:10 2020
From: anlin.kong at gmail.com (Lingxian Kong)
Date: Fri, 28 Aug 2020 10:09:10 +1200
Subject: Trove images for Cluster testing.
In-Reply-To: <MA1PR01MB21229BC9F1D07FA7F3F571058A540@MA1PR01MB2122.INDPRD01.PROD.OUTLOOK.COM>
References: <MA1PR01MB21229BC9F1D07FA7F3F571058A540@MA1PR01MB2122.INDPRD01.PROD.OUTLOOK.COM>
Message-ID: <CALjNAZ3Gog4G-92qEv9z9iLgB89Y_rNPbYn3VbL7ydB3wdaoYA@mail.gmail.com>

Hi Arunkumar,

Unfortunately, for now Trove only supports MySQL and MariaDB, I'm working
on adding PostgreSQL support. All other datastores are unmaintained right
now.

Since this(Victoria) dev cycle, docker container was introduced in Trove
guest agent in order to remove the maintenance overhead for multiple Trove
guest images. We only need to maintain one single guest image but could
support different datastores. We have to do that as such a small Trove team
in the community.

If supporting Redis, Cassandra, MongoDB or Couchbase is in your feature
request, you are welcome to contribute to Trove.

Please let me know if you have any other questions. You are also welcome to
join #openstack-trove IRC channel for discussion.

---
Lingxian Kong
Senior Software Engineer
Catalyst Cloud
www.catalystcloud.nz


On Fri, Aug 28, 2020 at 6:45 AM ARUNKUMAR PALANISAMY <
arunkumar.palanisamy at tcs.com> wrote:

> Hello Team,
>
>
>
> My name is ARUNKUMAR PALANISAMY,
>
>
>
> As part of our project requirement, we are evaluating trove components and
> need your support for experimental  datastore Image for testing cluster.
>  (Redis, Cassandra, MongoDB, Couchbase)
>
>
>
> 1.)    We are running devstack enviorment with Victoria Openstack release
> and with this image (trove-master-guest-ubuntu-bionic-dev.qcow2
> <https://tarballs.opendev.org/openstack/trove/images/trove-master-guest-ubuntu-bionic-dev.qcow2>),
> we are able to deploy mysql instance and and getting below error while
> creating mongoDB instances.
>
>
>
> *“ModuleNotFoundError: No module named
> 'trove.guestagent.datastore.experimental' “*
>
>
>
> 2.)    While tried creating mongoDB image with diskimage-builder
> <https://opendev.org/openstack/diskimage-builder> tool, but we are
> getting “Block device ” element error.
>
>
>
>
>
> Regards,
>
> Arunkumar Palanisamy
>
> Cell: +49 172 6972490
>
>
>
> =====-----=====-----=====
> Notice: The information contained in this e-mail
> message and/or attachments to it may contain
> confidential or privileged information. If you are
> not the intended recipient, any dissemination, use,
> review, distribution, printing or copying of the
> information contained in this e-mail message
> and/or attachments to it are strictly prohibited. If
> you have received this communication in error,
> please notify us by reply e-mail or telephone and
> immediately and permanently delete the message
> and any attachments. Thank you
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200828/4b4f4b4d/attachment-0001.html>

From sorrison at gmail.com  Thu Aug 27 22:10:50 2020
From: sorrison at gmail.com (Sam Morrison)
Date: Fri, 28 Aug 2020 08:10:50 +1000
Subject: [neutron][networking-midonet] Maintainers needed
In-Reply-To: <610412AF-AADF-44BD-ABA2-BA289B7C8F8A@redhat.com>
References: <0AC5AC07-E97E-43CC-B344-A3E992B8CCA4@netways.de>
 <610412AF-AADF-44BD-ABA2-BA289B7C8F8A@redhat.com>
Message-ID: <5E2F5826-559E-42E9-84C5-FA708E5A122A@gmail.com>

We (Nectar Research Cloud) use midonet heavily too, it works really well and we haven’t found another driver that works for us. We tried OVN but it just doesn’t scale to the size of environment we have.

I’m happy to help too.

Cheers,
Sam


> On 31 Jul 2020, at 2:06 am, Slawek Kaplonski <skaplons at redhat.com> wrote:
> 
> Hi,
> 
> Thx Sebastian for stepping in to maintain the project. That is great news.
> I think that at the beginning You should do 2 things:
> - sync with Takashi Yamamoto (I added him to the loop) as he is probably most active current maintainer of this project,
> - focus on fixing networking-midonet ci which is currently broken - all scenario jobs aren’t working fine on Ubuntu 18.04 (and we are going to move to 20.04 in this cycle), migrate jobs to zuulv3 from the legacy ones and finally add them to the ci again,
> 
> I can of course help You with ci jobs if You need any help. Feel free to ping me on IRC or email (can be off the list).
> 
>> On 29 Jul 2020, at 15:24, Sebastian Saemann <Sebastian.Saemann at netways.de> wrote:
>> 
>> Hi Slawek,
>> 
>> we at NETWAYS are running most of our neutron networking on top of midonet and wouldn't be too happy if it gets deprecated and removed. So we would like to take over the maintainer role for this part.
>> 
>> Please let me know how to proceed and how we can be onboarded easily.
>> 
>> Best regards,
>> 
>> Sebastian
>> 
>> -- 
>> Sebastian Saemann
>> Head of Managed Services
>> 
>> NETWAYS Managed Services GmbH | Deutschherrnstr. 15-19 | D-90429 Nuernberg
>> Tel: +49 911 92885-0 | Fax: +49 911 92885-77
>> CEO: Julian Hein, Bernd Erk | AG Nuernberg HRB25207
>> https://netways.de | sebastian.saemann at netways.de
>> 
>> ** NETWAYS Web Services - https://nws.netways.de **
> 
> — 
> Slawek Kaplonski
> Principal software engineer
> Red Hat
> 
> 


From gmann at ghanshyammann.com  Thu Aug 27 22:14:57 2020
From: gmann at ghanshyammann.com (Ghanshyam Mann)
Date: Thu, 27 Aug 2020 17:14:57 -0500
Subject: [kolla] Focal upgrade
In-Reply-To: <CAFHSqWoUeGEtAAW0q76Sz-SBh00_H86JePPCwVphr_9gubqrVA@mail.gmail.com>
References: <CAFHSqWoUeGEtAAW0q76Sz-SBh00_H86JePPCwVphr_9gubqrVA@mail.gmail.com>
Message-ID: <17431fcfe5e.fab23a5543953.1175910699983452884@ghanshyammann.com>

 ---- On Thu, 27 Aug 2020 03:08:44 -0500 Mark Goddard <mark at stackhpc.com> wrote ----
 > Hi,
 > 
 > For the Victoria release we will be moving our Ubuntu support from
 > Bionic 18.04 to the Focal 20.04 LTS release. This applies to both the
 > base container image and host OS.
 > 
 > We would like to request feedback from any Ubuntu users about how they
 > typically deal with a distro upgrade like this. I would assume that
 > the following workflow would be used:
 > 
 > 1. start with a Ussuri release on Bionic
 > 2. distro upgrade to Focal
 > 3. OpenStack upgrade to Victoria
 > 
 > However, that would imply that it would not be possible to make any
 > more changes to the Ussuri deploy after the Focal upgrade, since Kolla
 > Ansible Ussuri release does not support Focal (it is blocked by
 > prechecks).
 > 
 > An alternative approach is:
 > 
 > 1. start with a Ussuri release on Bionic
 > 2. OpenStack upgrade to Victoria
 > 3. distro upgrade to Focal
 > 

I am not ubuntu user or not done such upgrade but I think later approach is a better way. Ussuri release for
sure will not work on Focal ( by seeing the fixes in various projects for Focal migration.). Victoria still is
tested on Bionic so it is surly tested till we move our testing to Focal.

-gmann

 > This implies that Victoria must support both Bionic and Focal as a
 > host OS, which it currently does. This flow matches more closely what
 > we are currently testing in CI (steps 1 and 2 only).
 > 
 > In both cases, Victoria container images are based on Focal.
 > 
 > Feedback on this would be appreciated.
 > 
 > Thanks,
 > Mark
 > 
 > 


From melwittt at gmail.com  Thu Aug 27 23:51:34 2020
From: melwittt at gmail.com (melanie witt)
Date: Thu, 27 Aug 2020 16:51:34 -0700
Subject: per user quota not applign in openstack train
In-Reply-To: <6559ED40-CE58-41A4-98B7-3AB90FF88E8A@cern.ch>
References: <17431083602.fe3e34d15305.5471067663006187936@zohocorp.com>
 <6559ED40-CE58-41A4-98B7-3AB90FF88E8A@cern.ch>
Message-ID: <5f3f8a59-a41a-2a69-007e-1a4dbf4f0ed7@gmail.com>

On 8/27/20 11:56, Tim Bell wrote:
> 
>> On 27 Aug 2020, at 19:47, its-openstack at zohocorp.com 
>> As you see from mentioned table. when user1,user2, has instance quota 
>> of 2 and when user1 has created 2 instance, user2 unable to create 
>> instance.
>> but user3 able to create only 1 more instance, user 4 has no quota 
>> applied so project quota 10 will be applied and he can create 10 instance.
>>
>> the quota is applied to each user but not tracked for each user, so 
>> this defeats the purpose of per user quota.
>>
>> Please help us with resolving this issue.

Hi, I tried your scenario in devstack and found a bug in the [lack of] scoping for per-user quotas [1] and have a proposed a patch (still needs test coverage):

https://review.opendev.org/748550

If you could please try out this patch and let me know whether you find any issues, it would be appreciated.

With this patch, I got the following result with your same scenario (first user has instances quota of 2, second user has 2, third user has 3, last user has no per-user quota, project has 10):

$ nova list --fields name,user_id,created --sort created_at:asc
+--------------------------------------+-------+----------------------------------+----------------------+
| ID                                   | Name  | User Id                          | Created              |
+--------------------------------------+-------+----------------------------------+----------------------+
| 5a52f400-2bef-4f00-add4-df69b6ac195f | one   | e630b64070f042e98381bb7f6be9919c | 2020-08-27T21:41:06Z |
| 800c7673-0846-4c2e-a502-2d8db7ceab40 | two   | e630b64070f042e98381bb7f6be9919c | 2020-08-27T21:42:07Z |
| af9b35ce-6ba9-4657-aacf-aedb1915ce9a | three | b34b2b234e0545b9a54ce7f63d9b116e | 2020-08-27T23:14:36Z |
| d83e9c56-ccc5-4d65-bb81-ac3be3a8f575 | four  | b34b2b234e0545b9a54ce7f63d9b116e | 2020-08-27T23:16:24Z |
| 56aa06d2-2d1f-49e4-a314-e2e06f68fef0 | five  | 3278d32e38534016963e457f6c9d07d7 | 2020-08-27T23:16:59Z |
| 38e84ebb-fb88-4e39-b5fe-32bbcdd5f062 | six   | 3278d32e38534016963e457f6c9d07d7 | 2020-08-27T23:17:19Z |
| 7376788f-c51a-4a6d-be91-c14bb71b3541 | seven | 3278d32e38534016963e457f6c9d07d7 | 2020-08-27T23:17:37Z |
| f072d745-c37f-493d-8fa8-7dc83d520539 | eight | 06f1a9d74d214fa1b352d4a3f41e3421 | 2020-08-27T23:18:21Z |
| 58688387-8f8c-4b60-acac-40d11a8ca5b9 | nine  | 06f1a9d74d214fa1b352d4a3f41e3421 | 2020-08-27T23:18:37Z |
| 1a25e99f-2bfe-42ea-b1b3-e88a78b11293 | ten   | 06f1a9d74d214fa1b352d4a3f41e3421 | 2020-08-27T23:18:53Z |
+--------------------------------------+-------+----------------------------------+----------------------+

And I was not able to create a fourth instance with the last user because that would exceed the total project quota of 10.

Also, be careful with how you assign per-user quota in nova. The first time I tried your scenario, I did not make sure to use user_id UUID instead of name. The per-user quotas will not work properly if you do not specify the user_id as a UUID, example:

$ nova quota-update --user 3278d32e38534016963e457f6c9d07d7 --instance 3 518c0eaec2754217bee6b67a1ec6f884

where the first UUID is the user_id and the second UUID is the project_id.

> I had understood that per-user quota was deprecated now.
> 
> Have you had a look at creating dedicated per-usret projects with 
> assigned quotas ?

Tim is correct in that per-user quota is not encouraged because when we move to unified limits in nova [2], they will be removed [3].

If you are able to use a dedicated project per user instead of using per-user quotas, that is a better approach.

Hope this helps,
-melanie

[1] https://bugs.launchpad.net/nova/+bug/1893284
[2] https://review.opendev.org/#/q/topic:bp/unified-limits-nova
[3] https://docs.openstack.org/nova/latest/admin/quotas.html#view-and-update-quota-values-for-a-project-user


From zbitter at redhat.com  Fri Aug 28 01:39:53 2020
From: zbitter at redhat.com (Zane Bitter)
Date: Thu, 27 Aug 2020 21:39:53 -0400
Subject: [Heat] Reg Creation of resource based on another resource
 attribute value
In-Reply-To: <AM0PR0702MB3540365ECACE6DC033136590BA550@AM0PR0702MB3540.eurprd07.prod.outlook.com>
References: <AM0PR0702MB3540365ECACE6DC033136590BA550@AM0PR0702MB3540.eurprd07.prod.outlook.com>
Message-ID: <cc5d2acc-7aa8-6981-4a3e-a2440ee97fe2@redhat.com>

On 27/08/20 7:09 am, NANTHINI A A wrote:
> Hi Team ,
> 
>  Â Â Â  I want to create the openstack subnet resource based on the 
> openstack network resourceâ€™s attribute STATUS value.
> 
>  Â Â  i.e Create neutron subnet only when the neutron network status is 
> ACTIVE .
> 
>  Â Â  I can see currently the support of get_Attr function is not there in 
> conditions section .

Correct, and that's because they're evaluated at different times. When 
you create (or update) the stack Heat decides immediately which 
resources are enabled. But the attribute values are not known until 
after that resource is created.

> Also the depends_on function accepts input as 
> resource ids only .I cant pass a condition there .

I believe you can depend on a resource that is conditionally disabled 
without causing an error.

>  Â Â  Is there any other way to implement the same .Please suggest .

There isn't.

cheers,
Zane.


From thierry at openstack.org  Fri Aug 28 10:01:59 2020
From: thierry at openstack.org (Thierry Carrez)
Date: Fri, 28 Aug 2020 12:01:59 +0200
Subject: [ops] Restructuring OSOPS tools
In-Reply-To: <f24f25c2-fb44-125c-0306-1a3362732a3f@gmx.com>
References: <f24f25c2-fb44-125c-0306-1a3362732a3f@gmx.com>
Message-ID: <cbd0a8a1-348d-2d00-6573-e6a6569cdde9@openstack.org>

Sean McGinnis wrote:
> [...]
> Since these are now owned by an official SIG, we can move this content
> back under the openstack/ namespace. That should help increase
> visibility somewhat, and make things look a little more official. It
> will also allow contributors to tooling to get recognition for
> contributing to an import part of the OpenStack ecosystem.
> 
> I do think it's can be a little more difficult to find things spread out
> over several repos though. For simplicity with finding tooling, as well
> as watching for reviews and helping with overall maintenance, I would
> like to move all of these under a common openstack/osops. Under that
> repo, we can then have a folder structure with tools/logging,
> tools/monitoring, etc.

Also the original setup[1] called for moving things from one repo to 
another as they get more mature, which loses history. So I agree a 
single repository is better.

However, one benefit of the original setup was that it made it really 
low-friction to land half-baked code in the osops-tools-contrib 
repository. The idea was to encourage tools sharing, rather than judge 
quality or curate a set. I think it's critical for the success of OSops 
that operator code can be brought in with very low friction, and 
curation can happen later.

If we opt for a theme-based directory structure, we could communicate 
that a given tool is in "unmaintained/use-at-your-own-risk" status using 
metadata. But thinking more about it, I would suggest we keep a 
low-friction "contrib/" directory in the repo, which more clearly 
communicates "use at your own risk" for anything within it. Then we 
could move tools under the "tools/" directory structure if a community 
forms within the SIG to support and improve a specific tool. That would 
IMHO allow both low-friction landing *and* curation to happen.

> [...]
> Please let me know if there are any objects to this plan. Otherwise, I
> will start cleaning things up and getting it staged in a new repo to be
> imported as an official repo owned by the SIG.

I like it!

[1] https://wiki.openstack.org/wiki/Osops

-- 
Thierry Carrez (ttx)


From skaplons at redhat.com  Fri Aug 28 10:04:11 2020
From: skaplons at redhat.com (Slawek Kaplonski)
Date: Fri, 28 Aug 2020 12:04:11 +0200
Subject: [neutron][gate] verbose q-svc log files and e-r indexing
In-Reply-To: <62e4fcd2-0f7a-a7d3-7692-3ad9a05c8399@gmail.com>
References: <bbf4a146-7c67-969e-b783-4bdfbe31add1@gmail.com>
 <20200818103323.wq5upyjn4nzsqhx7@skaplons-mac>
 <20200818150052.u4xkjsptejikwcny@skaplons-mac>
 <62e4fcd2-0f7a-a7d3-7692-3ad9a05c8399@gmail.com>
Message-ID: <20200828100411.l7egqidkfzfi4xjt@skaplons-mac>

Hi,

On Tue, Aug 18, 2020 at 02:10:40PM -0700, melanie witt wrote:
> On 8/18/20 08:00, Slawek Kaplonski wrote:
> > Hi,
> > 
> > I proposed patch [1] which seems that decreased size of the neutron-server log
> > a bit - see [2] but it's still about 40M :/
> > 
> > [1] https://review.opendev.org/#/c/730879/
> > [2] https://48dcf568cd222acfbfb6-11d92d8452a346ca231ad13d26a55a7d.ssl.cf2.rackcdn.com/746714/1/check/tempest-full-py3/5c1399c/controller/logs/
> 
> Thanks for jumping in to help, Slawek! Indeed your proposed patch improves things from 60M-70M => 40M (good!).
> 
> With your patch applied, the most frequent potential log message I see now is like this:
> 
> Aug 18 14:40:21.294549 ubuntu-bionic-rax-iad-0019321276 neutron-server[5829]: DEBUG neutron_lib.callbacks.manager [None req-eadfbe92-eaee-4e3e-a5c0-f18aa8ba9772 None None] Notify callbacks ['neutron.services.segments.db._update_segment_host_mapping_for_agent-8764691834039', 'neutron.plugins.ml2.plugin.Ml2Plugin._retry_binding_revived_agents-4033733'] for agent, after_update {{(pid=6206) _notify_loop /opt/stack/neutron-lib/neutron_lib/callbacks/manager.py:193}}
> 
> with the line count difference being with and without:
> 
> $ wc -l "screen-q-svc.txt"
> 102493 screen-q-svc.txt
> 
> $ grep -v "neutron_lib.callbacks.manager" "screen-q-svc.txt" |wc -l
> 83261
> 
> so I suppose we could predict a decrease in file size of about 40M => 32M if we were able to remove the neutron_lib.callbacks.manager output.

I was looking at this again today but I'm really not sure if we should get rid
of those messages from the log.
For now I think that indexing of screen-q-svc.txt file is disabled so this size
of the log shouldn't be big problem (I hope) and I would like to not remove any
other debug messages from it if that will not be really necessary.

> 
> But I'm not sure whether that's a critical debugging element or not.
> 
> -melanie
> 

-- 
Slawek Kaplonski
Principal software engineer
Red Hat


From mdulko at redhat.com  Fri Aug 28 11:49:32 2020
From: mdulko at redhat.com (=?UTF-8?Q?Micha=C5=82?= Dulko)
Date: Fri, 28 Aug 2020 13:49:32 +0200
Subject: [kuryr] vPTG October 2020
In-Reply-To: <54f84af6378e1507d1f04c0aab733922cdc2c8bd.camel@redhat.com>
References: <54f84af6378e1507d1f04c0aab733922cdc2c8bd.camel@redhat.com>
Message-ID: <68470a74141032e96e04e3b26fb2cab8eba80e61.camel@redhat.com>

On Mon, 2020-08-17 at 09:46 +0200, Michał Dulko wrote:
> Hello all,
> 
> There's a vPTG October 2020 project signup process going on and I'd
> like to ask if you want me to reserve an hour or two there for a sync
> up on the priorities and plans of various parts of the team.

I haven't heard much, so I've just reserved 2 one-hour slots that won't
overlap with Octavia sessions:

* 7-8 UTC on October 27th
* 15-16 UTC on October 29th

I think we'll treat those as pretty open discussion, but feel free to
add stuff to the agenda etherpad, so that team members could prepare
for the topics in advance.

[1] https://etherpad.opendev.org/p/kuryr-virtual-W-ptg

Thanks,
Michał


From adriant at catalystcloud.nz  Fri Aug 28 12:36:31 2020
From: adriant at catalystcloud.nz (Adrian Turjak)
Date: Sat, 29 Aug 2020 00:36:31 +1200
Subject: [tc][telemetry][gnocchi] The future of Gnocchi in OpenStack
Message-ID: <0a22dd8a-2b54-cd22-1734-619d28d6efc8@catalystcloud.nz>

Hey OpenStackers,

We're currently in the process of discussing what to do with OpenStack's 
reliance on Gnocchi, and at present it is looking like we are most 
likely to just fork it back under a new name (currently Farfalle to 
stick with the pasta theme).

The discussion is mostly happening here:
https://review.opendev.org/#/c/744592/

But for those running Gnocchi in prod, this is likely something you may 
want to know about and we'd like to hear from you.

A bit of history: Gnocchi started off as a new backend for Ceilometer in 
OpenStack, and eventually become the defacto API for telemetry samples 
when that was removed from Ceilometer (as backed by MongoDB). Gnocchi 
was eventually spun off outside of OpenStack, but still essentially 
remained our API for telemetry despite not being an official part of 
OpenStack anymore. Since then the development around it seems to have 
stalled, with pull requests left unreviewed, CI broken, and even the 
domain for the docs lapsing once.

They have essentially said the project is unmaintained themselves: 
https://github.com/gnocchixyz/gnocchi/issues/1049

Given that OpenStack telemetry relies on it, we needed to decide what to 
do. We tried talking to the devs which spun it off outside of OpenStack, 
but they seem disinclined to interact with the OpenStack community, or 
move the project back to our infra/governance despite OpenStack looking 
like the only consumers of Gnocchi as a project. We want to find a 
solution, and the feeling is that they don't.

So we've opted to fork it back and now the discussion is how to approach 
that fork. The OpenStack community doesn't want to maintain a time 
series database, but our telemetry API is part of it. We are putting it 
under non-OpenStack namespace to start, but we need to decide what the 
long term place for it should be.

Do we want to make it an official project again? Do we keep it just as 
an API and drop the time series DB part for another DB? Do we build a 
new API back into Ceilometer and switch to a different backend like 
InfluxDB? We don't know yet, and we want some input from people who use 
the service so we can hopefully work with OpenStack telemetry as a whole 
and figure out what the long term picture is.

If Gnocchi matters to you at all, or you use it, we want to hear from you.

Cheers,
Adrian Turjak


From mnaser at vexxhost.com  Fri Aug 28 12:45:40 2020
From: mnaser at vexxhost.com (Mohammed Naser)
Date: Fri, 28 Aug 2020 08:45:40 -0400
Subject: [tc] office hours update
Message-ID: <CAEs876gaWzm1nxN=ydQ4=b9qwnkuRw1W7tKQP8aAqZvHKEzftw@mail.gmail.com>

Hi everyone,

In order to be able to have more folks available during office hours,
we have the following newly available times once this patch lands:
https://review.opendev.org/746167

* 01:00 UTC on Tuesdays:
http://www.timeanddate.com/worldclock/fixedtime.html?hour=01&min=00&sec=0
* 15:00 UTC on Wednesdays:
http://www.timeanddate.com/worldclock/fixedtime.html?hour=15&min=00&sec=0

We look forward to seeing our community present at those office hours
with anything they have.

Thank you,
Mohammed

-- 
Mohammed Naser
VEXXHOST, Inc.


From mark at stackhpc.com  Fri Aug 28 13:47:41 2020
From: mark at stackhpc.com (Mark Goddard)
Date: Fri, 28 Aug 2020 14:47:41 +0100
Subject: [Kolla] re-create container
In-Reply-To: <CY4PR08MB2375F72A3AB956D6B74FDDAABD550@CY4PR08MB2375.namprd08.prod.outlook.com>
References: <CY4PR08MB2375F72A3AB956D6B74FDDAABD550@CY4PR08MB2375.namprd08.prod.outlook.com>
Message-ID: <CAFHSqWq9XvA647jciHJKBExPAJR6MuEyjp3utMZURwfUu=zvxw@mail.gmail.com>

On Thu, 27 Aug 2020 at 21:48, Tony Liu <tonyliu0592 at hotmail.com> wrote:
>
> Hi,
>
> Is Kolla container created by playbook only or there is
> something like docker-compose to re-create container in
> case it's deleted after initial deployment?

If you run any of the deploy, reconfigure, deploy-containers or
upgrade commands, kolla ansible will ensure that necessary containers
exist, even if they were removed.

>
> Thanks!
> Tony
>
>


From zbitter at redhat.com  Fri Aug 28 14:48:53 2020
From: zbitter at redhat.com (Zane Bitter)
Date: Fri, 28 Aug 2020 10:48:53 -0400
Subject: [tc][telemetry][gnocchi] The future of Gnocchi in OpenStack
In-Reply-To: <0a22dd8a-2b54-cd22-1734-619d28d6efc8@catalystcloud.nz>
References: <0a22dd8a-2b54-cd22-1734-619d28d6efc8@catalystcloud.nz>
Message-ID: <5cc54d3f-ecf3-a769-9edb-187efc1c2d3f@redhat.com>

On 28/08/20 8:36 am, Adrian Turjak wrote:
> Hey OpenStackers,
> 
> We're currently in the process of discussing what to do with OpenStack's 
> reliance on Gnocchi, and at present it is looking like we are most 
> likely to just fork it back under a new name (currently Farfalle to 
> stick with the pasta theme).
> 
> The discussion is mostly happening here:
> https://review.opendev.org/#/c/744592/
> 
> But for those running Gnocchi in prod, this is likely something you may 
> want to know about and we'd like to hear from you.
> 
> A bit of history: Gnocchi started off as a new backend for Ceilometer in 
> OpenStack, and eventually become the defacto API for telemetry samples 
> when that was removed from Ceilometer (as backed by MongoDB). Gnocchi 
> was eventually spun off outside of OpenStack, but still essentially 
> remained our API for telemetry despite not being an official part of 
> OpenStack anymore.

I think a large part of the issue here is that there are multiple 
reasons for wanting (small-t) telemetry from OpenStack, and historically 
because of reasons they have all been conflated into one Thing with the 
result that sometimes one use case wins. At least 3 that I can think of are:

1) Monitoring the OpenStack infrastructure by the operator, including 
feeding into business processes like reporting, capacity planning &c.

2) Billing

3) Monitoring user resources by the user/application, either directly or 
via other OpenStack services like Heat or Senlin.


For the first, you just want to be able to dump data into a TSDB of the 
operator's choice. Since all of the reporting requirements are 
business-specific anyway, it's up to the operator to decide how they 
want to store the data and how they want to interact with it. It appears 
that this may have been the theory behind the Gnocchi split.

On the other hand, for the third one you really need something that 
should be an official OpenStack API with all of the attendant stability 
guarantees, because it is part of OpenStack's user interface.

The second lands somewhere in between; AIUI CloudKitty is written to 
support multiple back-ends, with OpenStack Telemetry being the primary 
one. So it needs a fairly stable API because it's consumed by other 
OpenStack projects, but it's ultimately operator-facing.


As I have argued before, when we are thinking about road maps we need to 
think of these as different use cases, and they're different enough that 
they are probably best served by least two separate tools.

Mohammed has made a compelling argument in the past that Prometheus is 
more or less the industry standard for the first use case, and we should 
just export metrics to that directly in the OpenStack services, rather 
than going through the Ceilometer collector.

I don't know what should be done about the third, but I do know that 
currently Telemetry is breaking Heat's gate and people are seriously 
discussing disabling the Telemetry-related tests, which I assume would 
mean deprecating the resources. Monasca offers an alternative, but isn't 
preferred for some distributors and operators because it brings the 
whole Java ecosystem along for the ride (managing the Python one is 
already hard enough).

cheers,
Zane.


From mahdi.abbasi.2013 at gmail.com  Fri Aug 28 08:03:07 2020
From: mahdi.abbasi.2013 at gmail.com (mahdi abbasi)
Date: Fri, 28 Aug 2020 12:33:07 +0430
Subject: Horizon installation problem
Message-ID: <CAOgBtR9Wu7WDX23q-S6NRqMPToYjqRSe=L0EAaqF-Cxsr7rViQ@mail.gmail.com>

Hi openstack development team,

When i want install horizon with pip3 i recieve and error:

Could not satisfy constraints for horizon: installation from path or url
cannot be constrained to a version.

Please help me

Best regards
Mahdi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200828/359a8379/attachment.html>

From cohuck at redhat.com  Fri Aug 28 13:47:41 2020
From: cohuck at redhat.com (Cornelia Huck)
Date: Fri, 28 Aug 2020 15:47:41 +0200
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200826064117.GA22243@joy-OptiPlex-7040>
References: <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
 <20200818113652.5d81a392.cohuck@redhat.com>
 <20200820003922.GE21172@joy-OptiPlex-7040>
 <20200819212234.223667b3@x1.home>
 <20200820031621.GA24997@joy-OptiPlex-7040>
 <20200825163925.1c19b0f0.cohuck@redhat.com>
 <20200826064117.GA22243@joy-OptiPlex-7040>
Message-ID: <20200828154741.30cfc1a3.cohuck@redhat.com>

On Wed, 26 Aug 2020 14:41:17 +0800
Yan Zhao <yan.y.zhao at intel.com> wrote:

> previously, we want to regard the two mdevs created with dsa-1dwq x 30 and
> dsa-2dwq x 15 as compatible, because the two mdevs consist equal resources.
> 
> But, as it's a burden to upper layer, we agree that if this condition
> happens, we still treat the two as incompatible.
> 
> To fix it, either the driver should expose dsa-1dwq only, or the target
> dsa-2dwq needs to be destroyed and reallocated via dsa-1dwq x 30.

AFAIU, these are mdev types, aren't they? So, basically, any management
software needs to take care to use the matching mdev type on the target
system for device creation?


From smooney at redhat.com  Fri Aug 28 14:04:12 2020
From: smooney at redhat.com (Sean Mooney)
Date: Fri, 28 Aug 2020 15:04:12 +0100
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200828154741.30cfc1a3.cohuck@redhat.com>
References: <20200814051601.GD15344@joy-OptiPlex-7040>
 <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
 <20200818113652.5d81a392.cohuck@redhat.com>
 <20200820003922.GE21172@joy-OptiPlex-7040>
 <20200819212234.223667b3@x1.home>
 <20200820031621.GA24997@joy-OptiPlex-7040>
 <20200825163925.1c19b0f0.cohuck@redhat.com>
 <20200826064117.GA22243@joy-OptiPlex-7040>
 <20200828154741.30cfc1a3.cohuck@redhat.com>
Message-ID: <8f5345be73ebf4f8f7f51d6cdc9c2a0d8e0aa45e.camel@redhat.com>

On Fri, 2020-08-28 at 15:47 +0200, Cornelia Huck wrote:
> On Wed, 26 Aug 2020 14:41:17 +0800
> Yan Zhao <yan.y.zhao at intel.com> wrote:
> 
> > previously, we want to regard the two mdevs created with dsa-1dwq x 30 and
> > dsa-2dwq x 15 as compatible, because the two mdevs consist equal resources.
> > 
> > But, as it's a burden to upper layer, we agree that if this condition
> > happens, we still treat the two as incompatible.
> > 
> > To fix it, either the driver should expose dsa-1dwq only, or the target
> > dsa-2dwq needs to be destroyed and reallocated via dsa-1dwq x 30.
> 
> AFAIU, these are mdev types, aren't they? So, basically, any management
> software needs to take care to use the matching mdev type on the target
> system for device creation?

or just do the simple thing of use the same mdev type on the source and dest.
matching mdevtypes is not nessiarly trivial. we could do that but we woudl have
to do that in python rather then sql so it would be slower to do at least today.

we dont currently have the ablity to say the resouce provider must have 1 of these
set of traits. just that we must have a specific trait. this is a feature we have
disucssed a couple of times and delayed untill we really really need it but its not out
of the question that we could add it for this usecase. i suspect however we would do exact
match first and explore this later after the inital mdev migration works.

by the way i was looking at some vdpa reslated matiail today and noticed vdpa devices are nolonger
usign mdevs and and now use a vhost chardev so i guess we will need a completely seperate mechanioum
for vdpa vs mdev migration as a result. that is rather unfortunet but i guess that is life.
> 


From fungi at yuggoth.org  Fri Aug 28 16:11:25 2020
From: fungi at yuggoth.org (Jeremy Stanley)
Date: Fri, 28 Aug 2020 16:11:25 +0000
Subject: Horizon installation problem
In-Reply-To: <CAOgBtR9Wu7WDX23q-S6NRqMPToYjqRSe=L0EAaqF-Cxsr7rViQ@mail.gmail.com>
References: <CAOgBtR9Wu7WDX23q-S6NRqMPToYjqRSe=L0EAaqF-Cxsr7rViQ@mail.gmail.com>
Message-ID: <20200828161125.vbqwob5i6ocyreor@yuggoth.org>

On 2020-08-28 12:33:07 +0430 (+0430), mahdi abbasi wrote:
[...]
> When i want install horizon with pip3 i recieve and error:
> 
> Could not satisfy constraints for horizon: installation from path or url
> cannot be constrained to a version.
[...]

This sounds like you're passing a -c option to pip telling it to
apply a constraints file, but you're attempting to install Horizon
from source instead of from a released package so it can't be
matched against the constraints list. The easy workaround is to
delete the Horizon entry from the constraints list you're using, or
consume a release of Horizon from PyPI instead of using a source
checkout.
-- 
Jeremy Stanley
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 963 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200828/17ef6817/attachment.sig>

From gmann at ghanshyammann.com  Fri Aug 28 16:35:55 2020
From: gmann at ghanshyammann.com (Ghanshyam Mann)
Date: Fri, 28 Aug 2020 11:35:55 -0500
Subject: [all][tc][policy] Progress report of consistent and secure default
 policies pop-up team
Message-ID: <17435ecf42a.129e8542b80609.3552606214442342355@ghanshyammann.com>

Hello Everyone,

This is a regular update on progress in  'Consistent and Secure Default Policies Popup Team'.
We will try to make it a monthly report form now onwards.

Progress so far:
============
* Popup team meet twice in a month and discuss and work on progress and pre-work to do.
- https://wiki.openstack.org/wiki/Consistent_and_Secure_Default_Policies_Popup_Team#Meeting

* Pre-work to provide a smooth migration path to the new policy
<here we will add any pre-work we need to do before more project start moving towards new policy>

** Migrate Default Policy Format from JSON to YAML
- This involves oslo side + each project side works.
- oslo side work to provide tool and utils method are merged (one patch is in gate).
- The new tool 'oslopolicy-convert-json-to-yaml' is available now to convert your existing JSON formatted
policy file to YAML formatted in a backward-compatible way. 
- I have started to do it in Nova (need to update the patch though) to give example work for other projects:
https://review.opendev.org/#/c/748059/
- all work is tracked here: https://review.opendev.org/#/q/topic:bp/policy-json-to-yaml+(status:open+OR+status:merged)

** Improving documentation about target resources (oslo.policy)
- https://bugs.launchpad.net/oslo.policy/+bug/1886857
- raildo pushed the patch which is under review: https://review.opendev.org/#/c/743318/


* Team Progress: (list of a team interested or have volunteer to work)

** Keystone (COMPLETED; use as a reference)

** Nova (COMPLETED; use as a reference)
- All APIs except deprecated APIs were done in the Ussuri cycle and deprecated APIs also done now.

** Cyborg (in-progress)
- Spec is merged, code under review.

** Barbican (not started)

** Neutron (not started)

** Cinder (not started)

** Manila (not started)


Why This Is Important 
=================
(I have copied it from Colleen email which is nicely written)

Separating system, domain, and project-scope APIs and providing meaningful
default roles is critical to facilitating secure cloud deployments and to
fulfilling OpenStack's vision as a fully self-service infrastructure
provider[1]. Until all projects have completed this policy migration, the
"reader" role that exists in keystone is dangerously misleading, and the
`[oslo_policy]/enforce_scope` option has limited usefulness as long as projects
lack uniformity in how an administrator can use scoped APIs.


How You Can Help
================
Contributor: 
- You can help by starting the work in your (or any other you would like to help) project and attend
popup team meeting in case of any question, review request etc.

Cloud operator:  
- Please help review the proposed policy rule changes to sanity-check the new scope and
role defaults.
- Migrate your JSON formatted policy file to YAML JSON formatted file can be problematic in
the various way as described here[2]. You can use 'oslopolicy-convert-json-to-yaml' tool [3] to
convert your existing JSON formatted policy file to YAML formatted in a backward-compatible way. 

[1] https://governance.openstack.org/tc/reference/technical-vision.html#self-service
[2] https://specs.openstack.org/openstack/oslo-specs/specs/victoria/policy-json-to-yaml.html#problem-description
[3] https://docs.openstack.org/oslo.policy/latest/cli/oslopolicy-convert-json-to-yaml.html 

-gmann & raildo


From cboylan at sapwetik.org  Fri Aug 28 16:54:11 2020
From: cboylan at sapwetik.org (Clark Boylan)
Date: Fri, 28 Aug 2020 09:54:11 -0700
Subject: [neutron][gate] verbose q-svc log files and e-r indexing
In-Reply-To: <20200828100411.l7egqidkfzfi4xjt@skaplons-mac>
References: <bbf4a146-7c67-969e-b783-4bdfbe31add1@gmail.com>
 <20200818103323.wq5upyjn4nzsqhx7@skaplons-mac>
 <20200818150052.u4xkjsptejikwcny@skaplons-mac>
 <62e4fcd2-0f7a-a7d3-7692-3ad9a05c8399@gmail.com>
 <20200828100411.l7egqidkfzfi4xjt@skaplons-mac>
Message-ID: <22d34440-fc65-4f4b-a5e7-c5725283aa58@www.fastmail.com>

On Fri, Aug 28, 2020, at 3:04 AM, Slawek Kaplonski wrote:
> Hi,
> 
> On Tue, Aug 18, 2020 at 02:10:40PM -0700, melanie witt wrote:
> > On 8/18/20 08:00, Slawek Kaplonski wrote:
> > > Hi,
> > > 
> > > I proposed patch [1] which seems that decreased size of the neutron-server log
> > > a bit - see [2] but it's still about 40M :/
> > > 
> > > [1] https://review.opendev.org/#/c/730879/
> > > [2] https://48dcf568cd222acfbfb6-11d92d8452a346ca231ad13d26a55a7d.ssl.cf2.rackcdn.com/746714/1/check/tempest-full-py3/5c1399c/controller/logs/
> > 
> > Thanks for jumping in to help, Slawek! Indeed your proposed patch improves things from 60M-70M => 40M (good!).
> > 
> > With your patch applied, the most frequent potential log message I see now is like this:
> > 
> > Aug 18 14:40:21.294549 ubuntu-bionic-rax-iad-0019321276 neutron-server[5829]: DEBUG neutron_lib.callbacks.manager [None req-eadfbe92-eaee-4e3e-a5c0-f18aa8ba9772 None None] Notify callbacks ['neutron.services.segments.db._update_segment_host_mapping_for_agent-8764691834039', 'neutron.plugins.ml2.plugin.Ml2Plugin._retry_binding_revived_agents-4033733'] for agent, after_update {{(pid=6206) _notify_loop /opt/stack/neutron-lib/neutron_lib/callbacks/manager.py:193}}
> > 
> > with the line count difference being with and without:
> > 
> > $ wc -l "screen-q-svc.txt"
> > 102493 screen-q-svc.txt
> > 
> > $ grep -v "neutron_lib.callbacks.manager" "screen-q-svc.txt" |wc -l
> > 83261
> > 
> > so I suppose we could predict a decrease in file size of about 40M => 32M if we were able to remove the neutron_lib.callbacks.manager output.
> 
> I was looking at this again today but I'm really not sure if we should get rid
> of those messages from the log.
> For now I think that indexing of screen-q-svc.txt file is disabled so this size
> of the log shouldn't be big problem (I hope) and I would like to not remove any
> other debug messages from it if that will not be really necessary.

Maybe as an option we split this into two log files. One that is INFO and above and the other that includes everything with DEBUG? Then we can index the INFO and above contents only.

One thing to keep in mind here is that this system tends to act like a canary for when our logs would create problems for people in production. The q-svc logs here are significantly more chatty than the other services. Not necessarily a problem, but don't be surprised if people notice after they upgrade and start complaining.

> 
> > 
> > But I'm not sure whether that's a critical debugging element or not.
> > 
> > -melanie
> > 
> 
> -- 
> Slawek Kaplonski
> Principal software engineer
> Red Hat
> 
> 
>


From ltoscano at redhat.com  Fri Aug 28 17:33:15 2020
From: ltoscano at redhat.com (Luigi Toscano)
Date: Fri, 28 Aug 2020 13:33:15 -0400 (EDT)
Subject: [all][goals] Switch legacy Zuul jobs to native - update #3
In-Reply-To: <4658601.CvnuH1ECHv@whitebase.usersys.redhat.com>
Message-ID: <2019711119.48014000.1598635995968.JavaMail.zimbra@redhat.com>

Hi,

it's time for another status report on this goal.
A lot of reviews have been merged in the past 10 days, and several projects are not on the list anymore. This is a very good news, but we still have some 
work to complete. Please keep pushing!


Status
======

The number of the project with legacy jobs is now limited, so I'm going to explain the status in more details for each of them. The links to the patches can be found in the etherpad [3] (see below for the links).


cinder
------

There is just one test left, and it is definitely tricky, because it implements a cycle of "change tempest configuration"/"run tempest"/"repeat", which is not the usual pattern. In the worst case I will "port" by adding a simple bash wrapper, but I'd like to have a clean ansible solution. 

designate
---------

A patch for the only legacy job is under review after some forth and back, but there are some open questions.

heat
------

Only one legacy job left, the heat cores are aware of it.

infra
-----

Only one devstack-gate job in the os-loganalyze repository, which should be probably retired.
There are 2 other legacy jobs, but not devstack-gate, so less urgent.

ironic
------

There has been an open review with the full port of the last legacy job, but it is failing. 
As it is has been failing even before the porting, the patch could be probably merged as it is.

karbor
------

A patch for the only legacy job is under review, but it still has some issues.

manila
------

There is only one legacy-base job (not devstack-gate), so less urgent, but there is a patch for it.

monasca
-------

One job in the monasca-transform repository, which is most likely due for a retirement.
There are 3 legacy (non devstack-gate) jobs in other repositories.

murano
------

There are two legacy jobs left. 
I'm not sure whether murano-apps-refstackclient-unittest is still needed. 
murano-dashboard-sanity-check is a bit tricky, the tests still use nose and the corresponding code in horizon has seen several changes.

neutron
-------

There are three types of legacy jobs:
 * all jobs in networking-midonet, whose retirement is under discussion, but the final decision is not clear, so a porting may be needed anyway:
 * two grenade jobs are being worked on;
 * the remaining legacy job could be maybe dropped.

nova
----

The team is trying to port the two legacy job left with some refactoring, but it may require some effort yet.

oslo
----

Only one legacy job left, but it is part of the soon-to-be-retired devstack-plugin-zmq repository.

senlin
------

A patch for the only legacy job has been proposed and it is working, needs reviews.

trove
-----

The trove-grenade job should be ported, but on the other hand, trove has no grenade plugin. 
At this point it is unlikely to be implemented before Victoria, so maybe the job can be dropped for now.


zaqar
-----
A few patches have been proposed and working. 
One of them is failing (python-zaqarclient) but it does a bit more than a simple porting, so it may be simply changed to do exactly what the old job was doing (input needed).


References
==========

[1] the goal: 
https://governance.openstack.org/tc/goals/selected/victoria/native-zuulv3-jobs.html

[2] the up-to-date Zuul v3 porting guide: 
https://docs.openstack.org/project-team-guide/zuulv3.html

[3] the etherpad which tracks the current status: 
https://etherpad.opendev.org/p/goal-victoria-native-zuulv3-migration

[4] the previous reports: 
http://lists.openstack.org/pipermail/openstack-discuss/2020-July/016058.html
http://lists.openstack.org/pipermail/openstack-discuss/2020-August/016561.html

Ciao
-- 
Luigi


From romanko at selectel.com  Fri Aug 28 17:52:15 2020
From: romanko at selectel.com (=?UTF-8?B?0JjQstCw0L0g0KDQvtC80LDQvdGM0LrQvg==?=)
Date: Fri, 28 Aug 2020 20:52:15 +0300
Subject: [tc][telemetry][gnocchi] The future of Gnocchi in OpenStack
In-Reply-To: <0a22dd8a-2b54-cd22-1734-619d28d6efc8@catalystcloud.nz>
References: <0a22dd8a-2b54-cd22-1734-619d28d6efc8@catalystcloud.nz>
Message-ID: <CAFRG+Mwi-jmskO1Ay94VwO+eS1Juqwd951Xdv-1t7xKFsOH1iA@mail.gmail.com>

пт, 28 авг. 2020 г. в 15:40, Adrian Turjak <adriant at catalystcloud.nz>:

> But for those running Gnocchi in prod, this is likely something you may
> want to know about and we'd like to hear from you.
>

Hello, everyone!

Here at Selectel we use Gnocchi as a backend for Ceilometer – we gather
different metrics from virtual machines and provide our customers with
graphs in
a control panel. In this scenario we rely on Gnocchi's Keystone auth
support and
nearly standard mappings for instances, volumes, ports, etc provided out of
the
box.

We also use Gnocchi as a secondary target for our home-grown billing system.
Billing measures are gathered from different OpenStack and custom APIs,
go through the charging engine and then being POSTed to Gnocchi API in
batches.
Here again we need the possibility to fetch measures with project- and
domain-
scoped tokens on the customer side in the control panel to be able to
separate scopes
for resellers (domain owners) and their clients (project owners).

The third way to consume Gnocchi API is through OpenStack Watcher in it's
strategy for balancing load in our regions. Here we use hosts metrics as
well as
virtual machines metrics.

What do we like in Gnocchi:
- API is clean and easy to use, object model is universal and makes us able
to
utilize it in different scenarios;
- Fast enough for our use cases;
- Can store metrics for a long period of time with a ceph backend with no
performance penalty – useful in billing case.

What we do not like:
- server-side aggregations do not work as one might think they should work
– API
and CLI are very hard to use, we stopped trying to use them;
- very CPU and disk IO intensive, platforms are hot like hell 24/7
processing
not more then 1k metrics per second;
- sometimes deadlocks happen in Redis incoming metrics storage preventing
measures from certain metrics from being processed.

What are our plans for the nearest future:
- try to switch Watcher to Grafana backend to be able to use the same
Prometheus
metrics we rely on for alerting and capacity planning;
- continue using Gnocchi only for VMs mertics, switching billing system for
something more reliable in terms of missed points on graphs.

Speaking about VMs metrics, it would probably be great to be able to
continue
using Gnocchi API for customer-facing features as it works well with
OpenStack
object model, authentication and everything. But Gnocchi's TSDB is not the
best
on the market. By switching it to Victoria Metrics, providing Prometheus
API and
working amazingly with Grafana, we would be able to gather and store metrics
with node/libvirt exporters and Prometheus doing remote writes to Victoria,
and consume them via Grafana/AlertManager or
Gnocchi API depending on a scenario.

-- 
Ivan Romanko
Selectel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200828/1415ae59/attachment-0001.html>

From sean.mcginnis at gmx.com  Fri Aug 28 18:56:46 2020
From: sean.mcginnis at gmx.com (Sean McGinnis)
Date: Fri, 28 Aug 2020 13:56:46 -0500
Subject: [release] Release countdown for week R-6 Aug 31 - Sept 4
Message-ID: <20200828185646.GA128227@sm-workstation>

Development Focus
-----------------

Work on libraries should be wrapping up, in preparation for the various
library-related deadlines coming up. Now is a good time to make decisions
on deferring feature work to the next development cycle in order to be
able to focus on finishing already-started feature work.

General Information
-------------------

We are now getting close to the end of the cycle, and will be gradually
freezing feature work on the various deliverables that make up the
OpenStack release.

This coming week is the deadline for general libraries (except client
libraries): their last feature release needs to happen before "Non-client
library freeze" on September 3. Only bugfix releases will be allowed
beyond this point.

When requesting those library releases, you can also include the
stable/victoria branching request with the review (as an example, see the
"branches" section here:
https://opendev.org/openstack/releases/src/branch/master/deliverables/pike/os-brick.yaml#n2

In the next weeks we will have deadlines for:

* Client libraries (think python-*client libraries), which need to have
   their last feature release before "Client library freeze" (Sept 10)

* Deliverables following a cycle-with-rc model (that would be most
  services), which observe a Feature freeze on that same date, Sept 10.
  Any feature addition beyond that date should be discussed on the
  mailing-list and get PTL approval.

As we are getting to the point of creating stable/victoria branches, this
would be a good point for teams to review membership in their
victoria-stable-maint groups. Once the stable/victoria branches are cut
for a repo, the ability to approve any necessary backports into those
branches for Victoria will be limited to the members of that stable team.
If there are any questions about stable policy or stable team membership,
please reach out in the #openstack-stable channel.

Upcoming Deadlines & Dates
--------------------------

Non-client library freeze: September 3 (R-6 week)
Client library freeze: September 10 (R-5 week)
Victoria-3 milestone: September 10 (R-5 week)
Cycle Highlights Due: September 10 (R-5 week)
Victoria release: October 14


From zaitcev at redhat.com  Fri Aug 28 19:48:44 2020
From: zaitcev at redhat.com (Pete Zaitcev)
Date: Fri, 28 Aug 2020 14:48:44 -0500
Subject: [tripleo, ironic] Error: Could not retrieve ... pxelinux.0
Message-ID: <20200828144844.7787707d@suzdal.zaitcev.lan>

Hello:

I wanted to give the TripleO a try, so started follow our
installation guide for Ussuri, and eventually made it to
"openstack undercloud install". It fails with something like this:

Aug 28 10:10:53 undercloud puppet-user[48657]: Error: /Stage[main]/Ironic::Pxe/File[/var/lib/ironic/tftpboot/ipxe.efi]: Could not evaluate: Could not retrieve information from environment production source(s) file:/usr/share/ipxe/ipxe-x86_64.efi
Aug 27 20:05:42 undercloud puppet-user[37048]: Error: /Stage[main]/Ironic::Pxe/Ironic::Pxe::Tftpboot_file[pxelinux.0]/File[/var/lib/ironic/tftpboot/pxelinux.0]: Could not evaluate: Could not retrieve information from environment production source(s) file:/tftpboot/pxelinux.0

Does anyone have an idea what it wants?

I added a couple of packages on the host system that provided
the files mentioned in the message, but it made no difference.
Ussuri is conteinerized anyway.

Since I'm very new to this, I have no clue where to look at all.
The nearest task is a wrapper of some kind, so the install-undercloud.log
looks like this:

2020-08-28 14:11:31.397 60599 WARNING tripleoclient.v1.tripleo_deploy.Deploy [  ] TASK [Run container-puppet tasks (generate config) during step 1 with paunch] ***
2020-08-28 14:11:31.397 60599 WARNING tripleoclient.v1.tripleo_deploy.Deploy [  ] Friday 28 August 2020  14:11:31 -0400 (0:00:00.302)       0:06:28.734 *********
2020-08-28 14:11:32.223 60599 WARNING tripleoclient.v1.tripleo_deploy.Deploy [  ] changed: [undercloud]
2020-08-28 14:11:32.325 60599 WARNING tripleoclient.v1.tripleo_deploy.Deploy [  ]
2020-08-28 14:11:32.326 60599 WARNING tripleoclient.v1.tripleo_deploy.Deploy [  ] TASK [Wait for container-puppet tasks (generate config) to finish] *************
2020-08-28 14:11:32.326 60599 WARNING tripleoclient.v1.tripleo_deploy.Deploy [  ] Friday 28 August 2020  14:11:32 -0400 (0:00:00.928)       0:06:29.663 *********
2020-08-28 14:11:32.948 60599 WARNING tripleoclient.v1.tripleo_deploy.Deploy [  ] WAITING FOR COMPLETION: Wait for container-puppet tasks (generate config) to finish (1200 retries left).
. . .

If anyone could tell roughly what is supposed to be going on here,
it would be great. I may be able figure out the rest.

Greetings,
-- Pete


From aschultz at redhat.com  Fri Aug 28 20:00:11 2020
From: aschultz at redhat.com (Alex Schultz)
Date: Fri, 28 Aug 2020 14:00:11 -0600
Subject: [tripleo, ironic] Error: Could not retrieve ... pxelinux.0
In-Reply-To: <20200828144844.7787707d@suzdal.zaitcev.lan>
References: <20200828144844.7787707d@suzdal.zaitcev.lan>
Message-ID: <CAFsb3b78ZhNOaF7QM=kR=QxX8L3we7unuYO8xQ7xVTevH9KXHg@mail.gmail.com>

I've seen this in the past if there is a mismatch between the host OS
and the Containers. Centos7 host with centos8 containers or vice
versa.  Ussuri should be CentOS8 host OS and make sure you're pulling
the correct containers.  The Ironic containers have some pathing
mismatches when the configuration gets generated around this. It used
to be compatible but we broke it at some point when switching some of
the tftp location bits.

Thanks,
-Alex

On Fri, Aug 28, 2020 at 1:55 PM Pete Zaitcev <zaitcev at redhat.com> wrote:
>
> Hello:
>
> I wanted to give the TripleO a try, so started follow our
> installation guide for Ussuri, and eventually made it to
> "openstack undercloud install". It fails with something like this:
>
> Aug 28 10:10:53 undercloud puppet-user[48657]: Error: /Stage[main]/Ironic::Pxe/File[/var/lib/ironic/tftpboot/ipxe.efi]: Could not evaluate: Could not retrieve information from environment production source(s) file:/usr/share/ipxe/ipxe-x86_64.efi
> Aug 27 20:05:42 undercloud puppet-user[37048]: Error: /Stage[main]/Ironic::Pxe/Ironic::Pxe::Tftpboot_file[pxelinux.0]/File[/var/lib/ironic/tftpboot/pxelinux.0]: Could not evaluate: Could not retrieve information from environment production source(s) file:/tftpboot/pxelinux.0
>
> Does anyone have an idea what it wants?
>
> I added a couple of packages on the host system that provided
> the files mentioned in the message, but it made no difference.
> Ussuri is conteinerized anyway.
>
> Since I'm very new to this, I have no clue where to look at all.
> The nearest task is a wrapper of some kind, so the install-undercloud.log
> looks like this:
>
> 2020-08-28 14:11:31.397 60599 WARNING tripleoclient.v1.tripleo_deploy.Deploy [  ] TASK [Run container-puppet tasks (generate config) during step 1 with paunch] ***
> 2020-08-28 14:11:31.397 60599 WARNING tripleoclient.v1.tripleo_deploy.Deploy [  ] Friday 28 August 2020  14:11:31 -0400 (0:00:00.302)       0:06:28.734 *********
> 2020-08-28 14:11:32.223 60599 WARNING tripleoclient.v1.tripleo_deploy.Deploy [  ] changed: [undercloud]
> 2020-08-28 14:11:32.325 60599 WARNING tripleoclient.v1.tripleo_deploy.Deploy [  ]
> 2020-08-28 14:11:32.326 60599 WARNING tripleoclient.v1.tripleo_deploy.Deploy [  ] TASK [Wait for container-puppet tasks (generate config) to finish] *************
> 2020-08-28 14:11:32.326 60599 WARNING tripleoclient.v1.tripleo_deploy.Deploy [  ] Friday 28 August 2020  14:11:32 -0400 (0:00:00.928)       0:06:29.663 *********
> 2020-08-28 14:11:32.948 60599 WARNING tripleoclient.v1.tripleo_deploy.Deploy [  ] WAITING FOR COMPLETION: Wait for container-puppet tasks (generate config) to finish (1200 retries left).
> . . .
>
> If anyone could tell roughly what is supposed to be going on here,
> it would be great. I may be able figure out the rest.
>
> Greetings,
> -- Pete
>
>


From adriant at catalystcloud.nz  Fri Aug 28 22:21:48 2020
From: adriant at catalystcloud.nz (Adrian Turjak)
Date: Sat, 29 Aug 2020 10:21:48 +1200
Subject: [tc][telemetry][gnocchi] The future of Gnocchi in OpenStack
In-Reply-To: <5cc54d3f-ecf3-a769-9edb-187efc1c2d3f@redhat.com>
References: <0a22dd8a-2b54-cd22-1734-619d28d6efc8@catalystcloud.nz>
 <5cc54d3f-ecf3-a769-9edb-187efc1c2d3f@redhat.com>
Message-ID: <89261d64-e88e-ff2e-0a28-894c509d50ab@catalystcloud.nz>


On 29/08/20 2:48 am, Zane Bitter wrote:
> I think a large part of the issue here is that there are multiple 
> reasons for wanting (small-t) telemetry from OpenStack, and 
> historically because of reasons they have all been conflated into one 
> Thing with the result that sometimes one use case wins. At least 3 
> that I can think of are:
>
> 1) Monitoring the OpenStack infrastructure by the operator, including 
> feeding into business processes like reporting, capacity planning &c.
>
> 2) Billing
>
> 3) Monitoring user resources by the user/application, either directly 
> or via other OpenStack services like Heat or Senlin.
>
>
> For the first, you just want to be able to dump data into a TSDB of 
> the operator's choice. Since all of the reporting requirements are 
> business-specific anyway, it's up to the operator to decide how they 
> want to store the data and how they want to interact with it. It 
> appears that this may have been the theory behind the Gnocchi split.
>
> On the other hand, for the third one you really need something that 
> should be an official OpenStack API with all of the attendant 
> stability guarantees, because it is part of OpenStack's user interface.
>
> The second lands somewhere in between; AIUI CloudKitty is written to 
> support multiple back-ends, with OpenStack Telemetry being the primary 
> one. So it needs a fairly stable API because it's consumed by other 
> OpenStack projects, but it's ultimately operator-facing.
>
>
> As I have argued before, when we are thinking about road maps we need 
> to think of these as different use cases, and they're different enough 
> that they are probably best served by least two separate tools.
>
> Mohammed has made a compelling argument in the past that Prometheus is 
> more or less the industry standard for the first use case, and we 
> should just export metrics to that directly in the OpenStack services, 
> rather than going through the Ceilometer collector.
>
> I don't know what should be done about the third, but I do know that 
> currently Telemetry is breaking Heat's gate and people are seriously 
> discussing disabling the Telemetry-related tests, which I assume would 
> mean deprecating the resources. Monasca offers an alternative, but 
> isn't preferred for some distributors and operators because it brings 
> the whole Java ecosystem along for the ride (managing the Python one 
> is already hard enough).
>
> cheers,
> Zane.
>
You are totally right about the three use cases, and we need to address 
this as we move forward with Not-Gnocchi and the rest of Telemetry.

Internally we've never used OS-Telemetry for case 1, but we do use it 
for cases 2 and 3.

I do think having a stable API for OpenStack for those last two cases is 
worth it, and I don't think merging those together is too hard. The way 
Cloudkitty (and our thing Distil) process the data for billing means we 
aren't needing to store months of data in the telemetry system because 
we ingest and aggregate into our own systems.

The third use case doesn't need much long term data in a high level of 
granularity, but does (like billing) need high accuracy closer to 'now'. 
So again I think those line up well to fit into a single system, with 
maybe different granularity on specific metrics.

We should try and fix the telemetry heat tests ideally, because there 
are people using Aodh and auto-scaling.

As for case 1, I agree that trying to encourage Prometheus support in 
OpenStack is a good aim. Sadly though supporting it directly in each 
service likely won't be too easy, but Ceilometer already supports 
pushing to it, so that's good enough for now:
https://github.com/openstack/ceilometer/blob/master/ceilometer/publisher/prometheus.py


We do need a more coherent future plan for Telemetry in OpenStack, but 
the starting point is stabilizing and consolidating before we try and 
steer in a new direction.


From mahdi.abbasi.2013 at gmail.com  Sat Aug 29 14:33:59 2020
From: mahdi.abbasi.2013 at gmail.com (mahdi abbasi)
Date: Sat, 29 Aug 2020 19:03:59 +0430
Subject: Openstack zun ui
Message-ID: <CAOgBtR_V0B2dY5JsDwBTXr519GyCfRqCe0m0ihH1gA9XWR36EQ@mail.gmail.com>

Hi,

I want install zun ui, first i installed horizon with package and then
install zun ui with pip and then python horizon_path/manage.py collectstatic
But i receive error in httpd finaly.

Error:
ImportError: No module named zun_ui

Please help me

Best Regards
Mahdi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200829/efd54d4f/attachment.html>

From hongbin034 at gmail.com  Sat Aug 29 20:01:31 2020
From: hongbin034 at gmail.com (Hongbin Lu)
Date: Sat, 29 Aug 2020 16:01:31 -0400
Subject: Openstack zun ui
In-Reply-To: <CAOgBtR_V0B2dY5JsDwBTXr519GyCfRqCe0m0ihH1gA9XWR36EQ@mail.gmail.com>
References: <CAOgBtR_V0B2dY5JsDwBTXr519GyCfRqCe0m0ihH1gA9XWR36EQ@mail.gmail.com>
Message-ID: <CAFNUc+QXHaofQK7ktZ0Prv0oQo5S_5YNX-VG3pm2ueZQPAZpFQ@mail.gmail.com>

Hi Mahdi,

I need more information to help the troubleshooting:

* Which version of Horizon you installed (master? stable/ussuri, etc.)
* Which version of zun-ui you installed (master? stable/ussuri, etc.)
* Which operating system you were using?
* What are the outputs of the following commands?

$ python --version
$ pip freeze
$ pip3 freeze

On Sat, Aug 29, 2020 at 1:04 PM mahdi abbasi <mahdi.abbasi.2013 at gmail.com>
wrote:

> Hi,
>
> I want install zun ui, first i installed horizon with package and then
> install zun ui with pip and then python horizon_path/manage.py collectstatic
> But i receive error in httpd finaly.
>
> Error:
> ImportError: No module named zun_ui
>
> Please help me
>
> Best Regards
> Mahdi
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200829/74b7f64a/attachment.html>

From mkopec at redhat.com  Sun Aug 30 10:03:56 2020
From: mkopec at redhat.com (Martin Kopec)
Date: Sun, 30 Aug 2020 12:03:56 +0200
Subject: [all][infra] READMEs of zuul roles not rendered properly -
 missing content
In-Reply-To: <14978702-3919-943f-2750-3ecae1201a68@gmail.com>
References: <CAKZGdE2rW_i2JX1HAytTakQcJBS6A=9SGeyx0JzzNagL6u_w2g@mail.gmail.com>
 <20200824143618.7xdecj67m5jzwpkz@yuggoth.org>
 <a23f9c08-fbe1-4d1c-b488-84e9f710cce5@www.fastmail.com>
 <14978702-3919-943f-2750-3ecae1201a68@gmail.com>
Message-ID: <CAKZGdE3g7ao76j6520_Lo6ukVLK90vd9p6X2nXwNGOOAjTAWoA@mail.gmail.com>

Let's summarize the facts mentioned in the thread to continue discussion:
**1.** ".. zuul:rolevar::" structure is used to generate a documentation so
using a different syntax is not an option

**2.** turn rendering of rst files off:
**2a:** in order to make pointing to the source code easier - that's an
interesting topic, probably it would be better to discuss this separately -
turning rendering completely off would be a step back from the visual
perspective, however, I can imagine that referring to the specific lines of
rst files can be needed in some cases. The automatic rst rendering is
really useful, f.e when I open a repo I really appreciate that README.rst
is rendered under the list of files (f.e. this view [5]). On the other
side, when I open specifically a rst file (f.e. this view [6]) I can
totally imagine that this file would be opened in a mode when a user can
refer to any line like it is with any other source file. If not that, what
about adding a new button besides Raw, Permalink, Blame, History called
f.e. Source? This functionality would give a certain advantage to
opendev.org over github.com

**2b:** in order to avoid omitting content - as also mentioned in the
comments in [7] turning rendering completely off would be a step back at
least from the visual perspective, therefore I'd rather move to the
direction where we improved rendering capabilities - either switching to a
different rendering tool or just implementing some kind of try except block
as Jeremy suggested in [7] - if rendering of a certain block of code throws
an error, that part of the code would be rendered as is.

[5] https://opendev.org/openstack/tempest
[6] https://opendev.org/openstack/tempest/src/branch/master/README.rst
[7] https://review.opendev.org/#/c/747796/

On Tue, 25 Aug 2020 at 16:22, Brian Rosmaita <rosmaita.fossdev at gmail.com>
wrote:

> On 8/24/20 11:05 AM, Clark Boylan wrote:
> > On Mon, Aug 24, 2020, at 7:36 AM, Jeremy Stanley wrote:
> >> On 2020-08-24 16:12:17 +0200 (+0200), Martin Kopec wrote:
> >>> I've noticed that READMEs of zuul roles within openstack projects
> >>> are not rendered properly on opendev.org - ".. zuul:rolevar::"
> >>> syntax seems to be the problem. Although it's rendered well on
> >>> github.com, see f.e. [1] [2].
>
> [snip]
>
> >> To be entirely honest, I wish Gitea didn't automatically attempt to
> >> render RST files, that makes it harder to actually refer to the
> >> source code for them, and it's a source code browser not a CMS for
> >> publishing documentation, but apparently this is a feature many
> >> other users do like for some reason.
> >
> > We can change this behavior by removing the external renderer (though I
> expect we're in the minority of preferring ability to link to the source
> here).
>
> This may be a bigger minority that you think ... I put up a patch to
> change the default behavior to not render RST, so anyone with a strong
> opinion, please comment on the patch:
>    https://review.opendev.org/#/c/747796/
>
> >
> > [3]
> https://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/gitea/templates/app.ini.j2#L88-L95
> > [4]
> https://opendev.org/opendev/system-config/src/branch/master/docker/gitea/Dockerfile#L92-L94
> >
> >> --
> >> Jeremy Stanley
> >
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200830/74cc351b/attachment.html>

From mahdi.abbasi.2013 at gmail.com  Sun Aug 30 05:31:19 2020
From: mahdi.abbasi.2013 at gmail.com (mahdi abbasi)
Date: Sun, 30 Aug 2020 10:01:19 +0430
Subject: Openstack zun ui
In-Reply-To: <CAFNUc+QXHaofQK7ktZ0Prv0oQo5S_5YNX-VG3pm2ueZQPAZpFQ@mail.gmail.com>
References: <CAOgBtR_V0B2dY5JsDwBTXr519GyCfRqCe0m0ihH1gA9XWR36EQ@mail.gmail.com>
 <CAFNUc+QXHaofQK7ktZ0Prv0oQo5S_5YNX-VG3pm2ueZQPAZpFQ@mail.gmail.com>
Message-ID: <CAOgBtR8VhWDqjnp=dDhgDDVPjJxtHojdPqc4zwjCe4RcOmN-4Q@mail.gmail.com>

- horizon and zun ui are version stable/train
- my os is centos 7
Outputs of commands:

https://paste.ubuntu.com/p/J8szczHrXF/


On Sun, 30 Aug 2020, 00:31 Hongbin Lu, <hongbin034 at gmail.com> wrote:

> Hi Mahdi,
>
> I need more information to help the troubleshooting:
>
> * Which version of Horizon you installed (master? stable/ussuri, etc.)
> * Which version of zun-ui you installed (master? stable/ussuri, etc.)
> * Which operating system you were using?
> * What are the outputs of the following commands?
>
> $ python --version
> $ pip freeze
> $ pip3 freeze
>
> On Sat, Aug 29, 2020 at 1:04 PM mahdi abbasi <mahdi.abbasi.2013 at gmail.com>
> wrote:
>
>> Hi,
>>
>> I want install zun ui, first i installed horizon with package and then
>> install zun ui with pip and then python horizon_path/manage.py collectstatic
>> But i receive error in httpd finaly.
>>
>> Error:
>> ImportError: No module named zun_ui
>>
>> Please help me
>>
>> Best Regards
>> Mahdi
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200830/d0706654/attachment.html>

From mnaser at vexxhost.com  Sun Aug 30 15:41:26 2020
From: mnaser at vexxhost.com (Mohammed Naser)
Date: Sun, 30 Aug 2020 11:41:26 -0400
Subject: [octavia] usage of SELECT .. FOR UPDATE
Message-ID: <CAEs876gYATyy5VsmYd_4S5UxEtcnHrjY-k9jgRdNf2vb5bh9Kw@mail.gmail.com>

Hi everyone,

We're being particularly hit hard across different deployments where
Octavia has several SELECT .. FOR UPDATE queries which are causing
load balancers to fail to provision properly.

- spare_pools: This usually hits on rolling restarts of o-housekeeping
as they all seem to try to capture a lock --
https://github.com/openstack/octavia/blob/73fbc05386b512aa1dd86a0ed6e8455cc6b8dc7f/octavia/controller/housekeeping/house_keeping.py#L54

- quota: This hits when provisioning a lot of load balancers in
parallel.  For example in cases when using Heat --
https://github.com/openstack/octavia/blob/bf3d5372b9fc670ecd08339fa989c9b738ad8d69/octavia/db/repositories.py#L565-L566

These hurt quite a lot in a busy deployment and result in a poor user
experience unfortunately.  We're trying to off-load Octavia to it's
own database server but that is more of a "throw power at the problem"
solution.  I can imagine that we can probably likely look into a
better/cleaner alternative that avoids this entirely?

I'm happy to try and push for some of this work on our side.

Thanks,
Mohammed

-- 
Mohammed Naser
VEXXHOST, Inc.


From hongbin034 at gmail.com  Sun Aug 30 18:31:27 2020
From: hongbin034 at gmail.com (Hongbin Lu)
Date: Sun, 30 Aug 2020 14:31:27 -0400
Subject: Openstack zun ui
In-Reply-To: <CAOgBtR8VhWDqjnp=dDhgDDVPjJxtHojdPqc4zwjCe4RcOmN-4Q@mail.gmail.com>
References: <CAOgBtR_V0B2dY5JsDwBTXr519GyCfRqCe0m0ihH1gA9XWR36EQ@mail.gmail.com>
 <CAFNUc+QXHaofQK7ktZ0Prv0oQo5S_5YNX-VG3pm2ueZQPAZpFQ@mail.gmail.com>
 <CAOgBtR8VhWDqjnp=dDhgDDVPjJxtHojdPqc4zwjCe4RcOmN-4Q@mail.gmail.com>
Message-ID: <CAFNUc+Trz0eTVJ9Dav839xmwt5Gu7KRAEW6hj7aygCHgeQ0PJg@mail.gmail.com>

On Sun, Aug 30, 2020 at 1:31 AM mahdi abbasi <mahdi.abbasi.2013 at gmail.com>
wrote:

> - horizon and zun ui are version stable/train
> - my os is centos 7
> Outputs of commands:
>
> https://paste.ubuntu.com/p/J8szczHrXF/
>

I saw you were using horizon==18.4.1, which is basically the latest version
of Horizon. The zun-ui version is zun-ui==4.0.x which is stable/train. This
version of zun-ui doesn't match the version of horizon. If you want to use
horizon 18.4.1, suggest to install zun-ui from master branch.
Alternatively, you can re-install horizon 16.x.x to match zun-ui 4.0.x.


>
>
>
> On Sun, 30 Aug 2020, 00:31 Hongbin Lu, <hongbin034 at gmail.com> wrote:
>
>> Hi Mahdi,
>>
>> I need more information to help the troubleshooting:
>>
>> * Which version of Horizon you installed (master? stable/ussuri, etc.)
>> * Which version of zun-ui you installed (master? stable/ussuri, etc.)
>> * Which operating system you were using?
>> * What are the outputs of the following commands?
>>
>> $ python --version
>> $ pip freeze
>> $ pip3 freeze
>>
>> On Sat, Aug 29, 2020 at 1:04 PM mahdi abbasi <mahdi.abbasi.2013 at gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I want install zun ui, first i installed horizon with package and then
>>> install zun ui with pip and then python horizon_path/manage.py collectstatic
>>> But i receive error in httpd finaly.
>>>
>>> Error:
>>> ImportError: No module named zun_ui
>>>
>>> Please help me
>>>
>>> Best Regards
>>> Mahdi
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200830/8a9634c2/attachment.html>

From mahdi.abbasi.2013 at gmail.com  Sun Aug 30 19:50:49 2020
From: mahdi.abbasi.2013 at gmail.com (mahdi abbasi)
Date: Mon, 31 Aug 2020 00:20:49 +0430
Subject: Openstack zun ui
In-Reply-To: <CAFNUc+Trz0eTVJ9Dav839xmwt5Gu7KRAEW6hj7aygCHgeQ0PJg@mail.gmail.com>
References: <CAOgBtR_V0B2dY5JsDwBTXr519GyCfRqCe0m0ihH1gA9XWR36EQ@mail.gmail.com>
 <CAFNUc+QXHaofQK7ktZ0Prv0oQo5S_5YNX-VG3pm2ueZQPAZpFQ@mail.gmail.com>
 <CAOgBtR8VhWDqjnp=dDhgDDVPjJxtHojdPqc4zwjCe4RcOmN-4Q@mail.gmail.com>
 <CAFNUc+Trz0eTVJ9Dav839xmwt5Gu7KRAEW6hj7aygCHgeQ0PJg@mail.gmail.com>
Message-ID: <CAOgBtR9AkYXEG--sxLgMZ_+hGypqMmsM4cyQUMn-vYHC_ma0qA@mail.gmail.com>

Thanks a lot

On Sun, 30 Aug 2020, 23:01 Hongbin Lu, <hongbin034 at gmail.com> wrote:

>
>
> On Sun, Aug 30, 2020 at 1:31 AM mahdi abbasi <mahdi.abbasi.2013 at gmail.com>
> wrote:
>
>> - horizon and zun ui are version stable/train
>> - my os is centos 7
>> Outputs of commands:
>>
>> https://paste.ubuntu.com/p/J8szczHrXF/
>>
>
> I saw you were using horizon==18.4.1, which is basically the latest
> version of Horizon. The zun-ui version is zun-ui==4.0.x which is
> stable/train. This version of zun-ui doesn't match the version of horizon.
> If you want to use horizon 18.4.1, suggest to install zun-ui from master
> branch. Alternatively, you can re-install horizon 16.x.x to match zun-ui
> 4.0.x.
>
>
>>
>>
>>
>> On Sun, 30 Aug 2020, 00:31 Hongbin Lu, <hongbin034 at gmail.com> wrote:
>>
>>> Hi Mahdi,
>>>
>>> I need more information to help the troubleshooting:
>>>
>>> * Which version of Horizon you installed (master? stable/ussuri, etc.)
>>> * Which version of zun-ui you installed (master? stable/ussuri, etc.)
>>> * Which operating system you were using?
>>> * What are the outputs of the following commands?
>>>
>>> $ python --version
>>> $ pip freeze
>>> $ pip3 freeze
>>>
>>> On Sat, Aug 29, 2020 at 1:04 PM mahdi abbasi <
>>> mahdi.abbasi.2013 at gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I want install zun ui, first i installed horizon with package and then
>>>> install zun ui with pip and then python horizon_path/manage.py collectstatic
>>>> But i receive error in httpd finaly.
>>>>
>>>> Error:
>>>> ImportError: No module named zun_ui
>>>>
>>>> Please help me
>>>>
>>>> Best Regards
>>>> Mahdi
>>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200831/86953984/attachment.html>

From iwienand at redhat.com  Mon Aug 31 02:27:33 2020
From: iwienand at redhat.com (Ian Wienand)
Date: Mon, 31 Aug 2020 12:27:33 +1000
Subject: Setuptools 50 and Devstack Failures [was Re: Setuptools 48 and
 Devstack Failures]
In-Reply-To: <91325864-5995-4cf8-ab22-ab0fe3fdd353@www.fastmail.com>
References: <91325864-5995-4cf8-ab22-ab0fe3fdd353@www.fastmail.com>
Message-ID: <20200831022733.GA287001@fedora19.localdomain>

On Fri, Jul 03, 2020 at 12:13:04PM -0700, Clark Boylan wrote:
> Setuptools has made a new version 48 release. This appears to be
> causing problems for devstack because `pip install -e $PACKAGE_PATH`
> installs commands to /usr/bin and not /usr/local/bin on Ubuntu as it
> did in the past. `pip install $PACKAGE_PATH` continues to install to
> /usr/local/bin as expected. Devstack is failing because
> keystone-manage cannot currently be found at the specific
> /usr/local/bin/ path.

This is now back with setuptools 50.0.0 [1], see the original issue
[2].

The problems are limited to instances where jobs are installing with
pip as root into the system environment on platforms that override the
default install path (debuntu).  The confluence of this set of
requirements of neatly describes most devstack testing :/

There's two visible problems; both stem from the same issue.  Packaged
Debuntu python installs things into dist-packages; leaving
site-packages for a non-packaged interpreter, should you wish to
install such a thing.  It patches distutils to provide this behaviour.
The other thing it does is makes pip installs use /usr/local/bin,
rather that /usr/bin.

Thus it is unfortunately not just s,/usr/local/bin,/usr/bin,g because
the new setuptools will install all the libraries into site-packages;
which the packaged python intpreter doesn't know to look for.

Using SETUPTOOLS_USE_DISTUTILS=stdlib with such installs is one
option; it feels like it just makes for more confusing bifurcation.
We can't really set this in stack.sh as a global, because we wouldn't
want this to apply to subshells that are installing in virtualenv's,
for example.  It might be the best option.

A more radical thought; perhaps we could install a non-packaged python
interpreter for devstack runs.  Isolation from packaging cuts both
ways; while we might work around packaging issues in CI, we're also
working around packaging issues that then just hit you when you're in
production.  The eternal question of "what are we testing".

I don't think there's an easy answer.  Which is probably why we've
ended up here with everything broken ...

-i

[1] https://github.com/pypa/setuptools/commit/04e3df22df840c6bb244e9b27bc56750c44b7c85
[2] https://github.com/pypa/setuptools/issues/2232


From johnsomor at gmail.com  Mon Aug 31 06:19:58 2020
From: johnsomor at gmail.com (Michael Johnson)
Date: Sun, 30 Aug 2020 23:19:58 -0700
Subject: [octavia] usage of SELECT .. FOR UPDATE
In-Reply-To: <CAEs876gYATyy5VsmYd_4S5UxEtcnHrjY-k9jgRdNf2vb5bh9Kw@mail.gmail.com>
References: <CAEs876gYATyy5VsmYd_4S5UxEtcnHrjY-k9jgRdNf2vb5bh9Kw@mail.gmail.com>
Message-ID: <CAMH0MgLu0fK5DjRZF-Kd9NeQO3Jtgr5puSK7JPwYN0svq7Z=OQ@mail.gmail.com>

Hi Mohammed,

Have you opened stories for these issues? I haven't seen any bug
reports about this. If not, could you capture your information in
stories for us to work against?

I am not sure I follow the issue fully, so hopefully we can clarify.

Housekeeping, when spares pool is enabled, boot spare amphora VMs. I'm
not sure how that could inhibit load balancers from provisioning.
Sure, some periodic jobs in the housekeeping process may deadlock and
not complete booting spare VMs, but this will not block any load
balancer provisioning. If there are no spares available the worker
will simply boot a VM as it would normally do without spares enabled
(This was functionality we added to Taskflow from the beginning to
make sure we didn't have issues blocking load balancers from
provisioning if the spares pool was depleted). This lock was added at
an operators request as they did not want any "extra" amphora booted
beyond the configured spares pool limit.

The quota management does lock the project during the critical phase
of managing the quota for the project, just like every OpenStack
project. If that is not completing the quota update in a timely
manner, please open a story with the logs so we can investigate. I
assume your application is correctly designed to handle an
asynchronous API (such as neutron, Octavia, etc.) and handle any
responses that indicate the object is currently immutable and will
retry the request.

Michael

On Sun, Aug 30, 2020 at 8:47 AM Mohammed Naser <mnaser at vexxhost.com> wrote:
>
> Hi everyone,
>
> We're being particularly hit hard across different deployments where
> Octavia has several SELECT .. FOR UPDATE queries which are causing
> load balancers to fail to provision properly.
>
> - spare_pools: This usually hits on rolling restarts of o-housekeeping
> as they all seem to try to capture a lock --
> https://github.com/openstack/octavia/blob/73fbc05386b512aa1dd86a0ed6e8455cc6b8dc7f/octavia/controller/housekeeping/house_keeping.py#L54
>
> - quota: This hits when provisioning a lot of load balancers in
> parallel.  For example in cases when using Heat --
> https://github.com/openstack/octavia/blob/bf3d5372b9fc670ecd08339fa989c9b738ad8d69/octavia/db/repositories.py#L565-L566
>
> These hurt quite a lot in a busy deployment and result in a poor user
> experience unfortunately.  We're trying to off-load Octavia to it's
> own database server but that is more of a "throw power at the problem"
> solution.  I can imagine that we can probably likely look into a
> better/cleaner alternative that avoids this entirely?
>
> I'm happy to try and push for some of this work on our side.
>
> Thanks,
> Mohammed
>
> --
> Mohammed Naser
> VEXXHOST, Inc.
>


From iwienand at redhat.com  Mon Aug 31 06:46:22 2020
From: iwienand at redhat.com (Ian Wienand)
Date: Mon, 31 Aug 2020 16:46:22 +1000
Subject: Setuptools 50 and Devstack Failures [was Re: Setuptools 48 and
 Devstack Failures]
In-Reply-To: <20200831022733.GA287001@fedora19.localdomain>
References: <91325864-5995-4cf8-ab22-ab0fe3fdd353@www.fastmail.com>
 <20200831022733.GA287001@fedora19.localdomain>
Message-ID: <20200831064622.GB287001@fedora19.localdomain>

On Mon, Aug 31, 2020 at 12:27:33PM +1000, Ian Wienand wrote:
> Thus it is unfortunately not just s,/usr/local/bin,/usr/bin,g because
> the new setuptools will install all the libraries into site-packages;
> which the packaged python intpreter doesn't know to look for.

https://review.opendev.org/748937 was where I tried this before I
understood the above.

> Using SETUPTOOLS_USE_DISTUTILS=stdlib with such installs is one
> option; it feels like it just makes for more confusing bifurcation.
> We can't really set this in stack.sh as a global, because we wouldn't
> want this to apply to subshells that are installing in virtualenv's,
> for example.  It might be the best option.

https://review.opendev.org/748957 is this option; this should hook
into pip_install function.  How many plugins do "sudo pip install ..."
I don't know; they would all still be broken with this.  But as
mentioned, we don't want to set this globally to avoid setting it for
virtualenv installs.

> A more radical thought; perhaps we could install a non-packaged python
> interpreter for devstack runs.  Isolation from packaging cuts both
> ways; while we might work around packaging issues in CI, we're also
> working around packaging issues that then just hit you when you're in
> production.  The eternal question of "what are we testing".

On further consideration I don't think this is a great idea.  Lots of
things do #!/usr/bin/python3 which is always going to be the packaged
Python.  I imagine we'd have quite a mess of things not understanding
which python their libraries are installed for.

Another thing that failed was just using the system packaged pip;
https://review.opendev.org/748942.  In theory that would be OK, and
obviously patched correctly for the distro, but unfortunately the
bionic pip is so old it doesn't pull down manylinux2010 wheels and so
there's assorted build breakages from packages that now have to build.

https://review.opendev.org/748943/ is a pin to <50 in requirements.
devstack uses requirements to install setuptools in it's
tools/install_pip.sh so this does move the system back to a version
without this change.  Obviously this doesn't fix the underlying
problem, but helps the gate.

-i


From dtantsur at redhat.com  Mon Aug 31 09:42:03 2020
From: dtantsur at redhat.com (Dmitry Tantsur)
Date: Mon, 31 Aug 2020 11:42:03 +0200
Subject: Setuptools 50 and Devstack Failures [was Re: Setuptools 48 and
 Devstack Failures]
In-Reply-To: <20200831022733.GA287001@fedora19.localdomain>
References: <91325864-5995-4cf8-ab22-ab0fe3fdd353@www.fastmail.com>
 <20200831022733.GA287001@fedora19.localdomain>
Message-ID: <CACNgkFwC_4FqshQ7VySwSwbWOmBtWxEZU3um0j8CwQzyAQ6aXA@mail.gmail.com>

On Mon, Aug 31, 2020 at 4:32 AM Ian Wienand <iwienand at redhat.com> wrote:

> On Fri, Jul 03, 2020 at 12:13:04PM -0700, Clark Boylan wrote:
> > Setuptools has made a new version 48 release. This appears to be
> > causing problems for devstack because `pip install -e $PACKAGE_PATH`
> > installs commands to /usr/bin and not /usr/local/bin on Ubuntu as it
> > did in the past. `pip install $PACKAGE_PATH` continues to install to
> > /usr/local/bin as expected. Devstack is failing because
> > keystone-manage cannot currently be found at the specific
> > /usr/local/bin/ path.
>
> This is now back with setuptools 50.0.0 [1], see the original issue
> [2].
>
> The problems are limited to instances where jobs are installing with
> pip as root into the system environment on platforms that override the
> default install path (debuntu).  The confluence of this set of
> requirements of neatly describes most devstack testing :/
>
> There's two visible problems; both stem from the same issue.  Packaged
> Debuntu python installs things into dist-packages; leaving
> site-packages for a non-packaged interpreter, should you wish to
> install such a thing.  It patches distutils to provide this behaviour.
> The other thing it does is makes pip installs use /usr/local/bin,
> rather that /usr/bin.
>
> Thus it is unfortunately not just s,/usr/local/bin,/usr/bin,g because
> the new setuptools will install all the libraries into site-packages;
> which the packaged python intpreter doesn't know to look for.
>
> Using SETUPTOOLS_USE_DISTUTILS=stdlib with such installs is one
> option; it feels like it just makes for more confusing bifurcation.
> We can't really set this in stack.sh as a global, because we wouldn't
> want this to apply to subshells that are installing in virtualenv's,
> for example.  It might be the best option.
>
> A more radical thought; perhaps we could install a non-packaged python
> interpreter for devstack runs.  Isolation from packaging cuts both
> ways; while we might work around packaging issues in CI, we're also
> working around packaging issues that then just hit you when you're in
> production.  The eternal question of "what are we testing".
>
> I don't think there's an easy answer.  Which is probably why we've
> ended up here with everything broken ...
>

Is it the right time to discuss switching to virtual environments? We've
had quite a positive experience with bifrost since we stopped trying global
installations at all.

Dmitry


>
> -i
>
> [1]
> https://github.com/pypa/setuptools/commit/04e3df22df840c6bb244e9b27bc56750c44b7c85
> [2] https://github.com/pypa/setuptools/issues/2232
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200831/208ec7df/attachment.html>

From e0ne at e0ne.info  Mon Aug 31 12:51:10 2020
From: e0ne at e0ne.info (Ivan Kolodyazhny)
Date: Mon, 31 Aug 2020 15:51:10 +0300
Subject: [horizon] horizon-integration-tests job is broken
Message-ID: <CAGocpaHweoJokrJB0=6SM_WtpWk2ALmBdaU57HbUTd96D61DdA@mail.gmail.com>

Hi team,

Please, do not trigger recheck if horizon-integration-tests failed like [1]:
______________ TestDashboardHelp.test_dashboard_help_redirection
_______________
'NoneType' object is not iterable

While I'm trying to figure out what is happening there [2], any help with
troubleshooting is welcome.


[1]
https://51bb980dc10c72928109-9873e0e5415ff38d9f1a5cc3b1681b19.ssl.cf1.rackcdn.com/744847/2/check/horizon-integration-tests/62ace86/job-output.txt

[2] https://review.opendev.org/#/c/749011

Regards,
Ivan Kolodyazhny,
http://blog.e0ne.info/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200831/b9ab2b6e/attachment.html>

From yan.y.zhao at intel.com  Mon Aug 31 02:23:38 2020
From: yan.y.zhao at intel.com (Yan Zhao)
Date: Mon, 31 Aug 2020 10:23:38 +0800
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <20200828154741.30cfc1a3.cohuck@redhat.com>
References: <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
 <20200818113652.5d81a392.cohuck@redhat.com>
 <20200820003922.GE21172@joy-OptiPlex-7040>
 <20200819212234.223667b3@x1.home>
 <20200820031621.GA24997@joy-OptiPlex-7040>
 <20200825163925.1c19b0f0.cohuck@redhat.com>
 <20200826064117.GA22243@joy-OptiPlex-7040>
 <20200828154741.30cfc1a3.cohuck@redhat.com>
Message-ID: <20200831022338.GA13784@joy-OptiPlex-7040>

On Fri, Aug 28, 2020 at 03:47:41PM +0200, Cornelia Huck wrote:
> On Wed, 26 Aug 2020 14:41:17 +0800
> Yan Zhao <yan.y.zhao at intel.com> wrote:
> 
> > previously, we want to regard the two mdevs created with dsa-1dwq x 30 and
> > dsa-2dwq x 15 as compatible, because the two mdevs consist equal resources.
> > 
> > But, as it's a burden to upper layer, we agree that if this condition
> > happens, we still treat the two as incompatible.
> > 
> > To fix it, either the driver should expose dsa-1dwq only, or the target
> > dsa-2dwq needs to be destroyed and reallocated via dsa-1dwq x 30.
> 
> AFAIU, these are mdev types, aren't they? So, basically, any management
> software needs to take care to use the matching mdev type on the target
> system for device creation?
dsa-1dwq is the mdev type.
there's no dsa-2dwq yet. and I think no dsa-2dwq should be provided in
future according to our discussion.

GVT currently does not support aggregator also.
how to add the the aggregator attribute is currently uder discussion,
and up to now it is recommended to be a vendor specific attributes.

https://lists.freedesktop.org/archives/intel-gvt-dev/2020-July/006854.html.

Thanks
Yan


From jasowang at redhat.com  Mon Aug 31 03:07:53 2020
From: jasowang at redhat.com (Jason Wang)
Date: Mon, 31 Aug 2020 11:07:53 +0800
Subject: [ovirt-devel] Re: device compatibility interface for live
 migration with assigned devices
In-Reply-To: <20200821165255.53e26628.cohuck@redhat.com>
References: <a51209fe-a8c6-941f-ff54-7be06d73bc44@redhat.com>
 <20200818085527.GB20215@redhat.com>
 <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
 <20200818113652.5d81a392.cohuck@redhat.com>
 <BY5PR12MB4322C9D1A66C4657776A1383DC5C0@BY5PR12MB4322.namprd12.prod.outlook.com>
 <20200819033035.GA21172@joy-OptiPlex-7040>
 <e20812b7-994b-b7f9-2df4-a78c4d116c7f@redhat.com>
 <20200819065951.GB21172@joy-OptiPlex-7040>
 <d6f9a51e-80b3-44c5-2656-614b327dc080@redhat.com>
 <20200819081338.GC21172@joy-OptiPlex-7040>
 <c1d580dd-5c0c-21bc-19a6-f776617d4ec2@redhat.com>
 <20200820142740.6513884d.cohuck@redhat.com>
 <ea0e84c5-733a-2bdb-4c1e-95fd16698ed8@redhat.com>
 <20200821165255.53e26628.cohuck@redhat.com>
Message-ID: <b9739032-9bc0-ec48-a4c7-36c055b91702@redhat.com>


On 2020/8/21 下午10:52, Cornelia Huck wrote:
> On Fri, 21 Aug 2020 11:14:41 +0800
> Jason Wang <jasowang at redhat.com> wrote:
>
>> On 2020/8/20 下午8:27, Cornelia Huck wrote:
>>> On Wed, 19 Aug 2020 17:28:38 +0800
>>> Jason Wang <jasowang at redhat.com> wrote:
>>>   
>>>> On 2020/8/19 下午4:13, Yan Zhao wrote:
>>>>> On Wed, Aug 19, 2020 at 03:39:50PM +0800, Jason Wang wrote:
>>>>>> On 2020/8/19 下午2:59, Yan Zhao wrote:
>>>>>>> On Wed, Aug 19, 2020 at 02:57:34PM +0800, Jason Wang wrote:
>>>>>>>> On 2020/8/19 上午11:30, Yan Zhao wrote:
>>>>>>>>> hi All,
>>>>>>>>> could we decide that sysfs is the interface that every VFIO vendor driver
>>>>>>>>> needs to provide in order to support vfio live migration, otherwise the
>>>>>>>>> userspace management tool would not list the device into the compatible
>>>>>>>>> list?
>>>>>>>>>
>>>>>>>>> if that's true, let's move to the standardizing of the sysfs interface.
>>>>>>>>> (1) content
>>>>>>>>> common part: (must)
>>>>>>>>>         - software_version: (in major.minor.bugfix scheme)
>>>>>>>> This can not work for devices whose features can be negotiated/advertised
>>>>>>>> independently. (E.g virtio devices)
>>> I thought the 'software_version' was supposed to describe kind of a
>>> 'protocol version' for the data we transmit? I.e., you add a new field,
>>> you bump the version number.
>>
>> Ok, but since we mandate backward compatibility of uABI, is this really
>> worth to have a version for sysfs? (Searching on sysfs shows no examples
>> like this)
> I was not thinking about the sysfs interface, but rather about the data
> that is sent over while migrating. E.g. we find out that sending some
> auxiliary data is a good idea and bump to version 1.1.0; version 1.0.0
> cannot deal with the extra data, but version 1.1.0 can deal with the
> older data stream.
>
> (...)


Well, I think what data to transmit during migration is the duty of qemu 
not kernel. And I suspect the idea of reading opaque data (with version) 
from kernel and transmit them to dest is the best approach.


>
>>>>>>>>>         - device_api: vfio-pci or vfio-ccw ...
>>>>>>>>>         - type: mdev type for mdev device or
>>>>>>>>>                 a signature for physical device which is a counterpart for
>>>>>>>>> 	   mdev type.
>>>>>>>>>
>>>>>>>>> device api specific part: (must)
>>>>>>>>>        - pci id: pci id of mdev parent device or pci id of physical pci
>>>>>>>>>          device (device_api is vfio-pci)API here.
>>>>>>>> So this assumes a PCI device which is probably not true.
>>>>>>>>      
>>>>>>> for device_api of vfio-pci, why it's not true?
>>>>>>>
>>>>>>> for vfio-ccw, it's subchannel_type.
>>>>>> Ok but having two different attributes for the same file is not good idea.
>>>>>> How mgmt know there will be a 3rd type?
>>>>> that's why some attributes need to be common. e.g.
>>>>> device_api: it's common because mgmt need to know it's a pci device or a
>>>>>                ccw device. and the api type is already defined vfio.h.
>>>>> 	    (The field is agreed by and actually suggested by Alex in previous mail)
>>>>> type: mdev_type for mdev. if mgmt does not understand it, it would not
>>>>>          be able to create one compatible mdev device.
>>>>> software_version: mgmt can compare the major and minor if it understands
>>>>>          this fields.
>>>> I think it would be helpful if you can describe how mgmt is expected to
>>>> work step by step with the proposed sysfs API. This can help people to
>>>> understand.
>>> My proposal would be:
>>> - check that device_api matches
>>> - check possible device_api specific attributes
>>> - check that type matches [I don't think the combination of mdev types
>>>     and another attribute to determine compatibility is a good idea;
>>
>> Any reason for this? Actually if we only use mdev type to detect the
>> compatibility, it would be much more easier. Otherwise, we are actually
>> re-inventing mdev types.
>>
>> E.g can we have the same mdev types with different device_api and other
>> attributes?
> In the end, the mdev type is represented as a string; but I'm not sure
> we can expect that two types with the same name, but a different
> device_api are related in any way.
>
> If we e.g. compare vfio-pci and vfio-ccw, they are fundamentally
> different.
>
> I was mostly concerned about the aggregation proposal, where type A +
> aggregation value b might be compatible with type B + aggregation value
> a.


Yes, that looks pretty complicated.


>
>>
>>>     actually, the current proposal confuses me every time I look at it]
>>> - check that software_version is compatible, assuming semantic
>>>     versioning
>>> - check possible type-specific attributes
>>
>> I'm not sure if this is too complicated. And I suspect there will be
>> vendor specific attributes:
>>
>> - for compatibility check: I think we should either modeling everything
>> via mdev type or making it totally vendor specific. Having something in
>> the middle will bring a lot of burden
> FWIW, I'm for a strict match on mdev type, and flexibility in per-type
> attributes.


I'm not sure whether the above flexibility can work better than encoding 
them to mdev type. If we really want ultra flexibility, we need making 
the compatibility check totally vendor specific.


>
>> - for provisioning: it's still not clear. As shown in this proposal, for
>> NVME we may need to set remote_url, but unless there will be a subclass
>> (NVME) in the mdev (which I guess not), we can't prevent vendor from
>> using another attribute name, in this case, tricks like attributes
>> iteration in some sub directory won't work. So even if we had some
>> common API for compatibility check, the provisioning API is still vendor
>> specific ...
> Yes, I'm not sure how to deal with the "same thing for different
> vendors" problem. We can try to make sure that in-kernel drivers play
> nicely, but not much more.


Then it's actually a subclass of mdev I guess in the future.

Thanks


From yan.y.zhao at intel.com  Mon Aug 31 04:43:44 2020
From: yan.y.zhao at intel.com (Yan Zhao)
Date: Mon, 31 Aug 2020 12:43:44 +0800
Subject: device compatibility interface for live migration with assigned
 devices
In-Reply-To: <8f5345be73ebf4f8f7f51d6cdc9c2a0d8e0aa45e.camel@redhat.com>
References: <3a073222-dcfe-c02d-198b-29f6a507b2e1@redhat.com>
 <20200818091628.GC20215@redhat.com>
 <20200818113652.5d81a392.cohuck@redhat.com>
 <20200820003922.GE21172@joy-OptiPlex-7040>
 <20200819212234.223667b3@x1.home>
 <20200820031621.GA24997@joy-OptiPlex-7040>
 <20200825163925.1c19b0f0.cohuck@redhat.com>
 <20200826064117.GA22243@joy-OptiPlex-7040>
 <20200828154741.30cfc1a3.cohuck@redhat.com>
 <8f5345be73ebf4f8f7f51d6cdc9c2a0d8e0aa45e.camel@redhat.com>
Message-ID: <20200831044344.GB13784@joy-OptiPlex-7040>

On Fri, Aug 28, 2020 at 03:04:12PM +0100, Sean Mooney wrote:
> On Fri, 2020-08-28 at 15:47 +0200, Cornelia Huck wrote:
> > On Wed, 26 Aug 2020 14:41:17 +0800
> > Yan Zhao <yan.y.zhao at intel.com> wrote:
> > 
> > > previously, we want to regard the two mdevs created with dsa-1dwq x 30 and
> > > dsa-2dwq x 15 as compatible, because the two mdevs consist equal resources.
> > > 
> > > But, as it's a burden to upper layer, we agree that if this condition
> > > happens, we still treat the two as incompatible.
> > > 
> > > To fix it, either the driver should expose dsa-1dwq only, or the target
> > > dsa-2dwq needs to be destroyed and reallocated via dsa-1dwq x 30.
> > 
> > AFAIU, these are mdev types, aren't they? So, basically, any management
> > software needs to take care to use the matching mdev type on the target
> > system for device creation?
> 
> or just do the simple thing of use the same mdev type on the source and dest.
> matching mdevtypes is not nessiarly trivial. we could do that but we woudl have
> to do that in python rather then sql so it would be slower to do at least today.
> 
> we dont currently have the ablity to say the resouce provider must have 1 of these
> set of traits. just that we must have a specific trait. this is a feature we have
> disucssed a couple of times and delayed untill we really really need it but its not out
> of the question that we could add it for this usecase. i suspect however we would do exact
> match first and explore this later after the inital mdev migration works.

Yes, I think it's good.

still, I'd like to put it more explicitly to make ensure it's not missed:
the reason we want to specify compatible_type as a trait and check
whether target compatible_type is the superset of source
compatible_type is for the consideration of backward compatibility.
e.g.
an old generation device may have a mdev type xxx-v4-yyy, while a newer
generation  device may be of mdev type xxx-v5-yyy.
with the compatible_type traits, the old generation device is still
able to be regarded as compatible to newer generation device even their
mdev types are not equal.

Thanks
Yan
> by the way i was looking at some vdpa reslated matiail today and noticed vdpa devices are nolonger
> usign mdevs and and now use a vhost chardev so i guess we will need a completely seperate mechanioum
> for vdpa vs mdev migration as a result. that is rather unfortunet but i guess that is life.
> > 
> 


From mnaser at vexxhost.com  Mon Aug 31 15:42:54 2020
From: mnaser at vexxhost.com (Mohammed Naser)
Date: Mon, 31 Aug 2020 11:42:54 -0400
Subject: [tc] weekly update
Message-ID: <CAEs876iDKjtJnc_gZWDqCVaE+KDRd_CKs-oovESX+Cb5n1Nyng@mail.gmail.com>

Hi everyone,

Here’s an update for what happened in the OpenStack TC this week. You
can get more information by checking for changes in
openstack/governance repository.  We've also included a few references
to some important mailing list threads that you should check out.

# Patches
## Open Reviews
- Move ansible-role-XXX-hsm projects to Barbican team
https://review.opendev.org/748027
- Retire the devstack-plugin-zmq project https://review.opendev.org/748731
- Retire devstack-plugin-pika project https://review.opendev.org/748730
- Add openstack-ansible/os_senlin role https://review.opendev.org/748677
- Drop all exceptions for legacy validation https://review.opendev.org/745403
- Add openstack-helm-releases to openstack-helm
https://review.opendev.org/748302
- Add assert:supports-standalone. https://review.opendev.org/722399

## Project Updates
- Add etcd3gw to Oslo https://review.opendev.org/747188

## General Changes
- Update and simplify comparison of working groups
https://review.opendev.org/746763
- Move towards dual office hours in diff TZ https://review.opendev.org/746167

## Abandoned Changes
- Drop requirement of 1/3 positive TC votes to land
https://review.opendev.org/746711
- Move towards single office hour https://review.opendev.org/745200

# Email Threads
- vPTG October 2020 Signup:
http://lists.openstack.org/pipermail/openstack-discuss/2020-August/016497.html


Thanks for reading!
Mohammed & Kendall

-- 
Mohammed Naser
VEXXHOST, Inc.


From kendall at openstack.org  Mon Aug 31 17:37:48 2020
From: kendall at openstack.org (Kendall Waters)
Date: Mon, 31 Aug 2020 12:37:48 -0500
Subject: vPTG October 2020 Team Signup Reminder
Message-ID: <5F13B10F-C0C5-4761-8AD2-9B3A55F67441@openstack.org>

Hello Everyone!

Wanted to give you all a reminder that the deadline for signing up teams for the PTG is approaching!

The virtual PTG will be held from Monday October 26th to Friday October 30th, 2020.

To signup your team, you must complete BOTH the survey[1] AND reserve time in the ethercalc[2] by September 11th at 7:00 UTC.

We ask that the PTL/SIG Chair/Team lead sign up for time to have their discussions in with 4 rules/guidelines. 

1. Cross project discussions (like SIGs or support project teams) should be scheduled towards the start of the week so that any discussions that might shape those of other teams happen first.
2. No team should sign up for more than 4 hours per UTC day to help keep participants actively engaged. 
3. No team should sign up for more than 16 hours across all time slots to avoid burning out our contributors and to enable participation in multiple teams discussions.

Once your team is signed up, please register[3]! And remind your team to register! Registration is free, but since it will be how we contact you with passwords, event details, etc. it is still important!

If you have any questions, please let us know.

-The Kendalls (diablo_rojo & wendallkaters)

[1] Team Survey: https://openstackfoundation.formstack.com/forms/oct2020_vptg_survey <https://openstackfoundation.formstack.com/forms/oct2020_vptg_survey>
[2] Ethercalc Signup: https://ethercalc.openstack.org/7xp2pcbh1ncb <https://ethercalc.openstack.org/7xp2pcbh1ncb>
[3] PTG Registration: https://october2020ptg.eventbrite.com <https://october2020ptg.eventbrite.com/>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200831/415b9c5a/attachment.html>

From ruslanas at lpic.lt  Mon Aug 31 19:22:29 2020
From: ruslanas at lpic.lt (=?UTF-8?Q?Ruslanas_G=C5=BEibovskis?=)
Date: Mon, 31 Aug 2020 21:22:29 +0200
Subject: [TripleO][Ussuri] image prepare, cannot download containers image
 prepare recently
Message-ID: <CABE5tBaD9qfh8F2bYCVm9QoFXBEG28+FBUM75Wemh1oFk9XUrw@mail.gmail.com>

Hi all,

I have noticed, that recently my undercloud is not able to download images
[0]. I have provided newly generated containers-prepare-parameter.yaml and
outputs from container image prepare

providing --verbose and later beginning of --debug (in the end) [0]

were there any changes? As "openstack tripleo container image prepare
default --output-env-file containers-prepare-parameter.yaml
--local-push-destination" have prepared a bit different file, compared what
was previously: NEW # namespace: docker.io/tripleou VS namespace:
docker.io/tripleomaster # OLD


[0] - http://paste.openstack.org/show/rBCNAQJBEe9y7CKyi9aG/
-- 
Ruslanas Gžibovskis
+370 6030 7030
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200831/e8186875/attachment.html>

From skaplons at redhat.com  Mon Aug 31 22:02:46 2020
From: skaplons at redhat.com (Slawek Kaplonski)
Date: Tue, 1 Sep 2020 00:02:46 +0200
Subject: [neutron] Team meeting - Tuesday 01.09.2020
Message-ID: <20200831220246.cignba25ghfgqsmf@skaplons-mac>

Hi,

I will be on PTO on Tuesday and I will not be able to run our team meeting.
So lets cancel it this week and see You all on meeting next week.

For this week there are 3 important things from me:
* please check doodle: https://doodle.com/poll/2ppmnua2nuva5nyp and put there
  time slots which works the best for You for the PTG in October - please do
  that this week as next week I need to book some slots for us,
* as we are close to the Victoria-3 milestone (next week) which is also feature
  freeze week, please focus now on reviewin patches for BPs targeted for this
  cycle: https://wiki.openstack.org/wiki/Network/Meetings#Blueprints
* There is new bug deputy role starting this week. Please check new schedule at
  https://wiki.openstack.org/wiki/Network/Meetings#Bug_deputy and let me know if
  it don't works for You.

That's all from me for this week. See You all online :)

-- 
Slawek Kaplonski
Principal software engineer
Red Hat


From arunkumar.palanisamy at tcs.com  Mon Aug 31 19:47:15 2020
From: arunkumar.palanisamy at tcs.com (ARUNKUMAR PALANISAMY)
Date: Mon, 31 Aug 2020 19:47:15 +0000
Subject: Trove images for Cluster testing.
In-Reply-To: <CALjNAZ3Gog4G-92qEv9z9iLgB89Y_rNPbYn3VbL7ydB3wdaoYA@mail.gmail.com>
References: <MA1PR01MB21229BC9F1D07FA7F3F571058A540@MA1PR01MB2122.INDPRD01.PROD.OUTLOOK.COM>
 <CALjNAZ3Gog4G-92qEv9z9iLgB89Y_rNPbYn3VbL7ydB3wdaoYA@mail.gmail.com>
Message-ID: <MAXPR01MB2126D355F5B43752A1517CCE8A510@MAXPR01MB2126.INDPRD01.PROD.OUTLOOK.COM>

Hi  Lingxian,

Hope you are doing Good.

Thank you for your mail and detailed information.

We would like to join #openstack-trove IRC channel for discussions. Could you  please advise us the process to join IRC channel.

We came to know that currently there is no IRC channel meeting happening for Trove, if there is any meeting scheduled and happening. we would like to join and understand the works and progress towards Trove and contribute further.

Regards,
Arunkumar Palanisamy

From: Lingxian Kong <anlin.kong at gmail.com>
Sent: Friday, August 28, 2020 12:09 AM
To: ARUNKUMAR PALANISAMY <arunkumar.palanisamy at tcs.com>
Cc: openstack-discuss at lists.openstack.org; Pravin Mohan <pravin.mohan at tcs.com>
Subject: Re: Trove images for Cluster testing.

"External email. Open with Caution"
Hi Arunkumar,

Unfortunately, for now Trove only supports MySQL and MariaDB, I'm working on adding PostgreSQL support. All other datastores are unmaintained right now.

Since this(Victoria) dev cycle, docker container was introduced in Trove guest agent in order to remove the maintenance overhead for multiple Trove guest images. We only need to maintain one single guest image but could support different datastores. We have to do that as such a small Trove team in the community.

If supporting Redis, Cassandra, MongoDB or Couchbase is in your feature request, you are welcome to contribute to Trove.

Please let me know if you have any other questions. You are also welcome to join #openstack-trove IRC channel for discussion.

---
Lingxian Kong
Senior Software Engineer
Catalyst Cloud
www.catalystcloud.nz<http://www.catalystcloud.nz>


On Fri, Aug 28, 2020 at 6:45 AM ARUNKUMAR PALANISAMY <arunkumar.palanisamy at tcs.com<mailto:arunkumar.palanisamy at tcs.com>> wrote:
Hello Team,

My name is ARUNKUMAR PALANISAMY,

As part of our project requirement, we are evaluating trove components and need your support for experimental  datastore Image for testing cluster.  (Redis, Cassandra, MongoDB, Couchbase)


1.)    We are running devstack enviorment with Victoria Openstack release and with this image (trove-master-guest-ubuntu-bionic-dev.qcow2<https://tarballs.opendev.org/openstack/trove/images/trove-master-guest-ubuntu-bionic-dev.qcow2>), we are able to deploy mysql instance and and getting below error while creating mongoDB instances.

“ModuleNotFoundError: No module named 'trove.guestagent.datastore.experimental' “


2.)    While tried creating mongoDB image with diskimage-builder<https://opendev.org/openstack/diskimage-builder> tool, but we are getting “Block device ” element error.


Regards,
Arunkumar Palanisamy
Cell: +49 172 6972490


=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200831/5772d990/attachment-0001.html>